The AIs are making visual art now!

I’ve written quite a bit here about the latest generation(s) of AIs that generate text, after reading much of the Internet and so on, and given some text to start with. I’ve played with AI Dungeon, NovelAI (which I see I haven’t blogged about much; it’s cool, and imho the UI is much better than, and the AI about as good as, AI Dungeon’s), whatever the heck is inside Replika these days, and Google’s own engine (paper is out!).

I’ve also blogged before about Art Breeder, which is cool, and lets one interact with software that includes some AI elements to make new strange (or realistic) pictures. While it uses AI, Art Breeder doesn’t have quite the wild and open-ended feel that the text generators do, because it knows specifically about certain kinds of images (faces, landscapes, etc), and it knows certain things about them (the “genes” that exist and that people who pay a certain amount of money can create new ones of), and lets you mix and match and evolve within that rather structured framework, rather than just typing stuff.

Now I’ve been playing with some visual-art AIs that are more in the generator style. These have existed for awhile (the earliest I can recall being OpenAI’s “DALL-E” (a cute pun opon “Dali” the artist and “WALL-E” the cute fictional robot)), but I haven’t noticed them being more or less freely available to lazy people until like this month.

The first one I’ll mention is the “2D” mode in AI Dungeon: paying members or something can now turn on a feature that will insert pixelated images into one’s stories, generated from the story text by (according to the help text) pixray (about which I know very little). For example:

I admit I don’t find the images especially… interesting. But the idea is at least kind of fun!

Next up is “dall-e mini” (punctuation and trademark status unclear to me). It’s very simple: you give it a word or phrase, and it displays a small grid of small images which it calls “predictions”, and which are… perhaps at least vaguely related to the words, and sometimes cool.

And yes, the reason I was prompting AI Dungeon’s GPT-3 to generate names for not-yet-created artworks up in the first image there, was so I could enter the names into things like dall-e mini, and the next one we’ll talk about.

That next one is Nightcafe, which is free to use in a relatively complex sense: anyone can sign up, and you get a certain number of “credits” to use on doing stuff, and as you hit various milestones (which come pretty fast at first; I don’t know about longer term!) you get more credits for free. Credits can also be purchased with, y’know, money. Every time you want to do a thing (create a new image, make an image higher-res, etc), you spend a credit or two.

So far, I’m having fun without having given them any money. You can see all of the things that I’ve generated so far on my profile page here, including this one generated from Ai Dungeon’s “The Discomfort” idea in the first image up there. The main things I’ve discovered that one can do so far are:

  • Type in some word or phrase, with optional additions selected from a list of modifiers (like “concept art”, “surrealism”, “watercolor”), and push the button to have it generate an image. That’s how I made “The Discomfort” and “Song of Hidden Beauty” which I rather like, and some others.
  • Give it some existing image (either that it generated, or uploaded from anywhere at all, for instance from dall-e mini) and choose one of a number of “style” images, and push the button to do a “style transfer” from the “style” image to the image that you gave it. See this and this, which are the same image from dall-e mini prompted with “The most beautiful thing in the world”, with two different styles transferred onto it by Nightcafe. (I don’t know if you can have it apply the “style” from one arbitrary picture to another one; that would be neat but probably harder for the engine.)
  • Sort of do both, by giving it both a word or phrase and optional modifiers, and also an existing image (or more than one even maybe), and pushing a button to have it produce something derived from all of that. For instance “The Young Mother” here is a combination of this image (which looks like a pile of cloth to me, but a friend assures me contains body parts) and the phrase “The Young Mother”.
  • Order a physical print of one’s image, suitable for framing and/or hanging! This is a fun thought, and I may do it eventually. It would look good with the Art Breeder image that I had put onto canvas by Google Photos the other month (did I tell y ‘all about that?) and that is now hanging over my bed.
  • Produce NFTs from the images (speaking of our recent posts on that subject). This one is pretty funny, in that the NightCafe site currently has a page called “Create NFT Art”, which basically just says that you can use their stuff to produce a cool image, and then download it and use something else to make an NFT and sell it and stuff. I don’t know if this is just an amusing low-effort way to hook into the NFT hype, or a placeholder for adding a “Make into an NFT on OpenSea” button or whatever later on.

Amusingly, the relevant help / settings text on AI Dungeon says “Creation of NFTs with AI Dungeon 2D is not currently allowed”. For what it’s worth…

Anyway! This is all pretty fun. I feel like the amazingness of GPT-style text generators has somewhat worn off, and their output has a common “convincingly-worded text without any actually understanding behind it” feeling to it, although it look quite awhile for that to happen, and it’s still fun to play with now and then. And I’m already starting to suspect that the (“GPT-style”?) visual art that I’ve been looking at already has that kind of feeling to it, as though even though they can be very different, they still somehow have the same sort of vibe.

(I don’t know if I’m claiming that I could tell an AI-generated sample (of text or art) from a human-generated one in an appropriate set of blind tests. That might be interesting!)

I haven’t to speak of looked into what these things are trained on, for instance. I think to first order it’s some big dataset of images that have had words / descriptions attached to them by humans. How big is it? Are they all based on the same one? Are people looking into other things to train on? Etc etc? I don’t know! Maybe I will find that stuff all out eventually.

Meanwhile, here’s the first image I made with Nightcafe: Frightened Business Men with Lamps (concept art, film noir). Enjoy!

Frightened business men with lamps

Update: And how could I forget, the very notable Twitterbot: ai_curio_bot? Endless AI-generated images based on phrases entered by Twitter users! With a very similar feel yet again; not sure what the backend is here either.

One Trackback to “The AIs are making visual art now!”

Hm?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: