I took it into my head for some reason to see what Midjourney would do with little sub-semantic phonemes, like say “ton glam so”. When I first tried it, the results had letters (and not-quite-letters) all over the and/or were all just faces, so I added the switches “–no words, letters, text –no face” to the prompt.
I did that as two separate –no switches without thinking, but in retrospect that may have resulted in a weight of one (1) for “ton glam so”, and weights of -0.5 each for “words, letters, text” and “face”, resulting in a total weight of zero (0), which is known to do weird / fun things (I thought I had mentioned that here earlier, but apparently not).
With those switches, our initial “ton glam so” produces the rather noteworthy:
Possibly the “glam” make “glamour” or even “glamor” salient in the model? But these are not, well, the images that I would have expected to be most salient under the category of “glamour”.
The same switches with the text prompt “so bel wip” produces the also, but very differently, noteworthy:
No relationship to “so bel wip” occurs to me, but it’s certainly consistent! Wondering if this was due to some common seed or something, I tried it again, and got:
which, whoa, definitely very similar. One more time for good luck?
I tried adding “–chaos 70”, which does something or other, and got this:
The same but just a bit more variety; two kids possibly white, one with pointy ears, and so on. But the same interesting clothes and general style. Fascinatin’!
I tried another text prompt (without the –chaos) “plin bo san”, and got these delightful things:
Does “plin bo san” make “plane” and maybe “boat” salient? Does “san” somehow specify the aesthetic? So fascinating! What if we change the aspect ratio to three wide by two high?
OMG so delightful. I love all of these! Next, I tried “tem wo sec” and…
I mean… what?!
Then, “lus dab ba” with –chaos 60:
“mai rem den” with –chaos 70:
Ahhhh what even is happening? What are all these things??
I’m stopping now because my brain is tired, and it’s challenging to write alt-text for these! But wow, eh? Whatever is going on with these things? These are all Midjourney v4, I’m pretty sure, because that’s the default at the moment and I didn’t specify. I’m guessing the total weight of zero is part of what’s causing… whatever this is.
As I’m sure you’ve heard there’s a new level of GPT in the world. Friend Steve has been playing with it, and says that it does seem to do some stuff better, but also still make stuff up amusingly and all. At the moment for whatever reason I can’t be arsed to investigate, or even read yet more hype / analysis about it. Similarly, Google announced a thing, and Microsoft is putting LLMs into various products whose names I don’t recognize, and I’m not reading about any of that. NovelAI‘s good old open-source model works fine for all of the telling-weird-stories stuff that I need right now.
And there’s a test version of a new Midjourney engine out! Being tested! And it seems pretty cool. Hands in particular seem much more likely to have five fingers when you’d expect them too, which is a whole thing.
And I spent too much time arguing with people on the Twitter, which isn’t at all new. And I definitely shouldn’t do because it is not healthy. So I’m trying to stop that.
Now I’m just making pretty pictures! And not thinking very much until later on sometime!
Lots of weather in those, eh? Hadn’t noticed that. :)
On art made with AI tools, that is. Reuters story here, actual letter from the Office lawyer here.
I haven’t read the whole letter in detail yet (it’s long!) but I’ve looked it over and have Initial Thoughts:
I don’t think there’s a fact-of-the-matter here, about what is copyrightable when. There are legal theories that make more and less sense, that are more and less consistent with other established theories, and so on. But these are not theories that try to model something in the real world, like the Theory of Relativity; they are more theories in the sense of Set Theory. So the Office can’t really be right or wrong here overall, but they can have made a more or less sensible decision.
The overall finding of the memo is that Kristina Kashtanova still has a copyright on Zarya of the Dawn, but only on the text, and “the selection, coordination, and arrangement of the Work’s written and visual elements”, not on the visual elements themselves (i.e. the images made with Midjourney), because those images don’t involve “sufficient creative input or intervention from a human author.”
This seems wrong to me; as other places in the document point out, the case law says that “only a modicum of creativity is necessary”, and there is certainly a modicum of creativity in prompt design and engine usage.
The argument here seems to be, not that there isn’t enough creativity in the prompts and flags and so on, but that the connection between the artist’s input and the image output isn’t strong enough. The memo says things like ‘Rather than a tool that Ms. Kashtanova controlled and guided to reach her desired image, Midjourney generates images in an unpredictable way. Accordingly, Midjourney users are not the “authors” for copyright purposes of the images the technology generates.’
But where is the existing doctrine that says anything about predictability? Jackson Pollock might like a word, and the creator of any other roughly uncontrolled or algorithmic or found-object work. The theory here seems to be that Midjourney prompts are just suggestions or ideas, and those can’t be copyrighted. Does that mean that since Pollock just had the idea of splashing paint onto canvas, and the unpredictable physics of the paint cans and the air produced the actual work, that “Autumn Rhythm” can’t be copyrighted? Or are they going to hold that there is a legal significance to the fact that the detailed movements of his arm muscles were involved? That seems dicey.
For the Office to claim that the prompts and other input did contain at least a modicum of creativity (which seems undeniable) but that that input wasn’t strongly enough connected to the output, seems to be inventing a new legal test, which it’s not at all clear to me that the Office can do on its own hook, can it?
This memo may be specifically designed to be contested, so that the question can go to a court that can do that kind of thing.
The memo may have interesting consequences for Thaler, in particular the cases in which Thaler attempted to claim copyright under work-for-hire theory, with his software as the creator. The memo explicitly makes the comparison with human work-for-hire, saying that if someone had given the same instructions to a human artist that are contained in a Midjourney prompt, and the human artist had made an image, then the person giving the instructions would not have been the creator unless work-for-hire applies (the human carrying out the instructions would have been the creator-in-fact), and that therefore they aren’t in the Midjourney case either.
To be consistent with both the memo and Thaler, the theory seems like it has to be that Midjourney is the creator-in-fact, and therefore the human isn’t (and can’t get a direct copyright as the creator), but also that software can’t be hired in the work-for-hire sense and therefore the human can’t get the copyright that way either. Which seems odd! It seems to acknowledge that the software is the creator-in-fact, but then deny both making the software the creator-in-law (because not human) and making the user the creator-in-law via work-for-hire (because I’m-not-sure).
Some other countries are different and imho somewhat more sensible about this, as in the UK’s Copyright, Designs, and Patents Act, of which Section 178 explicitly talks about “computer-generated” works, meaning “that the work is generated by computer in circumstances such that there is no human author of the work”. That’s still imho a little sketchy (I continue to think that Kashtanova is in fact the human author of the images in Zarya), but at least it then provides that “In the case of a literary, dramatic, musical or artistic work which is computer-generated, the author shall be taken to be the person by whom the arrangements necessary for the creation of the work are undertaken.”
There’s still some room for doubt there, as for instance whether it’s Kashtanova or the Midjourney people or some combination who relevantly undertook the arrangements, but at least we aren’t in the position of saying that the author is a being that is not legally allowed to either be a creator, or confer creatorship to a human via work-for-hire.
In the case of the many, many currently-registered copyrights on images made with AI tools (including mine), it seems that if the copyright office is notified, or notices, that fact, they are likely to cancel / withdraw the registration. The theory will be that the registration materials were incorrect when they named the creator as the author of the work, without in any way informing the Copyright Office that an AI tool was used. I could, for instance, send the Copyright Office a note saying “oh by the way I hear that you want to know when AI tools are used, and in my case Midjourney was”, and then they might cancel my registration on their (imho mistaken) theory that I’m not really the author.
Since I believe their theory is mistaken, I’m not currently planning to do that. :)
If they discover it on their own hook and send me a letter telling me they’re withdrawing the registration, I will do whatever easy thing one can do to contest that, but I’m not going to like hire a lawyer or anything; life’s too short.
I’m very curious to see what others do; I would expect that Midjourney itself (assuming it’s big enough to have lawyers) will have their lawyers working on a response to this memo.
My copyrights on the Klara trilogy and Ice Dreams (casually announced here) are secure, as to the text and the image selection and arrangement and all, just not to the images per se. Which is fine. And I haven’t registered those anyway. :)
I should go back and add a note to all of my existing copyright weblog entries, pointing at this one; or, more sustainably, pointing at the entire “copyright” tag on the weblog here. Then I won’t have to keep updating it.
I’m quite happy I decided not to worry too much about this whole thing, and just make pretty pictures (see pretty picture of concerned purple aliens above).
Updates: as this is a developing topic (as opposed to my usual topics which are Timeless Truths of the Universe), you may want to check the copyright tag on the weblog here for later updates, if this post is more than a week or month old.
The story of Klara, written by me channeling the Collective Unconscious, illustrated by me using Midjourney, and narrated and set to music and videographed by the talented Karima Hoisan, is finally finished!
I originally thought it was finished at the end of the first forty-frame thing; and then when I did Part Two at about the same length, I thought it was finished; and now having struggled for months on Part Three I’m pretty sure it actually is done. :)
Having just watched Karima’s videos of all three parts in order (playlist here!), I’m glad various viewers convinced me not to stop at one or two parts. It’s pretty good!
And I say this with all modesty; I feel like this story came through me, more than like it is something that I did. The comments over in Karima’s weblog, and her narration, have suggested various meanings and facets to me that I hadn’t thought of before.
In terms of the experience of creating it, it’s been interesting to see the various phases of interaction with the AI tool. I started out Part One by creating various variations of the prompt “detailed surrealism” on the v3 engine on Midjourney, and then weaving the story around pretty much whatever came out.
It happens that in v3, that prompt pretty reliably produces scenes from a stylistically coherent universe, including the MJ Girl, who plays the part of Klara in the first two parts. In Part Two, I had a bit more of an idea of what I wanted to happen, in a general way, but continued using v3 and the same prompt. This required somewhat more work, because it would produce images that didn’t fit with the story I wanted, so I had to put those aside and make more. But the style was at least not much trouble.
Part Three was quite different. For plot reasons, being in basically a different reality, the style needed to be different. It was relatively easy to do that, by using the “test” and “testp” engines, either alone or by “remastering” images made under v3. But the resulting images, while different from those of the first two parts, weren’t nearly as consistent among themselves as those of parts one and two. So I had to play around a lot more with the workflows and the prompts, and produce quite a few more pictures, to get a reasonably consistent style.
The style of Part Three still shifts around quite a bit; the flavor of the city, the color of Klara’s hair, the cat’s fur, and many other things change somewhat from panel to panel, but I wanted a nice mixture of consistent and in flux; and that took work!
Then there was the Story issue. The beginning “recap” part of Part Three was relatively easy that way, summarizing the story of the first two parts from a different point of view. But then I quickly got stuck; I wanted to do something more satisfying and less random than I would get by letting the AI’s raw output drive the action. For whatever reason, it took me quite awhile to find the story thread that I liked, and then about as long to create (or obtain, if you prefer!) the images to go with it.
(The images still drove the narrative to some extent; for instance the firefly line, which I adore, was inspired by the image that goes with it, not vice-versa.)
But finally I finished! :) And Karima made the video in record time, and there it is! Woooo!
I keep feeling like I should make it into good PDFs, or something (even) more readable, and officially post links to that; maybe even have it printed somewhere onto atoms. On the other hand, without the narrative and music and video, it would hardly be the same… :)
I asked Midjourney for some simple proofs of the Pythagorean Theorem. The results make me happy. :)
(On the text side: GPT-2 and even GPT-3 might have hallucinated something interesting. ChatGPT would just error out a few times and then give a boring literal description of one in a condescending tone. My ability to be interested in ChatGPT as an interaction partner is severely limited by how boring it is. But anyway, back to the pictures!)
Presented without comment (beyond the alt text):
I hope you find these at least as amusing, endearing, and/or thought-provoking as I do. :)
Well, it turns out that Midjourney does, maybe, to an extent. For maybe a few works?
The one that’s gotten the most attention is the 1984 photograph of Sharbat Gula by Steve McCurry, popularly known as “Afghan Girl“. The strings “afghan girl” and (haha) “afgan girl” are prohibited in Midjourney prompts at the moment. (“The phrase afghan girl is banned. Circumventing this filter to violate our rules may result in your access being revoked.”) And this is apparently because that phrase all by itself elicits what are arguably just slight variations of the original.
There’s a Twitter post that claims to show this, but I’m not certain enough it’s real to link to it. Also it’s on Twitter. But I can say that entering similar non-banned phrases like “young Afghan woman” also produce images that are at least quite similar to the photo of Gula, more similar than I would have expected. Given the size of the Midjourney training set, that image in association with those words must occur a lot of times!
(Update: it seems likely that the most widely-circulated image purporting to show Midjourney spontaneously generating close copies of the Gula “Afghan Girl” picture, is not actually that: it was made by giving the AI a copy of the original photo (!) and the prompt “afghan girl, digital art”. That the AI can make a copy of a work, given a copy of the work, is no surprise! Evidence, on a link probably usable only if you’re logged into Midjourney, is here. Given the further examples below, this doesn’t entirely undercut the point, but it’s interesting.)
The other example that I know of is “Starry Night”, which brings up variations of the van Gogh piece. This one’s out of copyright :) so I have no qualms about posting what I got:
Pretty obviously derivative in the usual sense. Derivative Work in the legal sense? I have no idea, and copyright law is sufficiently squishy and subjective that there is probably not a correct answer until and unless explicitly litigated, or the legal landscape otherwise changes significantly.
Are there other short phrases that will home in on a particular famous image? “Mona Lisa” (also out of copyright) certainly seems to:
Interesting and/or hideous variations, but still instantly recognizable.
What else might we try? “Migrant Mother” produces images that I think are clearly not derivative works:
Striking perhaps, ruined by the bizarre hands perhaps, in the same general category as the photo by Lange, but clearly of different people, in different positions, and so on. It’s not “plagiarizing” here, at any rate.
What if we tried harder? Let’s explicitly prompt with like “Migrant Mother photo, Dorothea Lange, 1936“. Whoa, yipes! Is this out of copyright? Well, if not it’s probably Fair Use in this posting anyway, so here:
Definitely derivative, and possibly Derivative. How about “Moon and Half Dome, Ansel Adams, 1960“? Well:
This is definitely not the picture that that search will get you in Google Images; if nothing else, the moon is way too large, and the top of Half Dome is a bizarre penguin-bill sort of shape. I’m guessing that this is because there are lots of other Ansel Adams pictures in the training set associated with words like “moon” and “half dome”, and mushing them all together quasi-semantically gives this set. The origin of the penguin-bill I dunno.
Maybe “Beatles Abbey Road cover, crossing the street“?
Crosswalk, front guy in white, roundish car to the left, check. Derivative in various senses, for sure. More specific prompting could presumably increase the exactness.
So I think we’ve established, to the extent of the tiny number of experiments I have the energy to do, that Midjourney (and, I would wager, other AI art tools, mutatis mutandis; I could get a Starry Night easily out of NightCafe, but not a Migrant Mother) can in fact produce images, the production of which arguably violates one or more of the rights of the copyright holder. It is most likely to do it if you explicitly try to do it (giving the most famous name of the image along with the artist and ideally the year and anything else that might help), but can also do it by accident (innocently typing “afghan girl”).
This doesn’t mean that these tools usually or typically do this; the fact that you can get a copy of an image from a tool that looks like it involves copyright laws doesn’t mean that other images made with it also involve copyright laws. To use the usual comparison, you can easily violate copyright using Photoshop, but that doesn’t suggest that there aren’t non-infringing uses of Photoshop, nor does it provide evidence that any particular image from Photoshop is infringing.
The easiest way to think about the blocking of “afg{h}an girl” from Midjourney prompts is that they have made a tool, realized that it could be used to violate copyright, and taken action to make it more difficult to use it that way in some cases.
This all bears on the question of whether images made with AI tools violate copyrights; the question of whether making the AI tools in the first place involves an infringing use is a somewhat different question, and we might talk about it some other time, although I’m still feeling kind of burnt out on the legal issues. But I did want to update on this one particular thing.
More updates: More stuff has happened in this area! For my latest deep and otherwise thoughts on the subject, I recommend the copyright tag on the weblog here.
But yes, in fact I’ve been using good ol’ Midjourney to make some wallpapers, and figured out how to get Windows to permute among them as desktop backgrounds on this brand-new Framework laptop I have (I should write a long boring geeky entry about my old Windows laptop breaking and my replacing it with this lovely new thing whose only disadvantage is that I’m still running Windows on it ewww), and I thought I would share them here as the first of the promised (or threatened) posts with tons of images made with Midjourney.
I think I will just do it as a big WordPress Gallery thing? Which means WordPress will I dunno display them in some random layout, but I hope you can still get the actual images at full size by clicking through and rightclick-saving? Or whatever?
Another “fun corners of the AI’s network” post. These are all pretty much unfiltered and unretried and unmodified results with the prompt “figure three” with the current “test” or “testp” engine (v4 said to be coming soon!) on MidJourney. I have no comment except that I find them all wonderful. :)
(There are, typically, various women’s faces, and perhaps the word “figure” got us more sort-of-bodies than we would have gotten otherwise?)
Well, this is just too much fun. :) Very good Second Life friend and collaborator liked the little Klara piece so much that she voiced it and set it to the perfect music and made it into a rather wonderful YouTube! Definitely more accessible :) and more of an experience this way than the 327MB pdf file. Wooot!
Very excited to share with you all, this off-beat, pretty long (almost 10 minutes) surreal video collaboration with Dale Innis Those of you who read me regularly, know that Dale Innis is a scripter friend who has collaborated with me and also with Natascha & I for the last 10 years and lately has been dabbling in all sorts of AI Art, especially MidJourney, which is a veritable game-changer in this blossoming field. He showed me a pdf file of slides and a story-line, that he had made and I fell in love…fell obsessed, is a better word, to try to bring this to a way more people could see it. This is how the project was born. I found, what we both agree, is the perfect music Meditative Music and I made a voice-over and edited the slides into what you’ll see below. This is a very slow-…
I first found Yeni Cavan as a story and art venue, based on a bunch of words used as prompts in the pre-Stable Diffusion NightCafe, way back in February. Since then I’ve tried to find it in various other engines and things, casually and without much luck. But after playing with the engine flows and prompts and things some, here are some images from MidJourney that I rather like; sufficiently Yeni Cavanish, I’d say, although so far I miss the little random patches of bright purple neon and such. (Maybe I’ll try some of the other venues as well eventually.)
Yeni Cavan; interior room (image started in the –hd engine)Yeni Cavan; room interior (love the comfy couch with the … circuit board? sitting on it)Yeni Cavan; room interior (I’d like to be there yes)Yeni Cavan; room interior (pure v3 I think)Yeni Cavan; room interior (pure –hd I think; intricate!)Yeni Cavan; detailed surrealism (whee!)Yeni Cavan; adorable surreal botsYeni Cavan; more detailed surrealism!Yeni Cavan; upstanding citizenYeni Cavan; City Waterfront
I am losing track of the number of AI-based image-creation tools I have access to now. It’s not that huge a number, but it’s complicated! :) There’s at least:
good old ArtBreeder, which I haven’t used in ages, and which seems to have a potentially interesting new mode where you sketch a thing with a few shapes, and then type text telling the AI what to make it into,
MidJourney with the old V3 engine and the newer and lyrically named ‘test’ and ‘testp’ engines and mixmashes of those,
NightCafe, which was my main goto image tool quite some weeks, with the old Artistic and Coherent engines, but now also the new Stable Diffusion (SD) based “Stable” engine, and various workflows among those,
NovelAI which now does images as well as text; the images are also in a Discord bot, and it’s really fast; it uses some heuristic smut-blurrer (maybe just the standard SD one?) but the devs sort of promise they will eventually move it off of discord and then have few or no restrictions (similarly to their text generator),
and now I discover that I have access to Dall-E also, from OpenAI, which I have just barely begun to use (detailed surrealism).
The “you can’t copyright art made with AIs” meme seems to have withered (which is good since it’s not true, although nothing is certain), but my experiment to gather additional evidence against it has finally borne fruit (months before I expected it to, really): I have now registered my copyright in this masterpiece of mine:
A blonde porcelain doll and a worn teddy bear sit on a trunk, in a musty attic in light from the window
with the real actual US Copyright Office, who have sent me a real actual certificate testifying to it. The registration can also be found on the web (you have to go to that page and then search on Registration Number for “VA0002317843”; I have yet to find a permalink that persists, bizarrely).
I did it through LegalZoom rather than myself; it cost more (I think), but I was more confident that I was Doing It Right during the process. There were no questions about whether AI was involved, or about what software I used to create it, or anything like that. I did have to say that I’m the creator, of course, but since I am :) I don’t see a problem there.
Registering the copyright doesn’t mean it’s 100% correct, it just creates a legal presumption. Someone could still challenge it, arguing that I wasn’t really the creator at all. I think that would be very unlikely to succeed.
And in any case, here is a nice concrete counterexample to any remaining “you can’t copyright art produced with an AI” claims that might be floating around.
Extremely generous friend Karima also continues updating the virtual world region “AI Dreams in Art” with things she likes from my Twitter feed, etc, so drop by! It is getting blushingly positive reviews on the Social Medias; apparently there are significant numbers of people who have heard a lot about this AI Art stuff, but never really seen any. They seem to like mine! :)
Updates: there have been significant developments, legal and otherwise, in this area since this was initially posted; see the copyright tag on the weblog here for more.
As we’ve discussed, one of my favorite things is to give a text- or image-generating AI a vague and/or ambiguous prompt, and just see what happens. The results are sometimes kind of horrifying, but here I’m going to post a bunch of results that aren’t especially horrifying, and that are sometimes lovely.
The prompt for all of these is basically just “a photograph”. And what I really want to do (and I am realizing that there are various services out there that would let me do it without much fuss) is make a nice coffee-table book of these, accompanied by text produced by like NovelAI. Just because it would be neat.
I continue having way too muchy fun making images with MidJourney (and NightCafe, and now somethings that I can’t quite show off yet).
I’m realizing that I’m a little weird, in that most people seem to be interested in just how exactly they can get the tool to produce an image that they’re thinking of, whereas I am almost entirely into typing somewhat random ambiguous stuff, and seeing what fun things the AI responds with.
For instance, here’s a snapshot of a whole bunch of images made using the prompt “one inside of another” with various seeds and switches and engine flows and things:
two dozen rather varied and ominous images
I love all of these (there were some I didn’t love, and I didn’t upscale those, so they aren’t here). The first two got me:
It seemed like there was really something going on there.
These two are with a slightly different engine flow than most of the others, but are no less wonderful:
What’s going on here? Is the AI showing wild creativity? Is it just starting in a basically random place due to the vague prompt, and then drifting into some weird random local minimum from there? Is that different from showing wild creativity?
Clearly there are lots of pictures of faces (especially women’s faces) and rooms with windows in the training set, so we get lots of those, that makes sense. But why do we get two different images that are (inter alia) the face of a person holding (and/or confronting) some semi-abstract sharp object? Why are there two faces which are split in half vertically, and one half striped / pixelated?
And what are these?
One thing is certainly inside of another. Is that a coincidence? Or is the AI “aware” of it in some sense?
I feel like I could swim in this stuff forever! That is what I thought at first about the GPT-3 stuff, though, and that wasn’t true. :) Still, if it’s just that I’m still in the initial flush of excitement, it’s a very fun flush.
Oh, and somewhat relatedly, here is a stealth announcement of a new graphic novel (or perhaps picture book) based on MidJourney images. This time I generated many many images from the same (small set of related) prompts, four at a time, and then tried to construct a story that would make sense with them. Note that this version is like 327MB for some reason, so click with care: Klara, Part 1.
I was thinking of a post extending the legal thoughts from last time to talk about this widespread claim (based on the Thaler decisions that I mentioned briefly there) that “Artwork made with an AI can’t be copyrighted”. It’s all over the clickbait-website press, and it’s wrong. The rulings in question said that an AI can’t be the creator-in-fact of a work (in the U.S.) so someone can’t get copyright to a work based on being the “employer” of the creator-in-fact AI. But they say nothing about the obvious alternative that a human can be the creator (simpliciter) of a work make with an AI, just as a human can be the creator of a work made with Photoshop, or a paintbrush.
Heh heh, I guess I’ve already written a bit about that here now, haven’t I? But there are various arguments and counterarguments that one could talk about that I’m not going to.
Then there’s the fact that I’ve been generating So Many Images in Midjourney, which for a while there had pretty much entirely drawn me away from NightCafe. As well as those So Many Images, I’ve started to put a bunch of them together in the GIMP in the form of a sort of amateur manga or graphic story that attempts to have an actual plot and stuff; here’s a pdf of the story (the first 10 pages of it, which is all that currently exists), at considerably reduced resolution so it isn’t like over 30MB. Feedback welcome. :)
But then! By which I mean just today I think! NightCafe has become very interesting again, due to adding the Stable Diffusion engine. Which I have been using extensively, and have noted that:
It is kind of boring compared to the other engines I’ve used, in that it seems to usually take the simplest and most quotidian interpretation of a prompt, and create the most unremarkable (and, admittedly, sometimes impressively realistic!) image possible from it.
The right set of adjectives and so on can get more interesting results from it sometimes. The prompt prefix for Yeni Cavan, for instance, produces recognizably Yeni Cavan images, but somewhat less smoky and mysterious ones than Midjourney or the NightCafe Artistic engine do.
It has some kind of risible post-censorship blurring algorithm, and if a picture looks too naughty to that algorithm, it comes out with a very heavy blur applied. I have (accidentally) gotten one NFSW image that its filter didn’t detect, and on the other hand just including “in the style of Gauguin” in a prompt seems to pretty reliably produce just a blur. (“Well, yeah, he’s in the training set, but his stuff is really too naughty to output.”) I mean, /facepalm and all.
Update: when I reported a couple of very obvious porn-filter false positives, NightCafe support replied that the filter should be gone / optional in “a few days”. Very gratifyin’!
I wish NightCafe had an “effectively free, but might be slow” generation mode like Midjourney does. I’m running out of NightCafe credits after playing with Stable Diffusion for hours, and I’m near out of credits, and given the overall experience I will probably just to back to Midjourney now and make more images for the comic. :)
So that’s those things! But mostly it’s been lots of cool pictures. We will close with a recent one from Midjourney:
“Atomic surrealism detailed render“
and something that Stable Diffusion did (rather interestingly) with the same prompt:
So I’ve been producing so many images in Midjourney. I’ve been posting the best ones (or at least the ones I decide to post) in the Twitters; you can see basically all of them there (apologies if that link’s annoying to use for non-Twitterers). And an amazing friend has volunteered to curate a display of some of them in the virtual worlds (woot!), which is inexpressibly awesome.
Lots of people use “in the style of” or even “by” with an artist’s name in their Midjourney prompts. I’ve done it occasionally, mostly with Moebius because his style is so cool and recognizable. It did imho an amazing job with this “Big Sale at the Mall, by Moebius”:
“Big Sale at the Mall, by Moebius” by Midjourney
It captures the coloration and flatness characteristic of the artist, and also the feeling of isolation in huge impersonal spaces that his stuff often features. Luck? Coolness?
While this doesn’t particularly bother me for artists who are no longer living (although perhaps it should), it seems questionable for artists who are still living and producing, and perhaps whose works have been used without their permission and without compensation in training the AI. There was this interesting exchange on Twitter, for instance:
Quick question, is there a way one can specifically ask for their name to be in a “do not EVER allow to be used in a prompt” list? Seeing folks on your discord not only use my name, but trying to figure out how to train your ai to mimic my work-
The Midjourney folks replied (as you can I hope see in the thread) that they didn’t think any of this particular artist’s works were in the training set, and that experimentally adding their name to a prompt didn’t seem to do anything to speak of; but what if it had? Does an artist have the right to say that their works which have been publicly posted, but are still under copyright of one kind or another, cannot be used to train AIs? Does this differ between jurisdictions? Where they do have such a right, do they have any means of monitoring or enforcing it?
Here’s another thread, about a new image-generating AI (it’s called “Stable Diffusion” or “Stability AI”, and you can look it up yourself; it’s in closed beta apparently and the cherrypicked images sure do look amazing!) which seems to offer an explicit list of artists, many still living and working, that it can forge, um, I mean, create in the style of:
Here is a collection of relatively modern or currently working artists that they advertise as styles to steal on their site(there are hundreds of others). This is just gross. pic.twitter.com/LLcl7dR9IN
That’s a good question! I posted a few guesses on that thread (apologies again if Twitter links are annoying). In particular (as a bulleted list for some reason):
One could argue that every work produced by an AI like this, is a derivative work of every copyrighted image that it was trained on.
An obvious counterargument would be that we don’t say that every work produced by a human artist is a derivative work of every image they’ve studied.
A human artist of course has many other inputs (life experience),
But arguably so does the AI, if only in the form of the not-currently-copyrighted works that it was also trained on (as well as the word associations and so on in the text part of the AI, perhaps).
One could argue that training a neural network on a corpus that includes a given work constitutes making a copy of that work; I can imagine a horrible tangle of technically wince-inducing arguments that reflect the “loading a web page on your computer constitutes making a copy!” arguments from the early days of the web. Could get messy!
Perhaps relatedly, the courts have found that people possess creativity / “authorship” that AIs don’t, in at least one imho badly-brought case on the subject: here. (I say “badly-brought” just because my impression is that the case was phrased as “this work is entirely computer generated and I want to copyright it as such”, rather than just “here is a work that I, a human, made with the help of a computer, and I want to assert / register my copyright”, which really wouldn’t even have required a lawsuit imho; but there may be more going on here than that.)
The simplest thing for a court to decide would be that an AI-produced work should be evaluated for violating copyright (as a derivative work) in the same way a human-produced work is: an expert looks at it, and decides whether it’s just too obviously close a knock-off.
A similar finding would be that an AI-produced work is judged that way, but under the assumption that AI-produced work cannot be “transformative” in the sense of adding or changing meaning or insights or expression or like that, because computers aren’t creative enough to do that. So it would be the same standard, but with one of the usual arguments for transformativity ruled out in advance for AI-produced works. I can easily see the courts finding that way, as it lets them use an existing (if still somewhat vague) standard, but without granting that computer programs can have creativity.
Would there be something illegal about a product whose sole or primary or a major purpose was to produce copyright-infringing derivative works? The DMCA might possibly have something to say about that, but as it’s mostly about bypassing protections (and there really aren’t any involved here), it’s more likely that rules for I dunno photocopiers or something would apply.
So whew! Having read some of the posts by working artists and illustrators bothered that their and their colleagues’ works are being used for profit in a way that might actively harm them (and having defended that side of the argument against one rather rude and rabid “it’s stupid to be concerned” person on the Twitter), I’m now feeling some more concrete qualms about the specific ability of these things to mimic current artists (and maybe non-current artists whose estates are still active).
It should be very interesting to watch the legal landscape develop in this area, especially given how glacially slowly it moves compared to the technology. I hope the result doesn’t let Big AI run entirely roughshod over the rights of individual creators; that would be bad for everyone.
But I’m still rather addicted to using the technology to make strange surreal stuff all over th’ place. :)
One of the AI “image from text” systems that’s gotten a lot of coverage, and shown some pretty amazing images, is Midjourney. There’s been a waiting list for it for some time (I vaguely recall), and lucky people who had access have been posting the occasional wild image to social medias (with, I expect, a relatively large cherrypick effect).
I ran across a mention today that, while still in beta, it was thrown open to anyone and everyone (a week or three ago?), with some number of free images to generate, and then some reasonable prices for stuff thereafter. This sounded interesting!
Bizarrely, the interface to it appears to be via Discord, which I suppose is perfectly natural to The Kids These Days, but strikes me as chaotic and strange and hard to use. You accept an invitation to some Discord channel, you go to one of many “newbie” channels, you enter a /imagine command with your text prompt, and then you watch your mentions (after looking up on the web how to find them) to see what comes out.
The /imagine command causes the bot to produce four thumbnails based on your prompt. There are rather enigmatic little buttons under the thumbnails, that let you ask for four (or, I guess, three) more thumbnail-sized variants based on one of the four, or to upscale one of the four.
I’ve been having some fun with it, but I don’t feel like I’ve really mastered it (or figured out when I run out of free stuff and have to decide whether to pay), because the Discord interface is just bizarre.
Anyway though :) here’s some of what I’ve done so far. I think it’s pretty comparable to the Nightcafe output, really, although some of the things I’ve seen other people post seem extra-amazing.
Art Deco Print of an Elegant Woman Dressed in SkullsSteampunk plans for a complex mechanism; blueprintsFour thumbnails from earlier in the history of the aboveThe Library of Time; EntranceA page from a forbidden bookDetailed steampunk cyberpunk noir; street scene (i.e. #YeniCavan !)
I’m not sure if this will turn out to be an advance over Nightcafe to the extent that I’ll start using it more, or get all excited about AI generated art again (note that Nightcafe has been hinting at a new algorithm coming also, so there’s that). Especially if the interface stays on Discord!
But we’ll see! And in general the progress here seems pretty astounding; as I said to a couple of people in the last week, if I was a professional illustrator, I might be getting worried. (Whereas I don’t feel that way about professional writers or coders yet; this may just indicate that I don’t have as good an understanding of what illustrators actually do.)
I did another little run of NightCafe images, this time all with “The Library of Time” and some details in the first prompt, and then another prompt like “Fantasy, wide shot, detailed art, expert” which just sort of popped into my head. The full published Collection should be here, but I’ll post some favorites herein. The smaller ones are generally from the “Artistic” engine in NightCafe, and the larger ones are from running the “Coherent” engine over output from “Artistic”. And I would definitely like to go to this Library, thank you!
The Library of TimeOuter OfficesThe Library of TimeMain Reading RoomDown in the CatacombsSecret HallwayDreams of EternitySunset (woo!)
I’m pleased by what the AI did with these (see the full Collection for some that I left out for length, including the Room of Secrets). The last one is rather Hudson River School, which I love.
Soothing, eh? I had to ask it specifically for grey and green for that last one; otherwise it really liked the blue-orange colorway (“colorway” is such a great word).
And finally, a completely different prompt, and even a different AI:
Two different and completely unrelated topics (OR ARE THEY?) today, just because they’re both small, and hearkening back to the days when I would just post utterly random unconnected things in a weblog entry those title was just the date (and noting again that I could easily do that now, but for whatever reason I don’t).
First, whatsisname Wolfram and his Ruliad. Reading over some of his statements about it again, it seems like, at least often, he’s not saying that The Ruliad is a good model of our world, such that we can understand and make predictions about the world by finding properties of the model; he’s saying that The Ruliad just is the world. Or, in some sense vice-versa, that the world that we can observe and experience is a tiny subset of The Ruliad, the actual thing itself, the one and only instantiation of it (which incidentally makes his remarks about how it’s unique less obviously true).
I’m not exactly sure what this would mean, or (as usual) whether it’s falsifiable, or meaningful, or useful, or true, in any way. My initial thought is that (at the very least) every point in the Ruliad (to the extent it makes sense to talk about points) has every possible value of every possible property at once, since there is some computation that computes any given bundle of properties and values from any given inputs. So it’s hard to see how “beings like us” would experience just one particular, comparatively immensely narrow, subset of those properties at any given time.
It might be arguable, Kant-like, that beings like us will (sort of by definition) perceive three dimensions and time and matter when exposed to (when instantiated in) the infinite variety of The Ruliad, but how could it be that we perceive this particular specific detailed instance, this particular afternoon, with this particular weather and this particular number of pages in the first Seed Center edition of The Lazy Man’s Guide to Enlightenment?
The alternative theory is that we are in a specific universe, and more importantly not in many other possible universes, and that we experience what we experience, and not all of the other things that we could experience, as a result of being in the particular universe that we are in. This seems much more plausible than the theory that we are in the utterly maximal “Ruliad of everything everywhere all at once”, and that we experience the very specific things that we experience due to basically mathematical properties of formal systems that we just haven’t discovered yet.
We’ll see whether Wolfram’s Physical Project actually turns out to have “vast predictive power about how our universe must work, or at least how observers like us must perceive it to work”. :) Still waiting for that first prediction, so far…
In other news, The End of the Road is Haunted:
These are in the cheapest one-credit mode, because I was in that mood. Also I kind of love how the AI takes “haunted” to mean “creepy dolls”.
We made this World Tree series in NightCafe by appending things to “Concept art, pencil sketch, sepia tone”, starting with “Surprised by Joy”:
Surprised by Joy
Kind of adorable, eh? Might be a tardigrade.
Then for some reason we took it to The World Tree:
at the base of the World TreeClimbing the World TreeSo high in the World Tree
And then some old friends suddenly appeared! It’s Hugo and the Lamb!
Hugo and the Lamb arrive at the World TreeHugo and the Lamb ascend the World Tree
(I don’t think the AI has the idea that to really look like you’re up in the World Tree, there shouldn’t be anything that looks too much like the ground right there. Unsurprisingly perhaps.)
Wind in the Limbs of the World TreeResting High Up In The World TreeEyes High Up in the World Tree
And on that somewhat eerie note, the World Tree series ends (for the moment). We don’t see Hugo and the Lamb sleeping in one of the eye-shaped little nooks, or making their way back down the Tree (perhaps via parachutes or aircraft).
Back on the ground:
Those two are amusingly similar, due to having the same starting image with different seeds. I imagine that the little character on the right there comes in and says something witty or ominous.
On the RoadHugo and the Lamb reach the river, and cross the bridge
And across the river (surprise crossover!) they enter Hyrule:
Is Link taking care of the Lamb? Is Hugo wearing a Link costume? Was Hugo Link (or Link Hugo) all along?