Posts tagged ‘bard’

2023/03/31

It’s just predicting the next word! Well…

tl;dr: While it’s true that all LLMs do is produce likely text continuations, this doesn’t imply that they don’t have mental models, don’t reason, etc.

One thing that sensible people often say about Large Language Models like ChatGPT / GPT-n and Bard and so on, is that all they do is predict the next word, or for more technical accuracy, that all they do is generate text that is likely to follow the prompt that they are given, i.e. “produce likely continuations”.

And that’s a good thing to note, in that people tend to have all sorts of other theories about what they are doing, and some of those theories are just wrong, and lead people to make bad conclusions. For instance, people will have a more or less default theory that the model knows things about itself and tells the truth about things it knows, and take seriously its (non-factual) answers to questions like “What language are you written in?” or “What hardware are you running on?” or “Are you a tool of Chinese Communism?”.

Also, it’s true that all they do is generate text that is likely to follow the prompt, in the sense that that is the only significant criterion used during training of the underlying neural network.

But that doesn’t actually mean that that is all they do, in the more general sense. And this, at least potentially, matters.

Consider for instance the claim that “all life does is arrange to have many generations of descendants”. That is true in the same sense, since the only criterion for having survived long enough to be noticed in the current world, is to have had many generations of descendants.

But, significantly, this doesn’t mean that that is all life does, in the sense that life does all sorts of other things, albeit arguably in the service of (or at least as a side effect of) having many generations of descendants.

For instance, I think it would be plainly false to say “people obviously can’t reason about the world; all they do is arrange for there to be many more generations of people!”. In fact, people can and do reason about the world. It may be that we can explain how we came to do this, by noting that one effective strategy for having many generations of descendants involves reasoning about the world in various ways; but that does not mean that we “don’t really reason” in any sense.

Similarly, I think the arguments that various smart people make, which when boiled down to a Tweet come out as roughly “LLMs don’t X; all they do is predict likely continuations!” for various values of X, are in fact not valid arguments. Even if all an LLM does is predict likely continuations, it might still do X (reason about the world, have mental models, know about truth and falsehood) because X is helpful in (or even just a likely side-effect of) one or more effective strategies for predicting likely continuations.

Put another way, if you train a huge neural network to output likely continuations of input text, it’s not obviously impossible that in choosing internal weights that allow it to do that, it might develop structures or behaviors or tendencies or features that are reasonably described as mental models or reasoning or knowledge of truth and falsehood.

This isn’t a claim that LLMs do in fact have any of these X’s; it’s just pointing out that “all it does is produce likely continuations” isn’t a valid argument that they don’t have them.

It’s still entirely valid to respond to “It told me that it’s written in Haskell!” by saying “Sure, but that’s just because that’s a likely answer to follow that question, not because it’s true”. But it’s not valid to claim more generally that a model can’t have any kind of internal model of some subset of the real world; it might very well have that, if it helps it to correctly predict continuations.

Bonus section! Current LLMs don’t in fact reason significantly, or have interesting internal models, in many cases. Amusing case from this morning: when fed some classic text rot13’d, this morning’s Bard claimed that it was a quote from Hitchhiker’s Guide to the Galaxy, whereas this morning’s ChatGPT replied with rot13’d text which, when decoded, was gibberish of the sort that an early GPT-2 might have produced from the decoded version of the prompt. No agent with a reasonable mental model of what it was doing, would have done either of those things. :)

2023/03/26

Creativity, how does it work?

This is a random brainstorming post, I have no particular conclusions at the moment as I write the first sentence here, but I might develop something as we go along.

So far, I just have this “meme” that I made:

Critics: AI art tools can't create anything new, just copy and paste from existing art!

People using AI art tools:
Here there is what looks like a charcoal drawing of a maniacally-smiling woman with wild hair and an extra set of lower teeth, immersed to just below the shoulders in a whitecapped ocean. There is an odd sailing ship on the ocean in the background, and two more vessels (ships? dirigibles?) in the sky.

There are two obvious reactions to this. Someone who likes AI art tools might say “haha, yeah, this shows how creative and crazy this art can be!”. And someone who agrees with the critics might say “omg, totally, that’s obviously sooo derivative!”.

The first thing to wonder is whether there is a particular image, set of images, or artist out in the world somewhere of which this image is obviously derivative. Pointers in the comments are extremely welcome! :)

Google (reverse) image search doesn’t come up with anything especially obvious. There are some images (like, at the moment, this one) that involve somewhat stylized faces with prominent hair and ocean waves and one or more ships, but the arrangement and overall style and impact are, I think, significantly different. In the past when I asked a couple of people who were all “oh, yeah, I can usually identify the particular artist or artwork that one of these AI images was taken from”, to do that with one of my images, they suddenly became very quiet. ¯\_(ツ)_/¯

If there isn’t a specific image or small set of images or an artist that one can point to and say “see, this is where this came from!”, what does that mean? I’m not an art critic (hahaha), but I think it would be pretty uncontroversial that, if a person had created that image above there entirely with real-live paper and charcoal, or even with a tablet and Photoshop, we’d say that it displayed sort of average human creativity; nothing incredible, but easily beyond (for instance) the “modicum of creativity” required by US copyright case law, enough that it could be entered in an art competition, and so on.

Once we know that it was created by a person using an AI art tool (Midjourney, in this case, with a particular prompt and engine settings and so on), is it reasonable to say something different? Does it still display creativity, or not? Does it do it differently, or in the same way? What is creativity? How is it displayed? In what does it inhere? Is it for that matter the kind of thing that inheres in things? Are there facts of the matter about it, or is it a purely squishy and subjective thing?

There are a bunch of theories that one might put together:

  • One might hold that it’s just as creative, and in the same way, as the counterfactual no-AI version, and that the creativity comes from the same place: the human who made it. One version of narrative would say that the difference between the no-AI and the with-AI version, creativity-wise, is not different in kind from a person making it with paper and charcoal and a person making it with tablet and Photoshop, or a board and little mosaic tiles. It might be objected that the activity of choosing engine parameters and prompt strings and then culling the results is just obviously (or by dint of some specific plausible theory) different from the activities in the other cases, since those involve something like choosing a particular color for particular parts of the image, whereas the AI-tool case doesn’t.
  • One might hold that it’s just as creative (or at least that it is creative, if perhaps to a different degree), and the creativity still comes from the human, although it’s implemented (delivered, displayed, exercised, used, manifest) in a different way. One might say in this theory that the difference between the real paper and charcoal version and the Midjourney version is like the difference between a realistic drawing of a scene and a photograph of the same scene. Both born of human creativity, but through very different means, and perhaps to different degrees. And then we can get into lots of questions about the creative element(s) in various kinds of photography!
  • The two takes above can, I think, go either way on the question of whether creativity is inherent in the end result, the image, in a sort of death-of-the-author way, or whether it’s in the overall process. At the other end of some spectrum, one could say that the image made with the AI tool does not in fact display (involve, require, contain) any creativity; that our initial impression that it did just turns out to have been mistaken, and now that we know how it came to exist, we know that it didn’t involve creativity. This sort of claim pretty much rules out the position that creativity is inherent in the finished product, unless one is willing to take the (facially untenable, I think) position that this image could not in principle have been created by a human without using an AI, and that inversely no purely human-created image could in principle have been created with an AI tool.
  • That is, if you think there is no creativity in this image because it was made with an AI tool, you pretty much have to take the position that it’s not possible to tell how much creativity there is in an artwork (or a putative artwork) just by looking at it; that the creativity is not displayed by / doesn’t inhere in solely the image or object. Which seems sensible in at least one obvious way: I might think that something involved lots of creativity, until I see that it is an exact copy of something that existed before, just with a little line drawn on it. More nuancedly, we’d say that you can’t tell how much new creativity is in a thing, until you see how it was made (because it might be, say, a copy).
  • So now we have a potential claim that images made with AI tools don’t have any (or much) new creativity, because they are just processed / stolen / noisily compressed and expanded / copy and pasted, versions of the material that they were trained on. Sure there might be a little creativity in choosing the prompt or whatever, but that’s not much. The program itself can’t add any creativity because “they can’t, they just can’t” (a phrase I’ve heard from a couple of people talking on videos lately, but of course can’t find at the moment).
  • Humans also process things that they’ve seen / experienced when producing new things. I’d say we can’t really require creativity to mean “those aspects of a work that spring purely from the artist’s soul, and that would still have been there had the artist been a brain in a vat with no experience of the world or other artworks, only its own thoughts”, because then there wouldn’t be any creativity anywhere, and when common words turn out to have no referent in a theory, it generally (if not always) means that that theory is wrong.
  • Or maybe we do want to require that “sprung from the soul alone” thing, because we want to set a very high bar for True Creativity, and we are confident that there will be at least a few glorious shining examples if only we knew the truths of people’s souls! In which case we can say that a marvelous few humans have displayed true creativity through the ages, and no computer ever has (having no soul and all), and neither have the vast majority of people we loosely call “artists”. This is a theory, but not a popular one, and it means that most art displays no creativity, which again feels sort of like a reductio. It’s certainly not compatible with what the Copyright Office means by “creativity”.
  • The question of how much creativity is in the selection of prompts and engine settings and images to keep is one we can put aside (in the drawer next to the question of the creativity in a cellphone snapshot, as alluded to above). And it seems we are left with having a theory about how much creativity comes from the AI tool itself, and how much of that is what we’ve called new creativity. Possible answers include “none, there’s lots of new creativity, but it’s all from the human user”, “none, there’s no new creativity in this at all, it’s all stolen / copied from the creativity in the training set”, “about the same amount that comes from the human, they are in some sense equals in the new creation”, and “the human just types a few words, and then the software adds lots of new creativity to it, so it’s the AI”.
  • This leaves us mostly with the question of “under what circumstances is it true that a person, or a piece of software, adds new creativity to a work, when that work is to a degree influenced by other prior works that that person, or piece of software, has been exposed to?”. Or other words to that general effect. One set of answers will not especially care whether it’s a person or a piece of software; the other set (“they just can’t”) will either think that it’s important which it is, or have a set of criteria which (they will claim) only people and not software can for whatever reason satisfy.

And I’ll leave it there for now, having perhaps not been especially productive :) but having written a bunch of words and focused in (if in fact it’s a focusing) on the question of what it means to add new creativity when making something, even though the entity doing the creating is influenced by other works that existed before. People talk a lot about things like reflecting one’s lived experience, having a thought that the work will (may? is intended to?) cause the viewer to also have (some version of?), and like that. None of those seem likely to be any kind of complete explanation to me at the moment.

In legal news, of course, the US Copyright Office has issued a Copyright Registration Guidance on “Works Containing Material Generated by Artificial Intelligence”, which I gather (I have not had the mental energy to think about this very hard) just repeats the statements in the Zarya (I always want to write Zendaya) memo we briefly discussed the other day, using various phrases that are putatively synonymous but as far as I can tell are subtly different and introduce all sorts of new uncertainty to the subject.

I’m going to continue not thinking about that very hard for now, because that part of my brain is still tired.

Also! You can get onto the waiting list for the Google LLM thing (and I hear varying stories about how quickly one gets access; apparently it is sometimes quite quick). In case you’re, like, collecting those, or otherwise interested.