Posts tagged ‘superintelligence’

2022/05/31

A couple of books I’ve read some of

To update this post from (gad) three months ago on the book “Superintelligence”, I’m finally slightly more than halfway through it, and it has addressed pretty reasonably my thoughts about perfectly safe AIs, like for instance AIDungeon or LaMDA, that just do what they do, without trying to optimize themselves, or score as many points as possible on some utility function, or whatever. The book calls such AIs “Tools”, and agrees (basically) that they aren’t dangerous, but says (basically) that we humans are unlikely to build only that kind of AI, because AIs that do optimize and improve themselves and try to maximize the expected value of some utility function, will work so much better that we won’t be able to resist, and then we’re all doomed.

Well, possibly in the second half they will suggest that we aren’t all doomed; that remains to be seen.

Conscious experience is unitary, parallel, and continuous

Another book I’ve read some of, and in this case just a little of, is Daniel Ingram’s “Mastering the Core Teachings of the Buddha” (expanded 2nd edition). There’s a page about it here, which includes a link to a totally free PDF, which is cool, and you can also buy it to support his various good works.

An internal Buddhists group at The Employer has had Daniel Ingram join us for a couple of videoconferences, which have been fun to various extents. There’s a lot that one could say about Ingram (including whether he is in fact a Buddhist, what one thinks of him calling himself “The Arahant Daniel M. Ingram” on the cover, etc.), but my impression of him is that he’s a very pragmatic and scientific (and energetic) sort of person, who has a project to study the various paths to things like enlightenment in the world, in a scientific sort of way, and figure out what paths there are, which ones work best for which kinds of people, what stages there are along the paths, what works best if one is at a particular point on a certain path, and so on. There is apparently a whole Movement of some sort around this, called Pragmatic Dharma (I was going to link to something there, but you can Web Search on it yourself at least as effectively).

I’m not sure that this is an entirely sensible or plausible project, since as a Zen type my instinctive reaction is “okay, that’s a bunch of words and all, but better just sit”. But it’s cool that people are working on it, I think, and it’ll be fun to see what if anything they come up with. Being both pragmatic and moral, they are all about the Kindness and Compassion, so they can’t really go far wrong in some sense.

Having started to read that PDF, I have already a couple of impressions that it’s probably far too early to write down, but hey it’s my weblog and I’ll verb if I want to.

First off, Ingram says various things about why one would want to engage on some project along these lines at all, and I get a bit lost. He says that in working on morality (by which he means practical reasoning in the relative sphere, being kind and compassionate and all that) we will tend to make the world a better place around us, and that’s cool. But then the reasons that one would want to work on the next level after morality, which is “concentration”, are all about vaguely-described jhanas, as in (and I quote):

  • The speed with which we can get into skillful altered states of awareness (generally called here “concentration states” or “jhanas”).
  • The depth to which we can get into each of those states.
  • The number of objects that we can use to get into each of those states.
  • The stability of those states in the face of external circumstances.
  • The various ways we can fine-tune those states (such as paying attention to and developing their various sub-aspects)

Now it appears to me that all of these depend on an underlying assumption that I want to get into these “states” at all; unless I care about that, the speed with which I can do it, the depth, the number of objects (?), the stability, and the fine-tuning, don’t really matter.

I imagine he will say more about these states and why they’re desirable later, but so far it really just says that they are “skillful” (and “altered”, but that seems neutral by itself), and “skillful” here just seems to be a synonym for “good”, which doesn’t tell us much.

(In other Buddhist contexts, “skillful” has a somewhat more complex meaning, along the lines of “appropriate for the particular occasion, and therefore not necessarily consistent with what was appropriate on other occasions”, which a cynic might suggest is cover for various Official Sayings of the Buddha appearing to contradict each other seriously, but who wants to be a cynic really?)

It seems that if the jhanas are so fundamental to the second (um) training, he might have made more of a case for why one would want to jhana-up at the point where the training is introduced. (One could make the same sort of comment about Zen types, where the reason that you’d want to meditate is “the apple tree in the side yard” or whatever, but those types make no pretense at being scientific or rational or like that.)

In the Third Training, called among other things “insight”, Ingram talks about becoming directly aware of what experience is like, or as he summarizes it, “if we can simply know our sensate experience clearly enough, we will arrive at fundamental wisdom”. He then talks about some of the ways that he has become more aware of sensate experience, and I am struck by how very different from my own observations of the same thing they are. Let’s see if I can do this in bullets:

  • He starts with basically a “the present moment is all that exists” thing, which I can get pretty much entirely behind.
  • He says that experience is serial, in that we can experience only one thing at a time. He describes focusing on the sensations from his index fingers, for instance, and says “[b]asic dharma theory tells me that it is not possible to perceive both fingers simultaneously”.
  • Relatedly, he says that experience is discrete, and that one sensation must fade entirely away before another one can begin. At least I think he’s saying that; he says things like “[e]ach one of these sensations (the physical sensation and the mental impression) arises and vanishes completely before another begins”, and I think he means that in general, not just about possibly-related “physical sensations” and “mental impressions”. He also uses terms like “penetrating the illusion of continuity” (but what about the illusion of discontinuity?).
  • And relatedly relatedly, he thinks that experience is basically double, in that every (every?) “physical sensation” is followed by a “mental impression” that is sort of an echo of it. “Immediately after a physical sensation arises and passes is a discrete pulse of reality that is the mental knowing of that physical sensation, here referred to as ‘mental consciousness'”.

Now as I hinted above, the last three of these things, that consciousness is serial, discrete, and double, do not seem to accord at all with my own experience of experience.

  • For me, experience is highly parallel; there is lots going on at all times. When sitting in a state of open awareness, it’s all there at once (in the present moment) in a vast and complex cloud. Even while attending to my breath, say, all sorts of other stuff is still there, even if I am not attending to it.
  • Similarly, experience is continuous; it does not come in individual little packets that arise and then fade away; it’s more of an ongoing stream of isness (or at least that is how memory and anticipation present it, in the singular present moment). If thoughts arise, and especially if those thoughts contain words or images, the arising and fading away of those feel more discrete, but only a bit; it’s like foam forming on the tops of waves, and then dissolving into the water again.
  • And finally, there’s no important distinction to be had between “physical sensations” and “mental impressions”; there is only experience happening to / constituting / waltzing with mind. If there were a mental impression following each physical sensation, after all, how would one avoid an infinite regress, with mental impressions of mental impressions stretching out far into the distance? Something like that does happen sometimes (often, even) but it’s more a bug than a feature.

I suspect that some or most of all of these differences come because Ingram is talking about a tightly-focused awareness, where I am more of an open and expansive awareness kind of person, even when attending to the breath and all. If you really pinch down your focus to be as small as possible, then you won’t be able to experience (or at least be consciously aware of) both fingers at once, and you may manage to make yourself see only one sensation at a time in individual little packets, and you may even notice that after every sensation you notice, you also notice a little mental echo of it (which may in fact be the sum of an infinite series of echoes of echoes that with any luck converges).

This kind of tightly-focused conscious awareness goes well, I think, with what Ingram says about it being important to experience as many sensations per second as possible. He puts it in terms of both individual sensations, and vibrations, although the latter doesn’t really fit the model; I think he means something more like “rapid coming into existence and going out of existence” rather than a vibration in some continually-existing violin string.

He is enthusiastic about experiencing things really fast, as in

If you count, “one, one thousand”, at a steady pace, that is about one second per “one, one-thousand”. Notice that it has four syllables. So, you are counting at four syllables per second, or 4 Hertz (Hz), which is the unit of occurrences per second. If you tapped your hand each time you said or thought a syllable, that would be four taps per second. Try it! Count “one, one thousand” and tap with each syllable. So, you now know you can experience at least eight things in a second!

and this strikes me as really funny, and also endearing. But he takes it quite seriously! He says in fact that “that is how fast we must perceive reality to awaken”; I do wonder if he is going to present any scientific evidence for this statement later on. I’m sure it has worked well / is working well for him, but this seems like a big (and high-frequency) generalization. I don’t remember ol’ Dogen, or Wumen, or the Big Guy Himself, talking about experiencing as many things per second as possible, as a requirement. I guess I’ll see!

2022/02/27

Not a review of “Superintelligence”

I’m reading this book “Superintelligence” by Nick Bostrom (I apparently have the 2014, not the 2016, edition). This isn’t a review, because I haven’t nearly finished it yet, but I have some Thoughts.

First of all, the book is taking far too long to get to the “What we should do about the prospect of things that are sufficiently smarter than us coming to exist nearby?” part. I’ve been plodding and plodding through many pages intended to convince me that this is a significant enough probability that I should bother thinking about it at all. I already believed that it was, and if anything the many many pages attempting to convince me of it have made me think it’s less worth worrying about, by presenting mediocre arguments.

(Like, they point out the obvious possibility that if a computer can make itself smarter, and the smarter it is the faster it can make itself smarter, then it can get smarter exponentially, and that can be really fast. But they also note that this depends on each unit of smarterness being not significantly harder than the last to achieve, and they give unconvincing external arguments for this being likely, whereas the typical inherent behavior of a hard problem is that it gets harder and harder as you go along (80/20 rules and all), and that would tend to prevent an exponential increase, and they haven’t even mentioned that. Maybe they will, or maybe they did and I didn’t notice. They also say amusing things about how significant AI research can be done on a single “PC” and maybe in 2014 that was a reasonable thing to say I dunno.)

super2

But anyway, this isn’t a review of the book (maybe I’ll do one when I’m finished, if I ever finish). This is a little scenario-building based on me thinking about what an early Superintelligent system might actually look like, in what way(s) it might be dangerous, and so on. I’m thinking about this as I go, so no telling where we might end up!

The book tends to write down scenarios where, when it becomes Superintelligent, a system also becomes (or already is) relatively autonomous, able to just Do Things with its effectors, based on what comes in through its sensors, according to its goals. And I think that’s unlikely in the extreme, at least at first. (It may be that the next chapter or two in the book will consider this, and that’s fine, I’m thinking about it now anyway.

Consider a current AI system of whatever kind; say GPT-3 or NightCafe (VQGAN+CLIP). It’s a computer program, and it sits there doing nothing. Someone types some stuff into it, and it produces some stuff. Some interesting text, say, or a pretty image. Arguably it (or a later version of it) knows a whole lot about words and shapes and society and robots and things. But it has no idea of itself, no motives except in the most trivial sense, and no autonomy; it never just decides to do something.

So next consider a much smarter system, say a “PLNR-7” which is an AI in the same general style, which is very good at planning to achieve goals. You put in a description of a situation, some constraints and a goal, and it burns lots of CPU and GPU time and gets very hot, and outputs a plan for how to achieve that goal in that situation, satisfying those constraints. Let’s say it is Superintelligent, and can do this significantly better than any human.

Do we need to worry about it taking over the world? Pretty obviously not, in this scenario. If someone were to give it a description of its own situation, a relatively empty set of constraints, and the goal of taking over the world, perhaps it could put out an amazing plan for how it could do that. But it isn’t going to carry out the plan, because it isn’t a carrier-out of plans; all it does is create them and output them.

The plan that PLNR-7 outputs might be extremely clever, involving hiding subliminal messages and subtle suggestions in the outputs that it delivers in response to inputs, that would (due to its knowledge of human psychology) cause humans to want to give PNLR-7 more and more authority, to hook it up to external effectors, add autonomy modules to allow it to take actions on its own rather than just outputting plans, and so on.

But would it carry out that plan? No. Asking “would it have any reason to carry out that plan?” is already asking too much; it doesn’t have reasons in the interesting sense; the only “motivation” that it has is to output plans when a situation / constraints / goal triplet is input. And it’s not actually motivated to do that, that is simply what it does. It has no desires, preferences, or goals itself, even though it is a superhuman expert on the overall subject of desires, preferences, goals, and so on.

Is the difference here, the difference between being able to make plans, and being able to carry them out? I don’t think it’s even that simple. Imagine that we augment PLNR-7 so that it has a second input port, and we can bundle up the situation / constraints / goal inputs with the plan output of the first part, feed than into that second slot, and PLNR-7 will now compare the real world with the situation described, and the plan with whatever effectors we’ve given it, and if it matches closely enough it will carry out the plan (within the constraints) using its effectors.

Say we give it as its only effectors the ability to send email, and make use of funds from a bank account containing 100,000 US dollars. We give it a description of the current world as its input, a constraint that corresponding to being able to send email and spend a starting pool of US$100,000, and a goal of reducing heart disease in developing countries in one year. It thinks for awhile and prints out a detailed plan involving organizing a charitable drive, hiring a certain set of scientists, and giving them the task of developing a drug with certain properties that PLNR-7 has good reason to think will be feasible.

We like that plan, so we put it all into the second input box, and a year later heart disease in developing countries is down by 47%. Excellent! PLNR-7, having finished that planning and executing, is now just sitting there, because that’s what it does. It does not worry us, and does not pose a threat.

Is that because we let humans examine the plan between the first-stage output, and the second-stage effecting? I don’t think that’s entirely it. Let’s say we are really stupid, and we attach the output of the first stage directly to the input of the second stage. Now we can give it constraints to not to cause any pain or injury, and a goal of making the company that built it one billion dollars in a year, and just press GO.

A year later, it’s made some really excellent investments, and the company is one billion dollars richer, and once again it’s just sitting there.

Now, that was dangerous, admittedly. We could have overlooked something in the constraints, and PLNR-7 might have chosen a plan that, while not causing any pain or injury, would have put the entire human population of North America into an endless coma, tended to by machines of loving grace for the rest of their natural lives. But it didn’t, so all good.

The point, at this point, is that while PLNR-7 is extremely dangerous, it isn’t extremely dangerous on its own behalf. That is, it still isn’t going to take any actions autonomously. It is aware of itself only as one element of the current situation, and it doesn’t think of itself as special. It is extremely dangerous only because it has no common sense, and we might give it a goal which would be catastrophic for us.

And in fact, circling back around, the book sort of points that out. It tends to assume that AIs will be given goals and effectors, and notes that this doesn’t automatically give them any kind of instinct for self-preservation or anything, but that if the goal is open-ended enough, they will probably realize in many circumstances that the goal will be best achieved if the AI continues to exist to safeguard the goal-achievement, and if the AI has lots of resources to use to accomplish the goal. So you end up with an AI that both defends itself and wants to control as much as possible, not for itself but for the sake of the goal that we foolishly gave it, and that’s bad.

The key step here seems to be closing the loop between planning and effectuating. In the general case in the current world, we don’t do that; we either just give the AI input and have it produce symbolic output, or we give it effectors (and goals) that are purely virtual: get the red block onto the top of a stable tower of blocks on the virtual work surface, or get Company X to dominate the market in the marketplace simulation.

On the other hand, we do close the loop back to the real world in various places, some having to do with not-necessarily-harmless situations like controlling fighter jets. So that’s worth thinking about.

Okay, so that’s an interesting area identified! :) I will watch for, as I continue to read the book, places where they talk about how an AI might get directly attached to effectors that touch the real world, and might be enabled to use them to carry out possibly Universal Paperclips style real-world goals. And whether not doing that (i.e. restricting your AIs to just outputting verbal descriptions of means toward closed-ended goals) might be a Good Thing To Do. (Although how to you prevent that one jerk from taking the Fatal Step in order to speed up his world domination? Indeed.)