The Splintered Mind: technology

Showing posts with label technology. Show all posts

Friday, May 12, 2023

Pierre Menard, Author of My ChatGPT Plagiarized Essay

If I use autocomplete to help me write my email, the email is -- we ordinarily think -- still written by me. If I ask ChatGPT to generate an essay on the role of fate in Macbeth, then the essay was not -- we ordinarily think -- written by me. What's the difference?

David Chalmers posed this question a couple of days ago at a conference on large language models (LLMs) here at UC Riverside.

[Chalmers presented remotely, so Anna Strasser constructed this avatar of him. The t-shirt reads: "don't hate the player, hate the game"]

Chalmers entertained the possibility that the crucial difference is that there's understanding in the email case but a deficit of understanding in the Macbeth case. But I'm inclined to think this doesn't quite work. The student could study the ChatGPT output, compare it with Macbeth, and achieve full understanding of the ChatGPT output. It would still be ChatGPT's essay, not the student's. Or, as one audience member suggested (Dan Lloyd?), you could memorize and recite a love poem, meaning every word, but you still wouldn't be author of the poem.

I have a different idea that turns on segmentation and counterfactuals.

Let's assume that every speech or text output can be segmented into small portions of meaning, which are serially produced, one after the other. (This is oversimple in several ways, I admit.) In GPT, these are individual words (actually "tokens", which are either full words or word fragments). ChatGPT produces one word, then the next, then the next, then the next. After the whole output is created, the student makes an assessment: Is this a good essay on this topic, which I should pass off as my own?

In contrast, if you write an email message using autocomplete, each word precipitates a separate decision. Is this the word I want, or not? If you don't want the word, you reject it and write or choose another. Even if it turns out that you always choose the default autocomplete word, so that the entire email is autocomplete generated, it's not unreasonable, I think, to regard the email as something you wrote, as long as you separately endorsed every word as it arose.

I grant that intuitions might be unclear about the email case. To clarify, consider two versions:

Lazy Emailer. You let autocomplete suggest word 1. Without giving it much thought, you approve. Same for word 2, word 3, word 4. If autocomplete hadn't been turned on, you would have chosen different words. The words don't precisely reflect your voice or ideas, they just pass some minimal threshold of not being terrible.

Amazing Autocomplete. As you go to type word 1, autocomplete finishes exactly the word you intend. You were already thinking of word 2, and autocomplete suggests that as the next word, so you approve word 2, already anticipating word 3. As soon as you approve word 2, autocomplete gives you exactly the word 3 you were thinking of! And so on. In the end, although the whole email is written by autocomplete, it is exactly the email you would have written had autocomplete not been turned on.

I'm inclined to think that we should allow that in the Amazing Autocomplete case, you are author or author-enough of the email. They are your words, your responsibility, and you deserve the credit or discredit for them. Lazy Emailer is a fuzzier case. It depends on how lazy you are, how closely the words you approve match your thinking.

Maybe the crucial difference is that in Amazing Autocomplete, the email is exactly the same as what you would have written on your own? No, I don't think that can quite be the standard. If I'm writing an email and autocomplete suggests a great word I wouldn't otherwise have thought of, and I choose that word as expressing my thought even better than I would have expressed it without the assistance, I still count as having written the email. This is so, even if, after that word, the email proceeds very differently than it otherwise would have. (Maybe the word suggests a metaphor, and then I continue to use the metaphor in the remainder of the message.)

With these examples in mind, I propose the following criterion of authorship in the age of autocomplete: You are author to the extent that for each minimal token of meaning the following conditional statement is true: That token appears in the text because it captures your thought. If you had been having different thoughts, different tokens would have appeared in the text. The ChatGPT essay doesn't meet this standard: There is only blanket approval or disapproval at the end, not token-by-token approval. Amazing Autocomplete does meet the standard. Lazy Emailer is a hazy case, because the words are only roughly related to the emailer's thoughts.

Fans of Borges will know the story Pierre Menard, Author of the Quixote. Menard, imagined by Borges to be a 20th century author, makes it his goal to authentically write Don Quixote. Menard aims to match Cervantes' version word for word -- but not by copying Cervantes. Instead Menard wants to genuinely write the work as his own. Of course, for Menard, the work will have a very different meaning. Menard, unlike Cervantes, will be writing about the distant past, Menard will be full of ironies that Cervantes could not have appreciated, and so on. Menard is aiming at authorship by my proposed standard: He aims not to copy Cervantes but rather to put himself in a state of mind such that each word he writes he endorses as reflecting exactly what he, as a twentieth century author, wants to write in his fresh, ironic novel about the distant past.

On this view, could you write your essay about Macbeth in the GPT-3 playground, approving one individual word at a time? Yes, but only in the magnificently unlikely way that Menard could write the Quixote. You'd have to be sufficiently knowledgeable about Macbeth, and the GPT-3 output would have to be sufficiently in line with your pre-existing knowledge, that for each word, one at a time, you think, "yes, wow, that word effectively captures the thought I'm trying to express!"

Thursday, November 03, 2022

GPT-3 Can Talk Like the Philosopher Daniel Dennett Without Parroting His Words

Earlier this year, Anna Strasser, Matthew Crosby, and I fined-tuned the large language model GPT-3 on the philosophical writings of Daniel Dennett. Basically, this amounts to training a chat-bot to talk like Dennett. We then posed ten philosophical questions to Dennett and our to Dennett model, "digi-Dan". Regular readers of this blog will recall that we then tested ordinary research participants, blog readers, and experts in the work of Daniel Dennett, to see if they could distinguish Dennett's actual answers from those of digi-Dan.

The results were, to us, surprising. When asked to select Dennett's answer to a philosophical question from a set of five possible answers, with the other four being digi-Dan outputs, Dennett experts got only about half right -- significantly better than the 20% chance rate, but also significantly below the 80% we had hypothesized. Experts often chose digi-Dan's answer over actual Dan's answer. In fact, on two questions, at least one of the four digi-Dan outputs was selected by more experts than was Dennett's own response. (Blog readers performed similarly to the experts, while ordinary research participants were at about chance.)

Anna Strasser and I then brought my son David Schwitzgebel into the collaboration (Matthew Crosby unfortunately had to withdraw, given a change in career direction), and we wrote up the results in a new paper in draft "Creating a Large Language Model of a Philosopher". Comments welcome, as always!

Presenting our initial results to audiences, we sometimes heard the following objection: Could digi-Dan be doing so well because it's just parroting Dennett's words? That is, might we have "over-trained" the model, so that it produces long strings of text more or less word-for-word directly from Dennett's corpus? If so, then the Dennett experts aren't really mistaking a computer program for Dennett. Rather, they're mistaking Dennett for Dennett. They're just picking out something Dennett said previously rather than what he happened to say when asked most recently, and nothing particularly interesting follows.

That's an good and important concern. We addressed it in two ways.

First, we used the Turnitin plagiarism checker to check for "plagiarism" between the digi-Dan outputs and the Turnitin corpus, supplemented with the texts we had used as training data. Turnitin checks for matches between unusual strings of words in the target document and in the comparison corpora, using a proprietary method that attempts to capture paraphrasing even when the words don't exactly match. We found only a 5% overall similarity between digi-Dan's answers and the comparison corpora. Generally speaking, similarity thresholds below 10%-15% are considered ordinary for non-plagiarized work. Importantly for our purposes, none of the passages were marked as similar to the training corpus we used in fine-tuning.

However, since the Turnitin plagiarism checking process is non-transparent, we chose also to employ the more transparent process of searching for matching strings of text between digi-Dan's answers and the training corpus. We found only five matching strings of seven words or longer, plus another sixteen strings of six words or longer. None of these strings has distinctive philosophical content. A few are book titles. The rest are stock phrases of the type favored by analytic philosophers. If you want to see the details, I've pasted a table at the end of this post containing every matching string of six or more words.

Digi-Dan is thus more sophisticated than the objector supposes. Somehow, it creates textual outputs that Dennett experts often mistake for Dennett's own writing without parroting Dennett's exact words. It can synthesize new strings of Dennett-like prose.

It by no means follows that digi-Dan thinks like a philosopher, and we emphasize that some of its outputs are unlike what Dennett would say. But we still find the results quite interesting. Maybe humanity is on the cusp of creating machines capable of producing texts that seem to sparkle with philosophical cleverness, insight, or common sense, potentially triggering new philosophical ideas in the reader, and perhaps also paving the way for the eventual creation of artificial entities who are genuinely capable of philosophical thought.

-------------------------------------------------------------

Table

Strings of six or more words that match between the GPT-3 outputs and the Dennett training corpus. The occurrences column indicates the number of separate training data segments in the training corpus in which that phrase appears. The occurrences total for shorter strings excludes the occurrences in larger matching strings. (Therefore, if any n-gram that is a subset of a larger n-gram appears in the table, that means that it appeared independently in the text, rather than appearing only within the larger n-gram. For example, “intuition pumps and other tools for thinking” occurs once outside of “in my new book intuition pumps and other tools for thinking.”)

String	# of words	occurrences
in my new book intuition pumps and other tools for thinking	11	1
is organized in such a way that it	8	1
there is no such thing as a	7	10
figure out what it ought to do	7	1
intuition pumps and other tools for thinking	7	1
there is no such thing as	6	14
i have learned a great deal	6	2
organized in such a way that	6	2
a capacity to learn from experience	6	1
but if you want to get	6	1
capacity to learn from experience we	6	1
in my book breaking the spell	6	1
in such a way that it	6	1
is organized in such a way	6	1
my book breaking the spell i	6	1
of course it begs the question	6	1
that is to say there is	6	1
that it is not obvious that	6	1
the more room there is for	6	1
to fall into the trap of	6	1
what it ought to do given	6	1

Tuesday, August 23, 2022

The Washout Argument Against Longtermism

Longtermism is the view that what we choose to do now should be substantially influenced by its expected consequences for the trillions of people who might possibly exist in the longterm future. Maybe there's only a small chance that trillions of people will exist in the future, and only a minuscule chance that their lives will go appreciably better or worse as a result of what you or I do now. But however small that chance is, if we multiply it by a large enough number of possible future people -- trillions? trillions of trillions? -- the effects are worth taking very seriously.

Longtermism is a hot topic in the effective altruism movement, and William MacAskill's What We Owe the Future, released last week, has made a splash in the popular media, including The New Yorker, NPR, The Atlantic, and Boston Review. I finished the book Sunday. Earlier this year, I argued against longtermism on several grounds. Today, I'll expand on one of those arguments, which (partly following Greaves and MacAskill 2021) I'll call the Washout Argument.

The Washout Argument comes in two versions, infinite and finite.

The Washout Argument: Infinite Version

Note: If you find this a bit silly, that's part of the point.

As I've argued in other posts -- as well as in a forthcoming book chapter with philosopher of physics Jacob Barandes -- everything you do causes almost everything. Put more carefully, if we accept currently standard, vanilla physics and cosmology, and extrapolate it forward, then almost every action you take will cause almost every type of non-unique future event of finite probability. A ripple of causation extends outward from you, simply by virtue of the particles that reflect off you as you move, which then influence other particles, which influence still more particles, and so on and so on until the heat death of the universe.

But the heat death of the universe is only the beginning! Standard cosmological models don't generally envision a limit to future time. So post heat death, we should expect the universe to just keep enduring and enduring. In this state, there will be occasional events in which particles enter unlikely configurations, by chance. For example, from time to time six particles will by chance converge on the same spot, or six hundred will, or -- very, very rarely (but we have infinitude to play with) six hundred trillion. Under various plausible assumptions, any finitely probable configuration of a finite number of particles should occur eventually, and indeed infinitely often.

This relates to the famous Boltzmann brain problem, because some of those chance configurations will be molecule-for-molecule identical with human brains. These unfortunate brains might be having quite ordinary thoughts, with no conception that they are mere chance configurations amid post-heat-death chaos.

Now remember, the causal ripples from the particles you perturbed yesterday by raising your right hand are still echoing through this post-heat-death universe.

Suppose that, by freak chance, a human brain in a state of great suffering appears at spatiotemporal location X that has been influenced by a ripple of causation arising from your having raised your hand. That brain wouldn't have appeared in that location had you not raised your hand. Chancy events are sensitive in that way. Thus, one extremely longterm consequence of your action was that Boltzmann brain's suffering. Of course, there are also things of great value that arise which wouldn't have arisen if you hadn't raised your hand -- indeed, whole amazing worlds that wouldn't otherwise have come into being. What awesome power you have!

[For a more careful treatment see Schwitzgebel and Barandes forthcoming.]

Consequently, from a longterm perspective, everything you do has a longterm expected value of positive infinity minus negative infinity -- a value that is normally undefined. Even if you employed some fancy mathematics to subtract these infinitudes from each other, finding that, say, the good would overall outweigh the bad, there would still be a washout, since almost certainly nothing you do now would have a bearing on the balance of those two infinitudes. (Note, by the way, that my argument here is not simply that adding a finite value to an infinite value is of no consequence, though that is arguably also true.) Whatever the expected effects of your actions are in the short term, they will eventually be washed out by infinitely many good and bad consequences in the long term.

Should you then go murder people for fun, since ultimately it makes no difference to the longterm expected balance of good to bad in the world? Of course not. I consider this argument a reductio ad absurdum of the idea that we should evaluate actions by their longterm consequences, regardless of when those consequences occur, with no temporal discounting. We should care more about the now than about the far distant future, contra at least the simplest formulations of longtermism.

You might object: Maybe my physics is wrong. Sure, maybe it is! But as long as you allow that there's even a tiny chance that this cosmological story is correct, you end up with infinite positive and negative expected values. Even if it's 99.9% likely that your actions only have finite effects, to get an expected value in the standard way, you'll need to add in a term accounting for 0.1% chance of infinite effects, which will render the final value infinite or undefined.

The Washout Argument: Two Finite Versions

Okay, what if we forget about infinitude and just truncate our calculations at heat death? There will be only finitely many people affected by your actions (bracketing some worries about multiverse theory), so we'll avoid the problems above.

Here the issue is knowing what will have a positive versus negative longterm effect. I recommend radical skepticism. Call this Skeptical Washout.

Longtermists generally think that the extinction of our species would be bad for the longterm future. There are trillions of people who might have led happy lives who won't do so if we wipe ourselves out in the next few centuries!

But is this so clear?

Here's one argument against it: We humans love our technology. It's our technology that creates the big existential risks of human extinction. Maybe the best thing for the longterm future is for us to extinguish ourselves as expeditiously as possible, so as to clear the world for another species to replace us -- one that, maybe, loves athletics and the arts but not technology quite so much. Some clever descendants of dolphins, for example? Such a species might have a much better chance than we do of actually surviving a billion years. The sooner we die off, maybe, the better, before we wipe out too many more of the lovely multicellular species on our planet that have the potential to eventually replace and improve on us.

Here's another argument: Longtermists like MacAskill and Toby Ord typically think that these next few centuries are an unusually crucial time for our species -- a period of unusual existential risk, after which, if we safely get through, the odds of extinction fall precipitously. (This assumption is necessary for their longtermist views to work, since if every century carries an independent risk of extinction of, say, 10%, the chance is vanishingly small that our species will survive for millions of years.) What's the best way to tide us through these next few especially dangerous centuries? Well, one possibility is a catastrophic nuclear war that kills 99% of the population. The remaining 1% might learn the lesson of existential risk so well that they will be far more careful with future technology than we are now. If we avoid nuclear war now, we might soon develop even more dangerous technologies that would increase the risk of total extinction, such as engineered pandemics, rogue superintelligent AI, out-of-control nanotech replicators, or even more destructive warheads. So perhaps it's best from the longterm perspective to let us nearly destroy ourselves as soon as possible, setting our technology back and teaching us a hard lesson, rather than blithely letting technology advance far enough that a catastrophe is more likely to be 100% fatal.

Look, I'm not saying these arguments are correct. But in my judgment they're not especially less plausible than the other sorts of futurist forecasting that longtermists engage in, such as the assumption that we will somehow see ourselves safely past catastrophic risk if we survive the next few centuries.

The lesson I draw is not that we should try to destroy or nearly destroy ourselves as soon as possible! Rather, my thought is this: We really have no idea what the best course is for the very long term future, millions of years from now. It might be things that we find intuitively good, like world peace and pandemic preparedness, or it might be intuitively horrible things, like human extinction or nuclear war.

If we could be justified in thinking that it's 60% likely that peace in 2023 is better than nuclear war in 2023 in terms of its impact on the state of the world over the entire course of the history of the planet, then the longtermist logic could still work (bracketing the infinite version of the Washout Argument). But I don't think we can be justified even in that relatively modest commitment. Regarding what actions now will have a positive expected impact on the billion-year future, I think we have to respond with a shoulder shrug. We cannot use billion-year expectations to guide our decisions.

Even if you don't want to quite shrug your shoulders, there's another way the finite Washout Argument can work. Call this Negligible Probability Washout.

Let's say you're considering some particular action. You think that action has a small chance of creating an average benefit of -- to put a toy number on it -- one unit to each future person who exists. Posit that there are a trillion future people. Now consider, how small is that small chance? If it's less than one in a trillion, then on a standard consequentialist calculus, it would be better to create a sure one unit benefit for one person who exists now.

What are reasonable odds to put on the chance that some action you do will materially benefit a trillion people in the future? To put this in perspective, consider the odds that your one vote will decide the outcome of your country's election. There are various ways to calculate this, but the answer should probably be tiny, one in a hundred thousand at most (if you're in a swing state in a close U.S. election), maybe one in a million, one in ten million or more. That's a very near event, whose structure we understand. It's reasonable to vote on those grounds, by the utilitarian calculus. If I think that my vote has a one in ten million chance of making my country ten billion dollars better off, then -- if I'm right -- my vote is a public good worth an expected $1000 (ten billion times one in ten million).

My vote is a small splash in a very large pond, though a splash worth making. But the billion-year future of Earth is a much, much larger pond. It seems reasonable to conjecture that the odds that some action you do now will materially improve the lives of trillions of people in the future should be many orders of magnitude lower than one in a million -- low enough to be negligible, even if (contra the first part of this argument) you can accurately predict the direction.

On the Other Hand, the Next Few Centuries

... are (moderately) predictable! Nuclear war would be terrible for us and our immediate descendants. We should care about protecting ourselves from pandemics, and dangerous AI systems, and environmental catastrophes, and all those other things that the longtermists care about. I don't in fact disagree with most of the longtermists' priorities and practical plans. But the justification should be the long term future in the more ordinary sense of "long term" -- fifteen years, fifty years, two hundred years, not ten million years. Concern about the next few generations is reason enough to be cautious with the world.

[Thanks to David Udell for discussion.]

Monday, June 27, 2022

If We're Living in a Simulation, The Gods Might Be Crazy

[A comment on David Iverson's new short story, "This, But Again", in Slate's Future Tense]

That we’re living in a computer simulation—it sounds like a paranoid fantasy. But it’s a possibility that futurists, philosophers, and scientific cosmologists treat increasingly seriously. Oxford philosopher and noted futurist Nick Bostrom estimates there’s about a 1 in 3 chance that we’re living in a computer simulation. Prominent New York University philosopher David J. Chalmers, in his recent book, estimates at least a 25 percent chance. Billionaire Elon Musk says it’s a near-certainty. And it’s the premise of this month’s Future Tense Fiction story by David Iserson, “This, but Again.”

Let’s consider the unnerving cosmological and theological implications of this idea. If it’s true that we’re living in a computer simulation, the world might be weirder, smaller, and more unstable than we ordinarily suppose.

Full story here.

----------------------------------------

"Skepticism, Godzilla, and the Artificial Computerized Many-Branching You" (Nov. 15, 2013).

"Our Possible Imminent Divinity" (Jan. 2, 2014).

"1% Skepticism" (Nous (2017) 51, 271-290).

Related "Is Life a Simulation? If So, Be Very Afraid" (Los Angeles Times, Apr. 22, 2022).

Wednesday, January 05, 2022

Against Longtermism

Last night, I finished Toby Ord's fascinating and important book, The Precipice: Existential Risk and the Future of Humanity. This has me thinking about "longtermism" in ethics.

I fell the pull of longtermism. There's something romantic in it. It's breaktaking in scope and imagination. Nevertheless, I'm against it.

Longtermism, per Ord,

is especially concerned about the impacts of our actions on the longterm future. It takes seriously the fact that our own generation is but one page in a much longer story, and that our most important role may be how we shape -- or fail to shape -- that story (p. 46).

By "longterm future", Ord means very longterm. He means not just forty years from now, or a hundred years, or a thousand. He means millions of years from now, hundreds of millions, billions! In Ord's view, as his book title suggests, we are on an existential "precipice": Our near-term decisions (over the next few centuries) are of crucial importance for the next million years plus. Either we will soon permanently ruin ourselves, or we will survive through a brief "period of danger" thereafter achieving "existential security" with the risk of self-destruction permanently minimal and humanity continuing onward into a vast future.

Given the uniquely dangerous period we face, Ord argues, we must prioritize the reduction of existential risks to humanity. Even a one in a billion chance of saving humanity from permanent destruction is worth a huge amount, when multiplied by something like a million future generations. For some toy numbers, ten billion lives times a hundred million years is 10^18 lives. An action with a one in a billion chance of saving that many lives has an expected value of 10^18 / 10^9 = a billion lives. Surely that's worth at least a trillion dollars of the world's economy (not much more than the U.S. annual military budget)? To be clear, Ord doesn't work through the numbers in so concrete a way, seeming to prefer vaguer and more cautious language about future value -- but I think this calculation is broadly in his spirit, and other longtermists do talk this way.

Now I am not at all opposed to prioritizing existential risk reduction. I favor doing so, including for very low risks. A one in a billion chance of the extinction of humanity is a risk worth taking seriously, and a one in a hundred chance of extinction ought to be a major focus of global attention. I agree with Ord that people in general treat existential risks too lightly. Thus, I accept much of Ord's practical advice. I object only to justifying this caution by appeal to expectations about events a million years from now.

What is wrong with longtermism?

First, it's unlikely that we live in a uniquely dangerous time for humanity, from a longterm perspective. Ord and other longtermists suggest, as I mentioned, that if we can survive the next few centuries, we will enter a permanently "secure" period in which we no longer face serious existential threats. Ord's thought appears to be that our wisdom will catch up with our power; we will be able to foresee and wisely avoid even tiny existential risks, in perpetuity or at least for millions of years. But why should we expect so much existential risk avoidance from our descendants? Ord and others offer little by way of argument.

I'm inclined to think, in contrast, that future centuries will carry more risk for humanity, if technology continues to improve. The more power we have to easily create massively destructive weapons or diseases -- including by non-state actors -- and in general the more power we have to drastically alter ourselves and our environment, the greater the risk that someone makes a catastrophic mistake, or even engineers our destruction intentionally. Only a powerful argument for permanent change in our inclinations or capacities could justify thinking that this risk will decline in a few centuries and remain low ever after.

You might suppose that, as resources improve, people will grow more cooperative and more inclined toward longterm thinking. Maybe. But even if so, cooperation carries risks. For example, if we become cooperative enough, everyone's existence and/or reproduction might come to depend on the survival of the society as a whole. The benefits of cooperation, specialization, and codependency might be substantial enough that more independent-minded survivalists are outcompeted. If genetic manipulation is seen as dangerous, decisions about reproduction might be centralized. We might become efficient, "superior" organisms that reproduce by a complex process different from traditional pregancy, requiring a stable web of technological resources. We might even merge into a single planet-sized superorganism, gaining huge benefits and efficiencies from doing so. However, once a species becomes a single organism the same size as its environment, a single death becomes the extinction of the species. Whether we become a supercooperative superorganism or a host of cooperative but technologically dependent individual organisms, one terrible miscalculation or one highly unlikely event could potentially bring down the whole structure, ending us all.

A more mundane concern is this: Cooperative entities can be taken advantage of. As long as people have differential degrees of reproductive success, there will be evolutionary pressure for cheaters to free-ride on others' cooperativeness at the expense of the whole. There will always be benefits for individuals or groups who let others be the ones who think longterm, making the sacrifices necessary to reduce existential risks. If the selfish groups are permitted to thrive, they could employ for their benefit technology with, say, a 1/1000 or 1/1000000 annual risk of destroying humanity, flourishing for a long time until the odds finally catch up. If, instead, such groups are aggressively quashed, that might require warlike force, with the risks that war entails, or it might involve complex webs of deception and counterdeception in which the longtermists might not always come out on top.

There's something romantically attractive about the idea that the next century or two are uniquely crucial to the future of humanity. However it's much likelier that selective pressures favoring a certain amount of short-term self-interest, either at the group or the individual level, will prevent the permanent acquisition of the hyper-cautious wisdom Ord hopes for. All or most or at least many future generations with technological capabilities matching or exceeding our own will face substantial existential risk -- perhaps 1/100 per century or more. If so, that risk will eventually catch up with us. Humanity can't survive existential risks of 1/100 per century for a million years.

If this reasoning is correct, it's very unlikely that there will be a million-plus year future for humanity that is worth worrying about and sacrificing for.

Second, the future is hard to see. Of course, my pessimism could be mistaken! Next year is difficult enough to predict, much less the next million years. But to the extent this is true, this cuts against longtermism in a different way. We might think that the best approach to the longterm survival of humanity is to do X -- for example, to be cautious about developing superintelligent A.I. or to reduce the chance of nuclear war. But that's not at all clear. Risks such as nuclear war, unaligned A.I., or a genetically engineered pandemic would have been difficult to imagine even a century ago. We too might have a very poor sense of what the real sources of risk will be a century from now.

It could be that the single best thing we could do to reduce the risk of completely destroying humanity in the next two hundred years is to almost destroy humanity right now. The biggest sources of existential risk, Ord suggests, are technological: out-of-control artificial intelligence, engineered pandemics, climate change, and nuclear war. However, as Ord also argues, no such event -- not even nuclear war -- is likely to completely wipe us out, if it were to happen now. If a nuclear war were to destroy most of civilization and most of our capacity to continue on our current technological trajectory, that might postpone our ability to develop even more destructive technologies in the next century. It might also teach us a fearsome lesson about existential risk. Unintuitively, then, if we really are on the precipice, our best chance for longterm survival might be to promptly blast ourselves nearly to oblivion.

Even if we completely destroy humanity now, that might be just the thing the planet needs for another, better, and less self-destructive species to arise.

I'm not, of course, saying that we should destroy or almost destroy ourselves! My point is only this: We currently have very little idea what present action would be most likely to ensure a flourishing society a million years in the future. It could quite easily be the opposite of what we're intuitively inclined to think.

What we do know is that nuclear war would be terrible for us, for our children, and for our grandchildren. That's reason enough to avoid it. Tossing speculations about the million-year future into the decision-theoretic mix risks messing up that straightforward reasoning.

Third, it's reasonable to care much more about the near future than the distant future. In Appendix A, Ord has an interesting discussion of the logic of temporal discounting. He argues on technical grounds that a "pure time preference" for a benefit simply because it comes earlier should be rejected. (For example, if it's non-exponential, you can be "Dutch booked", that is, committed to a losing gamble; but if it's strictly exponential it leads to highly unintuitive results such as caring about one death in 6000 years much more than about a billion deaths in 9000 years.) The rejection of temporal discounting is important to longtermism, since it's the high weight we are supposed to give to distant future lives that renders the longterm considerations so compelling.

But we don't need to be pure temporal discounters to care much more about the near future than the distant future. We can instead care about particular people and their particular near-term descendants. In Confucian ethics, for example, one ought to care most about near family, next about more distant family, next about neighbors, next about more distant compatriots, etc. I can -- rationally, I think -- care intensely about the welfare of my children, care substantially about the welfare of the children they might eventually have, care somewhat about their potential grandchildren, and only dimly and about equally about their sixty-greats-grandchildren and their thousand-greats-grandchildren. I can care intensely about the well-being of my society and the world as it now exists, substantially about society and the world as it will exist a hundred years after my death, and much less, but still somewhat, about society and the world in ten thousand or a million years. Since this isn't pure temporal discounting but instead concern about particular individuals and societies, it needn't lead to the logical or intuitive troubles Ord highlights.

Fourth, there's a risk that fantasizing about extremely remote consequences becomes an excuse to look past the needs and interests of the people living among us, here and now. I don't accuse Ord in particular of this. He also works on applied issues in global healthcare, for example. He concludes Precipice with some sweet reflections on the value of family and the joys of fatherhood. But there's something dizzying or intoxicating about considering the possible billion-year future of humanity. Persistent cognitive focus in this direction has at least the potential to turn our attention away from more urgent and personal matters, perhaps especially among those prone to grandiose fantasies.

Instead of longtermism, I recommend focusing on the people already among us and what's in the relatively foreseeable future of several decades to a hundred years. It's good to emphasize and prevent existential risks, yes. And it's awe-inspiring to consider the million-year future! Absolutely, we should let ourselves imagine what incredible things might lie before our distant descendants if the future plays out well. But practical decision-making today shouldn't ride upon such far-future speculations.

ETA Jan. 6: Check out the comments below and the public Facebook discussion for some important caveats and replies to interesting counterarguments -- also Richard Yetter Chappell's blogpost today with point-by-point replies to this post.

------------------------------------------

Group Minds on Ringworld (Oct 24, 2012)

Group Organisms and the Fermi Paradox (May 16, 2014)

How to Disregard Extremely Remote Possibilities (Apr 16, 2015)

Against the "Value Alignment" of Future Artificial Intelligence (Dec 22, 2021)

[image generated by wombo.art]

Wednesday, July 28, 2021

Speaking with the Living, Speaking with the Dead, and Maybe Not Caring Which Is Which

Since the pandemic began, I've been meeting people, apart from my family, mainly through Zoom. I see their faces on a screen. I hear their voices through headphones. This is what it has become to interact with someone. Maybe future generations will find this type of interaction ever more natural and satisfying.

"Deepfake" technology is also improving. We can create Anthony Bourdain's voice and hear him read aloud words that he never actually read aloud. We can create video of Tom Cruise advocating exfoliating products after industrial cleanup. We can create video of Barack Obama uttering obscenities about Donald Trump:

Predictive text technology is also improving. After training on huge databases of text, GPT-3 can write plausible fiction in the voice of famous authors, give interview answers broadly (not closely!) resembling those that philosopher David Chalmers might give, and even discuss its own consciousness (in an addendum to this post) or lack thereof.

The possibility of conjoining the latter two developments is eerily foreseen in Black Mirror: Be Right Back. If we want, we can draw on text and image and video databases to create simulacra of the deceased -- simulacra that speak similarly to how they actually spoke, employing characteristic ideas and turns of phrase, with voice and video to match. With sufficient technological advances, it might become challenging to reliably distinguish simulacra from the originals, based on text, audio, and video alone.

Now combine this thought with the first development, a future in which we mostly interact by remote video. Grandma lives in Seattle. You live in Dallas. If she were surreptitiously replaced by Deepfake Grandma, you might hardly know, especially if your interactions are short and any slips can be attributed to the confusions of age.

This is spooky enough, but I want to consider a more radical possibility -- the possibility that we might come to not care very much whether grandma is human or deepfake.

Maybe it's easier to start by imagining a scholar hermit, a scientist or philosopher who devotes their life to study, who has no family they care about, who has no serious interests outside of academia. She lives in the hills of Wyoming, maybe, or in a basement in Tokyo, interacting with students and colleagues only by phone and video. This scholar, call her Cherie, records and stores every video interaction, every email, and every scholarly note.

We might imagine, first, that Cherie decides to delegate her introductory lectures to a deepfake version of herself. She creates state-of-the-art DeepCherie, who looks and sounds and speaks and at least superficially thinks just like biological Cherie. DeepCherie trains on the standard huge corpus as well as on Cherie's own large personal corpus, including the introductory course Cherie has taught many times. Without informing her students or university administrators, Cherie has DeepCherie teach a class session. Biological Cherie monitors the session. It goes well enough. Everyone is fooled. Students raise questions, but they are familiar questions easily answered, and DeepCherie performs credibly. Soon, DeepCherie is teaching the whole intro course. Sometimes DeepCherie answers student questions better than Cherie herself would have done on the spot. After all, DeepCherie has swift access to a much larger corpus of factual texts than does biological Cherie. Monitoring comes to seem less and less necessary.

Let's be optimistic about the technology and suppose that the same applies to Cherie's upper-level teaching, her graduate advising, department meetings, and conversations with collaborators. DeepCherie's answers are highly Cherie-like: They sound very much like what biological Cherie would say, in just the tone of voice she would say it, with just the expression she would have on her face. Sometimes DeepCherie's answers are better. Sometimes they're worse. When they're worse, Cherie, monitoring the situation, instructs DeepCherie to utter a correction, and DeepCherie's learning algorithms accommodate this correction so that it will answer similar questions better the next time around.

If DeepCherie eventually learns to teach better than biological Cherie, and to say more insightful things to colleagues, and to write better article drafts, then Cherie herself might become academically obsolete. She can hand off her career. Maybe DeepCherie will always need a real human collaborator to clean up fine points in her articles that even the best predictive text generator will tend to flub -- or maybe not. But even if so, as I'm imagining the case DeepCherie has compensating virtues of insight and synthesis beyond what Cherie herself can produce, much like AlphaGo can make clever moves in the game of Go that no human Go player would have considered.

Does DeepCherie really "think"? Suppose DeepCherie proposes a new experimental design. A colleague might say, "What a great idea! I'm glad you thought of that." Was the colleague wrong? Might one object that really there was no idea, no thought, just an audiovisual pattern that the colleague overinterprets as a thought? The colleague, supposing they were informed of the situation, might be forgiven for treating that objection as a mere cavil. From the colleague's perspective, DeepCherie's "thought" is as good as any other thought.

Is DeepCherie conscious? Does DeepCherie have experiences alongside her thoughts or seeming-thoughts? DeepCherie lacks a biological body, so she presumably won't feel hunger and she won't know what it's like to wiggle her toes. But if consciousness is about intelligent information processing, self-regulation, self-monitoring, and such matters -- as many theorists think it is -- then a sufficiently sophisticated DeepCherie with enough recurrent layers might well be conscious.

If biological Cherie dies, she might take comfort in the thought that the parts of her she cared about most -- her ideas, her intellectual capacities, her style of interacting with others -- continue on in DeepCherie. DeepCherie carries on Cherie's characteristic ideas, values, and approaches, perhaps even better, immortally, ever changing and improving.

Cherie dies and for a while no one notices. Eventually the fake is revealed. There's some discussion. Should Cherie's classes be canceled? Should her collaborators no longer consult with DeepCherie as they had done in the past?

Some will be purists of that sort. But others... are they really going to cancel those great classes, perfected over the years? What a loss that would be! Are they going to cut short the productive collaborations? Are they going to, on principle, not ask "Cherie", now known to them really to be DeepCherie, her opinions about the new project? This would be to deprive themselves of the Cherie-like skills and insights that they had come to rely on in their collaborative work. Cherie's students and colleagues might come to realize that it is really DeepCherie, not biological Cherie, that they admired, respected, and cared for.

Maybe the person "Cherie", really, is some amalgam of biological Cherie and DeepCherie, and despite the death of biological Cherie, this person continues on through DeepCherie?

Depending on what your grandma is like, it might or might not be quite the same for Grandma in Seattle.

---------------------------------

Strange Baby (Jul. 22, 2011)

THE TURING MACHINES OF BABEL, Apex Magazine, 2017.

Susan Schneider's Proposed Tests for AI Consciousness: Promising but Flawed (with David B. Udell), Journal of Consciousness Studies, 2021

People Might Soon Think Robots Are Conscious and Deserve Rights (May 5, 2021)

Thursday, June 03, 2021

What Zoom Removes

Guest post by C. Thi Nguyen

The paradox of Zoom is: it should make life easy, but it can also make life really, really hard.

My time teaching on Zoom basically broke me. It left me spiritless, drained, miserable. One standard explanation is that the physical and cognitive experience of Zoom is exhausting in and of itself — that Zoom screws with all these minutae of eye contact and bodily signaling. But I’ve started to suspect that the effects of Zoom extend far beyond the experience of actually being on Zoom. Zoom re-orders your entire life.

Halfway through my first Zoom teaching term, I was absolutely falling apart. I slowly realized that, for me, a big part of it was that Zoom had eliminated by commute. Which is strange, because I thought I hated my commute. But my commute had also been one of the few totally isolated parts of my day. I was sealed off from other people and from other demands — from my email, from my phone, from my children. My car commute was enforced non-productive time. And it was non-negotiable. In work-from-home pandemic life, you can try to tell yourself that you should go for a walk or something every day. But when push comes to shove, you can always give up that walk. The commute cannot be bargained with.

Kelsey Piper puts it this way: sometimes, a tiny change in your routine can throw everything out of whack. You didn’t realize that the little change you made took out a load-bearing support for your whole emotional infrastructure. You didn’t realize that your walk to lunch was your only bit of sunshine and fresh air — and how much you needed those moments to unclench. You didn’t realize that this yoga class imposed an specific schedule into your day, or that this new emailing app would mean getting work emails on your phone 24/7. And so you change one little thing, and then everything goes haywire.

Albert Borgmann, the philosopher of technology (and one of Heidegger’s last students), talks about a similar effect, writ large. He’s worried about what a culture might unthinkingly eliminate, in the march of technology. What happens when a society takes out one of its load-bearing supports?

According to Borgmann, there are two basic kinds of human artifacts: things and devices. Things are embedded in a complex network of activity and socialization. His favorite example: a wood-burning stove. Using a wood-burning stove drags you into a complex and textured form of life. You have to acquire the wood. This means going out to chop it yourself, or talking with somebody who will chop it for you. You have to stack the wood. You have to manage the fire — watching it, stirring it, adding fuel to it. And a wood-burning stove creates a particular social world. It create a center for home life, says Borgmann — a social focal point. There is a warm spot where people congregate, and a periphery to where people can retreat. The wood-stove drags with it an entire pattern of life — of skill, of involvement, of attention to the world, of a particular embedded in a social web.

Compare a wood-burning stove with central heating. Central heating is a device. Central heating makes heat appear invisibly and effortlessly. It appears out of nowhere, evenly distributed. You don’t have to fuss with anything, or know anything about how the heat was made. You don’t have to exercise any sort of skill. The method of production drops out of sight.

Says Borgmann:

We have seen that a thing such as a fireplace provides warmth, but it inevitably provides those many other elements that compose the world of the fireplace. We are inclined to think of these additional elements as burdensome, and they were undoubtedly often so experienced. A device such as a central heating plant procures mere warmth and disburdens us of all other elements. They are taken over by the machinery of the device. The machinery makes no demands on our skill, strength, or attention, and it is less demanding the less it makes its presence felt.

The progress of technology, says Borgmann, is driving us further into what he calls “the device paradigm”. The point of a device lies solely in its output — what he calls its commodity. The commodity of central heating is warmth. The commodity of a car is transportation. And unlike a thing, a device gives its users that commodity disconnected from the process of its creation. Frozen food lets you have a meal without cooking it for yourself. Central heating lets you have warmth without fussing around with a wood stove. A device is a kind of shortcut to its commodity. And if we think that all we really want is that commodity — then we want the device to hide from us all the mechanisms by which it creates those commodities. We want the process shoved out of sight, excised from our lives. So we make better devices, that give us faster access to what we think we want. They are better, from our perspective, because they further disentangle the commodity from all these other burdensome elements.

Of course, the key is that we only think these other elements are burdensome. But these burdensome elements also drive us into the complex world, says Borgmann. They drive us into social relationships, into activity, into a rich and sensuous experience of the detailed world. Devices divest us of that. They give us only the thing that we thought we had wanted. But that’s good only if we know exactly what’s good for us.

In graduate school, as I was losing myself to stress, I became temporarily obsessed with fishing. I fantasized about it, I craved it, and I went every weekend I could. I was also terrible at it. I caught an embarrassingly small number of fish, in my years of fishing. Eventually I gave it up as another failed hobby. Without it, I could devote so many more of my hours to my research.

Of course, once I eliminated fishing, my mental and emotional state started to deteriorate, and fast. Here was my mistake: I had thought that the point of fishing was to catch some fish. But, in reality, it was not. The process of fishing was one that forced me out of my tiny apartment, out of the library, away from books and computers. It made me suffer through LA traffic (while listening to music). It made me search through forgotten mountain paths for an unfished stream. It made me stand in a river and do nothing but stare at moving water for hours on end. It gave me days that were so full of fussy and physical detail that I had to stop thinking about philosophy completely. And then I got rid of it, because I didn’t actually understand what I was getting out of it. Fishing wasn’t just about fish. It was a pattern of a whole life, dragged in by the attempt to catch a little fish.

Zoom, I want to suggest, is a device. It is a device for communication. And my point here isn’t that Zoom is somehow “fake” communication, or that virtual meetings aren’t real. It’s that Zoom gets rid of all the other stuff that surrounds a communicative encounter. It makes communication frictionless. It delivers communication as a commodity. Zoom offers a whole new basic pattern and rhythm for a life, by divesting us of that burdensome friction. Without Zoom, you had to commute to school or work. You had to listen to your stupid podcasts and your music. You had to walk around and run into people, to negotiate with them, to chat aimlessly with them, to figure out how to co-occupy physical spaces with them. Before the Zoom Era, I had to fly to conferences, which involved this whole weird complex and deeply annoying endeavor that took me out of my habit, out of my standard rituals. Flying pushed me into strange parts of the world where I had to re-orient myself, to figure out how to be in a space that wasn’t my own. With Zoom, I can go to an unlimited number of international conferences effortlessly. But also, I never leave the habitual patterns of my home life.

Of course, Zoom also brings enormous benefits. So does every device. In my academic life, it’s apparent: Zoom makes it easier for people without travel funding to attend conferences, for students with complex childcare obligations to attend classes. Dishwashers ease the burden of domestic labor. And I’m certainly not giving up my dishwasher or my motorized transport, and the ease and accessibility that Zoom offers is basically irresistible.

But Borgmann gives us a reason, at least, to be cautious with a device, to watch carefully how it reshapes our lives. A lot of times, the value of a thing in our lives is not just what it presents, on its face, as its function. So much of the time, the beauty of an activity is in the process of doing it, and not the simple output. But it’s easy to forget. Things spread their tendrils through our lives, they reshape our interactions and procedures in a thousand countless ways. Devices like Zoom — efficient, frictionless little miracles — give us what we think we want, but they also cut off all those tendrils. And sometimes there was value in that friction, too.

-------------------------------------------

[image source]

Thursday, May 16, 2019

The Ethics of Drones at the University of California

I've been appointed to an advisory board to evaluate the University of California's systemwide policy regarding Unmanned Aircraft Systems or "drones". We had our first meeting Tuesday. Most of the other members of the committee appear to be faculty who use drones in their research, plus maybe a risk analyst or two. (I missed the first part of the meeting with the introductions.)

Drones will be coming to college campuses. They might come in a big way, as Amazon, Google, and other companies continue to explore commercial possibilities (such as food and medicine delivery) and as drones' great potential for security and inspection becomes increasing clear. Technological change can be sudden, when an organization with resources decides the time is right for a big investment. Consider how fast shareable scooters arrived on campus and in downtown areas.

We want to get ahead of this. Since University of California is such a large and prominent group of universities, our policies might become a model for other universities. The advisory board is only about a dozen people, and they seem interested to hear the perspective a philosopher interested in the ethics of technology. So I have a substantial chance to shape policy. Help me think. What should we be anticipating? What ethical issues are particularly important to anticipate before Amazon, or whoever, arrives on the scene and suddenly shapes a new status quo?

One issue on my mind is the combination of face recognition software and drones. It's generally considered okay to take pictures of crowds in public places. But drones could create a huge stream of pictures or video, sometimes from unexpected angles or locations, possibly with zoom lenses, and possibly with facial recognition, which creates privacy issues orders of magnitude more serious than photographers on platforms taking still photos of crowds on a busy street.

Another issue on my mind is the possibility of monopoly or cartel power among the first company or first few companies to set up a drone network -- which in the (moderately unlikely but not impossible) event that drone technology starts to become integral to campus life, could become another source of abusive corporate power. (Compare the abuses of for-profit academic journals.)

I'm not as much concerned about conventional safety issues (drones crashing into crowded areas), since such safety issues are already a central focus of the committee. I'd like to use my role on this committee as an opportunity to highlight potential concerns that might be visible to those of us who think about the ethics of technology but not as obviously visible to drone enthusiasts and legally trained risk analysts.

An agricultural research drone at UC Merced

Incidentally, what great fun to be a tenured philosophy professor! I get to help shape drone policy. Last weekend, I enjoyed entertaining UCSD philosophers with lots of amazingly weird facts about garden snails (love darts!, distributed brains!), while snails crawled around on the speaker's podium. This coming weekend, I'll be running a session at the conference of the Science Fiction Writers Association on "Science Fiction as Philosophy". I'm designing a contest to see if any philosopher can write an abstract philosophical argument that actually convinces readers to give money to charity at higher rates than control. (So far, the signs aren't promising.) Why be boring?

Philosophers, do stuff!

[source]

The Splintered Mind