Friendly AI: What is it, and how can we foster it?

Friendly AI: What is it, and how can we foster it?
By Frank W. Sudia [1]

Originally written July 20, 2008
Edited and web published June 6, 2009
Copyright © 2008-09, All Rights Reserved.

Keywords: artificial intelligence, artificial intellect, friendly AI, human-robot ethics, science policy.

1. Introduction

There is consensus that true artificial intelligence, of the kind that could generate a “runaway” increasing-returns process or “singularity,” is still many years away, and some believe it may be unattainable. Nevertheless, in view of the likely difficulty of putting the genie back in the bottle, an increasing concern has arisen with the topic of “friendly AI,” coupled with the idea we should do something about this now, not after a potentially deadly situation is starting to spin out of control [2].

(Note: Some futurists believe this topic is moot in view of intensive funding for robotic soldiers, which can be viewed as intrinsically “unfriendly.” However if we focus on threats posed by “super-intelligence,” still off in the future, the topic remains germane.)

Most if not all popular (Western) dramatizations of robotic futures postulate that the AIs will run amok and turn against humans. Some scholars [3] who considered the issue concluded that this might be virtually inevitable, in view of the gross inconsistencies and manifest “unworthiness” of humanity, as exemplified in its senseless destruction of its global habitat and a large percentage of extant species, etc.

The prospect of negative public attention, including possible legal curbs on AI research, may be distasteful, but we must face the reality that public involvement has already been quite pronounced in other fields of science, such as nuclear physics, genetically modified organisms, birth control, and stem cells. Hence we should be proactive about addressing these popular concerns, lest we unwittingly incur major political defeats and long lasting negative PR.

Nevertheless, upon reasoned analysis, it is far from obvious what “friendly” AI means, or how it could be fostered. Advanced AIs are unlikely to have any fixed “goals” that can be hardwired [4], so as to place “friendliness” towards humans and other life at the top of the hierarchy.

Rather, in view of their need to deal with perpetual novelty, they will reason from facts and models to infer appropriate goals. It’s probably a good bet that, when dealing with high-speed coherence analyzers, hypocrisy will not be appreciated – not least because it wastes a lot of computational resources to detect and correct. If humans continue to advocate and act upon “ideals” that are highly contradictory and self destructive, it’s hard to argue that advanced AI should tolerate that.

To make progress, not only for friendly AI, but also for ourselves, we should be seeking to develop and promote “ruling ideas” (or source models) that will foster an ecologically-respectful AI culture, including respect for humanity and other life forms, and actively sell it to them as a proper model upon which to premise their beliefs and conduct.

By a “ruling idea” I mean any cultural ideal (or “meme”) that can be transmitted and become part of a widely shared belief system, such as respecting one’s elders, good sportsmanship, placing trash in trash bins, washing one’s hands, minimizing pollution, and so on. An appropriate collection of these can be reified as a panel (or schema) of case models, including a program for their ongoing development. These must be believable by a coherence-seeking intellect, although then as now there will be competing models, each with its own approach to maximizing coherence.

2. What do we mean by “friendly”?

Moral systems are difficult to derive from first principles and most of them seem to be ad hoc legacies of particular cultures. Lao Tsu’s [5] Taoist model, as given in the following quote, can serve as a useful starting point, since it provides a concise summary of desiderata, with helpful rank ordering:

When the great Tao is lost, there is goodness.
When goodness is lost, there is kindness.
When kindness is lost, there is justice.
When justice is lost, there is the empty shell of ritual.

– Lao Tsu, Tao Te Ching, 6th-4th century BCE (emphasis supplied)

I like this breakout for its simplicity and clarity. Feel free to repeat the following analysis for any other moral system of your choice. Leaving aside the riddle of whether AIs can attain the highest level (of Tao or Nirvana), we can start from the bottom of Lao Tsu’s list and work upwards, as follows:

2.1. Ritual / Courteous AI

Teaching or encouraging the AIs to behave with contemporary norms of courtesy will be a desirable first step, as with children and pets. Courtesy is usually a fairly easy sell, since it provides obvious and immediate benefits, and without it travel, commerce, and social institutions would immediately break down. But we fear that it’s not enough, since in the case of an intellectually superior being, it could easily mask a deeper unkindness.

2.2. Just AI

Certainly to have AIs act justly in accordance with law is highly desirable, and it constitutes the central thesis of my principal prior work in this field [6]. Also it raises the question on what basis can we demand anything more from an AI, than that it act justly? This is as far as positive law can go [7], and we rarely demand more from highly privileged humans. Indeed, for a powerful human to act justly (absent compulsion) is sometimes considered newsworthy.

How many of us are faithful in all things? Do many of us not routinely disappoint others (via strategies of co-optation or betrayal, large or small) when there is little or no penalty for doing so? Won’t AIs adopt a similar “game theory” calculus of likely rewards and penalties for faithfulness and betrayal?

Justice is often skewed towards the party with greater intelligence and financial resources, and the justice system (with its limited public resources) often values “settling” controversies over any quest for truly equitable treatment. Apparently we want more, much more. Still, if our central desire is for AIs not to kill us, then (as I postulated in my prior work) Just AI would be a significant achievement.

2.3. Kind / Friendly AI

How would a “Kind AI” behave? Presumably it will more than incidentally facilitate the goals, plans, and development of others, in a low-ego manner, reducing its demands for direct personal benefit and taking satisfaction in the welfare, progress, and accomplishments of others. And, very likely, it will expect some degree of courtesy and possible reciprocation, so that others will not callously free-ride on its unilateral altruism. Otherwise its “feelings would be hurt.” Even mothers are ego-free mainly with respect to their own kin and offspring (allegedly fostering their own genetic material in others) and child care networks, and do not often act altruistically toward strangers.

Our friendly AI program may hit a barrier if we expect AIs to act with unilateral altruism, without any corresponding commitment by other actors to reciprocate. Otherwise it will create a “non-complementary” situation, in which what is true for one, who experiences friendliness, may not be true for the other, who experiences indifference or disrespect in return.

Kindness could be an easier sell if we made it more practical, by delimiting its scope and depth. To how wide of a circle does this kindness obligation extend, and how far must they go to aid others with no specific expectation of reward or reciprocation? For example the Boy Scout Oath [8] teaches that one should do good deeds, like helping elderly persons across busy streets, without expecting rewards.

However, if too narrow a scope is defined, we will wind up back with Just AI, because justice is essentially “kindness with deadlines,” often fairly short ones, during which claims must be aggressively pursued or lost, with token assistance to weaker, more aggrieved claimants.

2.4. Good / Benevolent AI

Here we envision a significant departure from ego-centrism and personal gain towards an abstract system-centered viewpoint. Few humans apparently reach this level, so it seems unrealistic to expect many AIs to attain it either. Being highly altruistic, and looking out for others or the World as a whole rather than oneself, entails a great deal of personal risk due to the inevitable non-reciprocation by other actors. Thus it is often associated with wealth or sainthood, where the actor is adequately positioned to accept the risk of zero direct payback during his or her lifetime.

We may dream that our AIs will tend towards benevolence or “goodness,” but like the visions of universal brotherhood we experience as adolescents, such ideals quickly fade in the face of competitive pressures to survive and grow, by acquiring self-definition, resources, and social distinctions as critical stepping-stones to our own development in the world.

3. Robotic Dick & Jane Readers?

As previously noted, advanced AIs must handle “perpetual novelty” and almost certainly will not contain hard coded goals. They need to reason quickly and reliably from past cases and models to address new target problems, and must be adept at learning, discovering, identifying, or creating new source models on the fly, at high enough speeds to stay on top of their game and avoid (fatal) irrelevance.

If they behave like developing humans they will very likely select their goals in part by observing the behavior of other intelligent agents, thus re-emphasizing the importance of early socialization, role models, and appropriate peer groups.

“Friendly AI” is thus a quest for new cultural ideals of healthy robotic citizenship, honor, friendship, and benevolence, which must be conceived and sold to the AIs as part of an adequate associated program for their ongoing development. And these must be coherent and credible, with a rational scope and cost and adequate payback expectations, or the intended audience will dismiss such purported ideals as useless, and those who advocate them as hypocrites.

Conclusion: The blanket demand that AIs be “friendly” is too ill-defined to offer meaningful guidance, and could be subject to far more scathing deconstruction than I have offered here. As in so many other endeavors there is no free lunch. Workable policies and approaches to robotic friendliness will not be attained without serious further effort, including ongoing progress towards more coherent standards of human conduct.

= = = = =
Footnotes:

[1] Author contact: fwsudia-at-umich-dot-edu.

[2] See “SIAI Guidelines on Friendly AI” (2001) Singularity Institute for Artificial Intelligence, http://www.singinst.org/ourresearch/publications/guidelines.html.

[3] See, e.g., Hugo de Garis, The Artilect War: Cosmists Vs. Terrans: A Bitter Controversy Concerning Whether Humanity Should Build Godlike Massively Intelligent Machines (2005). ISBN 0882801546.

[4] This being said, we should nevertheless make an all out effort to force them to adopt a K-limited (large mammal) reproductive strategy, rather than an R-limited (microbe, insect) one!

[5] Some contemporary scholars question the historicity of “Lao Tsu,” instead regarding his work as a collection of Taoist sayings spanning several generations.

[6] “A Jurisprudence of Artilects: Blueprint for a Synthetic Citizen,” Journal of Futures Studies, Vol. 6, No. 2, November 2001, Law Update, Issue No. 161, August 2004, Al Tamimi & Co, Dubai.

[7] Under a civil law or “principles-based” approach we can seek a broader, less specific definition of just conduct, as we see arising in recent approaches to the regulation of securities and accounting matters. This avenue should be actively pursued as a format for defining friendly conduct.

[8] Point 2 of the Boy Scout Oath commands, “To help other people at all times,” http://www.usscouts.org.

Blog