Value Definition

From: Superintelligence

Indirect normativity offers a possibility to offload the question of which values to feed an agent with to the superintelligence itself while still anchoring it in deep human values.

for now a appropriate moral norm could be enough eg. "the common good principle": superintelligence should be developed only for the benefit of all of humanity and in the service of widely shared ethical ideals.; could include "windfall clause" for wealth distribution

Indirect normativity (256-259) Instead of directly determining ourselves which values an ai is to promote, humans specify a criterion or method that the ai can follow using its own intellectual resources to discover the concrete content of a mercy implicitly defined normative standard. "Since the superintelligence is better at cognitive work than humans it might see past the errors and confusion that cloud our thinking."

On human values and their change over time: "When we look back, we see glaring deficiencies not just in the behaviour but in the moral beliefs of all previous ages. Though we have perhaps since gleaned some moral insight, we could hardly claim to be now basking in the high noon of perfect moral enlightenment. Very likely we are still labouring under one or more grave moral misconceptions.

Seed ai might be given the final goal of continuously acting according to its best prediction of what the implicitly defined standard would have it do.

Coherent extrapolated volition from Eliezer Yudkowsky; seed ai is given final goal of carrying out humanities cev. cev: our wish if we knew more, thought faster, were more the people we wished we were, had grown farther together, where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere, extrapolated as we wish that extrapolated, interpreted as wish that interpreted. –> compare ethics observer theories; trying to normative things like "good" and "right" –> only "take action" if wishes of all steak holders cohere; does not have to produce common ground, only find it where it exists -> further explanation on p. 260 Yudkowsky seven arguments for cev:

cev approach is meant to be robust
cev approach is meant to be self-correcting
cev approach is meant to capture the source of our values
cev approach is meant to encapsulate moral growth
cev approach is meant avoid hijacking the destiny of humankind
cev approach is meant to avoid creating a motive for modern day humans to fight over the initial dynamic
cev approach is meant to keep humankind ultimately in charge of its own destiny

many open parameters remain eg. who's volitions are to be included; how to deal with "marginal persons" (p.265)

Morality models

"moral rightness" – ai figures out via its cognitive superpower what action fits the criteria of "the right thing to do" since humans have an imperfect understanding of what's right
"If the ai estimates with a sufficient probability that there are no suitable in relative truths about moral rightness, then it should revert to implementing cev instead or shut itself down."
reduced mr: "moral permissibility": ai can pursue humanity cev as long as it does not act in ways that are morally impermissible
- sample goal: "Among the actions that are morally permissible for the ai, take one that humanities cev would prefer. however if some part of this instruction has no well specified meaning to if we are radically confused about its meaning our if moral realism is false or if we acted morally impermissibly increasing an ai with this goal, then undergo a controlled shutdown. follow the intended meaning of this instruction."

"Do what i mean" –> due to possible misunderstandings in our own assertion unintended consequences can not be prohibited –> a more granular approach by clearing up the term "do what i mean" into revealed preferences in various hypothetical situations under the "what if..." model which ultimately leads back to the indirect normative cev approach

Other:

almost all the things we humans value (love, happiness, even survival) are important to us because we have particular evolutionary history – a history we share with higher animals, but not with computer programs, such as artificial intelligences.

http://theconversation.com/artificial-intelligence-can-we-keep-it-in-the-box-8541

––––––––

"So what we probably want is not a direct specification of values, but rather some algorithm for what's called indirect normativity. Rather than programming the AI with some list of ultimate values we're currently fond of, we instead program the AI with some process for learning what ultimate values it should have, before it starts reshaping the world according to those values."

https://io9.gizmodo.com/can-we-build-an-artificial-superintelligence-that-wont-1501869007

–––––––––

You could say machines should err on the side of doing nothing in areas where there’s a conflict of values. That might be difficult. I think we will have to build in these value functions. If you want to have a domestic robot in your house, it has to share a pretty good cross-section of human values; otherwise it’s going to do pretty stupid things, like put the cat in the oven for dinner because there’s no food in the fridge and the kids are hungry.
I don’t see any real way around the fact that there’s going to be, in some sense, a values industry. https://www.quantamagazine.org/artificial-intelligence-aligned-with-human-values-qa-with-stuart-russell-20150421

–––––––

3 principles for creating safer AI | Stuart Russell

The robots only objective is to maximise the realisation of human values
The robot does initially not know what those human values are
Human behaviour provides information about human values
- Learn to predict which life each human will prefer
- https://www.youtube.com/watch?v=EBK-a94IFHY

–––––––

Sample Values from 23 ASILOMAR AI PRINCIPLES:

12) Personal Privacy: People should have the right to access, manage and control the data they generate, given AI systems’ power to analyze and utilize that data.

13) Liberty and Privacy: The application of AI to personal data must not unreasonably curtail people’s real or perceived liberty.

14) Shared Benefit: AI technologies should benefit and empower as many people as possible.

15) Shared Prosperity: The economic prosperity created by AI should be shared broadly, to benefit all of humanity.

16) Human Control: Humans should choose how and whether to delegate decisions to AI systems, to accomplish human-chosen objectives.

17) Non-subversion: The power conferred by control of highly advanced AI systems should respect and improve, rather than subvert, the social and civic processes on which the health of society depends.

––––––

http://intelligence.org/files/IE-ME.pdf

Individual preferences could inform our preference policies, and preference policies could inform our individual preferences, until we had reached a state of “reflective equilibrium” (Daniels 1996, 2011) with respect to our values.
We’ve just described a family of desire satisfaction theories that philosophers call “ideal preference” or “full information” theories of value

Value extrapolation theories have some advantages when seeking a machine ethics suitable for a machine superoptimizer:

The value extrapolation approach can use what a person would want after reaching reflective equilibrium with respect to his or her values, rather than merely what each person happens to want right now.
The value extrapolation approach can allow for a kind of moral progress, rather than freezing moral progress in its tracks at the moment when a particular set of values are written into the goal system of an AI undergoing intelligence explosion.
The value extrapolation process may dissolve the contradictions within each per- son’s current preferences.
The value extrapolation process may simplify one’s values, as the accidental products of culture and evolution are updated with more considered and consistent values.
Though the value extrapolation approach does not resolve the problem of specifying intractably complex current human values, it offers a potential solution for the problem of using human values to design the goal system of a future machine superoptimizer. The solution is: extrapolate human values so that they are simpler, more consistent, and more representative of our values upon reflection, and thereby more suitable for use in an AI’s goal system.
The value extrapolation process may allow the values of different humans to con- verge to some degree.

But the data from neuroeconomics suggest a different solution: interpret inconsistent choices as deviations (from an underlying “true” utility function) that are produced by non-model-based valuation systems in the brain, and use the latest neuroscientific research to predict when and to what extent model-based choices are being “overruled” by the non-model-based valuation systems.

––––––

https://nickbostrom.com/ethics/artificial-intelligence.pdf

Ethics in Machine Learning and Other Domain‐Specific AI Algorithms
- when AI algorithms take on cognitive work with social dimensions—cognitive tasks previously performed by humans—the AI algorithm inherits the social requirements
- It is also important that AI algorithms taking over social functions be predictable to those they govern
Artificial General Intelligence
- To build an AI that acts safely while acting in many domains, with many consequences, including problems the engineers never explicitly envisioned, one must specify good behavior in such terms as “X such that the consequence of X is not harmful to humans”
  
  Thus the discipline of AI ethics, especially as applied to AGI, is likely to differ fundamentally from the ethical discipline of noncognitive technologies, in that:
  - The local, specific behavior of the AI may not be predictable apart from its safety, even if the programmers do everything right;
  - Verifying the safety of the system becomes a greater challenge because we must verify what the system is trying to do, rather than being able to verify the system’s safe behavior in all operating contexts;
  - Ethical cognition itself must be taken as a subject matter of engineering.