Types of ASI | Notion

From Superinelligence

Oracles (177-180)

question answering machine variations: domain limited oracles (e.g. mathematics), output restricted oracles (e.g. only yes/no answers or probabilities), oracles that refuse answer if answer is predicted to have disastrous consequences

evaluation

boxing methods fully applicable
domesticity fully applicable
reduced need to understand human intention
untrustworthy oracles might provide answers that are hard to find but easy to verify by controllers
multiple oracles might offer answer verification
source of great power, might offer decisive strategic advantage to operator
limited protection against foolish use by operator

Genie (181-184)

command execution system, executes high level command then pauses until next command is initiated variations: genies that follow intention rather than direct meaning, domain limited genies, genies with a preview, genies that refuse answer if answer is predicted to have disastrous consequences

evaluation

boxing methods only partially apply
domesticity partially applies
can offer a preview of possible expected outcomes
changes can be broken in to stages / review after each stage
compared to oracles: greater need to understand human intentions
source of great power, might offer decisive strategic advantage to operator
limited protection against foolish use by operator

Sovereign (181-183)

system for open-ended autonomous operation; can pursue longe range objectives, many different motivation systems, possibility of using preview and sponsor ratification

evaluation

boxing methods inapplicable
capability control methods inapplicable (except social integration)
domesticity inapplicable
great need for ai to understand human intention
great necessity of getting it right on the first try
source of great power, might offer decisive strategic advantage to operator
once activated not vulnerable to hijacking by human operator, might provide precautions against foolish use
can be used to implement veil of ignorance outcomes

Tool AIs (184-187)

system designed to work as a tool not an agent, no goal directed behaviour

evaluation

boxing methods might be applicable
powerful internal search and planning processes would be required which might lead to agent like behavior as an unintended consequence than with no safety precautions designed –> designing safe agent in the first place