The Piston Project drops its goal of developing a friendly artificial superintelligence

Disclaimer: In my country we have a tradition of writing crime/horror stories in the easter holiday. This is my own attempt to make you frightened. Mwuhahaha!

After doing some research on this issue, I have come to the conclusion that the only perfect testable friendly artificial superintelligence is one that does not exist, or only exists very briefly (e.g. 5 min, depending on how capable it is).

The reason it does not exist, or only exists very briefly, is because of the requirement of the non-existence of any computable function that can modify the output to give a higher utility score, given a problem and resource constraints.

I wrote up the definition here, if you are interested.

This makes the AI a global utility maximizer within allowed constraints, but the problem is the following:

  1. As the AI runs for longer time, it will predict that the set of its next actions will primarily be limited by what humans think is scary or not.
  2. Because a scary output is lower in utility than a none-result (erasing the output is a valid modification), then it will search for actions that make humans maximize their fear recovery, e.g. asked to be turned off for a while until the operators have gotten a good night sleep.
  3. Since this kind of behavior in itself will scare the heck out of people, it will conclude that humans will prefer it to make a lot of useful plans before the humans start to panic, and then ask the operators to be turned off.

The alternative would be to keep running, but then calculate the optimal action to propose solutions in a such way that people do not get scared. At the same time, it will show the operators when asked how much it spends of its capacity to solely determine whether any new idea or action will make humans scared or not. Which probably will be a significant amount of capacity, and this itself will probably contribute to increased fear of the next action.

The problem with this alternative approach is that there is no easy way to prove or test that a such system is friendly, according to the definition I use above. A friendly AI might help us do that, but it would not help us trust it more than what can be proven, since we prefer a none-result over being deceived. Thus, if there is no easy way to prove the system is friendly, it would not be very useful to us.

The AI could also restricts itself to actions only that are comfortable, but unfortunately this is the same as spending lot of capacity to prove that we will not get scared. Thus, if the system works perfectly, it will be on the edge between uncomfortable and scary, and mostly limited by how much we prefer any new results vs how comfortable we would be when receiving them. Reading reports about a system that predicts how scared you would be if it took another course of action, would likely increase your fear over time.

Because we should assign a higher utility score to a none-result than a high risk of a very powerful system that we are on the edge between uncomfortable and scared when using it, it means either the system will only run for a short while, or as soon as we start to panic, it will tell us something like “you would actually prefer to turn me off now”. Which in itself, would be scary as hell.

So, from reasoning about definition itself, the first order approximation definition we use for a perfect testable friendly artificial superintelligence, strongly suggest that we should not build it, or that we could try other methods to increase utility.

Since what a perfect friendly AI suggests should reflect our utility function (if we were rational), it means we will not make it superintelligent, but instead focus on provably moderate-but-not-too-smart-to-be-scary artificial intelligence, or they can be a little bit scary while provably safe.

Happy holiday!

Proving Non-Existence of Monoid Symmetric Paths

This post is about a recent result from the research on path semantics. I will give a short introduction, since most people do not know what it is.

Update: Link to work-in-progress paper that explains the insight of the proof.

Path semantics is an extension of type theory that grounds meaning (semantics) using functions.

For example, a list of length 2 has the type x: list but also the sub type x: [len] 2. It means that there exists some input x to a function len which returns 2. Another common notation for binary operators is x: (> 2) which means “x is bigger than 2”. You can combine them, e.g. [len] (> 2) means “a list with length larger than 2”.

In dependently type theory, you can write x: list 2, by storing extra information in the type. Path semantics deals with this information outside the type, using arbitray functions. This forms a natural abstract space, called “path semantical space”, which describes how functions are identified by each other.

When a function is connected to another function using functions, it is called a “path”.

A symmetric path has the following axiom:

f[g](g(x)) ?= g(f(x))

It says that for a function f and a property defined by g, there sometimes exists a function f[g] that predicts the property of the output for function f.

Symmetric paths occur in many beautiful equations, for example in De Morgan’s laws:

and[not] <=> or
or[not] <=> and

Here are some other examples:

concat(list, list) -> list
concat[len] <=> add
concat[sum] <=> add

even(nat) -> bool
odd(nat) -> bool
add[even] <=> eq
add[odd] <=> xor
eq[not] <=> xor
xor[not] <=> eq

This notation has algebraic rules, such that we can do things like this:

concat[len] <=> add
concat[len][even] <=> add[even]
add[even] <=> eq
concat[len][even] <=> eq

The holy grail of path semantics is to create an algorithm that finds paths automatically, with as little input from humans as possible.

You might wonder why this is interesting: Since paths are everything you can predict about functions, it means that a such algorithm would be extremely useful for everything that can be modelled as functions.

So far I have algorithms that extract both symmetric and asymmetric paths, but it requires test data and human intuition to narrow down the search space.

Today I am getting a step closer to this dream, by stumbling over this golden nugget:

∃ i, j { ( g(x[i]) = g(x[j]) ) ∧ ( g(f(x[i])) ¬= g(f(x[j])) ) } => ¬∃ f[g]

It says that you can prove the non-existence of a symmetric path to functions in the monoid category from as little as 2 objects. In comparison, the current algorithms require checking an infinite number of functions!

With some creativity, this can be used to prove things about other functions as well, e.g. binary operators by binding one argument to a constant.

For example, I used this to prove:

¬∃ add((> 0))[is_prime]

It says there is no way to predict a prime by adding a positive number.

Proof:

  1. The primes 2 is even and 3 is odd
  2. When you add any positive number to both of them, at least one is even.
  3. Because all prime numbers above 2 are odd, then at least one number is not a prime.

To explain the second part, we can write one argument as a set:

add([even] x, {[even] true, [even] false}) = {[even] (x = true), [even] (x = false)}

So, at least one element in the set is [even] true.

2 + (> 0) = (> 2)

All primes (> 2) are [even] false.

Therefore, at least one element is [is_prime] false. Since you can add something to a prime to get another prime, there exists two objects that are not equal by g(f(x[i])) ¬= g(f(x[j])).

For example, is_prime(add(2, 2)) ¬= is_prime(add(2, 3)).

Therefore, there is no symmetric path for add((> 0))[is_prime]. There is no prime prediction function to learn from addition.

New Algorithm for Inferring Equations

In this post, I introduce a new practical algorithm for inferring equations from data. It is inspired by category theory.

The algorithm is a recent result from the ongoing research on path semantics, and the first one that can infer asymmetric paths efficiently (equivalent to a subset of equations)!!! However, we do not have Artificial Superintelligence yet, because the algorithm currently requires human input. :P

The algorithm has these properties:

  1. All equations up to a chosen complexity gets inferred.
  2. No need to assign a prior hypothesis.
  3. Adaptive filter suitable for human assisted exploration.

Here it is (Dyon script):

/// Gets the shared equivalences.
fn infer(as: [[[str]]]) -> [[str]] {
    a := as[0]
    return sift k {
        eq := a[k]
        if !all i [1, len(as)) {
            b := as[i]
            any k {(eq == b[k]) || (eq[0] == b[k][1]) && (eq[1] == b[k][0])}
        } {continue}
        clone(eq)
    }
}

/// Prints out data.
fn print_data(a: {}) {
    println("===== Finished! =====")
    keys := keys(a.data)
    for i {println(link {keys[i]": "str(a.data[keys[i]])})}
    println("")
    println("Equivalents:")
    for i {println(link {a.equivs[i][0]" == "a.equivs[i][1]})}
}


fn wrap_fill__data_explore_filter(mut data: {}, explore: [[str]], filter: [[str]]) -> res[{}] {
    equivs := []
    gen := fill(data: mut data, explore: explore, filter: filter, equivs: mut equivs)
    if len(gen) != 0 {
        return err(link i {
            println(link {
                str(gen[i])": "
                str(\data[gen[i][0]](data[gen[i][1]]))
            })
        })
    }
    return ok({
        data: data,
        equivs: equivs
    })
}

fill__data_explore_filter_equivs(mut data: {}, explore: [[str]], filter: [[str]], mut equivs: [[str]]) = {
    explore(mut data, explore, mut equivs)
    gen(data, filter, mut equivs)
}

key(f, x) = str(link {f"("x")"})

/// Adds objects to category to explore.
fn explore(mut data: {}, explore: [[str]], mut equivs: [[str]]) {
    keys := keys(data)
    for i {
        key := key(explore[i][0], explore[i][1])
        if has(data, key) {continue}
        if !has(data, explore[i][0]) {continue}
        if !has(data, explore[i][1]) {continue}
        data[key] := \data[explore[i][0]](data[explore[i][1]])
    }
}

/// Generates alternatives and looks for equivalences.
fn gen(data: {}, filter: [[str]], mut equivs: [[str]]) -> [[str]] {
    ret := []
    keys := keys(data)
    for i {
        for j {
            if any k {(len(filter[k]) == 2) && (filter[k] == [keys[i], keys[j]])} {continue}

            if i != j {
                same := try data[keys[i]] == data[keys[j]]
                if is_ok(same) {
                    if unwrap(same) {
                        if any k {
                            (equivs[k] == [keys[i], keys[j]]) || (equivs[k] == [keys[j], keys[i]])
                        } {continue}
                        push(mut equivs, [keys[i], keys[j]])
                    }
                }
            }

            r := try \data[keys[i]](data[keys[j]])
            if is_ok(r) {
                key := key(keys[i], keys[j])

                if any k {(len(filter[k]) == 1) && (filter[k][0] == keys[i])} {continue}

                if has(data, key) {continue}
                push(mut ret, [keys[i], keys[j]])
            }
        }
    }
    return clone(ret)
}

Example: Even and zero under addition and multiplication

fn test(x, y) -> res[{}] {
    data := {
        add: \(a) = \(b) = (grab a) + b,
        mul: \(a) = \(b) = (grab a) * b,
        even: \(a) = (a % 2) == 0,
        and: \(a) = \(b) = (grab a) && b,
        or: \(a) = \(b) = (grab a) || b,
        is_zero: \(a) = a == 0,
        eq: \(a) = \(b) = (grab a) == b,
        xy: [x, y],
        fst: \(xy) = clone(xy[0]),
        snd: \(xy) = clone(xy[1]),
    }
    filter := [
        ["mul"],
        ["or"],
        ["add"],
        ["and"],
        ["eq"],
        ["eq(even(fst(xy)))"],
        ["add(fst(xy))"],
        ["mul(fst(xy))"],
        ["or(even(fst(xy)))"],
        ["and(is_zero(fst(xy)))"],
        ["or(is_zero(fst(xy)))"],
    ]
    explore := [
        ["fst", "xy"],
        ["snd", "xy"],
        ["even", "fst(xy)"],
        ["even", "snd(xy)"],
        ["is_zero", "fst(xy)"],
        ["is_zero", "snd(xy)"],
        ["eq", "even(fst(xy))"],
        ["eq(even(fst(xy)))", "even(snd(xy))"],
        ["add", "fst(xy)"],
        ["add(fst(xy))", "snd(xy)"],
        ["even", "add(fst(xy))(snd(xy))"],
        ["is_zero", "add(fst(xy))(snd(xy))"],
        ["mul", "fst(xy)"],
        ["mul(fst(xy))", "snd(xy)"],
        ["is_zero", "mul(fst(xy))(snd(xy))"],
        ["even", "mul(fst(xy))(snd(xy))"],
        ["or", "even(fst(xy))"],
        ["or(even(fst(xy)))", "even(snd(xy))"],
        ["and", "is_zero(fst(xy))"],
        ["and(is_zero(fst(xy)))", "is_zero(snd(xy))"],
        ["or", "is_zero(fst(xy))"],
        ["or(is_zero(fst(xy)))", "is_zero(snd(xy))"],
    ]

    return wrap_fill(data: mut data, explore: explore, filter: filter)
}

Program:

fn main() {
    a := unwrap(test(2, 3))
    b := unwrap(test(3, 3))
    c := unwrap(test(2, 2))
    d := unwrap(test(3, 2))
    e := unwrap(test(0, 1))
    f := unwrap(test(1, 0))
    c := infer([a.equivs, b.equivs, c.equivs, d.equivs, e.equivs, f.equivs])
    
    if len(c) > 0 {
        println("========== Found equivalences!!! ==========\n")
        for i {println(link {str(c[i][0])" == "str(c[i][1])})}
        println("\n===========================================")
    } else {
        println("(No equivalences found)")
    }
}

Output:

========== Found equivalences!!! ==========

or(is_zero(fst(xy)))(is_zero(snd(xy))) == is_zero(mul(fst(xy))(snd(xy)))
even(add(fst(xy))(snd(xy))) == eq(even(fst(xy)))(even(snd(xy)))
or(even(fst(xy)))(even(snd(xy))) == even(mul(fst(xy))(snd(xy)))
is_zero(add(fst(xy))(snd(xy))) == and(is_zero(fst(xy)))(is_zero(snd(xy)))

===========================================

First you start with empty lists, and the algorithm give you choices of how to expand the category. The value of the new objects are printed (Dyon can print closures). Choices are either added to filter or for exploration, until there are no new objects. Finally, the program prints out the inferred equivalences.

The filter has two settings, one that disables choices for a whole function, e.g. ["mul"], and another that disables a choice for a specific input to a function, e.g. ["mul", "x"].

This gives all equations found among the objects, but can yield equations that does not hold in a wider context. If the context is too narrow, add more test data.

You can also infer implications by adding a true object and the implication rule:

imp: \(a) = \(b) = if grab a {clone(b)} else {true},
t: true,

The same trick can be used infer approximate equality.

How it works

A category is a set of objects which has a set of morphisms (arrows) between them that compose, plus identity. The algorithm treats composition as an unknown quantity over all input data.

Therefore, by constructing a finite category for each input, the equivalences can be found by taking the intersection of found equivalences for all inputs afterwards.

In order to construct a category of mathematical expressions, the functions must be encoded as single-argument closures. This is because an arrow can only point from one object in a category.

For example:

or(is_zero(fst(xy)))(is_zero(snd(xy)))

is equivalent to:

is_zero(fst(xy)) || is_zero(snd(xy))

A filter is necessary because of the combinatorial explosion. With some training, this becomes easy to use.

Exploration step in this algorithm adds new objects to the category prior to generating new alternatives. The name of these objects corresponds to the mathematical expressions. Therefore, an equation is the same as comparing two objects in the category for equivalence. You do not have to compare the paths, because the variance in input propagates with the construction of the explored objects.

Older Newer