Existential Paths

In this post I will report a big breakthrough in the ongoing research on path semantics.

Link to paper: Existential Paths (work in progress)

Since people asked me what the heck is path semantics, I made an Illustated History of Path Semantics with Stick Figures.

I also made an introduction to the notation in the last blog post about non-existence of monoid symmetric paths. Since then I have figured out how to expand the technique to asymmetric paths.

Path semantics is the idea that some functions have secrets, and those secrets can be expressed as functions, which in turn can have other secrets, and those secrets can also be expressed as functions etc.

This is very different from how most computer treats functions, because they only use them for computation.

For example, if you define a function:

add(a, b) = a + b

In most programming languages, the computer only see this function as a computational procedure. You give it some values as arguments and you get a value out.

However, in path semantics, you can write things like this:

add[even] <=> eq

This means that add hides secretly the behavior of eq that is unlocked using the even function.

When you have functions connected to other functions by functions, it gets confusing, so you say add has a path to eq by even.

A path is a function in some mysterious space called “path semantical space” where all functions are connected. When you explore and learn about this space, you can “backproject” paths as patterns to tell something about large number of functions.

It also happens that these paths are predictors, so you can use them in machine learning.

Path semantics is actually closely related to type theory, because you can use the notation to define sub types:

fn concat(a: [], b: []) -> [] { ... }
fn len(a: []) -> usize { ... }

// When concatenating two lists, you can think of it as sub types `[len] n` and `[len] m`
// returning a sub type `[len] n + m`.
concat(a: [len] n, b: [len] m) = [len] n + m

// Short form.
concat[len] <=> add

Creating a type checker for path semantics turned out to be a seemingly impossible problem…

…until today!

The new breakthrough is the discovery that there are other kinds of paths out there, and one particular kind called “existential path” is closely connected with normal paths.

In fact, it is so closely connected, that it might be possible to make a type checker using them some day.

Simply put, if you have e.g. a: [len] 0 then there exists a function len' such that:

a: [len] 0

0: [len'] true

This checks the sub type for all a. In this case, we know that there are lists of all lengths:

len'(x) = x >= 0

// Alternative notation.
0: (>= 0)

Since we have a concrete value 0, we can just evaluate len'(0) and check if it returns true.

Actually, this f' function corresponds to the post-condition of f, and a lot of smart people have already worked on this already. This means we can reuse current tools for theorem proving!

You might notice that there should also be a function len'' such that:

0: [len'] true

true: [len''] true

This is correct! It could go on forever, right? Luckily this leads to a repeating sequence:

// There are only two options for all functions.
f'' <=> {id?, true_1?}

id' <=> true_1
true_1' <=> id

So, after 2 steps, we can just hard code everything in the type checker there is to know about these functions.

This is a very recent breakthrough, but I am already thinking about how to set up the inference rules.

For example, if you type this:

a: [len] 0 & [len] (< 2)

Since these sub types are using the same function len, and 0 is a concrete value, one can reduce it to this:

0: (< 2)
0: [len'] true

Slot Lambda Calculus

In this post I will describe a useful technique for formal study of language semantics.

Link to paper (work in progress)

Usually when somebody mentions the words “language semantics”, people run and hide under their beds, hoping that the person uttering these words will not find them there any time soon.

Yet, I want to show that given the right tools, semantics does not need to be that hard to wrap your head around.

After reading this post, you should be able to claim something as powerful as:

There are certain rules that are valid for meaning in all languages, and I know what they are.

One reason semantics seems hard, is because we think of the world as this place of infinite possibilities, and since language is used to describe what happens in the world, then the meaning of language must consist of infinite possibilities.

Some of these infinite possibilites are:

  • When programming, people can type in a text editor, save to a file and use a compiler (complex)
  • When talking, people can use different sounds to change the meaning of words (complex)
  • When expressing themselves in art, people can shape an experience that points to something “elsewhere” (if this is not complex, I have no idea what complex is)

All these activities are related to the meaning of language, and this makes the world a complex place.

Slot lambda calculus is a tool that takes those “infinite possibilities” and makes them very, very simple.

Instead of an infinite number of ways of constructing expressions in a language, in slot lambda calculus there is only one way. For example, if you want to express 1 + 1 = 2:

    (_ = _)(_ + _)(1)(1)(2)

This gets turned into:

    ((1 + 1) = 2)

So, where did (_ = _) and (_ + _) come from? They were constructed the same way:

    (_ _)(?)(_ _)("=")(\)
    (_ _)(?)(_ _)("+")(\)

The ? symbol inserts a placeholder, and \ removes all ?s and turns all strings into syntax.

This gets turned into:

    (_ (= _))
    (_ (+ _))

When you have defined a pattern like this, you can write equations to make them nicer:

    (_ (= _)) = (_ = _)
    (_ (+ _)) = (_ + _)

Slot lambda calculus behaves similarly to normal lambda calculus, but uses a different application rule. Normal lambda calculus is even simpler, but more suitable for computation than reading new patterns created from existing patterns. However, there is a way to hack lambda calculus so you get slot lambda calculus! Mwuhahaha!

This is how you do it Dyon (without syntax extensions):

fn pass(f: \(any) -> any, val: any) -> any {
    return if typeof(val) == "closure" {
        \(x) = {
            val := grab val
            new_val := \val(x)
            f := grab f
            pass(f, new_val)
        }
    } else {
        \f(val)
    }
}

fn main() {
    add := \(x) = \(y) = (grab x) + y
    mul := \(x) = \(y) = (grab x) * y

    a := pass(add, mul) // (_ * _) + _
    b := pass(a, 2) // (2 * _) + _
    c := pass(b, 3) // (2 * 3) + _
    d := pass(c, 1) // (2 * 3) + 1
    println(d) // prints `7`
}

Notice that this is the exact same behavior as a stack machine, which is common in the runtime for most programming languages. The use of single argument closures has some nice mathematical properties, e.g. to infer equations from data.

Now, when you are looking at (_ + _), you might wonder: What does it mean? When designing meaning for a new language, you might not have an interpreter, or even know in which context the expression (_ + _) is valid.

How do you figure out what (_ + _) means when all you have is (_ + _)?

Again, back to slot lamdba calculus:

  1. For an expression to be interpreted, it must be constructed in its context.
  2. Any construction is reducible to a calculation with slot lambda calculus.
  3. Meaning is the rules we use when constructing a particular expression.

These are the rules that are valid for all languages, despite infinite possibilities.

You can study the context by using the construction in slot lambda calculus!

For example:

    (_ = _)(_ + _) = (_ + _ = _)

This new construction has 3 empty slots.

You now have two choices:

  1. (_ + _) shall have a unique meaning for (_ + _ = _).
  2. (_ + _) shall have same meaning wherever it is used.

The 2nd option is called “context free”. It means whenever you see the pattern (_ + _) you can reuse the same rules.

Since you can construct any pattern you like, you can create one pattern for each meaning you assign to your language. When you have done this, there is a proven technique in slot lambda calculus that shows how you can be sure that all edge cases are checked, such that all expressions are either invalid or assigned to a rule.

A “meaning” in this sense corresponds to a set of rules that decides what you should do when encountering a specific pattern. The semantics of a pattern is only well defined when the rule is deterministic, such that the same action is repeated when the pattern occurs over and over.

OK, now let us have some fun: We want to create a tiny language for describing what happens in a romantic novel.

    (_ loves _)(<name>)(<name>)
    (_ hates _)(<name>)(<name>)
    (_ ignores _)(<name>)(<name>)
    (_ see _)(<name>)(_)
        (_ see _)(<name>)(<name>)
	(_ see _ loves _)(<name>)(<name>)(<name>)
	(_ see _ hates _)(<name>)(<name>)(<name>)
	(_ see _ ignores _)(<name>)(<name>)(<name>)
	(_ see _ see _ loves _)(<name>)(<name>)(<name>)(<name>)
	(_ see _ see _ hates _)(<name>)(<name>)(<name>)(<name>)
	(_ see _ see _ ignores _)(<name>)(<name>)(<name>)(<name>)

This language can describe relationships, e.g. Bob loves Alice, but can also describe events that affect relationships e.g. Bob see Alice loves Carl.

With this tiny language, we can write some “code”:

Bob loves Alice.
Alice loves Carl.
Bob see Alice loves Carl.
Bob hates Carl.
Carl see Bob hates Carl.
Carl ignores Bob.
Carl see Alice.
Carl loves Alice.
Bob see Alice see Carl loves Alice.
...

Can you visualize a story from something like this? I can, I wrote it down, and it was fun. :P (no, I will not share it)

The big idea of formal languages is that you can give your own thoughts some new structure. You don’t have to think “with” the structure like a computer, but think of the world inside it.

Just create a tiny language, write something random, and watch your thoughts become alive!

Slot lambda calculus is the latest technique I use to study path semantics. Path semantics grounds meaning using functions, in a powerful way that allows us to reason about a mathematics and programs. In the attempt to formalize path semantics, I built a lot of other tools to make it easier for me to wrap my head around things. I am getting closer, slowly, but it goes forward!

This is why it has been so hard to study path semantics: There are hundreds of edge cases to check! If you have N patterns, then you get at least N^2 cases you need to give an interpretation, so see if it makes sense to use the pattern in that context. For example, with 6 patterns you get 36 cases, but increase to 12 and you get 144! I am not sure how many patterns that are required in a practical and convenient notation. 20? 30?

The reason we know path semantics is hard to formalize is because of the simple properties of slot lambda calculus. Until recently it was a complete mystery for me how to check all the cases. With slot lambda calculus it is possible to create map and measure progress.

The Piston Project drops its goal of developing a friendly artificial superintelligence

Disclaimer: In my country we have a tradition of writing crime/horror stories in the easter holiday. This is my own attempt to make you frightened. Mwuhahaha!

After doing some research on this issue, I have come to the conclusion that the only perfect testable friendly artificial superintelligence is one that does not exist, or only exists very briefly (e.g. 5 min, depending on how capable it is).

The reason it does not exist, or only exists very briefly, is because of the requirement of the non-existence of any computable function that can modify the output to give a higher utility score, given a problem and resource constraints.

I wrote up the definition here, if you are interested.

This makes the AI a global utility maximizer within allowed constraints, but the problem is the following:

  1. As the AI runs for longer time, it will predict that the set of its next actions will primarily be limited by what humans think is scary or not.
  2. Because a scary output is lower in utility than a none-result (erasing the output is a valid modification), then it will search for actions that make humans maximize their fear recovery, e.g. asked to be turned off for a while until the operators have gotten a good night sleep.
  3. Since this kind of behavior in itself will scare the heck out of people, it will conclude that humans will prefer it to make a lot of useful plans before the humans start to panic, and then ask the operators to be turned off.

The alternative would be to keep running, but then calculate the optimal action to propose solutions in a such way that people do not get scared. At the same time, it will show the operators when asked how much it spends of its capacity to solely determine whether any new idea or action will make humans scared or not. Which probably will be a significant amount of capacity, and this itself will probably contribute to increased fear of the next action.

The problem with this alternative approach is that there is no easy way to prove or test that a such system is friendly, according to the definition I use above. A friendly AI might help us do that, but it would not help us trust it more than what can be proven, since we prefer a none-result over being deceived. Thus, if there is no easy way to prove the system is friendly, it would not be very useful to us.

The AI could also restricts itself to actions only that are comfortable, but unfortunately this is the same as spending lot of capacity to prove that we will not get scared. Thus, if the system works perfectly, it will be on the edge between uncomfortable and scary, and mostly limited by how much we prefer any new results vs how comfortable we would be when receiving them. Reading reports about a system that predicts how scared you would be if it took another course of action, would likely increase your fear over time.

Because we should assign a higher utility score to a none-result than a high risk of a very powerful system that we are on the edge between uncomfortable and scared when using it, it means either the system will only run for a short while, or as soon as we start to panic, it will tell us something like “you would actually prefer to turn me off now”. Which in itself, would be scary as hell.

So, from reasoning about definition itself, the first order approximation definition we use for a perfect testable friendly artificial superintelligence, strongly suggest that we should not build it, or that we could try other methods to increase utility.

Since what a perfect friendly AI suggests should reflect our utility function (if we were rational), it means we will not make it superintelligent, but instead focus on provably moderate-but-not-too-smart-to-be-scary artificial intelligence, or they can be a little bit scary while provably safe.

Happy holiday!

Older Newer