New Algorithm for Inferring Equations

In this post, I introduce a new practical algorithm for inferring equations from data. It is inspired by category theory.

The algorithm is a recent result from the ongoing research on path semantics, and the first one that can infer asymmetric paths efficiently (equivalent to a subset of equations)!!! However, we do not have Artificial Superintelligence yet, because the algorithm currently requires human input. :P

The algorithm has these properties:

  1. All equations up to a chosen complexity gets inferred.
  2. No need to assign a prior hypothesis.
  3. Adaptive filter suitable for human assisted exploration.

Here it is (Dyon script):

/// Gets the shared equivalences.
fn infer(as: [[[str]]]) -> [[str]] {
    a := as[0]
    return sift k {
        eq := a[k]
        if !all i [1, len(as)) {
            b := as[i]
            any k {(eq == b[k]) || (eq[0] == b[k][1]) && (eq[1] == b[k][0])}
        } {continue}

/// Prints out data.
fn print_data(a: {}) {
    println("===== Finished! =====")
    keys := keys(
    for i {println(link {keys[i]": "str([keys[i]])})}
    for i {println(link {a.equivs[i][0]" == "a.equivs[i][1]})}

fn wrap_fill__data_explore_filter(mut data: {}, explore: [[str]], filter: [[str]]) -> res[{}] {
    equivs := []
    gen := fill(data: mut data, explore: explore, filter: filter, equivs: mut equivs)
    if len(gen) != 0 {
        return err(link i {
            println(link {
                str(gen[i])": "
    return ok({
        data: data,
        equivs: equivs

fill__data_explore_filter_equivs(mut data: {}, explore: [[str]], filter: [[str]], mut equivs: [[str]]) = {
    explore(mut data, explore, mut equivs)
    gen(data, filter, mut equivs)

key(f, x) = str(link {f"("x")"})

/// Adds objects to category to explore.
fn explore(mut data: {}, explore: [[str]], mut equivs: [[str]]) {
    keys := keys(data)
    for i {
        key := key(explore[i][0], explore[i][1])
        if has(data, key) {continue}
        if !has(data, explore[i][0]) {continue}
        if !has(data, explore[i][1]) {continue}
        data[key] := \data[explore[i][0]](data[explore[i][1]])

/// Generates alternatives and looks for equivalences.
fn gen(data: {}, filter: [[str]], mut equivs: [[str]]) -> [[str]] {
    ret := []
    keys := keys(data)
    for i {
        for j {
            if any k {(len(filter[k]) == 2) && (filter[k] == [keys[i], keys[j]])} {continue}

            if i != j {
                same := try data[keys[i]] == data[keys[j]]
                if is_ok(same) {
                    if unwrap(same) {
                        if any k {
                            (equivs[k] == [keys[i], keys[j]]) || (equivs[k] == [keys[j], keys[i]])
                        } {continue}
                        push(mut equivs, [keys[i], keys[j]])

            r := try \data[keys[i]](data[keys[j]])
            if is_ok(r) {
                key := key(keys[i], keys[j])

                if any k {(len(filter[k]) == 1) && (filter[k][0] == keys[i])} {continue}

                if has(data, key) {continue}
                push(mut ret, [keys[i], keys[j]])
    return clone(ret)

Example: Even and zero under addition and multiplication

fn test(x, y) -> res[{}] {
    data := {
        add: \(a) = \(b) = (grab a) + b,
        mul: \(a) = \(b) = (grab a) * b,
        even: \(a) = (a % 2) == 0,
        and: \(a) = \(b) = (grab a) && b,
        or: \(a) = \(b) = (grab a) || b,
        is_zero: \(a) = a == 0,
        eq: \(a) = \(b) = (grab a) == b,
        xy: [x, y],
        fst: \(xy) = clone(xy[0]),
        snd: \(xy) = clone(xy[1]),
    filter := [
    explore := [
        ["fst", "xy"],
        ["snd", "xy"],
        ["even", "fst(xy)"],
        ["even", "snd(xy)"],
        ["is_zero", "fst(xy)"],
        ["is_zero", "snd(xy)"],
        ["eq", "even(fst(xy))"],
        ["eq(even(fst(xy)))", "even(snd(xy))"],
        ["add", "fst(xy)"],
        ["add(fst(xy))", "snd(xy)"],
        ["even", "add(fst(xy))(snd(xy))"],
        ["is_zero", "add(fst(xy))(snd(xy))"],
        ["mul", "fst(xy)"],
        ["mul(fst(xy))", "snd(xy)"],
        ["is_zero", "mul(fst(xy))(snd(xy))"],
        ["even", "mul(fst(xy))(snd(xy))"],
        ["or", "even(fst(xy))"],
        ["or(even(fst(xy)))", "even(snd(xy))"],
        ["and", "is_zero(fst(xy))"],
        ["and(is_zero(fst(xy)))", "is_zero(snd(xy))"],
        ["or", "is_zero(fst(xy))"],
        ["or(is_zero(fst(xy)))", "is_zero(snd(xy))"],

    return wrap_fill(data: mut data, explore: explore, filter: filter)


fn main() {
    a := unwrap(test(2, 3))
    b := unwrap(test(3, 3))
    c := unwrap(test(2, 2))
    d := unwrap(test(3, 2))
    e := unwrap(test(0, 1))
    f := unwrap(test(1, 0))
    c := infer([a.equivs, b.equivs, c.equivs, d.equivs, e.equivs, f.equivs])
    if len(c) > 0 {
        println("========== Found equivalences!!! ==========\n")
        for i {println(link {str(c[i][0])" == "str(c[i][1])})}
    } else {
        println("(No equivalences found)")


========== Found equivalences!!! ==========

or(is_zero(fst(xy)))(is_zero(snd(xy))) == is_zero(mul(fst(xy))(snd(xy)))
even(add(fst(xy))(snd(xy))) == eq(even(fst(xy)))(even(snd(xy)))
or(even(fst(xy)))(even(snd(xy))) == even(mul(fst(xy))(snd(xy)))
is_zero(add(fst(xy))(snd(xy))) == and(is_zero(fst(xy)))(is_zero(snd(xy)))


First you start with empty lists, and the algorithm give you choices of how to expand the category. The value of the new objects are printed (Dyon can print closures). Choices are either added to filter or for exploration, until there are no new objects. Finally, the program prints out the inferred equivalences.

The filter has two settings, one that disables choices for a whole function, e.g. ["mul"], and another that disables a choice for a specific input to a function, e.g. ["mul", "x"].

This gives all equations found among the objects, but can yield equations that does not hold in a wider context. If the context is too narrow, add more test data.

You can also infer implications by adding a true object and the implication rule:

imp: \(a) = \(b) = if grab a {clone(b)} else {true},
t: true,

The same trick can be used infer approximate equality.

How it works

A category is a set of objects which has a set of morphisms (arrows) between them that compose, plus identity. The algorithm treats composition as an unknown quantity over all input data.

Therefore, by constructing a finite category for each input, the equivalences can be found by taking the intersection of found equivalences for all inputs afterwards.

In order to construct a category of mathematical expressions, the functions must be encoded as single-argument closures. This is because an arrow can only point from one object in a category.

For example:


is equivalent to:

is_zero(fst(xy)) || is_zero(snd(xy))

A filter is necessary because of the combinatorial explosion. With some training, this becomes easy to use.

Exploration step in this algorithm adds new objects to the category prior to generating new alternatives. The name of these objects corresponds to the mathematical expressions. Therefore, an equation is the same as comparing two objects in the category for equivalence. You do not have to compare the paths, because the variance in input propagates with the construction of the explored objects.

The Mathematics of Infinite-Things-Space

I am currently doing some research on procedurally generated 3D by inflating structures.

This blog post is about how the mathematics work.

Visualizing Infinite-Things-Space

Vertex data in a 3D structure is often stored in an array.
Ignoring the position and other attributes of vertices, each vertex can be represented as a natural number which refers to the index in the array of vertices.

To connect two vertices, I use a math formula that takes an index and outputs a new one.
The nice thing about this method is that you can compactly represent an infinite number of structures.


When the function returns the same index, I visualize it as a loop for each natural number from 0 to infinity.

By adding by one, f(x) = x + 1, the loops are disconnect from themselves and attached to each other.

To create circles with 4 vertices, use the formula f(x) = x - x % 4 + (x+1) % 4.

Mathematical operations on the natural numbers are equivalent to operations on these loops.
For any mathematical expression of a single natural number, there is an equivalent shape in Infinite-Things-Space.

In group theory, these functions are called “generators”.

The Super Function

The Super function is a higher order function that takes a generator and produces a new generator. It preserves the structure of the input generator, but now “scaled up” with a factor of n.

super(n, f: N -> N) = \(x: N) = f(x / n) * n + x % n

The Super function is the key to combine multiple generators.


f_0(x) = x - x % 4 + (x + 1) % 4
f_1(x) = super(4, \(x) = x + 1)

You can use the Super function with any generator, for example circles to create a donut topology (torus):


f_0(x) = x - x % n + (x + 1) % n
f_1 = super(n, \(x) = x - x % m + (x + 1) % m)

When n and m in the example above goes to infinity, you can start at 0,
add infinity to take an infinitesimal step in the rotation direction of the torus.
Take an Aleph-1 step (a higher infinity), and you jump from the 0-point at the first torus
to the 0-point (modulus Aleph-1) to the second torus.

Inflating 3D structures

To inflate a 3D structure created by generators, I use an algorithm that takes a list of generators
and walks along all edges for a fixed range, generating a list of spring constraints.
If a generator leads back to the same vertex, it gets ignored.

The spring constraints are passed to a simulator that inserts random positions for vertices.
An extra spring constraint drives inflation by being connected temporarily between two random vertices A and B with a target distance 10 * (A - B).length().sqrt().

Inflation causes an internal pressure outward on the whole shape, so it unfolds into its “natural form”.

The square root operation is to avoid too much acceleration between distant points,
which might lead to stretching the shape like spaghetti.

Here is a shape inflated by connecting cubes with some internal support to edges of a pentagon,
which in turn is connected by a super hundredgon (a circle shape with 100 vertices).


As you see in the picture above, the shapes do not become perfect.
This is because the springs are not possible to satisfy unless the differential topology is complete.
Because the inner circle is the same size as the outer one, the shape above twists and stretches the pentagons.

Finding a complete differential topology could be done by “tuning” the constraints while simulating,
or perhaps relaxing the target distances based on a function of the tension.
Alternatively, some new constraints could be added to keep edges at a fixed angle relative to each other.

The twisting effect can be used on purpose, e.g. on a cylinder with sinus modified distances:


Have you seen how the sweets with double twisted paper get wrinkles? This seems to be the same effect!

Cloud albedo control is the way to go

I just finished reading 3 papers sent to me by Stephen Salter about geoengineering by brightening the albedo of clouds to buy us time fixing climate change. This is so far the best approach forward that I have seen so far.

Update: Stephen Salter gave permission to host the papers so other people could read them, Link to repository.

Thumbs up for this direction, and I encourage governments and corporations to immediate start investing in this research, not only because it could reduce the risks of human extinction significantly, but also because there is a potential billion dollar industry here!

Geoengineering sounds scary, and it is indeed dangerous when diving into such actions without wisdom. However, if you consider this a problem where you are not only studying how to reverse the effect of anthropogenic greenhouse gas emissions and runaway methane processes, but also study our capabilities to improve our understanding about climate, then it seems like a good approach to invest your effort into systems that lets you iterate rapidly with feedback from the climate system itself, while providing the control to react quickly to an increasingly complex global siutation.

The basic idea is that since there are lower concentration of nuclei that help droplet formation in the clean air of open sea waters, it is possible to spray microscopic salt crystals from sea water into the air and regulate precipitation very sensitively. As a result you can control the brightness of clouds, and counter-act the average warming of the planet. Computer models shows that this can both increase and decrease precipitation in various regions around the world, so figuring out where to NOT do it is an important first step.

Cloud albedo control allows the use of coded modulation to measure with weather stations how spraying of salt nuklei affects weather patterns, and this can be made sensitive enough to collect information in real time without causing severe effects.

It will also provide valuable feedback for climate modelerers, which currently have a hard time to predict precipitation, something that could have important economic benefit such as earlier warning of flooding.

As a software engineer I recognize these good characteristics of the solution:

  1. Rapid iteration with real world feedback
  2. Low latency that gives quick response time when reacting to a changing situation
  3. Proven technology applied in a new way, which often leads to new markets and industries

Even if there somehow turned up a better solution, which is likely in the long term, we would regret not investing in a capability that lets us monitor and learn our limits safely with the technology we have today.

One particular worrysome issue is that we know very little about how increasing ice melting in the arctic areas will affect methane clathrates in shallow sea waters. Estimates indicate that there is much more carbon stored there than all the emitted greenhouse gases the past century. If we get a runaway methane situation, how do we plan to respond to it?

The obvious course of action is to deploy a research project immediately to control cloud albedo, so we have a method that we know something about, if the worst case scenario should happen. It is a question of collecting the information that engineers can use to design the appropriate system.

We can not do this with existing climate models, because they are not flexible enough. The things we see happening now was not expected in many years, and the data used for peer-reviewed science used by IPCC that politicians rely on, has a cycle of several years which is too slow when we see rapid climate changes, perhaps caused by positive feedback loops.

It is no longer a debate whether climate change is happening. Some people might disagree with scientists that human emissions are causing it. The problem we have now, is that if the methane in the arctic reaches the athmosphere in large amounts, we have a hundred times worse greenhouse gas to deal with than CO2, and a such effect could have played a role in earlier extinction events.

Remember that our actions will be judged by the coming generations, and there is no more time to fight about political stances. Imagine a future where droughts can be reduced and floods can be predicted earlier. One one hand we have the path to ignorance and destruction, on the other hand we have the path to knowledge and security.

We have the carrot and the wip, but how stubborn is the donkey?

Older Newer