Announcing Binpool

This blog post is about a new experimental uniform binary format for particle physics: Piston-Binpool

Background

Currently I am diving deep into marine cloud brightening to understand how humanity can avoid some severe effects of climate change by using some form of clean and reversible geoengineering. If you did not know already, there have been some very worrisome observations in 2016-2017, so I thought it was a good idea to educate myself a bit and perhaps get some input on what technological challenges there are. The scientists and engineers working on this have been quite open and even sent me papers to read! Thanks guys.

While reading about this, which seems to require understanding of climate modelling (it is understandable if you take your time), I figured doing some basic virtual physics experiments could help my brain along the way. Nothing advanced yet, just freshing up some of the things I played around with over the years.

One thing I discovered, that it is not obvious what kind of analysis you want to do with a virtual experiment. Instead of running simulations over and over, I would like to collect the raw data and later run algorithms that computes some functional relationship.

The advantage of this method is that you can exploit some physical similarites between systems. For example, it is commonly known that certain physical behaviors can transfer knowledge between a model and the full size system. So, collecting data about some few particles colliding in a slow motion simulation, can give you knowledge about particles colliding in high velocities and over a huge area. This way you can save both time and energy, because you do not have to simulate every detail of a full system.

This is also interesting because I came across similar ideas when researching path semantics. Studying physics could help design algorithms that predict approximate behaviors of other algorithms. One long term goal is to develop a constraint solver that can assist with engineering tasks.

Motivation and design

I could not find a standard format for storing and reading particle data in a such way that you can reuse algorithms between projects, so I made my own experimental one.

Here is the format:

type format == 0 => end of stream
type format: u16, property id: u16
|- bytes == 0 => no more data, next property
|- bytes: u64, offset instance id: u64, data: [u8; bytes]

The first flag is a type format, which supports 10 common Rust numerical primitives, with vector and matrix dimensions up to 80x80.

The second flag is a unique property id that identifies the property of the particle system. This is also used to define time, e.g. a f32 scalar.

Next the format stores the number of bytes in the data block and an offset instance id, such that you can read and write parts of the particle system (e.g. only the part that is dynamic or needed for analysis).

The number of particles in the block is derived from the number of bytes and the type format. You can also specify a custom binary format, in which case a general purpose analysis tool can just skip the block.

For each frame in a simulation, you simply write out a set of properties, and when reading back a frame you use a loop that breaks when all properties are set. More information about the usage is in the README file.

Future work

My hope is that I am able to reuse realgorithms across projects, for example a command line tool that computes the energy or the number of collisions.

Existential Paths

In this post I will report a big breakthrough in the ongoing research on path semantics.

Link to paper: Existential Paths (work in progress)

Since people asked me what the heck is path semantics, I made an Illustated History of Path Semantics with Stick Figures.

I also made an introduction to the notation in the last blog post about non-existence of monoid symmetric paths. Since then I have figured out how to expand the technique to asymmetric paths.

Path semantics is the idea that some functions have secrets, and those secrets can be expressed as functions, which in turn can have other secrets, and those secrets can also be expressed as functions etc.

This is very different from how most computer treats functions, because they only use them for computation.

For example, if you define a function:

add(a, b) = a + b

In most programming languages, the computer only see this function as a computational procedure. You give it some values as arguments and you get a value out.

However, in path semantics, you can write things like this:

add[even] <=> eq

This means that add hides secretly the behavior of eq that is unlocked using the even function.

When you have functions connected to other functions by functions, it gets confusing, so you say add has a path to eq by even.

A path is a function in some mysterious space called “path semantical space” where all functions are connected. When you explore and learn about this space, you can “backproject” paths as patterns to tell something about large number of functions.

It also happens that these paths are predictors, so you can use them in machine learning.

Path semantics is actually closely related to type theory, because you can use the notation to define sub types:

fn concat(a: [], b: []) -> [] { ... }
fn len(a: []) -> usize { ... }

// When concatenating two lists, you can think of it as sub types `[len] n` and `[len] m`
// returning a sub type `[len] n + m`.
concat(a: [len] n, b: [len] m) = [len] n + m

// Short form.
concat[len] <=> add

Creating a type checker for path semantics turned out to be a seemingly impossible problem…

…until today!

The new breakthrough is the discovery that there are other kinds of paths out there, and one particular kind called “existential path” is closely connected with normal paths.

In fact, it is so closely connected, that it might be possible to make a type checker using them some day.

Simply put, if you have e.g. a: [len] 0 then there exists a function len' such that:

a: [len] 0

0: [len'] true

This checks the sub type for all a. In this case, we know that there are lists of all lengths:

len'(x) = x >= 0

// Alternative notation.
0: (>= 0)

Since we have a concrete value 0, we can just evaluate len'(0) and check if it returns true.

Actually, this f' function corresponds to the post-condition of f, and a lot of smart people have already worked on this already. This means we can reuse current tools for theorem proving!

You might notice that there should also be a function len'' such that:

0: [len'] true

true: [len''] true

This is correct! It could go on forever, right? Luckily this leads to a repeating sequence:

// There are only two options for all functions.
f'' <=> {id?, true_1?}

id' <=> true_1
true_1' <=> id

So, after 2 steps, we can just hard code everything in the type checker there is to know about these functions.

This is a very recent breakthrough, but I am already thinking about how to set up the inference rules.

For example, if you type this:

a: [len] 0 & [len] (< 2)

Since these sub types are using the same function len, and 0 is a concrete value, one can reduce it to this:

0: (< 2)
0: [len'] true

Slot Lambda Calculus

In this post I will describe a useful technique for formal study of language semantics.

Link to paper (work in progress)

Usually when somebody mentions the words “language semantics”, people run and hide under their beds, hoping that the person uttering these words will not find them there any time soon.

Yet, I want to show that given the right tools, semantics does not need to be that hard to wrap your head around.

After reading this post, you should be able to claim something as powerful as:

There are certain rules that are valid for meaning in all languages, and I know what they are.

One reason semantics seems hard, is because we think of the world as this place of infinite possibilities, and since language is used to describe what happens in the world, then the meaning of language must consist of infinite possibilities.

Some of these infinite possibilites are:

  • When programming, people can type in a text editor, save to a file and use a compiler (complex)
  • When talking, people can use different sounds to change the meaning of words (complex)
  • When expressing themselves in art, people can shape an experience that points to something “elsewhere” (if this is not complex, I have no idea what complex is)

All these activities are related to the meaning of language, and this makes the world a complex place.

Slot lambda calculus is a tool that takes those “infinite possibilities” and makes them very, very simple.

Instead of an infinite number of ways of constructing expressions in a language, in slot lambda calculus there is only one way. For example, if you want to express 1 + 1 = 2:

    (_ = _)(_ + _)(1)(1)(2)

This gets turned into:

    ((1 + 1) = 2)

So, where did (_ = _) and (_ + _) come from? They were constructed the same way:

    (_ _)(?)(_ _)("=")(\)
    (_ _)(?)(_ _)("+")(\)

The ? symbol inserts a placeholder, and \ removes all ?s and turns all strings into syntax.

This gets turned into:

    (_ (= _))
    (_ (+ _))

When you have defined a pattern like this, you can write equations to make them nicer:

    (_ (= _)) = (_ = _)
    (_ (+ _)) = (_ + _)

Slot lambda calculus behaves similarly to normal lambda calculus, but uses a different application rule. Normal lambda calculus is even simpler, but more suitable for computation than reading new patterns created from existing patterns. However, there is a way to hack lambda calculus so you get slot lambda calculus! Mwuhahaha!

This is how you do it Dyon (without syntax extensions):

fn pass(f: \(any) -> any, val: any) -> any {
    return if typeof(val) == "closure" {
        \(x) = {
            val := grab val
            new_val := \val(x)
            f := grab f
            pass(f, new_val)
        }
    } else {
        \f(val)
    }
}

fn main() {
    add := \(x) = \(y) = (grab x) + y
    mul := \(x) = \(y) = (grab x) * y

    a := pass(add, mul) // (_ * _) + _
    b := pass(a, 2) // (2 * _) + _
    c := pass(b, 3) // (2 * 3) + _
    d := pass(c, 1) // (2 * 3) + 1
    println(d) // prints `7`
}

Notice that this is the exact same behavior as a stack machine, which is common in the runtime for most programming languages. The use of single argument closures has some nice mathematical properties, e.g. to infer equations from data.

Now, when you are looking at (_ + _), you might wonder: What does it mean? When designing meaning for a new language, you might not have an interpreter, or even know in which context the expression (_ + _) is valid.

How do you figure out what (_ + _) means when all you have is (_ + _)?

Again, back to slot lamdba calculus:

  1. For an expression to be interpreted, it must be constructed in its context.
  2. Any construction is reducible to a calculation with slot lambda calculus.
  3. Meaning is the rules we use when constructing a particular expression.

These are the rules that are valid for all languages, despite infinite possibilities.

You can study the context by using the construction in slot lambda calculus!

For example:

    (_ = _)(_ + _) = (_ + _ = _)

This new construction has 3 empty slots.

You now have two choices:

  1. (_ + _) shall have a unique meaning for (_ + _ = _).
  2. (_ + _) shall have same meaning wherever it is used.

The 2nd option is called “context free”. It means whenever you see the pattern (_ + _) you can reuse the same rules.

Since you can construct any pattern you like, you can create one pattern for each meaning you assign to your language. When you have done this, there is a proven technique in slot lambda calculus that shows how you can be sure that all edge cases are checked, such that all expressions are either invalid or assigned to a rule.

A “meaning” in this sense corresponds to a set of rules that decides what you should do when encountering a specific pattern. The semantics of a pattern is only well defined when the rule is deterministic, such that the same action is repeated when the pattern occurs over and over.

OK, now let us have some fun: We want to create a tiny language for describing what happens in a romantic novel.

    (_ loves _)(<name>)(<name>)
    (_ hates _)(<name>)(<name>)
    (_ ignores _)(<name>)(<name>)
    (_ see _)(<name>)(_)
        (_ see _)(<name>)(<name>)
	(_ see _ loves _)(<name>)(<name>)(<name>)
	(_ see _ hates _)(<name>)(<name>)(<name>)
	(_ see _ ignores _)(<name>)(<name>)(<name>)
	(_ see _ see _ loves _)(<name>)(<name>)(<name>)(<name>)
	(_ see _ see _ hates _)(<name>)(<name>)(<name>)(<name>)
	(_ see _ see _ ignores _)(<name>)(<name>)(<name>)(<name>)

This language can describe relationships, e.g. Bob loves Alice, but can also describe events that affect relationships e.g. Bob see Alice loves Carl.

With this tiny language, we can write some “code”:

Bob loves Alice.
Alice loves Carl.
Bob see Alice loves Carl.
Bob hates Carl.
Carl see Bob hates Carl.
Carl ignores Bob.
Carl see Alice.
Carl loves Alice.
Bob see Alice see Carl loves Alice.
...

Can you visualize a story from something like this? I can, I wrote it down, and it was fun. :P (no, I will not share it)

The big idea of formal languages is that you can give your own thoughts some new structure. You don’t have to think “with” the structure like a computer, but think of the world inside it.

Just create a tiny language, write something random, and watch your thoughts become alive!

Slot lambda calculus is the latest technique I use to study path semantics. Path semantics grounds meaning using functions, in a powerful way that allows us to reason about a mathematics and programs. In the attempt to formalize path semantics, I built a lot of other tools to make it easier for me to wrap my head around things. I am getting closer, slowly, but it goes forward!

This is why it has been so hard to study path semantics: There are hundreds of edge cases to check! If you have N patterns, then you get at least N^2 cases you need to give an interpretation, so see if it makes sense to use the pattern in that context. For example, with 6 patterns you get 36 cases, but increase to 12 and you get 144! I am not sure how many patterns that are required in a practical and convenient notation. 20? 30?

The reason we know path semantics is hard to formalize is because of the simple properties of slot lambda calculus. Until recently it was a complete mystery for me how to check all the cases. With slot lambda calculus it is possible to create map and measure progress.

Older Newer