Unary Functions in an E-graph
Functions of one variable are special in numerics: thanks to
equioscillation, unary functions can be approximated efficiently by
polynomials. For this reason, it can be useful to rewrite functions of
multiple variables so that you get large unary fragments. For example,
log(exp(a) + exp(b)) can be rewritten to:
a + log(1 + exp(b - a))
The fragment log(1 + exp(x)) is a unary function; if we call it f, the
original expression can be rewritten into a + f(b - a). How can we do
this automatically?
Expressions
Let's start with finding unary functions in simple expressions.
An expression can be thought of as a DAG, where there are multiple
source nodes (one per variable) and a single sink node (the expression
root). Finding a unary function in it means finding a pair of nodes
(a, b) where all paths to a first pass through b, in which case a is a
unary function of b.
This is what we call a dominator: b dominates a if every path from an
entry nodes to a passes through b. If you have %k = f(%i, %j), then
the dominators of %k are everything that dominates both %i and %j; in
other words:
dom(%k) = { %k } ∪ (dom(%i) ∩ dom(%j))
There's a rich theory of dominators. Specifically, dominators are
transitive: if every path to a goes through b, and every path to a
goes through c, then every path to b goes through c or vice versa.
Thus, if both b and c dominate a then one dominates the other. Every
node thus has a single immediate dominator, and this immediate
dominator relationship forms a tree.
The dominator tree an efficient representation of dominator sets: the dominator set of a node is all its ancestors in the dominator tree. Intersection of dominator sets then becomes least-common-ancestor of dominator tree nodes. This makes dominator trees very efficient to compute on a DAG.
You topologically sort all nodes and traverse the topological order
starting from the variables. Each node %k = f(%i, %j)'s immediate
dominator must dominate both %i and %j, so compute their least common
ancestor in the dominator tree, which is the closest node that
dominates both of them; that's the immediate dominator of %k.
I've gone pretty fast here because dominator trees are textbook compilers concepts. Though we're doing something a little bit weird here. Normally you compute dominator trees over control flow graph nodes; and control flow graphs aren't DAGs, so you usually use a more complicated algorithm called Lengauer-Tarjan. Here, we're computing it over nodes in an expression DAG, basically on SSA values.
Let's see an example. Consider a + log(1 + exp(b - a)), which as an
SSA block is:
%a = get %vars, 0 %b = get %vars, 1 %0 = sub %b, %a %1 = exp %0 %2 = const 1 %3 = add %2, %1 %4 = add %a, %3
Note that I've added a %vars node; this dummy node is useful to
represent the root of the dominator tree. Now let's consider
dominators. %a and %b only have one input, %vars, so they are
dominated by it. Then %0's immediate dominator is the least common
ancestor of %a and %b, which is also %vars. For %1, its immediate
dominator is %0. %2 is special, since it's a constant. Then for %3,
its left argument is a constant so its immediate dominator is %1.
Finally, for %4, we take the least common ancestor of %3 and %a, which
is %vars. The final dominator tree looks like this:
%vars
- %a
- %b
- %0
- %1
- %3
- %4
Now start at node %3 and walk up the tree until just before you reach
the root; that gets you to %0. So (%3, %0) is a unary function,
specifically the function %3 = log(1 + exp(%0)). In fact, any pair of
a node and its ancestor represents a unary function.
Anyway, the point of all of this is that dominator trees give you an easy way to find unary functions in an expression. Herbie in fact uses this algorithm for regime branches.
E-graphs
But this is all too easy. The tricky thing about log(exp(a) + exp(b))
is that it doesn't obviously contain a unary fragment (except trivial
ones like exp(x) and log(x)), it first needs to be rewritten
algebraically into a + log(1 + exp(b - a)) for the unary function to
appear.
E-graphs are a good way to represent algebraic rewrites. So how can we find unary functions in an e-graph?
An e-graph consists of e-nodes and e-classes. An e-node is something
like n = (%i, %j); its dominator set is the intersection of the dominator
sets of %i and %j:
dom(f(%i, %j)) = dom(%i) ∩ dom(%j)
An e-class is a set of e-nodes %k = { n1, n2, … }. We want to find
unary functions across all expressions represented in the e-graph, so
we want the union across this set:
dom(%k) = { %k } ∪ dom(n1) ∪ dom(n2) ∪ ⋯
Now, as written, this tells us what dominates a given e-class, but it
doesn't give us a way to recover the actual unary functions. We can
fix that by make dom(%k) be a map, not a set, where each dominator
maps to a witness term describing the unary function.
Let's start with the e-class case:
dom(%k) = { %k : x } ∪ dom(n1) ∪ dom(n2) ∪ ⋯
Here the first set says that %k is a unary function of itself via the
identity function. Now for the e-node case:
dom(f(%i, %j)) = { %k: f(dom(%i)[%k], dom(%j)[%k]) }
Here %k must be an entry in dom(%i) and dom(%j), so it ranges over
their intersection.
We can phrase all this as an e-class analysis where the analysis value
for each e-class is a Map<EClassID, Expr>.
Function expressions
But wait: if we do this naively we're going to get a lot of duplicate
unary functions. For example, every single e-class will be a unary
function of itself with the same witness term, x. For numerics
purposes we probably want all unique unary functions.
Instead of deduplicating things after the fact we can instead store
the witness terms in a hash-cons; in Herbie we have a standard
hash-cons data structure called a batch. Then the e-class analysis
value is a Map<EClassID, HashConsID>.
But in fact, there's no real reason to use a separate hash-cons. We
have a perfectly good one inside the e-graph itself! So we can make
the analysis value a Map<EClassID, EClassID>. Let's just write down
the semantics here precisely:
dom(%i)[%j] = %k ⟹ %i = %k(%j)
Here, on the right-hand side, %k is interpreted as a function of x,
while %i and %j are interpreted as normal values. In fact, once we
write it like this, it's easy to see that the right-hand side can be
thought of as a new type of e-node:
%j ∈ dom(%i) ⟹ %i = app(dom(%i)[%j], %j)
In fact we don't need dom as a separate e-class analysis at all! We
can rephrase the whole thing as two rewrite rules:1 [1 Technically the
second is a rewrite schema.]
?e = app(x, ?e) f(app(?a, ?e), app(?b, ?e)) = app(f(?a, ?b), ?e)
Note that the function nodes aren't a different type from expression
nodes; they're just expressions over the variable x. This means that
you might at times prove that two functions are equal, but that's
fine: applying equal functions to equal arguments yields equal
results, so the app operator is still congruence-closed.
Identities and similar
Another thing Herbie does is find range reductions; this was first introduced on this blog but later was a major part of the MegaLibm paper.2 [2 Which, may I brag, just won a SIGPLAN research highlight?]
The app concept lets us represent this idea, too. A range reduction
for a function f is a pair of functions s and t such that f(x) =
t(f(s(x))) over at least some interval of x values. In our app
notation, that means:
f = app(t, app(f, s))
In practice, Herbie doesn't search for totally generic t functions, it
looks for concrete choices for s and t. For example, we might check if:
f = -app(f, -x)
If this is true, then f is an odd function.
Slotted e-graphs extend e-graphs with support for variables and bindings, just like here. I've been skeptical of the value of slotted e-graphs for Herbie—the paper phrases slotted e-graphs as useful for higher-order languages, while Herbie is a first-order language—but now it looks quite relevant!
I think the relationship here is that slotted e-graphs make unwrapping
app nodes efficient by baking that directly into the e-graph, while
what I'm doing with app nodes is just naively building every syntactic
beta reduction and seeing what sticks. That said, I don't understand
slotted e-graphs well enough to be confident whether it helps or not.
Footnotes:
Technically the second is a rewrite schema.
Which, may I brag, just won a SIGPLAN research highlight?
