Word count: 3971
Translating mathematics into code that a proof assistant can understand is a process that involves a fair bit of friction. There’s quite a big difference between what humans are willing to suspend disbelief about, and what a type checker will actually accept without proof. Reducing this friction is the primary task of the authors of a proof assistant — but, of course, it would be infeasible to modify the proof assistant itself to cover a new problem domain every time one comes up. To this end, we provide users with various mechanisms for automation, through which users can encode their domain-specific knowledge.
Arguably one of the oldest forms of automation, type classes originated in Haskell, all the way back in the 80’s. The idea is simple enough: we can define predicates on types (the classes) and clauses for solving those predicates (the instances) — the connection with logic programming is not superficial. By “recording the execution” of this logic program, we can get the elaborator to automatically fill in terms, according to rules we specify.
In Agda, instance search has historically been a pretty simple process: like GHC, we do not support backtracking when looking for a given instance. Moreover, the implementation itself has cut some serious corners. It works, but it’s not particularly clever, or particularly efficient. This post details the first step taken in our journey to bring the implementation of instance search up to parity with what users of proof assistants expect in the 21st century.
An example
To start with, let’s actually present an instance (ha) of the problem,
so that we have some code to refer back to. When doing mathematics (or
computation!) in a proof assistant, we often care about decidability:
for a particular proposition P
, do we have a program that can tell us
whether P holds or fails? Keeping track of decidability is valuable even
when working in classical mathematics, since the automation around a
proof must obviously be executable — and it will use certain cases
of decidability, e.g., for looking things up in a list. It’s also
really boring, and generally extremely formulaic, to keep track of.
Instance search is the perfect solution for automating the boring cases of proving things are decidable — when we can put together a decision procedure simply by the syntax of the type we’re asking about. A basic setup would look something like this:
data Dec {l} (A : Type l) : Type l where
: A → Dec A
yes : ¬ A → Dec A
no
instance -- The implementation of each instance is immaterial
: Dec ⊤
DecYes : Dec ⊥ DecNo
Agda allows us to do instance search for any data or record type, so we
can directly use Dec
as a “type class”. The two definitions in the
instance
block are the two trivial cases of putting together a
decision procedure: the unit type always holds, and the empty type
always fails. We can then use these instances by wrapping certain
function arguments in double-braces: ⦃ ... ⦄
, or the ASCII {{ ... }}
. A simple use-case for the setup above is a function that calls the
decision procedure for a type, and, if it succeeds, produces a proof:
: ∀ {l} {A : Type l} → Dec A → Type
is-yes (yes _) = ⊤
is-yes (no _) = ⊥
is-yes
: ∀ {l} {A : Type l} ⦃ prf : Dec A ⦄ {_ : is-yes prf} → A
decide = p
decide ⦃ yes p ⦄
_ : ⊤
_ = decide -- boring!
Instance search gets more interesting in general when we consider instances with instance arguments. In our example, they allow us to implement decision procedures for type formers, where decidability of the overall thing boils down to decidability of components. For example, decidability is closed under negation and formation of product types:
instance
: ⦃ _ : Dec A ⦄ → Dec (¬ A)
DecNot : ⦃ _ : Dec A ⦄ ⦃ _ : Dec B ⦄ → Dec (A × B) DecProd
A particularly common class of decision procedures are those for
equality. This is the type-theoretic version of Haskell’s Eq
class, and, correspondingly, it shows up quite a lot when writing
metaprograms. We can also use instance search to handle decidable
equality — the code below sets up some more instances. It also
demonstrates that we can use instance search for anything that
evaluates to a data (or record) type, not just applications of those
literal types.
: ∀ {l} → Type l → Type l
Discrete = {x y : A} → Dec (x ≡ y)
Discrete A
instance
: Discrete Nat
DiscreteNat : Discrete Int
DiscreteInt : ⦃ _ : Discrete A ⦄ ⦃ _ : Discrete B ⦄ → Discrete (A × B)
DiscreteProd
_ : ¬ (2 + 2 ≡ 5)
_ = decide -- slightly more interesting
Detailing the problem
To discuss Agda’s actual instance search algorithm, I think it’ll probably be for the best if I write out the terms of art that will come up.
The algorithm we’re discussing is instance search. We’ll refer to the problems it solve as instance goals, since the term “instance constraint” might get conflated with the instance context, defined below, and “
FindInstance
constraint” is a bit verbose.An instance is a declaration annotated with the
instance
keyword. Following GHC’s terminology, we will split the type of each instance into a head and a context:instance Discrete-List: ∀ {l} {A : Type l} ⦃ da : Discrete A ⦄ -- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the context → Discrete (List A) -- ^^^^^^^^^^^^^^^^^~~~~~~~~~~~~~~~~~~~~~ the head
The head furthermore decomposes as a head symbol and a spine. The head symbol is the name of the “type class” we’re looking for: it could be a data type, a record type, or a postulate. The spine are the arguments for the head symbol. For
Discrete-List
, the head symbol isDec
, sinceDiscrete
is a definition and can reduce.The instance table is some data structure that, given the type of an instance goal, produces a list of all instances that could match that goal.
A candidate for an instance goal is either an instance with matching head symbol, or a local variable with
⦃...⦄
visiblity drawn from the context.For example, in the body of
Discrete-List
, the possible candidates for a⊢ Dec ?0
goal would include all the instances (likeDecYes
), but also the local variableda
.
Once we’re given an instance goal, with an identifiable head symbol, we can actually try to solve it. The procedure is simple:
The first thing to do is to verify that instance search even applies to this goal. This is a basic check on the shape of the type: it must be of the form
∀ {X Y Z} ⦃ _ : C X ⦄ ⦃ _ : D Y ⦄ → T x y z
where
T
is one of those head symbols we mentioned before — a data type, a record type, or a postulate.We then build a list of initial candidates, based on the definitions in the instance table and the variables in the context.
The list of initial candidates is an overapproximation to the set of all possible solutions — it’s possible, at this step, that some candidates are ill-typed.
We narrow those initial candidates to those that could actually solve the instance goal. This is the most complicated — and most expensive! — step.
A candidate could be rejected by a variety of reasons:
- It might be in the instance table despite not being in scope.
- It might under be in scope only under a qualified name, and the user has asked us not to use qualified names for instances.
- It might not even be type-correct, since we consider all the local
⦃...⦄
-variables as candidates;
However, unlike in Lean (but like in GHC), this narrowing does not do any backtracking. That is, when considering whether a candidate might solve a goal, we only consider its head, and not its context.
If we’re left with a single candidate, the instance goal is solved, and we’re technically done. If there are no candidates, that’s an immediate failure.
However, if there are multiple candidates, we don’t immediately report a failure. The type of the instance goal might be insufficiently determined, e.g. because of a metavariable, so we set this particular instance goal aside to revisit later. The second time we try it, if there’s still overlap, we report the failure to the user.
In most cases, the expensive part of executing a logic program is the backtracking: the combinatorial explosion in cases to consider requires complicated algorithms to deal with. It might then be surprising to learn that Agda’s instance search can often be quite slow!
The problem is that discarding the ill-typed instance candidates in the narrowing step can be quite expensive. Essentially, it’s a recursive invocation of the type checker, with all of its complexity. If the set of initial candidates is big, we may spend quite a long time just in the narrowing phase! To motivate this complexity, note that Agda accepts the following program:
data T : Type where
: T
foo bar
: Bool → T
g = foo
g false = bar
g true
postulate
: T → Type
Cls instance Inst : ∀ {x} → Cls (g x)
: ⦃ _ : Cls foo ⦄ → Type
t1 = ⊤
t1 ⦃ x ⦄
: t1
t2 = tt t2
To know that t2 : t1
is justfied, we have to know that Inst
— the
only possible candidate, in this case — actually checks at the goal
type Cls foo
. In this case, that involves the reduction of g
and
applying injectivity. While it’s not a lot of work here, it’s easy to
imagine situations where Agda must do quite a bit of work before it can
discard a candidate.
The jumpscare
In the vocabulary, the instance table was left purposefully underspecified. That’s because it’s secretly the star of today’s show! It’s also because it would have given away the ending way too soon: foreshadowing, not spoilers. And now, the shocking conclusion!
Up through Agda 2.6.4.3, the instance table is only indexed by the head symbol.
You read that right! See for yourself, if you find it hard to believe. This means that, at each instance goal, we must discard in turn every single instance of that class. And, just to stress this point, discarding instance candidates can involve potentially unbounded work. Yeesh.
It’s not a secret that this is something we needed to improve; we’ve known this for a while. In Jesper’s blog post about the improvements they would like to see in 2024, it was point 4. So why did it take so long to solve?
I think the reason is essentially lack of shininess. This is a pure engineering problem. The heading in Jesper’s blog post includes the solution — someone just had to find the time (and spoons) to make it happen. It’s not the type of work that you could reasonably turn into a paper, especially after the Lean folks did such a stellar job, and it’s not the type of work that usually catches my eye.
But, catch my eye it (eventually) did, and I think that indexing things by terms is a common enough problem that it falls on me (having solved the problem) to write up my solution — so that future compiler engineers have something to refer to for this bit of folklore.
Discrimination trees
A discrimination tree is an associative data structure where the indices are some sort of term — in our case, Agda’s type theory. Despite the fancy name, it’s basically a trie: you traverse the term in some (consistent) order to turn into a list of keys, which you then look up in the trie.
Note that this process doesn’t exactly extend to higher-order terms, so there’s an unavoidable amount of inexactness that is unavoidable when looking up things in a discrimination tree. We may also choose to introduce some inexactness, when being exact would be prohibitively expensive, by deliberately excluding some classes of terms from being represented as keys.
Remember, our goal is to produce a small set of initial candidates, so that we don’t spend a lot of time discarding candidates. But it’d be a total waste to do this if turning terms into keys was too expensive! The core trade-off is the following: adding more types of key will result in smaller candidate sets, but we’ll spend longer producing those keys.
Anyway, what exactly goes into a key? A good start is the following:
data Key
= RigidK {-# UNPACK #-} !QName {-# UNPACK #-} !Int
-- ^ Rigid symbols (constructors, data types, record types,
-- postulates) identified by a QName.
--
-- The 'Int' argument is the *arity* of the symbol as we encountered
-- it.
| FlexK
-- ^ Literally anything else, treated as a wildcard.
This representation is extremely coarse, but it serves to convey the key
ideas. When traversing a term, if we see a rigid symbol (as in the
comment), we can turn it into a RigidK
. Literally everything
else — lambda abstractions, function types, universe levels,
metavariables, local variables — will get turned into a
FlexK
.
One important thing to keep in mind is that, to tell if a term is rigid,
we need to do reduction — remember the Discrete
vs. Dec
instance
head? If we didn’t reduce, then we would mistakenly have a
RigidK
for Discrete
, which would cause us to miss
Discrete-List
if the goal was instead phrased as Dec (Path (List A) ...)
. Not good!
The main workhorse is the function splitTermKey
, found
here, which simplifies to the following:
splitTermKey :: Term -> TCM (Key, [Term])
= do
splitTermKey term <- reduce term
hnf case hnf of
Def head args ->
if head is rigid -- some nonsense about blockers...
then return (RigidK head (length args), args)
else return (FlexK, [])
-> return (FlexK, []) _
As the name implies, splitTermKey
splits a term into a key and
whatever arguments the rigid symbol had. The full traversal is then
elegantly implemented as a tail-recursive function. Full disclosure, the
implementation up to this point is heavily inspired by how it’s done in
Lean 4, but it’ll diverge in the next section.
termToKeys :: Term -> TCM [Key]
= loop [] [t] where
termToKeys t :rest) = do
loop acc (focus<- splitTermKey focus
(key, args) :acc) (args ++ rest)
loop (key= return (reverse acc) loop acc []
The idea of the loop is that we have a term-to-inspect (the “focus”) and a number of other term parts that we should get to next — the “rest”. We split the focus into a key and its arguments, record the key, and “push” the arguments in front of the rest.
Tracing the execution for a particularly simple case might look like the
following. I’ll write Agda terms in Agda syntax in these ⌜ fancy corners ⌝
.
Dec {lzero} ⊤ ⌝:[]) loop [] (⌜
The first thing we do is split the focus into a key and the arguments.
Dec
is a rigid symbol, so we get a rigid key. This key is pushed onto
the accumulator and we move onto the arguments:
-> splitTermKey ⌜ Dec {lzero} ⊤ ⌝
-> Dec, [ ⌜ lzero ⌝ , ⌜ ⊤ ⌝ ]
The next thing to look at is the first argument of Dec
, which was
lzero
. Since universe levels are not Def
s, they’re not
represented in our keys, and we get a “wildcard” FlexK.
Dec ] [ ⌜ lzero ⌝ , ⌜ ⊤ ⌝ ]
loop [ -> splitTermKey ⌜ lzero ⌝
-> *, []
Moving on again, we do the same thing for the unit type. It’s a record type, so it’s rigid, and we get an actual key. It has no arguments, though, so at this point, we’re done!
*, Dec ] [ ⌜ ⊤ ⌝ ]
loop [ -> splitTermKey ⌜ ⊤ ⌝
-> ⊤, []
*, Dec ] []
loop [ ⊤, -> returns [ Dec, *, ⊤ ]
The keys that correspond to the instance head Dec {lzero} ⊤
are Dec, *, ⊤
. We now know how to store our instances in the discrimination
trie!
Trouble in paradise
Now consider looking up instances, to solve a goal. For example, we
might be trying to solve Dec (4 ≡ 5)
, which arises in the example
involving decidable equality, above. We can turn this term into a key:
Dec (4 ≡ 5) ⌝ =
termToKeys ⌜ Dec -- the class itself
[ * -- the universe level
,
-- the identity type
, _≡_ * -- another level
, Nat -- the type of 4 and 5
,
-- the number 4
, suc, suc, suc, suc, zero -- the number 5
, suc, suc, suc, suc, suc, zero ]
This utterly sucks, and not just for the obvious reason. To turn the
term into a key, we had to look at every subterm, and reduce all of
them, even though we know that there’s only one possible instance by the
time we get to Nat
! When the numbers are 4 and 5, this isn’t an issue,
but imagine how embarrassing it would be not to solve the goal Dec (0 ≡ (2 ^ 2048))
because we got stuck generating a key with eleventy
quadspillion suc
s?
There’s also another, hidden problem. The keys are bogged down with a lot more wildcards than we could possibly ever need. For example:
instance
extensionality-uncurry: ∀ {ℓ ℓ' ℓ''} {A : Type ℓ} {B : A → Type ℓ'} {C : Type ℓ''}
→ Extensionality (Σ A B → C)
-- key: Extensionality, *, Π, Σ, *, *, *, *, *
Taking these keys literally would lead to a pretty significant amount of
pointer-chasing when we’re following a deep sequence of FlexK
s,
though this pales in comparison to the amount of extra normalisation we
would have to do.
I thought about these two problems for a literal entire week, and only arrived at a solution on a day I accidentally took two doses of my ADHD medication. Very elegantly, the solution to not storing eighty wildcards turns out to be precisely the information we need to avoid normalising literally everything!
Case trees
In Agda, pattern-matching functions are efficiently represented using case trees. I promise this digression is going somewhere. Consider this slightly silly function:
data Foo : Type where
: Foo
foo bar baz
: Nat → Foo
nat-to-foo 0 = foo
nat-to-foo 1 = bar
nat-to-foo _ = baz nat-to-foo
Internally, we represent nat-to-foo
like this. The @0
is a de Bruijn
level: in this context, it means the first argument to nat-to-foo
. We
have a tree indicating exactly what we have to evaluate:
= case @0 of
nat-to-foo → foo
zero → case @0 of
suc → bar
zero _ → baz
A crucial detail here is that we case on the first argument twice. The
inner case tree is oblivious to its enclosing (evaluation) context.
Therefore, if the suc
pattern matches, the first argument is
replaced with the predecessor. This is a bit technical, but it has a
clear interpretation in terms of contexts! If we have something like
data Bar : Type where
: Nat → Nat → Bar mk
then, in the function
: Bar → …
use = ? use x
matching on x
will replace the variable x
in the context by two
new variables, corresponding to the arguments of the mk
constructor:
(mk n m) = ? use
This procedure, replacing a matched term by its subterms, will be pretty important in understanding the case trees below: while nested case trees may look like they’re matching on the same argument repeatedly, or matching on something that doesn’t exist, if you trace the evolution of the “argument list” through the case trees, it’s possible to work out exactly what subterm is being matched on.
Case trees for discrimination
While the humble case tree was originally conceived to represent pattern matching, it’s actually exactly the solution to our problems with the literal implementation of discrimination trees as tries.
We might attempt a more literal solution, and I thought about it, by “run-length encoding” the sequences of wildcards in the trie representation. When looking up an instance, the number of wildcards to skip is exactly the index of the next subterm that we’ll have to turn into a key. Anything that gets skipped doesn’t have to be forced!
But assuming that we’ll force everything, and marking what needs to be skipped, is empirically backwards: in practice, very few term positions need to be forced to narrow to a single potential instance. So we choose instead to record the index of the next subterm to inspect — put another way, the next variable to case on!
So what do these “instance-case” trees actually look like in practice?
We can ask Agda to dump them by passing the -vtc.instance.discrim:60
verbosity flag. Using the Dec
and Discrete
instances we have
written so far, the instance table looks like this.
case 0 of
Dec² → case 1 of
DecYes}
⊤⁰ → done {DecNo}
⊥⁰ → done {case 4 of
Σ⁴ → Const → done {DecProd}
Pi → case 2 of
DecNot}
⊥⁰ → done {case 2 of
_≡_⁴ → Int⁰ → done {DiscreteInt}
Nat⁰ → done {DiscreteNat}
case 5 of
Σ⁴ → Const → done {DiscreteProd}
Here, we can see a few leaks of the Key
data type that we
actually use in Agda. First, it represents Π-types using a
PiK
key, printed as Pi
. Second, it has a special case for
constant functions — that’s ConstK
and it’s printed as
Const
. Oh, by the way, the high number indicates that it’ll print
quite a lot of stuff — and, indeed, it’ll print the entire instance
table for every since instance goal. You were warned.
Finding instances
So, armed with the case tree above, how are we to find the candidates
for, e.g. Dec (4 ≡ ack 4 4)
? First, we start, once again, by putting
the term to be matched in a list. We’ll call this list the spine.
Dec {lzero} (_≡_ {lzero} {Nat} 4 (ack 4 4)) ⌝ ] (case 0 of ...) match [ ⌜
We have to case on the first argument, the element with index zero on
the spine. Here, casing means calling our good friend splitTermKey
to
find out what branch to take. Dutifully, it splits Dec
into its
arguments — and we have a branch for Dec
, so that our next state is
Nat} 4 (ack 4 4) ⌝ ] (case 1 of ...) match [ ⌜ lzero ⌝ , ⌜ _≡_ {lzero} {
We now case on the second argument, the element with index 1; Since the identity type is data, this also succeeds, and we shift into
Nat ⌝ , ⌜ 4 ⌝ , ⌜ ack 4 4 ⌝ ] (case 2 of ...) match [ ⌜ lzero ⌝ , ⌜ lzero ⌝ , ⌜
The next thing we case on has index 2, so it’s the third thing on the
list: it’s Nat
, which is rigid and has no arguments. We shift to
4 ⌝ , ⌜ ack 4 4 ⌝ ] (done {DiscreteNat}) match [ ⌜ lzero ⌝ , ⌜ lzero ⌝ , ⌜
but this is a done state! We know what the instance we’re looking for
is. Though, note the slight oddity that results from matched patterns
being replaced with their subterms: Nat
is gone. Anyway, please note
that, while the literal reading of turn a term into a key would’ve
gotten stuck exploding ack 4 4
, the case-tree representation doesn’t
touch it at all!
We’ve solved both the problem of having long paths of wildcards and the problem of having to potentially force every subterm. Nice!
Meandering around
This case-tree stuff is all well and good, but what about finding
instances that aren’t fully determined? What if our instance goal was,
for example, Discrete ?0
, where we don’t know the type we’re
comparing? This is where discrimination tree lookup diverges from
pattern matching. While applying a function to a metavariable is just
stuck, in instance lookup, a metavariable means that we should take
every possible opportunity.
This is the job of the explore
state in the lookup function. If, at
any point, we case
on a metavariable, instead of getting
stuck, we pretend, for each pattern in turn, that the metavariable was
actually that all along! For example, this state from before
?0 ⌝ , ⌜ x ⌝ , ⌜ y ⌝ ] (case 2 of ...) match [ ⌜ lzero ⌝ , ⌜ lzero ⌝ , ⌜
would result, after a trip through explore
, in the same results as
all three of
match [ ⌜ lzero ⌝ , ⌜ lzero ⌝ , ⌜ x ⌝ , ⌜ y ⌝ ]DiscreteNat})
(done {
match [ ⌜ lzero ⌝ , ⌜ lzero ⌝ , ⌜ x ⌝ , ⌜ y ⌝ ]DiscreteInt})
(done {?1 ⌝ , ⌜ ?2 ⌝ , ⌜ ?3 ⌝ , ⌜ ?4 ⌝ , ⌜ x ⌝ , ⌜ y ⌝ ]
match [ ⌜ lzero ⌝ , ⌜ lzero ⌝ , ⌜ DiscreteProd}) (done {
Note that in the first two cases, the metavariable was simply removed
from the spine, since Nat
and Int
both have arity zero. In the
second case, we had to invent four new metas to stand in its place,
since Σ
has four arguments. Knowing how many placeholders we should
insert is the reason that we have the arity in RigidK
.
True overlap
The example from the previous section, where the goal had a
metavariable, is one source of overlap. However, it’s quite benign: we
know all the metavariables that caused us to go explore
ing, so we know
which metavariables need to be solved to make progress on the overlap.
However, there are cases where overlap is unresolvable, at the level of
the discrimination tree:
postulate
: Type → Type
C instance
: C ⊤
CUnit : ∀ {A} → C A CAny
The CUnit
instance wants to case on the first argument twice: its case
tree is
case 0 of
C¹ → case 0 of
CUnit} ⊤⁰ → done {
But C-any
does not do any matching on the subterm of C
. Its case
tree looks like
case 0 of
C¹ → done {CAny}
How do we represent this? Well, in traditional pattern matching, we
can’t. A wildcard pattern is only matched if none of the patterns
above match, but, if we’re looking up C ⊤
, we should look up
both instances: the C-any
instance is not simply a “fallback”
after C ⊤
, it really does overlap. To this effect, our case trees have
an optional “and also” subtree: we evaluate the matching case (if any)
and also that case tree, totally independently from the other cases.
However, for ease of skimming, I elected to print the “and also” case tree as though it were a wildcard pattern: it just scans better. Just keep in mind that, when Agda says
case 0 of
C¹ → case 0 of
CUnit}
⊤⁰ → done {* → done {CAny}
the * → done {CAny}
branch will always be evaluated.
Merging case trees
Where does true overlap come from? Merging case trees. It’s inevitable that we would come to this — after all, you can import instances, and you can import multiple modules, so we would have to deal with merging the instance tables of multiple modules anyway. However, the algorithm is actually pretty simple!
First, let me transcribe the actual data type for case trees, just so we’re on the same page.
data DiscrimTree a
= EmptyDT
-- ^ The empty discrimination tree.
| DoneDT (Set a)
-- ^ Succeed with a given set of values.
| CaseDT
-- ^ Do case analysis on a term.
{-# UNPACK #-} !Int -- ^ The variable to case on.
Map Key (DiscrimTree a)) -- ^ The proper branches.
(DiscrimTree a) -- ^ A further tree, which should always be explored.
(deriving (Generic, Eq, Show)
The easiest cases are merging the empty tree against anything, and
merging a DoneDT
against (an empty tree or) another
DoneDT
.
EmptyDT x = x
mergeDT DoneDT s) x = case x of
mergeDT (EmptyDT -> DoneDT s
DoneDT s' -> DoneDT (s <> s') -- set union
What if we’re merging a DoneDT
against a CaseDT
,
like in the C-any
example above? That’s what the “and-also” branch,
the third field of CaseDT
, is for.
DoneDT s -> CaseDT i bs (mergeDT (DoneDT s) els)
If we have a CaseDT
on the left, things get really interesting. The
first two cases are symmetric:
CaseDT i bs els) x = case x of
mergeDT (EmptyDT -> CaseDT i bs els
DoneDT s -> CaseDT i bs (mergeDT (DoneDT s) els)
If we’re merging a legitimate case
against another, there
are two possibilities: if they case on the same thing, we can merge the
corresponding branches from each, and if they don’t, we make an
arbitrary choice of which should be on the outside, and which should be
pushed into the “and-also” branch.
CaseDT j bs' els' -> case compare i j of
EQ -> CaseDT j (Map.unionWith mergeDT bs bs') (mergeDT els els')
LT -> CaseDT i bs (mergeDT els (CaseDT j bs' els'))
GT -> CaseDT j bs' (mergeDT els' (CaseDT i bs els))
In our implementation, we keep the match against the numerically earliest variable on the outside.
The upshot
To justify all this complexity, we need some performance numbers. To this effect, I added a ton more profiling points to the implementation of instance search, and checked the 1Lab with and without discrimination trees. The result was a fairly modest, but consistent, improvement in type-checking performance. Overall, we gained 10 seconds:
Profiling point | 2.6.4.3 | 2.6.5 |
---|---|---|
Total | 703,384ms | 693,122ms |
Typing | 48,922ms | 50,900ms |
Typing.InstanceSearch | 918ms | 852ms |
Typing.InstanceSearch.FilterCandidates | 64,238ms | 34,063ms |
Typing.InstanceSearch.InitialCandidates | 1,954ms | 7,546ms |
Typing.OccursCheck | 44,405ms | 30,683ms |
Looking at the table above, what sticks out is that the time spent discarding candidates was halved. We make up for this change by spending more time computing the initial candidate set. While 7s is a big number, the 1Lab has 4,695,879 instance goals, so, on a per-goal basis, we didn’t get that much slower. Interestingly, it seems that a significant portion of the occurs checker’s runtime — the part of the elaborator responsible for checking that metavariables do not appear in their own solutions — was also being spent in instance filtering.
Is this work particularly novel? No, not exactly — but, even though it’s only a small component of instance search as a whole, we can finally say that Agda now matches the state of the art in… looking up things in collections indexed by terms? Oh, well. Thanks for reading — I’m happy to answer any questions about this work on Mastodon.