Posted on March 8, 2024
Word count: 3971

Translating mathematics into code that a proof assistant can understand is a process that involves a fair bit of friction. There’s quite a big difference between what humans are willing to suspend disbelief about, and what a type checker will actually accept without proof. Reducing this friction is the primary task of the authors of a proof assistant — but, of course, it would be infeasible to modify the proof assistant itself to cover a new problem domain every time one comes up. To this end, we provide users with various mechanisms for automation, through which users can encode their domain-specific knowledge.

Arguably one of the oldest forms of automation, type classes originated in Haskell, all the way back in the 80’s. The idea is simple enough: we can define predicates on types (the classes) and clauses for solving those predicates (the instances) — the connection with logic programming is not superficial. By “recording the execution” of this logic program, we can get the elaborator to automatically fill in terms, according to rules we specify.

In Agda, instance search has historically been a pretty simple process: like GHC, we do not support backtracking when looking for a given instance. Moreover, the implementation itself has cut some serious corners. It works, but it’s not particularly clever, or particularly efficient. This post details the first step taken in our journey to bring the implementation of instance search up to parity with what users of proof assistants expect in the 21st century.

An example

To start with, let’s actually present an instance (ha) of the problem, so that we have some code to refer back to. When doing mathematics (or computation!) in a proof assistant, we often care about decidability: for a particular proposition P, do we have a program that can tell us whether P holds or fails? Keeping track of decidability is valuable even when working in classical mathematics, since the automation around a proof must obviously be executable — and it will use certain cases of decidability, e.g., for looking things up in a list. It’s also really boring, and generally extremely formulaic, to keep track of.

Instance search is the perfect solution for automating the boring cases of proving things are decidable — when we can put together a decision procedure simply by the syntax of the type we’re asking about. A basic setup would look something like this:

data Dec {l} (A : Type l) : Type l where
  yes : A   → Dec A
  no  : ¬ A → Dec A

instance -- The implementation of each instance is immaterial
  DecYes : Dec ⊤
  DecNo  : Dec ⊥

Agda allows us to do instance search for any data or record type, so we can directly use Dec as a “type class”. The two definitions in the instance block are the two trivial cases of putting together a decision procedure: the unit type always holds, and the empty type always fails. We can then use these instances by wrapping certain function arguments in double-braces: ⦃ ... ⦄, or the ASCII {{ ... }}. A simple use-case for the setup above is a function that calls the decision procedure for a type, and, if it succeeds, produces a proof:

is-yes : ∀ {l} {A : Type l} → Dec A → Type
is-yes (yes _) = ⊤
is-yes (no  _) = ⊥

decide : ∀ {l} {A : Type l} ⦃ prf : Dec A ⦄ {_ : is-yes prf} → A
decide ⦃ yes p ⦄ = p

_ : ⊤
_ = decide -- boring!

Instance search gets more interesting in general when we consider instances with instance arguments. In our example, they allow us to implement decision procedures for type formers, where decidability of the overall thing boils down to decidability of components. For example, decidability is closed under negation and formation of product types:

instance
  DecNot  : ⦃ _ : Dec A ⦄ → Dec (¬ A)
  DecProd : ⦃ _ : Dec A ⦄ ⦃ _ : Dec B ⦄ → Dec (A × B)

A particularly common class of decision procedures are those for equality. This is the type-theoretic version of Haskell’s Eq class, and, correspondingly, it shows up quite a lot when writing metaprograms. We can also use instance search to handle decidable equality — the code below sets up some more instances. It also demonstrates that we can use instance search for anything that evaluates to a data (or record) type, not just applications of those literal types.

Discrete : ∀ {l} → Type l → Type l
Discrete A = {x y : A} → Dec (x ≡ y)

instance
  DiscreteNat  : Discrete Nat
  DiscreteInt  : Discrete Int
  DiscreteProd : ⦃ _ : Discrete A ⦄ ⦃ _ : Discrete B ⦄ → Discrete (A × B)

_ : ¬ (2 + 2 ≡ 5)
_ = decide -- slightly more interesting

Detailing the problem

To discuss Agda’s actual instance search algorithm, I think it’ll probably be for the best if I write out the terms of art that will come up.

The algorithm we’re discussing is instance search. We’ll refer to the problems it solve as instance goals, since the term “instance constraint” might get conflated with the instance context, defined below, and “FindInstance constraint” is a bit verbose.
An instance is a declaration annotated with the instance keyword. Following GHC’s terminology, we will split the type of each instance into a head and a context:
```
instance
  Discrete-List
    : ∀ {l} {A : Type l} ⦃ da : Discrete A ⦄
  --  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the context
    → Discrete (List A)
  --  ^^^^^^^^^^^^^^^^^~~~~~~~~~~~~~~~~~~~~~ the head
```
The head furthermore decomposes as a head symbol and a spine. The head symbol is the name of the “type class” we’re looking for: it could be a data type, a record type, or a postulate. The spine are the arguments for the head symbol. For Discrete-List, the head symbol is Dec, since Discrete is a definition and can reduce.
The instance table is some data structure that, given the type of an instance goal, produces a list of all instances that could match that goal.
A candidate for an instance goal is either an instance with matching head symbol, or a local variable with ⦃...⦄ visiblity drawn from the context.

For example, in the body of Discrete-List, the possible candidates for a ⊢ Dec ?0 goal would include all the instances (like DecYes), but also the local variable da.

Once we’re given an instance goal, with an identifiable head symbol, we can actually try to solve it. The procedure is simple:

The first thing to do is to verify that instance search even applies to this goal. This is a basic check on the shape of the type: it must be of the form
```
∀ {X Y Z} ⦃ _ : C X ⦄ ⦃ _ : D Y ⦄ → T x y z
```
where T is one of those head symbols we mentioned before — a data type, a record type, or a postulate.
We then build a list of initial candidates, based on the definitions in the instance table and the variables in the context.

The list of initial candidates is an overapproximation to the set of all possible solutions — it’s possible, at this step, that some candidates are ill-typed.
We narrow those initial candidates to those that could actually solve the instance goal. This is the most complicated — and most expensive! — step.

A candidate could be rejected by a variety of reasons:
- It might be in the instance table despite not being in scope.
- It might under be in scope only under a qualified name, and the user has asked us not to use qualified names for instances.
- It might not even be type-correct, since we consider all the local ⦃...⦄-variables as candidates;
However, unlike in Lean (but like in GHC), this narrowing does not do any backtracking. That is, when considering whether a candidate might solve a goal, we only consider its head, and not its context.
If we’re left with a single candidate, the instance goal is solved, and we’re technically done. If there are no candidates, that’s an immediate failure.

However, if there are multiple candidates, we don’t immediately report a failure. The type of the instance goal might be insufficiently determined, e.g. because of a metavariable, so we set this particular instance goal aside to revisit later. The second time we try it, if there’s still overlap, we report the failure to the user.

In most cases, the expensive part of executing a logic program is the backtracking: the combinatorial explosion in cases to consider requires complicated algorithms to deal with. It might then be surprising to learn that Agda’s instance search can often be quite slow!

The problem is that discarding the ill-typed instance candidates in the narrowing step can be quite expensive. Essentially, it’s a recursive invocation of the type checker, with all of its complexity. If the set of initial candidates is big, we may spend quite a long time just in the narrowing phase! To motivate this complexity, note that Agda accepts the following program:

data T : Type where
  foo bar : T

g : Bool → T
g false = foo
g true  = bar

postulate
  Cls : T → Type
  instance Inst : ∀ {x} → Cls (g x)

t1 : ⦃ _ : Cls foo ⦄ → Type
t1 ⦃ x ⦄ = ⊤

t2 : t1
t2 = tt

To know that t2 : t1 is justfied, we have to know that Inst — the only possible candidate, in this case — actually checks at the goal type Cls foo. In this case, that involves the reduction of g and applying injectivity. While it’s not a lot of work here, it’s easy to imagine situations where Agda must do quite a bit of work before it can discard a candidate.

The jumpscare

In the vocabulary, the instance table was left purposefully underspecified. That’s because it’s secretly the star of today’s show! It’s also because it would have given away the ending way too soon: foreshadowing, not spoilers. And now, the shocking conclusion!

Up through Agda 2.6.4.3, the instance table is only indexed by the head symbol.

You read that right! See for yourself, if you find it hard to believe. This means that, at each instance goal, we must discard in turn every single instance of that class. And, just to stress this point, discarding instance candidates can involve potentially unbounded work. Yeesh.

It’s not a secret that this is something we needed to improve; we’ve known this for a while. In Jesper’s blog post about the improvements they would like to see in 2024, it was point 4. So why did it take so long to solve?

I think the reason is essentially lack of shininess. This is a pure engineering problem. The heading in Jesper’s blog post includes the solution — someone just had to find the time (and spoons) to make it happen. It’s not the type of work that you could reasonably turn into a paper, especially after the Lean folks did such a stellar job, and it’s not the type of work that usually catches my eye.

But, catch my eye it (eventually) did, and I think that indexing things by terms is a common enough problem that it falls on me (having solved the problem) to write up my solution — so that future compiler engineers have something to refer to for this bit of folklore.

Discrimination trees

A discrimination tree is an associative data structure where the indices are some sort of term — in our case, Agda’s type theory. Despite the fancy name, it’s basically a trie: you traverse the term in some (consistent) order to turn into a list of keys, which you then look up in the trie.

Note that this process doesn’t exactly extend to higher-order terms, so there’s an unavoidable amount of inexactness that is unavoidable when looking up things in a discrimination tree. We may also choose to introduce some inexactness, when being exact would be prohibitively expensive, by deliberately excluding some classes of terms from being represented as keys.

Remember, our goal is to produce a small set of initial candidates, so that we don’t spend a lot of time discarding candidates. But it’d be a total waste to do this if turning terms into keys was too expensive! The core trade-off is the following: adding more types of key will result in smaller candidate sets, but we’ll spend longer producing those keys.

Anyway, what exactly goes into a key? A good start is the following:

data Key
  = RigidK {-# UNPACK #-} !QName {-# UNPACK #-} !Int
    -- ^ Rigid symbols (constructors, data types, record types,
    -- postulates) identified by a QName.
    --
    -- The 'Int' argument is the *arity* of the symbol as we encountered
    -- it.
  | FlexK
    -- ^ Literally anything else, treated as a wildcard.

This representation is extremely coarse, but it serves to convey the key ideas. When traversing a term, if we see a rigid symbol (as in the comment), we can turn it into a RigidK. Literally everything else — lambda abstractions, function types, universe levels, metavariables, local variables — will get turned into a FlexK.

One important thing to keep in mind is that, to tell if a term is rigid, we need to do reduction — remember the Discrete vs. Dec instance head? If we didn’t reduce, then we would mistakenly have a RigidK for Discrete, which would cause us to miss Discrete-List if the goal was instead phrased as Dec (Path (List A) ...). Not good!

The main workhorse is the function splitTermKey, found here, which simplifies to the following:

splitTermKey :: Term -> TCM (Key, [Term])
splitTermKey term = do
  hnf <- reduce term
  case hnf of
    Def head args ->
      if head is rigid -- some nonsense about blockers...
        then return (RigidK head (length args), args)
        else return (FlexK, [])
    _ -> return (FlexK, [])

As the name implies, splitTermKey splits a term into a key and whatever arguments the rigid symbol had. The full traversal is then elegantly implemented as a tail-recursive function. Full disclosure, the implementation up to this point is heavily inspired by how it’s done in Lean 4, but it’ll diverge in the next section.

termToKeys :: Term -> TCM [Key]
termToKeys t = loop [] [t] where
  loop acc (focus:rest) = do
    (key, args) <- splitTermKey focus
    loop (key:acc) (args ++ rest)
  loop acc [] = return (reverse acc)

The idea of the loop is that we have a term-to-inspect (the “focus”) and a number of other term parts that we should get to next — the “rest”. We split the focus into a key and its arguments, record the key, and “push” the arguments in front of the rest.

Tracing the execution for a particularly simple case might look like the following. I’ll write Agda terms in Agda syntax in these ⌜ fancy corners ⌝.

loop [] (⌜ Dec {lzero} ⊤ ⌝:[])

The first thing we do is split the focus into a key and the arguments. Dec is a rigid symbol, so we get a rigid key. This key is pushed onto the accumulator and we move onto the arguments:

  -> splitTermKey ⌜ Dec {lzero} ⊤ ⌝
    -> Dec, [ ⌜ lzero ⌝ , ⌜ ⊤ ⌝ ]

The next thing to look at is the first argument of Dec, which was lzero. Since universe levels are not Defs, they’re not represented in our keys, and we get a “wildcard” FlexK.

loop [ Dec ] [ ⌜ lzero ⌝ , ⌜ ⊤ ⌝ ]
  -> splitTermKey ⌜ lzero ⌝
    -> *, []

Moving on again, we do the same thing for the unit type. It’s a record type, so it’s rigid, and we get an actual key. It has no arguments, though, so at this point, we’re done!

loop [ *, Dec ] [ ⌜ ⊤ ⌝ ]
  -> splitTermKey ⌜ ⊤ ⌝
    -> ⊤, []
loop [ ⊤, *, Dec ] []
  -> returns [ Dec, *, ⊤ ]

The keys that correspond to the instance head Dec {lzero} ⊤ are Dec, *, ⊤. We now know how to store our instances in the discrimination trie!

Trouble in paradise

Now consider looking up instances, to solve a goal. For example, we might be trying to solve Dec (4 ≡ 5), which arises in the example involving decidable equality, above. We can turn this term into a key:

termToKeys ⌜ Dec (4 ≡ 5) ⌝ =
  [ Dec -- the class itself
  , *   -- the universe level

  , _≡_ -- the identity type
  , *   -- another level
  , Nat -- the type of 4 and 5

  , suc, suc, suc, suc, zero      -- the number 4
  , suc, suc, suc, suc, suc, zero -- the number 5
  ]

This utterly sucks, and not just for the obvious reason. To turn the term into a key, we had to look at every subterm, and reduce all of them, even though we know that there’s only one possible instance by the time we get to Nat! When the numbers are 4 and 5, this isn’t an issue, but imagine how embarrassing it would be not to solve the goal Dec (0 ≡ (2 ^ 2048)) because we got stuck generating a key with eleventy quadspillion sucs?

There’s also another, hidden problem. The keys are bogged down with a lot more wildcards than we could possibly ever need. For example:

instance
  extensionality-uncurry
    : ∀ {ℓ ℓ' ℓ''} {A : Type ℓ} {B : A → Type ℓ'} {C : Type ℓ''}
    → Extensionality (Σ A B → C)
-- key: Extensionality, *, Π, Σ, *, *, *, *, *

Taking these keys literally would lead to a pretty significant amount of pointer-chasing when we’re following a deep sequence of FlexKs, though this pales in comparison to the amount of extra normalisation we would have to do.

I thought about these two problems for a literal entire week, and only arrived at a solution on a day I accidentally took two doses of my ADHD medication. Very elegantly, the solution to not storing eighty wildcards turns out to be precisely the information we need to avoid normalising literally everything!

Case trees

In Agda, pattern-matching functions are efficiently represented using case trees. I promise this digression is going somewhere. Consider this slightly silly function:

data Foo : Type where
  foo bar baz : Foo

nat-to-foo : Nat → Foo
nat-to-foo 0 = foo
nat-to-foo 1 = bar
nat-to-foo _ = baz

Internally, we represent nat-to-foo like this. The @0 is a de Bruijn level: in this context, it means the first argument to nat-to-foo. We have a tree indicating exactly what we have to evaluate:

nat-to-foo = case @0 of
  zero → foo
  suc  → case @0 of
    zero → bar
    _    → baz

A crucial detail here is that we case on the first argument twice. The inner case tree is oblivious to its enclosing (evaluation) context. Therefore, if the suc pattern matches, the first argument is replaced with the predecessor. This is a bit technical, but it has a clear interpretation in terms of contexts! If we have something like

data Bar : Type where
  mk : Nat → Nat → Bar

then, in the function

use : Bar → …
use x = ?

matching on x will replace the variable x in the context by two new variables, corresponding to the arguments of the mk constructor:

use (mk n m) = ?

This procedure, replacing a matched term by its subterms, will be pretty important in understanding the case trees below: while nested case trees may look like they’re matching on the same argument repeatedly, or matching on something that doesn’t exist, if you trace the evolution of the “argument list” through the case trees, it’s possible to work out exactly what subterm is being matched on.

Case trees for discrimination

While the humble case tree was originally conceived to represent pattern matching, it’s actually exactly the solution to our problems with the literal implementation of discrimination trees as tries.

We might attempt a more literal solution, and I thought about it, by “run-length encoding” the sequences of wildcards in the trie representation. When looking up an instance, the number of wildcards to skip is exactly the index of the next subterm that we’ll have to turn into a key. Anything that gets skipped doesn’t have to be forced!

But assuming that we’ll force everything, and marking what needs to be skipped, is empirically backwards: in practice, very few term positions need to be forced to narrow to a single potential instance. So we choose instead to record the index of the next subterm to inspect — put another way, the next variable to case on!

So what do these “instance-case” trees actually look like in practice? We can ask Agda to dump them by passing the -vtc.instance.discrim:60 verbosity flag. Using the Dec and Discrete instances we have written so far, the instance table looks like this.

case 0 of
  Dec² → case 1 of
    ⊤⁰ → done {DecYes}
    ⊥⁰ → done {DecNo}
    Σ⁴ → case 4 of
      Const → done {DecProd}
    Pi → case 2 of
      ⊥⁰ → done {DecNot}
    _≡_⁴ → case 2 of
      Int⁰ → done {DiscreteInt}
      Nat⁰ → done {DiscreteNat}
      Σ⁴ → case 5 of
        Const → done {DiscreteProd}

Here, we can see a few leaks of the Key data type that we actually use in Agda. First, it represents Π-types using a PiK key, printed as Pi. Second, it has a special case for constant functions — that’s ConstK and it’s printed as Const. Oh, by the way, the high number indicates that it’ll print quite a lot of stuff — and, indeed, it’ll print the entire instance table for every since instance goal. You were warned.

Finding instances

So, armed with the case tree above, how are we to find the candidates for, e.g. Dec (4 ≡ ack 4 4)? First, we start, once again, by putting the term to be matched in a list. We’ll call this list the spine.

match [ ⌜ Dec {lzero} (_≡_ {lzero} {Nat} 4 (ack 4 4)) ⌝ ] (case 0 of ...)

We have to case on the first argument, the element with index zero on the spine. Here, casing means calling our good friend splitTermKey to find out what branch to take. Dutifully, it splits Dec into its arguments — and we have a branch for Dec, so that our next state is

match [ ⌜ lzero ⌝ , ⌜ _≡_ {lzero} {Nat} 4 (ack 4 4) ⌝ ] (case 1 of ...)

We now case on the second argument, the element with index 1; Since the identity type is data, this also succeeds, and we shift into

match [ ⌜ lzero ⌝ , ⌜ lzero ⌝ , ⌜ Nat ⌝ , ⌜ 4 ⌝ , ⌜ ack 4 4 ⌝ ] (case 2 of ...)

The next thing we case on has index 2, so it’s the third thing on the list: it’s Nat, which is rigid and has no arguments. We shift to

match [ ⌜ lzero ⌝ , ⌜ lzero ⌝ , ⌜ 4 ⌝ , ⌜ ack 4 4 ⌝ ] (done {DiscreteNat})

but this is a done state! We know what the instance we’re looking for is. Though, note the slight oddity that results from matched patterns being replaced with their subterms: Nat is gone. Anyway, please note that, while the literal reading of turn a term into a key would’ve gotten stuck exploding ack 4 4, the case-tree representation doesn’t touch it at all!

We’ve solved both the problem of having long paths of wildcards and the problem of having to potentially force every subterm. Nice!

Meandering around

This case-tree stuff is all well and good, but what about finding instances that aren’t fully determined? What if our instance goal was, for example, Discrete ?0, where we don’t know the type we’re comparing? This is where discrimination tree lookup diverges from pattern matching. While applying a function to a metavariable is just stuck, in instance lookup, a metavariable means that we should take every possible opportunity.

This is the job of the explore state in the lookup function. If, at any point, we case on a metavariable, instead of getting stuck, we pretend, for each pattern in turn, that the metavariable was actually that all along! For example, this state from before

match [ ⌜ lzero ⌝ , ⌜ lzero ⌝ , ⌜ ?0 ⌝ , ⌜ x ⌝ , ⌜ y ⌝ ] (case 2 of ...)

would result, after a trip through explore, in the same results as all three of

match [ ⌜ lzero ⌝ , ⌜ lzero ⌝ , ⌜ x ⌝ , ⌜ y ⌝ ]
  (done {DiscreteNat})
match [ ⌜ lzero ⌝ , ⌜ lzero ⌝ , ⌜ x ⌝ , ⌜ y ⌝ ]
  (done {DiscreteInt})
match [ ⌜ lzero ⌝ , ⌜ lzero ⌝ , ⌜ ?1 ⌝ , ⌜ ?2 ⌝ , ⌜ ?3 ⌝ , ⌜ ?4 ⌝ , ⌜ x ⌝ , ⌜ y ⌝ ]
  (done {DiscreteProd})

Note that in the first two cases, the metavariable was simply removed from the spine, since Nat and Int both have arity zero. In the second case, we had to invent four new metas to stand in its place, since Σ has four arguments. Knowing how many placeholders we should insert is the reason that we have the arity in RigidK.

True overlap

The example from the previous section, where the goal had a metavariable, is one source of overlap. However, it’s quite benign: we know all the metavariables that caused us to go exploreing, so we know which metavariables need to be solved to make progress on the overlap. However, there are cases where overlap is unresolvable, at the level of the discrimination tree:

postulate
  C : Type → Type
  instance
    CUnit : C ⊤
    CAny  : ∀ {A} → C A

The CUnit instance wants to case on the first argument twice: its case tree is

case 0 of
  C¹ → case 0 of
    ⊤⁰ → done {CUnit}

But C-any does not do any matching on the subterm of C. Its case tree looks like

case 0 of
  C¹ → done {CAny}

How do we represent this? Well, in traditional pattern matching, we can’t. A wildcard pattern is only matched if none of the patterns above match, but, if we’re looking up C ⊤, we should look up both instances: the C-any instance is not simply a “fallback” after C ⊤, it really does overlap. To this effect, our case trees have an optional “and also” subtree: we evaluate the matching case (if any) and also that case tree, totally independently from the other cases.

However, for ease of skimming, I elected to print the “and also” case tree as though it were a wildcard pattern: it just scans better. Just keep in mind that, when Agda says

case 0 of
  C¹ → case 0 of
    ⊤⁰ → done {CUnit}
    * → done {CAny}

the * → done {CAny} branch will always be evaluated.

Merging case trees

Where does true overlap come from? Merging case trees. It’s inevitable that we would come to this — after all, you can import instances, and you can import multiple modules, so we would have to deal with merging the instance tables of multiple modules anyway. However, the algorithm is actually pretty simple!

First, let me transcribe the actual data type for case trees, just so we’re on the same page.

data DiscrimTree a
  = EmptyDT
    -- ^ The empty discrimination tree.
  | DoneDT (Set a)
    -- ^ Succeed with a given set of values.
  | CaseDT
    -- ^ Do case analysis on a term.
    {-# UNPACK #-} !Int       -- ^ The variable to case on.
    (Map Key (DiscrimTree a)) -- ^ The proper branches.
    (DiscrimTree a)           -- ^ A further tree, which should always be explored.
  deriving (Generic, Eq, Show)

The easiest cases are merging the empty tree against anything, and merging a DoneDT against (an empty tree or) another DoneDT.

mergeDT EmptyDT    x = x
mergeDT (DoneDT s) x = case x of
  EmptyDT       -> DoneDT s
  DoneDT s'     -> DoneDT (s <> s') -- set union

What if we’re merging a DoneDT against a CaseDT, like in the C-any example above? That’s what the “and-also” branch, the third field of CaseDT, is for.

  DoneDT s -> CaseDT i bs (mergeDT (DoneDT s) els)

If we have a CaseDT on the left, things get really interesting. The first two cases are symmetric:

mergeDT (CaseDT i bs els) x = case x of
  EmptyDT  -> CaseDT i bs els
  DoneDT s -> CaseDT i bs (mergeDT (DoneDT s) els)

If we’re merging a legitimate case against another, there are two possibilities: if they case on the same thing, we can merge the corresponding branches from each, and if they don’t, we make an arbitrary choice of which should be on the outside, and which should be pushed into the “and-also” branch.

  CaseDT j bs' els' -> case compare i j of
    EQ -> CaseDT j (Map.unionWith mergeDT bs bs') (mergeDT els els')
    LT -> CaseDT i bs (mergeDT els (CaseDT j bs' els'))
    GT -> CaseDT j bs' (mergeDT els' (CaseDT i bs els))

In our implementation, we keep the match against the numerically earliest variable on the outside.

The upshot

To justify all this complexity, we need some performance numbers. To this effect, I added a ton more profiling points to the implementation of instance search, and checked the 1Lab with and without discrimination trees. The result was a fairly modest, but consistent, improvement in type-checking performance. Overall, we gained 10 seconds:

Profiling point	2.6.4.3	2.6.5
Total	703,384ms	693,122ms
Typing	48,922ms	50,900ms
Typing.InstanceSearch	918ms	852ms
Typing.InstanceSearch.FilterCandidates	64,238ms	34,063ms
Typing.InstanceSearch.InitialCandidates	1,954ms	7,546ms
Typing.OccursCheck	44,405ms	30,683ms

Looking at the table above, what sticks out is that the time spent discarding candidates was halved. We make up for this change by spending more time computing the initial candidate set. While 7s is a big number, the 1Lab has 4,695,879 instance goals, so, on a per-goal basis, we didn’t get that much slower. Interestingly, it seems that a significant portion of the occurs checker’s runtime — the part of the elaborator responsible for checking that metavariables do not appear in their own solutions — was also being spent in instance filtering.

Is this work particularly novel? No, not exactly — but, even though it’s only a small component of instance search as a whole, we can finally say that Agda now matches the state of the art in… looking up things in collections indexed by terms? Oh, well. Thanks for reading — I’m happy to answer any questions about this work on Mastodon.

Efficient instance resolution for Agda