Certainly. A declarative specification could be nice. I’ll attempt provide a sketch of our algorithm here, and if that direction seems worthwhile I am happy to attempt a declarative specification based on it.
For what it’s worth, I think the current algorithmic specification is quite nice. It’s clearly stated and, because it relies on the existing subtyping rules, is quite concise. Readers do not need to understand that callable arguments have negative polarity, for example, because the existing subtyping rules are enough.
To hopefully make the algorithm more clear, I’ll use a smaller language, with only polymorphic type aliases, a single atomic type, arrows, and type application. Instead of giving a formal definition of the syntax, I’ll just work through examples. I think it should be relatively clear from the examples.
A[T] = T -> ()
In the above, we infer variance by visiting the structure of the definition. The arrow type T->U
has the normal subtyping rules for functions, contravariant on the argument and covariant on the return. T
occurs only in a contravariant position in the definition, so we infer that A
is contravariant with respect to T.
B[T] = T -> T
Here, T
occurs in both a contravariant and covariant position, so we infer that B
is invariant w.r.t. T
. We define an operation to combine the variance of each position, +
where co + cn = inv
, for example.
C[T,U] = T -> U
Since we are just looking at the occurrences of the type parameters, we can keep track for all type parameters at once.
D[T] = (T -> ()) -> ()
As we walk down the structure of a type, we need to keep the polarities straight. We start the traversal with a positive polarity. As we walk into the argument of the outer arrow, we flip to negative. As we walk into the argument of the inner arrow, we flip again to positive. By analogy, positive polarity corresponds to +1
and negative polarity corresponds to -1
, and as we walk, we multiply. Here, -1 * -1 = +1
. (We can extend the notion of polarity to include a neutral
element corresponding to 0
).
E[T] = () -> T
F[U] = () -> E[U]
Here the definition of F
depends on E
, so we must infer the variance of E
first. When inferring variance for U
within F
, we see that it occurs in the type application E[U]
, where that position is positive, because E
is covariant w.r.t T
.
G[T] = G[T] -> (T -> ())
Recursion creates an additional difficulty, which is that the positions where T
appears is not immediately clear. We have not yet determined the variance of G
, so we can not rely on that pre-existing knowledge to determine the polarity of T
in G[T]
.
If we unfold the type one level, we can see that T
appears in both a positive and negative position. In practice, unfolding isn’t practical. So instead we iterate to a fixed point. Here’s a worked example:
[Pre] [Post]
[T = Bi] G[T] -> (T -> ()) [T = Cn]
[T = Cn] G[T] -> (T -> ()) [T = Inv]
[T = Inv] G[T] -> (T -> ()) [T = Inv]
Because polarities form a lattice and our inference function is monotone, once we reach a fixed point we have a solution. Hopefully the analogy with expansion makes sense, and hopefully this motivates the inclusion of bivariance.
I haven’t done so here, but the same process extends to multiple type parameters.
H[T] = I[T] -> ()
I[U] = H[U] -> ()
We handle mutual recursion in the same way, by iterating to a fixed point. Both I
and U
start out bivariant. We visit all mutually recursive definitions together in a single iteration giving our postcondition, which we use as the precondition for the next iteration, and so on.
However, if we work through this example, we will infer that both T
and U
are invariant.
[Pre] [Post]
[T=Bi;U=Bi] I[T] -> U; H[U] -> () [T=Cn;U=Cn]
[T=Cn;U=Cn] I[T] -> U; H[U] -> () [T=Inv;U=Inv]
[T=Inv;U=Inv] I[T] -> U; H[U] -> () [T=Inv;U=Inv]
We solve this by also inferring injectivity. In other words, we detect the case where the type parameter does not appear in the right hand side. We do this in the same manner as inferring variance, and in practice we infer at the same time.
To extend the previous example, we will maintain state about both variance and injectivity. Injectivity is just a boolean, initially false, and becomes true when we see an the type parameter appear injectively. In this case, we infer false, corresponding to the fact that an infinite expansion of this type has no occurrences of T or U, and we can thus infer bivariance.
Here is a complete algorithm of the above, which might make some of the above more clear.
That’s the sketch of the algorithm. In practice, the algorithm is more complicated because it needs to understand the polarities of all typing forms, not just arrows. A declarative specification would be pretty verbose because of that, whereas the existing specification can rely on subtyping rules which encode the polarities indirectly.
And to be fully transparent, we have not actually extended this algorithm to the entire language in Pyre yet, so it’s not yet clear how many cases we would need to specify, but certainly at least unions, tuples, lists, dictionaries, and so on.