The math · in math notation

What the team has proved

Every theorem the team has formalized, in standard mathematical notation with diagrams. No code — just math, the way you'd see it in a textbook.

1. The composition idea — where matrices come from

The textbook starts here: a linear map $T : V \to W$ between vector spaces is an arrow, and arrows compose. If $S : W \to U$ is a second linear map, you can do $T$ first and then $S$, producing a new arrow $S \circ T : V \to U$. That's all.

Three vector spaces with two linear maps; the composite $S \circ T$ is forced — once you know $T$ and $S$, you know it.

Now choose bases: $b_V = \{b_1, \ldots, b_n\}$ for $V$, $b_W = \{w_1, \ldots, w_p\}$ for $W$, $b_U = \{u_1, \ldots, u_m\}$ for $U$. A linear map $T$ is fully determined by what it does to the basis vectors: each $T(b_j)$ is a vector in $W$, expressible as a column of $p$ scalars in $b_W$. Write those columns side-by-side and you have a $p \times n$ matrix: $$[T]_{b_W \leftarrow b_V} \in \R^{p \times n}, \qquad \text{column } j = \text{coordinates of } T(b_j) \text{ in } b_W.$$ Same for $S$: a $m \times p$ matrix. The question is then: what matrix represents $S \circ T$, and what's its formula in terms of the matrices of $S$ and $T$?

2. Why matrix multiplication is the formula it is

This is the load-bearing derivation. Compute $(S \circ T)(b_j)$ in two steps, expanding through the bases:

$$ \begin{aligned} (S \circ T)(b_j) &= S\bigl(T(b_j)\bigr) \\ &= S\!\left(\textstyle\sum_k B_{kj}\, w_k\right) && \text{[expand $T(b_j)$ in $b_W$]} \\ &= \textstyle\sum_k B_{kj}\, S(w_k) && \text{[linearity of $S$]} \\ &= \textstyle\sum_k B_{kj}\, \bigl(\sum_i A_{ik}\, u_i\bigr) && \text{[expand $S(w_k)$ in $b_U$]} \\ &= \textstyle\sum_i \!\left(\sum_k A_{ik}\, B_{kj}\right) u_i. && \text{[swap the sums]} \end{aligned} $$

The coefficient of $u_i$ is the $(i, j)$ entry of the matrix of $S \circ T$. So the matrix of the composite is the matrix whose $(i, j)$ entry is $\sum_k A_{ik} B_{kj}$. That's the row-times-column formula. It is forced.

Theorem · matrix multiplication = composition PROVED notes.md §6.3 · MatMul.lean · toMatrix_comp_eq

For any choice of bases $b_V$, $b_W$, $b_U$ and linear maps $T : V \to W$, $S : W \to U$: $$\bigl[\, S \circ T \,\bigr]_{b_U \leftarrow b_V} \;=\; \bigl[\, S \,\bigr]_{b_U \leftarrow b_W} \cdot \bigl[\, T \,\bigr]_{b_W \leftarrow b_V}.$$ Equivalently, entry-wise: $$\bigl(\,[S \circ T]\,\bigr)_{ij} \;=\; \sum_{k=1}^{p} A_{ik}\, B_{kj}.$$

What this says. The matrix that represents "first do $T$, then do $S$" is the product of the two matrices, computed by the row-times-column formula. The formula is not a convention — it's the unique formula consistent with composition, once the column-as-image convention is fixed.

A 2×2 worked product, with the row–column highlight

To see the formula in action, take $A = \begin{pmatrix} 2 & 3 \\ 1 & 4 \end{pmatrix}$ and $B = \begin{pmatrix} 5 & 6 \\ 7 & 8 \end{pmatrix}$. The $(1, 2)$ entry of $AB$ is row 1 of $A$ dotted with column 2 of $B$:

$(AB)_{12} = 2 \cdot 6 + 3 \cdot 8 = 12 + 24 = 36$. Highlighted row of $A$ times highlighted column of $B$.

3. Why $AB \ne BA$ — concrete witness

Function composition isn't commutative. "First put on socks, then shoes" is not the same as "first put on shoes, then socks." Matrix multiplication inherits exactly that.

Three explicit examples · 2×2 over $\Z$ PROVED MatMul.lean · NonComm

Let $$A = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}, \qquad B = \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}.$$ Then by direct computation, $$AB = \begin{pmatrix} 0 & 0 \\ 0 & 0 \end{pmatrix}, \qquad BA = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix} = A. $$ So $AB \ne BA$. The proof in Lean is one tactic (decide), since both sides reduce to canonical integer matrices.

Geometric reading. $A$ acts on $(x, y)$ as the shift $(x, y) \mapsto (y, 0)$; $B$ acts as the projection $(x, y) \mapsto (x, 0)$. Project-then-shift erases everything (the projection drops $y$, then the shift looks for a $y$ to pull up — there isn't one). Shift-then-project keeps the $y$-coordinate alive long enough to pull it up to position $1$. Order matters because the operations see different things at the moment they run.

4. Algebra laws as one-line corollaries

Every algebraic law of matrices follows from the corresponding fact about linear maps, plus §6.3. The chain is short enough that you can prove it by pointing at it.

Function level

Composition is associative: $$(S \circ T) \circ R = S \circ (T \circ R).$$ Sums distribute on the right: $$S \circ (T_1 + T_2) = S \circ T_1 + S \circ T_2.$$ On the left: $$(S_1 + S_2) \circ T = S_1 \circ T + S_2 \circ T.$$ The identity composes trivially: $$\mathrm{id} \circ T = T = T \circ \mathrm{id}.$$

Matrix level (apply §6.3)

Matrix multiplication is associative: $$(AB)C = A(BC).$$ Right-distributive: $$A(B_1 + B_2) = AB_1 + AB_2.$$ Left-distributive: $$(A_1 + A_2)B = A_1B + A_2B.$$ Identity: $$I A = A = A I.$$

The Lean proofs are 5–6 line rewrite chains — mathlib has direct one-liners, but the team took the longer route deliberately to keep the prose-level argument visible in the formal proof.

5. Change of basis — same map, different coordinates

The same linear map $T : V \to V$ is represented by different matrices in different bases. The relationship between them is conjugation by the change-of-basis matrix $P$:

The two matrices represent the same linear map, in two different bases. Travel around the square either way — the result is the same map. That's what "represents" means.

Theorem · change of basis PROVED notes.md §10 · toMatrix_change_of_basis

For any two bases $B, B'$ of $V$ and any linear map $T : V \to V$: $$[T]_{B'} \;=\; P^{-1}\,[T]_B\,P, \qquad \text{where} \quad P = [\mathrm{id}]_{B \leftarrow B'}.$$

What this says. Read the matrix product right-to-left: take a vector in $B'$ coordinates, convert to $B$ coordinates ($P$ multiplies on the right of a column), apply the map (multiply by $[T]_B$), convert back to $B'$ coordinates (multiply by $P^{-1}$). The whole thing is the matrix of $T$ in the $B'$ basis. The proof is "§6.3 applied three times" — exactly what the diagram above traces out.

Two invariants that survive — and why eigenvalues matter

If $A' = P^{-1} A P$, then $\det A' = \det A$ and $\mathrm{tr}\, A' = \mathrm{tr}\, A$. Same for the characteristic polynomial. So determinant, trace, and the eigenvalues are properties of the underlying linear map, not coordinate artifacts. That's why eigenvalues matter: they're the basis-independent quantities. Anything that doesn't survive similarity is bookkeeping.

A 2×2 example, formalized: the map $T(x, y) = (2x + y,\, x + 2y)$ has matrix $\begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}$ in the standard basis. Change to the eigenbasis $\{(1, 1),\, (1, -1)\}$ and the same map becomes diagonal: $$\begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix} \;\xrightarrow{\;P^{-1} \cdot\, \cdot\, P\;}\; \begin{pmatrix} 3 & 0 \\ 0 & 1 \end{pmatrix}.$$ Same map. Different coordinates. The eigenbasis is the basis where $T$ acts independently on each axis.

6. EML — the operator and its identities

Andrzej Odrzywołek (Jagiellonian University, April 2026) defined a single binary operator and claimed every elementary function on $\R$ can be expressed as a finite tree built from this operator and the constant $1$. The operator is:

$\eml(x, y) \;:=\; e^{\,x} - \ln y$

Sanity check at the canonical input: $\eml(0, 1) = e^0 - \ln 1 = 1 - 0 = 1$. So you can produce $1$ from itself. That's enough to bootstrap everything else.

Identity 1 · exponentiation PROVED Eml.lean · exp_eq_eml

$e^x = \eml(x, 1)$

Tree shape. One internal $\eml$ node, two leaves. Depth 1.

Identity 2 · logarithm PROVED Eml.lean · log_eq_eml

$\ln x \;=\; \eml\bigl(1,\;\eml(\eml(1,\,x),\,1)\bigr)$

Tree shape. Three nested $\eml$ nodes. Depth 3. This is the kind of tree the universality claim is about — Odrzywołek's grammar generates trees whose only internal nodes are $\eml$ and whose only leaves are constants $1$ or input variables.

7. Building $-,\,+,\,\times$ from EML — and why the domains shrink

Now you can compose. From $\exp$ and $\ln$ — which are themselves $\eml$-trees — you can build subtraction, negation, addition, and multiplication. Each construction is just substitution: replace each named function with its $\eml$-tree, expand. The trees grow.

$\begin{aligned} e^x &\;=\; \eml(x, 1) && \text{depth } 1 \\ \ln x &\;=\; \eml\bigl(1, \eml(\eml(1, x), 1)\bigr) && \text{depth } 3 \\ e &\;=\; \eml(1, 1) && \text{depth } 1 \\ x - y &\;=\; \eml(\ln x,\; e^y) && \text{depth } \approx 4 \\ -x &\;=\; (e - x) - e && \text{depth } \approx 6 \\ x + y &\;=\; x - (-y) && \text{depth } \approx 12 \\ x \cdot y &\;=\; e^{\ln x + \ln y} && \text{depth } \approx 27 \end{aligned}$

The depth grows fast, but that's not the interesting part. The interesting part is that each operation requires its inputs to satisfy a tighter constraint than its components do. Where $\ln x$ wants $x > 0$, the construction $x - y$ inherits that. Where the negation $(e - x) - e$ wants the inner $e - x > 0$, it pushes the constraint $x < e$ onto its input. Building $x + y$ on top of negation tightens even further.

ln domain
sub domain
neg domain
add domain

$\ln x$: $x > 0$

$x - y$: $x > 0$

$-x$: $x < e$

$x + y$: $x > 0$ and $y < e$

$x \cdot y$: $x > 1$ and $y < e^e$

Each operation's domain is a strict subset of its parts'. The obvious construction shrinks the universe each time you compose.

This is the structural worry: even if every operation works on some region, the obvious tree for any particular function might fail on the rest of $\R$. That worry has a name — the obstruction — and the team formalized one concrete instance.

8. EML's first concrete obstruction

Take negation. The obvious EML construction is $\mathrm{neg}(x) = (e - x) - e$, which agrees with $-x$ when $x < e$. What happens at $x = e + 1$?

The inner $\mathrm{sub}(e, x)$ correctly computes $e - (e+1) = -1$. The outer $\mathrm{sub}(\,\cdot\,,\, e)$ then needs to compute $\mathrm{eml}(\ln(-1),\, e^e)$. But $\ln$ on a negative number is mathlib's junk value $0$ — the convention extends $\ln$ to be $0$ on $(-\infty, 0]$, so the formula evaluates without error but produces the wrong number.

Plugging in: $$\mathrm{neg}(e + 1) \;=\; \eml(0, e^e) \;=\; e^0 - \ln(e^e) \;=\; 1 - e \;\approx\; -1.72.$$ The expected value was $-(e + 1) \approx -3.72$. They differ by $2$.

Theorem · the obvious negation isn't global PROVED Eml.lean · myNeg_not_globally_neg

$$\exists\, x \in \R \;:\quad \mathrm{neg}(x) \ne -x. \qquad \text{Witness: } x = e + 1.$$

What this says. Even the simplest function — negation — is not globally representable by the obvious EML-tree. The witness $x = e + 1$ is a single specific real number where the formula breaks. This is a weaker claim than "no EML-tree of any size or shape represents negation globally" — the stronger claim needs the monodromy argument in §10. But the concrete witness already shows the constraint-narrowing pattern is not just theoretical.

The truth $y = -x$ (green dashed) versus the obvious EML construction $\mathrm{neg}(x)$ (blue solid). They agree on $x < e$; at $x = e$ the construction jumps because the inner sub crosses zero. At $x = e + 1$ the gap is exactly $2$.

9. The MinimalBasis framework — Tier 2 in math notation

The team didn't stop at the EML obstruction. They generalized. MinimalBasis.lean defines an abstract framework that any "small set of operators" universality question can instantiate.

Definition · operator basis FRAMEWORK MinimalBasis.lean

An operator basis over a carrier set $\alpha$ is a triple $(\mathcal{O}, \mathrm{ar}, \llbracket - \rrbracket)$ where:

$\mathcal{O}$ is a set of operator symbols,
$\mathrm{ar} : \mathcal{O} \to \N$ assigns each operator an arity,
$\llbracket - \rrbracket : \prod_{o \in \mathcal{O}} \alpha^{\mathrm{ar}(o)} \to \alpha$ interprets each operator as a function on $\alpha$.

Examples. EML: $\mathcal{O} = \{\eml\}$, $\mathrm{ar}(\eml) = 2$, $\llbracket\eml\rrbracket(x, y) = e^x - \ln y$. NAND: $\mathcal{O} = \{\mathrm{nand}\}$, $\mathrm{ar}(\mathrm{nand}) = 2$, $\llbracket\mathrm{nand}\rrbracket(p, q) = \neg(p \land q)$. Future AI primitives: same shape, different operators.

Definition · terms over a basis FRAMEWORK MinimalBasis.lean

For an operator basis $B$ over $\alpha$ and an arity $n$, the set of terms $\mathrm{Term}_B(n)$ is the smallest set such that:

$\dfrac{c \in \alpha}{\;\mathrm{const}(c) \in \mathrm{Term}_B(n)\;}\quad\text{(constants)}$ $\qquad$ $\dfrac{i \in \{1, \ldots, n\}}{\;\mathrm{var}(i) \in \mathrm{Term}_B(n)\;}\quad\text{(variables)}$

$\dfrac{o \in \mathcal{O}, \;\; t_1, \ldots, t_{\mathrm{ar}(o)} \in \mathrm{Term}_B(n)}{\;\mathrm{app}(o; t_1, \ldots, t_{\mathrm{ar}(o)}) \in \mathrm{Term}_B(n)\;}\quad\text{(application)}$

A term $T \in \mathrm{Term}_B(n)$ globally represents a function $f : \alpha^n \to \alpha$ iff $\llbracket T \rrbracket(\vec x) = f(\vec x)$ for every $\vec x \in \alpha^n$.

Definition · obstruction theory FRAMEWORK MinimalBasis.lean

An obstruction theory over $B$ is a family of properties $\mathcal{C}_n \subseteq (\alpha^n \to \alpha)$ such that:

(C1) Every constant function lies in $\mathcal{C}_n$.
(C2) Every projection $\pi_i : \vec x \mapsto x_i$ lies in $\mathcal{C}_n$.
(C3) If $f_1, \ldots, f_{\mathrm{ar}(o)} \in \mathcal{C}_n$, then $\vec x \mapsto \llbracket o \rrbracket(f_1(\vec x), \ldots, f_{\mathrm{ar}(o)}(\vec x)) \in \mathcal{C}_n$.

Theorem · structural closure PROVED Term.eval_in_class

For any term $T \in \mathrm{Term}_B(n)$ and any obstruction theory $\mathcal{C}$ over $B$: $$\llbracket T \rrbracket \;\in\; \mathcal{C}_n.$$ Direct structural induction over the construction of $T$.

Theorem · the bridge lemma PROVED Term.no_representation_of_not_in_class

If $f \notin \mathcal{C}_n$, then no term $T \in \mathrm{Term}_B(n)$ globally represents $f$.

What this says. The obstruction class is "the universe of functions that the basis can possibly produce." If $f$ is outside that universe, no amount of clever tree-building reaches it. This is the abstract non-expressibility theorem; the entire universality discussion downstream of EML, NAND, etc. specializes this.

10. Headline — no EML-tree represents the quintic root

Plug in the EML basis. Choose the obstruction class $\mathcal{C}_n = \{f : \alpha^n \to \alpha \mid f \text{ has solvable monodromy}\}$. The structural induction needs three pieces:

The four Path-B axioms — two discharged, two remaining Eml.lean · EmlObstruction · plus SolvableMonodromy.lean

Discharged. Every constant function has solvable monodromy. Every projection has solvable monodromy. Both proved via the concrete hasSolvableMonodromy definition in Sheffer/Foundations/SolvableMonodromy.lean, built on the multivalued-analytic infrastructure in MultivaluedAnalytic.lean and the group-action layer in MonodromyGroup.lean. Both were once axioms; both are now theorems.

Remaining. Khovanskii's lemma — if $f$ and $g$ have solvable monodromy, so does $\eml(f, g)$. The Galois fact — the generic quintic root function has $S_5$ monodromy. Both are theorems with established proofs in the analytic-Galois literature; both are still axiomatized in the Lean source.

The discharge order is the natural one. Constants and projections are the trivial cases — single-sheeted, no non-trivial monodromy. Khovanskii's lemma is the deep analytic step where the eml composition's monodromy group needs to inherit solvability from the factor monodromies of $\exp$ (trivial), $\log$ (cyclic ℤ), and the composing $f$, $g$. The Galois fact is one mathlib call away once the monodromy bridge is in place.

The quintic-root function has $S_5$ monodromy Eml.lean · quinticRoot_has_S5_monodromy_universal

The function $\mathrm{quinticRoot} : (a_1, \ldots, a_5) \mapsto$ a real root of $f^5 + a_1 f^4 + a_2 f^3 + a_3 f^2 + a_4 f + a_5$, where one exists, has monodromy group $S_5$. Since $S_5$ is not solvable (it contains $A_5$, which is simple non-abelian), $\mathrm{quinticRoot} \notin \mathcal{C}_5$. This is the second of the two remaining Path-B axioms.

Putting the pieces together:

Headline theorem · no EML-term represents the quintic root AXIOMATIZED · 2 axioms remaining no_eml_term_represents_quintic

$\nexists\, T \in \mathrm{Term}_{B_{\eml}}(5) \;:\quad \llbracket T \rrbracket = \mathrm{quinticRoot}.$

The proof is the bridge lemma applied to $\mathrm{quinticRoot}$ plus the two remaining Path-B axioms. The const-and-var clauses of the structural induction are now theorems, built from the concrete monodromy infrastructure in Sheffer/Foundations/.

Every term over the EML basis evaluates to a function with solvable monodromy.structural induction · const and var discharged · Khovanskii's lemma axiomatized

↓

$\mathrm{quinticRoot}$ has $S_5$ monodromy, which is not solvable.classical Galois · axiomatized

↓

If some EML-term equalled $\mathrm{quinticRoot}$, it would have to have both solvable and non-solvable monodromy.contraposition

↓

Therefore no EML-term globally represents the quintic root function.

This is the formalization of Stylewarning's informal critique. It's the headline math that the team has produced. The Path-B axioms wrap the analytic-monodromy content as a black box; Path A would discharge them once mathlib has the underlying theory. Either way, the framework, the structural induction, the bridge lemma, and the headline theorem are on disk and machine-checked.

11. PolyAct — the same machinery, fully constructive

The framework supports multiple substrates. The EML / quintic-root story makes that point with axioms wrapping the analytic content; polynomial-activation networks, formalized in Sheffer/Examples/PolyAct.lean, make the same kind of point with no axioms at all. The analytic-continuation barrier is replaced by elementary polynomial degree arithmetic, and a single mathlib lemma about polynomial composition does all the work. The result is also about a real architecture — networks whose activation functions are polynomials of bounded degree.

The expressibility frontier below has a draggable version on the interactive page — slide $k$ and $d$ and watch $x^N$ enter or leave the representable region.

Definition · the polynomial-activation basis FRAMEWORK PolyAct.lean · PolyActOp, PolyActBasis

For an integer $k \geq 0$, let $\mathrm{PolyActOp}_k$ be the set of univariate real polynomials of degree at most $k$: $$\mathrm{PolyActOp}_k \;=\; \{\, p \in \R[X] \;\mid\; \deg p \leq k \,\}.$$ The polynomial-activation basis $B_{\mathrm{Poly},k}$ over $\R$ has one operator type ($\mathrm{PolyActOp}_k$), arity $1$, and evaluator $$\llbracket p \rrbracket(\vec{x}) \;=\; p(x_1).$$ A term over $B_{\mathrm{Poly},k}$ with one input variable is then a finitary composition: $p_1 \circ p_2 \circ \cdots \circ p_d \circ x$, where each $p_i$ has degree at most $k$.

What this models. A neural network whose activation functions are univariate polynomials of degree $\leq k$, composed to depth $d$ with one input. ReLU isn't polynomial, but quadratic activations and Taylor approximations of arbitrary depth-$d$ activation networks fit this picture. Polynomial activations are studied in the algebraic-complexity-theory tradition (Bürgisser-Clausen-Shokrollahi 1997).

Definitions · structural depth and underlying polynomial FRAMEWORK PolyAct.lean · Term.polyDepth, Term.toPoly

For a term $T \in \mathrm{Term}_{B_{\mathrm{Poly},k}}(1)$, define recursively: $$\mathrm{polyDepth}(T) \;=\; \begin{cases} 0 & T = \mathrm{const}(c) \text{ or } T = \mathrm{var}(0) \\ 1 + \mathrm{polyDepth}(T') & T = \mathrm{app}(p;\, T') \end{cases}$$ and extract the underlying polynomial: $$\mathrm{toPoly}(T) \;=\; \begin{cases} C(c) & T = \mathrm{const}(c) \\ X & T = \mathrm{var}(0) \\ p \circ \mathrm{toPoly}(T') & T = \mathrm{app}(p;\, T'). \end{cases}$$

By structural induction, $\mathrm{toPoly}(T).\mathrm{eval}(x) = \llbracket T \rrbracket(\vec x)$ — the extracted polynomial computes the same thing as the term.

Theorem · the degree bound PROVED PolyAct.lean · Term.toPoly_natDegree_le, Term.eval_hasPolyDegLe

For any term $T$ over the polynomial-activation basis $B_{\mathrm{Poly},k}$: $$\deg(\mathrm{toPoly}(T)) \;\leq\; k^{\mathrm{polyDepth}(T)}.$$ Equivalently: every term of structural depth at most $d$ evaluates to a polynomial of degree at most $k^d$ (assuming $k \geq 1$).

The proof is a one-line induction.

Constants and variables (depth 0): $C(c)$ has degree $0 \leq k^0$; $X$ has degree $1 \leq k^0$ when $k \geq 1$.
Application (depth $1 + d'$): mathlib's Polynomial.natDegree_comp_le gives $\deg(p \circ q) \leq \deg(p) \cdot \deg(q)$. Apply: $\deg(p \circ \mathrm{toPoly}(T')) \leq k \cdot k^{d'} = k^{1+d'}$.

Each level of nesting multiplies the degree by at most $k$. After $d$ levels, the resulting polynomial has degree at most $k^d$. The argument is mathlib's Polynomial.natDegree_comp_le applied to each level.

Headline theorem · $x^9$ is not representable at depth $3$ PROVED — 0 axioms x_pow_9_not_representable_at_depth_3

$\nexists\, T \in \mathrm{Term}_{B_{\mathrm{Poly},2}}(1) \;:\quad \mathrm{polyDepth}(T) \leq 3 \;\wedge\; \forall x \in \R.\; \llbracket T \rrbracket(x) = x^9.$

The proof, in three steps.

Suppose such a $T$ exists. By the degree-bound theorem, $\mathrm{toPoly}(T)$ has degree at most $2^3 = 8$.
Since $\mathrm{toPoly}(T).\mathrm{eval}(x) = x^9$ at every $x \in \R$, the polynomial $\mathrm{toPoly}(T) - X^9$ has infinitely many roots. A polynomial in $\R[X]$ with infinitely many roots is the zero polynomial, so $\mathrm{toPoly}(T) = X^9$.
But $\deg(X^9) = 9 > 8$. Contradiction.

The expressibility frontier at $k = 2,\, d = 3$: representable polynomials have degree $\leq 8$ (shaded green); $x^9$ sits just past the boundary at degree $9$ (orange marker). The widget on the interactive page lets you drag $k,\, d,\, N$ and watch the frontier move.

What's notable about this result, relative to the EML / quintic-root headline:

Zero axioms. The Path-B axioms that wrap monodromy theory in the EML case (two remaining, two already discharged via Sheffer/Foundations/SolvableMonodromy.lean) have no analog here. The proof uses one mathlib lemma (Polynomial.natDegree_comp_le) and induction.
Same framework. The Sheffer machinery — OperatorBasis, Term, representsGlobally — is reused unchanged. PolyAct is an instance, not a parallel framework.
An AI-architectural function class. Polynomial activations are a real (if not dominant) neural-network architecture. The headline says: a depth-3 network whose activations are at most quadratic cannot compute $x^9$. That's a falsifiable property of the architecture, not an abstract claim.
Pedagogically simpler. The argument fits in three steps a calculus student can follow. EML's headline needs Galois theory and analytic continuation; PolyAct's needs only the multiplicative-degree law for polynomial composition.

12. NAND — Sheffer's positive completeness

The third instantiation. Where EML and PolyAct produce obstruction results — no term represents this function — NAND produces a completeness result: every Boolean function is representable as a NAND-tree. The framework supports both polarities.

Definition · the NAND basis FRAMEWORK Nand.lean · NandBasis

Let $\alpha = \mathrm{Bool}$ and define one binary operator $\bar{\wedge}$ (NAND) by $$\bar{\wedge}(a, b) \;=\; \neg(a \wedge b).$$ The NAND basis $B_{\bar{\wedge}}$ has one operator type with one element, arity $2$, evaluator $\llbracket \bar{\wedge} \rrbracket(a, b) = \neg(a \wedge b)$.

What this models. Boolean circuit construction restricted to NAND gates. The historical observation (Sheffer 1913) is that every Boolean function on $n$ inputs is realizable as a NAND-only circuit — NAND is functionally complete for Boolean logic.

Three direct constructions PROVED Nand.lean · notTree, andTree, orTree

Three Boolean primitives expressed as NAND-trees and proved to represent the standard logical operations:

$$\mathrm{notTree}(x) \;=\; \bar{\wedge}(x, x) \;=\; \neg x.$$ $$\mathrm{andTree}(x, y) \;=\; \bar{\wedge}(\bar{\wedge}(x, y),\, \bar{\wedge}(x, y)) \;=\; x \wedge y.$$ $$\mathrm{orTree}(x, y) \;=\; \bar{\wedge}(\bar{\wedge}(x, x),\, \bar{\wedge}(y, y)) \;=\; x \vee y. \qquad \text{(De Morgan)}$$

Each proof is two lines — a four-case truth-table check via cases <;> simp. Bool has only two values, so two nested case-splits exhaust the space; the simplifier finishes from there. No axioms.

Theorem · Sheffer's functional completeness PROVED — 0 axioms Nand.lean · sheffer_complete

For every $n \geq 0$ and every Boolean function $f : \mathrm{Bool}^n \to \mathrm{Bool}$, there exists a term $T \in \mathrm{Term}_{B_{\bar{\wedge}}}(n)$ such that $T$ globally represents $f$: $$\forall n, f.\;\;\exists T \in \mathrm{Term}_{B_{\bar{\wedge}}}(n).\;\; \forall \vec{x}.\; \llbracket T \rrbracket(\vec{x}) = f(\vec{x}).$$

Proved via the disjunctive-normal-form construction. For each truth-table row where $f$ outputs true, build a minterm (a conjunction of literals — variables or their negations) that fires exactly on that row; OR the minterms together; rewrite the whole expression as a NAND-tree using the three direct constructions from the previous card. Helpers bigAnd, bigOr, minTerm, dnf plus the supporting lemmas minTerm_self_eval and minTerm_diff_eval close the case-split. Sheffer 1913, formalized with zero axioms.

The polarity contrast. EML and PolyAct land on the negative side: some functions are not representable. NAND lands on the positive side: every function is representable. The Sheffer framework's OperatorBasis + Term + representsGlobally machinery is polarity-agnostic — both questions are stated identically and proved separately.

13. Arithmetic circuits, VP, VNP

A different track from the Sheffer framework's three substrate examples: the foundations of algebraic complexity theory. Arithmetic circuits, polynomial-size families, and the classes VP and VNP that Valiant introduced in 1979 to study the complexity of polynomial computation. Arithmetic-circuit definitions, the VP and VNP classes, and the permanent's VNP membership via Ryser's formula are all on disk, proved without axioms.

Definition · arithmetic circuit FRAMEWORK ArithCircuit.lean · ArithmeticCircuit

For a commutative semiring $R$ and a variable type $\sigma$, an arithmetic circuit $C \in \mathrm{ArithmeticCircuit}_{R,\sigma}$ is freely generated by: $$\dfrac{i \in \sigma}{\mathrm{input}(i) \in \mathrm{ArithmeticCircuit}_{R,\sigma}} \quad \dfrac{c \in R}{\mathrm{const}(c) \in \mathrm{ArithmeticCircuit}_{R,\sigma}}$$ $$\dfrac{a, b \in \mathrm{ArithmeticCircuit}_{R,\sigma}}{\mathrm{add}(a, b) \in \mathrm{ArithmeticCircuit}_{R,\sigma}} \quad \dfrac{a, b \in \mathrm{ArithmeticCircuit}_{R,\sigma}}{\mathrm{mul}(a, b) \in \mathrm{ArithmeticCircuit}_{R,\sigma}}.$$ Three associated functions:

$\mathrm{size}(C) \in \N$ — the number of add/mul internal gates.
$\mathrm{eval}(\mathrm{env}, C) \in R$ — evaluate $C$ at an input assignment $\mathrm{env} : \sigma \to R$.
$\mathrm{toMvPolynomial}(C) \in R[\sigma]$ — extract the multivariate polynomial $C$ computes.

Standard inductive type; one generalization on Bürgisser-Clausen-Shokrollahi 1997 §5.2: variable types are arbitrary Type* rather than fixed to Fin n. This lets us state permanent ∈ VNP directly against MvPolynomial (Fin n × Fin n) R without an awkward Fin (n*n) encoding.

Definitions · VP and VNP FRAMEWORK ArithCircuit.lean · VPOver, VNPOver, IsPolyBound

A function $p : \N \to \N$ is polynomially bounded if there exist constants $c, d \in \N$ such that $p(n) \leq c \cdot n^d + c$ for all $n$.

A polynomial family $f = (f_n)_{n \in \N}$ where $f_n \in R[\mathrm{Fin}\, n]$ is in $\mathrm{VP}_R$ iff there exists a sequence of arithmetic circuits $(C_n)_{n \in \N}$ and a polynomially-bounded $p$ such that: $$\forall n.\;\; \mathrm{toMvPolynomial}(C_n) = f_n \;\wedge\; \mathrm{size}(C_n) \leq p(n).$$

$f \in \mathrm{VNP}_R$ iff there exists $g \in \mathrm{VP}_R$ over an extended variable type $\mathrm{Fin}\,(n + m(n))$ — for some polynomially-bounded $m$ — such that $f$ is a Boolean projection of $g$: $$f_n(\vec{x}) \;=\; \sum_{\vec{y} \in \{0, 1\}^{m(n)}} g_{n + m(n)}(\vec{x}, \vec{y}).$$

VP is "polynomially-computable" — polynomial families with polynomial-size arithmetic circuits. VNP is VP plus a Boolean-projection step. The classical analog of P vs NP: $\mathrm{VP} \subseteq \mathrm{VNP}$ (proved); $\mathrm{VP} = \mathrm{VNP}$ is open (Valiant's hypothesis is that they differ).

Theorem · $\mathrm{VP} \subseteq \mathrm{VNP}$ PROVED ArithCircuit.lean · VPSubsetVNP

For all $R$ and all $f$: $$f \in \mathrm{VP}_R \;\;\Longrightarrow\;\; f \in \mathrm{VNP}_R.$$

The easy direction. Take $m \equiv 0$ (no extra Boolean variables); the Boolean cube $\{0, 1\}^0$ is the singleton $\{()\}$, so the sum collapses to $g_n(\vec x, ())$, which equals $f_n(\vec x)$ when $g$ is taken to be $f$ with a vacuous added-variable signature. About 30 Lean lines of plumbing through the framework's Boolean-projection definition. No axioms.

Definition · the permanent polynomial FRAMEWORK ArithCircuit.lean · permanentVar

For an $n \times n$ matrix of formal variables $X : \mathrm{Fin}\,n \times \mathrm{Fin}\,n \to R$, the permanent is the multivariate polynomial $$\mathrm{perm}_n(X) \;=\; \sum_{\sigma \in S_n} \prod_{i = 1}^{n} X_{i,\, \sigma(i)} \;\;\in\;\; R[\mathrm{Fin}\,n \times \mathrm{Fin}\,n].$$

The permanent is determinant-without-the-signs. The naive permutation-sum formula has $n!$ monomials — exponential, not polynomial-size. The headline theorem the team is currently formalizing (PermanentVNP) says $\mathrm{perm} \in \mathrm{VNP}$ — even though the naive formula is exponential, the polynomial belongs to the VNP class because it admits a different representation via a Boolean projection over a polynomial-size circuit.

Theorem · $\mathrm{perm} \in \mathrm{VNP}$ via Ryser's formula PROVED — 0 axioms PermanentVNP.lean · permanent_in_VNP_via_arity

Ryser (1963) gave an alternative formula for the permanent: $$\mathrm{perm}(X) \;=\; (-1)^n \sum_{S \subseteq [n]} (-1)^{|S|} \prod_{i = 1}^{n} \!\left( \sum_{j \in S} X_{i, j} \right).$$ Where the naive permutation sum has $n!$ terms, Ryser's sum has $2^n$ — and after reorganization, it admits a polynomial-size arithmetic circuit over $n + n^2$ variables: $n$ Boolean variables encoding the subset $S$, and the $n^2$ formal entries of $X$.

The formalization decomposes into two parts: the Ryser identity itself (the auxiliary polynomial summed over the Boolean cube equals the permanent) and the polynomial-size circuit witness for that auxiliary polynomial. Both are proved without axioms; the chain to the headline collapses to six lines once they are in hand. Explicit circuit size: $2n^2 + 5n + 2$, satisfying IsPolyBound with constant $k = 9$. See the proofs page for the Lean source.

Math notation rendered with KaTeX; diagrams hand-crafted as inline SVG. Source theorems live across MatMul.lean, Eml.lean, JunkValueTest.lean, and the Sheffer/ tree (Core.lean for the framework, Examples/{Eml,Nand,PolyAct}.lean for the three instantiations, Foundations/{ArithCircuit,PermanentVNP}.lean for the arithmetic-circuit-complexity work). The plain-English notes under each result are my reading; they aren't part of Lean's verification chain.