Formalizing Length Functions in Coding Theory
A brief discussion on the formalization of length functions, which are more than just a mapping from elements of an alphabet to natural numbers. They also need to satisfy a capacity bound condition.
The theories I’m formalizing are based on Coding Theory, and the main elements are: an alphabet α
, a set of codes Code
–usually thought of as a sequence of bits–and a coding function CodingFunction
that maps elements of α
to Code
, along with a length function LengthFunction
that maps elements of α
to their lengths in bits. I realized that the definition of LengthFunction
I was using is incomplete:
structure LengthFunction (α : Type) where
length : α → ℕ
finite_domain : Fintype α
Since we represent codes as sequences of bits, we need to make sure that the number of elements from \(\alpha\) mapped to a given \(n\) is bounded by \(2^n\). This is because no more than \(2^n\) different codes can be represented using exactly \(n\) bits, and the coding function is injective. Therefore, we need to add a condition to the LengthFunction
structure:
structure LengthFunction (α : Type) where
length : α → ℕ
finite_domain : Fintype α
capacity_bound : ∀ n, (Fintype.card {a : α // length a = n}) ≤ 2^n
This condition complicates the instantiation of the LengthFunction
structure. However, if we already have an actual CodingFunction
, we can derive the LengthFunction
from it and use its injectivity to ensure the capacity bound.
structure CodingFunction (α : Type) where
toFun : α → Code
inj : Function.Injective toFun
finite_domain : Fintype α
Unfortunately, the solution is not as straightforward as I had hoped. I used an equivalence between the set of codes of length \(n\) and the set of functions from a finite set of size \(n\) to Bit
: Fin n → Bit
. An equivalence, Equiv in Lean, is a way to define a bijection between two types, which can be used to show that they have the same cardinality. In this case, the set of functions from Fin n
to Bit
is a finite type of cardinality \(2^n\), since Bit
has two elements (0 and 1), and there are \(n\) positions to fill with these bits.
To build the equivalence, I first defined an alias for the set of codes of length \(n\):
abbrev CodeLen (n : ℕ) := {c : Code // c.length = n}
For the equivalence, we need to provide four components: a function from CodeLen n
to Fin n → Bit
, a function from Fin n → Bit
to CodeLen n
, and proofs that these functions are inverses of each other.
def equivFun (n : ℕ) : CodeLen n ≃ (Fin n → Bit) where
toFun := by
rintro ⟨l, hl⟩ i
rw [←hl] at *
exact l.get i
invFun := by
intro f
refine ⟨List.ofFn (fun i : Fin n => f i), ?_⟩
simp [List.length_ofFn]
left_inv := by
rintro ⟨l, hl⟩
apply Subtype.ext
simp
subst hl
simp [List.ofFn_getElem]
right_inv := by
intro f; ext i; simp
have : (List.ofFn fun i => f i).length = n := by
simp [List.length_ofFn]
congr; simp
For the rest of the proofs, we need to have a proof that, for each n
, CodeLen n
is a Fintype
and has the correct cardinality.
instance instFintype (n : ℕ) : Fintype (CodeLen n) :=
Fintype.ofEquiv (Fin n → Bit) (equivFun n).symm
lemma card (n : ℕ) : Fintype.card (CodeLen n) = 2 ^ n := by
have h₁ := Fintype.card_congr (equivFun n)
simpa [Fintype.card_fun] using h₁
With that in place, we can finally prove the capacity_bound
condition for the LengthFunction
when using a CodingFunction
lemma CodingFunction.coding_length_bound {α : Type} [Fintype α] (c : CodingFunction α) (n : ℕ) :
(Fintype.card {a : α // (c.toFun a).length = n}) ≤ 2^n := by
let g : {a : α // (c.toFun a).length = n} → CodeLen n :=
fun ⟨a, ha⟩ => ⟨c.toFun a, ha⟩
have hg : Function.Injective g := by
rintro ⟨a, ha⟩ ⟨b, hb⟩ hxy
have hcode : c.toFun a = c.toFun b := congrArg Subtype.val hxy
have : a = b := c.inj hcode
subst this
rfl
calc Fintype.card {a : α // (c.toFun a).length = n}
_ ≤ Fintype.card (CodeLen n) := Fintype.card_le_of_injective _ hg
_ = 2 ^ n := card n
I was curious why I hadn’t noticed this earlier. I guess it’s because the Kraft-McMillan condition is stronger than the capacity bound:
\[\sum_{a\in \alpha} 2^{-\text{length}(a)} \leq 1 \implies \forall n, \text{Fintype.card } \{a : \alpha // \text{ length}(a) = n\} \leq 2^n.\]Only when I moved to a weaker CodingFunction
, one that may not satisfy the condition, did I notice the issue.
Enjoy Reading This Article?
Here are some more articles you might like to read next: