Formalizing Length Functions in Coding Theory
A brief discussion on the formalization of length functions, which are more than just a mapping from elements of an alphabet to natural numbers. They also need to satisfy a capacity bound condition.
The theories I’m formalizing are based on Coding Theory, and the main elements are: an alphabet α, a set of codes Code–usually thought of as a sequence of bits–and a coding function CodingFunction that maps elements of α to Code, along with a length function LengthFunction that maps elements of α to their lengths in bits. I realized that the definition of LengthFunction I was using is incomplete:
structure LengthFunction (α : Type) where
length : α → ℕ
finite_domain : Fintype α
Since we represent codes as sequences of bits, we need to make sure that the number of elements from \(\alpha\) mapped to a given \(n\) is bounded by \(2^n\). This is because no more than \(2^n\) different codes can be represented using exactly \(n\) bits, and the coding function is injective. Therefore, we need to add a condition to the LengthFunction structure:
structure LengthFunction (α : Type) where
length : α → ℕ
finite_domain : Fintype α
capacity_bound : ∀ n, (Fintype.card {a : α // length a = n}) ≤ 2^n
This condition complicates the instantiation of the LengthFunction structure. However, if we already have an actual CodingFunction, we can derive the LengthFunction from it and use its injectivity to ensure the capacity bound.
structure CodingFunction (α : Type) where
toFun : α → Code
inj : Function.Injective toFun
finite_domain : Fintype α
Unfortunately, the solution is not as straightforward as I had hoped. I used an equivalence between the set of codes of length \(n\) and the set of functions from a finite set of size \(n\) to Bit: Fin n → Bit. An equivalence, Equiv in Lean, is a way to define a bijection between two types, which can be used to show that they have the same cardinality. In this case, the set of functions from Fin n to Bit is a finite type of cardinality \(2^n\), since Bit has two elements (0 and 1), and there are \(n\) positions to fill with these bits.
To build the equivalence, I first defined an alias for the set of codes of length \(n\):
abbrev CodeLen (n : ℕ) := {c : Code // c.length = n}
For the equivalence, we need to provide four components: a function from CodeLen n to Fin n → Bit, a function from Fin n → Bit to CodeLen n, and proofs that these functions are inverses of each other.
def equivFun (n : ℕ) : CodeLen n ≃ (Fin n → Bit) where
toFun := by
rintro ⟨l, hl⟩ i
rw [←hl] at *
exact l.get i
invFun := by
intro f
refine ⟨List.ofFn (fun i : Fin n => f i), ?_⟩
simp [List.length_ofFn]
left_inv := by
rintro ⟨l, hl⟩
apply Subtype.ext
simp
subst hl
simp [List.ofFn_getElem]
right_inv := by
intro f; ext i; simp
have : (List.ofFn fun i => f i).length = n := by
simp [List.length_ofFn]
congr; simp
For the rest of the proofs, we need to have a proof that, for each n, CodeLen n is a Fintype and has the correct cardinality.
instance instFintype (n : ℕ) : Fintype (CodeLen n) :=
Fintype.ofEquiv (Fin n → Bit) (equivFun n).symm
lemma card (n : ℕ) : Fintype.card (CodeLen n) = 2 ^ n := by
have h₁ := Fintype.card_congr (equivFun n)
simpa [Fintype.card_fun] using h₁
With that in place, we can finally prove the capacity_bound condition for the LengthFunction when using a CodingFunction
lemma CodingFunction.coding_length_bound {α : Type} [Fintype α] (c : CodingFunction α) (n : ℕ) :
(Fintype.card {a : α // (c.toFun a).length = n}) ≤ 2^n := by
let g : {a : α // (c.toFun a).length = n} → CodeLen n :=
fun ⟨a, ha⟩ => ⟨c.toFun a, ha⟩
have hg : Function.Injective g := by
rintro ⟨a, ha⟩ ⟨b, hb⟩ hxy
have hcode : c.toFun a = c.toFun b := congrArg Subtype.val hxy
have : a = b := c.inj hcode
subst this
rfl
calc Fintype.card {a : α // (c.toFun a).length = n}
_ ≤ Fintype.card (CodeLen n) := Fintype.card_le_of_injective _ hg
_ = 2 ^ n := card n
I was curious why I hadn’t noticed this earlier. I guess it’s because the Kraft-McMillan condition is stronger than the capacity bound:
\[\sum_{a\in \alpha} 2^{-\text{length}(a)} \leq 1 \implies \forall n, \text{Fintype.card } \{a : \alpha // \text{ length}(a) = n\} \leq 2^n.\]Only when I moved to a weaker CodingFunction, one that may not satisfy the condition, did I notice the issue.
Enjoy Reading This Article?
Here are some more articles you might like to read next: