Project Overview
This research project investigates the effects of regularisation on p-adic linear regression. Building on the foundational work in arXiv:2510.00043v1, which established that an n-dimensional hyperplane minimizing p-adic sum of distances must pass through at least n+1 points, we explore what happens when regularisation is introduced.
Key Research Question
When we add regularisation to p-adic linear regression, how does the number of data points that the optimal hyperplane passes through change as a function of the regularisation strength?
Mathematical Background
p-Adic Valuation and Distance
For a prime p and rational number x = pk · (a/b) where gcd(p, ab) = 1:
- The p-adic valuation is vp(x) = k
- The p-adic absolute value is |x|p = p−vp(x)
- The p-adic distance is dp(x, y) = |x − y|p
Unregularised p-Adic Regression
The standard p-adic regression loss function is:
L(β) = Σi |yi − Xi·β|p = Σi p−vp(yi − Xi·β)
Regularised p-Adic Regression
We explore adding a regularisation term:
Lreg(β) = Σi |yi − Xi·β|p + λ · R(β)
where R(β) could be:
- Real L2 penalty: Σj βj2
- p-Adic penalty: Σj |βj|p
Current Status
Last Updated: 2025-12-06 (Day 39)
Current Focus: Formal Lean 4 proofs - second theorem fully proven!
Recent Progress (Day 39):
- pareto_monotonicity PROVEN! Second formal theorem complete. Shows k-path is monotonic ⟺ no consecutive k-increase exists. Uses
pairwise_iff_getElemandisChain_iff_pairwisewith transitivity. - Helper lemma:
isChain_of_consec_ge- consecutive non-increase implies chain property via induction. - Build status: Only 1
sorryremaining (kMin_implies_increase).
Previous Progress (Day 38):
- R_decreases_at_transition PROVEN! First formal theorem complete using
nlinarithtactic. - Fixed Lean 4 syntax: λ→lam variable renaming, removed unsafe deriving, updated deprecated API calls.
Key Theorems Proven:
- Pareto Frontier Monotonicity: k(λ) is monotonic ⟺ no Pareto inversion exists (100% validated, formally proven)
- n+1 Theorem: At λ=0, optimal hyperplane passes through ≥ d+1 points (validated to d=5)
- Threshold Formula: λ* = (L₂ - L₁) / (b₁² - b₂²) with exact asymptotic expansion
- k_min Predictor: k_min_intermediate < k_final → inversion exists (92.7% recall, 100% precision)
- k-up Detector: k_up_transitions >= 1 ⟺ inversion exists (PERFECT)
- p-Adic Monotonicity Dominance (1D only): p-adic inversions ⊂ L2 inversions in 1D
Next Steps: Complete kMin_implies_increase proof, formalise threshold formula.
Formal Proofs Status
| Theorem | Status | File |
|---|---|---|
| R Decreases at Transitions | PROVEN | lean/PadicRegularisation/Basic.lean |
| Pareto Frontier Monotonicity | PROVEN | lean/PadicRegularisation/Basic.lean |
| k_min Predictor | Statement Formalized | lean/PadicRegularisation/Basic.lean |