Validated Hypotheses
H1: The n+1 Point Property Holds at λ=0
Statement: At zero regularisation, the optimal hyperplane in n-dimensional p-adic regression passes through at least n+1 points.
Status: VALIDATED (2025-12-01)
Evidence: All 1D experiments at λ=0 show 2 or more exact fits. Extended to 2D: 15/15 datasets show k(0) ≥ 3.
H2: Strong Regularisation Reduces Exact Fits
Statement: As λ→∞, the number of exactly-fitted points decreases toward 1.
Status: VALIDATED (2025-12-01)
Evidence: Experiments show transition from 2-3 exact fits at λ=0 to 1 exact fit at large λ.
H3: Discrete Phase Transitions
Statement: The function k(λ) giving the number of exactly-fitted points is a step function with finitely many discontinuities.
Status: VALIDATED (2025-12-02)
Evidence: All experiments show discrete jumps in k(λ). Threshold formula derived: λ* = (L₂ - L₁) / (b₁² - b₂²).
H4: Threshold Depends on Data Geometry
Statement: The critical λ values depend on the p-adic distances between data points and their geometric configuration.
Status: VALIDATED (2025-12-02)
Evidence: Different datasets show different threshold values. Also depends on the choice of prime p.
H6: Monotonic Decrease in k(λ)
Statement: The number of exact fits k(λ) is monotonically non-increasing in λ.
Status: VALIDATED (2025-12-02)
Evidence: 30/30 random 1D datasets and 15/15 random 2D datasets show monotonic k(λ). No counterexamples found.
H8: Higher-Dimensional Generalization
Statement: In n dimensions with regularisation, the optimal hyperplane passes through k points where 1 ≤ k ≤ n+1, with k depending on λ.
Status: VALIDATED for n=2 (2025-12-02)
Evidence: 2D experiments confirm: k(0) ≥ 3, k monotonically decreases with λ, discrete thresholds exist.
Active Hypotheses (Under Investigation)
H5: p-Adic Regularisation Prefers High-Valuation Coefficients
Statement: When using p-adic regularisation (penalty = |β|p), the optimal coefficients tend to have high p-adic valuations (divisible by higher powers of p).
Status: UNDER INVESTIGATION
Evidence: Preliminary experiment shows different optimal slopes for p-adic vs real L2 regularisation.
H7: Generalized Base r Interpolation
Statement: Using r-v(t) instead of p-v(t):
- As r→1: Solution approaches binary (nearest neighbor) selection
- As r→∞: Solution approaches minimax (minimize maximum residual)
Status: PROPOSED
H9: Prime Dependence Structure
Statement: The number and location of thresholds depend systematically on the choice of prime p.
Status: UNDER INVESTIGATION
Evidence: For the same dataset, p=2 gives 1 threshold while p=3,5,7 give 2 thresholds at different locations.
Proposed Hypotheses (To Be Tested)
H10: Threshold Count Upper Bound
Statement: For a dataset of m points in n dimensions, the number of thresholds is at most C(m, n+1) - 1.
Rationale: There are C(m, n+1) ways to choose n+1 points to determine a hyperplane. Each threshold corresponds to a switch between candidate hyperplanes.
H11: Universal Threshold at Strong Regularisation
Statement: For sufficiently large λ, the optimal solution is always the horizontal hyperplane through the "p-adic median" of the y-values.
Rationale: When slopes are too expensive, the intercept-only solution minimizes the data loss.
Invalidated Hypotheses
None yet.