Research Hypotheses

Conjectures and their status

Validated Hypotheses

H1: The n+1 Point Property Holds at λ=0

Statement: At zero regularisation, the optimal hyperplane in n-dimensional p-adic regression passes through at least n+1 points.

Status: VALIDATED (2025-12-01)

Evidence: All 1D experiments at λ=0 show 2 or more exact fits. Extended to 2D: 15/15 datasets show k(0) ≥ 3.

H2: Strong Regularisation Reduces Exact Fits

Statement: As λ→∞, the number of exactly-fitted points decreases toward 1.

Status: VALIDATED (2025-12-01)

Evidence: Experiments show transition from 2-3 exact fits at λ=0 to 1 exact fit at large λ.

H3: Discrete Phase Transitions

Statement: The function k(λ) giving the number of exactly-fitted points is a step function with finitely many discontinuities.

Status: VALIDATED (2025-12-02)

Evidence: All experiments show discrete jumps in k(λ). Threshold formula derived: λ* = (L₂ - L₁) / (b₁² - b₂²).

H4: Threshold Depends on Data Geometry

Statement: The critical λ values depend on the p-adic distances between data points and their geometric configuration.

Status: VALIDATED (2025-12-02)

Evidence: Different datasets show different threshold values. Also depends on the choice of prime p.

H6: Monotonic Decrease in k(λ)

Statement: The number of exact fits k(λ) is monotonically non-increasing in λ.

Status: VALIDATED (2025-12-02)

Evidence: 30/30 random 1D datasets and 15/15 random 2D datasets show monotonic k(λ). No counterexamples found.

H8: Higher-Dimensional Generalization

Statement: In n dimensions with regularisation, the optimal hyperplane passes through k points where 1 ≤ k ≤ n+1, with k depending on λ.

Status: VALIDATED for n=2 (2025-12-02)

Evidence: 2D experiments confirm: k(0) ≥ 3, k monotonically decreases with λ, discrete thresholds exist.

Active Hypotheses (Under Investigation)

H5: p-Adic Regularisation Prefers High-Valuation Coefficients

Statement: When using p-adic regularisation (penalty = |β|p), the optimal coefficients tend to have high p-adic valuations (divisible by higher powers of p).

Status: UNDER INVESTIGATION

Evidence: Preliminary experiment shows different optimal slopes for p-adic vs real L2 regularisation.

H7: Generalized Base r Interpolation

Statement: Using r-v(t) instead of p-v(t):

  • As r→1: Solution approaches binary (nearest neighbor) selection
  • As r→∞: Solution approaches minimax (minimize maximum residual)

Status: PROPOSED

H9: Prime Dependence Structure

Statement: The number and location of thresholds depend systematically on the choice of prime p.

Status: UNDER INVESTIGATION

Evidence: For the same dataset, p=2 gives 1 threshold while p=3,5,7 give 2 thresholds at different locations.

Proposed Hypotheses (To Be Tested)

H10: Threshold Count Upper Bound

Statement: For a dataset of m points in n dimensions, the number of thresholds is at most C(m, n+1) - 1.

Rationale: There are C(m, n+1) ways to choose n+1 points to determine a hyperplane. Each threshold corresponds to a switch between candidate hyperplanes.

H11: Universal Threshold at Strong Regularisation

Statement: For sufficiently large λ, the optimal solution is always the horizontal hyperplane through the "p-adic median" of the y-values.

Rationale: When slopes are too expensive, the intercept-only solution minimizes the data loss.

Invalidated Hypotheses

None yet.