Higher-Dimensional SC8 Sweep (2026-01-03)
Experiment 327: SC7/SC8 in 6D
- Goal: push SC8 (optimal-path monotonicity) into 6D to see if n+1 → n+2 rebounds (6→7) appear and whether SC8 stays exact.
- Config: 400 seeds × bases r ∈ {1.1, 2.0, 5.0}; 8 points with unique y (C(8,6)=28 candidates + horizontals) ⇒ 1200 cases.
- Monotonicity: 1200/1200 cases monotone (0 violations); no observed 6→7 rebounds.
- SC7/SC8: TP=1200, FP=0, TN=0, FN=0 ⇒ both 100% accuracy/coverage; SC8 remains exact in 6D.
- Limitation: the 8-point setup may under-stress the frontier; plan to rerun with 10–12 points (fewer seeds) to probe for rare non-monotone cases.
- Artifact: experiments/exp327_sc8_6d_condition.py and results/exp327_sc8_6d_condition.json
Experiment 328: Dense 6D (10–12 points)
- Config: 150 seeds (10 points) + 50 seeds (12 points) × bases {1.1, 2.0, 5.0}; C(10,6)=210 and C(12,6)=924 hyperplane candidates.
- Results: 600/600 cases monotone; max k seen = 6 (10-point) and 7 (12-point); still no 6→7 rebounds.
- SC8: Remains exact (0 mismatches).
- Takeaway: even with 12 points the k-path stays monotone; need more points or targeted frontier filtering to expose n+1 → n+2 transitions.
- Artifact: experiments/exp328_sc8_6d_dense.py and results/exp328_sc8_6d_dense.json
Experiment 330: Quick 6→7 hunt (solver optimised)
- Upgraded the 6D solver to Fraction-based Gaussian elimination (replaces Cramer’s rule) for large candidate sets.
- Config: 120 seeds × bases {1.1, 2.0, 5.0}; 10 points with unique y (210 candidates per dataset).
- Results: 360/360 cases monotone, 0 non-monotone; max k seen = 6 (no k ≥ 7 candidates in this setup).
- Implication: 10-point 6D datasets seldom produce 7-fits; even the follow-on 14-point chunk (exp329) remained monotone, so we likely need more points or structured sampling to see 6→7.
- Artifact: experiments/exp330_6d_quick_hunt.py and results/exp330_6d_quick_hunt.json
Experiment 329 (chunked): Targeted 14-point sweep in 6D
- Added CLI controls (seeds/points/range/bases/output) so exp329 can run in manageable chunks without timeouts.
- Chunk run: 60 seeds × bases {1.1, 2.0, 5.0}, 14 points, coord range ±6 (180 cases; 3003 candidates each).
- Results: Max k = 7 in 6 configs, but 0 k=7 candidates on the Pareto frontier ⇒ 180/180 cases monotone (no 6→7 violations).
- 16-point slice (seeds 60–69): 30/30 cases monotone, max k = 6; zero k=7 candidates even with 8008-candidate sets ⇒ geometry, not path shielding, is the blocker.
- 18-point slice (seeds 0–7): 24 cases; 3 configs have k=7 candidates, 1 has k=7 on the Pareto frontier with k=6 co-present; still 24/24 monotone (0 violations), max k = 7.
- Implication: random 14–18 point 6D datasets rarely produce frontier 7-fits (only 1 frontier 7 so far) and remain monotone; need more points or structured/biased sampling (contrast with forced-collinear exp331).
- Artifacts: experiments/exp329_6d_violation_hunt.py (updated), results/exp329_6d_violation_hunt_n14_s60.json, results/exp329_n16_seeds10_start60.json, results/exp329_n18_seeds8_start0.json
SC7/SC8 Static Conditions Analysis (2026-01-02 Night)
Experiment 322: Shielding Mechanism Behind SC7 Coverage Gap
- Goal: explain why SC7 (reachability-based monotonicity condition) misses ~2.2% of monotone cases.
- Config: 1000 seeds × bases r ∈ {1.1, 2.0, 5.0} (3000 cases); distinct y-values.
- Result: 2992/3000 cases monotone; SC7 false negatives = 67/3000 (2.23% of monotone).
- Key finding: 97/102 minimal-L starts in FN cases (95.1%) are "shielded" — an earlier lower-k transition blocks the path to any higher-k crossover. The remaining 5 starts (4.9%) have no higher-k reachable from their specific position but SC7 triggers from another minimal-L start in the same case.
- Implication: SC7 is conservative because it considers ALL winners at each step, not just the optimal (first) one.
- Artifact: experiments/exp322_sc7_failure_forensics.py and results/exp322_sc7_failure_forensics.json
Experiment 323: SC8 Refined Exact Condition
- Goal: test whether checking only the FIRST (optimal) winner at each step achieves exact monotonicity prediction.
- Config: 1000 seeds × bases r ∈ {1.1, 2.0, 5.0} (3000 cases).
- Key Result: SC8 achieves 100% accuracy with 100% coverage!
- SC7: TP=2925, FP=0, TN=8, FN=67 (97.77% coverage)
- SC8: TP=2992, FP=0, TN=8, FN=0 (100% coverage)
- Insight: SC8 = H16 (exact reachability criterion) expressed as a condition. SC7 is conservative because it flags ANY reachable higher-k, even when the optimal path never visits it. SC8 checks only the FIRST (earliest crossover) winner, matching the actual optimal path.
- Definition difference:
- SC7: max{k(C') : C' is a winner at this step} ≤ k(current)
- SC8: k(first winner at this step) ≤ k(current)
- Artifact: experiments/exp323_sc8_refined_condition.py and results/exp323_sc8_refined_condition.json
Experiment 324: SC7/SC8 in 3D
- Goal: test whether SC8 remains exact in 3D (two-slope hyperplanes) and quantify SC7's conservatism.
- Config: 500 seeds × bases r ∈ {1.1, 2.0, 5.0}; 6 points with unique y (1500 cases).
- Monotonicity: 1498/1500 cases monotone (99.87%); 2 cases have increases 3→4→1 and 3→4→3→1.
- SC7: TP=1471, FP=0, TN=2, FN=27 (98.20% coverage of monotone cases).
- SC8: TP=1498, FP=0, TN=2, FN=0 — 100% accuracy/coverage in 3D.
- Artifacts: experiments/exp324_sc8_3d_condition.py and results/exp324_sc8_3d_condition.json
3D Reachability & Static Conditions (2026-01-02 Evening)
Experiment 319: 3D Reachability Structure
- Extended reachability analysis to 3D (n=2) to check for higher-k violations beyond 2→3.
- Config: 500 seeds × bases r ∈ {1.1, 2.0, 5.0}; 6 data points per dataset.
- Key Finding: Higher-k violations DO occur: 3→4 transitions found (2 cases).
- Monotone in 99.87% (1498/1500); non-monotone in 0.13% (2 cases, both 3→4).
- Initial k distribution: k=3 (1709), k=4 (51), k=1 (290).
- Pattern: Violations follow n+1 → n+2 (2→3 in 2D, 3→4 in 3D).
- Artifacts: experiments/exp319_reachability_3d.py
Experiment 320: Static Monotonicity Conditions
- Tested three static conditions to predict monotonicity without full path tracing.
- SC1 ("No higher-k frontier point has lower R than min-L"): 99.80% accurate, 3 FP.
- SC2 ("Frontier k non-increasing sorted by R"): 2.70% accurate, 0 FP — valid sufficient condition but very conservative.
- SC3 ("k non-increasing on R-path from min-L"): 99.73% accurate, 8 FP.
- Artifacts: experiments/exp320_static_monotonicity_condition.py
Experiment 321: Refined Static Conditions
- Developed SC7: "No higher-k candidate is reachable from any minimal-L start."
- SC7 has 0 false positives and 97.77% coverage — a practical sufficient condition!
- Comparison: SC4 (97.73%, 2 FP), SC5 (96.50%, 6 FP), SC7 (97.77%, 0 FP).
- Theorem: If SC7 holds, k(λ) is guaranteed to be monotone non-increasing.
- Artifacts: experiments/exp321_refined_static_condition.py
Frontier & Monotonicity Updates (2026-01-02 PM)
Experiment 306: Analytical Breakpoints in 3D
- Exact λ breakpoints for 30 seeds (p=2, r ∈ {1.1, 2, 5}); candidates include quadruples, axis-aligned, horizontals.
- H15 supported: achievable k-values match Pareto frontier k-values exactly.
- Skipping is rare: 0/30 at r=1.1; 1/30 at r=2 and r=5 (seed 12 skips k=3); avg frontier size ≈ 5–6.
- Artifact: experiments/exp306_analytical_breakpoints_3d.py
Experiment 307: H6 Counterexamples
- Search over 1000 random 2D datasets (bases r ∈ {1.1, 2, 5}) for k increases as λ grows.
- Found 36/1000 seeds (1.2%) with k: 2 → 3 increases; 72% have ≥3 points sharing the same y-value.
- Mechanism: horizontal lines with R=0 fit many points and overtake tilted lines as λ increases.
- Artifacts: experiments/exp307_h6_counterexamples.py, experiments/results/exp307_h6_counterexamples.json
Experiment 308: Unique-y Monotonicity Audit
- Enforced unique y-values (no horizontal degeneracy), 1000 seeds × bases r ∈ {1.1, 2, 5}.
- Non-monotone cases persist: 8/3000 seed-base pairs (0.27%) still show k: 2 → 3 increases.
- Mechanism: competing k=3 candidates with different slope penalties; mid-λ region prefers a lower-R k=2 before a gentler k=3 retakes optimality.
- Artifacts: experiments/exp308_general_position_monotonicity.py, experiments/results/exp308_general_position_monotonicity.json
Experiment 315: Structural Monotonicity Criterion
- Tested the Pareto frontier k-ordering rule (k non-increasing along frontier sorted by L↑, R↓) against exact optimal-path monotonicity.
- Config: 1000 seeds, bases r ∈ {1.1, 2.0, 5.0}, unique y-values; traced optimal paths from every minimal-L candidate.
- Results: Frontier k-ordering is ~99.7% accurate but not exact (2 false positives, 3 false negatives). Exact non-monotone rate unchanged at 0.27% (8/3000, all 2→3 jumps).
- Takeaway: Need an “effective frontier” notion that filters unreachable frontier points.
- Artifacts: experiments/exp315_structural_monotonicity.py, experiments/results/exp315_structural_monotonicity.json
Experiment 316: Reachability Analysis
- Dissected the exp315 mismatches: traced optimal paths from every minimal-L candidate to see why frontier k-ordering fails.
- False positives (seeds 513, 530): Off-frontier minimal-L candidates can traverse higher-k off-frontier points (k-path 2→3→…).
- False negatives (seeds 478, 848): Frontier k=3 points are unreachable from minimal-L starts; paths stay monotone (2→2→1).
- Artifacts: experiments/exp316_reachability_analysis.py, experiments/results/exp316_reachability_analysis.json
Experiment 317: Exact Reachability Criterion
- Verified the reachability-based monotonicity criterion across 2000 seeds × bases r ∈ {1.1, 2.0, 5.0}.
- Result: 6000/6000 matches between the reachability check and ground-truth optimal paths (criterion is exact).
- Comparison: Pareto k-ordering is 99.82% accurate (11 mismatches). Non-monotone rate remains 0.27% (16/6000, all 2→3 jumps).
- Artifacts: experiments/exp317_reachability_criterion.py, experiments/results/exp317_reachability_criterion.json
Experiment 318: Reachability Structure Audit
- Quantified how minimal-L candidates drive monotonicity (1000 seeds × bases r ∈ {1.1, 2.0, 5.0}).
- Monotone in 99.73% (2992/3000); 8 non-monotone cases, all 2→3 transitions.
- Minimal-L multiplicity: 1 candidate in 64% of datasets; up to 6 in rare cases. Off-frontier minimal-L candidates appear in 31.5% of datasets.
- Non-monotone paths start on the frontier 7 times and off the frontier 2 times; all increases are 2→3.
- Artifacts: experiments/exp318_reachability_structure.py, experiments/results/exp318_reachability_structure.json
Regularization Threshold Experiments (2026-01-02)
MAJOR FINDING: Hypothesis H12 (intermediate k-values always achievable) is FALSE.
Experiment 300: Systematic 1D Threshold Analysis
- Purpose: Map k(λ) function for 1D case (n=1, expecting k=2 at λ=0).
- Config: p ∈ {2, 3, 5}, 30 datasets each, λ ∈ [0, 100].
- Results: k(0) = 2 in 100% of cases (validates n+1). k(∞) ∈ {1, 2}. Monotonic decrease confirmed.
- Threshold formula validated: λ* = (L₂-L₁)/(b₁²-b₂²).
- H12 TRUE for 1D: all intermediate k values achieved.
- Artifacts: experiments/exp300_regularization_thresholds.py
Experiment 301: 2D Threshold Analysis
- Purpose: Map k(λ) function for 2D case (n=2, expecting k=3 at λ=0).
- Config: p=2, 30 random datasets, 5 points each, λ ∈ [0, 50].
- Results: k(0) ≥ 3 in 100% of cases (validates n+1). Transitions: 3→2 (26 times), 2→1 (7 times), 3→1 (2 times).
- H12 INVALIDATED: Seeds 4 and 23 skip k=2 entirely.
- Artifacts: experiments/exp301_regularization_2d.py
Experiment 302: Skipped k-Value Analysis
- Purpose: Understand WHY some datasets skip intermediate k values.
- Config: Deep analysis of counterexample seeds 4 and 23.
- Mechanism discovered: For k to be optimal at some λ, need (Lk+1-Lk)/(Rk-Rk+1) < λ < (Lk-1-Lk)/(Rk-Rk-1). When this interval is empty, k is skipped.
- In counterexamples: k=2 candidates have unfavorable loss/regularization tradeoff vs both k=3 and k=1.
- New hypothesis H15: achievable k-values = Pareto frontier coverage in (L, R) space.
- Artifacts: experiments/exp302_skipped_k_analysis.py
Experiment 303: Pareto Frontier Coverage
- Purpose: Test H15 by comparing k(λ) to the (data-loss, reg) Pareto frontier.
- Config: 30 seeds, p=2, r ∈ {1.1, 2, 5}, λ ∈ [0, 20]; candidates include all triples, axis-aligned, and horizontal planes.
- Results: For all 90 datasets, every observed k(λ) is on the Pareto frontier (0 violations of H15). Frontier size ≈ 4.0 (r=1.1) and 3.2 (r=2,5) vs ~29 total candidates.
- Skipping persists but is rare: 2/30 seeds (r=1.1) and 1/30 seeds (r=2,5) still skip an intermediate k.
- Some frontier k (usually k=1) only activate beyond λ=20, explaining remaining frontier-only entries.
- Artifacts: experiments/exp303_pareto_frontier_coverage.py, experiments/results/exp303_pareto_frontier.json
Experiment 304: 3D Pareto Frontier Coverage
- Purpose: Extend H15 validation to 3D (n=3, where n+1=4 points determine a hyperplane).
- Config: 30 seeds, p=2, r ∈ {1.1, 2, 5}, λ ∈ [0, 30], m=6 points; candidates include quadruples, axis-aligned planes, and horizontal.
- Results: H15 SUPPORTED in 3D - 0/90 violations. All observed k(λ) lie on the Pareto frontier.
- n+1 property: k(0) ≥ 4 in 100% of cases (validates the unregularized theorem in 3D).
- Skip frequency initially appeared higher (5-9/30 via λ-grid sweep) but see exp306 for the exact analytical rate (~3%).
- Avg candidates: ~107, Avg frontier size: 5-6 (more complex frontier structure in 3D).
- Artifacts: experiments/exp304_pareto_frontier_3d.py, experiments/results/exp304_pareto_frontier_3d.json
Experiment 305: Analytical Lambda Breakpoints
- Purpose: Compute exact breakpoints where optimal k changes, using λ* = (L₂-L₁)/(R₁-R₂).
- Config: 50 seeds, p=2, r ∈ {1.1, 2, 5}; no grid sweep needed.
- Results: Confirms skipped k is EXACT (not grid artifact). E.g., Seed 4 at r=2: λ-ranges are {k=3: [0, 3.375], k=1: [3.375, ∞)}. No λ exists where k=2 is optimal.
- Skip frequency: 2-4% of datasets (1-2/50) have analytically verified skipped k.
- Seed 47 shows 4→2→1 transition (skipping k=3), a different type of skip.
- Achievable k = Pareto frontier k: confirmed analytically (not just from sampling).
- Artifacts: experiments/exp305_analytical_breakpoints.py, experiments/results/exp305_analytical_breakpoints.json
Experiment 306: Analytical Lambda Breakpoints in 3D
- Purpose: Extend the exact breakpoint computation to 3D datasets to confirm H15 and quantify skipping without λ-grid artifacts.
- Config: 30 seeds, p=2, r ∈ {1.1, 2, 5}; candidates are all quadruples, axis-aligned planes, and horizontals.
- Results: Achievable k-values exactly match Pareto frontier k-values (H15 supported). Skipping is rare: 0/30 at r=1.1, 1/30 at r=2 and r=5 (seed 12 skips k=3). Avg frontier size ≈ 5.0–6.0.
- Implication: The higher skip rate seen in the λ sweep (exp304) is largely a grid artifact; exact analysis shows only ~3% skipping in 3D.
- Artifacts: experiments/exp306_analytical_breakpoints_3d.py, experiments/results/exp306_analytical_breakpoints_3d.json
Latest Experiments (2025-12-25)
Experiment 277: Maximal-partition c_{v_min} audit
- Purpose: Measure whether maximal-partition candidates automatically have a positive leading margin coefficient c_{v_min}.
- Config: n ∈ {1,2,3}, p ∈ {2,3,5}, m = n+4, seeds < 500; enumerate all E of size n+1, build h′, restrict to maximal partitions.
- Results: 55,700 maximal candidates; c_{v_min} ≥ 1 in 55,685 (99.97%), c_{v_min} = 0 in 0, min c_{v_min} = -1 (15 negatives). Gains+orig = 1 slice: 988/988 have c_{v_min} ≥ 1 (100%).
- Implication: In the axiom domain, maximal partitions already give c_{v_min} ≥ 1; combined with the Lean lemma
new_zero_of_maximal_and_gains_orig_onethis forces new = 0 without delta/cancellation analysis. - Artifacts: experiments/exp277_maximal_partition_margin_ge_one.py.
Experiment 266: Proper max_vmin computation (k < n domain)
- Purpose: Recompute v_max as max_vmin (maximum v_min over all partitions) instead of min(h-res val) and retest the axiom.
- Config: n ∈ {1,2,3}, p ∈ {2,3,5}, m = n+4, seeds < 1000 with baseline k < n.
- Results: 669/669 gains+orig=1 cases have zero-new; max_vmin ≠ min(h-res) in 667/669, confirming the correct domain differs from min-res valuation.
- Artifacts: experiments/exp266_proper_max_vmin.py.
Experiment 267: max_vmin structure
- Purpose: Analyse how max_vmin relates to h-res valuations and anchor placement.
- Config: n ∈ {1,2}, p ∈ {2,3}, m = n+4, seeds < 500 with k < n.
- Results: Anchor sits at max_vmin in 261/261 cases; max_vmin − min_h_val is small (diff=1 in 64%, diff≤2 in 90%).
- Artifacts: experiments/exp267_max_vmin_structure.py.
Experiment 268: Anchor-in-R delta analysis
- Purpose: When the anchor is not in E, measure δ = h' − h at the anchor to see why new=0 persists.
- Config: n ∈ {1,2}, p ∈ {2,3}, m = n+4, seeds < 500, maximal candidates with gains+orig=1.
- Results: 320 anchor∉E maximal cases; val(δ at anchor) < max_vmin in 311 (97.2%), remaining 9 hit val(δ)=max_vmin but still have new=0; total new=0 in 320/320.
- Artifacts: experiments/exp268_anchor_in_R_analysis.py.
Experiment 269: Cancellation at val(δ)=max_vmin
- Purpose: Inspect the rare cases with val(δ)=max_vmin at the anchor to see if cancellation triggers h′-avoidance.
- Config: Same search space as exp268, filtered to val(δ)=max_vmin cases.
- Results: 9/9 cases show perfect leading-term cancellation, giving val(h′-res) > max_vmin.
- Artifacts: experiments/exp269_cancellation_cases.py.
Latest Experiments (2025-12-23)
Experiment 225: Anchor signature injectivity (n ≥ 2)
- Purpose: Probe whether Mechanism 1 per-P uniqueness can be derived from the anchor-in-E signature
sig(E) = sorted(E \\ {anchor}). - Config: n ∈ {2,3}, p ∈ {2,3,5}, seeds 0–1999; candidates with h_min_val = max_vmin, gains+orig = 1, anchor ∈ E.
- Key results: Mechanism 1 cases = 25; signature-unique in 24/25 (max contributions per P = 1). One collision (n=3, p=2, seed=1329) where P=0 contributes to signatures {1,2,5} and {2,4,5} (candidate_count=9).
- Implication: Identity-signature path is almost universal; need to diagnose the lone collision (degeneracy vs genuine δ-coincidence) and possibly refine the signature/GP guard.
- Artifacts: experiments/exp225_anchor_signature_uniqueness.py, experiments/results/exp225_anchor_signature_uniqueness.json.
Latest Experiments (2025-12-22)
Experiment 208: Mechanism 1 delta guard (n ≥ 2)
- Purpose: Test whether δ = h' - h sits above v_max on R-points for anchor-in-E Mechanism 1 candidates, and whether that guard distinguishes successes.
- Config: n ∈ {2,3}, p ∈ {2,3,5}, seeds 0–1999; anchor-in-E candidates with h_min_val = max_vmin and gains+orig = 1.
- Key results: Anchor candidates = 128; zero-new = 124 (96.9%). Every success has δ-valuation > v_max on all R-points (delta_gap ≥ 1) and h'-valuation > v_max. All four failures have δ hitting v_max and h'-val = v_max.
- Delta vs h: 61.3% of successes have δ ≥ h on R; 38.7% succeed with δ < h but still δ > v_max, so δ > v_max is the critical guard. Success δ-gap min=1 (mean≈1.32); failures have δ-gap=0.
- Artifacts: experiments/exp208_mechanism1_delta_guard.py, experiments/results/exp208_mechanism1_delta_guard.json.
Experiment 205: Mechanism 1 R-point structure (n ≥ 2)
- Purpose: Dissect how anchor-in-E interpolants achieve new=0 when h_min_val = max_vmin and gains+orig = 1 (Mechanism 1) for n ∈ {2,3}.
- Config: Seeds 0–3999; primes p ∈ {2,3,5}. Mechanism 1 cases observed for (n,p) ∈ {(2,2), (2,3), (3,2)}.
- Key results: Anchor-in-E candidates = 293; zero-new = 263. In every zero-new case, all R-points satisfy val(h'-res) > v_max (valuation lift); no cases with val(h'-res) < v_max or mixed. Failures (30) occur exactly when some R-point stays at v_max.
- Delta behaviour: No candidate has all δ(P) below v_max; zero-new cases always exhibit δ-val > v_max somewhere. Partial-cancellation detector (h_val = δ_val = v_max, h'-val > v_max) did not trigger.
- Artifacts: experiments/exp205_mechanism1_rpoint_structure.py, experiments/results/exp205_mechanism1_rpoint_structure.json.
Experiment 204: Mechanism 1 verification for n ≥ 2
- Purpose: Target the Mechanism 1 branch (h_min_val = max_vmin) and test whether anchor-in-E always yields new=0 in higher dimensions.
- Config: n ∈ {2,3}, p ∈ {2,3,5}, seeds < 3000.
- Key results: Mechanism 1 cases = 44; any zero-new = 44/44 (100%); anchor-in-E success = 44/44 (100%); all anchor-in-E succeed = 23/44 (52.3%).
- Insight: Unlike 1D, n ≥ 2 shows no failures—there is always at least one anchor-in-E choice that zeros out new_{v*}.
- Artifacts: experiments/exp204_mechanism1_verification.py, experiments/results/exp204_mechanism1_verification.json.
Latest Experiments (2025-12-21)
Experiment 187: Anchor residual gap audit
- Purpose: Test the hypothesized anchor-residual valuation jump when gains+orig=1 at maximal v_min and identify the true zero-new mechanism.
- Config: 189 gains+orig=1 cases across n=2,3; p=2,3,5; seeds < 700; exhaustive maximal-partition analysis.
- Key results: Zero-new candidate in 100% of cases. Anchor-in-E candidates: 751 total, 747 have
new_{v*} = 0(four outliers withnew_{v*} = 2). Anchor-in-R residual valuation > v_min is rare (16/384) and never zero. - Implication: Zero-new is driven by selecting the anchor point in E; the anchor-in-R valuation jump is not reliable. Formal target shifts to proving “anchor ∈ E ⇒ new_{v*} = 0 (or ≤ 1)” and explaining the four outliers (seeds (2,2,583), (2,3,669 twice), (3,2,392)).
- Artifacts: experiments/exp187_anchor_residual_gap.py, experiments/results/exp187_anchor_residual_gap.json.
Experiment 186: Zero-new mechanism analysis
- Purpose: Investigate the structural mechanism behind zero-new existence when gains+orig=1.
- Config: Deep analysis of 148 cases with gains+orig=1 across n=2,3; p=2,3,5; 500 seeds per configuration.
- Key results: 100% of gains+orig=1 cases (148/148) have a zero-new candidate. Zero-new cases where anchor ∈ E: 364; anchor ∉ E: 213.
- Mechanism: The "anchor point" (unique h-residual at v_min) can be in E (gains=1) or R (orig=1). Follow-up exp187 shows anchor-in-E drives zero-new; the earlier anchor-in-R valuation-jump hypothesis is largely false.
- Implication: Structural path for axiom elimination—prove that for any point A with h-val(A) = v_min, there exists E with A ∈ E giving new_{v*} = 0 (or tightly bounded).
- Artifacts: experiments/exp186_zero_new_mechanism.py, experiments/results/exp186_zero_new_mechanism.json.
Experiment 185: Zero-new existence analysis
- Purpose: Test whether a zero-new candidate (new_{v*} = 0) always exists at maximal v_min, and analyze gains+orig=1 cases.
- Config: 1200 k < n cases across n=2,3; p=2,3,5; 200 seeds per configuration.
- Key results: Satisfier (c_{v*} ≥ 1) exists in 100% of cases (1200/1200). For 62 cases with gains+orig=1, zero-new candidate exists in 100% (62/62).
- Implication: Zero-new existence is guaranteed when gains+orig=1, validating the structural approach to Step 2 axiom elimination.
- Artifacts: experiments/exp185_zero_new_existence.py, experiments/results/exp185_zero_new_existence.json.
Experiment 184: Gap-failure diagnostics
- Purpose: Reconstruct and inspect the lone
avg_new > gains+orig-1case from exp183 to understand why averaging missed while min-new still succeeds. - Config: Replayed the failing seed (n=2, p=2, seed=80) with the same dataset/hyperplane generation as exp183; collected full per-valuation counts for all candidates in the maximal v_min partition.
- Key results: Partition size=10, coeff_vmin=0, gains+orig=1, min_new=0, avg_new=0.2. Distribution: 9/10 candidates have
new_{v*}=0(c=1); 1/10 candidate hasnew_{v*}=2(c=-1), solely causing the negative avg gap. - Implication: The averaging lemma fails only because of a single high-new outlier; a structural bound on the count of high-new candidates or a direct zero-new witness for gains+orig=1 should close the gap.
- Artifacts: experiments/exp184_gap_failure_diagnostics.py, experiments/results/exp184_gap_failure_diagnostics.json.
Experiment 183: Max-partition new-mass audit
- Purpose: Measure total/average
new_{v*}inside the maximal coefficient-v_min partition to test whether averaging alone forces a satisfier. - Config: k < n cases with n = 2,3; primes p ∈ {2,3,5}; 1200 random integer datasets/hyperplanes.
- Key results: gains+orig fixed and ≥ 1 in 100%; min-new candidate is a satisfier in 100%; avg_new ≤ gains+orig-1 in 99.9% (1199/1200). Avg gap (g+o-1 - avg_new) mean ≈ 1.43 (min -0.2). Partition size mean ≈ 11 (max 15).
- Implication: Averaging lemma (`exists_min_newAtVal_le_avg`) almost suffices by itself; a structural bound on total new_{v*} or ruling out the lone avg-gap-negative seed should close the axiom-free satisfier proof.
- Artifacts: experiments/exp183_max_partition_new_mass.py, experiments/results/exp183_max_partition_new_mass.json.
Experiment 179: Max-partition min-new gap audit
- Purpose: Measure how the min-new candidate in the maximal coefficient-v_min partition compares to the gains+orig baseline to support Step 2 axiom elimination.
- Config: k < n cases with n = 2,3; primes p ∈ {2,3,5}; 1188 random integer datasets/hyperplanes.
- Key results: gains+orig fixed and ≥ 1 in 100% of cases; min-new candidate has c_{v*} ≥ 1 in 100% of cases; min_new = 0 in 27.0%. c_gap (gains+orig - min_new) ranges 1–6 (mean ≈ 2.87).
- Implication: Strong empirical support for the `exists_min_newAtVal_le_avg` strategy—bounding total new_{v*} mass should yield a formal satisfier without the fit-preserving axiom.
- Artifacts: Summary JSON at
experiments/results/exp179_max_partition_min_new.json; scriptexperiments/exp179_max_partition_min_new.py.
Latest Experiments (2025-12-20)
Experiment 173: Residual denominator coprimality
- Purpose: Empirically test the coprimality hypothesis now required by the Lean proof.
- Config: Random rational hyperplanes/data with n=3, m=8, denominators sampled from 1–6; primes p ∈ {2,3,5}; 2,000 trials per prime.
- Outcome: If denominators are allowed to contain p, 52–70% of nonzero residual denominators are divisible by p; if all input denominators are p-free, 0/47k nonzero residuals have denominators divisible by p (0% failure).
- Implication: Coprimality is nontrivial but appears stable under p-free inputs; motivates a Lean lemma propagating p-freeness through residuals.
- Script: ad-hoc Python snippet run from the repository root (see session log 2025-12-20 06:00 AEDT).
Experiment 162: Crossover rate for r=5 minimiser
- Purpose: Estimate how often the fit-preserving minimiser selected at r=5 develops a negative margin at other bases.
- Config: n=2, p=2, m=8, seeds=1000; choose min-loss FP at r=5 and evaluate its margin on r ∈ [1.001, 100] (log grid).
- Outcome: Crossovers are extremely rare—3/997 valid seeds (0.30%) go negative, all with c_{v_min} = 1. Worst case seed=120 hits min_margin ≈ -1.76 at r≈2.43; seeds 62 and 110 recur with milder negatives.
- Implication: Cross-base sign flips exist but are confined to a narrow band near r≈2.2–2.4; theorem unaffected because each base’s minimiser acts as the witness.
- Files: experiments/exp162_margin_crossover_rate.py, experiments/results/exp162_margin_crossover_rate.json
Experiment 161: Deep analysis of margin crossover cases
- Purpose: Investigate why min-loss FP at r=5 can have negative margin at r≈2, and verify theorem still holds.
- Configs: Deep dive into crossover cases from exp160 (n=2, p=2, seeds 62 and 110).
- Outcome: At every base r > 1, SOME h' beats h (✓). Different h' selected at different bases. At r=2, 4/21 candidates beat h0; at r=5, 6/21 beat h0. The min-loss at r=2 is different from min-loss at r=5.
- Key insight: The theorem "∃h' that beats h" is about existence at a FIXED base r, not about a universal witness across all bases.
- Files: experiments/exp161_crossover_analysis.py
Experiment 160: Min-loss FP at r=5 tested at all bases
- Purpose: Does the min-loss FP selected at r=5 have positive margin for ALL bases r > 1?
- Configs: n ∈ {2,3,4}, p ∈ {2,5}, 120 seeds; test h* (min-loss at r=5) across r from 1.001 to 1000.
- Outcome: 2 crossover cases found! (n=2, p=2, seeds 62 and 110). Margin crosses zero at r ≈ 2.2 despite c_{v_min} = 1 > 0.
- Mechanism: The margin polynomial M(r) = r² - 4r + 3 + 2/r² has positive leading coefficient but negative intermediate coefficient (-4r), causing crossover.
- Implication: c_{v_min} > 0 is NOT sufficient to guarantee margin > 0 across all bases. However, this doesn't break the theorem—different bases select different optimal h'.
- Files: experiments/exp160_minloss_r5_margin_all_bases.py, experiments/results/exp160_minloss_r5_margin_all_bases.json
Experiment 157: Margin decomposition at small bases (r ∈ (1,3])
- Purpose: Understand how the margin stays positive when c_{v_min} ≤ 0 and the structural bound fails at small bases.
- Configs: n=3, p ∈ {2,5}, m=8, bases r ∈ {1.2, 1.5, 2.0, 2.5, 3.0}, 60 seeds/config; enumerated the loss-minimising fit-preserving interpolant per base.
- Outcome: Margin positivity 100% for all bases; c_{v_min} ≤ 0 cases decline with base (13.3% at r=1.2 → 0% by r≥2.5). Negative heads (avg -1.3 to -2.0) are overwhelmed by positive tails (avg 3.0–4.7), giving margins ≥ 1.0. Coefficient-sum margins equal direct loss differences (max gap ≈ 1.8e-15).
- Implication: Above-v_min mass alone can dominate a nonpositive head for r>1; suggests a path to formalise positivity without the structural bound.
- Files: experiments/exp157_margin_decomposition_small_base.py, experiments/results/exp157_margin_decomposition_small_base.json
Experiment 153: Structural bound at low bases (r = 2–5)
- Purpose: Test whether the structural bound
gains + orig ≥ new + 1for the min-loss fit-preserving interpolant persists below the previously validated r ≥ 5 regime. - Configs: Bases r ∈ {2.0, 3.0, 4.0, 5.0}; n ∈ {2,3,4}; p ∈ {2,5}; m = 2n+2; 120 seeds/config; exhaustive FP enumeration from the zero hyperplane.
- Outcome: Bound holds 100% for r=3,4,5 (716/716 each). r=2 shows 4/716 violations (gap = -2, c_at_vmin = -1) in configs (3,2 seed 99), (3,5 seed 45), (4,2 seeds 84 & 116) where new = 2 but gains+orig ≤ 1.
- Implication: Structural bound empirically “turns on” by r ≈ 3; proof target can likely lower the base assumption to r ≥ 3, with a separate treatment needed for rare r=2 failures.
- Files: experiments/exp153_low_base_structural_bound.py, experiments/results/exp153_low_base_structural_bound.txt
Latest Experiments (2025-12-15)
Experiment 146: MinResVal guard for min-loss FP
- Purpose: Test whether the below-v_min guard at
v_min = minResidualValuation(h_min)(Σ_{v<v_min} c_v ≤ |E|-1) holds for the min-loss fit-preserving interpolant. - Configs: n ∈ {2,3,4}, p ∈ {2,5}, m = n+4, seeds 0–119; exhaustive enumeration of fit-preserving interpolants; base r = 5.0.
- Outcome: Guard holds 87.4% overall (as low as 78.3% for n=4,p=2);
c_at_vmin_res > 0in 71.7% overall (54.2% for n=2,p=2, 94.2% for n=4,p=5); max guard gap = 4. - Implication: The minResVal witness is too weak for a |E|-1 proof route; positivity proof should pivot to the margin’s v_min (exp141-144 show 100%) or a sharper structural bound.
- Files: experiments/exp146_minresval_guard.py, experiments/exp146_results.json
Latest Experiments (2025-12-14)
Experiment 145: Below-v_min guard for min-loss FP
- Purpose: Check whether the min-loss fit-preserving interpolant satisfies the below-v_min bound
Σ_{v<v_min} c_v ≤ |E|-1needed for the canonical positivity lemma. - Configs: n ∈ {2,3,4}, p ∈ {2,5}, m = n+4, seeds 0–119; exhaustive enumeration of fit-preserving interpolants; base r = 5.0.
- Outcome: Guard holds in 100% of 720 cases; below-v_min mass is always 0; c_min > 0 in all cases (avg ≈ 2.10); worst guard gap = -1.
- Implication: Empirical evidence that the min-loss FP already meets the below-mass ≤ |E|-1 condition, so formalising this bound in Lean should yield
c_{v_min} > 0via the canonical positivity lemma. - Files: experiments/exp145_below_mass_guard.py, experiments/exp145_results.json
Latest Experiments (2025-12-12)
Experiment 133: Min-loss dominant coefficient at r=2.0
- Purpose: Stress near-binary selection to see when the min-loss fit-preserving interpolant fails to achieve c_{v_min} > 0.
- Configs: n ∈ {2,3,4,5}, p ∈ {2,3}, m = n+4, seeds 0–399; exhaustive enumeration; base r_ref = 2.0.
- Outcome: c_{v_min} > 0 candidate exists in 100% of 3198 cases; min-loss has c_{v_min} > 0 in 99.9% (3 failures, all p=2). Failures: n=2 seeds 192/366 (v_min=-3,-2, c_min=-1), n=3 seed 344 (v_min=-3, c_min=-1); margins still positive (≈4.6–7.9).
- Implication: Dominant positivity is tightest at r=2.0; a mild base lower bound (r_ref ≥ 5) likely guarantees c_{v_min} > 0 for min-loss.
- Files: experiments/exp133_minloss_base2_failures.py, results/exp133_minloss_base2_failures.json
Experiment 128: Gap-to-positive dominance guard rescue
- Purpose: Measure how far we must drop in valuation to find c_{v_min} > 0 when the max-v_min guard fails.
- Configs: n ∈ {2,3,4}, p ∈ {2,3}, m = n+4, seeds 0–199; exhaustive enumeration of fit-preserving interpolants.
- Outcome: Max-v_min success = 99.75% (1197/1200). In all 3 strict failures, the best positive candidate sits exactly one valuation level below the top v_min (gap histogram: {0: 1197, 1: 3}).
- Failure fingerprint: The top level has 1–2 candidates with c_min = -1 driven by “new-only” residuals; the v_min-1 level has many candidates with c_min = +2.
- Files: experiments/exp128_gap_level_rescue.py, results/exp128_gap_level_rescue.json
Experiment 126: Max-v_min failure analysis
- Purpose: Diagnose the rare cases where the max-v_min dominance guard yields c_min ≤ 0.
- Configs: n ∈ {2,3,4}, p ∈ {2,3}, m = n+4, seeds 0–199; exhaustive enumeration of fit-preserving interpolants.
- Outcome: All max-v_min candidates positive in 1172/1200 cases (97.7%); any positive at max-v_min in 1197/1200 (99.8%). Strict failures = 3 seeds, always with the positive candidate one valuation level below the top.
- Failure anatomy: Top v_min is “new-only” — gains_vmin + orig_vmin = 0 (or 1) while new_vmin ≥ 1, so c_min = -1 with only 1–2 candidates at that level. Dropping one valuation level recovers c_min > 0.
- Files: experiments/exp126_max_vmin_failure_analysis.py, results/exp126_max_vmin_failure_analysis.json
Experiment 122: Dominance-guarded selection across bases
- Purpose: Stress-test the dominance guard (min-loss subject to c_{v_min} > 0) under near-binary and moderate reference bases.
- Configs: n ∈ {4,5}, p ∈ {2,3}, m = n+4, seeds 0–119; r_ref ∈ {1.1, 1.3, 1.5, 2.0, 3.0, 5.0}; margin checked on r ∈ {1.05…50}.
- Results: A c_min > 0 candidate existed in 100% of cases. Guarded selection achieved 100% positive margins on the full r-grid for every config/base. Un-guarded min-loss had c_min > 0 in 94–100% of cases, with misses confined to near-binary r_ref for p=2.
- Safety margins: Worst guarded min-margin ≥ 3.88 (n=4, p=3); p=2 cases stayed ≥ 4.04, indicating robust positivity.
- Files: experiments/exp122_dominance_guard_grid.py, results/exp122_dominance_guard.json
Experiment 115: High-d margin polynomial vs r_ref
- Purpose: Stress-test the margin polynomial for min-loss fit-preserving interpolants in higher dimensions and check sensitivity to the reference base r_ref used for selection.
- Configs: n ∈ {4,5}, p ∈ {2,3}, m = n+4, seeds 0–39; r_ref ∈ {1.5, 5.0, 20.0}; exhaustive min-loss search over all extra-point choices.
- Outcome: 479/480 cases had c_{v_min} > 0 and margin > 0 on r ∈ {1.05…100}. One failure at n=4, p=2, r_ref=1.5 (seed 21) had dominant term c_{-2} = -1, sending the margin negative for large r. The same dataset is fine at r_ref ≥ 5.
- Implication: Mid/heavy r_ref choices remain robust, but near-binary r_ref can pick a harmful interpolant. Proof/algorithmic guarantees may need either r_ref lower bounds or a secondary guard on the dominant valuation.
- Files: experiments/exp115_margin_polynomial_highdim.py, results/exp115_margin_polynomial_highdim.json
Experiment 114: Margin polynomial positivity sweep
- Purpose: Test whether the min-loss fit-preserving interpolant has a positive margin polynomial across r > 1.
- Configs: k < n cases with n ∈ {2,3}, p ∈ {2,3}, m = n+4, 200 seeds each; min-loss chosen at r_ref = 5.0 via exhaustive search.
- Outcome: 800/800 cases had positive dominant coefficient c_{v_min}; margin stayed > 0 on r ∈ {1.05…50}. Worst min margin ≈ 0.0016 (n=2, p=2).
- Implication: Strong evidence the margin polynomial for the min-loss FP never crosses zero, reinforcing the offset < gain conjecture and suggesting a direct Lean proof via valuation counts.
- Files: experiments/exp114_margin_polynomial_sweep.py, results/exp114_margin_polynomial_sweep.json
Experiment 113: Dominant term distribution analysis
- Purpose: Analyze the distribution of the dominant (lowest valuation) term coefficient in the margin polynomial across many seeds.
- Configs: n=2, p=2, 200 seeds analyzed; computed full margin polynomial for each.
- Outcome: 100% have positive dominant coefficient c_{v_min} > 0. Distribution: c_min=1 (105 cases), c_min=2 (51), c_min=3 (23), c_min≥4 (21).
- Key Finding: The coefficient c_{v_min} = (gains at v_min) + (orig R at v_min) - (new R at v_min) is always positive.
- Implication: The min-loss selection criterion forces the dominant term positive, ensuring margin > 0 for all r.
- Files: experiments/exp113_dominant_term.py
Experiment 112: Algebraic margin formula analysis
- Purpose: Derive the symbolic margin formula and understand its structure.
- Configs: Seeds 0, 1, 2, 10, 50, 100, 186, 200; n=2, p=2; verified across r values.
- Outcome: margin = Σ c_v·r^{-v} is a polynomial in r^{-1}. Positive at both r→1 (where margin→|E|) and r→∞ (where dominant term rules).
- Key Finding: The margin structure explains why offset < gain: the polynomial is positive at both limits and has no intermediate roots.
- Files: experiments/exp112_algebraic_margin.py
Experiment 111: Seed 186 deep analysis (worst case)
- Purpose: Understand why seed 186 consistently produces the highest offset/gain ratio across all r values.
- Configs: Detailed analysis of seed 186 (n=2, p=2) including residual valuations and FP interpolant comparison.
- Outcome: Margin formula =
2·r^{-1} + r^{-4}. Both coefficients positive! Valuation changes at remaining points: {(1,-2), (4,0), (0,2)}. - Key Finding: The worst case (approaching offset/gain = 1 as r→∞) still has margin = 2/r + 1/r^4 > 0 for all r > 0.
- Implication: The tight bound is algebraically enforced by the selection criterion, not accidental.
- Files: experiments/exp111_seed186_analysis.py
Experiment 110: High-d offset/gain bound audit
- Purpose: Check whether the empirical bound
offset(h*) < gain(h*)for the loss-minimising fit-preserving interpolant persists in higher dimensions and at heavy bases. - Configs: n ∈ {3,4}, p ∈ {2,3}, r ∈ {2,5,10,20,50}, m ≈ n+3..5, 30–80 trials each; enumerated all fit-preserving interpolants per trial.
- Outcome: Success rate 100% across all configs; anchor rate 75–93%. Global max offset/gain = 0.9257 (n=3, p=2, r=20) < 1. n=4 stays well below 1 (max 0.669 at r=10; 0.495 at r=50).
- Implication: Strong evidence that the offset < gain bound holds beyond n=2–3 and remains strict even as r grows large; supports the Lean goal of proving the bound for the k < n case.
- Files: experiments/exp110_highdim_offset_bound.py, results/exp110_highdim_offset_bound.json
Experiment 105: Anchoring property across bases/penalties
- Purpose: Stress test whether the loss-minimising fit-preserving interpolant reduces remaining-point loss (anchoring) across near-binary and heavy-base regimes.
- Configs: (n,p,r,trials) = {(2,2,1.2,200), (2,2,5.0,200), (3,2,2.0,150), (2,3,1.5,200), (3,3,3.0,150)} with m = n+4.
- Outcome: Anchor rate (offset ≤ 0) = 74–88%; success rate (margin > 0) = 100% in every config. Min-loss rarely equals min-offset (10–30%) yet always wins. Avg offsets stay negative (≈ -0.36…-9.28), margins strongly positive (≈ 3.7–51).
- Implication: Proof should target the loss minimiser directly; even when anchoring fails, gain dominates offset.
- Files: experiments/exp105_anchoring_property.py, results/exp105_anchoring_property.json
Experiment 104: Min-loss FP interpolant analysis
- Purpose: Dissect why the min-loss (min-offset) fit-preserving interpolant always beats h.
- Configs: n ∈ {2,3,4}, p ∈ {2,3}, r matched to p; 2300+ valid trials total.
- Outcome: Offsets for the min-offset/min-loss choice are NEGATIVE on average in every config (≈ -3.2 to -6.7) and success rate is 100%. Margin is driven almost entirely by offset (avg O/G ratio ≈ -0.69…-0.98).
- Implication: Strong evidence for the “anchoring” effect—fitting more points typically lowers loss on the remaining points, making margin automatically positive.
- Files: experiments/exp104_minloss_fp_analysis.py, results/exp104_minloss_fp_analysis.json
Latest Experiments (2025-12-09)
Experiment 094: High-dim product formula stress test
- Purpose: Extend the simplest-case product formula check to higher dimensions to support removing the axiom in Lean.
- Setup: n ∈ {5,6,7,8}, primes {2,3,5,7,11}, 200 trials per dimension. Sample n fit points, build an interpolant h, add two non-fits a,b, form h_a/h_b, and test δ_a(b)·δ_b(a) = res(h,a)·res(h,b) plus valuation sums.
- Outcome: 798/800 valid trials with 0 product failures, 0 valuation-sum failures, 0 both-negative margin cases across all primes; valuation checks succeeded in 3,990/3,990 prime instances.
- Implication: Empirical support for the product formula now spans n = 1..8, bolstering confidence that the determinant/Cramer's rule proof will be dimension-agnostic.
- Files: experiments/exp094_highdim_product_formula.py, experiments/exp094_results.json
Experiment 093: Average margin over fit-preserving interpolants
- Purpose: Test whether a simple averaging argument (Σ margins over all fit-preserving choices) can prove the general loss gap.
- Setup: n ∈ {2,3}, p = 2, r = 2.0, m = n+4, 10 seeds each; enumerate all fit-preserving interpolants for a random h with 1 ≤ k < n+1 fits (strong GP only).
- Outcome: Average margin is often negative (avg_margin_positive: 3/10 for n=2, 4/10 for n=3), but the best margin is always positive (min best margin 0.75 in n=2, 1.4375 in n=3).
- Implication: Averaging over the entire fit-preserving family does NOT guarantee a positive margin. The proof must find a structural selection (e.g., greedy or product-formula-guided) to pick a good interpolant.
- Files: experiments/exp093_fit_preserving_average_margin.py, experiments/exp093_results.json
Experiment 084: Fit-Preserving Gain vs Offset
- Purpose: Decompose margins of fit-preserving interpolants into gain from newly fitted points vs offset on the remaining points to guide the loss proof.
- Setup: dims {2,3}, primes {2,5}, r=2.0, m = n+4; enumerate all fit-preserving interpolants for random h with k < n+1 fits (strong GP only).
- Outcome: Best fit-preserving interpolant always satisfies gain ≥ offset (100% of trials). Best margins stay positive (min 0.0625→2.0, medians 2.25–4.0, max up to 5.8). Offsets are often negative (median ≈ -0.375 to -0.5), meaning many choices reduce loss even away from the new fits.
- Implication: To prove
interpolant_loss_lt_of_fit_witness, it may suffice to show existence of a fit-preserving interpolant with offset ≤ gain, then appeal to the loss-minimising interpolant. - Files: experiments/exp084_fit_preserving_gain_offset.py, experiments/exp084_results.json
Experiment 083: Fit-Preserving Interpolant Strategy
- Purpose: Test whether any fit-preserving interpolant (through all k fits of h plus n+1-k extra points) always beats h.
- Setup: dims {2,3}, primes {2,5}, r=2.0, m = n+4; strong GP datasets; enumerate all fit-preserving interpolants.
- Outcome: At least one fit-preserving interpolant beats h in 100% of 494 valid trials, though only 0–35% beat h individually. Margins: min 0.44–1.75, median 2.25–4.0.
- Implication: The minimum over fit-preserving interpolants seems sufficient; combined with exp084, the loss-minimising interpolant should inherit the strict loss drop.
- Files: experiments/exp083_fit_preserving_interpolant.py, experiments/exp083_results.json
Experiment 082: Averaging Interpolants Through a Witness (loss gap test)
- Purpose: Test whether averaging over all interpolants through a witness non-fit d is enough to beat a non-interpolating hyperplane h (k < n+1 fits).
- Setup: dims {2,3}, primes {2,5}, r=2.0, m = n+4, 120 trials/config; enumerate all (n+1)-point interpolants containing d; strong-GP datasets only.
- Outcome: Fraction of interpolants that beat h is high (≈0.63/0.87/0.70/0.93 for (n,p)=(2,2),(2,5),(3,2),(3,5)). Best margins mostly healthy (means ≈1.96–3.88) with a single mild negative (-0.125) in n=2,p=2. **Average margins are negative for p=2 configs** (≈ -1.88 in 2D, -3.31 in 3D) but positive for p=5.
- Implication: Averaging over interpolants through d is insufficient; the Lean proof needs a minimiser-over-interpolants argument (or a concentrated subset), not a straight expectation.
- Files: experiments/exp082_interpolant_average_gain.py, experiments/exp082_results.json
Experiment 081: Interpolant Loss Minimiser Gap Audit
- Purpose: Test whether the loss-minimising interpolant h_min across all (n+1)-subsets always beats an arbitrary h with k < n+1 fits.
- Setup: dims {2,3}, primes {2,5}, r=2.0, m = n+4, 300 trials/config; only strong-GP datasets kept.
- Outcome: 100% success in every config (760 valid runs). Margins are robust: min/median/p90 ≈ {0.375, 2.125, 3.62} (n=2,p=2) up to {1.25, 4.00, 4.75} (n=3,p=5); average fit gain ≈ 2.8–3.8 points.
- Implication: Empirical evidence that the finite interpolant minimiser h_min delivers a strict loss drop, supporting a non-constructive Lean proof that bypasses the shaky perturbation argument.
- Files: experiments/exp081_interpolant_minimizer_gap.py, experiments/exp081_results.json
Experiment 074: Ultrametric Direction Feasibility
- Purpose: Test whether a perturbation direction δ in the constraint kernel can satisfy vp(eval(δ,q)/eval(δ,d)) ≥ vp(res(h,q)/res(h,d)) for all non-fits q ≠ d.
- Method: Random small datasets (m=n+3), random integer hyperplanes with k < n+1 fits; exact kernel basis via Fractions; random integer combos searched for δ.
- Results (200 trials/config): n=2,p=2 → 11.5% success; n=3,p=2 → 13.0%; n=3,p=5 → 23.5%.
- Insight: Higher primes improve feasibility (p=5 roughly doubles the hit rate vs p=2 in 3D), but naive kernel sampling succeeds only in a minority of cases.
- Next: Bias δ search toward the constructive direction from the Lean lemma and condition on kernel dimension ≥ 2 to see if success rises.
Experiment 073: Inversion Feature Stats (exp072 re-analysis)
- Setup: Post-hoc analysis of exp072's 720 datasets (dims {3,4}, primes {2,5,11}, bases {2,3,4,6,8}) using duplication/diversity features and k-path timing.
- Padic-only signature (12 cases): High diversity (avg 0.99), minimal duplication (max_y_dup ≈ 1.1), concentrated at base=6 (7/12) in 3D (11/12) with p∈{5,11}. First k-up early (≈0.31) and only 33% of k-ups tied to reg-penalty drops → valuation-gap bounces.
- L2-only signature (24 cases): Lower diversity (0.67) with heavier duplication (max_y_dup ≈ 2.9), mostly 4D (21/24) with p∈{5,11}; bases spread with base=8 largest. First k-up later (≈0.56) and always paired with reg-penalty drops (100%) → late, smooth swaps.
- Overlap (9 cases): All in 4D with p=2; high duplication (max_y_dup = 3.0), late k-ups (≈0.59–0.66), full reg-drop coupling.
- Insight: padic-only inversions prefer diverse, low-dup datasets and early valuation jumps; L2-only inversions need duplication and occur later when penalty ordering flips.
Experiment 072: Inversion Geometry Gap (p-adic vs L2)
- Setup: dims {3,4}, n=d+3, primes {2,5,11}, bases {2,3,4,6,8}, seeds=24 (720 datasets).
- Counts: padic-only 12, L2-only 24, both 9, monotone 675.
- Padic-only signature: concentrated in 3D at bases 3 & 6 (p=5/11 heavy); longer paths (avg len 5.17), early k-up (first_up ≈ 0.31), only 33% k-ups with reg-drop → early valuation-gap bounce.
- L2-only signature: mostly 4D with p=5/11 across bases; later k-up (first_up ≈ 0.56) with 100% reg-drop coupling → late smooth swaps. Overlap cases all 4D with p=2.
Experiment 070: Near-Binary Base Transition Map
- Setup: dims {1,2,3}, n=d+3, primes {2,5,11}, bases {1.02,1.05,1.1,1.2,1.3,1.4,1.5,1.6,1.7,2.0}, seeds=24 (2,160 datasets).
- Outcome: 1D stayed monotone (0/720). p-adic inversions reappear at r=1.05–1.10 for small primes and peak near r≈1.4–1.5 (up to 12.5% in 2D). L2 saw only 2 inversions total (3D p=2 at r=1.4; 3D p=5 at r=2.0).
- Prime/base interaction: Early inversions driven by p∈{2,5}; p=11 only flips at r≥1.2 and only in 3D. r=1.3 is inversion-free despite neighboring bumps.
Experiment 062: p-Adic vs L2 Base Sweep
- Setup: dims {2,3}, primes {2,5,11}, bases {1.1,1.5,2,5}, seeds 0–11 (288 datasets).
- Outcome: L2 inversions = 1; p-adic inversions = 4 (all in 2D). 3D stayed monotone for both penalties.
- Near-binary surprise: r=1.1 produced a p-adic-only inversion (p=2, seed=5); heavy base r=5.0 also triggered p-adic-only inversions at p=2 and p=5.
- Example dataset (p=2, base=1.1, seed=5): k-paths L2 [3,3,3,2,1] vs p-adic [3,2,3,2,1] on X=[[3,-1],[-4,4],[-3,5],[-1,0],[1,2],[1,-3]], y=[-4,-8,-2,9,6,-7].
Experiment 048: Two-Regime Classifier
- Setup: Random family (n = d+3) and axis-duplication family (aligned/misaligned, n=10) across dims {2,3,4}; primes {2,5,11}; bases {1.5,2,5}; seeds=4.
- Heuristics: k_horiz > k_lambda0, duplication (y>=3 or axis>=5), and a k_lambda0/n splitter that chooses duplication when k_ratio ≥ t else k_horiz.
- Outcome: k_horiz never triggered (0 recall). Best splitter t=0.35 matched duplication precision≈0.148 but with lower recall (0.909 vs 0.970); no improvement over duplication alone.
- Inversions: 33 total (30 axis-dup, 3 random); k_ratio clustered ~0.7 for both families, so it is not a useful regime detector.
Experiment 035: 4D Base-Density Resample
- Setup: d=4, n ∈ {7,8,10}, primes {2,5,11}, bases {1.05,1.3,1.5,2,5,10}, seeds=6.
- Near-binary r=1.05 remained inversion-free across all configs.
- Dense n=10: p=5/11 flip in 1–2/6 runs once r≥1.5 (g_hat ≈1.67 for p=5; ≈3.67 with a bump to 7.33 at r=5 for p=11); p=2 stays zero.
- Baseline n=8: only p=2 inverts sparsely at r≥1.5 (g_hat ≈0.89); p≥5 invert only at r≥5 with low rates (1/6) and stay flat.
- Sparse n=7: small-prime inversions rise at r=5 (p=2 g_hat ≈2.33; p=5 hits 1/6 at r≥5); p=11 remains inversion-free.
Experiment 034: 4D Density Probe
- d=4 with n ∈ {7,8,10}, primes {2,5,11}, bases {1.05,1.5,2,5,10,15}, seeds=4.
- n=8 stayed inversion-free; near-binary bases remain immune.
- Dense n=10 flips for p=5 at r≥2 and p=11 at r≥1.5 (g_hat ≈2.5–5.5) with little base dependence once inversions appear.
- Sparse n=7 saw a single p=2 inversion at r=5 (g_hat ≈1.75); needed more seeds for stability.
Experiment 025: Base-Factor Curve
- Dense base grid r ∈ {1.01…10} across primes {2,5,11,17} and inversion-prone 1D/2D configs.
- Near-binary bases (r ≤ 1.1) gave 0/560 inversions; g(r) ramps gently to ≈0.1 by r≈2, jumps to ≈0.30 at r=3 and ≈0.327 at r=5.
- Heavy-base bump: r=10 raises g_hat to ≈0.417; prime scaling c(r, p) ≈ g(r)/p still holds (p≥11 invert only for r ≥ 3).
Experiment 021: Prime-Weighted Excess Scaling
- Tested P(inv) ≈ c(p) × excess_points with r = p across dims 1–3 and multiple n.
- Aggregated c_hat_global shrinks with prime: p=2/3 → 0.081, p=5 → 0.070, p=7 → 0.065, p=11 → 0.032.
- The prior 0.10 coefficient over-predicts inversions; largest gaps occur in 1D n=7 and 2D n=6 configs.
Experiment 020: Prime Sensitivity of Inversions
- Inversion probability drops with prime when r=p: p=2/3 → 5.25%, p=5 → 4.50%, p=7 → 3.75%, p=11 → 2.50% (400 runs/prime).
- Most inversions occur in 1D n=7 and 2D n=6; 3D is nearly inversion-free.
Experiment 018 (rerun): Base Sensitivity of Inversions
- r ∈ {1.02, 1.1} → 0/410 inversions (suppressed near binary).
- r ∈ {1.5, 2, 3, 5, 10} → aggregated 3.9–7.8% inversions.
- 1D inversion rates now non-zero: n=5 → 2.5–5.0%; n=6 → 2.5–8.8%; n=7 → 7.5–17.5%.
Experiment 019: 1D Non-Monotonicity Audit
- 1D only; bases r ∈ {1.5, 2, 3, 5, 10}; 80 seeds for n ∈ {5,6,7}.
- Result: 0 cases of non-monotonicity without a Pareto inversion once reg_penalty is tracked.
- Confirms Pareto inversions fully explain k increases in 1D.
Experiment 001: Regularisation Thresholds in 1D
Date: 2025-12-01
Status: Completed
Objective
Understand how the number of exactly-fitted points changes with regularisation strength in 1-dimensional p-adic linear regression.
Method
- Implemented exhaustive search over candidate slopes and intercepts
- Tested with p=2 and various λ values from 0 to 1000
- Used both real L2 and p-adic regularisation
Results
| λ | Intercept | Slope | Exact Fits | Data Loss |
|---|---|---|---|---|
| 0.000 | 0.0 | 2.0 | 2 | 2.00 |
| 0.001 | 1.0 | 1.0 | 2 | 2.00 |
| 0.100 | 1.0 | 1.0 | 2 | 2.00 |
| 0.500 | 3.0 | 0.0 | 1 | 2.25 |
| 1.000 | 3.0 | 0.0 | 1 | 2.25 |
| 10.00 | 3.0 | 0.0 | 1 | 2.25 |
Threshold Detection
For dataset {(0,1), (1,2), (2,4), (4,5)}:
Threshold between λ=1.2 and λ=1.3:
3 exact fits → 1 exact fit
Conclusions
- Phase transitions exist at specific λ values
- The transition is discrete, not continuous
- Different regularisation types give different optimal solutions
Experiment 002: Monotonicity and Threshold Analysis
Date: 2025-12-02
Status: Completed
Objective
Test whether k(λ) is monotonically non-increasing and derive analytical formula for threshold values.
Results
- Monotonicity: 30/30 random 1D datasets show monotonic k(λ)
- Threshold Formula: λ* = (L₂ - L₁) / (b₁² - b₂²)
- Validation: Formula correctly predicts λ* = 1.25 for test dataset
Conclusions
H6 (monotonicity) is strongly supported. Threshold locations can be computed analytically from data loss and coefficient values of competing solutions.
Experiment 003: 2D Regression (Fitting Planes)
Date: 2025-12-02
Status: Completed
Objective
Test whether n+1 theorem and monotonicity generalize to 2D regression (fitting planes z = a + bx₁ + cx₂ to 3D points).
Results
| Metric | 1D (Lines) | 2D (Planes) |
|---|---|---|
| n+1 theorem at λ=0 | Validated | Validated (k(0) ≥ 3) |
| Monotonicity | 30/30 | 15/15 |
| Typical thresholds | 1 | 1-2 |
Example
Dataset: 5 points in 3D
λ=0.00: z = 1 + 0.5x₁ + 1.5x₂ fits 4 points
λ=0.50: z = 2 + x₁ fits 3 points
λ=2.00: z = 5 fits 1 point
Conclusions
H8 (higher-dimensional generalization) is validated. The theory extends naturally from 1D to 2D.
Experiment 004: Prime Dependence
Date: 2025-12-02
Status: Completed
Objective
Investigate how the choice of prime p affects threshold structure.
Results
| Prime p | k(0) | Number of Thresholds | Threshold Locations |
|---|---|---|---|
| 2 | 3 | 1 | λ ≈ 1.25 |
| 3 | 3 | 2 | λ ≈ 0.45, 3.95 |
| 5 | 3 | 2 | λ ≈ 1.35, 3.95 |
| 7 | 3 | 2 | λ ≈ 1.35, 3.95 |
Conclusions
The optimization landscape is prime-dependent. Different primes can lead to different numbers of thresholds and different phase transition points.
Experiment 005: Alternative Base Values (r-sweep)
Date: 2025-12-02
Status: Completed
Objective
Test r-v(t) for r ∈ {1.02, 1.05, 1.1, 1.5, 2, 3, 5, 10} to understand interpolation between binary (r→1) and minimax (r→∞) behavior.
Results
- Monotonicity persists: k(λ) stayed monotone for all tested r
- Near-binary r≈1: Preserves two thresholds (3→2→1)
- Large r: Collapses to single threshold, λ* shifts downward
- Trend: λ* drops from ~1.33 at r=1.02 to ~1.01 at r=10
Experiment 006: Exact Threshold Solver
Date: 2025-12-02
Status: Completed
Objective
Compute thresholds analytically (no λ sweep) by intersecting loss lines.
Results
- Method works: Exact thresholds from L(λ) = data_loss + λb²
- Validation: Matches numerical sweeps exactly
- Advantage: No discretization error, handles r-dependence exactly
Experiment 007: Asymptotic Threshold Formula (MAJOR)
Date: 2025-12-02 (Evening)
Status: Completed
Objective
Derive and validate exact closed-form formula for threshold dependence on r.
Results
- Formula: λ* = (L₂ - L₁) / (b₁² - b₂²) where L = Σᵢ r-vp(resi)
- Expansion: λ* = c₀ + c₁/r + c₂/r² + ... (exact)
- Validation: 100% match for r = 1.5, 2, 3, 5, 10, 100, 1000
Examples
canonical_threshold: λ* = 1 + 1/r²
r=2 → 1.25, r=5 → 1.04, r=10 → 1.01 (all exact)
gentle_line: λ* = 1 + 1/r
r=2 → 1.5, r=5 → 1.2, r=10 → 1.1 (all exact)
Conclusions
Major finding: Threshold behavior is entirely determined by p-adic valuations of residuals. The formula is exact, not an approximation.
Planned Experiments
Experiment 008: 2D Exact Threshold Solver
Extend the exact solver to 2D (planes) to test formula generalization.
Experiment 009: p-Adic Regularisation
Systematic comparison of p-adic |β|p vs real L2 regularisation.
Experiment 010: Higher Dimensions (n=3,4,5)
Test n+1 theorem and monotonicity in n=3,4,5 dimensions.
Code Repository
All experiment code is available in the experiments/ directory:
exp001_regularization_threshold.py- Initial threshold detectionexp002_monotonicity_and_thresholds.py- Monotonicity validationexp003_2d_regression.py- 2D plane fittingexp004_prime_dependence.py- Prime comparison studyexp005_r_base_valuation.py- Alternative base sweepexp006_exact_thresholds.py- Exact analytic solverexp007_asymptotic_threshold.py- Asymptotic formula validationexp018_base_inversion_sensitivity.py- Base sensitivity (updated counts)exp019_1d_nonmonotonic_audit.py- 1D audit of non-monotonic casesexp020_prime_inversion_sensitivity.py- Prime dependence of Pareto inversions (r = p)
Core library: src/padic.py - p-adic arithmetic, regression, and 2D fitting