Longevity Research · MIMIC-III Mortality Harness · build v3.63

MIMIC-III → Early-Death Oracle Harness

Loads your uploaded MIMIC-III demo CSVs, assembles a per-admission record, routes each free-text admission diagnosis to the matching early-death (mortality-endpoint) oracle, computes the maximum achievable risk reduction and life-years saved, and compares against the patient's observed in-hospital outcome. Runs entirely in your browser.

⚠ Runs on the 100-patient MIMIC-III DEMO · processing is local (no upload/network) · representative oracle effect sizes
run the harness first, then print to capture the results

00The promise, in one number

“The fox knows many things, but the hedgehog knows one big thing.” — Archilochus, fr. 201

Load the data and press Run — this headline fills in from the records you load: the harness routes every record to the oracle that owns its mode of early death and reports the mean life-years the Bayesian Pareto-optimum set recovers over the disease-specific standard-of-care baseline.

Discussion. Read as a study, this is a transportability-and-ceiling exercise, not a clinical estimate. The mean Δ is an upper bound on the incremental gain: the Pareto hazard ratio is applied multiplicatively to the usual-care hazard, so interventions already embedded in standard care (e.g. a statin) are credited a second time. The overlap-free version — the prescribed-vs-Pareto headroom in §06 — requires the PRESCRIPTIONS table; without it that split reads n/a. Three further bounds on interpretation: (i) the chronic-prevention hazard ratios are transported from ambulatory trials to post-ICU survivors; (ii) the acute first-year mortality m₁ and post-acute hazard h_long are representative literature values, not fit to this cohort; and (iii) where observed actual life-years are absent (a selected-decedent demo with the in-hospital-death flag suppressed), the baseline reflects the disease group's standard-of-care expectation, not the loaded sample. Stated precisely: under transported, literature-calibrated assumptions, the Pareto-optimum set recovers a modeled mean gain per person over standard of care — a hypothesis-generating ceiling to validate against cohort-fit baselines and an overlap-free prescribed-vs-Pareto comparison, not a prescribe-tomorrow figure. Not for clinical or policy use.

01Load the data

Select the CSV files from your archive.zip (at minimum PATIENTS.csv and ADMISSIONS.csv; structured_medical_records.csv is optional and used only to read the stated Age). Files are parsed locally in your browser — nothing is uploaded.

NHANES mode (free, no-application data). This harness also reads NHANES .XPT files directly (SAS-transport, parsed in-browser). Drop a demographics file (DEMO_*.xpt, with SEQN/RIDAGEYR/RIAGENDR), a prescriptions file (RXQ_RX_*.xpt, RXDDRUG), the medical-conditions questionnaire (MCQ_*.xpt/DIQ/KIQ for routing), and a linked-mortality file (SEQN+MORTSTAT+PERMTH_EXM); it auto-detects NHANES, routes by self-reported condition, and runs the same standard-of-care vs Pareto comparison on an ambulatory population (general-population survival baseline). You download the files from CDC and drop them here — the tool can't fetch wwwn.cdc.gov directly (cross-origin). NHANES III fixed-width files (e.g. adult.dat) also load: drop the data file together with its SAS layout (adult.sas) — the tool parses the INPUT column positions and LABELs, then routes by condition label. (Bring one fixed-width file + its .sas per load; prescriptions/mortality can come from .XPT or CSV. Continuous cycles (1999+) are all-.XPT and need no layout.)

Harmonized NHANES 1988–2018 (figshare/Kaggle, Nguyen et al.) — drop the raw modules directly. Download the cleaned demographics, questionnaire (or response), mortality and medications module CSVs and drop all of them here at once. The tool now streams each file and keeps only the handful of columns it needs (SEQN, age, sex, self-reported conditions, MORTSTAT/PERMTH, drug names), so the 1000+-column questionnaire file loads without overwhelming the browser; it then merges them on SEQN and runs the analysis. If the medications module stores drug codes, also drop dictionary_drug_codes.csv. No Python, no fetching — still fully local. Stage is not in NHANES (shown as NA-stage in §04b). If a file's columns aren't recognised, the load status lists exactly what each file was read as, so a name mismatch is visible rather than silent.

⬆ Drop CSV files here, or
No files loaded yet. Accepts: harmonized NHANES module .csv (demographics+questionnaire+mortality+medications, drop together) · MIMIC .csv · NHANES .xpt · NHANES III .dat + .sas.

What this archive does and does not contain. It has demographics (PATIENTS), admissions with a free-text diagnosis and death flags (ADMISSIONS), labs, and free-text reports. It does not contain PRESCRIPTIONS, DIAGNOSES_ICD, or PROCEDURES_ICD. Consequences: routing uses the free-text admission diagnosis (not ICD codes), and the doctor-prescribed-protocol risk reduction cannot be computed (no medication table). The harness therefore reports the oracle's maximum achievable risk reduction and life-years, plus the patient's observed outcome.

02Early-death oracles included

Every atlas oracle whose primary endpoint is mortality / early death is included here. Population-count analyses (us-mortality, self-caused-harm, rare-disease) are excluded. Symptom-scale and biomarker endpoints (osteoarthritis, depression, anxiety, ADHD, schizophrenia, PTSD, OCD, bipolar, LDL cholesterol) have no patient-level early-death endpoint and are handled separately in §08 as non-mortality reductions, not life-years.

OracleEndpointInterventionsRouted from diagnoses containing…

03Per-admission output

Subject/AdmAge/SexAdmission diagnosisOracle Prescribed RRMax achievable RRUsual-care baseline LYPareto-optimum LYActual LY (obs.)Δ years addedObserved

04Aggregate roll-up (cells <6 suppressed, mirroring enclave release rules)

OraclenMean max-achievable RR Mean life-yrs added/admTotal life-yrs addedObserved in-hosp deaths

04bRoll-up by oracle × disease variant × stage (NHANES collects no clinical stage — shown as NA-stage)

The same risk-reduction and life-years figures, summed by oracle and — where the survey records it — disease variant (e.g. a diabetes sub-item, or cancer site in the cycles that ask it). NHANES does not capture clinical stage (tumour stage, NYHA class, CKD stage, Child-Pugh), so the stage column reads NA-stage: a deliberate anti-fabrication placeholder, not a missing computation. This section populates in NHANES mode when the demographics file carries variant_* columns (the companion preprocessor emits them).

OracleVariantStagenMean max-achievable RR Mean life-yrs addedTotal life-yrs added

05Actual vs usual-care baseline vs Pareto-optimum life-years ("years added")

The usual-care baseline is the standard of care: the empirical survival of patients with this disease who received ordinary treatment, including medications taken in the recent past (e.g. a statin they were already on). It is not an untreated counterfactual — there is no plausible untreated cohort to estimate it from. The Bayesian Pareto optimum is a different, specified set of interventions. So this table compares two regimens — standard of care vs the Pareto-optimum set — and the years added is Pareto-optimum LE − usual-care baseline LE. A high disease-specific acute first-year mortality is carried by both regimens; the Pareto set acts on the modifiable post-acute hazard. Observed actual life-years (from dod − admittime) are shown as the empirical anchor.

OraclenAvg age @ interventionAvg age @ death Avg actual LY (obs.)Usual-care baseline LYPareto-optimum LYAvg Δ years added

What the delta is — and the one caveat that remains. This compares the standard-of-care regimen to the Pareto-optimum set. The Pareto effect is applied multiplicatively to the usual-care hazard, so where the two regimens overlap — e.g. both include a statin — the model still credits that shared intervention, making the headline Δ an upper bound on the incremental gain. The clean, overlap-free version is in §06: the prescribed-vs-Pareto headroom measures the Pareto optimum relative to what the patient was actually given (from PRESCRIPTIONS), so it nets out the standard care already in the baseline. Two further notes: the Pareto set acts only on the post-acute hazard (a statin does not avert acute septic death), and observed actual LY runs below the usual-care baseline here because this demo cohort is selected decedents — the baseline reflects the disease group's realistic standard-of-care expectation, not this biased sample.

06Doctor-prescribed vs Pareto-optimum (requires the PRESCRIPTIONS table)

The demo archive you may have loaded omits PRESCRIPTIONS. Load it (it ships with the full credentialed MIMIC-III, ~4.16M rows, and with the open 100-patient demo on PhysioNet) and this section activates: each admission's ordered drugs are string-matched (DRUG/DRUG_NAME_GENERIC) to the routed oracle's interventions, giving the doctor-prescribed risk reduction, the gap to the Pareto optimum, and a split of life-years into already secured by the prescribed protocol vs remaining headroom.

Oraclen (with Rx)Mean prescribed RRMean Pareto RR Mean gap (unrealized)Mean yrs securedMean headroom yrs

Interpretation caveats. MIMIC PRESCRIPTIONS are inpatient CPOE orders during the stay — a mix of acute ICU drugs (pressors, sedatives, antibiotics, which map to no prevention oracle) and continued chronic medications (statins, antihypertensives, etc., which do). So the prescribed RR here is a lower bound on the true outpatient regimen and is not adherence-weighted (a single inpatient order ≠ chronic use). Lifestyle and procedural interventions (exercise, diet, weight loss, rehab) never appear in a drug table, so part of the "gap to Pareto" is structurally unmeasurable from prescriptions alone.

07Methods & caveats

08Alternative (non-mortality) endpoints (records with no mortality endpoint)

MIMIC-III Clinical Database (MIT Laboratory for Computational Physiology, Beth Israel Deaconess Medical Center), demo subset. Local processing; representative effect models. Not for clinical or policy use.