Stoploss Inference Scoring (No-MRF Data)

Overview

This page documents inference scoring for provider-payer combos with remits data but no MRF stoploss match. Unlike the MRF-validated scoring which uses a known MRF rate as an anchor, this system detects stoploss-like payment patterns from rate distributions alone, bootstrapped by known stoploss rates from the full MRF dataset.

We use a 6-signal scoring system (different signals than the MRF-validated version) that evaluates each combo on a 0-100 scale, producing a confidence tier from "Strong Candidate" (Tier 1) to "Unlikely" (Tier 5).

Two-strategy approach:

Intrinsic pattern detection (signals I1–I5): Score how "stoploss-like" the rate distribution looks — dominant flat rate, consistency across service types, carveout divergence
Payer template matching (signal I6): Cross-reference against 5,463 known MRF stoploss records across 120 payers to corroborate the inferred rate

Data Description

Source: all_remits_no_mrf_connection.csv — remit lines for inpatient claims that could NOT be matched to any MRF stoploss provision.

Metric	Value
Total rows	258,194
Unique provider-payer combos	6,666
Rows scored (5+ lines, after multi-cluster split)	6,228
Multi-cluster combos	1,761
Unique providers	1,461
Unique payers	193
Government payer rows (flagged)	1,140

Payer template source: mrf_stoploss.csv — 5,463 percentage-type MRF stoploss records across 120 payers. Used to build per-payer rate frequency maps for signal I6.

What each row represents: A group of remit lines for a specific provider × payer × revenue code combination, all paid at the same allowed/billed percentage. The lines_paid_at_this_percentage field counts how many individual lines share that rate.

Key constraint: perc_allowed is rounded to 1% increments (0.10–0.90), so ±Xpp windows in the scoring map to integer percentage points.

Known data limitations

No MRF anchor — Without a known stoploss rate, we cannot distinguish stoploss from other flat-% pricing (e.g., DRG case rates that happen to be uniform). Confidence tiers are more conservative than the MRF-validated version.
Multi-plan mixing — Remits are NOT unique on plan. A single payer-provider combo contains lines from multiple plans. The stoploss signal may be a minority cluster.
1% rate granularity — Rates are rounded to integer percentages, limiting our ability to distinguish nearby rates.
Government payer noise — VA, Tricare, Medicaid use schedule-based pricing that mimics flat-% stoploss. These are flagged but not excluded from scoring.

Methodology: 6-Signal Inference Framework

Design Principle: Pattern Detection Without an Anchor

The MRF-validated scoring system has a known target rate — it asks "do remits cluster near the MRF %?" This system has no target. Instead, it asks:

Is there a dominant flat rate? (I1)
Does it span multiple service types? (I2 — the strongest new signal)
Is the rate distribution concentrated? (I3)
Do carveout codes diverge? (I4)
Is there enough data? (I5)
Does the rate match known payer templates? (I6)

Multi-Cluster Detection

A combo's rate distribution may contain multiple distinct rate clusters — for example, a stoploss cluster at 30% and a percent-of-charges cluster at 57%. Previously, only the single modal rate was scored. Now, the system detects separate clusters and scores each independently.

Algorithm: Rates are sorted by value. If the gap between consecutive rates exceeds 5 percentage points, a new cluster begins. Clusters with fewer than 5 total lines are filtered out. Each qualifying cluster produces its own scored row.

Signal adaptations per cluster:

I1: Modal rate share = cluster peak lines / combo total lines (measures how dominant this cluster is overall)
I2: Rev code groups checked within the cluster's rate range only
I3: Distinct rates from the full combo (unchanged — measures overall noise)
I4: Clinical vs carveout modal within the cluster's filtered distributions
I5: Cluster total lines (not combo total)
I6: Payer template match for the cluster's peak rate

Single-cluster combos produce identical results to the previous system (cluster_id=1, n_clusters=1).

Signal I1: Dominant Rate Concentration — weight: 20%

"How much of the volume sits at a single rate?"

Stoploss pays a flat percentage of charges, so the modal (most common) rate should capture a large share of total lines. This is the MRF-free analog of S1 from the validated scoring.

Scoring thresholds

Modal Rate Share	Score
75%+	100
50-74%	80
25-49%	60
10-24%	40
<10%	20

Example: John Muir (Walnut Creek) / Kaiser — 99.3% of 13,657 lines pay at 81% → score 100. Shriners Ohio / Cigna — 86.7% of 15 lines at 50% → score 100.

Signal I2: Cross-Revenue-Code Consistency — weight: 25%

"Does the dominant rate appear across multiple service types?"

The most powerful signal in this framework. Stoploss pays a flat % of total charges regardless of service type — so the same rate should appear for room & board, OR, labs, radiology, etc. DRG/case-rate pricing produces different rates per code.

Revenue codes are collapsed into ~12 UB-04 category groups to avoid inflating the count from granular codes in the same department:

Revenue code group mapping

Prefix	Group
010x-019x	Room & Board
020x (excl 025x, 027x)	ICU / Specialty Room
025x	Pharmacy (carveout)
027x	Supplies (carveout)
030x	Lab
031x-033x	Radiology / Imaging
036x	OR Services
037x	Anesthesia
04xx	Respiratory / Cardio
0621-0638	Drugs/Supplies (carveout)
07xx	Rehab / PT
08xx-09xx	Other Therapeutic

Counts distinct groups where the dominant rate (±2pp) has lines:

Scoring thresholds

Rev Code Groups	Score
6+	100
4-5	80
2-3	50
1 only	20
Null/blank only	10

Example: Shriners Ohio / Cigna — 50% rate appears across 9 rev code groups (room & board, ICU, pharmacy, radiology, OR, anesthesia, resp/cardio, rehab, other) → score 100. Overlook / Horizon — 64% appears in only 2 groups → score 50.

Signal I3: Rate Dispersion — weight: 10%

"How many distinct payment rates exist?"

Identical logic to Signal 3 in the MRF-validated scoring. Fewer distinct rates suggests a flat percentage contract.

Scoring thresholds

Distinct Rates	Score
1-3	100
4-6	70
7-12	40
13-20	20
>20	10

Signal I4: Carveout Divergence — weight: 15%

"Do drugs/supplies pay differently than clinical codes?"

Computes the modal rate for carveout codes (pharmacy 025x, supplies 027x, drugs 0621-0638) vs clinical codes separately, then measures the gap.

Scoring thresholds

Gap Between Clinical & Carveout Modal Rates	Score
>10pp divergence	100
5-10pp divergence	70
<5pp (same rate)	40
Insufficient carveout data (<3 lines)	50

Rationale: Divergence = stoploss with carved-out categories (like Sutter Solano: clinical 55%, pharmacy 75%). Same rate = either stoploss without carveouts (like John Muir: everything at 81%) or non-stoploss.

Example: Shriners Ohio / Cigna — clinical modal 50% vs carveout modal 31% (19pp gap) → score 100. Sutter Solano / Anthem — clinical 55% vs carveout 75% (20pp gap) → score 100.

Signal I5: Line Volume — weight: 10%

"More claims = more statistical confidence."

Identical logic to Signal 5 in the MRF-validated scoring.

Scoring thresholds

Lines	Score
100+	100
50-99	80
20-49	60
10-19	40
5-9	20
<5	0

Signal I6: Payer Template Match — weight: 20%

"Does the dominant rate match a known stoploss rate for this payer?"

Cross-references the combo's inferred rate against the full MRF stoploss dataset (5,463 percentage-type records across 120 payers). This bridges the known MRF world and the unmatched remits.

Payer template construction from mrf_stoploss.csv:

Filter to stoploss_reimbursement_type == 'percentage' (5,463 records after excluding rates >100%)
Group by payer_id, round reimbursement_value to nearest integer
Build a rate frequency map per payer: {payer_id: {rate: count_of_MRF_records}}
A rate is "common" for a payer if 5+ MRF records exist at that rate

Scoring thresholds

Match Quality	Score
Within 2pp of a common payer rate (5+ MRF records)	100
Within 2pp of any payer rate (1-4 MRF records)	80
Within 5pp of a common payer rate	60
No template available for this payer	30
Rate contradicts all known rates (>10pp off)	10

Why this works: Payers standardize stoploss rates across providers. If Cigna has 50% stoploss at 40 different hospitals in MRF data, seeing 50% as the dominant rate at hospital #41 (with no MRF match) is strong evidence — even without that specific hospital's MRF file.

Coverage: 1,506 of 4,167 scored combos (36%) matched a known MRF payer template rate.

Confidence Tiers

Without MRF confirmation, labels are more conservative than the validated version:

Score	Tier	Label	Meaning
80-100	1	Strong Candidate	Highly stoploss-like pattern, often corroborated by payer template
60-79	2	Likely Stoploss	Consistent flat-% pattern across service types
40-59	3	Possible	Some stoploss indicators, but mixed signals
20-39	4	Weak Signal	Minimal pattern, likely non-stoploss
0-19	5	Unlikely	No stoploss pattern detected

Government Payer Flagging

VA (19.3% of rows), Tricare, Medicaid, and similar plans use schedule-based pricing that mimics flat-% stoploss but isn't. The govt_payer_flag column identifies these combos using keyword matching. Results are separated in the tier distribution below.

Findings

Tier Distribution

Of 6,228 scored rows (5+ remit lines, no MRF match — includes multi-cluster splits from 6,666 unique combos):

Commercial Payers (5,088 rows)

Tier	Label	Count	%
1	Strong Candidate	291	6%
2	Likely Stoploss	1,685	33%
3	Possible	1,702	33%
4	Weak Signal	1,401	28%
5	Unlikely	9	0%

Government Payers — flagged (1,140 rows)

Tier	Label	Count	%
1	Strong Candidate	6	1%
2	Likely Stoploss	338	30%
3	Possible	402	35%
4	Weak Signal	393	34%
5	Unlikely	1	0%

39% of commercial rows (Tiers 1-2) show strong stoploss-like patterns. Another 33% (Tier 3) show possible signal. Multi-cluster detection splits combos with distinct rate groupings into separate rows — each cluster is scored independently, which shifts some previously high-scoring combos into lower tiers when a dominant cluster is separated from a weaker one. This is more conservative but more accurate than the previous approach of only scoring the single modal rate.

Score Statistics (Commercial)

Mean: 53.1
Median: 52.0
Min: 19.0
Max: 91.0

All Scored Rows

The full dataset of 6,228 scored rows across 6,666 provider-payer combinations. Combos with multiple rate clusters (n_clusters > 1) have separate rows for each cluster. Use column headers to sort and filter.

Cross-Validation Against MRF-Scored Combos

Methodology

To test the inference scoring system's accuracy, we ran the 6-signal framework on the 92 combos that already have MRF-validated scores (from remits_mrf_first_dollar.csv). These combos have known stoploss rates from MRF data, so we can compare the inferred tier assignment against the MRF-validated tier assignment as ground truth.

The inference scoring was applied identically — no signal used the known MRF rate. This isolates how well the pattern-detection approach works when we could check the answer.

Results Summary

Metric	Value
Exact tier match	34 combos (37%)
Within 1 tier	35 combos (38%)
Off by 2+ tiers	23 combos (25%)
Mean tier delta	-0.75 (inference is systematically more optimistic)

The negative mean delta means the inference scoring consistently assigns higher confidence than the MRF-validated scoring — roughly 1 tier more optimistic on average.

Tier Confusion Matrix

Rows = MRF-validated tier (ground truth), columns = inferred tier:

	Inf T1	Inf T2	Inf T3	Inf T4
MRF Tier 1	7	1	1	0
MRF Tier 2	2	14	4	0
MRF Tier 3	3	17	11	2
MRF Tier 4	6	8	3	2
MRF Tier 5	0	0	5	6

Key patterns:

Tier 1 holds up well — 7 of 9 MRF Tier 1 combos are also inferred Tier 1
Tier 3 inflates badly — 17 of 33 MRF Tier 3 combos get promoted to Inferred Tier 2, and 3 more to Tier 1
Tier 4 inflates worst — 6 of 19 MRF Tier 4 combos land as Inferred Tier 1; 14 of 19 are off by 2+ tiers
No combos reach Inferred Tier 5 — the inference system never classifies anything as "Unlikely"

Key Insight: Percent-of-Charge Contamination (Partially Corrected)

The 14 worst mismatches (MRF Tier 4 → Inferred Tier 1 or 2) were originally attributed to percent-of-charge contamination. However, with per-provision scoring on the MRF side and multi-cluster detection on the inference side, 3 of these 14 turn out to be correct detections of a real provision that the MRF scoring previously missed because it anchored to the wrong rate. The remaining 11 are genuine percent-of-charge contracts.

These mismatches occur when the inference algorithm finds a strong flat-% signal in the remits at a different rate than the known MRF stoploss rate. The rate gap ranges from 8pp to 55pp. The inference scoring detects the pattern perfectly (high I1, high I2, often a template match via I6), but in these 11 cases it is detecting the wrong contract.

Mismatch Examples

Texas Health Arlington Memorial / Aetna

MRF stoploss rate: 39% → MRF Tier 4 (score 26.0)
Inferred modal rate: 57% (97% of 30 lines) → Inferred Tier 1 (score 87.0)
Rate gap: 18pp — the 57% rate spans 12 rev code groups with only 2 distinct rates, producing near-perfect scores on I1 (100), I2 (100), I3 (100), and I6 (100). But 57% is the percent-of-charge rate, not the stoploss rate.

Shriners Childrens Philadelphia / Aetna

MRF stoploss rate: 50% → MRF Tier 4 (score 24.0)
Inferred modal rate: 23% (93% of 15 lines) → Inferred Tier 1 (score 81.0)
Rate gap: 27pp — 23% dominates across 9 rev code groups. The inference system is confident, but it is detecting a completely different contractual arrangement.

Sycamore Shoals Hospital / BCBS Tennessee

MRF stoploss rate: 67% → MRF Tier 4 (score 24.0)
Inferred modal rate: 12% (85% of 13 lines) → Inferred Tier 2 (score 66.0)
Rate gap: 55pp — the largest gap in the validation set. The 12% flat rate across 4 rev code groups looks stoploss-like, but the known MRF stoploss is 67%. These are entirely different contracts.

Implications for No-MRF Results

This cross-validation confirms the central risk: 25% of combos with known MRF scores are misclassified by 2+ tiers, and the worst mismatches are all percent-of-charge contracts that the algorithm cannot distinguish from stoploss. The same contamination pattern is almost certainly present in the 4,167 no-MRF scored combos — particularly in Tiers 1-2 where the inference system finds the strongest flat-% signals.

The algorithm is excellent at detecting flat-percentage payment patterns. It cannot determine whether that pattern is a stoploss arrangement or a percent-of-charge contract.

Example Walkthroughs

Each walkthrough shows the raw rate distribution, how each signal is calculated step-by-step, and the final score. All data is from the actual scored dataset.

Tier 1: Shriners Childrens Ohio / Cigna

Inferred stoploss: 50% of charges 15 total lines, 3 distinct rates

Raw rate distribution (15 lines, 3 distinct rates)

Rate	Lines	% of Total
50%	13	86.7%	████████████████████████████████████████████
31%	1	6.7%	███
28%	1	6.7%	███

The 50% rate overwhelmingly dominates — 13 of 15 lines. The two outliers (31%, 28%) are a single pharmacy line (0250) and one other line.

Revenue code breakdown

Rev Code	Lines	Rate	Group
0250 (carveout)	1	31%	Pharmacy
0131	1	50%	Room & Board
0206	1	50%	ICU / Specialty
0258	1	50%	Pharmacy
0320	1	50%	Radiology
0379	1	50%	Anesthesia
0420	1	50%	Respiratory / Cardio
0430	1	50%	Respiratory / Cardio
0434	1	50%	Respiratory / Cardio
0710	1	50%	Rehab / PT

The 50% rate appears across 9 different rev code groups — room & board, ICU, pharmacy, radiology, anesthesia, respiratory, cardio, rehab. This cross-code consistency is the defining signature of stoploss: the same flat percentage regardless of service type.

Step-by-step signal scoring

Signal I1 — Dominant Rate Concentration: Modal rate 50% has 13/15 lines = 86.7% share. That's ≥75%, so score = 100.

Signal I2 — Cross-Revenue-Code Consistency: 50% (±2pp) appears in 9 rev code groups. That's ≥6, so score = 100.

Signal I3 — Rate Dispersion: 3 distinct rates. Falls in 1-3 bucket, so score = 100.

Signal I4 — Carveout Divergence: Clinical modal = 50%, carveout modal = 31%. Gap = 19pp (>10pp), so score = 100. The pharmacy code (0250) pays at a carved-out rate, while clinical codes all pay at the flat 50%.

Signal I5 — Line Volume: 15 lines, falls in 10-19 bucket, so score = 40. The only weak signal — limited data volume.

Signal I6 — Payer Template Match: Cigna's MRF data shows stoploss records at 52% (within 2pp of 50%, and a common rate with 5+ records). Score = 100.

Final Score

Signal	Weight	Score	Weighted
I1: Dominant Rate	20%	100	20.0
I2: Cross-Code Consistency	25%	100	25.0
I3: Rate Dispersion	10%	100	10.0
I4: Carveout Divergence	15%	100	15.0
I5: Line Volume	10%	40	4.0
I6: Payer Template Match	20%	100	20.0
Total			94.0 — Tier 1 (Strong Candidate)

Takeaway: Textbook stoploss pattern. A single flat rate (50%) spans 9 service types, pharmacy is carved out at a different rate, and the inferred rate matches Cigna's known MRF stoploss templates. The only deduction is limited line volume (15 lines), but the signal quality is extremely high.

Tier 2: John Muir Health — Walnut Creek / Kaiser Permanente

Inferred stoploss: 81% of charges 13,657 total lines, 47 distinct rates

Raw rate distribution (13,657 lines, top 15 of 47 distinct rates)

Rate	Lines	% of Total
81%	13,567	99.3%	██████████████████████████████████████████████████
17%	5	0.0%	▏
33%	5	0.0%	▏
54%	5	0.0%	▏
22%	4	0.0%	▏
21%	4	0.0%	▏
82%	3	0.0%	▏
36%	3	0.0%	▏
23%	3	0.0%	▏
39%	3	0.0%	▏
...	...	...

Despite 47 distinct rates, 99.3% of all lines pay at exactly 81%. The remaining 90 lines are scattered across 46 other rates — likely a handful of edge cases or non-stoploss plans under the same payer umbrella.

Revenue code breakdown (top 10)

Rev Code	Lines	Top Rates	Notes
(blank)	8,627	81% (8,624)	Unclassified — nearly all at 81%
0636 (carveout)	1,203	81% (1,202)	Pharmacy detail — still at 81%
0305	163	81% (163)	Lab — all at 81%
0250 (carveout)	163	81% (163)	Pharmacy — all at 81%
0301	162	81% (162)	Lab — all at 81%
0272 (carveout)	154	81% (154)	Supplies — all at 81%
0300	148	81% (148)	Lab — all at 81%
0324	147	81% (147)	Radiology — all at 81%
0271 (carveout)	146	81% (146)	Supplies — all at 81%
0278 (carveout)	139	81% (139)	Supplies — all at 81%

Every revenue code — including all carveout codes — pays at 81%. This payer does NOT carve out drugs/supplies from the stoploss rate.

Step-by-step signal scoring

Signal I1 — Dominant Rate Concentration: Modal rate 81% has 13,567/13,657 lines = 99.3% share. Score = 100.

Signal I2 — Cross-Revenue-Code Consistency: 81% (±2pp) appears in 12 rev code groups — every single group. Score = 100.

Signal I3 — Rate Dispersion: 47 distinct rates. Falls in the >20 bucket, so score = 10. The 46 non-modal rates are noise from edge cases, but the sheer count penalizes this signal.

Signal I4 — Carveout Divergence: Clinical modal = 81%, carveout modal = 81%. Gap = 0pp (<5pp), so score = 40. No carveout divergence — everything pays at the same rate.

Signal I5 — Line Volume: 13,657 lines. Score = 100.

Signal I6 — Payer Template Match: Kaiser Permanente's MRF data does not have a stoploss record near 81%. All known Kaiser rates are much lower. The modal rate contradicts all known rates (>10pp off), so score = 10.

Final Score

Signal	Weight	Score	Weighted
I1: Dominant Rate	20%	100	20.0
I2: Cross-Code Consistency	25%	100	25.0
I3: Rate Dispersion	10%	10	1.0
I4: Carveout Divergence	15%	40	6.0
I5: Line Volume	10%	100	10.0
I6: Payer Template Match	20%	10	2.0
Total			64.0 — Tier 2 (Likely Stoploss)

Why This Is Tier 2, Not Tier 1

This combo has arguably the strongest intrinsic pattern in the entire dataset — 99.3% of 13,657 lines at a single rate across all service types. But two signals drag the score down:

I3 (Rate Dispersion): 47 distinct rates scores just 10. Those 90 outlier lines create many distinct rates even though they're <1% of volume. This is a limitation of counting distinct rates without weighting by volume.
I6 (Payer Template): Kaiser's MRF stoploss data doesn't include an 81% rate, so the template match fails. This doesn't mean it's not stoploss — it may simply be a contract not in the MRF data, or a different arrangement structure.

The intrinsic evidence is overwhelming. A human reviewer would classify this as Tier 1 without hesitation.

Tier 3: Armstrong County Memorial Hospital / UnitedHealthcare

Inferred stoploss: 31% of charges 9 total lines, 3 distinct rates

Raw rate distribution (9 lines, 3 distinct rates)

Rate	Lines	% of Total
21%	6	66.7%	█████████████████████████████████
31%	2	22.2%	███████████
10%	1	11.1%	█████

The 21% rate captures two-thirds of lines, but with only 9 total lines the distribution is thin. The 31% rate (2 lines) comes from a single radiology code, and the 10% outlier is a single drug detail line.

Revenue code breakdown

Rev Code	Lines	Top Rates	Notes
0636 (carveout)	7	21% (6), 10% (1)	Drug detail — nearly all at modal rate
0335	2	31% (2)	Radiology — pays at a different rate

Only 2 revenue codes are present, both mapping to a single rev code group each (drugs/supplies and radiology). The limited code diversity is why I2 scores so poorly — we can't confirm cross-service consistency with data from only 2 service types.

Step-by-step signal scoring

Signal I1 — Dominant Rate Concentration: Modal rate 21% has 6/9 lines = 66.7% share. That's in the 50-74% bucket, so score = 80.

Signal I2 — Cross-Revenue-Code Consistency: 21% (±2pp) appears in only 1 rev code group. Score = 20. This is the weak link — the dominant rate only shows up in one service category, which is less convincing than seeing it across room & board, labs, radiology, etc.

Signal I3 — Rate Dispersion: 3 distinct rates. Score = 100.

Signal I4 — Carveout Divergence: Clinical modal = 31%, carveout modal = 21%. Gap = 10pp (5-10pp), so score = 70. Moderate divergence suggests possible carveout structure.

Signal I5 — Line Volume: 9 lines. Falls in 5-9 bucket, so score = 20. Very limited data.

Signal I6 — Payer Template Match: UHC's MRF data has a rate at 19% (within 2pp of modal rate 21%, 1-4 records). Score = 80.

Final Score

Signal	Weight	Score	Weighted
I1: Dominant Rate	20%	80	16.0
I2: Cross-Code Consistency	25%	20	5.0
I3: Rate Dispersion	10%	100	10.0
I4: Carveout Divergence	15%	70	10.5
I5: Line Volume	10%	20	2.0
I6: Payer Template Match	20%	80	16.0
Total			59.5 — Tier 3 (Possible)

Takeaway: The rate distribution looks stoploss-like (concentrated, few rates, carveout divergence) and the payer template corroborates. But only 9 lines across 1 rev code group — we simply don't have enough data across service types to be confident. More claims data would likely push this toward Tier 2.

Tier 4: Overlook Medical Center / BCBS of New Jersey (Horizon)

Inferred stoploss: 64% of charges 2,552 total lines, 76 distinct rates

Raw rate distribution (2,552 lines, top 15 of 76 distinct rates)

Rate	Lines	% of Total
64%	630	24.7%	████████████
69%	463	18.1%	█████████
66%	241	9.4%	████
13%	218	8.5%	████
21%	142	5.6%	██
55%	58	2.3%	█
37%	58	2.3%	█
15%	56	2.2%	█
35%	44	1.7%	█
34%	36	1.4%	█
53%	35	1.4%	█
36%	35	1.4%	█
67%	32	1.3%	█
24%	31	1.2%	█
29%	30	1.2%	█

No single rate dominates. The "modal" rate (64%) captures only 24.7% of lines, and the distribution is spread across 76 rates. This looks like multi-plan mixing with possibly DRG-based or case-rate pricing.

Revenue code breakdown (top 8)

Rev Code	Lines	Top Rates	Notes
(blank)	1,484	64% (308), 69% (255), 66% (125)	Broad distribution across many rates
0258 (carveout)	510	64% (253), 69% (160), 66% (96)	Pharmacy — mimics overall distribution
0250 (carveout)	225	64% (69), 69% (46), 67% (23)	Pharmacy — same pattern
0636 (carveout)	141	13% (140), 24% (1)	Drug detail — divergent, pays at 13%
0278 (carveout)	102	21% (102)	Supplies — divergent, all at 21%
0209	39	Spread across many rates	ICU — no dominant rate
0214	13	Spread	Specialty room — scattered
0343	10	13% (10)	Radiology — all at 13%

The wide rate variation across revenue codes is the opposite of a flat stoploss pattern. Different service types pay at different rates, suggesting DRG/case-rate pricing across multiple plans.

Step-by-step signal scoring

Signal I1 — Dominant Rate Concentration: Modal rate 64% has 630/2,552 lines = 24.7% share. Falls in 10-24% bucket, so score = 40.

Signal I2 — Cross-Revenue-Code Consistency: 64% (±2pp) appears in 2 rev code groups. Score = 50. The rate is concentrated in just a couple of service categories, not spread broadly.

Signal I3 — Rate Dispersion: 76 distinct rates. Falls in >20 bucket, so score = 10. Extremely dispersed.

Signal I4 — Carveout Divergence: Clinical modal = 64%, carveout modal = 64%. No divergence (<5pp), so score = 40.

Signal I5 — Line Volume: 2,552 lines. Score = 100. Plenty of data — the pattern just doesn't look like stoploss.

Signal I6 — Payer Template Match: Horizon's MRF data doesn't have any stoploss record near 64% (>10pp from all known rates). Score = 10. The inferred rate contradicts all known payer templates.

Final Score

Signal	Weight	Score	Weighted
I1: Dominant Rate	20%	40	8.0
I2: Cross-Code Consistency	25%	50	12.5
I3: Rate Dispersion	10%	10	1.0
I4: Carveout Divergence	15%	40	6.0
I5: Line Volume	10%	100	10.0
I6: Payer Template Match	20%	10	2.0
Total			39.5 — Tier 4 (Weak Signal)

Why Multi-Plan Mixing Creates False Candidates

This combo illustrates the core challenge of inference scoring without an MRF anchor. The 64% "modal" rate is really just the largest cluster in a heavily mixed distribution — likely one plan among many, each with different pricing. Without knowing which plan has stoploss (if any), the signal is too diluted to classify with confidence. The payer template contradiction (no known Horizon stoploss at 64%) further weakens the case.

Limitations and Caveats

Percent-of-Charge Contract Ambiguity

Key Limitation — Action Required

This scoring system is equally good at detecting percent-of-charge contracts (flat % of billed charges with no dollar threshold trigger) as it is at detecting stoploss. Both produce the same signature in remits data: a dominant flat rate applied uniformly across service types.

Without dollar threshold data or contract language, remits alone cannot distinguish:

Stoploss: "If total charges exceed $250K, pay 60% of charges for the entire claim"
Percent of charges: "Pay 60% of charges on all claims" (no threshold — applies from dollar one)

Carveout divergence (signal I4) provides a partial differentiator — stoploss contracts more commonly carve out pharmacy and supplies at different rates — but many legitimate stoploss arrangements pay a flat rate on everything, including carveouts.

Cross-validation confirms this risk: When tested against 92 combos with known MRF scores, 25% were misclassified by 2+ tiers — and all 14 worst mismatches were percent-of-charge contracts that the algorithm scored as Tier 1-2. See Cross-Validation Against MRF-Scored Combos for full results.

Recommendation: Before incorporating high-confidence results (Tier 1-2) into Clear Rates, cross-reference against MRF negotiation arrangement data to exclude combos where the payer-provider relationship is known to be a simple percent-of-charge contract. This is a to-do item for future work.

No MRF Anchor

The fundamental limitation. Without a known target rate, this scoring system cannot distinguish stoploss from other flat-% pricing arrangements (case rates, per-diem with similar effect, negotiated discounts). The confidence tiers are deliberately more conservative — a "Strong Candidate" here is weaker evidence than a "Confirmed" in the MRF-validated scoring.

Government Payer Schedule Pricing

VA, Tricare, and Medicaid plans use fee schedules that produce flat payment percentages similar to stoploss. These 803 combos are flagged (govt_payer_flag) and separated in the tier distribution, but they inflate Tier 2-3 counts if not filtered.

Rate Granularity

perc_allowed is rounded to 1% increments. Two providers with true rates of 49.6% and 50.4% both appear as 50%. This makes the ±2pp matching windows in I2 and I6 coarser than they appear — effectively a ±3pp real window.

Multi-Plan Mixing

The same challenge as the MRF-validated scoring, but harder to manage without a target rate. A combo where 25% of lines pay at one flat rate and 75% pay at various DRG rates may genuinely have stoploss on one plan — but the scoring system sees it as a weak dominant rate (I1 = 60) rather than a strong minority cluster.

Template Coverage Gaps

Only 120 of ~193 payers in the remits data have MRF stoploss records. The remaining ~73 payers receive a neutral I6 score (30), which prevents penalization but also means we can't corroborate their patterns.

Comparison to MRF-Validated Scoring

Dimension	MRF-Validated	No-MRF Inference
Anchor	Known MRF rate	None (inferred from distribution)
Rows scored	104 provisions	6,228 cluster-rows
Multi-unit scoring	Per-provision (by MRF rate)	Per-cluster (5pp gap detection)
Top signal	S1: MRF Cluster Strength (30%)	I2: Cross-Code Consistency (25%)
Novel signal	S6: Directional Consistency	I6: Payer Template Match
Tier 1-2 rate	33%	39% (commercial)
Tier 1-2 meaning	Confirmed stoploss	Strong candidate (needs validation)
Key weakness	Small sample (104 provisions)	No ground truth

The two systems are complementary. MRF-validated scoring provides high-confidence confirmation for a small set. Inference scoring extends coverage 60x (104 → 6,228 rows) at the cost of certainty — the "Strong Candidate" label acknowledges that even Tier 1 results should be treated as hypotheses until MRF data becomes available.

Overview​

Data Description​

Methodology: 6-Signal Inference Framework​

Design Principle: Pattern Detection Without an Anchor​

Multi-Cluster Detection​

Signal I1: Dominant Rate Concentration — weight: 20%​

Signal I2: Cross-Revenue-Code Consistency — weight: 25%​

Signal I3: Rate Dispersion — weight: 10%​

Signal I4: Carveout Divergence — weight: 15%​

Signal I5: Line Volume — weight: 10%​

Signal I6: Payer Template Match — weight: 20%​

Confidence Tiers​

Government Payer Flagging​

Findings​

Tier Distribution​

Score Statistics (Commercial)​

All Scored Rows​

Cross-Validation Against MRF-Scored Combos​

Methodology​

Results Summary​

Tier Confusion Matrix​

Key Insight: Percent-of-Charge Contamination (Partially Corrected)​

Mismatch Examples​

Example Walkthroughs​

Tier 1: Shriners Childrens Ohio / Cigna​

Final Score​

Tier 2: John Muir Health — Walnut Creek / Kaiser Permanente​

Final Score​

Tier 3: Armstrong County Memorial Hospital / UnitedHealthcare​

Final Score​

Tier 4: Overlook Medical Center / BCBS of New Jersey (Horizon)​

Final Score​

Limitations and Caveats​

Percent-of-Charge Contract Ambiguity​

No MRF Anchor​

Government Payer Schedule Pricing​

Rate Granularity​

Multi-Plan Mixing​

Template Coverage Gaps​

Comparison to MRF-Validated Scoring​

On this page:

Overview

Data Description

Methodology: 6-Signal Inference Framework

Design Principle: Pattern Detection Without an Anchor

Multi-Cluster Detection

Signal I1: Dominant Rate Concentration — weight: 20%

Signal I2: Cross-Revenue-Code Consistency — weight: 25%

Signal I3: Rate Dispersion — weight: 10%

Signal I4: Carveout Divergence — weight: 15%

Signal I5: Line Volume — weight: 10%

Signal I6: Payer Template Match — weight: 20%

Confidence Tiers

Government Payer Flagging

Findings

Tier Distribution

Score Statistics (Commercial)

All Scored Rows

Cross-Validation Against MRF-Scored Combos

Methodology

Results Summary

Tier Confusion Matrix

Key Insight: Percent-of-Charge Contamination (Partially Corrected)

Mismatch Examples

Example Walkthroughs

Tier 1: Shriners Childrens Ohio / Cigna

Final Score

Tier 2: John Muir Health — Walnut Creek / Kaiser Permanente

Final Score

Tier 3: Armstrong County Memorial Hospital / UnitedHealthcare

Final Score

Tier 4: Overlook Medical Center / BCBS of New Jersey (Horizon)

Final Score

Limitations and Caveats

Percent-of-Charge Contract Ambiguity

No MRF Anchor

Government Payer Schedule Pricing

Rate Granularity

Multi-Plan Mixing

Template Coverage Gaps

Comparison to MRF-Validated Scoring