Skip to main content

Architecture

Pipeline

Results are stored in the tq_dev.internal_dev_mmalhotra_provisions schema.

provisions_stoploss_base_{subversion}

provisions_stoploss_extracted_{subversion}

provisions_stoploss_aggregated_{subversion}
  • Base: broad capture of all rate records that may contain stoploss language — goal is recall, not precision
  • Extracted: exclusions, normalization, and field extraction — one row per note, queryable for QA before deduplication
  • Aggregated: deduplication to one row per payer/provider/network/setting, selecting the best available note

Step 1: Base (provisions_stoploss_base_{subversion})

Broadly pulls all hospital rate records that might contain a stoploss provision. A record is included if at least one note field contains any of the words: stop, loss, threshold, discount, charges, outlier — and at least one note field contains a digit.

Broad inclusions
WHERE
-- Broad inclusion: search for stoploss-related terms in any note field
(
REGEXP_LIKE(LOWER(hr.additional_generic_notes), 'stop|loss|threshold|discount|charges|outlier')
OR REGEXP_LIKE(LOWER(hr.additional_payer_notes), 'stop|loss|threshold|discount|charges|outlier')
OR REGEXP_LIKE(LOWER(hr.negotiated_algorithm), 'stop|loss|threshold|discount|charges|outlier')
)
AND hr.payer_class_name = 'Commercial'
-- Must have at least one numeric value in notes
AND (
REGEXP_LIKE(hr.additional_generic_notes, '\d')
OR REGEXP_LIKE(hr.additional_payer_notes, '\d')
OR REGEXP_LIKE(hr.negotiated_algorithm, '\d')
)
AND (ns.network_id IS NOT NULL OR hr.plan_name IS NOT NULL)

Step 2: Extract (provisions_stoploss_extracted_{subversion})

Applies all exclusion logic, normalizes note text, and extracts structured fields. This is the most complex step and the best place to QA results before deduplication.

Exclusions

Notes are dropped if they match known non-stoploss patterns.

Multi-field exclusions

Patterns that require looking at more than one note field:

PatternWhat it matchesExample
Hospital System Supply IdentifierGeneric note contains "Hospital System Supply Identifier" followed by text ending with "Lawson ID", no payer notesHospital System Supply Identifier ... Lawson ID 12345
Term LineGeneric note ends with "Term Line" followed by a number, no payer notesTerm Line 4
Outpatient billingGeneric note is exactly the standard outpatient billing setting note, no payer notesThe contract rate provided is when this service is billed outpatient.
Re-evaluated (payer notes)Payer note starts with "Re-evaluated:", no other notes, no "threshold" keywordRe-evaluated: rate updated per contract amendment
Re-evaluated (generic notes)Generic note starts with "Re-evaluated:", no other notesRe-evaluated: 85% of charges
CPT/code + contract paymentGeneric note is a CPT code or bare number; payer note starts with "contract indicates payment"generic: 99213, payer: contract indicates payment of 110%
Bare number + percentage of chargesGeneric note is a bare number; payer note is exactly a percentage followed by "OF TOTAL BILLED CHARGES" or similargeneric: 278, payer: 85% OF TOTAL BILLED CHARGES
Rev/Proc codeGeneric note matches "Rev NNN Proc NNNNN" format, no payer notesRev 200 Proc 99213
Reimb exceeds chargesGeneric note is exactly "Reimb exceeds charges. Trim to Lessor Of Claim", no other notesReimb exceeds charges. Trim to Lessor Of Claim
OPPS APCAny note starts with "OPPS" followed by whitespace or ;, then "APC"OPPS APC Type 1, OPPS; APC ...
Per-note exclusions

Applied to each note individually:

PatternWhat it matchesExample
Cost Plus MarkupContains "Cost Plus Markup"Cost Plus Markup 10%
Implants and DevicesStarts with "Implants and Devices"Implants and Devices: 80% of invoice
APC grouping requires price historyStarts with "APC grouping requires price history"APC grouping requires price history to determine...
If > $2500Starts with "If > $2500"If > $2500, invoice cost
Eligible charges greater thanStarts with "Eligible charges greater than $1,090"Eligible charges greater than $1,090 billed at invoice
Global case rateStarts with "Global case rate listed refers to the fixed fee payment made to provider..."Global case rate listed refers to the fixed fee payment made to provider...
Patients who expireStarts with "Patients who expire during treatment or who do not complete the transplant process"Patients who expire during treatment or who do not complete the transplant process...
Implant thresholdContains "implant threshold"subject to implant threshold
Rev code add-onContains "Rev" + 4-digit code starting with 0, then "in addition to IP rate"Rev 0120 ... in addition to IP rate
Paid under OPPSStarts with "Paid under OPPS;"Paid under OPPS; standard fee schedule
IPPS logicStarts with "This is being calculated using the IPPS logic from CMS."This is being calculated using the IPPS logic from CMS.
Aggregated billable gross charges for specific codeContains "aggregated billable gross charges for specific code"aggregated billable gross charges for specific code 99213
DRG Case WeightStarts with "DRG" + number + "DRG Case Weight"DRG 470 DRG Case Weight 2.89
Billable gross charges for specified codeContains "of billable gross charges" ... "for specified code"85% of billable gross charges for specified code
Medicare reimbursement headerStarts with "Medicare" + slash-delimited code + "Reimbursement:"Medicare XXX/YYY Reimbursement:
Medicare OP ReimbursementStarts with "Medicare OP Reimbursement:"Medicare OP Reimbursement: 110%
TRICARE OP ReimbursementStarts with "TRICARE OP Reimbursement:"TRICARE OP Reimbursement: 115%
Procedure Lookup TablesStarts with "Procedure Lookup Tables:"Procedure Lookup Tables: 99213
Procedure LookupStarts with "Procedure Lookup:" optionally preceded by "Sum: "Procedure Lookup: 99213, Sum: Procedure Lookup: 99213
FormulaCPEStarts with "FormulaCPE:"FormulaCPE: DRG 470
Interim ValueStarts with "Interim Value:"Interim Value: 85%
Group (Sum Of)Starts with "Group (Sum Of)"Group (Sum Of) DRG 470
Can be subject to stoploss provisionsStarts with "Can be subject to stoploss provisions"Can be subject to stoploss provisions based on contract
Percentage of costContains a number + "percent" + "of cost"85 percent of cost
Reimbursed at invoice costContains "reimbursed at" ... "of invoice cost when billed charges exceed"reimbursed at 90% of invoice cost when billed charges exceed $500
Carveout exclusion format (without Rate Threshold)Contains a newline + "Exclusion - ", and does not also contain "Rate Threshold"\nExclusion - Implants: 80% of invoice

Extraction Patterns

Notes are normalized (lowercased, hyphens removed, %"percent") before extraction. Each pattern fires at most once per note.

Stoploss Type

Rules are evaluated in priority order — first match wins.

PriorityValueSignalNotes
1first dollarNote contains: first dollar, 1st dollar, entire claim, entire admission, entire stayExplicit keyword match
2second dollarNote contains: second dollar, 2nd dollar, in addition to, over the threshold, charges over, charges in excess, after the threshold, above the threshold, for each day past, days after, excess of, over and aboveunless the note also contains in addition to other negotiated/contracted rates or in addition to other/inpatient/outpatient services (carveout structure language, not a stoploss signal). in addition to this service is intentionally kept as a second dollar signal.Explicit keyword match with carveout exclusion
3first dollarNote contains charges exceeding followed by threshold or outlier in the same sentenceGuard: does not fire if in addition to follows exceeding in the same clause — that indicates excess paid on top of a base rate (second dollar)
4first dollarNote contains revertGuard: does not fire if in excess of or that exceed that follows revert in the same clause — that indicates only the excess above the threshold is reimbursed differently (second dollar)
5first dollarNote starts with charges greater than
NULLNone of the above matched
Threshold

Dollar-based:

PatternWhat it looks forExample
k-notationNumber optionally preceded by $, immediately followed by k (not part of a longer word)$50k, 100k, $1.5k
keyword-anchoredexceeds, threshold of, threshold exceeding, or greater than followed by an optional $ and a number ≥ 5 digits. exceeds excluded when preceded by not to.exceeds $100,000, greater than $75000
structured-fixed-amount+ Fixed Amount followed by a number (auto-generated note format). Only fires if k-notation and keyword-anchored both fail.+ Fixed Amount 56687.00
numeric fallbackAny number ≥ 5 digits optionally preceded by $. Only fires if all above fail, and only if the value doesn't match an already-extracted per diem reimbursement.$25000, 150000.00

LOS-based:

PatternWhat it looks forExample
days-keyword>, exceeds, greater than, past, over, reaches, threshold of, or after followed by a number then day or daysexceeds 5 days, > 7 days, after 3 days
Reimbursement

Percentage of charges:

PatternWhat it looks forExample
structured-reimbursement-rateStoploss First/Second Dollar + Reimbursement Rate N (auto-generated note, non-zero N)Stoploss First Dollar + Reimbursement Rate 60.00 + ...
structured-poc-rateSame prefix but initial rate is zero, followed by + Percent Of Charge + Reimbursement Rate N... + Percent Of Charge + Reimbursement Rate 54.25 + ...
structured-rate-thresholdPercentage on the same line as Rate Threshold in a common auto-generated noteRate Threshold $125,000.00...reimbursed at 18.6%
decimalNumber with decimal point immediately followed by percent60.5%, 60.5 percent
integerWhole number immediately followed by percent75%, 75 percent

Per diem:

PatternWhat it looks forExample
amount-beforeNumber immediately followed by per diem1500 per diem
rate-ofper diem optionally followed by rate, then of, then a numberper diem rate of $1500
bareper diem followed directly by a numberper diem $1500
Cap
PatternWhat it looks forExample
not-to-exceed-knot to exceed, NTE, or up to a maximum (of) + number + k + per day. Number × 1,000.NTE $2k per day, up to a maximum of $2k per day
not-to-exceednot to exceed, NTE, or up to a maximum (of) + number + per daynot to exceed $500 per day, up to a maximum of $500 per day

Quality Gate

A record is only kept if:

  1. The note contains a stoploss-specific keyword (stoploss, threshold, excess, outlier, exceed) or starts with charges greater than. not to exceed alone does not qualify.
  2. At least one field was extracted: a percentage, a per diem reimbursement, a dollar threshold ≥ $50,000, or a per diem threshold

Confidence Scoring

Each record is scored by summing the weights of all patterns that fired, plus bonuses for values in plausible ranges. Higher-confidence patterns (e.g., structured auto-generated formats) score more than broad fallbacks (e.g., any 5-digit number). Max score is 15.

Scoring table
CategoryPatternPoints
Stoploss Typestoploss_type is not null1
Thresholddollar_threshold: k-notation, keyword-anchored, or structured-fixed-amount2
Thresholddollar_threshold: numeric-fallback1
Thresholddollar_threshold between 100K100K–300K (bonus)1
Thresholdper_diem_threshold: days-keyword1
Thresholdper_diem_threshold between 3–30 days (bonus)1
Reimbursementpercentage: structured-reimbursement-rate, structured-poc-rate, or structured-rate-threshold3
Reimbursementpercentage: decimal or integer2
Reimbursementpercentage_reimbursement between 20%–70% (bonus)1
Reimbursementper_diem_reimbursement: amount-before or rate-of2
Reimbursementper_diem_reimbursement: bare1
Reimbursementper_diem_reimbursement between 500500–5,000 (bonus)1
Capper_diem_cap: either pattern2

Step 3: Aggregate (provisions_stoploss_aggregated_{subversion})

Multiple notes can qualify for the same payer/provider/network/setting group. The highest-ranked note is selected as the canonical source using:

  1. Notes shared by more rate IDs are preferred (more representative)
  2. Notes with more extracted fields are preferred
  3. Alphabetical tiebreak on note type

Output Schema

ColumnTypeDescription
canonical_idVARCHARpayer_id-network_id-provider_id-setting
payer_idINT
payer_nameVARCHAR
provider_idVARCHAR
provider_nameVARCHAR
network_idBIGINTReal network ID, or deterministic hash for unmapped networks
network_nameVARCHARReal network name, or 'unmapped'
settingVARCHAR'Inpatient' or 'Outpatient'
stoploss_typeVARCHAR'first dollar', 'second dollar', or NULL
dollar_thresholdDOUBLEDollar amount above which stoploss applies
per_diem_thresholdINTEGERLength of stay (days) above which stoploss applies
stoploss_threshold_typeVARCHAR'dollar' or 'per diem'
percentage_reimbursementDOUBLEReimbursement rate as a percentage
per_diem_reimbursementDOUBLEReimbursement as a per diem dollar amount
stoploss_reimbursement_typeVARCHAR'percentage' or 'per diem'
per_diem_capDOUBLEMaximum reimbursement per day in dollars
stoploss_cap_typeVARCHAR'dollars per day' if a per diem cap was found, otherwise NULL
rate_scoreINTEGERConfidence score (0–15)
matched_textVARCHARPipe-delimited raw substrings that matched during extraction
regex_patternVARCHARPipe-delimited pattern labels
rate_countBIGINTDistinct rate IDs sharing the canonical note text
canonical_source_idBIGINTMinimum rate ID from the contributing rate group
canonical_source_noteVARCHARNote text used for extraction
canonical_source_note_typeVARCHARWhich note field the canonical note came from
other_notes_availableBOOLEANWhether other distinct qualifying notes exist for this group
source_rate_idsVARCHARComma-delimited list of all contributing rate IDs

Pattern Reference (provisions_stoploss_pattern_reference_{subversion})

A lookup table mapping every regex_pattern label to a plain-English description and its score weight.

ColumnDescription
category'Keyword anchored', 'Dollar, percentage, or day value', or 'Known pattern from auto-generated note'
regex_patternShort label matching values emitted by provisions_stoploss_extracted
descriptionPlain-English description of what the pattern matches
rate_score_weightPoints contributed to rate_score when this pattern fires

Use it to annotate query results or audit pattern coverage:

SELECT p.*, r.description, r.rate_score_weight
FROM tq_dev.internal_dev_mmalhotra_provisions.provisions_stoploss_extracted_2026_01 p
CROSS JOIN UNNEST(SPLIT(p.regex_pattern, ' | ')) AS t(pattern)
JOIN tq_dev.internal_dev_mmalhotra_provisions.provisions_stoploss_pattern_reference_2026_01 r ON t.pattern = r.regex_pattern