Data Dictionary
Column Naming Convention
All combined output tables use a consistent prefix + suffix scheme.
Prefix — identifies what is included in the price:
| Prefix | Included components | Formula |
|---|---|---|
inst_ | Institutional / facility | inst_price |
prof_ | Institutional + Professional Fee | inst_price + prof_professional_price |
prof_with_optional_ | Institutional + Professional Fee + Optional Fee | inst_price + prof_all_fees_price |
prof_* prices are additive: the institutional rate is always included, and the professional fee component is added on top.
Suffix — identifies the metric:
| Suffix | Description |
|---|---|
_price | Volume-weighted average commercial canonical rate |
_medicare_price | Medicare reference rate from the highest-volume code |
_weight | price / base_rate (IP base = 500) |
_medicare_weight | medicare_price / base_rate |
combined_subcategory_fee_schedule_2026_03_12
Granularity: one row per (ssp_grouper, sub_category, pos, provider_id)
Location: tq_dev.internal_dev_csong_ssp
| Column | Type | Description |
|---|---|---|
ssp_grouper | string | SSP identifier |
sub_category | string | Sub-category within the SSP (0 = base / least severe) |
pos | string | Place of service: IP (inpatient) or OP (outpatient) |
provider_id | string | Hospital provider identifier |
inst_price | float | Institutional commercial rate |
inst_medicare_price | float | Institutional Medicare rate |
inst_weight | float | inst_price / base_rate |
inst_medicare_weight | float | inst_medicare_price / base_rate |
prof_price | float | inst_price + professional_fee_price (commercial) |
prof_medicare_price | float | inst_medicare_price + professional_fee_medicare_price |
prof_weight | float | prof_price / base_rate |
prof_medicare_weight | float | prof_medicare_price / base_rate |
prof_with_optional_price | float | inst_price + professional_fee_price + optional_fee_price (commercial) |
prof_with_optional_medicare_price | float | inst_medicare_price + professional_fee_medicare_price + optional_fee_medicare_price |
prof_with_optional_weight | float | prof_with_optional_price / base_rate |
prof_with_optional_medicare_weight | float | prof_with_optional_medicare_price / base_rate |
combined_ssp_fee_schedule_2026_03_12
Granularity: one row per (ssp_grouper, pos, provider_id)
Location: tq_dev.internal_dev_csong_ssp
SSP-level prices are a volume-weighted average of the sub-category prices. Weights use the same base rates as the sub-category table.
| Column | Type | Description |
|---|---|---|
ssp_grouper | string | SSP identifier |
pos | string | Place of service: IP or OP |
provider_id | string | Hospital provider identifier |
total_claim_count | bigint | Sum of sub_package_total_billed_count across all base codes in the SSP (from sub_packages) |
inst_price | float | Institutional commercial rate |
inst_medicare_price | float | Institutional Medicare rate |
inst_weight | float | inst_price / base_rate |
inst_medicare_weight | float | inst_medicare_price / base_rate |
prof_price | float | inst_price + professional_fee_price (commercial) |
prof_medicare_price | float | inst_medicare_price + professional_fee_medicare_price |
prof_weight | float | prof_price / base_rate |
prof_medicare_weight | float | prof_medicare_price / base_rate |
prof_with_optional_price | float | inst_price + professional_fee_price + optional_fee_price (commercial) |
prof_with_optional_medicare_price | float | inst_medicare_price + professional_fee_medicare_price + optional_fee_medicare_price |
prof_with_optional_weight | float | prof_with_optional_price / base_rate |
prof_with_optional_medicare_weight | float | prof_with_optional_medicare_price / base_rate |
Combo SSPs
Combo SSPs (e.g., GA.2.colonoscopy_and_egd) are stored in the same combined
output tables above. They are created by combo_ssps.py using multiple procedure
logic: 100% of primary (higher inst_price) + 50% of secondary (lower
inst_price), determined per provider. Combo SSPs have a single
sub_category = '0' in combined_subcategory_fee_schedule.
See Methodology — Step 8 for details.
Source Tables
manual_institutional_line_codes
Granularity: one row per (base_code, line_code)
Location: tq_dev.internal_dev_csong_ssp
Claims-derived institutional line codes not in sub_package_contents (Facility Fee).
Produced by manual_institutional_line_codes.py. Includes hardcoded combo SSP entries.
| Column | Description |
|---|---|
base_code | Anchor billing code (or combo base code, e.g., 43250 + 45384) |
line_code | Co-occurring procedure code |
manual_professional_line_codes
Granularity: one row per (base_code, line_code)
Location: tq_dev.internal_dev_csong_ssp
Claims-derived professional line codes not in sub_package_contents (Professional/Optional Fee).
Produced by manual_professional_line_codes.py. Also mirrors ancillary codes (anesthesia,
radiology, lab/path) from the institutional side and includes hardcoded combo SSP entries.
| Column | Description |
|---|---|
base_code | Anchor billing code (or combo base code, e.g., 43250 + 45384) |
line_code | Co-occurring procedure code |
institutional_line_codes
Granularity: one row per (ssp_grouper, sub_category, pos, line_code, provider_id)
Location: tq_dev.internal_dev_csong_ssp
Line-code level detail for institutional fee schedule. Combines sub_package_contents
(Facility Fee), manual_institutional_line_codes, and DRG anchor codes. Produced by
institutional_line_codes.py.
| Column | Description |
|---|---|
ssp_grouper | SSP identifier |
sub_category | Sub-category (0 = least severe) |
pos | IP or OP |
line_code | Procedure code or revenue code |
label | Anchor Code, Revenue Code, Carved Out: Drug, Carved Out: Implant, or NULL |
fee_type | Facility Fee |
provider_id | Hospital provider identifier |
canonical_rate | Commercial rate (anchor codes only) |
medicare_rate | Medicare rate (anchor codes only) |
medicare_code | MS-DRG or APC code used for Medicare pricing |
rate_source | validated or backup |
institutional_fee_schedule_2026_03_12
Granularity: one row per (ssp_grouper, sub_category, pos, provider_id)
| Column | Description |
|---|---|
ssp_grouper | SSP identifier |
sub_category | Sub-category (0 = least severe) |
pos | IP or OP |
code | Highest-volume billing code (MS-DRG or HCPCS) in the sub-category |
multiplier | RII tier multiplier (1.0 for non-tiered sub-categories) |
provider_id | Hospital provider identifier |
subcategory_price | Volume-weighted avg commercial rate at sub-category level, scaled by multiplier |
subcategory_medicare_price | Volume-weighted avg Medicare rate at sub-category level |
ssp_grouper_price | Commercial rate rolled up to SSP level |
ssp_grouper_medicare_price | Medicare rate rolled up to SSP level |
subcategory_weight | subcategory_price / base_rate |
subcategory_medicare_weight | subcategory_medicare_price / base_rate |
ssp_grouper_weight | ssp_grouper_price / base_rate |
ssp_grouper_medicare_weight | ssp_grouper_medicare_price / base_rate |
professional_line_codes_2026_03_27
Granularity: one row per (ssp_grouper, sub_category, pos, base_code, line_code, fee_type, provider_id)
Location: tq_dev.internal_dev_csong_ssp
Intermediate table preserving line-code level detail. Produced by professional_line_codes.py.
Only includes line codes where association_rate IS NULL OR association_rate > 0.3.
| Column | Description |
|---|---|
ssp_grouper | SSP identifier |
sub_category | Sub-category (0 = least severe) |
pos | IP or OP |
base_code | Anchor billing code (HCPCS/CPT) |
line_code | Professional/optional fee line code |
fee_type | Professional Fee or Optional Fee |
provider_id | Hospital provider identifier |
units | Unit multiplier applied to rates: anesthesia codes scaled by average_units / 15; anchor codes in the same SSP capped at 1; all others = 1 |
canonical_rate | COALESCE(validated, backup) commercial rate, multiplied by units |
canonical_rate_validated | Validated commercial rate (score = 5) |
canonical_rate_backup | Backup commercial rate (score greater than 1) |
n_rates_validated | Number of validated rate observations |
n_rates_backup | Number of backup rate observations |
volume | Claim volume |
medicare_rate | Medicare rate multiplied by units: MPFS state-level rate first, then anesthesia fee schedule, then CLFS national rate as final fallback |
rate_source | validated or backup |
association_rate | Relative encounter proportion from claims discovery (NULL for codes from sub_package_contents with no claims data) |
service_type | Anesthesia, Lab/Path, Radiology, or Professional — from ssp_line_code_service_types |
line_code_description | Human-readable description from services_spines_cleaned |
line_code_shorthand | Short label from services_spines_cleaned |
ccs_category | CCS (Clinical Classification Software) category from services_spines_cleaned |
professional_fee_schedule_2026_03_12
Granularity: one row per (ssp_grouper, sub_category, pos, provider_id)
Location: tq_dev.internal_dev_csong_ssp
Aggregated from professional_line_codes table. Same column structure as the institutional table (prices only, no weights), plus Professional Fee only price variants:
| Extra columns | Description |
|---|---|
subcategory_professional_price | Commercial rate, Professional Fee codes only |
subcategory_professional_medicare_price | Medicare rate, Professional Fee codes only |
ssp_grouper_professional_price | Rolled-up commercial rate, Professional Fee only |
ssp_grouper_professional_medicare_price | Rolled-up Medicare rate, Professional Fee only |
Weights for all columns (institutional and professional) are computed in prices.py.