/add-provider-type
Scaffold a new provider type into the CLD sub-DAG. Adds spine table entries to params.py, creates a new codeset SQL file, adds the provider CTE to provider.sql, wires up code plausibility in rate_object_space.sql, adds the Medicare benchmark CTE to benchmarks.sql, updates QA answer files, and updates the provider_types list in both init.py files.
Usage
/add-provider-type <provider-type-name>
Provide the provider type name (e.g. Infusion Center, Laboratory). The skill will ask you to confirm several details before making any changes.
This is an experimental skill. Adding a new provider type is inherently case-by-case. The skill provides the structural scaffold; you may need to adapt details to the specific provider type.
What You Need to Provide
Before the skill makes any edits, it will ask you to confirm:
- Provider type name — as it should appear in the data (e.g.
Infusion Center) - Spine table(s) — fully-qualified Trino table name(s) for the primary spine and optionally an additional NPIs table
- Codeset definition — how plausible codes are determined (code list, Medicare reference table, or SQL query)
- Medicare reference table — which table provides the benchmark rate (e.g.
ASC_MEDICARE_REFERENCE_TABLE,CLINICAL_LAB_REFERENCE_PRICING) - Bill type —
Outpatient,Professional,Inpatient, or multiple - Codeset table name — what to name the new intermediate table
What It Does
-
Examines spine tables via Trino — runs
DESCRIBEand sample queries on each new spine table to understand column structure before writing any code. -
Updates
params.py— addsSPINES_PROVIDER_*entries for each spine table and optionally a codeset table entry. -
Creates a codeset SQL file — in
core_licensable_data_utils/sql/spines/, using either a Medicare reference table join pattern or a manual code list, with a-- manual additionssection for future/add-codesuse. -
Updates
provider.sql— adds an NPI aggregation CTE, a main provider CTE with all required columns (NULL-filled if unavailable), and a UNION entry in the finaldfCTE. -
Updates
rate_object_space.sql— adds a new block incode_plausibility_flattenedmapping codes to the new provider type. -
Updates
benchmarks.sql— adds alatest_*_datesCTE, a raw benchmark CTE, a geographic average CTE, CASE branches in the final CTE, and LEFT JOINs. -
Updates QA answer files — adds the provider type to
PROVIDER_TYPESandPROVIDER_TYPE_COVERAGE_THRESHOLDSinanswers.py,cmv_answers.py, and optionallylookback_answers.py. -
Updates
__init__.pyfiles — adds the provider type string to theprovider_typeslist in both the sub-DAG and orchestratorDEFAULT_PARAMS.
Files Modified
| File | Change |
|---|---|
utils/params.py | Add SPINES_PROVIDER_* entries and optionally a codeset table entry |
core_licensable_data_utils/sql/spines/<name>_codeset.sql | New file — creates the plausible codeset |
sql/reference/spines/provider.sql | Add NPI CTE, main provider CTE, UNION entry |
sql/reference/rate_object_space.sql | Add block in code_plausibility_flattened |
sql/benchmarks/benchmarks.sql | Add dates CTE, benchmark CTE, geographic avg CTE, CASE branches, JOINs |
qa/answers/answers.py | Add to PROVIDER_TYPES and PROVIDER_TYPE_COVERAGE_THRESHOLDS |
qa/answers/cmv_answers.py | Add to PROVIDER_TYPES and PROVIDER_TYPE_COVERAGE_THRESHOLDS |
qa/answers/lookback_answers.py | Add if applicable (currently Hospital, Imaging Center, Laboratory only) |
__init__.py (sub-DAG) | Add to provider_types list |
core_licensable_data_orchestrator/__init__.py | Add to provider_types list |
Post-Skill Checklist
After running this skill, the following additional steps are typically needed:
- Run the new codeset SQL in Trino to materialize the table and verify row counts
- Run
/update-input-schemasto add the new spine tables to input schema checks - QA: compare provider counts in the new CTE against expectations
- QA: check rate object space counts before and after the new provider type
- Consider whether existing provider type exclusions need updating (e.g. ASC excludes infusion centers)
Reference
Add Provider Type: Reference
Codeset SQL Patterns
All codeset files live in core_licensable_data_utils/sql/spines/ and write to tq_intermediate.cld_utils.*_{{ today }}.
Pattern A: Medicare reference table (e.g. ASC, Lab)
Joins a Medicare/CMS reference pricing table with claims utilization. Best for provider types that have a defined Medicare fee schedule.
CREATE OR REPLACE TABLE tq_intermediate.cld_utils.<name>_codeset_{{ today }} AS
WITH
ref AS (
SELECT hcpcs, approx_percentile(rate, 0.5) as payment_rate
FROM <medicare_reference_table>
WHERE is_latest_start_effective_date = True
GROUP BY 1
),
util AS (
SELECT DISTINCT billing_code, billing_code_type,
sum(total_count_encounters) as total_encounters
FROM tq_intermediate.claims_benchmarks.claims_benchmarks_utilization_national
WHERE <taxonomy filter>
GROUP BY 1, 2
)
SELECT DISTINCT
'<bill_type>' as bill_type,
'HCPCS' as billing_code_type,
util.billing_code,
util.total_encounters * ref.payment_rate as revenue,
CAST(NULL AS Boolean) as facility
FROM ref JOIN util ON util.billing_code = ref.hcpcs
AND ref.payment_rate IS NOT NULL
UNION ALL
-- manual <Name> additions
SELECT '<bill_type>' as bill_type, 'HCPCS' as billing_code_type,
billing_code, NULL as revenue, CAST(NULL AS Boolean) as facility
FROM (SELECT billing_code FROM unnest(array[
-- codes added via /add-codes
]) t(billing_code)) AS manual_codes
Pattern B: Manual code list only
For provider types without a standard Medicare fee schedule.
CREATE OR REPLACE TABLE tq_intermediate.cld_utils.<name>_codeset_{{ today }} AS
SELECT billing_code
FROM UNNEST(ARRAY[
'<code1>', '<code2>', ...
]) t(billing_code)
UNION ALL
-- manual <Name> additions
SELECT billing_code
FROM UNNEST(ARRAY[
-- codes added via /add-codes
]) t(billing_code)
Provider SQL Patterns
Patterns vary by data available in the spine. Use the closest analogue:
| Provider Type | Pattern | Key characteristics |
|---|---|---|
| Hospital | Full | CCN, Definitive ID, health system join, active status filter |
| ASC | Full | Definitive ID, health system join, AMSURG specialty join, active status filter, excludes infusion centers |
| Imaging Center | Simpler | Definitive ID, no CCN, active status filter, services_* from spine |
| Infusion Center | Simpler | No Definitive ID, EIN array join, national_payer_coverage from spine |
| Laboratory | Aggregated | Filters on core_rates presence OR national_payer_coverage, aggregates by (provider_id, provider_name, state) |
Required columns (every CTE must select all of these, NULL-filled if unavailable)
provider_id, npi, definitive_id, provider_name, medicare_provider_id, ein,
city, state, state_name_full, cbsa, cbsa_name, county,
zip_code, zip3, carrier_locality,
street_address, hq_latitude, hq_longitude, website_url,
health_system_id, health_system_name, health_system_type,
health_system_city, health_system_state,
provider_type, provider_type_new, provider_subtype,
taxonomy_code, taxonomy_grouping, taxonomy_classification, taxonomy_specialization,
total_beds, operating_room_count,
services_lab, services_radiology, services_cardiology, services_dental,
services_dermatology, services_ent, services_general_other, services_endoscopy,
services_nephrology, services_neurology, services_obgyn, services_opthalmology,
services_orthopedic, services_pain, services_plastic_surgery, services_podiatry,
services_pulmonary, services_urology,
services_mra, services_ct, services_mri, services_mammography, services_cta,
services_xray, services_ultrasound, services_spect, services_pet, services_imrt,
national_payer_coverage
Rate Object Space Patterns
The code_plausibility_flattened CTE in rate_object_space.sql maps codes to provider types. Each UNION ALL block adds one mapping.
Standard codeset pattern (most provider types)
UNION ALL
SELECT DISTINCT
cs.bill_type, cs.billing_code_type, cs.billing_code, cs.facility,
'<Provider Type>' as provider_type,
NULL as provider_subtype,
NULL as provider_id,
'' as billing_code_modifier
FROM {{ schema_name }}.tmp_ref_code_spine_{{ sub_version }} cs
WHERE cs.bill_type = '<bill_type>'
AND cs.billing_code IN (
SELECT billing_code FROM {{ cld_params.Tables.<CODESET_KEY>.value }}
)
Provider subtype pattern (Infusion Center)
Used when the provider type is expressed as provider_subtype rather than provider_type:
SELECT DISTINCT
cs.bill_type, cs.billing_code_type, cs.billing_code, cs.facility,
NULL as provider_type,
'<Provider Subtype>' as provider_subtype,
NULL as provider_id,
'' as billing_code_modifier
FROM {{ schema_name }}.tmp_ref_code_spine_{{ sub_version }} cs
WHERE cs.bill_type = 'Professional'
AND cs.is_drug_code = True
AND cs.facility = False
Plausibility table pattern (ASC)
Used when codes are determined by a separate per-provider plausibility table:
SELECT DISTINCT
asc.bill_type, asc.billing_code_type, asc.billing_code, asc.facility,
'ASC' as provider_type,
NULL as provider_subtype,
asc.provider_id,
'' as billing_code_modifier
FROM {{ schema_name }}.tmp_ref_asc_plausibility_{{ sub_version }} asc
WHERE asc.is_plausible = True
Existing Provider Types (for reference)
provider_type value | Spine key(s) in params.py | Codeset | Bill type |
|---|---|---|---|
Hospital | SPINES_PROVIDER_HOSPITAL, ..._ADDITIONAL_NPIS | OUTPATIENT_CODESET | Inpatient, Outpatient |
ASC | SPINES_PROVIDER_ASC, ..._ADDITIONAL_NPIS | ASC_CODESET (via plausibility) | Outpatient |
Imaging Center | SPINES_PROVIDER_IMAGING_CENTERS | SSP packages | Professional |
Physician Group | SPINES_PROVIDER_PHYSICIAN_GROUPS + 3 associated tables | CLD_PHYSICIAN_GROUP_CODESET | Professional |
Laboratory | SPINES_PROVIDER_LABS, ..._ADDITIONAL_NPIS | LAB_HOSPITAL_AND_PGS_CODESET | Professional |
Infusion Center | SPINES_PROVIDER_INFUSION_CENTERS, ..._ADDITIONAL_NPIS | is_drug_code = True | Professional |
Benchmarks SQL Patterns
benchmarks.sql has a distinct CTE for each provider type's Medicare benchmark source. The pattern varies by whether the rate is provider-specific or national.
National rate (e.g. Laboratory — Clinical Lab Fee Schedule)
laboratories AS (
SELECT hcpcs AS billing_code, ROUND(rate, 2) as rate
FROM {{ cld_params.Tables.CLINICAL_LAB_REFERENCE_PRICING.value }} lab
WHERE is_latest_start_effective_date = True
AND modifier = '' AND rate > 0
),
lab_geographic_avg AS (
SELECT billing_code,
ROUND(approx_percentile(rate, 0.5), 2) as lab_avg_medicare_rate,
ROUND(approx_percentile(rate, 0.5), 2) as national_avg_rate
FROM laboratories GROUP BY 1
),
Provider-specific rate (e.g. ASC — ASC reference pricing by NPI)
Rate varies by provider. Requires a join to the provider spine to get provider_id. Produces state/cbsa/national geographic averages.