Skip to main content

/add-provider-type

Scaffold a new provider type into the CLD sub-DAG. Adds spine table entries to params.py, creates a new codeset SQL file, adds the provider CTE to provider.sql, wires up code plausibility in rate_object_space.sql, adds the Medicare benchmark CTE to benchmarks.sql, updates QA answer files, and updates the provider_types list in both init.py files.

Usage

/add-provider-type <provider-type-name>

Provide the provider type name (e.g. Infusion Center, Laboratory). The skill will ask you to confirm several details before making any changes.

This is an experimental skill. Adding a new provider type is inherently case-by-case. The skill provides the structural scaffold; you may need to adapt details to the specific provider type.

What You Need to Provide

Before the skill makes any edits, it will ask you to confirm:

  1. Provider type name — as it should appear in the data (e.g. Infusion Center)
  2. Spine table(s) — fully-qualified Trino table name(s) for the primary spine and optionally an additional NPIs table
  3. Codeset definition — how plausible codes are determined (code list, Medicare reference table, or SQL query)
  4. Medicare reference table — which table provides the benchmark rate (e.g. ASC_MEDICARE_REFERENCE_TABLE, CLINICAL_LAB_REFERENCE_PRICING)
  5. Bill typeOutpatient, Professional, Inpatient, or multiple
  6. Codeset table name — what to name the new intermediate table

What It Does

  1. Examines spine tables via Trino — runs DESCRIBE and sample queries on each new spine table to understand column structure before writing any code.

  2. Updates params.py — adds SPINES_PROVIDER_* entries for each spine table and optionally a codeset table entry.

  3. Creates a codeset SQL file — in core_licensable_data_utils/sql/spines/, using either a Medicare reference table join pattern or a manual code list, with a -- manual additions section for future /add-codes use.

  4. Updates provider.sql — adds an NPI aggregation CTE, a main provider CTE with all required columns (NULL-filled if unavailable), and a UNION entry in the final df CTE.

  5. Updates rate_object_space.sql — adds a new block in code_plausibility_flattened mapping codes to the new provider type.

  6. Updates benchmarks.sql — adds a latest_*_dates CTE, a raw benchmark CTE, a geographic average CTE, CASE branches in the final CTE, and LEFT JOINs.

  7. Updates QA answer files — adds the provider type to PROVIDER_TYPES and PROVIDER_TYPE_COVERAGE_THRESHOLDS in answers.py, cmv_answers.py, and optionally lookback_answers.py.

  8. Updates __init__.py files — adds the provider type string to the provider_types list in both the sub-DAG and orchestrator DEFAULT_PARAMS.

Files Modified

FileChange
utils/params.pyAdd SPINES_PROVIDER_* entries and optionally a codeset table entry
core_licensable_data_utils/sql/spines/<name>_codeset.sqlNew file — creates the plausible codeset
sql/reference/spines/provider.sqlAdd NPI CTE, main provider CTE, UNION entry
sql/reference/rate_object_space.sqlAdd block in code_plausibility_flattened
sql/benchmarks/benchmarks.sqlAdd dates CTE, benchmark CTE, geographic avg CTE, CASE branches, JOINs
qa/answers/answers.pyAdd to PROVIDER_TYPES and PROVIDER_TYPE_COVERAGE_THRESHOLDS
qa/answers/cmv_answers.pyAdd to PROVIDER_TYPES and PROVIDER_TYPE_COVERAGE_THRESHOLDS
qa/answers/lookback_answers.pyAdd if applicable (currently Hospital, Imaging Center, Laboratory only)
__init__.py (sub-DAG)Add to provider_types list
core_licensable_data_orchestrator/__init__.pyAdd to provider_types list

Post-Skill Checklist

After running this skill, the following additional steps are typically needed:

  • Run the new codeset SQL in Trino to materialize the table and verify row counts
  • Run /update-input-schemas to add the new spine tables to input schema checks
  • QA: compare provider counts in the new CTE against expectations
  • QA: check rate object space counts before and after the new provider type
  • Consider whether existing provider type exclusions need updating (e.g. ASC excludes infusion centers)

Reference

Add Provider Type: Reference

Codeset SQL Patterns

All codeset files live in core_licensable_data_utils/sql/spines/ and write to tq_intermediate.cld_utils.*_{{ today }}.

Pattern A: Medicare reference table (e.g. ASC, Lab)

Joins a Medicare/CMS reference pricing table with claims utilization. Best for provider types that have a defined Medicare fee schedule.

CREATE OR REPLACE TABLE tq_intermediate.cld_utils.<name>_codeset_{{ today }} AS
WITH
ref AS (
SELECT hcpcs, approx_percentile(rate, 0.5) as payment_rate
FROM <medicare_reference_table>
WHERE is_latest_start_effective_date = True
GROUP BY 1
),
util AS (
SELECT DISTINCT billing_code, billing_code_type,
sum(total_count_encounters) as total_encounters
FROM tq_intermediate.claims_benchmarks.claims_benchmarks_utilization_national
WHERE <taxonomy filter>
GROUP BY 1, 2
)
SELECT DISTINCT
'<bill_type>' as bill_type,
'HCPCS' as billing_code_type,
util.billing_code,
util.total_encounters * ref.payment_rate as revenue,
CAST(NULL AS Boolean) as facility
FROM ref JOIN util ON util.billing_code = ref.hcpcs
AND ref.payment_rate IS NOT NULL

UNION ALL

-- manual <Name> additions
SELECT '<bill_type>' as bill_type, 'HCPCS' as billing_code_type,
billing_code, NULL as revenue, CAST(NULL AS Boolean) as facility
FROM (SELECT billing_code FROM unnest(array[
-- codes added via /add-codes
]) t(billing_code)) AS manual_codes

Pattern B: Manual code list only

For provider types without a standard Medicare fee schedule.

CREATE OR REPLACE TABLE tq_intermediate.cld_utils.<name>_codeset_{{ today }} AS
SELECT billing_code
FROM UNNEST(ARRAY[
'<code1>', '<code2>', ...
]) t(billing_code)

UNION ALL

-- manual <Name> additions
SELECT billing_code
FROM UNNEST(ARRAY[
-- codes added via /add-codes
]) t(billing_code)

Provider SQL Patterns

Patterns vary by data available in the spine. Use the closest analogue:

Provider TypePatternKey characteristics
HospitalFullCCN, Definitive ID, health system join, active status filter
ASCFullDefinitive ID, health system join, AMSURG specialty join, active status filter, excludes infusion centers
Imaging CenterSimplerDefinitive ID, no CCN, active status filter, services_* from spine
Infusion CenterSimplerNo Definitive ID, EIN array join, national_payer_coverage from spine
LaboratoryAggregatedFilters on core_rates presence OR national_payer_coverage, aggregates by (provider_id, provider_name, state)

Required columns (every CTE must select all of these, NULL-filled if unavailable)

provider_id, npi, definitive_id, provider_name, medicare_provider_id, ein,
city, state, state_name_full, cbsa, cbsa_name, county,
zip_code, zip3, carrier_locality,
street_address, hq_latitude, hq_longitude, website_url,
health_system_id, health_system_name, health_system_type,
health_system_city, health_system_state,
provider_type, provider_type_new, provider_subtype,
taxonomy_code, taxonomy_grouping, taxonomy_classification, taxonomy_specialization,
total_beds, operating_room_count,
services_lab, services_radiology, services_cardiology, services_dental,
services_dermatology, services_ent, services_general_other, services_endoscopy,
services_nephrology, services_neurology, services_obgyn, services_opthalmology,
services_orthopedic, services_pain, services_plastic_surgery, services_podiatry,
services_pulmonary, services_urology,
services_mra, services_ct, services_mri, services_mammography, services_cta,
services_xray, services_ultrasound, services_spect, services_pet, services_imrt,
national_payer_coverage

Rate Object Space Patterns

The code_plausibility_flattened CTE in rate_object_space.sql maps codes to provider types. Each UNION ALL block adds one mapping.

Standard codeset pattern (most provider types)

UNION ALL

SELECT DISTINCT
cs.bill_type, cs.billing_code_type, cs.billing_code, cs.facility,
'<Provider Type>' as provider_type,
NULL as provider_subtype,
NULL as provider_id,
'' as billing_code_modifier
FROM {{ schema_name }}.tmp_ref_code_spine_{{ sub_version }} cs
WHERE cs.bill_type = '<bill_type>'
AND cs.billing_code IN (
SELECT billing_code FROM {{ cld_params.Tables.<CODESET_KEY>.value }}
)

Provider subtype pattern (Infusion Center)

Used when the provider type is expressed as provider_subtype rather than provider_type:

SELECT DISTINCT
cs.bill_type, cs.billing_code_type, cs.billing_code, cs.facility,
NULL as provider_type,
'<Provider Subtype>' as provider_subtype,
NULL as provider_id,
'' as billing_code_modifier
FROM {{ schema_name }}.tmp_ref_code_spine_{{ sub_version }} cs
WHERE cs.bill_type = 'Professional'
AND cs.is_drug_code = True
AND cs.facility = False

Plausibility table pattern (ASC)

Used when codes are determined by a separate per-provider plausibility table:

SELECT DISTINCT
asc.bill_type, asc.billing_code_type, asc.billing_code, asc.facility,
'ASC' as provider_type,
NULL as provider_subtype,
asc.provider_id,
'' as billing_code_modifier
FROM {{ schema_name }}.tmp_ref_asc_plausibility_{{ sub_version }} asc
WHERE asc.is_plausible = True

Existing Provider Types (for reference)

provider_type valueSpine key(s) in params.pyCodesetBill type
HospitalSPINES_PROVIDER_HOSPITAL, ..._ADDITIONAL_NPISOUTPATIENT_CODESETInpatient, Outpatient
ASCSPINES_PROVIDER_ASC, ..._ADDITIONAL_NPISASC_CODESET (via plausibility)Outpatient
Imaging CenterSPINES_PROVIDER_IMAGING_CENTERSSSP packagesProfessional
Physician GroupSPINES_PROVIDER_PHYSICIAN_GROUPS + 3 associated tablesCLD_PHYSICIAN_GROUP_CODESETProfessional
LaboratorySPINES_PROVIDER_LABS, ..._ADDITIONAL_NPISLAB_HOSPITAL_AND_PGS_CODESETProfessional
Infusion CenterSPINES_PROVIDER_INFUSION_CENTERS, ..._ADDITIONAL_NPISis_drug_code = TrueProfessional

Benchmarks SQL Patterns

benchmarks.sql has a distinct CTE for each provider type's Medicare benchmark source. The pattern varies by whether the rate is provider-specific or national.

National rate (e.g. Laboratory — Clinical Lab Fee Schedule)

laboratories AS (
SELECT hcpcs AS billing_code, ROUND(rate, 2) as rate
FROM {{ cld_params.Tables.CLINICAL_LAB_REFERENCE_PRICING.value }} lab
WHERE is_latest_start_effective_date = True
AND modifier = '' AND rate > 0
),
lab_geographic_avg AS (
SELECT billing_code,
ROUND(approx_percentile(rate, 0.5), 2) as lab_avg_medicare_rate,
ROUND(approx_percentile(rate, 0.5), 2) as national_avg_rate
FROM laboratories GROUP BY 1
),

Provider-specific rate (e.g. ASC — ASC reference pricing by NPI)

Rate varies by provider. Requires a join to the provider spine to get provider_id. Produces state/cbsa/national geographic averages.