Spines
Labs Spine
Inclusion Criteria
- The Labs spine was created based on common taxonomy codes associated with Lab NPIs. The Life Sciences team has been using these in their products via the
provider_profilestable, which we then validated in claims data by looking at the taxonomies that commonly bill lab codes, plus made it refreshable.
TAXONOMY_SPECS = [
{'code': '291U00000X'},
{'code': '293D00000X'},
{'code': '292200000X'},
{'code': '291900000X'}
]
- These taxonomy codes are looped through the 15 taxonomy switch columns in
tq_production.reference_legacy.ref_cms_nppes_npi, searching for the firstYprimary taxonomy value per type 2 NPI. If none are reported, it falls back on the first non-null taxonomy found. Taxonomy details come fromtq_production.reference_legacy.ref_nucc_taxonomy_code.
CREATE OR REPLACE TABLE {{ schema }}.tmp_nppes_taxonomy_labs_{{ sub_version }} AS
SELECT
CAST(npi AS VARCHAR) AS npi,
provider_organization_name_legal_business_name,
provider_other_organization_name,
provider_business_practice_location_address_state_name,
provider_first_line_business_practice_location_address,
provider_business_practice_location_address_city_name,
provider_business_practice_location_address_postal_code,
entity_type_code,
{% set num_codes = 15 %}
CASE
{% for i in range(1, num_codes + 1) %}
WHEN healthcare_provider_primary_taxonomy_switch_{{ i }} = 'Y'
AND healthcare_provider_taxonomy_code_{{ i }} IS NOT NULL
THEN healthcare_provider_taxonomy_code_{{ i }}
{% endfor %}
ELSE COALESCE(
{% for i in range(1, num_codes + 1) %}
healthcare_provider_taxonomy_code_{{ i }}{{ "," if i < num_codes else "" }}
{% endfor %}
)
END AS selected_taxonomy_code
FROM {{ ref_cms_nppes_npi }};
- Lab billing is nuanced and requires surfacing rates for all payers, rather than limiting to the payers in the state that the NPI is located. In order to target the high impact labs, we are including a field in spines called
national_payer_coverageto be used as an identifier in Clear Rates. This also allows a clean framework to standardize naming conventions (i.e. Quest Diagnostics Inc and Quest Diagnostics will both group to the same name/provider_id). The inclusion list came from Guardant Health's requested provider list.
NATIONAL_PROVIDERS = {
"QUEST DIAGNOSTICS": "Quest Diagnostics",
"LABORATORY CORP": "LabCorp",
"GUARDANT HEALTH": "Guardant Health",
"BIOREFERENCE": "BioReference",
"MYRIAD GENETIC": "Myriad Genetics",
"TEMPUS": "Tempus",
"CARIS": "Caris",
"NEOGENOMICS": "NeoGenomics",
"INIVATA": "Inivata", # Acquired by NeoGenomics but still in NPPES
"FOUNDATION MEDICINE": "Foundation Medicine",
"NATERA": "Natera",
"BILLIONTOONE": "BillionToOne",
"EXACT SCIENCES": "Exact Sciences",
"INFORM DIAGNOSTICS": "Inform Diagnostics"
}
WITH provider_grouping AS (
-- Map each provider name to its canonical grouped name based on wildcard patterns
SELECT
provider_name,
provider_state,
CASE
{%- for provider in national_providers %}
WHEN UPPER(provider_name) LIKE '{{ provider }}%' THEN '{{ national_providers[provider] }}'
{%- endfor %}
ELSE provider_name
END as canonical_provider_name
FROM {{ schema }}.tmp_nppes_base_labs_{{ sub_version }}
)
CASE
WHEN (
{%- for canonical_name in national_providers.values() %}
tpns.canonical_provider_name = '{{ canonical_name }}'{{ " OR" if not loop.last else "" }}
{%- endfor %}
)
THEN tpns.canonical_provider_name
ELSE REGEXP_REPLACE(
LOWER(tpns.canonical_provider_name),
'(^|[^a-z])([a-z])',
x -> x[1] || UPPER(x[2])
)
END as provider_name,
Structure
- Labs follows a similar structure to the existing spines table, by splitting the NPIs into two tables that can be traced through via
provider_id. - The tables are
tq_production.spines.spines_provider_labsandtq_production.spines.spines_provider_labs_additional_npis. - The primary NPI for a
provider_idthat will be used in the main table is determined by the lowest numericalprovider_idvalue (created by hashing primary NPI). This is grouped by thecanonical_provider_name(cleaned version using `national_providers) and the state. This means that there will be separate provider IDs for Quest NY vs Quest CA for example.
ROW_NUMBER() OVER (PARTITION BY dp.canonical_provider_name, bp.provider_state ORDER BY bp.npi ASC) as primary_rank,
MIN(bp.provider_id) OVER (PARTITION BY dp.canonical_provider_name, bp.provider_state) as group_provider_id
CAST(from_big_endian_64(xxhash64(CAST(
npi
AS varbinary
))) AS VARCHAR) as provider_id,
- The
provider_parent_systemfield will be populated for labs that have more than 5 registered NPIs.
WITH npi_counts AS (
-- Count total NPIs per canonical provider
SELECT
canonical_provider_name,
COUNT(DISTINCT npi) as npi_count
FROM {{ schema }}.tmp_nppes_primary_labs_{{ sub_version }}
GROUP BY canonical_provider_name
)
CASE
WHEN (
{%- for canonical_name in national_providers.values() %}
tpns.canonical_provider_name = '{{ canonical_name }}'{{ " OR" if not loop.last else "" }}
{%- endfor %}
)
AND nc.npi_count > 5
THEN tpns.canonical_provider_name
ELSE NULL
END as provider_parent_system