Skip to main content

Tier Assignment

Once each SSP × code has an intensity score, 1-D KMeans clustering partitions codes into discrete tiers.


1-D KMeans

Clustering on the scalar intensity score rather than the full 14-dimensional feature matrix has two key advantages:

Ordered by construction. Cluster centroids lie on a single axis, so tier 1 is always the least intensive and tier kk the most. When clustering in the full feature space, cluster labels are arbitrary and require post-hoc reordering to make sense. It's important to check that PC1 is oriented correctly.

After fitting, tiers are relabelled so that tier 1 = lowest centroid and tier kk = highest.


Automatic kk selection

Rather than using a fixed number of tiers, the optimal kk is selected per SSP using the silhouette score:

s(code)=bamax(a,b)s(\text{code}) = \frac{b - a}{\max(a, b)}

where aa is the mean intra-cluster distance and bb is the mean nearest-cluster distance on the 1-D axis. The mean silhouette score across all codes is computed for k{2,3,4,5,6}k \in \{2, 3, 4, 5, 6\} and the value that maximises it is chosen.


Validation against CMS severity (CM codes only)

For ICD-10-CM codes, the assigned tier is cross-tabulated against the CMS SDx severity label (MCC / CC / No CC/MCC). Well-calibrated tiers show a strong monotonic pattern:

Tier 1 (Low)Tier k (High)
No CC/MCCdominantrare
CCmixedmixed
MCCraredominant

A strong off-diagonal concentration in this cross-tab is evidence that the RII tiers are tracking true clinical severity rather than an artefact of the data.

PCS codes have no CMS severity labels; their tier quality is assessed through the feature profiles and scatter plots (intensity vs. LOS, intensity vs. charge) in the per-SSP report.