Price Transparency

Provider Pricing and Health Policy

Methodology

 

This project was generously funded by grant # 2222433 from the National Science Foundation.

Introduction

Starting January 1, 2021, almost all hospitals in the United States have been required to publicly post machine-readable data files of their “standard charges.” These data files are intended to consist of five distinct data elements for all items and services that hospitals provide: the gross charge, discounted cash price, de-identified minimum negotiated rate, de-identified maximum negotiated rate, and the payer-specific negotiated rate.

This landmark regulation sheds light on previously inaccessible prices: in particular, the prices negotiated between hospitals and individual payers. Thus, for example, it is now possible to know the price of a brain MRI at a particular hospital for payers such as Cigna, United Health, Blue Cross Blue Shield, and others. Prior to this regulation, these data were not publicly accessible.

While these data can potentially help researchers, consumers, and other stakeholders better understand the market for hospital services, they are not standardized and thus not easily aggregated or synthesized. Raw data are posted at the hospital level in non-standardized formats; thus, files can differ radically from hospital to hospital.

For this project, we collected a sample of available hospital data and processed, synthesized, documented, and are publicly posting the underlying data. The resulting state-level files use a common data model, thus facilitating cross-hospital and cross-state comparisons. Below we discuss the sample selection, cleaning methodology, and recommended uses of the data.

Sample Selection

The initial sample for this project consists of hospitals in Medicaid expansion/non-expansion state pairs. Hilltop collected data for all hospitals that lie within 50 miles of either side of a Medicaid expansion/non-expansion state border as of October 2022.

This amounted to a total of 782 hospitals across 22 states: Arkansas, Colorado, Iowa, Idaho, Kansas, Kentucky, Louisiana, Minnesota, Mississippi, Missouri, Montana, Nebraska, New Mexico, North Carolina, North Dakota, Oklahoma, South Dakota, Tennessee, Texas, Utah, Virginia, and Wyoming.

We included every hospital for which a) we could locate data during the initial data collection period (late 2022-early 2023), b) the hospital’s standard charge file was in a machine-readable format, consistent with the regulations, and c) the hospital’s standard charge file contained payer-specific negotiated rates for at least one third-party payer. We did not include hospitals with only chargemasters, although future iterations of this database may do so. Additionally, we only included those pricing observations with a standardized procedure code (e.g., a Current Procedural Terminology code). We did not include pricing observations without an associated standardized procedure code or those with only an associated internal code (i.e., a hospital-specific code). Finally, for the first release, we did not include data on prescription drugs (i.e., identified by National Drug Codes).

Methodology for Data Standardization

While almost all hospitals in the United States are required to post machine-readable “standard charge” data files, hospitals are—as of the time of this writing—given substantial latitude in determining the structure and content of their data files. The result of this decentralization is a profusion of data files, the integration of which requires substantial effort.

In this project, Hilltop transforms hospital price transparency data sets into a common data model and then synthesizes these disparate hospital-level data sets to the state level. Hilltop based its standardization logic on the recommended “Tall” data format (version 1.1) for hospital price transparency data published by the Centers for Medicare & Medicaid Services in June 2023.

For each state-level data set, each observation in the processed data is a price for an item or service denoted by a code for a particular code type for a given price type in a setting for a billing class in a revenue code at a hospital.

For example, at Campbell County Memorial hospital in Wyoming, the price of CPT code 99285 (description: “TRAUMA B ACTIVATION”) for revenue center 0683 for Cigna is $2886.575.

At times, hospitals provided additional pricing information that we included into the standardized data files. Typically—but not always—these are procedure code modifiers.

The final data sets contain the fields defined in the following table.

Variable List and Description

Variable
Type
Description
Optional
ccn
numeric
Provider identifier for a hospital facility. This is the six-digit CMS Certification Number (CCN). It can be used to link hospital price transparency data to other data sources with hospital-specific information. This is also called the Medicaid/Medicare provider number, the OSCAR provider number, the Medicare Identification Number, or the Provider Number.
No
hospital_name
string
This is the name of the hospital as described in the CMS Provider of Services Current Files. It is important to note that because hospital names can change over time, individuals seeking to link hospital pricing data with other data files would be better served using “ccn” as a primary linking variable.
No
state
string
The two-digit abbreviation of the state in which the hospital is located.
No
date_of_file
string
Date when the file was published. Hospitals are not consistent in publishing this information. Collected where present.
Yes
code
string
Hospital procedure code. Every observation in these research data sets corresponds to a standardized procedure code. Where necessary, we clean out certain text (for example, “MS” or “CPT”) to leave only alphanumeric procedure codes.
No
code_type
string
This is the type of procedure code for this particular price. There are many different coding systems used in health care billing, but the primary code types are HCPCS/CPT codes, MS-DRG codes (which indicate certain inpatient procedures), APR-DRG codes (which also indicate certain inpatient procedures but which are distinct from MS-DRG codes), and APC codes (which indicate certain outpatient procedures).
No
description
string
This is plain-text description of the item or service for which a price is provided. These are not standardized, either within or across hospitals.
No
rev_code
numeric
This indicates the cost center for the particular price: that is, where in the hospital the item or service is provided or performed. Revenue codes are standardized.
Yes
setting
string
Indicates where in the hospital the particular item or service was provided or performed. This is indicated infrequently in the files, but we standardized this field to be “Inpatient” or “Outpatient.”
Yes
billing_class
string
Designates whether the price is for a professional (e.g., physician) or facility (e.g., hospital) item/service. Standardized to be “Facility,” “Professional,” or “ ” when not available.
Yes
price_type
string
Designates type of price for a particular item or service. In general, there are three types of prices. “Gross Charge” indicates a hospital charge; “Cash Price” indicates the discounted cash price charged to self-pay patients; and all other price types are the negotiated rates for third-party payers. Other than to standardize the nomenclature for Gross Charge and Cash Price and to remove uninformative text (such as “Negotiated Rate”) from this field, we have not edited the contents of this field.
No
price
numeric
Price listed for a given price type for a particular item or service.
No
date_added
string
Date when the raw data file was collected.
No
 

Recommended Uses

These data are intended to be used by researchers, policymakers, regulators, or other stakeholders. Users should note that, except for purposes of publishing in an academic, peer-reviewed journal, redistribution without permission is prohibited. Please contact us with any redistribution requests.

Additionally, users of these data agree to cite the data in all articles, press releases, or other publications in which data obtained from this site are published. Such acknowledgement shall reference only the specific data obtained from this site and no other data. “The data set forth at [INSERT REFERENCE TO LOCATION OF DATA OBTAINED FROM THIS SITE] of publication/press release was obtained from The Hilltop Institute website.” You must also use the following citation:

Henderson, M., & Mouslim, M. (2023). The Hilltop Institute Hospital Price Transparency Data Set: Version 1. Baltimore, MD: UMBC.

Researchers can use these files in various ways. While they do not represent the entirety of hospitals within a given state, they can be used to potentially examine across-payer variation in prices within a hospital; across hospitals in a given state in the sample; or across all hospitals in the sample.

Even with the processing and standardization of these data, they remain challenging to use. File sizes can be large—over 1 gigabyte for certain states—and users will typically need specialized software—such as Stata or SAS—to import and process these files.

Future versions of this database may seek to expand the number of hospitals available for study.

Please contact us in the event of suspected data errors or anomalies.

updated 9/27/23

Contact Us

"*" indicates required fields

This field is for validation purposes and should be left unchanged.

Please contact us with questions or comments using the form below. Please contact Marsha Willis Contact Us via Email for all media inquiries.

Name
X