One of CartoChrome's core design principles is that the platform runs entirely on free, publicly available data. No paid data licenses, no Data Use Agreements that take months to negotiate, no proprietary datasets that could disappear or change terms. Every byte of data that powers our Healthcare Access Scores comes from federally maintained, publicly documented sources with stable APIs or automated download endpoints.
Here is the complete inventory of all 21 data sources, organized by what they provide.
Provider Data (2 Sources)
**1. CMS NPPES (National Plan and Provider Enumeration System)** -- This is the foundation of our provider directory. The NPI registry contains approximately 7.5 million records, which we filter to ~4 million active, patient-facing providers. Updated monthly with a full file download and weekly delta files for incremental updates. Every physician, nurse practitioner, physician assistant, dentist, and other healthcare provider with an NPI number appears here with their practice address, specialty taxonomy codes, and organizational affiliations.
**2. CMS Doctors and Clinicians (Physician Compare)** -- Supplements NPPES with quality measures, Medicare acceptance status, and group practice affiliations. This data allows us to distinguish between a provider who technically has an NPI and one who is actively seeing patients and accepting common insurance.
Facility Data (5 Sources)
**3. CMS Provider of Services (POS)** -- The master list of Medicare-certified hospitals and healthcare facilities, including bed counts, ownership type, services offered, and accreditation status. This powers our Hospital Inpatient access component.
**4. CMS Hospital Compare** -- Star ratings, mortality rates, readmission rates, patient experience scores (HCAHPS), and safety indicators for every Medicare-certified hospital. We use these quality metrics to weight facilities in our scoring algorithm -- a 5-star hospital contributes more to access than a 1-star facility at the same distance.
**5. CMS Care Compare** -- Quality data for nursing homes, home health agencies, and dialysis facilities. Extends our facility coverage beyond hospitals to include the long-term and specialized care facilities that are critical for chronic disease management.
**6. SAMHSA Behavioral Health Treatment Locator** -- The Substance Abuse and Mental Health Services Administration maintains a database of mental health and substance abuse treatment facilities. This powers the mental health dimension of our Provider Score, including inpatient facilities, outpatient clinics, and residential treatment centers.
**7. FDA MQSA Mammography Facilities** -- Every FDA-certified mammography facility in the United States. This is the primary input for our Breast Cancer condition-specific score and contributes to the Hospital Score (preventive/screening dimension).
Census and Demographics (3 Sources)
Check Your ZIP Code Health Score
See how your area compares across 11 health dimensions
Explore the Map**8. Census ACS 5-Year Estimates** -- The American Community Survey provides demographic data at the ZCTA level: total population, age distribution, median household income, insurance coverage rates, vehicle access, disability prevalence, educational attainment, and language spoken at home. This data powers both the population demand side of our E2SFCA calculation and the six SDOH penalty sub-indices.
**9. Census TIGER/Line Shapefiles** -- The geographic boundary files for ZIP Code Tabulation Areas. These polygons are what you see on the CartoChrome map. We process them through Tippecanoe to generate optimized PMTiles vector tile archives.
**10. HUD USPS ZIP-ZCTA Crosswalk** -- The Department of Housing and Urban Development publishes a quarterly crosswalk that maps USPS mailing ZIP codes to Census ZCTAs. This is essential because ZIP codes (delivery routes) and ZCTAs (statistical areas) do not perfectly align.
Classification and Reference (3 Sources)
**11. USDA RUCA Codes** -- Rural-Urban Commuting Area codes classify every census tract by urbanicity level. We use these to assign each ZIP code to one of five catchment tiers, which determines the radius used in our E2SFCA calculations.
**12. HRSA HPSA Designations** -- Health Professional Shortage Area designations serve dual purposes: as a map overlay layer showing officially designated shortage areas, and as a validation target for our scoring model (our Healthcare Desert classifications should substantially overlap with HPSA designations).
**13. NUCC Taxonomy Codes** -- The National Uniform Claim Committee publishes a hierarchical classification of healthcare provider specialties. We use this to map each NPI record's taxonomy code to one of our eight scoring components.
Health Outcomes and Validation (4 Sources)
**14. CDC PLACES** -- Census tract-level health outcome and behavior measures, including preventive service utilization rates. This is our primary calibration target -- we validate that areas with high CartoChrome access scores actually show higher utilization of preventive services.
**15. CDC WONDER Mortality** -- County-level mortality data by cause of death. Used for validation: areas we identify as healthcare deserts should correlate with higher age-adjusted mortality for treatable conditions.
**16. County Health Rankings** -- The Robert Wood Johnson Foundation's county-level composite health rankings. Our second major validation target -- CartoChrome scores should correlate with these independent assessments.
**17. CMS Geographic Variation** -- Per-capita Medicare spending and utilization by geography. Helps us understand whether access scores predict actual healthcare utilization patterns.
Infrastructure and Supplemental (4 Sources)
**18. FCC Broadband Data Collection** -- Broadband availability data that powers our telehealth modifier. ZIP codes with inadequate broadband cannot benefit from telehealth, and the modifier reflects this reality.
**19. IMLC Interstate Medical Licensure Compact** -- State membership data for the medical licensure compact, used to determine which telehealth providers can legally serve patients across state lines.
**20. HRSA Health Center Program Data** -- Locations and service profiles of Federally Qualified Health Centers (FQHCs), which serve as critical safety-net providers in underserved areas.
**21. Census Bureau Geocoder** -- Address-to-coordinate resolution for provider records that lack latitude/longitude data.
The Power of Public Data
The fact that all 21 sources are free and public is not just a cost advantage -- it is a transparency advantage. Any researcher, journalist, or policymaker can verify our inputs, reproduce our calculations, and audit our methodology. In an era of increasing skepticism about data-driven claims, full source transparency is a feature, not a limitation.
