California Hospital Capability Finder

Methodology

The ingestion script detects actual CSV header rows, including files with title or readme rows, and extracts hospital identity, city, county, ZIP, service category, designation tier, certifying body, verification status, source files, source URLs, and notes.

Hospitals are deduplicated with conservative fuzzy matching using normalized names plus city or county. A hospital can retain multiple designations from different sources.

Existing latitude and longitude values from source records are treated as high-confidence coordinates. Missing locations are written to a geocoding queue. Batch geocoding results are only merged when they are present and have confidence at least as strong as the existing location confidence.

Designation confidence is derived from source verification status. Verified and recognized source records score higher than candidate or needs-verification records.