SIE Wellness

Product Roadmap

What we've shipped, what's next, and how our data pipeline powers better care discovery.

Current data footprint

Businesses
5,732
With services mapped
2,798
Total services
35,124
Free services
2,652
Geocoded
5,635
Geo success rate
99.96%

Source: internal multi‑agent pipeline analysis.

Multi‑Agent Healthcare Data Extractor

A powerful AI‑driven system that extracts comprehensive healthcare information from business websites using multiple specialized agents working in parallel. Designed to support vulnerable populations including people impacted by Medicaid cuts, immigrants, and low‑income families.

Key capabilities

  • 7 specialized AI agents running in parallel with intelligent URL selection
  • Free & discounted service detection with deep keyword + model analysis
  • SSN requirement detection for undocumented‑friendly care
  • Languages, accessibility, and transportation details extraction
  • Sliding scale, self‑pay options, and financial assistance detection

Data we extract

  • Services offered with free/discount flags (2,652 free mapped)
  • Insurance accepted, Medicaid/Medicare, self‑pay & payment plans
  • Eligibility: documentation, age groups, new patient policies
  • Accessibility: languages, interpretation, walk‑ins, transportation
  • Financial assistance and uninsured acceptance (1,729 accepting uninsured)

Shipped

  • Faster care discovery with clear, local results
  • Thousands of provider pages with easy contact info, directions, and services
  • City & service pages that explain options and answer common questions
  • Reliable addresses and maps with a mobile‑friendly experience
  • Helpful filters: uninsured, no SSN, telehealth, and free/discounted services
  • Consistent, clean design and shareable previews when you send links

Next (30–60 days)

  • Nationwide coverage: grow from ~5,000 to 100,000+ providers and facilities
  • Provider beta: AI agent scheduler that books appointments for patients
  • More guides and how‑tos to help people prepare for visits
  • Richer provider details (insurance, languages, hours, eligibility)
  • Stronger local pages, including neighborhood hubs in top metros

Later (60–120+ days)

  • Preventive Care Plan with transparent pricing (phased rollout)
  • Internationalization and localized content (hreflang)
  • Additional structured data (opening hours, price ranges) where reliable
  • Quality signals: editorial bylines, review workflow, and source citations

How the data pipeline works

1) Data sources

  • Web scraping across healthcare verticals in DMV cities
  • CSV ingestion preserving original fields
  • Google Places, Yelp, and official sites

2) Enhanced analysis

  • Jina.ai content extraction on websites
  • Azure OpenAI for service, SSN, insurance, and assistance detection
  • Smart keyword + fuzzy matching fallback (95+ terms)

3) Processing & output

  • Normalization, deduplication, geocoding (99.96% success)
  • Quality validation and timestamped JSON database
  • Programmatic pages + mining statistics