Now at SIIM 2026 · Startup Kiosk 3 · June 10–12

Case study · STAR Registry (multi-site)

STAR neurovascular registry — 87.7% accuracy on a 12 GB consumer GPU

Validation of CuratAI's local LLM extraction against expert REDCap ground truth on the Stroke Thrombectomy and Aneurysm Registry. Local 7B model reaches 87.7% accuracy on 170 fields per patient — within 3.2 points of a cloud baseline, with no PHI leaving the institution.

The challenge

Clinical registry abstraction is one of the hidden bottlenecks in multi-site research. The Stroke Thrombectomy and Aneurysm Registry (STAR) — founded at MUSC, now spanning 85+ sites with over 15,000 enrolled patients — requires structured abstraction across 341 fields per case, drawn from the H&P, procedure note, and discharge summary.

Doing that by hand takes hours per patient. Sending the source notes to a cloud LLM is fast — but the notes are PHI, the institutions are HIPAA-covered, and the route through cloud inference is the route no IRB approves.

The setup

CuratAI’s ingest stage runs an open-weights 7B-parameter language model locally, on a 12 GB consumer-class GPU. The same GPU you can buy at Micro Center.

We validated against 29 patient charts from STAR. Across the 341 STAR fields, 170 were evaluable against the site’s REDCap ground truth. The remaining fields were either not applicable to the case or had no extractable answer in the source documents.

The result

Field typenCuratAI 7B (local)Cloud baseline
Yes / No4991.6%92.3%
Multiple-choice2375.6%79.6%
Cascaded9490.1%93.2%
Free text453.4%57.1%
Overall17087.7%90.9%

A local 7B model reaches 87.7% overall accuracy — within 3.2 percentage points of a cloud-hosted model an order of magnitude larger.

The workflow that makes it work

Raw accuracy isn’t the whole story. CuratAI surfaces a confidence score with every extracted field:

  • 86% of fields land at confidence 1 with ~89% accuracy — auto-accept.
  • 14% land at confidence 2 or 3 — surfaced for human review.
  • End-to-end: 26 minutes per patient across all 341 STAR fields.

The result is a workflow that’s both fast enough to be useful and conservative enough to be trustworthy — without sending a single line of clinical text outside the institution.

Why this generalizes

STAR is a hard registry to abstract well: 341 fields, cascaded dependencies, free-text observations. The same engine ports to GWTG-Stroke, oncology registries, and custom institutional registries — registry-agnostic prompts, same architecture.

Reference

Submitted to Neurosurgery, 2026. Pre-print and detailed methodology available on request.

See the technical section on the CuratAI product page →