CodeBluePediatrics

A child's life shouldn't depend on geography

Like texting the specialist you had access to at the academic center — via SMS anywhere in the U.S., or WhatsApp in 195 countries. No app. No login. Free for clinicians.

Talk to the Founder See how it's different

Live system · Raising angel round

Benson Hsu, MD, MBA, FAAP, FCCMPediatric Intensivist & Professor · Princeton · Duke · McKinsey · Presidential Leadership Scholar

CodeBluePedsSMS · WhatsApp

1 month old, fever and irritability — worried about meningitis, what's the workup?

PEDS EM — Neonatal Fever

Full sepsis eval: CBC, CRP, blood cx, UA/UCx (cath), LP mandatory. HSV PCR if <21d or seizures.

Empiric: ampicillin 75mg/kg + gentamicin 4mg/kg
✓ CodeBluePeds verified

Delivered · 14 seconds

~0%

of U.S. emergency departments are not fully prepared to care for children

4 out of 5 ERs aren't ready for your child

Weyant et al., Health Affairs, Oct 2024

of completely rural U.S. counties have no general pediatrician at all

Nearly 9 in 10 rural counties have zero pediatricians

Ramesh & Yu, JAMA Network Open, 2023 / ABP Census

0M+

children lack timely access to high-readiness pediatric emergency care

24 million children without emergency pediatric access

Joseph et al., J Pediatrics, 2025

It's 2 AM. A six-month-old arrives seizing in a rural ER. The nearest pediatric specialist is 200 miles away. I've been on the other end of that call countless times. CodeBluePediatrics is what I wish every one of those physicians had in their pocket.

Automated tests

2,149 safety-marked · 312 test files

MEDCALC-BENCH accuracy

1,100/1,100 clinical calculations correct

Critical findings

Every finding resolved and regression-tested

Competitive Analysis

HEAD TO HEAD — SAME QUESTION, DIFFERENT JOURNEYS

How does the physician get an answer?

Select a system to compare. CodeBluePeds stays on the left. Red highlights where the other system falls short.

CodeBluePediatrics

Text your question via SMS or WhatsApp Free

No app, no login, no NPI. Works from any phone — rural EDs, tribal health, international clinics.

PHI automatically stripped Safety

18 HIPAA identifiers removed before AI processing.

Specialist-routed AI generates response

Routed to the correct pediatric module. RAG context from 150 validated pathways.

Second AI independently verifies Verify

Hallucination check by a separate model. No self-validation.

Python computes the dose Deterministic

100% MEDCALC-BENCH accuracy. The AI never does arithmetic.

Citations verified via NCBI Deterministic

PubMed API lookup. Not LLM-generated references.

Dose checked against 132-drug safety DB Safety

RED/YELLOW/GREEN tiered alerts. 229 drug name aliases.

Answer delivered to your phone

Structured workup, weight-adjusted doses, protocol citations, safety caveats. ~15 seconds.

~15s

CodeBluePeds

2–5 min

UpToDate

The Three Separations

No model validates its own output

Sonnet generates. Haiku reviews. Cross-provider when using fallback (Anthropic → OpenAI reviewer). NCBI verifies citations deterministically. Three models. Zero trust.

99.6%

of LLM-generated PMIDs were fabricated in testing across multiple frontier models. Eliminated by deterministic NCBI lookup.

The AI doesn't do the math

Claude extracts parameters. 55 deterministic calculators compute the dose. The arithmetic never touches the model.

100%

MEDCALC-BENCH accuracy (1,100/1,100) vs ~50% for GPT-4. 36 equations + 19 clinical scores.

Silent failure is never acceptable

PostgreSQL state machine tracks every message (RECEIVED → ACKNOWLEDGED → PROCESSING → RESPONDED). Watchdog sweeps every 15s. Stuck >90s = auto-recovery. Encrypted delivery queue.

90s

Max time-to-recovery. The system knows it failed and tells the physician.

Safety Architecture

THE PIPELINE · ~15 SECONDS

What happens between question and answer

PHI stripped Safety

Specialist routing → 7 subspecialties

Claude Sonnet generates response

Claude Haiku reviews Verify

Python dose calc Determ.

NCBI verifies PMIDs Determ.

132-drug validator Safety

Escalation detector Safety · 23 emergency conditions

Confidence scored

Truncated → 1,600 chars for WhatsApp

Delivered via Twilio · SMS fallback

PICU

Peds EM

General

Neonatology

Toxicology

Zero IT lift. No EHR integration required. Your physicians text a number. That's it.

WHICH LAYER WOULD YOU REMOVE?

Layer	What It Does	Without It
PHI De-identification	Strips all 18 HIPAA Safe Harbor identifiers before LLM (2,200-line regex engine)	Patient data in training corpora
Dosing Validator	132 drugs, 229 aliases, RED/YELLOW/GREEN tiers	Hallucinated doses reach physicians
Hallucination Checker	Second model cross-checks first; cross-provider on failover	Fabricated claims delivered as fact
PMID Verification	Deterministic NCBI lookup	99.6% fabricated citations
Escalation Detector	23 emergency conditions + context	DKA misidentified as routine
Cache Poisoning Guard	Only validated responses cached	Bad response repeats forever
Circuit Breaker	Auto-opens after failures, recovers	Cascading failures crash system
Watchdog + Timeout	15s sweep, 85s hard timeout chain	Silent infinite waits
Deterministic Calcs	55 calculators — Python, not LLM	~50% arithmetic accuracy
Emergency Fallback	11 pre-computed PALS 2025 protocols for total AI outage	No guidance during cardiac arrest if AI is down
Encrypted Delivery Queue	AES-256-GCM ephemeral storage, 1-hour TTL, max 3 retries	Failed deliveries lost permanently
Confidence Scoring	v5 three-state model (base 85, three cap mechanisms)	Physicians can't gauge response reliability
Response Sanitizer	Strips de-ID placeholders, duplicate banners, unverified attributions	Internal processing artifacts leak to physicians
Dosing Drift Prevention	Automated tests cross-reference prompt doses against drug DB	Prompts and validators silently disagree on safe limits
Correlation ID Propagation	Request-scoped IDs through async Celery pipeline (~90%+ log reconstruction)	Incident investigation impossible across workers
HIPAA Compliance	PHI stripped pre-LLM, no PHI retained, BAA-ready infrastructure	Regulatory exposure for your health system

Engineering

ENGINEERING

8,930 tests. Zero critical findings.

Solo-founded and architected. Every line is understood by one person — that's an asset for due diligence, not a liability.

120,898

Lines of Python

570 files (258 source + 312 test). One physician-founder built this between PICU shifts — solo-founded and architected with AI-accelerated development that compressed build time while freeing capacity for rigorous testing and safety architecture.

8,930

Automated tests · 2,149 safety-marked

312 test files. Every finding becomes a permanent regression test. CI runs real PostgreSQL 15 + Redis 7 containers before any code reaches production.

Critical findings

Across multiple rounds of iterative review. Every finding resolved and permanently regression-tested.

Why this is hard to replicate: 16 safety layers built by a board-certified pediatric intensivist who knows which drug doses kill children. The engineering is reproducible. The clinical judgment is not. Open architecture, documented API, full test coverage — the codebase is built for handoff, not dependency.

132
Drugs · 229 aliases

150
Validated pathways

55
Clinical calculators

100%
MEDCALC-BENCH accuracy

$0.03–0.15
Cost per consultation

FOR THE CTO IN THE ROOM

Architecture Deep-Dive

Toggleable technical detail for the funder's technical advisory board.

Expand infrastructure details

Multi-Provider FailoverAnthropic primary → OpenAI automatic via circuit breaker (opens after 5 failures). LLMProvider ABC ensures consistent interface. Automatic return on recovery.

Cross-Provider Hallucination CheckWhen on OpenAI fallback, reviewer switches to GPT-4o-mini. Never the same provider family reviewing its own output.

Async PipelineCelery-based, feature-flagged (default OFF), sync fallback. Webhook <1s. max_retries=0, acks_late, reject_on_worker_lost.

Timing ChainPipeline (85s) < Lock TTL (90s) < Watchdog (90s) < Celery soft (100s) < Celery hard (120s) ≤ Gunicorn (120s). Cooperative budget via PipelineContext.

8 Thread-Safe SingletonsDouble-checked locking in gevent context: DatabasePool, ClaudeClient, OpenAIClient, DeidentificationEngine, PresidioEngine, WhatsAppClient, WebhookSecurity, SMSClient.

Three-Layer Duplicate PreventionRedis lock (SET NX) + PostgreSQL MessageSid uniqueness + in-memory lock. Zero duplicate responses.

EncryptionAES-256-GCM delivery queue, HMAC-SHA256 phone hashing, constant-time comparison (hmac.compare_digest) for auth tokens.

CI/CDGitHub Actions with real PostgreSQL 15 + Redis 7 containers. Full test suite + smoke test mandatory. mypy enforced (137 errors, threshold 150).

18
Alembic migrations (zero-downtime)

86
Pinned dependencies

7
Feature flags (all default OFF)

50
Type-annotated modules (mypy enforced)

THE FOUNDER

Benson Hsu

Pediatric Intensivist · Founder, Celeritas Health

MD, MBA, FAAP, FCCM

Princeton AB Duke MBA · Fuqua Scholar Harvard Kennedy Ex-McKinsey Board Certified: Peds, PCCM

75+Publications & 10+ national practice guidelines
NEJM Catalyst · Harvard Business Review · Surviving Sepsis Campaign

ACCMFormer Board of Regents
Past AAP Section Chair · SCCM Executive Committee · SSC Children's Guidelines Panel

$6B+Health system VP · 70-member team
Enterprise Data & Analytics · $8M+ budget · Sanford Health

20yClinical medicine, health systems, and technology
PICU attending · health system executive · McKinsey advisor · founder

Bush Fellow Aspen Health Innovators Presidential Leadership Scholar

Built by the specialist on the other end of the phone

As a PICU attending in rural America, I've been the specialist that physicians call at 2 AM when a child is deteriorating and they need help now. I know what those calls sound like — the urgency, the limited resources, the clinician doing their best with what they have. I've taken thousands of those calls. CodeBluePediatrics is what I wish every one of those physicians had in their pocket.

I didn't build this as a side project. I built it because I've spent my career at the intersection of clinical medicine, health system operations, and technology — and this is the problem I was put together to solve.

Clinical Depth

I manage ventilators, vasopressors, and cardiac arrests in children. Every safety layer in this system exists because I've seen what happens when clinical information is wrong.

Operational Scale

Former McKinsey. Former VP leading data infrastructure across dozens of hospitals. I know how health systems buy, implement, and scale technology — because I've been the buyer.

We're not replacing specialists. We're extending their reach.

The Opportunity

Free for clinicians. Built to scale.

CodeBluePediatrics is free for every clinician. Physicians, APPs, anyone making pediatric clinical decisions, in any setting: rural EDs, tribal health facilities, FQHCs, international clinics. Funded by health system partnerships, not clinician fees.

For health systems: Layer your institutional protocols on top of CodeBluePeds. Your clinicians get answers that follow your guidelines, not generic ones. Revenue comes from these enterprise partnerships — protocol customization, analytics, and integration.

Two markets, one platform: HIPAA-compliant clinical consultation for U.S. clinicians, and educational reference content internationally via WhatsApp across 195 countries.

Pediatrics is where we start. The architecture extends to any specialty, any setting, anywhere a clinician needs expert guidance and doesn't have it.

Why This Is Defensible

150 clinician-authored pathwaysA curated pediatric knowledge base — not a generic model. Each pathway validated by a board-certified intensivist.

16 safety layers require clinical judgment to designKnowing which drug doses kill children isn't in any training corpus. The architecture encodes domain expertise competitors can't replicate with engineering alone.

SMS/WhatsApp creates a usage flywheelEvery question improves routing accuracy. Zero-friction distribution means adoption compounds without sales teams.

First mover in pediatric-specific clinical AI via SMSNo competitor offers subspecialty-routed, safety-validated pediatric guidance via text message. The interface is the moat.

Let's talk about what we're building

Celeritas Health is raising its angel round. Clinical AI safety, pediatric medicine, global health equity — we should connect.

$150K SAFE · $4M cap

Cap reflects 120K+ lines of production code, 16 safety layers, live infrastructure, and a founder with 20 years at the intersection of clinical medicine, health systems, and technology.

$150K gets us to: 200+ drug database (currently 132), HIPAA BAA-compliant infrastructure, and first health system revenue.

Pre-revenue. 8,930 automated tests. Zero critical findings. Live system. Q2 2026: first health system pilot launch.

benson.hsu@celeritashealth.com

Are you a clinician?

CodeBluePediatrics is free. No app, no login. Text a clinical question and get an answer in under 15 seconds. Reach out for the number to try it.

Request SMS Access

For informational and educational purposes only. Not a substitute for clinical judgment. Not FDA-cleared.