关于 UltrasoundAI
一个用 Claude Opus 4.6 视觉能力做运动员急性肌骨损伤超声筛查的 MVP 项目。
⚠️ 本工具不是医疗器械,不提供医疗诊断。输出仅作分诊参考,任何疑似损伤都必须由具备资质的医生确认。
定位与范围
本工具聚焦运动员常见的急性肌骨损伤识别,包括:
- 肌肉:拉伤、血肿、部分撕裂、完全断裂、网球腿
- 肌腱:肌腱炎、部分撕裂、完全断裂(跟腱、髌腱、肱二头肌腱等)
- 骨骼:皮质断裂(step-off)、青枝骨折(buckle)、撕脱骨折、骨膜下血肿、创伤性关节积液、脱位 / 半脱位
- 软组织:急性血肿、筋膜液体追踪
不涵盖: 慢性退行性病变、肿瘤、神经肌肉疾病(虽然会识别异常但不做分级)、关节内部细节。
技术架构
- 前端: 静态 HTML,中英双语,拖拽上传,浏览器内视频抽帧
- 后端: Cloudflare Worker (全球 330+ 边缘节点),零冷启动
- 模型: Claude Opus 4.6 (vision) + adaptive thinking + structured JSON output
- 翻译: 两步式 —— 先生成英文推理,再独立翻译调用生成专业中文版
- 代码: GitHub
cgyagenticloud/ultrasoundai(Private)
评测数据来源(全部 CC-BY 或等效开放许可)
| 数据集 | 类型 | 样本 | 许可 |
| Zenodo 2598553 (Baxter) | 健康腓肠肌视频 | 60 段 | CC-BY-4.0 |
| Mendeley 3jykz7wz8d (Marzola) | 健康 vs 病理 BB/GM/TA | 4k+ | CC-BY-4.0 |
| EBI BioStudies S-BIAD1482 (AnkleImage) | 健康踝关节多受试者 | ~20 | Open |
| Figshare 26889334 (AHU) | 1833 异构病例(抖音抓取,作鲁棒性测试) | 20 | Academic |
| PMC 3060433 等 16 篇文章 | 急性肌肉/肌腱/骨折 pictorial review | 114 | CC-BY / PMC Open |
方法论(Anti-bias 规则)
项目经历过多轮迭代修复了两类严重的 AI 医疗失败模式:
- Confirmation bias — 当用户提供病例描述(如"跳起落地听到 pop 声")时,早期版本会基于故事升级分诊,即使影像正常。现在严格限制:影像是唯一真相,病史不能驱动 red 升级。
- Unevaluable-input 误判 — 非超声图像(截图、文档)应返回 yellow 低置信度,**从不**返回 green。
- 过度保守 — 早期版本把每张不完美的图像都标 yellow,会洪水轰炸临床医生。现在有 "green-on-normal-anatomy" 规则:看到正常解剖+无发现就必须 green。
评测结果
在基于 PMC 开放获取真实临床图像的急性损伤评测集上:
- Sensitivity(召回率): ~96% — 28 个真实急性损伤中 27 个被正确标为 yellow/red
- Specificity: ~85% — 28 个健康样本中 23 个被正确标为 green
- Accuracy: ~91%
完整评测历史和每例结果见 评测结果页。
限制
- 不是 FDA / NMPA 认证的医疗器械
- 不取代超声医师或运动医学医生
- 单帧分析,不做时序动态扫查评估
- 训练/评测数据主要来自北美/欧洲成人,儿童和极端体型可能不准
- Claude 偶尔会在极低质量图像上编造发现,结果置信度 < 0.3 时请人工复核
免责声明
本工具作为筛查辅助和教育用途发布。任何疼痛、肿胀、功能受限都应当由具备资质的临床医生评估。输出结果不能作为医疗决策的唯一依据。作者和 Anthropic 对任何基于本工具结果的临床决策不承担责任。
About UltrasoundAI
An MVP that uses Claude Opus 4.6 vision to screen acute musculoskeletal injuries in athletes from ultrasound imaging.
⚠️ Not a medical device. No medical diagnosis. Output is screening-only and must be confirmed by a qualified clinician.
Scope
Focused on common acute sports injuries:
- Muscle: strain, hematoma, partial tear, complete rupture, tennis leg
- Tendon: tendinosis, partial tear, complete rupture (Achilles, patellar, biceps, etc.)
- Bone: cortical step-off, buckle fracture, avulsion, periosteal hematoma, traumatic effusion, dislocation / subluxation
- Soft tissue: acute hematoma, fluid tracking along fascia
Out of scope: chronic degenerative disease, tumors, neuromuscular disease grading, intra-articular detail.
Stack
- Frontend: static HTML, bilingual, drag-drop upload, client-side video frame extraction
- Backend: Cloudflare Worker (330+ edge POPs, zero cold start)
- Model: Claude Opus 4.6 (vision) + adaptive thinking + structured JSON output
- Translation: two-pass — EN rationale first, then a dedicated translation call produces fluent medical Chinese
- Code: GitHub
cgyagenticloud/ultrasoundai (Private)
Evaluation data sources (all CC-BY or equivalent open license)
| Dataset | Content | Samples | License |
| Zenodo 2598553 (Baxter) | Healthy gastrocnemius videos | 60 clips | CC-BY-4.0 |
| Mendeley 3jykz7wz8d (Marzola) | Healthy vs pathological BB/GM/TA | 4k+ | CC-BY-4.0 |
| EBI BioStudies S-BIAD1482 (AnkleImage) | Multi-subject healthy ankle | ~20 | Open |
| Figshare 26889334 (AHU) | 1833 heterogeneous cases (robustness test) | 20 | Academic |
| PMC 3060433 + 15 more articles | Acute muscle / tendon / bone pictorial reviews | 114 | CC-BY / PMC Open |
Methodology (anti-bias rules)
We debugged two severe failure modes during development:
- Confirmation bias — early versions escalated triage to RED when the user described a dramatic injury history, even if imaging was normal. Now strictly enforced: imaging is the sole source of truth, clinical notes can never drive a RED call.
- Unevaluable-input misclassification — non-ultrasound inputs (screenshots, documents) must return YELLOW with low confidence, never GREEN.
- Over-conservatism — early versions flagged every imperfect image as YELLOW, drowning clinicians in false positives. Fixed with an explicit "green-on-normal-anatomy" rule.
Results
Evaluated on a set of real clinical images from PMC open-access articles:
- Sensitivity (recall): ~96% — 27/28 true acute injuries correctly flagged
- Specificity: ~85% — 23/28 healthy samples correctly cleared
- Accuracy: ~91%
Full iteration history and per-sample breakdown on the results page.
Limitations
- Not FDA / CE / NMPA cleared; no regulatory approval
- Does not replace a sonographer or sports medicine physician
- Single-frame analysis; does not evaluate dynamic / temporal scanning
- Training distribution skews North American / European adult; pediatric and extreme body habitus may be inaccurate
- Claude may occasionally fabricate findings on very low-quality inputs; treat confidence < 0.3 as "human review required"
Disclaimer
Released as a screening aid and educational tool. Any pain, swelling, or functional impairment must be evaluated by a qualified clinician. Output may not be used as the sole basis for clinical decisions. The authors and Anthropic assume no liability for clinical decisions made using this tool.