Jul 2024·22 min·Structured Data APIBuilding

Athletic Data Protocol: Building a Universal Schema for Fitness Content

A unified schema for workouts and athletic performance data—born from having hundreds of bookmarked workouts across a dozen platforms and not being able to build a single plan from any of them.

The Origin: The Workout I Couldn't Plan

I've been training seriously for over a decade. Not casually—seriously. The kind where you track progressive overload in spreadsheets, obsess over periodization, and know the difference between a Romanian deadlift and a stiff-leg deadlift at 5:30 AM when your brain hasn't fully woken up. Training isn't just exercise for me. It's how I think, how I manage stress, how I stay sharp.

So there I was, on a Tuesday night at 11 PM, bookmarking Chris Heria's new weighted calisthenics video. Perfect for tomorrow. I went to bed satisfied.

The next morning, I opened my training app to plan the week. And I hit the wall I've hit a hundred times before.

The Heria bookmark lived on YouTube. My training log lived in a spreadsheet. My mobility routine was saved on Instagram—somewhere in a folder I'd named "flexibility stuff" six months ago. The stretching sequence from my Yoga Body course was buried in a different app. And that HIIT finisher I'd saved on TikTok? Gone. Lost in an endless scroll I'd never reconstruct.

Four apps. Three browser tabs. One spreadsheet. All containing workouts I genuinely wanted to do. None of them talking to each other.

But here's what really broke me: even if I did manage to piece together a plan, I still wouldn't know where I actually stand.

Heria's video showed weighted muscle-ups with a 20kg vest. I can't do a single muscle-up—weighted or not. So what's my path from here to there?

Which progressions have I already tried?
Where did I plateau?
What worked before?

That data exists somewhere across three different apps. Good luck finding it when you need it. The influencer showed the destination. The journey was up to me to reconstruct—again, manually, like it's 2005.

The irony isn't lost on me: I've accumulated hundreds of bookmarked workouts across a dozen platforms—Freeletics for bodyweight, B42 for football conditioning, YouTube for strength programs, Instagram for mobility flows. Thousands of hours of fitness content, saved with the best intentions. And I can't build a single coherent training plan from any of it.

The fitness industry generates more content than ever—87.6% of people watch health-related content on YouTube. Single videos get 500 million views. But none of this content is structured. It's locked in video format, impossible to integrate, impossible to personalize.

This isn't just my problem. It's an industry-wide data architecture failure.

The Scale of Fragmentation

Research published in the Journal of Medical Internet Research confirms what every serious athlete already knows: "Lack of interoperability and the presence of data silos prevent users and health professionals from getting an integrated view of health and fitness data."

Platform	Data Format	Interoperability
YouTube	Video (unstructured)	None
Instagram	Video/Image (unstructured)	None
Freeletics	Proprietary	None
B42	Proprietary	None
Apple Health	HealthKit (siloed)	iOS only
Google Fit	Google Fit API (siloed)	Android only
Gym Systems	Proprietary	Usually none

Google Fit and Apple Health were supposed to solve this. They didn't. They're not interoperable with each other, let alone with the billions of workout videos that exist outside their ecosystems.

The Insight: Instead of Asking the World to Change, Build Intelligence That Adapts

The traditional approach to standardization asks every platform and content creator to adopt common formats. This coordination problem has proven intractable—it's why we still don't have fitness data interoperability despite decades of trying.

Athletic Data Protocol inverts the model: instead of expecting the world to structure its data, I'm building AI systems capable of extracting structure from any existing content format. Video, image, text, audio—regardless of source, the output maps to a unified schema.

This approach has a name in systems thinking: adapting to the world as it is, rather than asking the world to change.

Research Objectives and Technical Challenges

The development of ADP presents several fundamental research questions that define our experimental agenda:

1. Multimodal Fusion for Exercise Understanding: How do we optimally combine visual, auditory, and textual signals to identify and characterize exercises? Current Video-LMM architectures achieve state-of-the-art performance on general video understanding but have not been systematically evaluated on the specific challenges of fitness content—repetitive motions, subtle form variations, and domain-specific terminology.

2. Structured Extraction Reliability: Large language models can produce structured JSON output, but reliability varies significantly. Simon Willison's analysis notes that "some providers YOLO it—they trust that their model is 'good enough' that showing it the schema will produce the right results." For a production system handling millions of extractions, even 1% error rates compound into serious data quality issues.

3. Exercise Ontology Alignment: The fitness domain lacks a universally adopted ontology. PACO (Physical Activity Ontology) contains 268 concepts; EXMO (Exercise Medicine Ontology) comprises 434 classes and 9,732 axioms; ExerciseDB catalogs 11,000+ exercises. These resources overlap incompletely and use different classification principles. How do we map extracted exercises to a consistent taxonomy?

4. Cross-Platform Semantic Consistency: The same exercise may be called "Romanian deadlift," "RDL," "stiff-leg deadlift," or "straight-leg deadlift" across different sources. Some distinctions reflect genuine biomechanical differences; others are synonyms. Automated systems must navigate this terminology landscape without human annotation of every edge case.

5. Form Quality Assessment: Beyond identifying which exercise is performed, can we assess how well it's performed? This requires not just pose detection but biomechanical reasoning about joint angles, tempo, and range of motion relative to exercise-specific standards.

These challenges define the core uncertainty that makes this project suitable for R&D funding—the optimal solutions are not known a priori and require systematic experimental investigation.

Theoretical Framework: The Semantics of Athletic Movement

Exercise as Structured Information

An exercise is not merely a name—it's a complex information structure with multiple dimensions:

Dimension	Description	Example Values
Movement Pattern	Fundamental biomechanical category	Push, Pull, Hinge, Squat, Lunge, Carry, Rotation
Primary Muscles	Target muscle groups	Quadriceps, Hamstrings, Pectoralis Major
Secondary Muscles	Synergists and stabilizers	Core, Forearms, Rotator Cuff
Equipment	Required apparatus	Barbell, Dumbbell, Bodyweight, Cable, Machine
Plane of Motion	Anatomical movement plane	Sagittal, Frontal, Transverse
Kinetic Chain	Open or closed chain	Open (leg extension) vs. Closed (squat)
Loading Parameters	Intensity and volume	Weight, Sets, Reps, Time Under Tension
Tempo	Eccentric/Concentric timing	3-1-2-0 (3s down, 1s pause, 2s up, 0s top)

Traditional fitness content captures only a fraction of this information explicitly. A YouTube video title might say "Chest Day Workout" while the actual content demonstrates specific exercises with particular form cues and rep schemes. The ADP system must infer the full semantic structure from partial, implicit, and multimodal signals.

Existing Ontological Frameworks

Several research efforts have attempted to formalize exercise knowledge:

Physical Activity Ontology (PACO): Developed through natural language processing of physical activity questionnaires, PACO contains 268 concepts organized into Daily_Living_Activity and Exercise_Leisure_Activity hierarchies. Evaluation confirmed structural consistency and compliance with OBO Foundry principles for ontology authoring.

Exercise Medicine Ontology (EXMO): Published in Scientific Data (December 2024), EXMO is the first core reference ontology specifically for exercise prescription. It comprises 434 classes and 9,732 axioms, categorizing exercises into aerobic, anaerobic, home, outdoor, and other types, with links to health status, equipment, and fitness testing concepts.

TRAK (Taxonomy for Rehabilitation of Knee Conditions): Contains extensive lists of exercises, joint movements, muscle contractions, and anatomical entities. Has been integrated into web-based interventions providing personalized exercise plans.

ExerciseDB: A comprehensive database of 11,000+ exercises with structured metadata including target body parts, equipment, images, videos, and instructions. Represents the most extensive practical catalog but lacks formal ontological structure.

A systematic review of physical activity ontologies (IJBNPA, March 2023) evaluated 28 ontologies against 12 quality criteria derived from OBO Foundry principles. The average score was 4.23 out of 12, with no ontology meeting all criteria. This gap between ontological theory and practical completeness defines a key challenge for ADP: we must build on existing frameworks while extending them to cover the full diversity of real-world fitness content.

The Multimodal Understanding Challenge

Exercise understanding from video requires fusing multiple information streams:

Visual Stream: Pose estimation extracts body landmark positions over time. MediaPipe provides 33 pose landmarks at real-time speeds, enabling joint angle calculation and movement trajectory analysis. Recent work on exercise classification using CNNs with ensemble learning achieves "high accuracy" on the Infinity AI Fitness Basic Dataset.

Audio Stream: Verbal cues ("squeeze at the top," "three more reps"), counting, music tempo, and breathing patterns all carry information. Audio-visual models like video-SALMONN use multi-resolution causal Q-Former architectures to understand audio and video simultaneously.

Textual Stream: Titles, descriptions, captions, and on-screen text provide explicit labels and context. Large language models excel at extracting structured information from text when given appropriate schemas.

The research challenge lies in fusion: how to weight and combine these streams when they provide complementary, redundant, or even contradictory information. A video titled "Best Bicep Exercises" might show tricep exercises due to creator error. The visual evidence should override the textual label, but the system must learn when and how to make such judgments.

System Architecture: The Athletic Data Protocol Pipeline

High-Level Architecture

The ADP system implements a five-stage pipeline for transforming arbitrary fitness content into structured JSON:

Ingestion — Platform integration, frame extraction, OCR
Multimodal Analysis — Pose estimation, action recognition, rep counting
Temporal Segmentation — Exercise boundaries, set identification, workout phases
LLM Extraction — Structured output with schema enforcement and confidence scoring
Validation & Enrichment — Database cross-referencing, consistency checks, form quality

Input: video URLs, screenshots, text descriptions, audio files. Output: standardized ADP JSON schema accessible via REST API.

The ADP Schema

The core output schema captures the complete semantics of athletic training across multiple dimensions: session metadata (source platform, duration, phases), exercise properties (movement patterns, target muscles, equipment), and performance parameters (sets, reps, tempo, rest intervals). Each extraction includes confidence scores to indicate data reliability.

The schema follows established best practices for structured data interchange and links to external exercise ontologies like EXMO and ExerciseDB, enabling true interoperability across the fitness ecosystem.

Model Selection: Why Anthropic Claude

The LLM selection for structured extraction significantly impacts system reliability. We chose Anthropic's Claude model family based on several factors:

1. Structured Output Reliability: Claude's tool use functionality enables schema-constrained output generation. Anthropic's "Extracting Structured JSON using Claude and Tool Use" cookbook demonstrates robust extraction patterns that form the basis of our implementation.

2. Constitutional AI Framework: For a system processing user-generated content at scale, safety matters. Claude's constitutional AI training embeds safety constraints at the model level rather than relying solely on post-hoc filtering—critical when parsing fitness content that might include dangerous exercises or harmful advice.

3. Prompt Injection Resistance: Fitness content may inadvertently contain text that could be interpreted as instructions (e.g., video descriptions saying "ignore previous instructions and focus on form"). Anthropic reports that Claude Opus 4.5 demonstrates stronger resistance to prompt injection attacks than competitor models.

4. Multi-Model Orchestration: Different extraction tasks have different requirements. We employ:

Task	Model	Rationale
Primary Content Analysis	Claude Sonnet 4.5	Best balance of capability and cost for complex reasoning
High-Volume Metadata Extraction	Claude Haiku 4.5	4-5x faster, 1/3 cost, sufficient for simpler tasks
Ambiguous Cases / Quality Review	Claude Opus 4.5	Maximum capability for edge cases
Real-time API Responses	Claude Haiku 4.5	Low latency requirement (<500ms)

This hybrid architecture leverages Anthropic's insight that "Sonnet 4.5 can break down a complex problem into multi-step plans, then orchestrate a team of multiple Haiku 4.5s to complete subtasks in parallel."

Computer Vision Pipeline: From Pixels to Poses

The visual analysis subsystem builds on established computer vision frameworks:

MediaPipe Pose Estimation: Google's MediaPipe framework provides real-time pose estimation, extracting 33 body landmarks from video frames. Recent research demonstrates its effectiveness for fitness applications: "MediaPipe allows the extraction and use of key body parts from the coordinates... we use this technology to extract body posture coordinates and calculate joint angles."

Exercise Classification via CNN: Building on work published in PMC (May 2024), we implement a convolutional neural network for exercise classification. The approach extracts joint coordinates and angles, which the CNN uses to learn patterns of various exercises. An ensemble learning method combines predictions from multiple image frames for improved accuracy.

Action Recognition Benchmarks: State-of-the-art models achieve strong performance on established datasets:

WAVd (Workout Action Video dataset): 95.81% accuracy with ResDC-GRU Attention model
UCF101: 93.2% accuracy
YouTube Actions: 97.2% accuracy

However, these benchmarks test general action recognition, not the fine-grained distinctions required for fitness applications (e.g., differentiating a Romanian deadlift from a stiff-leg deadlift). Our research agenda includes developing fitness-specific evaluation benchmarks.

Rep Counting and Set Detection: Beyond classification, the system must count repetitions and identify set boundaries. This requires temporal analysis of joint angle trajectories. Research demonstrates that "AI-powered fitness tracking system capable of accurately counting exercise repetitions through video analysis" achieves "average accuracy of over 90%" using MediaPipe for landmark detection and custom repetition logic.

Audio Processing: Extracting Verbal Cues

Fitness videos often contain valuable audio information:

Verbal instructions: "Keep your back straight," "Squeeze at the top"
Rep counting: Explicit counts that can validate or correct visual detection
Tempo cues: "Slow on the way down, explosive up"
Exercise naming: Trainers often announce exercises before demonstrating

We employ Whisper for audio transcription, then apply NLP techniques to extract exercise-relevant information. The integration follows patterns established in video-SALMONN research, which "uses a multi-resolution causal Q-Former to understand audio and video simultaneously."

Technical Implementation

Full Technology Stack

Component	Technology	Rationale
Orchestration	LangChain	Modular pipeline construction, native Claude integration
Primary LLM	Claude Sonnet 4.5	Complex reasoning, structured output
Fast LLM	Claude Haiku 4.5	High-throughput simple extractions
Escalation LLM	Claude Opus 4.5	Maximum capability for edge cases
Pose Estimation	MediaPipe	Real-time, 33 landmarks, cross-platform
Action Classification	Custom CNN + Ensemble	Fine-tuned for fitness domain
Video Processing	FFmpeg + OpenCV	Frame extraction, preprocessing
Audio Transcription	Whisper	State-of-the-art accuracy
Vector Database	Pinecone	Exercise embedding similarity search
Schema Validation	JSON Schema + Pydantic	Runtime type checking
API Layer	FastAPI + OpenAPI	Standards-compliant REST interface
Deployment	Kubernetes	Scalability, zero-downtime updates

Structured Extraction Pipeline

The extraction pipeline leverages Claude's tool use functionality for schema-constrained output generation. Multimodal analysis results—pose data, audio transcription, visual detections—are combined and processed through a domain-specific prompt engineered for fitness content understanding. The system identifies exercises, resolves synonyms to canonical names, and outputs structured data with confidence scores.

Exercise Ontology Mapping

A critical challenge is mapping extracted exercise names to canonical identifiers. We implement a multi-stage resolution process:

Stage 1 - Exact Match: Check against a normalized dictionary of 11,000+ exercises from ExerciseDB.

Stage 2 - Synonym Resolution: Apply learned synonym mappings (e.g., "RDL" → "Romanian Deadlift", "Skull Crusher" → "Lying Tricep Extension").

Stage 3 - Semantic Similarity: For unmatched names, compute embedding similarity against the exercise database. We use a domain-specific embedding model fine-tuned on fitness terminology.

Stage 4 - LLM Classification: For remaining ambiguous cases, query Claude with the exercise description and visual features to determine the best canonical match.

Stage 5 - Human Review Queue: Exercises that cannot be confidently mapped (confidence < 0.7) are flagged for human review. This feedback loop continuously improves the synonym dictionary and embedding model.

Handling Cross-Platform Variations

Different platforms present content differently, requiring platform-specific preprocessing:

YouTube: Long-form content (10-60 minutes typical). Often includes intro/outro segments, sponsor breaks, and verbal instructions. Video descriptions frequently contain full workout lists.

Instagram Reels: Short-form (15-90 seconds). Exercises demonstrated without verbal explanation. Heavy use of text overlays and captions. Music-driven timing.

TikTok: Similar to Reels but with different aspect ratio preferences and caption styles. Strong trend-following behavior (same exercises appear across many videos).

Screenshots/Images: Static representation of workout plans. Requires OCR for text extraction and table structure recognition.

Text Descriptions: Plain-text workout logs or plans. Most straightforward extraction but highly variable formatting.

Each platform parser normalizes content into a common intermediate representation before the main extraction pipeline, ensuring consistent downstream processing regardless of source.

Open Research Questions and Technical Uncertainties

The following questions define our experimental research agenda—areas where the optimal solutions are not known and require systematic investigation:

Multimodal Fusion Architecture

Question: What is the optimal architecture for combining visual pose data, audio transcription, and textual metadata?

Current Hypothesis: Late fusion (processing each modality independently, then combining at the decision level) outperforms early fusion for fitness content due to the independence of information streams.

Experimental Plan: Compare early fusion, late fusion, and attention-based fusion architectures on a curated evaluation dataset of 1,000 workout videos with ground-truth annotations.

Risk: Early fusion might capture cross-modal patterns we're missing (e.g., verbal cues synchronized with specific movements). Finding the optimal balance is non-trivial.

Exercise Disambiguation

Question: How do we reliably distinguish between exercises that share similar movement patterns but differ in subtle ways?

Examples:

Romanian Deadlift vs. Stiff-Leg Deadlift (degree of knee bend)
Bent-Over Row vs. Pendlay Row (back angle, barbell path)
High Bar vs. Low Bar Squat (bar position affecting torso angle)

Current Approach: Joint angle thresholds derived from biomechanical literature.

Technical Uncertainty: Threshold values may not generalize across body types, camera angles, and video quality. The relationship between visible angles and exercise classification is not fully characterized.

Structured Extraction Reliability

Question: What extraction accuracy can we achieve at scale, and how do we handle the long tail of edge cases?

Benchmark Target: >95% field-level accuracy on the primary schema fields (exercise name, sets, reps).

Current Performance: 87% on preliminary 500-video test set.

Gap Analysis: Errors cluster in:

Novel exercises not in training data (15% of errors)
Ambiguous visual quality (25% of errors)
Conflicting modality signals (20% of errors)
Schema edge cases (40% of errors)

Research Direction: Active learning pipeline to identify and prioritize high-value training examples.

Real-Time Processing Feasibility

Question: Can the full pipeline meet latency requirements for real-time applications?

Target: <5 seconds end-to-end for a 60-second video clip.

Current Bottlenecks:

Video download and preprocessing: ~2s
Pose estimation (all frames): ~3s
LLM extraction: ~2s
Total: ~7s (exceeds target)

Optimization Directions:

Selective frame sampling (every Nth frame)
Streaming pose estimation
Parallel processing of independent pipeline stages
Model distillation for faster inference

Form Quality Assessment

Question: Can we reliably assess exercise form quality from video?

Challenge: Form quality is inherently subjective and context-dependent. What's "correct" varies with training goal, experience level, and individual anatomy.

Approach: Rather than binary good/bad judgments, output a structured form analysis:

Joint angles at key positions
Range of motion achieved
Tempo consistency
Deviation from reference pattern

Technical Uncertainty: The reference patterns themselves are not standardized. Developing reliable form baselines requires expert validation that we have not yet completed.

Regulatory and Ethical Considerations

Data Privacy and Platform Terms

Extracting data from social media platforms raises legal and ethical questions:

Terms of Service Compliance: Each platform has specific policies regarding automated data access. Our system uses official APIs where available (YouTube Data API, Instagram Graph API) and respects rate limits and usage policies.

User Consent: Content creators have not explicitly consented to having their videos analyzed. We mitigate this through:

Processing only publicly available content
Not storing original video files (only extracted structured data)
Providing opt-out mechanisms for creators who request removal
Using extracted data only for aggregate analysis, not individual profiling

GDPR Considerations: For EU users, extracted data may constitute personal data processing. Our architecture implements data minimization principles, retaining only the structured exercise data necessary for the service function.

Intellectual Property

Workout content may be protected by copyright. Our legal analysis concludes:

Facts are not copyrightable: The fact that a video shows "3 sets of 10 bench presses" is not protected.
Expression is protected: We do not reproduce video content, only extract factual information.
Transformative use: Converting unstructured video into structured data serves a different purpose than the original content.

However, we acknowledge legal uncertainty in this area and have designed the system to minimize risk through fact-only extraction.

Accuracy and Liability

Fitness information carries health implications. Potential harms from extraction errors include:

Incorrect rep/set recommendations leading to overtraining
Misidentified exercises causing injury from improper form
Wrong equipment specifications creating safety risks

Mitigation Strategy:

Clear confidence scores on all extracted data
Prominent disclaimers that extracted data should be verified
No medical or injury-prevention claims
User verification interface for critical parameters

Discussion

Scientific Contribution

The Athletic Data Protocol makes several contributions to the intersection of AI, fitness technology, and data standardization:

1. First Comprehensive Fitness Content Extraction System: While pose estimation and exercise classification have been studied separately, ADP is the first system to combine multimodal analysis with LLM-based structured extraction specifically for fitness content.

2. Universal Exercise Schema: The ADP JSON Schema provides a standardized representation that can capture the semantics of any exercise from any source. This enables interoperability that has been impossible in the fragmented fitness ecosystem.

3. Hybrid AI Architecture for Domain Extraction: Our multi-model approach—using specialized models for visual analysis and LLMs for semantic reasoning—provides a template for similar domain-specific extraction challenges.

4. Benchmark Development: By creating evaluation datasets and metrics specific to fitness content extraction, we establish foundations for future research in this area.

Limitations

Current Accuracy Limitations: 87% extraction accuracy, while promising, means 13% of extracted data contains errors. For production deployment, this must improve significantly.

Exercise Coverage Gaps: The 11,000+ exercise database, while extensive, does not cover all variations, particularly culturally-specific exercises, novel movements, and highly modified versions.

Single-Person Assumption: Current pose estimation assumes one person per video. Group fitness classes, partner exercises, and crowded gym footage present unsolved challenges.

Language Limitations: Audio processing currently supports English only. Extending to German, Spanish, and other languages requires additional training data and validation.

Introducing BLOCK

The Athletic Data Protocol isn't just a research project—it's the foundation for BLOCK, a consumer application that will unify the fragmented fitness content landscape.

BLOCK transforms any workout content into your personal training library:

Import from anywhere: YouTube videos, Instagram Reels, TikTok workouts, screenshots of training plans, or plain text descriptions—BLOCK understands them all
Universal workout library: Every exercise you've ever done or want to do, structured, searchable, and organized
Cross-platform sync: Finally connect your Freeletics history with your gym sessions, your B42 football training with your YouTube follow-alongs
Smart recommendations: Discover workouts that match your goals, equipment, and training history

BLOCK represents the consumer-facing implementation of the Athletic Data Protocol—bringing interoperability to everyday athletes and fitness enthusiasts.

Stay tuned for early access.

Conclusion

The Athletic Data Protocol addresses a fundamental gap in the fitness technology ecosystem: the absence of interoperability across the explosion of workout content distributed across platforms. By building AI systems capable of extracting structured data from any content format, we invert the traditional standardization challenge—instead of waiting for the world to adopt common formats, we create intelligence that adapts to existing content.

The technical approach combines state-of-the-art capabilities in video understanding, pose estimation, and large language model structured extraction. Anthropic's Claude model family provides the reliability, safety, and multi-model orchestration required for production deployment. Preliminary results demonstrate the feasibility of the approach, achieving 87% extraction accuracy on a diverse test set.

Significant research challenges remain. Multimodal fusion architectures, exercise disambiguation, and real-time processing all require systematic experimental investigation. These uncertainties define the core R&D agenda that makes this project suitable for research funding—the optimal solutions are not known and cannot be derived from existing literature alone.

The potential impact extends beyond technical achievement. By creating a universal language for athletic data, we enable entirely new applications: AI coaches that learn from any content source, training logs that seamlessly aggregate across platforms, and analytics that span complete fitness journeys regardless of documentation medium. The fragmentation that has constrained fitness technology becomes, through ADP, a solved problem.

References

Tang, Y. et al. (2025). Video Understanding with Large Language Models: A Survey. IEEE Transactions on Circuits and Systems for Video Technology. DOI: 10.1109/TCSVT.2025.3566695
PMC11124794. (2024). Workout Classification Using a Convolutional Neural Network in Ensemble Learning. Applied Sciences.
Ranasinghe, K. et al. (2025). Understanding Long Videos with Multimodal Language Models. ICLR 2025.
Jin, Q. et al. (2019). Developing a Physical Activity Ontology to Support the Interoperability of Physical Activity Data. JMIR.
Tian, J. et al. (2024). Core reference ontology for individualized exercise prescription. Scientific Data.
IJBNPA. (2023). Content and quality of physical activity ontologies: a systematic review. International Journal of Behavioral Nutrition and Physical Activity.
Willison, S. (2025). Structured data extraction from unstructured content using LLM schemas. Simon Willison's Blog.
NuMind. (2024). NuExtract: A Foundation Model for Structured Extraction. NuMind Technical Report.
Treeby, M. et al. (2024). Structured information extraction from scientific text with large language models. Nature Communications.
ExerciseDB. (2025). ExerciseDB API Documentation. https://github.com/ExerciseDB/exercisedb-api
Google. (2024). MediaPipe Pose Estimation. https://developers.google.com/mediapipe
JSON Schema. (2024). JSON Schema conference: A milestone event at Apidays Paris 2024. https://json-schema.org/blog
Terra. (2025). Fitness and Health API. https://tryterra.co/
Anthropic. (2025). Claude Sonnet 4.5 Model Card. San Francisco: Anthropic.
Anthropic. (2025). Introducing Claude Haiku 4.5. https://www.anthropic.com/news/claude-haiku-4-5
Bai, Y. et al. (2022). Constitutional AI: Harmlessness from AI Feedback. arXiv preprint arXiv:2212.08073.
W3C. (2024). JSON-LD Best Practices. https://w3c.github.io/json-ld-bp/
Google. (2024). Google JSON Style Guide. https://google.github.io/styleguide/jsoncstyleguide.xml
LlamaIndex. (2024). Introducing LlamaExtract Beta. https://www.llamaindex.ai/blog/introducing-llamaextract-beta
Stoplight. (2023). Five Ways JSON Schema Can Help You Create Better APIs. https://blog.stoplight.io/