HomeBlogAplenty Icon PublicationBioCoach uses AI and biomechanics to give real-time exercise feedback at home

BioCoach uses AI and biomechanics to give real-time exercise feedback at home

by mcsvtln@gmail.com

Jun 6, 2026

Aplenty Icon Publication

A squat can look simple until it starts going wrong. Knees drift, backs round, shoulders tighten, and without someone watching closely, small mistakes can pile up into pain or injury. That problem became harder to ignore during the pandemic, when many people moved their workouts into living rooms and garages. The U.S. Consumer Product Safety Commission reported a 48% rise in injuries related to at-home exercise.

Now a research team from Drexel University and Michigan State University says it has built an artificial intelligence system designed to do more than count reps or cheer from the screen. Their prototype, called BioCoach, analyzes exercise video in real time and delivers feedback aimed at form, timing, and body mechanics, with explanations for why the advice matters.

“Many people who exercise at home with videos and apps don’t get high-quality assessment of their movements,” said Feng Liu, PhD, an assistant professor in Drexel’s College of Engineering and Computing, who led the research. “Feedback is often too generic or simply encouragement but no actual form coaching. Our goal with BioCoach is to provide timely, specific cues grounded in body motion, closer to the kind of guidance a knowledgeable coach would give.”

The work was published ahead of the team’s presentation at the Conference on Computer Vision and Pattern Recognition in June, hosted by the Institute of Electrical and Electronics Engineers and the Computer Vision Foundation.

Researchers from Drexel University and Michigan State University have demonstrated a program designed to use AI and computer vision to provide exercise form coaching in hopes of preventing injuries and improving outcomes. (CREDIT: Drexel University)

Teaching a machine to notice form

The main problem, according to the research, is not just whether an AI model can describe what it sees. It also has to know when to speak up, what body parts matter most in a given movement, and how to connect a visible mistake to the body mechanics underneath it.

Many existing systems rely mostly on appearance, the pixels in a video, rather than a structured understanding of how joints are moving. That can lead to vague advice or comments that arrive too late.

BioCoach was built to work differently. It uses two streams of information at once. One stream looks at visual appearance and motion patterns through a 3D convolutional neural network. The other estimates 3D skeletal movement and body shape, giving the system access to joint angles, ranges of motion, and the phases of an exercise.

That means the program is not just watching a squat or a push-up as a sequence of pictures. It is also trying to interpret how the body is moving through space, which joints are most important, and whether those joints are staying within expected limits.

“Our goal was to build a system that does more than look at pixels and generate a generic comment,” Liu said. “BioCoach exposes the model to 3D motion, joint angles and exercise-specific constraints, so the feedback can point to a concrete movement issue and explain why it matters.”

Comparison with existing methods. Top: prior pixel-only VLM methods provide generic, loosely timed comments. Bottom: BioCoach fuses visual features with 3D skeletal kinematics and a biomechanics module to produce phase-aligned, anatomy-specific, quantitative cues (e.g., shoulder flexion 160◦–170◦), yielding more precise and biomechanics-grounded feedback along the same timeline. (CREDIT: arXiv)

From “lower more” to actual biomechanics

To train the system, the researchers started with the Qualcomm Exercise Video Dataset, or QEVD, a public benchmark containing hundreds of hours of exercise footage paired with time-stamped coaching comments.

But the original coaching notes were often brief and colloquial. A cue like “lower your body more” tells a person something is off, but not exactly what, or why.

So the team rebuilt part of the dataset in more technical language. They re-annotated videos with detailed biomechanical targets, such as “increase elbow flexion to 90 degrees at the bottom,” and paired that with short rationales, such as “increase hip/knee flexion to distribute load.”

In total, they added more than 2,400 notes to over 200 videos used to train and test BioCoach. Because the original time stamps were preserved, the team could evaluate not only whether the system gave the right kind of guidance, but also whether it gave it at the right moment.

The researchers called this updated benchmark QEVD-bio-fit-coach.

BioCoach overview. Streaming video is encoded by two backbones: a 3D CNN for visual tokens and a pose extractor for 3D skeletal kinematics. (CREDIT: arXiv)

Picking the joints that matter

One of BioCoach’s more practical ideas is that it does not treat every joint as equally important all the time.

A squat depends heavily on the hips, knees, and ankles. A push-up shifts attention to the shoulders, elbows, and wrists. A plank is different again, because the goal is often stability rather than repeated motion.

The system uses an exercise-specific attention mechanism to decide which joints deserve the most scrutiny. It then builds what the paper calls a structured biomechanical context, combining body measurements, movement cycles, and detected form problems into information that can guide the language model’s response.

That structure is what allows the system to move from a generic correction to something more concrete. In the paper’s examples, a cue like “keep your back straight” becomes “Lumbar spine variance 12° (target: <5°). Maintain neutral spine alignment; engage the core to stabilize the lumbar region.”

The team argues that this makes the output more interpretable and easier to inspect, because the feedback is tied to explicit measurements rather than hidden inside a pattern-matching system.

Motion-Quality Context module. Given the selected joint set and the 3D skeletal kinematics, the module (a) detects repetition cycles and anchors the feedback moment; (b) timenormalizes each cycle and aligns it to a curated reference trajectory; and (c) evaluates biomechanical constraints. (CREDIT: arXiv)

How it stacked up against bigger names

The researchers tested BioCoach against a long list of video-language AI systems, including models from NVIDIA, ByteDance, Alibaba, Salesforce, OpenAI, MIT, Shanghai Jiao Tong University, the Chinese University of Hong Kong, Peking University, Peng Cheng Laboratory, and others.

On the original QEVD benchmark, BioCoach beat the strongest baseline, Stream-VLM from MIT and NVIDIA, on text quality and judged correctness, though its timing score was close and slightly lower.

On the newly re-annotated QEVD-bio-fit-coach benchmark, BioCoach came out ahead across all reported metrics. In that setting, its gains were especially strong in text quality, judged correctness, and a biomechanics-focused scoring measure the authors called LLM-Bio-Accuracy. Its timing-related score was also modestly higher than Stream-VLM.

The pattern matters. When the target feedback became more detailed and anatomically specific, BioCoach’s advantage grew.

The authors say that suggests explicit 3D kinematics and biomechanical context can improve both the quality and interpretability of live exercise feedback without greatly slowing the response.

“It was encouraging to see that BioCoach was able to perform so well against programs made by some of the top researchers and companies in the AI field,” Feng said. “This is still a prototype, but it shows how combining computer vision with structured biomechanical reasoning can make AI coaching systems more useful and easier to inspect.”

Qualitative timeline for a squat exercise. BioCoach produces temporally aligned, biomechanics-grounded cues with consistent phase tracking, while Stream-VLM outputs generic or mistimed feedback inconsistent with the ground-truth annotations. (CREDIT: arXiv)

What it still cannot do

The paper does not present BioCoach as a finished consumer product. It is a prototype tested on benchmark videos, not a polished app in everyday use.

The team says its next goal is to extend the system so it can estimate joint reaction forces and muscle activation patterns from video. That could help it detect subtle compensatory movements, the kind people make when they are tired, off balance, or protecting a weak area.

The broader ambition is not to replace human coaches, according to the authors, but to extend their reach between in-person sessions.

“We believe this work could ultimately support exercise and physical-therapy apps that extend the expertise of human coaches and trainers between in-person sessions,” Liu said. “A future system could help users receive more specific, timely feedback when they practice on their own, while still keeping human experts in the loop.”

Practical implications of the research

The clearest near-term use for BioCoach is in at-home fitness and physical therapy, where people often practice alone and may not realize their form is slipping.

If systems like this become reliable outside research tests, they could give users more detailed corrections than today’s typical fitness apps, especially for common movements like squats, lunges, push-ups, and planks.

The work also points toward a middle ground between fully self-guided exercise and expensive one-on-one coaching: software that can flag possible problems, explain them in plain language, and give trainers or therapists a better record of what happened between sessions.

Research findings are available online in the journal arXiv.

The original story “BioCoach uses AI and biomechanics to give real-time exercise feedback at home” is published in The Brighter Side of News.