September Project: Streaming Motion Capture

Motion capture is neat! There are a bunch of tools available now, iMotion, assorted Kinect libraries, etc. I’ve tried most of them but haven’t found anything that I particularly enjoy using. Most markerless systems suffer from drift, poor localization, jitter, and a host of other issues. Inertial motion capture units also have drift of their own and misbehave when there are metallic objects or magnetic fields nearby. I don’t think I’ll be able to get around any of the issues from the markerless problems unless I put together a fiducial system, but I can still try to make a useful piece of software.

Motion Capture Mk5 is a system that captures human poses and streams the data across the network to connected clients. The protocol should be absurdly simple so that a client can be written in almost any language in the span of a day. This makes the tool multi-purpose — it can be used to stream capture data to Blender for animation or VR Chat for real-time interaction or some other software for some other reason. Letting the application run on a network machine means if you, like me, have a living space that’s slightly constrained, you can capture from a volume that’s larger while your machine sits off to the side and out of the way.

What skills do I hope to practice along the way?

  • UI development in Rust with mixed 2D/3D
  • Computer-vision (if fiducials come into play)
  • Model integration into Rust via Tract or Candle (for depth estimation or pose estimation)
  • Rust networking and IPC

What kinds of deliverables will we see at the end?

I’d like to finish the month with an application that captures and sends the estimated positions and rotations of 13 joints. Ideally, the model will have depth estimation internally, too, so I can duplicate and modify the code into a SLAM tool. A Blender client or Godot client would not be out of the question.

Open Questions:

Should we start by using a pre-canned pose model? There’s a risk it will take more time to get a model building and integrated than it will to train from scratch, but it could be a good jumping off point.

Should we try finding fiducial markers? AprilTag-rs could be a good option, but it doesn’t build on Windows, last I checked.

Comments are closed.