Pose2Art Project

Jerry Isdale, MIOAT.com

Maui Institute of Art and Technology

notes started oct 12 2022

This Page is now supersceded by the Project

Basic idea

Create a low cost, (mostly) open source 'smart' camera system to capture human pose and use it to drive interactive immersive art installations. Yes, its kinda like the Microsoft Kinect/Azure 'product', but DIY open to upgrading.

use one or more smart cameras to capture Human Pose from video stream
stream that data (multicast?) – pose points (OSC), raw frames, skeleton overlay (video), outline, etc etc
receive stream to drive CGI Rendering Engine using skeleton data etc
project that stream on wall (or use all above streams as input to video switcher/overlay

Hardware:

Edge Computing: Raspberry Pi 4 and Nvidia Jetson Nano are target platforms i have. Google Coral may be a better low cost alternative to the raspberry pi 4.
Camera: Does not need to be high resolution. a usb webcam or CSI interface (ribbon cable, rPi camera, arducam, etc.)
Network: Wired Ethernet is preferred over WiFi for installations to avoid interference. A single cable can connect the edge device with PC. Configuration of software is a bit tricky
Rendering Engine: Decently powerful computer with high graphics card running TouchDesigner, Unity, Unreal or similar visual software
Display: either a video wall or projection video setup

Options:

Multiple cameras can be used to create 3d pose tracking

Stream video from edge cameras to rendering engine. Not sure of usable protocol

Tracking multiple people, in contact with each other (dancing, acro-yoga etc)

Depth Camera: cameras that give Point Cloud depth data could be used

STATUS:

This very much a work in process (with uneven progress).
18Nov: I have gotten the Camera/Pose etc working and feeding Points over network to PC via OSC which feeds data into TouchDesigner

Currently i;m note taking on both pi and pc (with multiple boot sd cards for different OS on Pi)

Example Art Installations

(insert links to still/video of pose tracking in interactive environments)

https://vimeo.com/satmetalab

https://user-images.githubusercontent.com/15977946/124654387-0fd3c500-ded1-11eb-84f6-24eeddbf4d91.mp4

----

Oct 14

10 steps in Pose2Art process

(make a graphic of this flow)

image capture
pose extract
pose render (optional)
stream send pose data, video optionally
physical send ( transport)
physical rcv
stream receive
stream process
render/overlay
project/display

Project Plan and this Page

This doc:survey's tech options for each of the 10 stages listed
survey existing solutions, focus on newer ones with multi-person options
find one that runs on my rPi4
build out a demo using rPi4 and TouchDesigner for rendering.

Nov 20 status:

The QEngineering Raspberry Pi image comes with TensorFlowLite properly installed, along with a C++ demo of pose capture. Adding Libosc++ got it emitting OSC data. Fair bit of mucking around with network static ip, routes, and firewalls was required, but finally got it working with PC. Found at least 1 TouchDesigner example of reading OSC pose data and got it working. Looking into other demos, like a Kinect driving TD Therimin simulator.

OSC (OpenSoundControl) currently chosen as data transport. It is VERY much user defined messages, and I have yet to see any 'standards' for how to name the Pose data. Kinect tracked point names might be useful.

Survey of System Demos

Web searching turned up a LOT of links on pose estimation using machine learning. Some include source code repositories and documentation, others are academic papers or other non-replicable demos. This section is a summary of some. Hopefully one will be found to actually work?

30 oct 2022: links below this update

Attempting to run demos has been Interesting, with lots of classic dependency issues. Some Python pose examples were made to work, but alas Very slowly. The QEngineering rPi example is in c++ and its basics ran much faster (6-10fps) than the python ones. It (and many other examples) use the TensorFlow Lite implementations to run on rPi. TFLite seems to be decent and there are both pretrained and reduced models available as well as the TF blog on how to train a new set on something more than the COCO yoga and dance poses. The options here after getting basics done.

Next steps are putting the pose data into a Message (OSC based) and sending that over network (UDP) to a 'server'.

More dependency issues over Socket/ASIO and OSC libraries, but some progress.

There is no 'standard' for the OSC pose messages. There are some examples of message, json, xml etc pose with either the 17 point or 33 point models, even some showing multiple person tracking data. Since we are writing our own code, we can define it on both ends. Receiver will likely be TouchDesigner, at least for the first prototype.

links to Pose

Tracking demos with code

- SAT LivePose

https://sat-mtl.gitlab.io/documentation/livepose/

https://sat-mtl.gitlab.io/documentation/livepose/en/contents.html

LivePose is a command line tool which tracks people skeletons from a RGB or grayscale video feed (live or not), applies various filters on them (for detection, selection, improving the data, etc) and sends the results through the network (OSC and Websocket are currently supported).
Requires: Ubuntu 20.04, nvidia gpu (jetson, pc-rtx, etc)
Other parts of SAT-MTL's GitLab site mention rpi distribution: https://gitlab.com/sat-mtl/distribution/mpa-bullseye-arm64-rpiL
indications that LivePose has rPi distrib and outputs OSC, we start from that
unfortunately it seems LivePose may not work on Jetson Nano or rPi4, so move to other options

- rpi TensorFlowLite, PoseNet

Ethan Del's rpi_pose_estimation builds on Tensor Flow Lite and seems to be simple python with webcam

https://github.com/ecd1012/rpi_pose_estimation

https://medium.com/analytics-vidhya/pose-estimation-on-the-raspberry-pi-4-83a02164eb8e

Medium article: rpi_pose_estimation is a Python library typically used in Artificial Intelligence, Machine Learning, Deep Learning, Tensorflow, Raspberry Pi applications. rpi_pose_estimation has no bugs, it has no vulnerabilities and it has low support. However rpi_pose_estimation build file is not available. You can download it from GitHub.

uses OpenPose

ActionAi and YogAI - Jetson

ActionAI is follow on to YogAI . The latter was touted as using rPi while new ActionAI uses JetsonNano

https://github.com/smellslikeml/ActionAI

https://www.hackster.io/yogai/yogai-smart-personal-trainer-f53744

- web TensorFlow OpenPose js

There are several projects that use browser based (javascript) webcam pose estimation. These might be worth looking into, although more for their use of underlying Pose tools

https://github.com/nishagandhi/OpenPose_PythonOpenCV

https://www.youtube.com/watch?v=DpGHWa2gOcc phoneCam+touchdesigner

https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5

April Tags https://github.com/ju1ce/April-Tag-VR-FullBody-Tracker

MediaPipe https://github.com/ju1ce/Mediapipe-VR-Fullbody-Tracking

FreeMoCap

Active Oct 2022; pre-alpha

The FreeMoCap Project: A free-and-open-source, hardware-and-software-agnostic, minimal-cost, research-grade, motion capture system and platform for decentralized scientific research, education, and training

https://freemocap.org/

https://github.com/freemocap/freemocap

https://youtu.be/WW_WpMcbzns

OpenPose

used in Ethan rPi pose

https://cmu-perceptual-computing-lab.github.io/openpose/web/html/doc/

https://viso.ai/deep-learning/openpose/

https://www.geeksforgeeks.org/openpose-human-pose-estimation-method/

https://github.com/CMU-Perceptual-Computing-Lab/openpose Active early 2022

https://www.youtube.com/watch?v=d3VrS4kgTn0

https://www.arxiv-vanity.com/papers/1812.08008/ OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

Steam webcam

https://www.youtube.com/watch?v=bQCC2HQX2u8

https://store.steampowered.com/app/1366950/Driver4VR/

Capture Systems:

usb web cam
rPi (csi bus) camera
Arducam stereo (csi bus)
Intel Realsense (usb cam w/depth sensor)

ML Pose Engines Systems

ModelNet v2 (ml network network)
TensorFlow
PyTorch

TensorFlow/TensorFlow Lite

https://pimylifeup.com/raspberry-pi-tensorflow-lite/

PoseNet OpenCV

nVidia BodyPoseNet, TensorRT

https://docs.nvidia.com/metropolis/TLT/tlt-user-guide/text/purpose_built_models/bodyposenet.html

SAT LivePose https://gitlab.com/sat-mtl/tools/livepose

MMPose https://github.com/open-mmlab/mmpose

stream protocols - video, data (osc)

Video Streaming: ndi

https://ndi.tv/

https://www.newtek.com/ndi/applications/

https://www.mgraves.org/2020/05/dicaffiene-using-a-raspberry-pi-4-to-display-an-ndi-stream/

https://github.com/rbalykov/ndi-rpi

Data: OpenSoundControl (OSC)

OpenSoundControl (OSC) is a data transport specification (an encoding) for realtime message communication among applications and hardware.

python. c+ c# bindings
available for unreal, unity and touchdesigner
text based, hierarchial tags for messages
encoding requires agreement on both ends
no single Standard for MoCap Pose

https://opensoundcontrol.stanford.edu/

osc4py: https://osc4py3.readthedocs.io/en/latest/

Transport - memory, multicast, disk

The Transport layer moves P2A assets between machines. This may be using in-memory on same system or across network.

render engines- td, unity, unreal, resolume

Rendering Engines should accept at least one of raw video, video+pose overlay, and pose data; using only OSC/PoseData would drive avatars and/or animation/synthesis.

likely first demo: TouchDesigner variant on Kinect

skeleton interaction with particles,

https://www.google.com/search?q=touchdesigner+interactive+particles

Pose2Art: low cost video pose capture for interactive spaces

Basic idea

Hardware:

Options:

STATUS:

Example Art Installations

10 steps in Pose2Art process

Project Plan and this Page

Survey of System Demos

30 oct 2022: links below this update

links to Pose

- SAT LivePose

- rpi TensorFlowLite, PoseNet

ActionAi and YogAI - Jetson

- web TensorFlow OpenPose js

FreeMoCap

OpenPose

Capture Systems:

ML Pose Engines Systems

stream protocols - video, data (osc)

Video Streaming: ndi

Data: OpenSoundControl (OSC)

Transport - memory, multicast, disk

render engines- td, unity, unreal, resolume

Discussions

Pose2Art: low cost video pose capture for interactive spaces

Basic idea

Hardware:

Options:

STATUS:

Example Art Installations

10 steps in Pose2Art process

Project Plan and this Page

Survey of System Demos

30 oct 2022: links below this update

links to Pose

- SAT LivePose

- rpi TensorFlowLite, PoseNet

ActionAi and YogAI - Jetson

- web TensorFlow OpenPose js

FreeMoCap

OpenPose

Capture Systems:

ML Pose Engines Systems

stream protocols - video, data (osc)

Video Streaming: ndi

Data: OpenSoundControl (OSC)

Transport - memory, multicast, disk

render engines- td, unity, unreal, resolume

Discussions

Become a Hackaday.io Member