-
Making it portable
01/02/2019 at 18:09 • 0 commentsThe leading solution is to always stream frames over a network to a big computer with GPU processing.
Another solution is solving a simpler problem than pose estimation for camera tracking & using pose estimation only for counting reps. Only camera tracking must be portable. Counting reps will always be done near the ryzen.
There are 2 competing libraries for GPU processing: OpenCL & CUDA. The choice depends on the CPU, GPU, & stock portfolio.
A starting point for just detecting people:
The raspberry pi does 1fps.
The odroid does 4fps.
The latest algorithm is YOLO. There are rumors of higher frame rates, but no good installation examples.
Running openpose with a GPU requires CUDA & CUDNN. CUDA requires 2GB of downloads from https://developer.nvidia.com/cuda-downloads. CUDNN comes from https://developer.nvidia.com/cudnn They're not linked from the nvidia.com home page.
The version of CUDA, CUDNN, & the X11 driver must all match & there's no documentation. Driver 410.78 happened to work with cuda-repo-ubuntu1604-10-0-local-10.0.130-410.48_1.0-1_amd64.
CUDA also requires rebooting & manually loading the nvidia-uvm module.
To test your CUDA installation:
cd /usr/local/cuda/samples/1_Utilities/deviceQuery
make
deviceQuery
To print the status of your graphics card:
nvidia-smi
Caffe must be rebuilt with CUDA, then openpose.
To build Caffe with CUDA:
edit caffe/Makefile.config
uncomment USE_CUDNN
comment out CPU_ONLY
tweek the CUDA_ARCH line
edit BLAS_INCLUDE, BLAS_LIB, LIBRARY_DIRS to include /root/countrepsAll the objects need -fPIC, but nvcc complains about it.
Placing -fPIC after -Xcompiler in NVCCFLAGS, CXXFLAGS, LINKFLAGS but not in COMMON_FLAGS seems to fix it. All the dependencies for caffe were installed in /root/openpose.PATH=$PATH:/root/countreps/bin make
PATH=$PATH:/root/countreps/bin make distributeComment out the tools/caffe.cpp: time() function if there's an undefined
reference to caffe::caffe_gpu_dotThe output goes in the distribute directory & must be copied manually.
cp -a bin/* /root/countreps/bin/
cp -a include/* /root/countreps/include/
cp -a lib/* /root/countreps/lib/
cp -a proto/* /root/countreps/proto/
cp -a python/* /root/countreps/python/To build openpose with CUDA:
mkdir build
cd build
cmake \
-DGPU_MODE=CUDA \
-DUSE_MKL=n \
-DOpenCV_INCLUDE_DIRS=/root/countreps/include \
-DOpenCV_LIBS_DIR=/root/countreps/lib \
-DCaffe_INCLUDE_DIRS=/root/countreps/include \
-DCaffe_LIBS=/root/countreps/lib/libcaffe.so \
-DBUILD_CAFFE=OFF \
-DPROTOBUF_LIBRARY=/root/countreps/lib \
-DProtobuf_INCLUDE_DIRS=/root/countreps/include \
-DGLOG_INCLUDE_DIR=/root/countreps/include \
-DGLOG_LIBRARY=/root/countreps/lib \
-DGFLAGS_INCLUDE_DIR=/root/countreps/include \
-DGFLAGS_LIBRARY=/root/countreps/lib \
-DCMAKE_INSTALL_PREFIX=/root/countreps/ \
..LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/root/countreps/lib make VERBOSE=1
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/root/countreps/lib make installThe dreaded
Check failed: error == cudaSuccess (2 vs. 0) out of memory
error is caused by the GPU running out of memory. Reduce the netInputSize variable to -1x256.
Openpose on the GeForce GTX 1050 hit 14 frames per second, but the computer can't do anything else with the GPU like play a video. CUDA is a return to 1980's single tasking, but it's still amazing how well it can track a human pose in a blurry photo.
The terrabytes of opaque libraries required to make a computer vision program are how all computing is going to be done in the future. All these libraries are going to be part of the base system. Using computer vision won't involve tweeking neural networks directly or creating training sets directly, but using libraries. Unlike decoding a video or encrypting text, computer vision libraries are the result of millions of people & careers. Bedroom hackers aren't driving this generation of software.
The problems are far too complex for an end user to be directly programming the neural network part or the tensorflow part. The current training sets contain millions of photos or every experience of a hypothetical human who lived many hundreds of years. We're not far from a neural network containing every experience of every human who ever lived.
-
Compiling openpose & the 1st test
12/24/2018 at 07:25 • 0 commentsIf you have multiple computer vision projects like lions do, each one was compiled for different versions of opencv & all its dependencies, so you can't have system wide dependencies. The reason cocoa pods works is it compiles all the dependencies inside the project. That's really not much of an innovation, but it's a political mountain to convince developers not to use system wide dependencies. Creating a dependency manager with a meaningless name was all about overcoming the political mountain & legitimizing having dependencies in the project directory.
Most dependencies were already compiled for a previous project & installed in the /root/countreps prefix. A gootuber recommended another version of openpose based on tensorflow, which might have fewer dependencies.
compiling OpenBLAS for Ryzen:
make TARGET=ZEN
make TARGET=ZEN PREFIX=/root/countreps installFLOAT in OpenBLAS/common.h conflicts with another definition & has to be
renamed FLOAT_ when compiling openpose.
OpenCV must be built with GTK support.
To build opencv:
mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX=/root/countreps/ ..make
# this doesn't work
make install
# running this a 2nd time is what installs libopencv.so
maketo build openpose:
mkdir build
cd build
cmake \
-DGPU_MODE=CPU_ONLY \
-DUSE_MKL=n \
-DOpenCV_INCLUDE_DIRS=/root/countreps/include \
-DOpenCV_LIBS_DIR=/root/countreps/lib \
-DCaffe_INCLUDE_DIRS=/root/countreps/include \
-DCaffe_LIBS=/root/countreps/lib/libcaffe.so \
-DBUILD_CAFFE=OFF \
-DPROTOBUF_LIBRARY=/root/countreps/lib \
-DProtobuf_INCLUDE_DIRS=/root/countreps/include \
-DGLOG_INCLUDE_DIR=/root/countreps/include \
-DGLOG_LIBRARY=/root/countreps/lib \
-DGFLAGS_INCLUDE_DIR=/root/countreps/include \
-DGFLAGS_LIBRARY=/root/countreps/lib \
-DCMAKE_INSTALL_PREFIX=/root/countreps/ \
..To compile openpose:
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/root/countreps/lib make VERBOSE=1
We have the simplest demo possible:
https://cdn.hackaday.io/files/1629446971396096/countreps.c
https://cdn.hackaday.io/files/1629446971396096/Makefile
to compile countreps:
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`pwd`/lib make
to run countreps:
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`pwd`/lib countreps
It reads JPEG photos from test_input & writes output to test_output.
The test program processed 424 frames at 640x480 resolution.
2.15 seconds per frame on the 4.1Ghz Ryzen 7 2700x.
4 gig of RAM required.
The 1st test brought the same disappointment as encoding MPEG video in 1995, on a 33Mhz computer. It was amazingly good at tracking the subject, but too slow to do it in realtime. The next step would be compiling the CUDA dependencies. It still wouldn't be portable.
The embedded installations in modern subject tracking cameras all use NVidia TX2 boards at $800. Maybe there could be a cheaper solution using a laptop. Just like 1995, the lion kingdom can't afford the required hardware.