SayHOLO

Description

The project is called SayHOLO. Basically, SayHOLO is a portable device that use holographic technology to display 3D object from 2D image taken by ESP32-CAM. Due to rapid development on Virtual Reality and Augment Reality over the past decades, researchers have devote more resources in realistic visualization. As a result, holographic technology has lots of potential to be implemented on daily applications.

This project is simple but has lots of ways to further elaborate. Currently, I am working on how to make it display real time videos on hologram device.
Hardware:
ESP32-CAM / FT232RL FTDI USB To TTL Serial Converter Adapter Module / Mini USB to USB Converter
Software:
Google APP Script ( Uploading Images from ESP32-CAM )
U-2-Net ( Background Removal Algorithm with Python )
PiFuHD ( 3D Object Generator Algorithm with Python )
AWS S3 ( Storing Objects on Cloud Service )
SayHOLO Website ( Customize ReactJS Website to Display Objects on Hologram Device )

Details

SayHOLO provide easiest way to generate 3D holographic from 2D images.

As we know the world has become more "digitalize" over the past few years. With more realistic virtual reality technology has been invented in area such as entertainment, military, and medical science, holographic technology becomes one of the major trend that researchers want to explore. In addition, starting from late 2019, our world has experience serious pandemic. Remote meeting become the major way that people communicate with each other. In order to make our live fill with entertainment, this project use the technologies that already existed to transform 2D images into 3D model. The device is called SayHOLO, a portable device that is easy to use for people to generate 3D human model from ESP32-CAM. Let's look at the technical components that are involved in this project:

Image Transferring to Google Drive
- In ESP32-CAM programming module, basically it calls a script deployed on Google APP Script which takes the images file from ESP32-CAM and transmitted to Google Drive. Since images content cannot be transfer via internet, I first of all convery the data into textual using Base64 encoding. In short, Base64 converts a binary representation into an ASCII string that is safe for plain text system. The downside is that resulting representation is around 1.3x larger in size than the original file such that my ESP32-CAM resolution cannot be higher than SVGA. That is something I will do to further improve the image file size limitation.
U-2-Net
- The algorithm written in Python is a simple but powerful deep network architecture for salient object detection. This architecture is a two-level nested U-structure which is able to capture more contextual information from different scales and increase the depth of the whole architecture without significantly increasing the computational cost. With the newly designed RSU blocks, the nested U-structure enables the network to capture richer local and global information from both shallow and deep layers regardless of the resolutions.
- U-2-Net is used to filter out the background components from images taken by ESP32-CAM which the formation algorithm can provide higher quality 3D model.

* If you are interested in this algorithm, feel free to learn more in this paper >>> U-2-Net

PIFuHD
- Recent advances in image-based 3D human shape estimation have been drive nby the significant improvement in representation power afforded by deep neural networks. Due to hardware limitation stem in current technologies, two conflicting requirements occur: accurate prediction require large context but precise predictions require high resolution. PIFuHD address this limitation by formulating a multi-level architecture that is end-to-end trainable. A coarse level observes the whole image at lower resolution and focuses on holistic reasoning. This provides context to an fine level which estimates highly detailed geometry by observing higher-resolution images.
- This algorithm is the major thing that allows my pictures transforming into three dimensional. With the first layer of U-2-Net, a clearer images will be feed into PIFuHD such that the expected result contains higher quality.

* If you are interested in this algorithm, feel free to learn more in this paper >>> PIFuHD

Three.js
- This library is a cross-browser javaScript and application programming interface (API) used to create and display 3D computer graphics in a web browser using WebGL. It allows the creation of graphical processing unit (GPU) accelerated 3D animations using the JavaScript language as part of a website without relying on proprietary browser plugins.
- I have designed a customize ReactJS website using Three.js such that the result object files generated by PIFuHD can display on the website.

* If you are interested in this library, feel free to learn more in this website >>> Three.js

After finished... Read more »

Components

1 × ESP32-CAM with OV2640 camera module

1 × FT232RL FTDI USB To TTL Serial Converter Adapter Module

1 × Mini USB to USB converter

4 × Wires

Build Instructions

Collapse

Material Preparation

Prepare one ESP32-CAM / one FTDI USB to TTL Serial Converter / one mini USB to USB converter / 4 wires

Connect your ESP32-CAM to Serial Converter with the following picture:

Program ESP32-CAM

Use Arudino IDE to program your ESP32-CAM

The code can be found here on SayHOLO Github >>> IMG_To_Google_Drive

* To program your ESP32-CAM, make your to use jumper or wires to connect IO0 & GND. Remove it when you finish compiling.

Make sure you update the following feature and replace with your own:

const char* ssid     = "xxxx";   //your network SSID
const char* password = "xxxx";   //your network password

String myScript = "/macros/s/xxxx/exec";  //Create your Google Apps Script and replace the "myScript" path.

For "myScript", open Google APP Script and create a new project

Place your code with "Google_APP_Script.gs" found on Github and deploy your script to obtain the token ID.

*Please remember to make the script public such that your ESP32-CAM can transmit to Google Drive properly.

Once everything has been setup, you should be able to see your images taken by ESP32-CAM in your Google Drive.

Generate 3D Human Model

Here comes to the exciting parts - 3D image formation

* Since I am using MacBook Pro to run the algorithm, it does not support certain library that require NVIDIA GPU. For alternative choose, Google Colaboratory can solve the issue.

Open SayHOLO Google Colab ( SayHOLO ) & save it to our own google account

First of all, run first line to connect Google Colab with your Google account.

Then run the rest of the program to generate 3D human model. Lets see whats going on with the Python script:

The first algorithm the image run through is called U-2-Net. Basically, it removes the background and leaving the image of the person only. This step is helpful for generating more accurate 3D model.
The second algorithm is call PIFuHD developed by people at University of Southern California, Facebook Reality Labs, and Facebook AI Research. After the 2D image has run through PIFuHD, it will generate 3D object file.
After the object file has successfully generated, it will be modified and transmitted to AWS S3. For simplicity, I have created my own cloud storage on AWS S3 but feel free to change it to your if you prefer.

Tips: The background removal algorithm deal with clear background. If your background is too complicated, the algorithm might not be able to generate clear images. PIFuHD generate better 3D model if the image contains the whole body picture; half of your body's picture usually leads to disastrous results.

Discussions

SayHOLO

Description

Details

Components

Build Instructions

Collapse

Discussions

Similar Projects

Multi-Domain Depth AI Usecases on the Edge

Synthetic spaces

SmartScale

PocketNav 32

SayHOLO

Become a Hackaday.io member

Just one more thing

Description

Details

Components

Build Instructions Collapse

Enjoy this project?

Discussions

Become a Hackaday.io Member

Similar Projects

Multi-Domain Depth AI Usecases on the Edge

Synthetic spaces

SmartScale

PocketNav 32

Does this project spark your interest?

Report project as inappropriate

Send message

Remove Member

Build Instructions

Collapse