ESP32 P4 + Camera + Sensors = ？

During the development of AI cameras, we realized that the form of AI cameras is far more than just RGB or depth cameras. "Specialized cameras" such as single-point ranging and thermal imaging do not rely on complex image processing, yet they can directly capture key physical information like temperature and distance, providing us with a new perspective to observe the world. This inspired us: can we create a lightweight hardware device that makes these efficient sensing capabilities more user-friendly and directly integrates them with camera images or even AI detection results, thereby achieving a wide range of functions?

Based on this concept, we propose SensorCam P4 — a modular sensing device with a camera at its core. This device is based on the high-performance ESP32-P4 main control, and its core capabilities are realized through pluggable expansion backplanes. There is no need to reflash the firmware; you only need to insert the corresponding sensing module according to your needs to quickly expand the functions, such as:

Insert a thermal imaging module, and you can real-time overlay or split-screen display thermal imaging images on the camera screen, automatically identify and mark abnormal areas such as high temperature/low temperature.
Insert a laser ranging module, and the precise distance of the object as well as what the object is can be displayed in real-time on the screen.
Insert temperature, humidity, and air pressure sensors to obtain and display environmental data.
…

SensorCam P4 adopts a highly modular design, getting rid of the cumbersome drawbacks of traditional multi-sensor integration solutions and focusing on the in-depth integration of camera images and sensing data. The camera is no longer just "seeing colors"; it can also let you "see temperature", "see distance" and so on, making it easy to obtain multi-dimensional data. You can also customize and add various modules according to actual needs. The device can automatically identify the type of inserted sensor and load the corresponding exclusive UI interface. For example, in thermal imaging mode, you can choose to display in overlay or split-screen; in ranging mode, it displays values and aiming reference lines, etc. It looks roughly like this

Why choose ESP32 P4 as the main controller？

Because its characteristics are highly consistent with the core requirements of the device - efficiently processing camera data, handling AI tasks, and achieving sensor fusion - which is specifically reflected as follows:

1. Native camera and display support

Equipped with a MIPI-CSI interface, it can directly connect to high-resolution camera modules to stably acquire image data.
A native MIPI-DSI display interface that can directly drive MIPI screens to fully present the fused images.

2. Powerful processing capability and built-in AI capability

Equipped with a 400MHz main frequency CPU, its performance is far superior to the previous generation of ESP32, and it can easily handle concurrent data from multiple high-speed sensors (such as cameras, thermal imaging).
Built-in vector instruction set and AI acceleration unit, supporting efficient local AI inference, which can directly run machine learning models such as image recognition and target detection on the device side, providing intelligent underlying support for sensor fusion.
Supports hardware graphics acceleration and scaling engine, which can efficiently complete image overlay, scaling and graphics rendering, significantly reducing CPU load.

3. Rich connectivity capabilities

Provides various interfaces such as UART, SPI, I2C, and ADC with sufficient bandwidth, laying the foundation for simultaneous access to multiple modules such as ranging, thermal sensing, temperature and humidity.
Supports Wi-Fi and Bluetooth 5.x, facilitating wireless data transmission, remote monitoring, and OTA firmware upgrades.

4. Mature development environment

Based on the ESP-IDF framework, it features high development efficiency and a complete ecosystem, supporting mainstream AI inference frameworks and model conversion tools, significantly reducing the threshold for AI application development.
The community is active, and extensive support can be obtained in terms of underlying drivers, complex UIs (such as LVGL), AI deployment, and sensor integration.

Next step plan

We believe that the SensorCam P4 can provide significant convenience in development, research, and engineering applications. Next, we will focus on advancing the following work:

Complete the overall hardware selection.
Design the equipment shell structure.
Develop backplane expansion interfaces.
Develop drivers and UI for the first batch of modules (such as cameras, laser ranging, thermal imaging).
...

Welcome to share your insights on this direction. Which sensors do you think have the most application potential when combined with camera images? We will continuously update the progress on the project page.

What kind of AI camera web interface do you want?

Interface/Base Board - Screen selection for ESP32P4-SensorCam P4

Discussions

Become a Hackaday.io Member