When exploring "How to build an AI camera", clarifying its core functional positioning is crucial. AI cameras have a wide range of application scenarios, whether it is real-time monitoring in intelligent security, detail capture in industrial quality inspection, or dynamic recording in home care. The three major functions of RTSP streaming, photo shooting, and video recording are core pillars, directly determining the practical value of AI cameras in different scenarios.
Currently, many related products and projects face significant pain points in practical applications: RTSP streaming is prone to instability and excessive latency, which greatly affects scenarios requiring real-time feedback (such as security monitoring and industrial assembly line monitoring); when taking photos and recording videos, problems like frame freezes and blurriness occur frequently, seriously impacting the experience for both home users recording life moments and enterprises using them for document shooting and scene preservation. Therefore, a stable RTSP streaming solution combined with smooth photo-taking and video-recording capabilities is a core element in creating an excellent AI camera.
After comparing multiple products, we chose the RV1126B chip as the core processor for testing. The reasons for selecting it are mainly twofold:
First, its RTSP streaming function is stable, supporting both H.264 and H.265 encoding formats, and it can adjust the bitrate according to network conditions and device performance, outputting multiple streams to adapt to different network environments, whether it is a home network with limited bandwidth or a demanding industrial local area network;
Second, the RV1126B has excellent ISP image processing capabilities, ensuring the clarity of photos and the smoothness of video recording, providing high-quality output for both long-distance capture in security scenarios and dynamic video recording in home scenarios.
In conclusion, the powerful encoding/decoding and network streaming capabilities of the RV1126B make it a strong candidate for AI Cameras.
Below, I will test the three major functions of the camera: taking photos, recording videos, and RTSP streaming. I will examine the clarity, color contrast, and level of detail presentation of the photos taken, and at the same time, I will introduce photos taken by the iPhone 15 at 4K resolution as a benchmark for comparison. For videos, I will focus more on picture smoothness, frame rate stability, and storage costs. For the RTSP part, streaming stability, anti-interference ability, and real-time performance are the points I value.
Camera Photo Performance
As an AI Camera, its photo quality is crucial. The original RV1126B is equipped with a photography unit that supports 4K resolution, fully meeting the needs of security and visual inspection. Below is a comparison of images taken by the RV1126B and the iPhone at the same 4K resolution. We can see that the image from the RV1126B has slight edge distortion, and the color contrast is slightly inferior to that of the iPhone. However, this is partly because the iPhone performs automated background processing on the images, enhancing color contrast and correcting distortion. In terms of image clarity, the effect captured by the RV1126B is no less than that of the iPhone.(Because HACKADAY limits the size of the uploaded file, the images are compressed)

It is worth mentioning that on the IP Camera's website, users can adjust parameters such as brightness, contrast, exposure, and backlight compensation to obtain higher-quality images.

Video Recording
The quality of video recording is an even more crucial basic indicator for a camera, as it directly affects the accuracy of visual algorithm recognition. A good camera must have excellent clarity and detail restoration capabilities of the recorded frame,smooth images, stable frame rates, and lower storage costs.
The video quality presented by the RV1126B fully meets the above indicators. Let's take a look at its performance:

In terms of clarity and detail restoration, the RV1126B supports a maximum resolution of 4K. We can see that in the video,fast-moving people are not blurred and leave no afterimages, and high-frequency details such as fingers and text labels are clearly visible, which is crucial for scenarios such as recording subtle defects in industrial quality inspection and capturing life details in home scenarios.
Regarding smoothness and frame rate stability, during the subsequent 10-minute long video recording, the frame rate remained stable at 30FPS without frame drops or freezes. Furthermore, when recording fast-moving objects (such as the fast-moving person in the test video), the picture is coherent without freezing, and there is no frame tearing or blurring caused by frame rate fluctuations.
In terms of storage optimization, the RV1126B is equipped with a dynamic bitrate adjustment function, which can automatically reduce the bitrate when the frame changes smoothly and increase the bitrate when the picture is complex, achieving a balance between image quality and storage. The measured data shows that recording a 168-hour 4K video occupies approximately 786GB of memory, and commonly used storage hard drives on the market are usually above 1TB, which is fully compatible. This feature can significantly reduce storage costs for scenarios that require long-term video recording and storage (such as home security and unattended warehouse monitoring).
RTSP Streaming
As a core function for AI cameras to achieve real-time data transmission, RTSP streaming performance directly determines the experience of cross-device and cross-platform real-time interaction, and is crucial in scenarios such as remote monitoring and multi-terminal collaboration.
The following is a screen recording of the RTSP real-time video stream:
Streaming stability and anti-interference ability are important indicators of RTSP functionality. The RV1126B performs excellently in complex network environments. In actual tests, it maintained continuous high-load streaming for 24 hours without any connection interruptions, providing a reliable core guarantee for scenarios requiring 24/7 real-time monitoring (such as unattended computer rooms and smart park security).
The RV1126B also performs well in low-latency transmission and real-time response. In scenarios with extremely high real-time requirements (such as remote control of industrial assembly lines and emergency security incident handling), streaming latency directly affects decision-making efficiency. By optimizing the encoding process and transmission protocol, the RV1126B controls the end-to-end streaming latency within 500 milliseconds, which is much lower than the industry average. This means that when viewing the video remotely, near-synchronous real-time feedback can be achieved, ensuring rapid response at critical moments.
To meet the needs of different devices and network environments, the RV1126B supports multiple stream outputs (such as 4K main stream and 640x480 sub-stream) and is compatible with mainstream RTSP players (such as VLC, FFmpeg) and various security platforms. Whether it is a high-definition main stream required by high-performance terminals or a standard-definition sub-stream adapted to low-bandwidth devices, it can be flexibly switched to achieve seamless docking across scenarios and devices.
Overall
Overall, the RV1126B demonstrates significant advantages in building the core functions of AI cameras, but there are also some areas for optimization.
In terms of advantages, its core performance shows significant scenario adaptability in different functional dimensions:
1. Photo-taking performance: Balancing high definition and flexible adjustment
The image clarity at 4K resolution is comparable to that of mainstream consumer-grade devices, and it supports manual adjustment of multiple parameters such as brightness, contrast, and exposure. This enables it to easily cope with different light environments (such as complex light and shadow in industrial workshops and light and dark changes in home interiors), making it very suitable for scenarios that require flexible adjustment of image effects according to the scene (such as capturing subtle defects in industrial quality inspection and recording life moments at home).
2. Video recording: Balancing clarity, smoothness, and storage efficiency
- In terms of clarity, 4K resolution can accurately restore picture details, even fast-moving objects (such as products on assembly lines and children at home) can maintain sharp edges without afterimages;
- In terms of smoothness, it stably outputs at 30FPS, with no frame drops or freezes during long-term recording, ensuring the coherence of dynamic images (such as tracking moving objects in monitoring);
- In terms of storage efficiency, the dynamic bitrate adjustment function can intelligently adjust the bitrate according to the complexity of the picture,effectively controlling the file size while ensuring image quality, which is especially suitable for scenarios that require long-term video storage such as unattended warehouses and home security.
3. RTSP streaming: Providing stable and reliable support for real-time interaction
It has the ability to operate stably under high load for 24 hours, with end-to-end latency controlled within 500 milliseconds, and supports multiple stream outputs (such as 4K main stream and standard-definition sub-stream). This enables it to adapt to scenarios such as remote monitoring (such as store operators remotely viewing customer flow) and multi-terminal collaboration (such as security systems linking multiple display devices), providing reliable support for cross-device real-time interaction.
However, there are some areas for improvement: in terms of edge distortion and color contrast during photo-taking, it is slightly inferior to consumer-grade devices optimized by in-depth algorithms. For scenarios with high requirements for picture aesthetics (such as commercial display shooting), additional post-processing may be required.
We chose the RV1126B precisely because it can meet the rigid needs of most AI camera application scenarios in terms of core performance (clarity, stability, efficiency), and its shortcomings can either be compensated by parameter adjustment or post-processing, or have little impact on the practicality of core functions. Therefore, it is definitely an excellent AI Camera Main Control.
Deng MingXi
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.