It turns out I was completely wrong about what was going on with my faulty sensor readings. It took me weeks to figure it out, partly due to my own carelessness. Here are some thing I tried that did not work:
- Piece by piece, I replaced every physical part of the system. That includes even things like the little breadboard that holds the ESP32 and the jumper wires that connect everything together.
- I changed the update intervals for the two sensors. My hypothesis was that there might be a weird timing error where device access on the I2C bus was overlapped. I had been using 30 seconds for the TSL2591 and 60 seconds for the BME280. I changed both of them to nearby prime numbers so that they would very rarely want to be read at the same point in time.
- I moved the BME280 to a separate pair of I2C pins. The ESP32 supports two separate I2C busses, and most of the GPIOs can be configured to be SDA/SCL.
- I changed my scripting around so that the ESPHome "interval" was used to ask the sensors for updates sequentially with short pauses in between. My hypothesis was that I could force access to the I2C bus to be a single device at a time.
- I switched the TSL2591 from powersave mode to always on. In powersave mode, the device shuts down between readings and needs to go through an ADC integration cycle of about 600 ms before values can be read.
- I changed the I2C bus frequency to a couple of different values.
- With the idea that heat on the TSL2591 might be affecting the whole system, I used some screws and nuts to make stand-offs to hold the TSL2591 board a half inch or so away from the glass. Hee's a picture:
- I removed the TSL2591 completely away from the water heater glass and just had it watching a little lamp.
- I modified the scripting so that if a value was not actually read from either sensor, I would reset the I2C bus and the sensors. You can do that in a lambda call. Weirdly, that sometimes worked and sometimes rebooted the ESP32.
- I added a text sensor to make it easier for me to monitor what was going on without tailing the ESPHome log all the time. The same script that sends the messages over MQTT updates that text sensor with the same value. This is a standalone system, but since I also run Home Assistant, it's easy to monitor all the sensors, view a logbook of changes, view history, etc. I used the special text value "RESET_I2C" (just for the text sensor) so it was easy for me to see when that activity kicked in.
- I modified the script to notice whether things got better after resetting I2C. If not, and it carried on for more than a few minutes, I used another special text value "REBOOT_ME" for the text sensor. Early on, I got tired of tromping up and down the basement stairs just to cycle the power, so I plugged things into a WiFi-controlled smart socket that I could toggle from Home Assistant. I also set up a Home Assistant automation that would watch for the "REBOOT_ME" text and then cycle the power. I felt pretty smart about that, but I never actually saw it happen.
# This is an explicit text sensor that gets the same value as the # message we send to waterwatcher. No explicit update interval because # a value gets published when we do the rest of the flame status # processing. text_sensor: - platform: template name: "${node_name} flame status text" id: i_flame_status_text icon: mdi:fire update_interval: never
- I set up another device that was an identical twin. It has the same model of ESP32 board, the same model of TSL2591, and the same ESPHome configuration (other than the node name). I had it laying around on my desk, and it worked perfectly. That was odd since the parts were those that I replaced early on in the troubleshooting.
- I moved the twin down to the basement, and it started being flaky. Could it be something like a few degrees colder temperature, trace amounts of natural gas fumes, lack of cosmic rays, ...?
OK, if you've read this far, you may have already guessed. The problem was the USB power supply that I was using. I assume it was marginal but worked OK for a year or two. Then it degraded for some reason to become "not OK". But wait, I had replaced the power supply as one of the many physical things in the system. Both the original and the replacement were just old phone chargers that I pulled out of my junk box. It looks like I replaced Crap with Also Crap. When I finally plugged the ESP32 into a high-quality USB power supply, things instantly became rock steady and stable.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.