Today I got some time to dig into the I2C issue and am pretty convinced the SGTL5000 is clock stretching when it shouldn't be doing anything at all.
As I mentioned in the previous log entry there are three devices on the I2C bus.
- SGTL5000 Codec chip at I2C 7-bit address 0x0A
- gCore EFM8-based PMIC/RTC at I2C 7-bit address 0x12
- FT6236 Touchscreen controller at 7-bit address 0x38
Different tasks access the chips as part of their normal processing. The I2C bus access functions are protected by a mutex so only one task gets to run a transaction at a time.
- The SGTL5000 is configured at start-up and then not accessed unless the volume of either the microphone or speaker changes.
- The EFM8 is accessed every 100 mSec primarily to look to see if the power button has been pressed (to turn the device off) and get the charge/battery state for display in the GUI (the battery voltages going into an averager).
- The FT6236 is polled every 30 mSec by the GUI task looking for touch activity.
What I see is that some reads of the EFM8 get an I2C timeout. According to the Espressif documentation this can only occur if the slave stretches the clock (holds SDA low) too long (default about 400 uSec). The failure was intermittent until I found one situation where it occurs on every read.
The EFM8 has a register called GPIO that contains the charge status bits and also a bit indicating if a Micro-SD card is installed or not. When the charge status is "Charge Done" and a Micro-SD card is installed then every single read times out.
The scope trace below shows the interleaved EFM8 and FT6236 accesses with the timeout. You can see the timeout at the beginning of the longer bursts which are accessing the EFM8.
The failing read looks like the following (I had increased the default timeout to 1200 uSec as an experiment). The EFM8 can stretch the clock but only before the first data byte and only for a very short time (I wrote its code). The failing clock stretching occurs after the first data byte. You can see the Espressif driver generates 9 clocks when it times out in an effort to clear any hung slave I2C state machine and then ends the cycle.
What is interesting is that if - while the system is running - I unplug the gCore POTS shield to disconnect the SGTL5000 (which, at this point is not getting any I2C cycles) the read magically starts working.
The read data = 0xA in the failing case which is curious because the SGTL5000 address is 0xA. I suspect the more random failures may have occurred when various voltage or current data being read had the value 0x0A in a read data byte. But the correlation between a data byte and the SGTL5000 address is a bit tenuous because when the 7-bit address is clocked out, it's actually shifted up one location to make room for the read/write bit. So I'm not entirely sure what's going on but I am sure that the SGTL5000 is interfering with reads going to the gCore EFM8.
I did another experiment where I look at the data the Espressif I2C driver returns when it is also returning a timeout error and it appears to be correct (e.g. it read in the correct data to its buffer before noting the timeout). So I have a possible work-around: Ignore the timeout error and just pretend the cycle finished correctly. It's a one line addition to the low-level I2C access routine.
esp_err_t i2c_master_read_slave(uint8_t addr7, uint8_t *data_rd, size_t size)
{
if (!is_initialized) {
return ESP_FAIL;
}
if (size == 0) {
return ESP_OK;
}
i2c_cmd_handle_t cmd = i2c_cmd_link_create();
i2c_master_start(cmd);
i2c_master_write_byte(cmd, (addr7 << 1) | I2C_MASTER_READ, ACK_CHECK_EN);
if (size > 1) {
i2c_master_read(cmd, data_rd, size - 1, ACK_VAL);
}
i2c_master_read_byte(cmd, data_rd + size - 1, NACK_VAL);
i2c_master_stop(cmd);
esp_err_t ret = i2c_master_cmd_begin(i2c_master_port, cmd, 1000 / portTICK_RATE_MS);
i2c_cmd_link_delete(cmd);
// SGTL5000 I2C bug work-around
if (ret == ESP_ERR_TIMEOUT) ret = ESP_OK;
return ret;
}
But of course this is unsettling. I could try to qualify it by looking for 0xA in one of the read data bytes but I'm not sure if that might not introduce additional issues.
Alternatively I could bite the bullet and replace the SGTL5000 with another codec chip which means writing and debugging another codec chip driver. I'm already planning to spin the board for a couple of minor changes so I would include the new chip there.
Codecs that look like a good fit include the MAX9867, some TLV320 parts, and the ES8388. I'm ignoring parts like the DA7212 because they're wafer-level bump mounting and I don't want to deal with that. The ES8388 is available to me in the USA through LCSC which might take some time to get here but is the least expensive. The MAX9867 is available through normal US distribution but 3X the cost.
So decision time. What to do? I need to sleep on this. I am going to include the kludge while I try to debug some of the other issues.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.