OPTIMIZER MATH BUG

Update (6-20-16):

The Bug Has Been Fixed In xc32-gcc v1.42. Get your updates!

https://hackaday.io/project/6450-operation-learn-the-mips-pic32mx1xx2xx370/log/40265-xc32-gcc-optimizer-math-bug-fixed-in-v142

And, interestingly, another bug I unknowingly uncovered was fixed, as well.

Update (11-20-15):

I have verified that this exists on the LINUX-x86 (32-bit) version of xc32-gcc v1.40, but NOT on the Windows(32-bit) version. I have not attempted to redownload the linux version of v1.40, as the version-number and date have not changed.

Briefly, as I recall: It appears the Windows version results in a "load immediate (-1) to register" (also with a "sign-extend"? or was that on the linux version?), whereas the Linux version results in only a "load immediate to register (255)". Interestingly, *the entirety* of the rest of the disassembly of the code is *identical* except for the obvious 4-byte offset of jumps thereafter.

Microchip has contacted me saying: "Our compiler team has tested it and got the expected/correct results... What version are you using?"--ish. Which is interesting, considering the code I submitted said *exactly* which version I was using...

Also, take-note: When logging in to your account, apparently HTTPS:// is *not* default, on a friggin' login-page! But you can get there by typing it in manually. Weird.

Update (10-15-15): CALL TO PIC32/xc32-gcc USERS!

There is now fully-executable and thoroughly documented (as well as significantly simplified) code available at github: xc32_mathBugTester (see the README.TXT).

I'm curious to know whether others run into the same problem. It shouldn't be difficult to modify for most PIC32's, but if you've a PIC32MX170F256B, and some spare time, you can run the already-compiled executable.

This bug was originally submitted as a ticket to MICROCHIP on 9-20-15, per @jlbrian7's suggestion and sleuthing (Thanks, buddy!).

This code was submitted today as a follow-up to a response from their tech-support today; he's "working with the compiler team."

This is the simplified code which causes the bug (with -O1):

//NOTE: 
//  PRINT_IT() must be a MACRO, or inline calls to printf(),
//    (not a function) for the bugs to occur.

#define PRINT_IT(ID, power, dir, signedPower) \
  printf("  %d: power(u8) = %" PRIu8 \
         "   dir(i8) = %" PRId8 \
         "   signedPower(i16) = %" PRIi16 "\n\r", \
         ID, power, dir, signedPower);


//This function *should* Print the following, in all cases:
//  1: power(u8) = 127   dir(i8) = -1   signedPower(i16) = -127
//  2: power(u8) = 127   dir(i8) = -1   signedPower(i16) = -127
//  3: power(u8) = 127   dir(i8) = -1   signedPower(i16) = -127
//
//WITH OPTIMIZATION it prints:
//  1: power(u8) = 127   dir(i8) = -1   signedPower(i16) = -127
//  2: power(u8) = 127   dir(i8) = 255   signedPower(i16) = -127
//  3: power(u8) = 127   dir(i8) = 255   signedPower(i16) = 32385

//REGARDLESS of the *printout*, 
//the horrendously-valued "signedPower" is actually used and assigned to the PWM 


//NOTE: Many tests have been performed (no longer included in this code)
// SEE main.c in the parent directory!!!
void testBrokenMathAndPrintout(void)
{
   uint8_t power = 127;
   int8_t dir = -1; 


   int16_t signedPower = (int16_t)power * (int16_t)dir;

   //"THE BUG" appears differently depending on how PRINT_IT is implemented
   // IN THIS FILE:
   // PRINT_IT() is merely a macro.
   //  Identical to including the printf() statement RIGHT HERE
   //  AND has the IDENTICAL result.
   PRINT_IT(1, power, dir, signedPower);


   //This case will SELDOMLY be *executed*
   // but without some *potential* reassignment to the variable 'dir' 
   // (somewhere) the bug does not appear
   // (Note how difficult it is to come up with a case that won't be optimized-out!)
   if(rand() == 0)
   {
      printf("!!! rand() == 0, else{} executed!\n\r");
      
      //THIS VALUE IS NOT EXPECTED TO BE USED
      dir = ((int8_t)(0));
   }


   PRINT_IT(2, power, dir, signedPower);


   signedPower = (int16_t)power * (int16_t)dir;

   PRINT_IT(3, power, dir, signedPower);
}

-------

Update (9-20-15): Thanks @jlbrian7 for the reminder... See the bottom for some updates...

-----

The past couple days' project-efforts have been spent trying to pinpoint the cause of some very strange behavior...

Briefly: int8_t is acting like it's much larger than that.

The trouble appeared as a result of multiplying -1 (int8_t) by another value (0-255), and getting results in the thousands.

Yes, I watched my casting quite closely, though a certain amount of the time was definitely spent wondering whether I understood casting (and many other things I've used regularly for over a decade) correctly.

Basically:

int8_t direction = -1; //Always -1, 0, or 1
uint8_t power = 127; //PWM-ish power, from 0-255 (0-100%)

int16_t signedPower = (int16_t)direction * (int16_t)power;

//results in signedPower == 32385

Not only does it print this value, it *uses* it, regardless of casting, masking, or various other tests I went through. Dig this:

int8_t direction = -1;

printf("1: direction = %" PRIi8 "\n", direction);

//do some stuff here, but DON'T actually CHANGE the value of direction

printf("2: direction = %" PRIi8 "\n", direction);

results in:

1: direction = -1
2: direction = 255

Long story short, this appears to be a result of the optimizer. Using 'xc32-gcc -O1 ...' causes the glitch, not including a -O argument doesn't.

this is a portion of the output from 'xc32-gcc -v':

gcc version 4.8.3 MPLAB XC32 Compiler v1.40 (Microchip Technology)

Not sure whether this exists in other incarnations of gcc... e.g. maybe mipsel-gcc-4.8.3?

CAUSING this bug is *difficult*... it seems to require several conditions to be met, which can be seen in the example-code below, which has been stripped down to nearly the bare-essentials to recreate the bug (and barely even remotely resembles the original code that the bug was found in).

Compiled with:

CFLAGS = -mprocessor=32MX170F256B -funsigned-char -O1

all:
   xc32-gcc -c $(CFLAGS) -o main.o main.c 
   xc32-gcc $(CFLAGS) main.o --output main.elf -Wl,-Map=main.map,--cref
   xc32-bin2hex main.elf

(Again, removing -O1 from CFLAGS causes the bug to disappear)

void brokenEventually(void)
{
   uint8_t newPower=0x7f;
   int8_t dir = -1;
   static int callNum1 = 0;
   static int callNum2 = 0;

    //This was originally all casted as int16's, same effect...
   int16_t signedPower = (int32_t)newPower * (int32_t)dir;

   //dir = -1, above, this prints out -1...
   if(callNum1 == 5)
   {
      callNum1 = 0;
      
      //something like this is necessary, otherwise the optimizer
      //can weed out the else case, which is necessary to reproduce the bug
      if(rand() != 0)
      {
            printf("1: %" PRIu8 "  %" PRId8 "  %" PRIi16 "\n\r",
                                           newPower, (int8_t)dir,
                                           signedPower);
      }
      else
      {
         //Without this, '-1' is printed in both cases
         //With it, '-1' is printed, above, and '255' is printed below
         //This happens regardless of (int8_t) cast(s).
         // I'm guessing what's happening, here, is that without some
         // assignment somewhere, it's not actually bothering to create a
         // register/memory location for it...
         // (Why, though, would it still print -1 above? Maybe because
         // technically the variable/register needn't be used until now...)
         dir = ((int8_t)(0));
      }
   }

//   signedPower = (int16_t)newPower * (int16_t)dir;
   signedPower = (int16_t)newPower * (int32_t)dir;


   
   //This prints out 255
   //(Oddly, only if dir is changed somewhere, e.g. the else case, above)
   //( which doesn't even effect this case most of the time!)
   if(callNum2 == 3)
   {
      callNum2 = 0;

      printf("2: %" PRIu8 "  %" PRId8 "  %" PRIi16 "\n\r",
                                           newPower, dir,
                                           signedPower);
   }

   callNum1++;
   callNum2++;

}

Update (9-20-15):

This seems to be a similar bug in sdcc (rather'n xc32-gcc), but I haven't analyzed it in detail:

http://sourceforge.net/p/sdcc/bugs/2324/ Thanks @jlbrian7 for finding that!

ALSO:

I experimented pretty thoroughly with the various optimization options...

I read the gcc manpage, which lists each individual optimization argument that's included in each optimization level (e.g. -O1 enables -faggressive-loop-optimizations, and several others).

In fact, I went through and enabled them all, and did not get the same bug... Now, rather'n using -O#, I'm using the individual -f<options> as shown below:

 # xc32-gcc -Q -O1 --help=optimizers:
 CFLAGS += -faggressive-loop-optimizations
 CFLAGS += -fbranch-count-reg
 CFLAGS += -fcombine-stack-adjustments
 CFLAGS += -fcommon
 CFLAGS += -fcompare-elim
 CFLAGS += -fcprop-registers
 CFLAGS += -fdce
 CFLAGS += -fdefer-pop
 CFLAGS += -fdelayed-branch
 CFLAGS += -fdelete-null-pointer-checks
 CFLAGS += -fdse
....
actually, the list is *huge*, so I'll leave you to do 
# xc32-gcc -Q -O1 --help=optimizers

Interestingly, even with *all* those optimizations, the main loop doesn't really increase in speed by much. But, with -O1, it increases nearly 300% faster. Which might make sense, if it's doing some weird stuff with optimizations like the one discovered above (if done *correctly*). These optimizations, I assume, must be Microchip's special voodoo that they keep hidden... (e.g. not shown in --help=optimizers).

Discussions

Eric Hertz wrote 08/22/2015 at 18:36

OTOH, maybe recreating the bug isn't so difficult, after all, maybe the difficult part is just the side-by-side printout of it non-buggy vs. buggy in the same code... Too tired to decipher that, at this point. In fact, I'm pretty sure this bad-math would be an issue with nearly *any* code compiled with -O1 using negative-valued int8_t's, *unless* it just happened to be a case where it could get away without assigning a register... so BEWARE.

Are you sure? yes | no

This is getting confusing!

PIC[32]sy

Discussions

Become a Hackaday.io Member