-
BEEN USING...
10/19/2016 at 14:32 • 0 commentsI've been using three different variants of the PIC32MX-series for quite some time, now...
The logs, 'round here, are a bit confusing, as there's a lot of mention of difficulties, and little explanation of accomplishments. Bad Me.
So, yes, it works.
(Yes, I'm using the xc32-gcc compiler, with free-license from Microchip, and Yes, the math-bug seems to have been fixed in the latest version).
Yes, I have been using them for some time.
Yes, I am programming them via openOCD and a FT2232H breakout-board via JTAG.
Yes, I have managed to whittle that down to a couple bash-scripts and makefile snippets 'make run' which don't require any user-interaction.
Yes, programming via JTAG takes a matter of seconds, now, rather than minutes.
I don't recall all the details of *how* I managed all this, it's all scripted now, but rereading my logs, here, am reminded that some amount of "magic" was involved...
So... where's that leave us...?
I guess, at this point, if you're interested in going this path, lemme know... enough "lemme knows" and I'll try to piece together exactly how I pulled it together, and dig/throw up the source-code mods, etc. as I find them.
Oh, and check out @jaromir.sukuba's page: https://hackaday.io/page/2437-myths-and-legends-of-pic-microcontrollers/discussion-67635 wherein you might get ideas for how to go the open-source route, which I've yet to accomplish.
BE SURE TO check out the "instructions" on this project's page... There seems to be some information there about how to rebuild openOCD with the necessary hacks... and more.
-
A Question!!! PBCLK Divisor 1:1 and SFRs
06/26/2016 at 15:38 • 0 commentsOK, so darn-near every register says something like "If you're using a peripheral-clock-divisor of 1:1, then do not write this register in the instruction immediately following writing this value"
...
Here's a quote from the datasheet:
When using 1:1 PBCLK divisor, the user’s software should not read/write the peripheral SFRs in the
SYSCLK cycle immediately following the instruction that clears the module’s ON bit.I have no idea what to search for, search-fu-fail, but it seems like it must be a common-enough thing...
Can we rely on xc32-gcc to make sure this never happens?
The obvious solution, not knowing, is to use a slower peripheral-clock (e.g. 1:2), but yahknow, sometimes yahs wants the speed, or need a ratio that's not 1:2^n where n>0, or something.
Has anyone run into the 1:1 PBCLK being an issue?
-
xc32-gcc Optimizer Math Bug Fixed in v1.42
06/20/2016 at 07:35 • 0 commentsAlright! It's been 9 months since I submitted the bug, but it appears that the latest greatest xc32-gcc has Fixed It!
The ol' bug was described, experimented-with, and characterized in great-detail here: https://hackaday.io/project/6450-operation-learn-the-mips-pic32mx1xx2xx370/log/23899-optimizer-math-bug
Basically, what it boiled down to was that if you used -O1, (optimization, at the highest-level available in the "free" xc32-gcc), then math-errors would occur with uint8_t and int8_t. It seems what happened, on rare occasion (which I just happened to be lucky enough to encounter in my first project) was that the optimizer would treat int8_t as though they were 32-bit, and forget to pad the remaining-bits. It could've worked, if it padded them correctly, but it didn't. So, e.g. I had something like
int8_t direction = -1; uint8_t power = 127; int16_t signedPower = (int16_t)direction * (int16_t)power;
And, instead of getting -127, I was getting 32385. Yep, 32385 = 0xff * 127. And we all know that -1 is 0xff, when represented in an int8_t, right...?
Except, it didn't happen *all the time*... you'll have to look at the details of the link above to see *when* it happened, but it did happen, and when driving a motor, the difference between a power-level of -127 and 32385 could've caused quite a bit of finger-damage to the unwary.
This, apparently, having tested, only occurred on the linux version of xc32-gcc (v1.40)... I tried the exact same code under WinXP, and it worked fine.
So, I thought I got the ol' brush-off, because I never saw an update, and even looking through changelogs between v1.40 and v1.42, no mention of it... But I tried v1.42 anyways, and yep, it works.
Woot!
Apparently I inadvertently discovered another bug, as well... by having a bug within my own code... I used "\n" to end one line, and "\n\r" to end those thereafter... And, of course, I was outputting via serial-port, where a lone "\n" isn't enough to carriage-return back to column-zero as well...
In 1.40, it displayed as I'd intended... (despite my bug)
But in v1.42 it displays funky... "Almost as though" it's not returning to column-zero before starting the new-line...
Yep, I forgot a "\r" and they fixed not only the math-bug, but also the carriage-return "bug." Woot!
There may be more in a bit... I've yet to reimplement -O1 in a regular-ol' project to see how my code functions... logically, there should be one level of optimization greater than what it was before, so it might be a bit faster... OTOH, my workaround for the math-bug was to use -O0 and enable all the -f<options> I could find... so... we shall see.
Indeed, my "loop-count/second" has increased from around 32,000 to 80,000 by being able to use -O1 instead of all the -f<options> explicitly. And now my bit-banged UART works darn-near perfectly:
("Why would you use a bit-banged UART on a chip which has a built-in UART peripheral...?" that's another topic entirely...)
And here's a showing of the carriage-return bug-fix. The earlier lines shouldn't've been aligned as they are, I forgot "\r"... The later lines show "1:" shifted as it should be with merely a "\n", when sent via serial-port...
I had "\n" after the "loopNum" statement, but "\n\r" after all the other statements... The old version (v1.40) apparently automatically appended the "\r" (!?), but the new version (v1.42) doesn't, so it works as-expected per the coding, which had a bug in it.
Now, Microchip, I may have "lucked" into discovering the math-optimization bug, but it was actually a tremendous amount of effort, on my part, to determine how to reproduce the bug, how to present it in a way that didn't include the thousands upon thousands of lines of code that I discovered it in, etc... And you lucked-into the fact that I happened to be willing to go to all that effort (hours, MANY MANY hours) to present it to you... submit it as a ticket, even work with your employee.. When I'd already found a workaround for my own purposes using the "-f<options>" and disabling the optimizer altogether (via -O<num>)... That workaround worked fine for my needs; presenting it to you was a tremendous effort. Further, you happened, apparently, to "luck" into another bug-discovery, which I might've happened to reveal to yah... So, yahknow, a bunch of free PIC32s and various other chips via the "free samples" service are nice, and the curiosity-board was pretty cool, but in reality, we're talking a sum-total of something like $50 I've gotten outta y'all, total. And if you only consider services you *don't* offer to *everyone* (again, free-samples), then only the curiosity-board, which was something like $20... And I mean, not to be rude, but upping my loop-count from 32k to 80k is pretty nice, but I could've accomplished the same (and more) by learning to move my code from FLASH to SRAM, if it was *that* important to me... So, yahknow, I think I put, easily, a good 40+ hours into that project, and at even minimum-wage, I think you've got a shitton of schwag to send my way... I get it, the odds of my "lucking" into something like this again are pretty low, so maybe a telecommuting-job-offer at 20hr/wk is a bit much to ask... But, schwag, lots of schwag... surprise-schwag is even cooler... My cat could use a toy or two as well, and I'm always up for beer... And as much as I'm growing to love the PIC32's, I'm still partial to AVR's when it comes to 8-bit, so now that you own 'em... maybe you could send some of those (or at least allow me free-samples of 'em, since Atmel never did despite my decade+ loyalty). I'm living in Brokeasheck these days... Things I could sell to pay the bills would be even better. My address is in the files I uploaded to y'all...!
-
PIC[32]sy
09/29/2015 at 11:10 • 0 commentsUpdate: Adding before-and-after images of it in-circuit.
The problem is, there's quite a bit of bare-minimum support-circuitry with the PIC32 (and probably most newer high-speed microcontrollers), including several capacitors, some resistors for the reset pin, and several power pins. Add to that, all my projects use a heartbeat LED, 9/10 also use serial I/O, and of course I need the programming-header (JTAG, in my case).I still want to breadboard with this, but the bare-minimum, alone, creates quite a rat's nest, at least the way I breadboard...
Originally I planned on placing a PCB flat atop the chip... @antti.lukats's design for DIPSY, with its upright pins, was definitely a contributing-factor in this layout idea... But I had some difficulty figuring out how to do that with through-hole parts like the header-pins... and staring at it for a while, it finally came to me!
(Next time I'll use actual right-angle headers, instead of bending straight ones).
This is actually handy-enough I might do it with through-hole AVRs, etc. as well... So I'm thinking with my next custom PCB-run I'll include something like... what're those things called, not J-lead, right? Anyways, holes at the edge that are only half-circles, so they sit right atop a DIP's pins, where they enter its package. In this case, I just blobbed a ton of solder through the lowest row of holes until the blob finally reached the pins' faces. (The pins come out at a slight angle, so the PCB and its holes aren't actually resting flush on them).
Most of the support-circuitry is connected to one side, but some wires needed to cross-over.
Everything needed to run code is now "on-chip"... The heartbeat is fading in this image, and it's also outputting serial data (and responding to received data) on the unpopulated serial-header pins.
And it still fits in the breadboard without occupying (almost) any additional breadboard space, and should probably still fit in "machine-pin" sockets. In fact, it should work fine in a circuit designed for this chip, e.g. on a PCB with the same support-circuitry on-board.
The left connector is for (TTL-level RS-232) serial I/O. The right is JTAG.
The other nice thing about the upright PCB, rather'n a flat one, is that the chip's part-number is still visible. Handy since I've got two in this family with slightly different pinouts/functionality.
That piece of red-heatshrink says NYI and covers the currently unconnected/unhoused 7th pin... That's for SRST or TRST, should I ever find a use for it. Though, I'm almost certain I've read somewhere that neither are necessary (nor even implemented?) for PIC32 JTAGging... (I shoulda bookmarked that and posted it as a response in the earlier logs regarding the various JTAG resets...).
Here's a "before and after" of my motor-driver project:
Oh, and... starting to get the hang of this FT2232H breakout-board I've been using... Now making use of its second "channel" for the serial-port (and broke out pins for AVR programming, as well).
-
OPTIMIZER MATH BUG
08/22/2015 at 18:29 • 1 commentUpdate (6-20-16):
The Bug Has Been Fixed In xc32-gcc v1.42. Get your updates!
And, interestingly, another bug I unknowingly uncovered was fixed, as well.
Update (11-20-15):
I have verified that this exists on the LINUX-x86 (32-bit) version of xc32-gcc v1.40, but NOT on the Windows(32-bit) version. I have not attempted to redownload the linux version of v1.40, as the version-number and date have not changed.
Briefly, as I recall: It appears the Windows version results in a "load immediate (-1) to register" (also with a "sign-extend"? or was that on the linux version?), whereas the Linux version results in only a "load immediate to register (255)". Interestingly, *the entirety* of the rest of the disassembly of the code is *identical* except for the obvious 4-byte offset of jumps thereafter.
Microchip has contacted me saying: "Our compiler team has tested it and got the expected/correct results... What version are you using?"--ish. Which is interesting, considering the code I submitted said *exactly* which version I was using...
Also, take-note: When logging in to your account, apparently HTTPS:// is *not* default, on a friggin' login-page! But you can get there by typing it in manually. Weird.
Update (10-15-15): CALL TO PIC32/xc32-gcc USERS!
There is now fully-executable and thoroughly documented (as well as significantly simplified) code available at github: xc32_mathBugTester (see the README.TXT).
I'm curious to know whether others run into the same problem. It shouldn't be difficult to modify for most PIC32's, but if you've a PIC32MX170F256B, and some spare time, you can run the already-compiled executable.
This bug was originally submitted as a ticket to MICROCHIP on 9-20-15, per @jlbrian7's suggestion and sleuthing (Thanks, buddy!).
This code was submitted today as a follow-up to a response from their tech-support today; he's "working with the compiler team."
This is the simplified code which causes the bug (with -O1):
//NOTE: // PRINT_IT() must be a MACRO, or inline calls to printf(), // (not a function) for the bugs to occur. #define PRINT_IT(ID, power, dir, signedPower) \ printf(" %d: power(u8) = %" PRIu8 \ " dir(i8) = %" PRId8 \ " signedPower(i16) = %" PRIi16 "\n\r", \ ID, power, dir, signedPower); //This function *should* Print the following, in all cases: // 1: power(u8) = 127 dir(i8) = -1 signedPower(i16) = -127 // 2: power(u8) = 127 dir(i8) = -1 signedPower(i16) = -127 // 3: power(u8) = 127 dir(i8) = -1 signedPower(i16) = -127 // //WITH OPTIMIZATION it prints: // 1: power(u8) = 127 dir(i8) = -1 signedPower(i16) = -127 // 2: power(u8) = 127 dir(i8) = 255 signedPower(i16) = -127 // 3: power(u8) = 127 dir(i8) = 255 signedPower(i16) = 32385 //REGARDLESS of the *printout*, //the horrendously-valued "signedPower" is actually used and assigned to the PWM //NOTE: Many tests have been performed (no longer included in this code) // SEE main.c in the parent directory!!! void testBrokenMathAndPrintout(void) { uint8_t power = 127; int8_t dir = -1; int16_t signedPower = (int16_t)power * (int16_t)dir; //"THE BUG" appears differently depending on how PRINT_IT is implemented // IN THIS FILE: // PRINT_IT() is merely a macro. // Identical to including the printf() statement RIGHT HERE // AND has the IDENTICAL result. PRINT_IT(1, power, dir, signedPower); //This case will SELDOMLY be *executed* // but without some *potential* reassignment to the variable 'dir' // (somewhere) the bug does not appear // (Note how difficult it is to come up with a case that won't be optimized-out!) if(rand() == 0) { printf("!!! rand() == 0, else{} executed!\n\r"); //THIS VALUE IS NOT EXPECTED TO BE USED dir = ((int8_t)(0)); } PRINT_IT(2, power, dir, signedPower); signedPower = (int16_t)power * (int16_t)dir; PRINT_IT(3, power, dir, signedPower); }
-------Update (9-20-15): Thanks @jlbrian7 for the reminder... See the bottom for some updates...
-----
The past couple days' project-efforts have been spent trying to pinpoint the cause of some very strange behavior...
Briefly: int8_t is acting like it's much larger than that.
The trouble appeared as a result of multiplying -1 (int8_t) by another value (0-255), and getting results in the thousands.
Yes, I watched my casting quite closely, though a certain amount of the time was definitely spent wondering whether I understood casting (and many other things I've used regularly for over a decade) correctly.
Basically:
int8_t direction = -1; //Always -1, 0, or 1 uint8_t power = 127; //PWM-ish power, from 0-255 (0-100%) int16_t signedPower = (int16_t)direction * (int16_t)power; //results in signedPower == 32385
Not only does it print this value, it *uses* it, regardless of casting, masking, or various other tests I went through. Dig this:
int8_t direction = -1; printf("1: direction = %" PRIi8 "\n", direction); //do some stuff here, but DON'T actually CHANGE the value of direction printf("2: direction = %" PRIi8 "\n", direction);
results in:
1: direction = -1
2: direction = 255Long story short, this appears to be a result of the optimizer. Using 'xc32-gcc -O1 ...' causes the glitch, not including a -O argument doesn't.
this is a portion of the output from 'xc32-gcc -v':
gcc version 4.8.3 MPLAB XC32 Compiler v1.40 (Microchip Technology)
Not sure whether this exists in other incarnations of gcc... e.g. maybe mipsel-gcc-4.8.3?
CAUSING this bug is *difficult*... it seems to require several conditions to be met, which can be seen in the example-code below, which has been stripped down to nearly the bare-essentials to recreate the bug (and barely even remotely resembles the original code that the bug was found in).
Compiled with:
CFLAGS = -mprocessor=32MX170F256B -funsigned-char -O1 all: xc32-gcc -c $(CFLAGS) -o main.o main.c xc32-gcc $(CFLAGS) main.o --output main.elf -Wl,-Map=main.map,--cref xc32-bin2hex main.elf
(Again, removing -O1 from CFLAGS causes the bug to disappear)
void brokenEventually(void) { uint8_t newPower=0x7f; int8_t dir = -1; static int callNum1 = 0; static int callNum2 = 0; //This was originally all casted as int16's, same effect... int16_t signedPower = (int32_t)newPower * (int32_t)dir; //dir = -1, above, this prints out -1... if(callNum1 == 5) { callNum1 = 0; //something like this is necessary, otherwise the optimizer //can weed out the else case, which is necessary to reproduce the bug if(rand() != 0) { printf("1: %" PRIu8 " %" PRId8 " %" PRIi16 "\n\r", newPower, (int8_t)dir, signedPower); } else { //Without this, '-1' is printed in both cases //With it, '-1' is printed, above, and '255' is printed below //This happens regardless of (int8_t) cast(s). // I'm guessing what's happening, here, is that without some // assignment somewhere, it's not actually bothering to create a // register/memory location for it... // (Why, though, would it still print -1 above? Maybe because // technically the variable/register needn't be used until now...) dir = ((int8_t)(0)); } } // signedPower = (int16_t)newPower * (int16_t)dir; signedPower = (int16_t)newPower * (int32_t)dir; //This prints out 255 //(Oddly, only if dir is changed somewhere, e.g. the else case, above) //( which doesn't even effect this case most of the time!) if(callNum2 == 3) { callNum2 = 0; printf("2: %" PRIu8 " %" PRId8 " %" PRIi16 "\n\r", newPower, dir, signedPower); } callNum1++; callNum2++; }
Update (9-20-15):
This seems to be a similar bug in sdcc (rather'n xc32-gcc), but I haven't analyzed it in detail:
http://sourceforge.net/p/sdcc/bugs/2324/ Thanks @jlbrian7 for finding that!
ALSO:
I experimented pretty thoroughly with the various optimization options...
I read the gcc manpage, which lists each individual optimization argument that's included in each optimization level (e.g. -O1 enables -faggressive-loop-optimizations, and several others).
In fact, I went through and enabled them all, and did not get the same bug... Now, rather'n using -O#, I'm using the individual -f<options> as shown below:
# xc32-gcc -Q -O1 --help=optimizers: CFLAGS += -faggressive-loop-optimizations CFLAGS += -fbranch-count-reg CFLAGS += -fcombine-stack-adjustments CFLAGS += -fcommon CFLAGS += -fcompare-elim CFLAGS += -fcprop-registers CFLAGS += -fdce CFLAGS += -fdefer-pop CFLAGS += -fdelayed-branch CFLAGS += -fdelete-null-pointer-checks CFLAGS += -fdse .... actually, the list is *huge*, so I'll leave you to do # xc32-gcc -Q -O1 --help=optimizers
Interestingly, even with *all* those optimizations, the main loop doesn't really increase in speed by much. But, with -O1, it increases nearly 300% faster. Which might make sense, if it's doing some weird stuff with optimizations like the one discovered above (if done *correctly*). These optimizations, I assume, must be Microchip's special voodoo that they keep hidden... (e.g. not shown in --help=optimizers).
-
This is getting confusing!
08/17/2015 at 16:40 • 0 comments -
It shoulda been simple!
08/12/2015 at 14:58 • 2 commentsUPDATE: FIXED!
Apparently the MX170 is not yet in openOCD's device-listing:
I don't fully understand it, and it was a VERY round-about way of coming to this rather simple conclusion. This does require recompilation of openOCD.
- Edit the file openocd-0.9.0/src/flash/nor/pic32mx.c
- Add your CPUTAPID (see note) to pic32mx_devs[] ('round line 120).
- NOTE, add the CPUTAPID & 0x0fffffff... or *drop* the first nibble... e.g. the MX170F256B has a CPUTAPID of 0x26610053, the line added looks like:
{0x06610053, "170F256B"},
- NOTE2: Based on my understanding of the code, it looks like you can add your device listing anywhere in the list (it doesn't have to be sorted).
- NOTE, add the CPUTAPID & 0x0fffffff... or *drop* the first nibble... e.g. the MX170F256B has a CPUTAPID of 0x26610053, the line added looks like:
- Recompile openOCD and place the executable appropriately
- (either 'make install', or if already done with an older version, just copy src/openocd over /usr/local/bin/openocd)
- (either 'make install', or if already done with an older version, just copy src/openocd over /usr/local/bin/openocd)
Was:
"It shoulda been simple" the phrase of this era... re: flashing the MX170...
The latest: For the motion-control code, I planned to switch to the MX170 rather than the MX230 used in this "project" because the 170 has several 5V-tolerant pins which would lend themselves well to my H-Bridge chip. Really, technically, the only differences should be a slight pinout change, and more flash-memory. In fact, they share the same datasheet; their registers are identical everywhere I've looked, the configuration-bytes are identical (as far as the PLL, etc...). The code compiles with no changes other than switching the xc32-gcc MCU argument, appropriately. And, in fact, I've even run "diff" on the disassembly-listings as well as the intel-hex files(!) and the only differences are a few address differences ((the 230 was 64KB, the 170 is 256KB, apparently some exception(?) routines are placed near the end of the FLASH).So... what on earth could be wrong...?
The best I've come up with, after much effort, is that somehow the program-FLASH is only being written in the first row (128bytes) and then it appears to be empty, despite the flashing-procedure not giving any errors. I've tried two chips, now I'm fighting with the Really Slow Flashing Method (half an hour for 256KB?!) to see if it functions differently... I think this is an openOCD thing. It's just *really weird*. These chips should be nearly identical, in fact I'm almost certain I should be able to run the hex-file compiled for one on the other.
Not sure if/how I'd've ever figured this out without single-stepping... it's not a problem with *my* code, probably not a code problem at all (as far as the code running on the chip, judging by the diffs)... Just Weird.
Maybe it'd been worth it to use the Microchip tools (programmer, etc.) after all...
HAH! "Slow-Flash" ended-up failing on the *other* flash-bank...
So, now we have: If I "fast-flash" (using the hacked openOCD), it writes the BOOT flash properly, but not the program-flash... then I can "slow-flash" (using the unhacked openOCD) and it will write the program-flash but not the boot flash! LOL. IT RUNS! LOL!
-
Finally!
08/11/2015 at 14:29 • 0 commentsThings're getting a bit confusing 'round here, as many of my projects feed into each other... This project is a relative-go, as I've managed to get the early-stages of #commonCode (not exclusively for AVRs) running on the PIC32... First was 'heartbeat,' which was a bit of a hurdle because I didn't realize the default ADC-register-settings interfere with many GPIO pins. (I think there's a log about that). Also, some difficulties with the timer, and a few other settings here and there... (FYI: read carefully: the "primary oscillator" is *external*, most-likely for early testing you'll want the FRC "Fast R/C" oscillator, which is internal (also note, it's not highly precise... mine's running at 108%). The weird thing is... even when selecting an external oscillator, it actually ran... maybe the breadboard capacitance did it, or maybe the PLL doesn't run slower than a certain frequency, so assumes it's there... I dunno.
OK, after 'heartbeat' usually comes 'polled-uat' which is a bitbanged UART (with only a transmitter). This should be an easy step, since the heartbeat's default functionality already assures the timer is set-up...
The third step is usually 'polled-uar' which is the bitbanged UART's *receiver.* Again, this should be an easy step; the timer's already set-up, and the heartbeat code also has Input functionality running (push a button and the heartbeat changes from fading to blinking). Right, again, easier said than done... Apparently I chose a pin who, again, had *different* defaults than GPIO... So, rather'n fight it, I just ended up swapping to another pin.
But, it didn't work. First-guess, the 108% thing was too much for bitbanged *input*... PCs are probably a bit more sophisticated in their UART reception than my code. So I scaled out that error... and everything died. And it's been dead for nearly a week. FINALLY I figured it out... My scaling managed to push some math to overflow... Actually, that was one of my earlier guesses, but my calculator (which I wrote) said it should fit in 32bits fine... Low-and-behold, my calculator was only showing the lower 32 bits, and actually the scaling needed 36. Woot! DAYS UPON DAYS to figure that shizzle out.
On the plus-side, my calculator is new and improved! My polled_uat code has been cleaned up quite a bit, as well as a few other things here and there. Also some new ideas for abstracting 'heartbeat' a bit more.
Anyways, suffice-to-say, these three 'commonThings', once running, are usually the basis for all my projects... and now we're ready to start a new, more-sophisticated project... Enter: #2.5-3D thing, wherein I'll be abstracting the motion-control aspects of 'commonCode' so they can function on PIC32, and hopefully more-easily be ported to most any architecture.
-
optimizer reliance follies
07/17/2015 at 10:59 • 5 commentsGAH! I should just delete this whole ordeal.
MPLABX/xc32-gcc--the free version--has a few restrictions...
The highest optimization-level is 1 (min 0, max 3, or "s" for size, IIRC)...
Whelp, I'm running this 32-bit system at 50MHz, what's that... 32/8=4... 50/20=5/2... 5/2*4=10 so... TEN TIMES the processing-power of my lowly AVRs...
All it's doing is fading an LED, sorta like software-PWM... And now it's noticeably flickering...
My AVR projects usually don't do that, even when they're *heavily* bogged-down with other code...
So... we've already discovered in a previous log that the world doesn't really exist, but that's OK, we have to pretend...
So, here's where we're allegedly at:
The simple "pinOn" and related MACROs rely on a lot of math via macros... Rather'n, say, doing PORTASET = 0x01, it's doing: *(&PORTA + (&PORTASET-&PORTA))) = 0x01... That was intentional. The AVR side of things does it quite similarly, such that we can use "PORTA" instead of thinking about referring to "PINA" and "DDRA" all the time... It's SIMPLE math, really, just add a constant to an address... and it's all done in macros, so it makes it easy, and easily-readable: clrpinPORT(1, PORTA), setoutPORT(1, PORTA)...
I guess I hadn't realized how much I relied on the optimizer... avr-gcc strips that entire thing, or something similarly-ugly, down to a single instruction "setbit" at the appropriate register.
xc32-gcc (with Optimization-level 1), on the other hand... well, just look at it:
(This was originally setinPORT(), to set bit 0 as an input, but in the process of trying to figure out the slow-down, I stripped a bunch of macros, resulting in this)
//TRISx to PORTx address-offset (-0x10): #define TPO (int)((int)(&TRISA) - (int)(&PORTA)) //xSET to x address-offset (0x08): #define SPO (int)((int)(&TRISASET) - (int)(&TRISA)) //setInput(bit0, PORTA) (*(&(PORTA) + TPO + SPO) = RPIN_TO_MASK(0)); 9d0001a4: 3c02bf88 lui v0,0xbf88 //constant 9d0001a8: 24426020 addiu v0,v0,24608 //constant 9d0001ac: 24636018 addiu v1,v1,24600 //constant 9d0001b0: 00621823 subu v1,v1,v0 //const - const 9d0001b4: 00031880 sll v1,v1,0x2 //(=const) - 2 (?) 9d0001b8: 00431821 addu v1,v0,v1 //add constant 9d0001bc: 24020001 li v0,1 //load constant 9d0001c0: ac620000 sw v0,0(v1) //write at constant
Again, that's a simple instruction, it's basically nothing more than "TRISASET = 0x01;" and the ...SET registers are such a nice addition, it should make this thing *even faster*! I'm thinking, in this architecture, two instructions, MAX, (load an immediate value to a register, write that register's contents to the TRISASET memory-location).
All the math is done with constants, if PORTA and TRISA weren't C variables, and instead were #defines, the math coulda easily been handled by preprocessor before even getting to GCC. GCC's obviously pretty good at math (nevermind optimizing), in comparison to the preprocessor... Instead, it's leaving all these repetitive constant-calculations for run-time. WEE!It's quite the realization about just how much the optimizer does... The fact I can easily see the flicker, combined with the fact there's *no other code running* besides the fading in-and-out of the LED, could indicate we're running the same C code easily 100-times slower on a system more than twice as fast. Shocking.
(Or, I could just be stupid, see the comments)