RAK3172 interrupt failure

swagner0 · September 22, 2021, 3:21am

Hello RAK community,

I am having problems with RAK3172 interrupt servicing. When an interrupt occurs, the M4 seems to jump to an arbitrary location and not the location in the vector table. I observe this only on the RAK31772, and not on my nucleo-wl55jc development board.

Specifically, I wrote a very simple STM32CubeIDE project which loops waiting for an EXTI15_10_IRQ interrupt when PA15 is pulled low. The loop waits 10 seconds after reset before enabling EXTI15_10_IRQ. If I do not pull PA15 low, the loop runs forever. If I pull PA15 low, before EXTI15_10_IRQ is enabled, the loop runs until the interrupt is enabled at 10 seconds, then jumps to a nonsense location. If I pull PA15 low after the interrupt is enabled, then the processor jumps immediately to the nonsense location.

I have verified that the interrupt vector table is correct (specifically, address 0x080000E4, the address of the EXTI15_10 vector) has the correct address for the handler (plus 1, indicating - correctly - that this is a thumb mode operation.)

Note that this behaviour occurs with ANY interrupt, not only the EXTI15_10 interrupt. I just chose this one to work on because it was very simple.

Any ideas? Remember, the exact same code (.bin file) works properly on the STM32WL55xx on the nucleo-wl55jc but fails on the STM32WLE5xx on the RAK3172.

Thank you!

cstratton · September 22, 2021, 3:34am

Edit: These ideas were not the issue, see below for the actual flash map aliasing / SCB->VTOR setting problem

The STM32WL55xx is dual core M4/M0+ while the STM32WLE5xx is single core M4 only

IIRC ST’s demo package has both dual core and single core projects. My guess is that a single core project will run on the dual core silicon, but a dual core project is likely to quickly fault on single core silicon if it manages to do anything at all.

~~That’s probably your most likely area of issue, but just for sake of completeness:~~

Are you sure the actual contents of the vector table are an even address? Because if so, that seems wrong.

This processor can only execute thumb type instructions, and not ARM ones. In the ARM family, thumb mode is indicated by having the 1’s bit set in the destination address, which means go to the 16-bit aligned address there and start executing in thumb mode. Put an even address into the vector table, and you’ll instantly get a fault since the processor would have to switch to non-existent ARM mode to fetch instructions from the target.

But perhaps that’s just a reporting difficulty - many tools will take the actual value in the memory and turn it back into the effect target address less the thumb mode bit. But if I objdump one of my elfs or hexdump my binaries, the actual contents of the vector table are odd addresses.

swagner0 · September 22, 2021, 3:42am

Hello @cstratton,

Thank you for your response. It will help me to edit my post and make it more clear.

Yes, the project is a single-core project. (I started with a STM32WLE5CC CubeMX project skeleton.) I built it for single core, and as best I can tell the M0+ core is idle when I run it on the wl55xx.

And yes, I reported “the address of the handler”, not “the address of the handler + 1” initially. As you suggest, “the address of the handler + 1” is both what is expected and what is present. I will go back and fix it in my post.

cstratton · September 22, 2021, 3:56am

Hmm,

How are you flashing the chip and what’s the setting of the boot mode pin?

What’s the VTOR or whatever exactly it’s called set to - I mean the register that tells the core where to find the in-effect vector table? If a bootloader is involved, that sometimes gets changed… or needs to be.

If the VTOR is pointing to address 0, then you’re probably also dependent on the remaping of 0x0800 0000 to 0x0000 0000 which might not exist under some funky mode

And your stack is in RAM that actually exists on the silicon, right?

swagner0 · September 22, 2021, 4:23am

Hey, @cstratton - we might just be on the way to my owing you a beer.

I don’t have direct access to the “real” boot mode pin - it’s buried in the module. But the module BOOT0 pin is grounded through a 10k. I’m flashing with a ST-LINK device.

The VTOR register sounds like a likely culprit: I was unaware of a register which provides a base for the vector table, but the existence (and improper setting) of such a register would exactly explain what I’m seeing. Do you know where this would be documented?

[Update: it looks like it might be described here: https://www.st.com/resource/en/programming_manual/dm00046982-stm32-cortex-m4-mcus-and-mpus-programming-manual-stmicroelectronics.pdf#page=212]

Oh, and yes - the stack and heap are both allocated OK. I checked that.

Best,
Scott

swagner0 · September 22, 2021, 5:43pm

Thank you to @cstratton for the lead that helped to solve this. The root cause of my issue is that the RAK3172 does in fact set the BOOT0 pin HIGH, which means the device likes to boot from system ROM.
There are a few ways to address the issue, including setting SCB->VTOR to 0x08000000 as Chris suggested. One of these is to set the nSWBOOT0 nonvolatile option bit to 0 and the nonvolatile nBOOT0 option bit to 1, forcing User (main) flash boot.
I chose to force the memory mapping to User(main) flash very early in boot by setting SYSCFG->MEMRMP to 0 in the startup.s code as follows:

/*
 * Set memory map to main (user) FLASH :SYSCFG->MEMRMP = 0
 * Bits 2:0 MEM_MODE[2:0]: memory mapping selection
 * These bits control the memory internal mapping at address 0x0000 0000. These bits are
 * used to select the physical remap by software and so, bypass the BOOT mode setting.
 * After reset, these bits take the value selected by BOOT0 (pin or option bit depending on
 * nSWBOOT0 option bit) and BOOT1 option bit.
 *     000: Main Flash memory mapped at CPU 0x00000000
 *     001: System Flash memory mapped at CPU 0x00000000
 *     010: Reserved
 *     011: SRAM1 mapped at CPU 0x00000000
 *     100: Reserved
 *     101: Reserved
 *     110: Reserved
 *     111: Reserved
 */
  ldr     r3, =0x40010000   /* SYSCFG MEMRMP (Memory Remap) register address */
  movs    r2, #0            /* Main Flash memory mapped at CPU 0x00000000 */
  str     r2, [r3, #0]      /* Write MEMRMP register */

cstratton · September 22, 2021, 6:18pm

Glad you figured it out!

So to summarize, the issue is that the module is configured to start up from the system ROM, and that leaves the system ROM rather than the user flash containing your vector table, aliased at the beginning of memory where the SCB->VTOR is configured to look for the vector table.

One can either

change the mapping in startup code to alias user flash to the start of memory
change the VTOR in startup code to point to the actual unaliased user flash at 0x08000000
change the boot mode to skip the bootloader via non-volatile option bits
(on other hardware) change the boot mode by changing the pin strapping (eg, what the Nucleo probably does)

Curious what the condition that’s causing the system ROM to run user code is.

I don’t use the system ROMs much, except on the STM32L07x where its part of the dual bank boot mechanism, but if I look at the canned startup code in use there, it happens its setting the SCB-VTOR to the flash offset in the SystemInit().

What’s interesting is that if I look in the STM32CubeWL code

void SystemInit(void)
{
#if defined(USER_VECT_TAB_ADDRESS)
  /* Configure the Vector Table location add offset address ------------------*/
  SCB->VTOR = VECT_TAB_BASE_ADDRESS | VECT_TAB_OFFSET;
#endif

It only sets the VTOR if you define that flag to tell it to.

To me that’s a think-o in the stock code. And wouldn’t you know, there’s a unique copy in every example project in the archive.

My guess is that defining USER_VECT_TAB_ADDRESS (which is a flag not an address) either by uncommenting the line in that file or passing it in via compiler flag or something is what ST intended, but they could have better handled this!

I don’t recall if ST ever made a Cortex M0 (vs M0+) which doesn’t have a VTOR, if they did they probably chose the remapping option.

swagner0 · September 22, 2021, 6:40pm

Yes, your guess is right. I defined USER_VECT_TAB_ADDRESS in the project #defines, and this made the project work. (VECT_TAB_BASE_ADDRESS and VECT_TAB_OFFSET are correctly defined in the stock STM32CubeWL code.)
I decided not to use that solution, however, because it leaves the system memory at 0x00000000 mapped to system ROM. While I haven’t seen that cause issues, I am much more comfortable having the memory mapped to user Flash at 0x08000000 - which my solution does. With that remapping, the vector table is OK being located at 0x00000000 rather than 0x08000000.
“Curious what the condition that’s causing the system ROM to run user code is.”
I am still not sure how the device gets to my reset vector in the first place, though. Since it boots into system ROM initially, there must be code there that pokes around for some signal to hang around for a download, and eventually calls the code at the Flash reset vector address.

cstratton · September 22, 2021, 6:49pm

It looks like there’s a mode that only runs the bootloader if the system flash is “empty”, but it’s all mixed up between the chip programmer’s manual and the generic and specific sections of AN2606.

Eg, it sounds like you’re in a mode where you get the actual bootloader only on a never programed or freshly erased chip. But it still always starts up through the system ROM, which leaves that rather than user flash mapped unless you explicitly change or workaround that.

swagner0 · September 23, 2021, 5:24pm

Having solved this nasty little problem, I have completed a port of the ST LoRaWAN_End_Node application for use on the RAK3172. It is based on the LoRaWAN_End_Node example in the latest (v1.1.0) version of the STM32Cube MCU Package (Sept 2021) from https://www.st.com/en/embedded-software/stm32cubewl.html. The port is avaiable at https://github.com/PrometheanDesign/LoRaWAN_End_Node.