RAK3172 does not respond to AT+ commands after 48 days

beegee

Thanks for the quick reply (as always)

Do you know any details regarding V4.0.4 enhancements of bug fixes ?

Thank You
Paul Humphreys

Yes, 4.0.4 main enhancements

  • Support the new RAK3172-T (RAK3172 stamp module with TCXO instead of “simple” oscillator)
  • USB plug/unplug wakeup call for API
  • add callback for api.system.sleep.xxx() finished
  • some fixes in P2P, but encryption is not complete fixed

The complete changelog will be available next week.

We have around 170 devices on field that are facing the same issue. So, if I understand it correctly performing a reset via ATZ (not power cycle) say every 30 days should reset the counter and as such the module should not face the hang issue? Please advise, as this issue is causing the customer to visit site and power cycle the devices which is causing a good deal of inconvenience.

Welcome to the forum @arifmansoori

Are you using the RAK3172

  1. with a custom firmware written with STM32CubeIDE
  2. with our RUI3 AT command firmware and a host MCU that communicates with the RAK3172 over UART
  3. with a custom firmware written with our RUI3 API?

If 2) or 3), what is the RUI3 firmware version?

If you are using RUI3 AT command or API firmware, a reset of the module every 30 days would prevent the hanging, but it is still strongly recommended to upgrade the firmware on your devices to the latest release V4.0.5.

Thanks Bernd! We using option 2, RUI Ver 3.5.3.

Upgrading firmware would mean devices need to be shipped back to us to Aus from EU…

Unfortunately re-flashing with the latest firmware or issuing a power cycle or ATZ reset command are the only options.

I am very sorry for the inconvenience we have created with this problem.

Hi, we faced this big issue since our devices at our customers place went down 49 days after deployment.

However, what we face is even worse than what was reported here. The RAK3172 is completely bricked. No resetting, power cycling or ATZ-ing can make the module function again.

We also potted all of our devices and so cannot reprogram the RAK modules.
Could you guys give us any pointers? We can reprogram the MCU that connects to the RAK3172 w UART2. Can we reprogram the RAK3172 through this MCU? Thanks!

Hi Jacky,

For us, when we remove the battery and wait for the capacitors to bleed. Just put in a new battery and the module return to regular operation.

When the module is stuck, the UART does not work, so it will not be possible to update before returning to regular operation. Nor by SWD, because the reset does not work.

Hi @jackyruth ,

Since the modules already reached 49 days, two options left: (1) update via SWD pins (if you have access to it) or (2) use UART bootloader of STM32WL but requires you to pull-up BOOT pin and use STM32CubeProgrammer SW tool.

1 Like

Hi Carlrowan,

Unfortunately, we do not have access to the SWD pins, and we left the BOOT pin floating on the RAK3172 module.

Is it still possible to program via UART?

Some users say that power cycling it (putting in a new battery) returns the module to normal operation, but that has not worked for us. Any ideas on what might be different in our case?

I appreciate your advice, but power cycling it did not work for us (even with an extended wait time for the battery to drain).

May we know which version of the affected firmware you have on your module?

Thanks!

When I created this topic, our modules used FW 3.5.3.

We just checked that our devices use FW 3.5.4. Why aren’t they online after a power cycle =(. We potted all of our devices, does that mean we have to toss all of our first batch away?

I had the same problem while working on a library team.
We worked hard on frequency div to ensure correct timing and deal with counter overflow. If we improve by counting the overflows in the interrupt event, we will still get this error in the future, it’s just that we prolong the count by adding a second counter to count the overflow of the first counter. .
I’m curious about your solution in this case. Can you share what the solution is?

Hi Beegee / Carlos

We have a report from a customer that some of our sensors using the RAK4072 have gone off line after 1-2 months (which could be the 48day problem)

The RAK4072 have v3.3.0.18 version firmware with a slave processor - EU868

Could the software team please clarify if this problem exists in this firmware version and if it does is there an upgrade hex file eliminating this bug.

Can some explain how the ATZ command is best implemented to overcome this software bug

Note: It could be effecting 180 sensors installed recently - eekkk

Thank You

Paul Humphreys

Hello @dingoxx

With RAK4072, do you mean the RAK4270?

If yes, the RAK4270 does not use RUI3, the firmware is based on older sources (RUI V2) and is not affected by this bug.

If you mean RAK3172, the affected firmware versions are
RUI3 V3.5.3
RUI3 V3.5.4
RUI3 V4.0.0
RUI3 V4.0.1

As you are on an older version, this could be a different problem.

Do you have access to any of the devices that went offline?
Beside of using a slave processor, is the device battery powered or battery + solar panel powered or have a permanent power supply?

If you mean RAK3172, the affected firmware versions are
RUI3 V3.5.3
RUI3 V3.5.4
RUI3 V4.0.0
RUI3 V4.0.1

fyi, I have the exact same behavior for devices in the field running RUI_3.4.11_RAK3172-E