RAK3172 does not respond to AT+ commands after 48 days

@guiff
Thank you for that test, I think that is an important information.
To bad you are controlling the JOIN manually from your host MCU. It would be interesting to know if only the UART is malfunctioning and the device would auto-join the network.

For the other reporters of this issue. Does anyone of you have auto-join enabled and do you see the hanging device joining after a hard reset, but just not responding on the UART?

Yesterday, i discovered that another one of my devices running fw v3.5.3 stopped working after 49 days and around 15-18 hours. Taking into account the oscillator stability, everything around 49 days could have the same cause.

2 Likes

We still have a module that will probably fail in a few days. Is there any pin that we could see if the module is still running by its state?

@Matejisko, very interesting, thank you for sharing.

@guiff

There is no pin that would tell you the status. The only thing that could show if it re-joins after a reset would be if join would be set to automatic and you would see the join request on the gateway or LNS.
But I understood from your commands, that you initiate the join process manually from your host MCU. If the UART does not work, it will never try to join.

Hi,

I have some news,

During the last two days, 8 of my devices stopped working after 26-27 days of continuous operation. All of these devices were flashed with rui 4.0.0 + custom firmware, which only forces the baud rate to 9600. These devices were powered up within a few hours.

custom fw:

void setup(void)
{
   // Force Serial to 9600 Baud
   Serial.begin(9600);
}
void loop(void)
{
   // No need for the loop, kill it
   api.system.scheduler.task.destroy();
}

Another 3 devices with rui 3.5.3 (no custom fw) stopped working after 34 days of operation.

It is interesting that another 5 devices with RUI 4.0.0 + custom firmware are still alive, but they transmit every hour or so with low TX power and SF=7, compared to the previously mentioned devices with Txp=16 dBm, SF=12, and a transmit period of approximately 10 minutes.

Regarding this topic, any update on this? :thinking:

1 Like

Hi again,

An important update…

Today, i checked my devices and all of them seems to operate normally. I use the Helium network and past days there was something like Network migration to Solana. Although I read somewhere that it should not affect the actual data transfer, most likely there was a system outage so i could not see the data transfers.

In reference to the 5 devices that I previously mentioned as working normally (with different settings…), I’d like to note that they are connected to my “data only” gateway, which was likely not affected by this migration.

I apologize for the rushed conclusion. :face_with_hand_over_mouth:

Have a nice day

Hi,

We have not performed further tests, we are awaiting a response from R&D.

Regards,

For everyone who is experiencing this problem, I want to apologize for the inconvenience that this bug is causing all of you.

We found the root cause for the RAK3172 hanging after ~48 days.

We have tested the bug fix and will release patches for all RUI3 versions that have this bug.

The bug appears in the following RUI3 versions:
RUI3 V3.5.3
RUI3 V3.5.4
RUI3 V4.0.0
RUI3 V4.0.1

For users of the standard AT command firmware that have validated their product with one of these RUI3 versions, we will provide patches that will only fix this specific bug and do not change anything else.

For users of the RUI3 BSP, we will release new BSP versions for all affected versions.

Here is the list of the affected RUI3 versions and the new version numbers of the patched firmware and BSP:

Version Comment
RUI_3.5.5 Patch for RUI_3.5.3
RUI_3.5.6 Patch for RUI_3.5.4
RUI_4.0.2 Patch for RUI_4.0.0
RUI_4.0.3 Patch for RUI_4.0.1
1 Like

Updated AT command firmware are available for download:

Prebuild firmware:
[RUI_3.5.5] RAKwireless-Arduino-BSP-Index/RUI_3.5.5_230 at staging · RAKWireless/RAKwireless-Arduino-BSP-Index · GitHub
[RUI_3.5.6] RAKwireless-Arduino-BSP-Index/RUI_3.5.6_231 at staging · RAKWireless/RAKwireless-Arduino-BSP-Index · GitHub
[RUI_4.0.2] RAKwireless-Arduino-BSP-Index/RUI_4.0.2_232 at staging · RAKWireless/RAKwireless-Arduino-BSP-Index · GitHub
[RUI_4.0.3] RAKwireless-Arduino-BSP-Index/RUI_4.0.3_233 at staging · RAKWireless/RAKwireless-Arduino-BSP-Index · GitHub

Latest RUI3 BSP can be installed with ArduinoIDE board manager.
Latest version with the bug fix is 4.0.3.

[RUI_4.0.3] RAKwireless-Arduino-BSP-Index/RUI_4.0.3_233 at staging · RAKWireless/RAKwireless-Arduino-BSP-Index · GitHub

1 Like

That’s great news, we’ll test it out as soon as the AT+ version comes out, today we’ve chosen to use version 4.0.1, with a RAK3172 reset logic every 30 days.

1 Like

Patch for RUI3 V4.0.1 is already available, it is V4.0.3:

[RUI_4.0.3] RAKwireless-Arduino-BSP-Index/RUI_4.0.3_233 at staging · RAKWireless/RAKwireless-Arduino-BSP-Index · GitHub

Can you tell us what the nature of the “bug” is?
I mean… what happens to hang the device?

1 Like

It is a timer overflow (as Dana suggested), but the main problem is that the handler that should capture the overflow exception was not implemented correct. This led to a infinite loop that even disabled the hardware reset.

2 Likes

@beegee . Question?
To go from RUI3 3.5.3 to RUI3 4.0.3_233 BSD, What would be the steps for this?
i.e. Do we need to use STM32CubeProgrammer and complete erase and then flash with RAK3172-E_latest_final.hex?
Side question… does using a BSD compiled file completely update the RAK3172 to that RUI?

These sort of issues can be challenging to find - great work.

@pmjackson
Flashing the RAK3172-E_latest_final.hex is not absolute necessary. Usually I only update the BSP in ArduinoIDE to the latest.

Is it enough to reset module? I think the only way is to power cycle the module.

Power cycle is required if the counter overflowed happened.
If you reset before it happens, the counter starts from zero.

2 Likes

Hi beegee

We are just about to program 200 RAK3172-E Modules

Where do I find the latest RAK3172-E_latest_final.hex that contains the fix for the overflow counter etc
along with the change log etc

Thank You
Paul Humphreys

If you can wait until next week, the final release V4.0.4 will be available.

If urgent, you can use V4.0.3.

Download the file RUI_4.0.3_233_release_firmware.zip, inside the ZIP look for the file RUI_4.0.3_RAK3172-E_final.hex

Changelog will be available with the official release.