RAK3172 crashing

Hello RAK support.

I have a problem with the RAK3172 crashing.
I’m using the LA915 band with ADR active.
My transmissions request acknowledgment from the gateway.
I don’t know if it’s just a coincidence, but a few times the RAK3172 crashed, I got the downlink below.
Could you tell me if there is something strange in the received downlink?

{
“params”: {
“payload”: “”,
“radio”: {
“gps_time”: 1415974582863,
“delay”: 0.02503681182861328,
“datarate”: 2,
“modulation”: {
“bandwidth”: 500000,
“type”: “LORA”,
“coderate”: “4/5”,
“spreading”: 10,
“inverted”: true
},
“datr”: “SF10BW500”,
“hardware”: {
“status”: 1,
“chain”: 0,
“power”: 26,
“tmst”: 478760964,
“snr”: -10.8,
“rssi”: -116,
“channel”: 71,
“gps”: {
“lat”: -23.125680923461914,
“lng”: -45.704471588134766,
“alt”: 620
}
},
“time”: 1731939369.8915317,
“freq”: 927.5,
“size”: 27
},
“counter_down”: 9,
“lora”: {
“header”: {
“confirmed”: false,
“adr”: true,
“ack”: true,
“version”: 0,
“type”: 3,
“pending”: false
},
“mac_commands”: [
{
“LinkADRReq”: {
“Redundancy”: {
“ChMaskCntl”: 0,
“NbTrans”: 1
},
“DataRate_TXPower”: {
“TXPower”: 5,
“DataRate”: 4
},
“ChMask”: 255
}
}
]
},
“port”: null,
“encrypted_payload”: “”
},
“meta”: {
“network”: “358b679dd88d47948f6e108677b4b79a”,
“packet_hash”: “291fc9d4a60fcbdc833b9191d311ef23”,
“application”: “28502c6bc9757d48”,
“device_addr”: “16327be2”,
“time”: 1731939367.419,
“device”: “ac1f09fffe0d8f9a”,
“packet_id”: “282663d0d6bc84a88865d32e3878eb18”,
“gateway”: “647fda01de200000”,
“application_name”: “TECSUS_TESTE_DEVICE”,
“organization_name”: “Tecsus”,
“device_name”: “TS-D31R-L-V ID40 Valvula Teste R4”,
“device_tags”:
},
“type”: “downlink”,
“decoded_data”: null
}

Hi @Takahashi , welcome back to forum.

I do not see any error on the downlink screenshots. Are you using RAK3172 with RUI3 firmware? What version? What LNS? Do you have repeatable scenario how the the crash can happen? If it crash, does it reset, hang or goes to an indefinite state?

Hi @Takahashi ,

I am also using the RAK3172 with LA915 (in Brazil). Having a severe crash.
Please, take a look if your case is the same:

For the moment I had to downgrade to V4.1.1, this version does not presente the issue so far.

1 Like

Hello Carlrowan.

Sorry for the delay.
I made some captures using Nordic’s Power Profile Kit 2.
We are using the RAK3172 in our equipment, with the following configurations:

FIRMWARE_VERSION=RUI_4.2.0_RAK3172-E
CLI_VERSION=1.5.13
API_VERSION=3.2.9
MODEL_ID=rak3172
CHIP_ID=stm32wle5xx
LORAWAN_VERSION=LoRaWAN 1.0.3
DEVEUI=xxxxxxxx
APPEUI=xxxxxxxx
APPKEY=xxxxxxxx
BAND=12
MASK=0x0100
ADR=1
DR=2
TXP=0
JN1DL=5000
JN2DL=6000
RX1DL=5000
RX2DL=6000

The RAK3172 module controls the actuation of a valve.
The valve can be controlled via the LORAWAN downlink message or via the ATC command that I created.
A valve command response message is sent at the end of each executed command.
The response message is with confirmation and three retrys.

Below I will show two groups of images: the first group is of a valve command that worked and the other that resulted in a crash.

Figure good01:
valve operation;
sending the response message;
confirmation reception of response mensage;
the end occurs 10 seconds after sending the response message;
enters sleep mode.

2 - Figure good02:
shows that the response message ack reached 5 seconds, as per RX1DL configuration.

3 - Figure good03:
zoom of receiving the ack of the response message, I added some time references.

4 - Figure crash01:
valve operation;
sending the response message;
confirmation reception of response mensage;
crash: point at which I think the system has crashed.

5 - Figure - crash02:
shows that the response message ack reached 5 seconds, as per RX1DL configuration.

6 - Figure crash03:
zoom of receiving the ack of the response message, I added some time references.

Observing and comparing the good03 figures, I will make a comment based on assumption:
It appears that the downlink is received during the first 106 ms;
The downlink appears to be processed until time 112 ms;
The subsequent times appear to be the handling of MAC commands or other assignments of the downlink message.

In the case of figure crash03, I will also make a comment based on assumption:
The downlink is received during the first 108 ms;
Downlink processing occurs up to 114 ms;
As there is no variation in the current signal as seen in good03, it appears that there is something wrong with the downlink message;
At time t=1.127 seconds, it appears that RX2 occurs, which is the second reception window;
The RX2 window appears not to occur at all, possibly because of the corrupted message received on RX1;
After that, it appears that the system freezes once and for all, as the current appears to remain at around 5.6 mA.

Crashes occur sporadically, sometimes it happens quickly and other times it takes a long time.
This problem appears to only occur when the response message is sent after a valve command.
Commanding the valve without sending the response message does not cause a crash.
Just sending messages of the same size (11 bytes) and type of the response message also does not cause a crash.
Could it be a stack overflow problem?
The project build indicates that 168720 bytes of storage and 32024 bytes of dynamic memory are used.
I tried changing the Debug Level setting to “2 Full Debug” in the ArduinoIDE settings to get more clues, but I didn’t notice any changes.

Best Regards.


good01

good02

good03

crash01

crash02

crash03

Hello Carlrowan.

I did the previous test again, with a small change.
I stopped requesting confirmation of response messages sent after executing the valve command.
I left the test running for more than five hours and there was no crash.
I noticed that even if the transmissions are without confirmation, the RX1 and RX2 reception windows continue to occur, see the figure below.
It appears that the problem only occurs on acknowledgment downlink reception and intermittently.

Best Regards.

Hi Takashi,

Thank you for providing the details.

By the looks of it, it is hard to resolve unless we get some clues on why it is happen and how we can repeat it. But surely, the downlink is a good clue.

Some ideas/thoughts I have:

  1. The RX1 and RX2 window are always checking if any downlink is incoming. So you will always see it.
  2. On the downlink, what is the SF used? Maybe you can make the downlink payload obviously lower like 4bytes.I’ve seen LoRaWAN stack issues specially on US915 hanging when payload and mac commands go together.
  3. What is the utilization of your ROM and RAM, are you above 80? 90? If you can make it lower, might be good as well.

Hello Carlrowan.

Thanks for the feedback.

Here are the answers:

1 - It seems like it’s not a question.

2 - About downlink SF:
In the first message of this post, I showed the JSON content of the message I received when the system crashed.
I don’t know if it’s right, but I think it might be:

“datarate”: 2,
“modulation”: {
“bandwidth”: 500000,
“type”: “LORA”,
“coderate”: “4/5”,
“spreading”: 10,
“inverted”: true
}

2 - In the transmission I presented in figure “crash03” I did not receive a payload in the downlink, I just requested ACK.
Is it right to consider Ack to be a MAC command?

3 - According to the ArduinoIDE Output:
Storage area: 168704 bytes (84%);
Dynamic memory: 32024 bytes (65%).

Best Regards.

If it is working with RUI3 V4.1.1, please use that version for now.

We do not have any test environment for LA915. Initial testing for that (unofficial) region created by Everynet was done by customers in South America.

It could be very well that the new LoRaWAN stack has a problem with LA915.

1 Like

Hi, @beegee

We conducted the same test using the same code, but compiled with versions 4.2.0 and 4.1.1. With version 4.2.0 and ACK disabled, the hardware does not crash. However, with version 4.1.1 and ACK enabled, the device crashes.

The dev board we are working with has an LED on the SWCLK pin. The device sends a message with ACK and retry every hour to the LNS. When the device crashes, the LED stays ON at half brightness, similar to when the device is in BOOT mode.

This is the last messages the LNS received.

We can perform some test for you on the LA915.

Thank you for your time!!

Mario