RAK7204 stops sending data

Hello, RAK team,

I’m testing RAK7204 for few months now. I had one time device stopped sending data. After restart device started to send data again. Now i have again same situation device stopped sending data. I do not want to restart it. I need to debug that is the problem because if test will be successful we will start it using in our company for agriculture fields monitoring this means driving few a few hundred kilometers to restart one node.

I need to debug this sensor how can i do that? Do you have any tools? What steps to do next after i see device do not send any data.

Device: RAK7204
FW version: V3.0.0.12
Bootloader: 3.0.2
Server: Chirpstack/loraserver

Gateway ~5 meters from the sensor. Gateway works fine other nodes sends data using this gateway.

Hi. You can monitor the node with RAK Serial tool, but have to be connected to it. To try to help you we have to see some outputs.

Hello Velev,

Bellow some output from Serial tool. As there is no timestamp it is not so useful i guess.

I have tried to send command at+join. Device replies OTAA Join Start… OK but never joins. And only after command at+set_config=device:restart device joined and sent data instantly.

I guess in the future it would be perfect if to add some watchdog to restart device in case not able to join.

at+version
OK3.0.0.9.H

at+get_config=device:status
OK.


===============Device Status List================
Board Core: RAK811
MCU: STM32L151CB_A
LoRa chip: SX1276

===================List End======================


at+get_config=lora:status
OK.


==============LoRaWAN Status List================
Work Mode: LoRaWAN
Region: EU868
Send_interval: 600s
Auto send status: true.
Send_interval work at sleep
Join_mode: OTAA
DevEui: 60C5A8FFFE7634CD
AppEui: 60C5A8FFFE7634CD
AppKey: E1BCDE4DA6AA3B3DAD04EEDBE383B172
Class: A
Joined Network:true
IsConfirm: false
AdrEnable: true
EnableRepeaterSupport: false
RX2_CHANNEL_FREQUENCY: 869525000, RX2_CHANNEL_DR:0
RX_WINDOW_DURATION: 3000ms
RECEIVE_DELAY_1: 1000ms
RECEIVE_DELAY_2: 2000ms
JOIN_ACCEPT_DELAY_1: 5000ms
JOIN_ACCEPT_DELAY_2: 6000ms
Current Datarate: 0
Primeval Datarate: 2
ChannelsTxPower: 0
UpLinkCounter: 1802
DownLinkCounter: 1498
===================List End======================


at+set_config=lora:send_interval:1,30
Start auto send data with sleep.
OK

at+get_config=device:status
OK.


===============Device Status List================
Board Core: RAK811
MCU: STM32L151CB_A
LoRa chip: SX1276

===================List End======================


at+join
OTAA:
DevEui:60C5A8FFFE7634CD
AppEui:60C5A8FFFE7634CD
AppKey:E1BCDE4DA6AA3B3DAD04EEDBE383B172
OTAA Join Start…
OK

at+set_config=device:restart
OK,restart …

========================================================


| ___ / _ \ | | / / | | | () | |
| |
/ / /\ | |/ / | | | | _ __ | | ___ ___ ___
| /| _ || \ | |/| | | '
/ _ \ |/ _ / __/ __|
| |\ | | | || |\ \ \ /\ / | | | __/ | /_ _
_| __| |
/_| _/ / /||| _|_|_||//


RAK7204 Version:3.0.0.9.H


========================================================

Selected LoRaWAN 1.0.2 Region: EU868
BME680 init success.
autosend_interval: 30s
Current work_mode:LoRaWAN, join_mode:OTAA, Class: A
OTAA:
DevEui:60C5A8FFFE7634CD
AppEui:60C5A8FFFE7634CD
AppKey:E1BCDE4DA6AA3B3DAD04EEDBE383B172
OTAA Join Start…
[LoRa]:Joined Successed!
Battery Voltage = 3.558 V
Humidity:52.113 %RH
Temperature:22.67 degree
Pressure:988.55 hPa
Gas_resistance: 8844 ohms
[LoRa]: send out
[LoRa]: Unconfirm data send OK
Go to Sleep.
Wake up.
Battery Voltage = 3.575 V
Humidity:51.606 %RH
Temperature:22.70 degree
Pressure:988.55 hPa
Gas_resistance: 13491 ohms
[LoRa]: send out
[LoRa]: Unconfirm data send OK
Go to Sleep.
Wake up.
Battery Voltage = 3.580 V
Humidity:50.918 %RH
Temperature:22.77 degree
Pressure:988.61 hPa
Gas_resistance: 19034 ohms
[LoRa]: send out
[LoRa]: Unconfirm data send OK
Go to Sleep.
Wake up.
Battery Voltage = 3.581 V
Humidity:50.269 %RH
Temperature:22.86 degree
Pressure:988.59 hPa
Gas_resistance: 24906 ohms
[LoRa]: send out
[LoRa]: Unconfirm data send OK
Go to Sleep.

Hi @Ernestas,

It is a good idea about watchdog, but i think it can be used for some issue, not join, because if there isn’t a valid gateway, RAK7204 will join—failed—restart—join—failed—restart…, then the battery will be used up soom.

Actually, we’ve released the latest firmware V3.0.0.12.H.T, which you can find here:
https://downloads.rakwireless.com/en/LoRa/RAK7204/Firmware/
In this firmware, if RAK7204 join failed, after some time, RAK7204 will try to join again. You can use “at+set_config=lora:send_interval:X:Y” to set the join interval too.

Thank you perfect, i will try it.

Hi Ernestas, I have the same problem. Did you get it fixed? If so, how?

1 Like

I have about 20 of these 7204’s with 3.0.0.14 software and they still have this problem. Although it is possible the battery would be used up trying to rejoin a Gateway that isn’t there, the 7204 may as well have a dead battery if it won’t even try to rejoin. How do you change send_interval x:y to affect the rejoin interval? Please post the command required. Also, please add a watchdog as a firmware option. I for one would adopt it as the 7204 failing to report is a much more common issue than a missing Gateway.

Mine goes offline too - and I’ve just changed the battery after only a few months so I’m not sure it sleeps properly either.

I think a watchdog would be useful so we’d at least get a reset / rejoin that we can track. But it’s not in RUI as far as I’ve seen.

Would the command be "at+set_config=lora:join_interval:X:Y” ??

What happened when you tried it?

Haven’t yet. Awaiting response from RAK. It will take weeks of testing to know as the 7204’s generally run for a few weeks before failing to send data.

What’s stopping you trying it?

Haven’t yet. Awaiting response from RAK.

I suppose I could to see if it accepts the command, but no point if it is wrong, and will take weeks to find out if it works if the command is correct, so I can wait for an answer from RAK.

Where did you get the idea for the command?

Higher up in this thread Tomi from RAK says the Send command can also be used for Join, so I just substituted Join in place of Send. I would set it to 1:86400 so it Joins once per day. We send and chart temps every 30 minutes, so I should be able to see no chart for a period of time. I can also set our software up to text me if no data received from the 7204 for “x” minutes. This way I can test if the Re-Join actually works.

Hi, got on my laptop so no need for one-liners on my iPhone.

It might be worth trying a at+help to see what commands the module supports - mines in pieces on my desk in the office so I can’t check, but I’ve not seen any join_interval on any of the core modules and it’s not documented for the RAK811 which is used in the 7204.

I’m not sure quite how a join_interval could be implemented. A device joins, ideally once in its life, and then transmits uplinks. It’s then in a double-negative mode - as in, how does it know if it’s not in range or joined or that the gateways haven’t been switched off or stolen or similar. There’s been quite some discussion on the TTN forum about this.

The only sensible way is for it to perform a confirmed uplink at some appropriate interval - like once a day - at the expense of the entire local community as the gateway can’t hear any uplinks whilst it’s transmitting it’s confirm, however briefly. One way of mitigating this is to have some significant overlap of gateways, so other devices are still in range of a listening gateway whilst the another one is transmitting.

The other scheme is to send a downlink at a particularly interval and if the device doesn’t get it’s downlink as expected, it performs a re-join.

Multiple re-joins then make a mess of the join nonce as firmware tends toward some questionable mechanisms for deriving the nonce. And if you have a few hundred devices being serviced by a gateway, all those gateway transmissions can result in a cascade - so the gateway transmitting means it doesn’t hear a confimed uplink, the device doesn’t get it’s confirmation, performs a join, which requires a transmitted response, which blocks another devices confirmed uplink and it just goes round in circles. Mathematically unlikely, but hugely entertaining / interesting when it happens, but I’ve only seen it for real on devices on RS485 chains.

Ideally you want the device not to go offline or fail to sleep, thereby running the battery down. The next best alternative is to watchdog &/or enforced reset at defined intervals, but keep the join keys & counters in flash.

Or, RAK could just fix whatever the underlying problem actually is, or take the product off the market…as this has been going on for far to long.

I can confirm the behaviour here. All 3 of my 7204 (firmware on 3.0.0.12) died at the same time after ~2-3 months. One device drained the battery completely. The other two devices ran for another 2-3 months until the same problem came up.
I generally like the nodes but with this bug I wouldn’t recommend them in any project.