RAK7258 with SD Card use it or not? What's for?

Last week I unboxed an old (may be more than 1 year) RAK7258 but never unboxed it before so it was a brand new GW (never used)

I configured the onboard LoRa Server and started to receive data. All was perfect until after one week gateway was down and refused to start back. Lucky I am, I ordered few weeks ago some console cable and checked the boot log.

Surprised to see an issue with mmcblk0p1 reading (or writing don’t remember and sorry did not recorded the logs) and then I saw there was a 16GB SD card on the GW (strange the other one GW on the same order does not have this SD)
Anyway, I removed the card then the GW booted and worked (no data was lost I still have devices declared into LoRa server)

Then I tested the card on my computer, all was fine, except something strange on boot region (not sure why and how because it should not boot on this)

charles@mac-office:Volumes$ sudo fsck_exfat /dev/disk4
fsck_exfat: Opened /dev/rdisk4 read-only
** Checking volume.
** Checking main boot region.
   Main boot region is invalid. Trying alternate boot region.
** Checking alternate boot region.
   Alternate boot region is invalid.
** The volume  could not be verified completely.
charles@mac-office:Volumes$ sudo /usr/local/Cellar/e2fsprogs/1.45.6/sbin/badblocks -v /dev/disk4
Vérification des blocs 0 à 15558143
Vérification des blocs défectueux (test en mode lecture seule) : o
y
complété                                             
Passe complétée, 0 blocs défectueux repérés. (0/0/0 erreurs)

So reading card all night did not fire any errors, but when plugged back into the GW still no boot due to errors, so I decided to format back the SD card (FAT32) and then put back in the GW and magic all is fine again.

So my question is : what’s is the goal of this card ? To be honest I’m not confortable to send this configured GW to customer and I don’t want that issue to occurs again so what’s the pro and con of using a SD card on this GW? Should we format this card to another one such as ext4?

I saw also that it’s used on startup scripts for lora_pkt_fwd and loraserver

start() {
        config_load lorasrv
        config_get enable lorasrv enable 0
        config_get loglvl lorasrv loglvl 1
        [ "${enable}" = "0" ] && return

        local db_file="/etc/lorasrv/mote_db"
        mkdir -p /var/etc/
        rm -f /var/etc/.mote_db
        if [ -d "/mnt/mmcblk0p1" ];then
                mkdir -p /mnt/mmcblk0p1/lorasrv
                [ -f "/mnt/mmcblk0p1/lorasrv/mote_db" ] || cp -f /etc/lorasrv/mote_db /mnt/mmcblk0p1/lorasrv/mote_db
                db_file="/mnt/mmcblk0p1/lorasrv/mote_db"
        fi

        #init_config
        /usr/sbin/lorasrv -D $db_file -d $loglvl -P /var/run/lorasrv.pid

}

Here the contents after boot, for now syslog folder is empty but before I format the card it was containing old *.tgz syslog files.

root@RAK7258:/etc/init.d# ls -al /mnt/mmcblk0p1/*
/mnt/mmcblk0p1/lorasrv:
drwxr-xr-x    2 root     root          8192 Dec  2 10:32 .
drwxr-xr-x    5 root     root          8192 Dec  2 10:59 ..
-rwxr-xr-x    1 root     root          4096 Dec  2 10:57 mote_db
-rwxr-xr-x    1 root     root          4096 Dec  2 10:57 mote_db.0

/mnt/mmcblk0p1/packet_forwarder:
drwxr-xr-x    2 root     root          8192 Dec  1 13:20 .
drwxr-xr-x    5 root     root          8192 Dec  2 10:59 ..

/mnt/mmcblk0p1/syslog:
drwxr-xr-x    2 root     root          8192 Dec  1 13:20 .
drwxr-xr-x    5 root     root          8192 Dec  2 10:59 ..
root@RAK7258:/etc/init.d# 

So any explanation would really be very useful to take the good decision. Does this card is only used to backup data or does it is used for something else? And the $1 000 000 question, why on errors on this card, this prevent GW from booting?

Thank you for your help

It would be interesting to see what the actual boot failure message was. I do not believe, for example, the U-Boot has any idea of how to interact with the card, so my guess is that this would actually be an issue with startup scripts under Linux, and not the actual starting of Linux from the internal NOR flash partitions. (That said, there is a bug in this version of the MT76x8 Linux kernel sources which causes a crash any time the card is installed or removed, even if it is not mounted at the time - several attempts to diff against a previous source tree from Mediatek which does not exhibit the issue didn’t lead to identifying the cause, so at the very least one should shut down before installing or removing the card)

Naturally, a consumer-grade SD card is not a perfectly reliable data store for critical usage. However, something like a local LoRaServer may(?) need to store enough that the capability of the writeable partition on the NOR flash is challenged.

In my strictly personal opinion, running the LoRaWAN network server on the gateway box itself is an idea that primarily makes sense as a technology demo, not for production usage. There might be the occasional situation where the gateway cannot be given real Internet connectivity where it could remain viable.

But in a typical “production” setting, you almost certainly want to run the network server in cloud infrastructure, for two key reasons:

  • it’s the only way to make a network with more than one gateway work, and LoRaWAN is a protocol specifically designed to leverage the coverage of multiple gateways

  • it’s the realistic way to make the state of node sessions survive gateway damage / loss / theft. If you lose the state of the conversation with a node, you may be unable to recover that without forcing a restart of the node. It’s a lot easier to make backups of the database of a cloud hosted server than it is of the embedded database of a gateway hosted one. And with a cloud server, all you have to do to replace a gateway is install, register and configure a new one - nodes in the field will start using it immediately without even any awareness that anything has changed, as reception uses the contributions of all in-range gateways and downlink uses whichever had the strongest receipt of the uplink signal.

Additionally, the embedded copy of LoRaServer is a relatively older version; that project has since been re-named Chirpstack, and the versions of it available to run in the cloud are much newer benefiting from a couple years more development progress.

So in summary, IMHO the embedded server is a good demonstration of the technology, but for deployment you want to use a cloud server rather than the built-in one. And when you do that, the SD card no longer has any real role so you might as well remove it.