Hi All, We have been developing a LoRaWAN network using RAK7249 Gateways and we are planning to deploy one gateway in one location. We installed ChirpStack and Mosquito MQTT broker in a VM and we are using MQTT Gateway bridge in RAK7249 to send data to our MQTT broker.
We observed that one Gateway is not sending the uplink frames to the MQTT broker and restarting the gateway fixed the issue. Once went through the logs in the sdcard, we found out that it has invoked oom-killer and it hasn’t send data to the MQTT broker after that.
Gateway Log:
Sat Dec 14 01:15:48 2019 kern.info quectel-CM[1368]: requestRegistrationState2 MCC: 0, MNC: 0, PS: Detached, DataCap: UNKNOW
Sat Dec 14 01:15:49 2019 kern.info quectel-CM[1368]: requestRegistrationState2 MCC: 413, MNC: 1, PS: Detached, DataCap: UNKNOW
Sat Dec 14 01:15:49 2019 kern.info quectel-CM[1368]: requestRegistrationState2 MCC: 413, MNC: 1, PS: Detached, DataCap: UNKNOW
Sat Dec 14 01:15:53 2019 kern.info quectel-CM[1368]: requestRegistrationState2 MCC: 413, MNC: 1, PS: Detached, DataCap: UNKNOW
Sat Dec 14 01:15:55 2019 daemon.info db[1658]: DataBase Backup
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.020000] loragwbr invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.030000] CPU: 0 PID: 1651 Comm: loragwbr Not tainted 3.18.45 #1
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.040000] Stack : 00000000 00000000 00000000 00000000 803741f2 00000036 00000000 00000000
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.040000] 00000001 00000000 802caa50 80326643 00000673 80373420 861955a8 00000000
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.040000] 80326730 80326854 00000000 80047748 00000006 800243d4 00000000 00000000
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.040000] 802cdfbc 861d5b7c 00000000 00000000 00000000 00000000 00000000 00000000
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.040000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.040000] ...
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.110000] Call Trace:
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.120000] [<800141a8>] show_stack+0x50/0x84
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.130000] [<8006deb0>] dump_header.isra.17+0x4c/0x13c
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.140000] [<8006e218>] oom_kill_process+0xc8/0x3c8
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.150000] [<8006e9b4>] out_of_memory+0x2f0/0x328
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.160000] [<80071bb8>] __alloc_pages_nodemask+0x628/0x68c
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.170000] [<8006d2cc>] filemap_fault+0x24c/0x438
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.180000] [<80087ad4>] __do_fault+0x34/0x9c
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.190000] [<8008aa64>] do_read_fault.isra.99+0x1a4/0x240
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.200000] [<8008b2c4>] handle_mm_fault+0x4d8/0x998
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.210000] [<800188fc>] __do_page_fault+0x168/0x3c8
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.220000] [<80004820>] ret_from_exception+0x0/0x10
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.230000]
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.230000] Mem-Info:
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.240000] Normal per-cpu:
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.240000] CPU 0: hi: 42, btch: 7 usd: 14
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.250000] active_anon:25681 inactive_anon:35 isolated_anon:0
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.250000] active_file:176 inactive_file:208 isolated_file:0
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.250000] unevictable:0 dirty:1 writeback:0 unstable:0
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.250000] free:366 slab_reclaimable:185 slab_unreclaimable:1386
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.250000] mapped:12 shmem:4359 pagetables:146 bounce:0
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.250000] free_cma:0
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.320000] Normal free:1432kB min:1420kB low:1772kB high:2128kB active_anon:102724kB inactive_anon:140kB active_file:704kB inactive_file:832kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:131072kB managed:126336kB mlocked:0kB dirty:4kB writeback:0kB mapped:48kB shmem:17436kB slab_reclaimable:740kB slab_unreclaimable:5544kB kernel_stack:480kB pagetables:584kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimaSat Dec 14 01:16:23 2019 kern.warn kernel: [425482.400000] lowmem_reserve[]: 0 0
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.400000] Normal: 236*4kB (UER) 35*8kB (ER) 1*16kB (R) 3*32kB (R) 1*64kB (R) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1400kB
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.430000] 4743 total pagecache pages
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.440000] 0 pages in swap cache
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.440000] Swap cache stats: add 0, delete 0, find 0/0
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.450000] Free swap = 0kB
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.460000] Total swap = 0kB
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.470000] 32768 pages RAM
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.470000] 0 pages HighMem/MovableOnly
Sat Dec 14 01:16:23 2019 kern.warn kernel: [425482.480000] 1184 pages reserved
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.490000] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.500000] [ 485] 0 485 226 20 4 0 0 ubusd
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.520000] [ 545] 0 545 192 15 3 0 0 askfirst
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.530000] [ 1195] 0 1195 376 20 4 0 0 syslog.sh
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.550000] [ 1204] 0 1204 302 37 3 0 0 logread
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.570000] [ 1230] 0 1230 264 39 3 0 0 logd
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.580000] [ 1239] 0 1239 455 47 5 0 0 rpcd
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.600000] [ 1368] 0 1368 777 27 4 0 0 quectel
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.620000] [ 1403] 0 1403 402 57 3 0 0 netifd
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.630000] [ 1458] 0 1458 377 21 5 0 0 crond
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.650000] [ 1492] 0 1492 375 19 3 0 0 udhcpc
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.660000] [ 1532] 0 1532 287 20 3 0 0 dropbear
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.680000] [ 1575] 0 1575 622 126 4 0 0 uhttpd
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.700000] [ 1590] 0 1590 194 18 5 0 0 rteswplugd
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.710000] [ 1598] 200 1598 887 59 4 0 0 mosquitto
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.730000] [ 1628] 0 1628 566 35 4 0 0 lora_pkt_fwd
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.750000] [ 1630] 0 1630 5701 96 9 0 0 lora_pkt_fwd
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.760000] [ 1650] 0 1650 1271 61 5 0 0 loragwbr
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.780000] [ 1651] 0 1651 21639 20401 25 0 0 loragwbr
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.800000] [ 1656] 0 1656 1271 61 5 0 0 lorasrv
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.810000] [ 1658] 0 1658 1362 124 6 0 0 lorasrv
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.830000] [ 1693] 0 1693 376 21 4 0 0 ntpd
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.850000] [ 1700] 0 1700 203 21 4 0 0 mpstat
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.860000] [ 1735] 0 1735 375 19 5 0 0 udhcpc
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.880000] [ 1878] 65534 1878 247 29 5 0 0 dnsmasq
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.890000] [15157] 0 15157 373 16 3 0 0 sleep
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.910000] [15525] 0 15525 1362 123 5 0 0 lorasrv
Sat Dec 14 01:16:23 2019 kern.info kernel: [425482.930000] [15531] 0 15531 5701 93 9 0 0 lora_pkt_fwd
Sat Dec 14 01:16:23 2019 kern.err kernel: [425482.940000] Out of memory: Kill process 1651 (loragwbr) score 627 or sacrifice child
Sat Dec 14 01:16:23 2019 kern.err kernel: [425482.960000] Killed process 1651 (loragwbr) total-vm:86556kB, anon-rss:81580kB, file-rss:24kB
Sat Dec 14 01:16:23 2019 kern.info quectel-CM[1368]: requestRegistrationState2 MCC: 413, MNC: 1, PS: Detached, DataCap: UNKNOW
Sat Dec 14 01:16:24 2019 daemon.info loraSrv[1650]: child process 1651 exited due to signal 9
Sat Dec 14 01:16:24 2019 daemon.err mqttEv[1658]: mqttEvDisconnectCB : MqttClient [gwBridgeS Mqtt Client] Disconnected : [1]
Sat Dec 14 01:16:24 2019 daemon.err mqttEv[1658]: mqttEvDisconnectCB : MqttClient [gwBridgeS Mqtt Client] Disconnected : [1]
Sat Dec 14 01:16:24 2019 daemon.err mqttEv[1658]: mqttLoopMiscTimerHandler: mosquitto [[gwBridgeS Mqtt Client]] loop misc ERR: 7
Sat Dec 14 01:16:24 2019 daemon.err mqttEv[1658]: mqttEvDisconnectCB : MqttClient [AppSrv Mqtt Client] Disconnected : [7]
Sat Dec 14 01:16:24 2019 daemon.err mqttEv[1658]: mqttEvRdReady: mosquitto [[AppSrv Mqtt Client]] loop read ERR: 7
Sat Dec 14 01:16:24 2019 daemon.info mqttEv[1658]: mqttEvRegister : Register mosquitto event [gwBridgeS Mqtt Client]
Sat Dec 14 01:16:24 2019 daemon.info gwBridge[1658]: mosqConnectCB: Mqtt Connect Success ...
Sat Dec 14 01:16:25 2019 daemon.info mqttEv[1658]: mqttEvRegister : Register mosquitto event [AppSrv Mqtt Client]
Sat Dec 14 01:16:25 2019 daemon.info appSrv[1658]: appMqttConnectCB: [AppSrv Mqtt Client] Connect Success ...
Sat Dec 14 01:16:27 2019 kern.info quectel-CM[1368]: requestRegistrationState2 MCC: 413, MNC: 1, PS: Detached, DataCap: UNKNOW
Sat Dec 14 01:16:32 2019 kern.info quectel-CM[1368]: requestRegistrationState2 MCC: 413, MNC: 1, PS: Detached, DataCap: UNKNOW
Sat Dec 14 01:16:37 2019 kern.info quectel-CM[1368]: requestRegistrationState2 MCC: 413, MNC: 1, PS: Detached, DataCap: UNKNOW
Sat Dec 14 01:16:37 2019 kern.info quectel-CM[1368]: requestSetOperatingMode(1)
Sat Dec 14 01:16:37 2019 kern.info quectel-CM[1368]: requestSetOperatingMode(0)
Has anyone else faced this issue? Since we don’t have redundancy, we want to make sure that this will not happen in the future. Thanks in advance.