RAK2247 The Lora Packet Forwarder now runs for about 4 hours before it stops with errors

Issue:
The Lora Packet Forwarder now runs for about 4 hours before it stops with errors:
lgw_setup_sx125x:407: Note: SX125x #0 version register returned 0x60 (see below sudo ./util_spi_stress)
lgw_setup_sx125x:415: Note: SX125x #0 clock output disabled
lgw_setup_sx125x:469: Note: SX125x #0 PLL start (attempt 1)
lgw_setup_sx125x:469: Note: SX125x #0 PLL start (attempt 2)
lgw_setup_sx125x:469: Note: SX125x #0 PLL start (attempt 3)
lgw_setup_sx125x:469: Note: SX125x #0 PLL start (attempt 4)
lgw_setup_sx125x:469: Note: SX125x #0 PLL start (attempt 5)
FAIL TO LOCK PLL
Failed to setup sx125x radio for RF chain 0

I runned the sudo ./util_spi_stress with this output:
Cycle 0 > error during the 1th iteration: write 0x67, read 0x60
Repeat read of target register: 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60
I can see in the code a random number is send (i.e 0x67 and 0x60 is returned.)

Setup:
I followed the steps from:
Quickstart page for the link see bel
Server:
RaspberryPI zero (PI 3) with SLES15.2
Details:
I try to install this module and can’t get the Lora Packet Forwarder to work.
I’ve following 2 steps. (see below for details)
1 Adapting install.sh to get the Lora Packet Forwarder installed.
2 Adapting installation files to get the Lora Packet Forwarder to run


1 Adapting install.sh to get the Lora Packet Forwarder installed

I followed the steps from:


(It was a slightly older page that got obsolete at 10 oct, I checked the difference.
NOTE: (on this page)
“If you want to connect the RAK2247 mPCIe board to the Linux PC directly, make sure to have the PERST# signal (Pin 22) pulled down.”
I did not know how to accomplish this, but read that is was taken care of by the software(loragw_spi.ftdi.c#L75):

a) I changed the install.sh to match Sles15.2
Changed:
apt-get -y install git libftdi-dev libusb-dev
To:
zypper install -y git libgcrypt-devel glib2-devel

b) In install gcc and make (needed by install.sh)
sudo zypper install gcc
sudo zypper install make

c) error: missing libftdi header files
No provider of ‘libftdi’ found
sudo zypper install libftdi0-devel

d) error: /usr/include/usb.h:81:2: unknown type name ‘u_int8_t’ u_int8_t bLength;
/usr/include/usb.h:184:2: error: unknown type name ‘u_int16_t’ u_int16_t bcdDevice;
sudo nano /usr/include/usb.h
#include <sys/types.h>

e) error: ‘PATH_MAX’ undeclared here (not in a function); did you mean ‘INT8_MAX’?
char filename[PATH_MAX + 1];
sudo nano /usr/include/usb.h
changed #include <limits.h> to #include <linux/limits.h>

2 Adapting installation files to get the Lora Packet Forwarder to run
ERROR: [main] failed to start the concentrator

f) Changed library.cfg to activate debug messages
DEBUG_HAL= 1
ERROR: FAIL TO CONNECT BOARD
Error in loragw_hal.c from functie lgw_connect();
lgw_connect is from loragw_reg.c

DEBUG_REG= 1
ERROR CONNECTING CONCENTRATOR
Error in loragw_reg.c from functie lgw_spi_open();
lgw_spi_open from loragw_spi.h used in:
loragw_spi.native.c en loragw_spi.ftdi.c

DEBUG_SPI= 1
ERROR: MPSSE OPEN FUNCTION FAILED
Error in loragw_spi.ftdi.c OpenIndex(VID,PID,SPI0, SIX_MHZ, MSB, IFACE_A, NULL, NULL, 0);

Error in OpenIndex() zit in mpsse.c
Seems that VID and PID are differend from what found with lsusb command.
lsusb: Bus 001 Device 004: ID 0403:6015 Future Technology Devices International, Ltd Bridge

Changed in loragw_spi.ftdi.c:
/* — PRIVATE CONSTANTS ---------------------------------------------------- /
/
parameters for a FT2232H */ changed to ->> FT2245
#define VID 0x0403
#define PID 0x6010 changed to ->> 0x6015

Now for the first time the package forwarder starts but stop with two errors:
ID: 0x6015
VID: 0x0403
clock: 6000000
Libmpsse version: 0x13
Note: SPI read success
lgw_connect:532: INFO: no FPGA detected or version not supported (v96)
Note: SPI read success
lgw_connect:555: ERROR: NOT EXPECTED CHIP VERSION (v96)

g) Try different things (suggestion I got form Internet search)
Adapt speed in loragw_spi.ftdi.c naar TWO_MHZ and ONE_MHZ
mpsse = OpenIndex(VID,PID,SPI0, SIX_MHZ, MSB, IFACE_A, NULL, NULL, 0);
NO difference
h) Adapt in rak2247_usb/libmpsse/src/mpsse.c to add the rak2247 to List of known FT2232-based devices
NO difference

i) Finally I put in printf() commando to see what values where expected in stead of the (V96) above, and adapted the install files with that values:

Adapt loragw_reg.c (Adapt install.sh to copy my version after git clone)
Adapt:
const uint8_t FPGA_VERSION[] = { 31, 33 }; /* several versions could be supported /
To:
const uint8_t FPGA_VERSION[] = { 31, 33, 96 }; /
several versions could be supported */
This error was gone:
lgw_connect:532: INFO: no FPGA detected or version not supported (v96)

j) Adapt loragw_reg.c (install.sh is already adapted to copy my version after git clone)
Adapt:
On the third line of the register in this file
{-1,1,0,0,8,1,103}, /* VERSION /
To:
{-1,1,0,0,8,1,96}, /
VERSION */

Now the package forwarder starts but stop after several hours, in the meanwhile put some messages on the screen:
l) lgw_fpga_configure:138: WARNING: FPGA TX notch frequency is out
of range (0 - [126000…250000]), setting it to default (129000)
Solve in global_conf.json added:
–>> “tx_notch_freq”: 129000,
m) lgw_setup_sx125x:407: Note: SX125x #0 version register returned
0x60 (see below sudo ./util_spi_stress)
n) lgw_setup_sx125x:415: Note: SX125x #0 clock output disabled
o) lgw_setup_sx125x:469: Note: SX125x #0 PLL start (attempt 1)
lgw_setup_sx125x:469: Note: SX125x #0 PLL start (attempt 2)
lgw_setup_sx125x:469: Note: SX125x #0 PLL start (attempt 3)
lgw_setup_sx125x:469: Note: SX125x #0 PLL start (attempt 4)
lgw_setup_sx125x:469: Note: SX125x #0 PLL start (attempt 5)
FAIL TO LOCK PLL
p) Failed to setup sx125x radio for RF chain 0

I runned the sudo ./util_spi_stress with this output:
Cycle 0 > error during the 1th iteration: write 0x67, read 0x60
Repeat read of target register: 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60
I Can see in the code a random number is send (i.e 0x67 and 0x60 is returned.

util_spi_stress is not the “packet forwarder”

Have you tried running the actual packet forwarder yet?

/ parameters for a FT2232H */ changed to ->> FT2245
#define VID 0x0403
#define PID 0x6010 changed to ->> 0x6015

Why would you change this? There’s no such chip as an FT2245, the 0x6015 belongs to the FT23xx series, why do you think that would be appropriate?

Generally speaking for a pi you would have been better off buying an SPI version of the card.

Thanks Chris for your reaction, only on the end I did run util_spi_stress, with the last errors. “Cycle 0 > error during the 1th iteration: write 0x67, read 0x60
Repeat read of target register: 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60”
All the other error messages above that are from running sudo ./lora_pkt_fwd.

Sharp notice:
My mistake: / parameters for a FT2232H */ changed to ->> FT2245
Should be FT2247, this is in the command section so no effect.
Changing 0x6010 ->> 0x6015 made the difference. Before the package forwarder dit not run at all, no it did start, because PID was the same as with the lsusb command.

“Generally speaking for a pi you would have been better off buying an SPI version of the card.”
Could be, but we cannot. We use a pi zero into a adapter card. On this card are two possibilities. Serial and Usb(mPCI-E), as we use the serial for a serial brackout board for serial devices(‘old’ sensors) and consequently we use Usb for LoRa concentrator.

Should be FT2247, this is in the command section so no effect.
Changing 0x6010 ->> 0x6015 made the difference. Before the package forwarder dit not run at all, no it did start, because PID was the same as with the lsusb command.

That makes no sense, the documentation of the RAK2247 says it has an FT2232H, not an FT230x, so its PID would be that of the FT2232H, not the FT230x series.

Perhaps you are mistakenly talking to something else in the system?

Could be, but we cannot. We use a pi zero into a adapter card. On this card are two possibilities. Serial and Usb(mPCI-E), as we use the serial for a serial brackout board for serial devices(‘old’ sensors) and consequently we use Usb for LoRa concentrator.

You could wire the SPI concentrator directly into the pi gpio.

But maybe it is this board that has the FT230x you seem to be talking to instead of the FT2232H on the RAK2247?

Perhaps somewhere in this process you lost the required reduction of the SPI baud rate?

Also, are you sure your power supply is sufficient to run both the concentrator and the pi? You probably need one rated for 4 amps or so, given peak loading.

Thanks for your reply again :smiley:

You can see the vid and pid with the lsusb command in linux.
The only change I made in the code section was adapting pid from 6010 to 6015, and then the ./lora_pkt_fwd program found the device as Linux already did. So I do not think I talk to the wrong device. Although I’m afraid I talk to the wrong version of the device, as I should not have to change .c files on this and on the registry change in the same file

I this we did that. I saw that requirement in the config.txt, but a in our sitiation it is located in an other file. I did not copy the config.txt but change this file:
sudo nano /boot/efi/extraconfig.txt
from:
dtparam=i2c_arm=on,i2c_arm_baudrate=400000
to:
dtparam=i2c_arm=on,i2c_arm_baudrate=100000
I did not mentioned this in the first part, as it was complex and long enough.

Our power supply is ‘only’ 2Amps, but could this be the problem?
LoRa long range, and low power 4 amps is 1200 watts
without LoRa it is 3.5 watts.

But we see an other stange thing in the /var/log/messages
A lot of USB errors:
2020-10-18T00:05:02.907932+02:00 rwspi-10d9e52d kernel: [27038.815025] dwc2 3f980000.usb: hcint 0x00000402, intsts 0x04600009
2020-10-18T00:05:02.907945+02:00 rwspi-10d9e52d kernel: [27038.815103] dwc2 3f980000.usb: dwc2_hc_chhltd_intr_dma: Channel 5 - ChHltd set, but reason is unknown
2020-10-18T00:05:02.907956+02:00 rwspi-10d9e52d kernel: [27038.815112] dwc2 3f980000.usb: hcint 0x00000402, intsts 0x04600001
2020-10-18T00:03:27.487321+02:00 rwspi-10d9e52d kernel: [26943.391283] usb 1-1.2: usbfs: usb_submit_urb returned -121
This last one many many times. The others ones in a while.

One can see the id of something but that is not the correct id for a RAK2247, which suggests you are seeing some other USB device in your system

dtparam=i2c_arm=on,i2c_arm_baudrate=100000

This would have no bearing on the SPI clock rate of the FT2232H, that instead must be set in the code that talks to the FTDI chip…

Our power supply is ‘only’ 2Amps, but could this be the problem?

That is definitely insufficient for reliable operation. It’s really dubious even for just the pi by itself. Average power consumption will of course be low, but you need a supply capable of handling peak.

It’s also unclear if your mPCIe slot (adapter?) is providing a suitable supply to the RAK card.

Dear Chris,
Thanks for your insights and help.
Probebly your are right:

We removed the Concentrator from the mPCI-e and connected it to the SPI connector.
For this we had to removed the serial card. In the beginning we could not connect till we removed the overlay file for the serial card (#dtoverlay=spiuart1 and #dtoverlay=spiuart2) and added another dtoverlay=i2cuarts. So maybe this card was also in the way when we were connected to mPCI-e. As this mPCI-e with serial card is the preferred option, we still look for a way to get it to work.

That correct, but you refering to baudrate I thought you ment this one.
I did changed the SPI speed in the loragw_spi.ftdi.c for mPCI-e version and in the loragw_spi.native.c for the SPI version, but that did not made a difference.

Also this seems not the problem, moreover I found a page with the power comsumption
Active-Mode(TX) 440 mA Active-Mode(RX) 470 mA and "You should use it at least 3.3V/1ADC power. See: https://downloads.rakwireless.com/LoRa/RAK2247-Mini-PCIe/Hardware-Specification/RAK2247_Mini-PCIe_LoRa_Concentrator_User_Manual_V1.1.pdf

Thank you for your help.
If you have any suggestions on how to get the mPCI-e to work they are most welcome.

What is this mystery “serial card” and what does it have to do with the LoRa Concentrator?

Also you are mistaken on the power, because you are failing to consider peak loads from the pi itself. 2 amps is marginal for the lowest end pi, even before you add the power hungry lora card. Both theory and practical experience point to your power supply being extremely insufficient. Again, anything less than a 4 amp supply and you are asking for trouble.

Dear Chris, the mystery is solved sometime ago, just to let you know. You were very correct, seeing a usb device with lsusb doesn’t mean it is one. After sending photo to our supplier they confirmed that this concentrator device was SPI only. I’m sorry for bothering you. It would never have worked. We now have received the USB version and the coming days I will installing this one too.Till now I was able to send data for a LoRa client (RN2483) to the rak2247-spi.and log it (util_pkt_logger). The same I’ll try with the USB version. I do not expect any problems. Thanks for all your help.

The SPI card would have worked quite well (better actually than the USB version, as it is more time agile on a busy network) if wired directly to the pi’s SPI bus on the GPIO connector.

But the USB version will work on less busy networks.

This could be a new topic, as the above was about the SPI version, but it matches the title, so I decided to place it here. If wrong please move this to a new topic (if possible)

Please help me to find the cause of the freeze and to get a proper .csv file.

I now install the (real) USB version of the Concentrator, and I can see that I receive packages, but after some time the program will frees.
It happened for the programs: lora_pkt_fwd and test_loragw_hal.
The program util_pkt_logger directly froze and CTRL-C did not work,so 0 byte log file.
So I dived in the C-files of this program to see if I could find the cause, hoping this would help with the others as well. Also we like this .csv file concept.
My findings: if I edit the util_pkt_logger.c on line: 484
Commenting out this line:
clock_nanosleep(CLOCK_MONOTONIC, 0, &sleep_time, NULL); /* wait a short time if no packets /
And replace it with: wait_ms(300); (from test_loragw_hal.c)
and added #include “loragw_aux.h” in the beginning (to be able to use wait_ms())
The program util_pkt_logger start running as the other to, but still freezes at some time and ctrl-c won’t work. ctrl-z and kill -9 are needed. But as long as it runs and receive packages I can stop the program and read the .csv file. Only this one is unreadable, even the header. Moreover I we see only Chinese characters???.
If a add in a peice of code to output the data(payload) to the screen I get to see the correct data on the screen, so the correct data should be written to the file. Only the file is created before this loop and with the header, so I assume it goes wrong there.(void open_log(void) line 343
My code to display on the screen added at line: 589 just before "/
end of log file line /"
/
Print hex-encoded payload (bundled in 32-bit words) Toegevoegd Laurent*/
if (p->freq_hz == 867500000 && p->status == STAT_CRC_OK && p->datarate == DR_LORA_SF12)
{ char output[p->size];
for (j = 0; j < p->size; ++j) {
output[j] = p->payload[j];
}
output[p->size] = 0;
MSG(“INFO: Client package ontvangen met string: %s\n”, output);
}
The concentrator gives some Notes when starting:
lgw_connect:532: INFO: no FPGA detected or version not supported (v103)
Note: success connecting the concentrator
lgw_setup_sx125x:407: Note: SX125x #0 version register returned 0x21
lgw_setup_sx125x:415: Note: SX125x #0 clock output disabled
lgw_setup_sx125x:469: Note: SX125x #0 PLL start (attempt 1)
lgw_setup_sx125x:407: Note: SX125x #1 version register returned 0x21
lgw_setup_sx125x:412: Note: SX125x #1 clock output enabled
lgw_setup_sx125x:469: Note: SX125x #1 PLL start (attempt 1)
lgw_start:823: Note: calibration started (time: 2300 ms)
lgw_start:844: Note: calibration finished (status = 159)

The screen text after a successful crtl-c (before freeze)
loragw_pkt_logger: INFO: gateway MAC address is configured to AA555A0000000101
loragw_pkt_logger: INFO: concentrator started, packet can now be received
loragw_pkt_logger: INFO: Now writing to log file pktlog_AA555A0000000101_20201211T165940Z.csv
loragw_pkt_logger: INFO: Client package ontvangen met string: Test zenden naam1
loragw_pkt_logger: INFO: Client package ontvangen met string: Test zenden naam2
loragw_pkt_logger: INFO: Client package ontvangen met string: Test zenden naam3
^Cloragw_pkt_logger: INFO: concentrator stopped successfully
loragw_pkt_logger: INFO: log file pktlog_AA555A0000000101_20201211T165940Z.csv closed, 3 packet(s) recorded
loragw_pkt_logger: INFO: Exiting packet logger program
I will attacht printscreen of the

csv file that is saed in this action, that should contain 3 packages, but only show chinese charcters.

  1. As mentioned before, make sure you are using at least a 5 ampere power supply, not the 2A one you were originally trying. You need to size this for peak loads, not average, or unreliability and random failures will result

  2. If you are logging binary data to file, you need to use something appropriate to view it, probably as a hexadecimal dump. Most likely you don not want to use csv for binary information, but rather hexdump in your output routine.

But let’s say you already have a binary file, to examine it you could do something such as

hexdump -C somefile