Problem with dying battery on interrupt counter 19007/4631

Hi, I have made a number of rain gauges using the 19007 and RAK4631. All the device is doing is counting interrupts and uploading every 10 minutes. For some unknown reason, some of these devices (not all) run out of battery after a period of time (not consistent). I’m trying to figure out what could be happening. The power consumption is usually tiny, I don’t lose more than 0.1v, even after overcast weather, and the ones that are working have been working well for over a year. I’m assuming they get caught in some sort of loop and can’t go to sleep, as the voltage drops from full to dead over a couple of days.
I’ll add the code. I’m wondering if there’s some sort of safeguard I can add, or whether there’s something in the code that could be improved to help. It’s a pretty simple sketch, the vast majority is the Deepsleep/lorawan sketch with the counter and battery voltage added.
Any help would be greatly appreciated…

#include <Arduino.h>
#include <SPI.h>
#include <LoRaWan-RAK4630.h>

uint32_t SLEEP_TIME = 600000; 
int current_sensor_payload;

//Battery Voltage************************************************************************
#define PIN_VBAT WB_A0
uint32_t vbat_pin = PIN_VBAT;

#define VBAT_MV_PER_LSB (0.73242188F) // 3.0V ADC range and 12 - bit ADC resolution = 3000mV / 4096
#define VBAT_DIVIDER_COMP (1.73)      // Compensation factor for the VBAT divider, depend on the board

#define REAL_VBAT_MV_PER_LSB (VBAT_DIVIDER_COMP * VBAT_MV_PER_LSB)

// Comment the next line if you want DEBUG output. But the power savings are not as good then!!!!!!!
//#define MAX_SAVE

/* Time the device is sleeping in milliseconds = 2 minutes * 60 seconds * 1000 milliseconds */
extern uint32_t SLEEP_TIME;

// LoRaWan stuff
int8_t initLoRaWan(void);
bool sendLoRaFrame(void);
extern SemaphoreHandle_t loraEvent;

// Main loop stuff
void periodicWakeup(TimerHandle_t unused);
extern SemaphoreHandle_t taskEvent;
extern uint8_t rcvdLoRaData[];
extern uint8_t rcvdDataLen;
extern uint8_t eventType;
extern SoftwareTimer taskWakeupTimer;

#include <Arduino.h>

//Rain Global****************************************************************************
#define rainPin WB_IO1                          //Red wire to WB_I01, Green to ground
int rain = 0;
//***************************************************************************************

/** Semaphore used by events to wake up loop task */
SemaphoreHandle_t taskEvent = NULL;

/** Timer to wakeup task frequently and send message */
SoftwareTimer taskWakeupTimer;

/** Buffer for received LoRaWan data */
uint8_t rcvdLoRaData[256];
/** Length of received data */
uint8_t rcvdDataLen = 0;

uint8_t eventType = -1;

/**
 * @brief Timer event that wakes up the loop task frequently
 * 
 * @param unused 
 */
void periodicWakeup(TimerHandle_t unused)
{
	// Switch off blue LED to save power during sleep
	digitalWrite(LED_CONN, LOW);
	eventType = 1;
	// Give the semaphore, so the loop task will wake up
	xSemaphoreGiveFromISR(taskEvent, pdFALSE);
}

/** DIO1 GPIO pin for RAK4631 */
#define PIN_LORA_DIO_1 47

/** Max size of the data to be transmitted. */
#define LORAWAN_APP_DATA_BUFF_SIZE 64
/** Number of trials for the join request. */
#define JOINREQ_NBTRIALS 8

/** Lora application data buffer. */
static uint8_t m_lora_app_data_buffer[LORAWAN_APP_DATA_BUFF_SIZE];
/** Lora application data structure. */
static lmh_app_data_t m_lora_app_data = {m_lora_app_data_buffer, 0, 0, 0, 0};

// LoRaWan event handlers
/** LoRaWan callback when join network finished */
static void lorawan_has_joined_handler(void);
/** LoRaWan callback when join failed */
static void lorawan_join_failed_handler(void);
/** LoRaWan callback when data arrived */
static void lorawan_rx_handler(lmh_app_data_t *app_data);
/** LoRaWan callback after class change request finished */
static void lorawan_confirm_class_handler(DeviceClass_t Class);
/** LoRaWan Function to send a package */
bool sendLoRaFrame(void);

/**@brief Structure containing LoRaWan parameters, needed for lmh_init()
 * 
 * Set structure members to
 * LORAWAN_ADR_ON or LORAWAN_ADR_OFF to enable or disable adaptive data rate
 * LORAWAN_DEFAULT_DATARATE OR DR_0 ... DR_5 for default data rate or specific data rate selection
 * LORAWAN_PUBLIC_NETWORK or LORAWAN_PRIVATE_NETWORK to select the use of a public or private network
 * JOINREQ_NBTRIALS or a specific number to set the number of trials to join the network
 * LORAWAN_DEFAULT_TX_POWER or a specific number to set the TX power used
 * LORAWAN_DUTYCYCLE_ON or LORAWAN_DUTYCYCLE_OFF to enable or disable duty cycles
 *                   Please note that ETSI mandates duty cycled transmissions. 
 */
static lmh_param_t lora_param_init = {LORAWAN_ADR_ON, DR_0, LORAWAN_PUBLIC_NETWORK, JOINREQ_NBTRIALS, LORAWAN_DEFAULT_TX_POWER, LORAWAN_DUTYCYCLE_OFF};

/** Structure containing LoRaWan callback functions, needed for lmh_init() */
static lmh_callback_t lora_callbacks = {BoardGetBatteryLevel, BoardGetUniqueId, BoardGetRandomSeed,
                    lorawan_rx_handler, lorawan_has_joined_handler, lorawan_confirm_class_handler, lorawan_join_failed_handler};
//  !!!! KEYS ARE MSB !!!!
/** Device EUI required for OTAA network join */
uint8_t nodeDeviceEUI[8] = {};
uint8_t nodeAppEUI[8] = {};
uint8_t nodeAppKey[16] = {};

/** Device address required for ABP network join */
uint32_t nodeDevAddr = ;
uint8_t nodeNwsKey[16] = {};
uint8_t nodeAppsKey[16] = {};

/** Flag whether to use OTAA or ABP network join method */
bool doOTAA = true;

DeviceClass_t gCurrentClass = CLASS_A;           /* class definition*/
LoRaMacRegion_t gCurrentRegion = LORAMAC_REGION_AU915; /* Region:EU868*/

int8_t initLoRaWan(void)
{
  Serial.printf("<<>> lora_rak4630_init called at %dms\n", millis());
  // Initialize LoRa chip.
  if (lora_rak4630_init() != 0)
  {
    return -1;
  }
  Serial.printf("<<>> lora_rak4630_init finished at %dms\n", millis());
  delay(100);

  // Setup the EUIs and Keys
  if (doOTAA)
  {
    lmh_setDevEui(nodeDeviceEUI);
    lmh_setAppEui(nodeAppEUI);
    lmh_setAppKey(nodeAppKey);

  }
  else
  {
    lmh_setNwkSKey(nodeNwsKey);
    lmh_setAppSKey(nodeAppsKey);
    lmh_setDevAddr(nodeDevAddr);
  }

  Serial.printf("<<>> lmh_init called at %dms\n", millis());
  // Initialize LoRaWan
  if (lmh_init(&lora_callbacks, lora_param_init, doOTAA, gCurrentClass, gCurrentRegion) != 0)
  {
    return -2;
  }
  Serial.printf("<<>> lmh_init finished at %dms\n", millis());
  delay(100);

  // For some regions we might need to define the sub band the gateway is listening to
  // This must be called AFTER lmh_init()
  if (!lmh_setSubBandChannels(2))
  {
    return -3;
  }

  // Start Join procedure
#ifndef MAX_SAVE
  Serial.printf("<<>> lmh_join called at %dms\n", millis());
#endif
  lmh_join();
  Serial.printf("<<>> lmh_join finished at %dms\n", millis());
  delay(100);
  
  return 0;
}

static void lorawan_has_joined_handler(void)
{
  Serial.printf("<<>> lorawan_has_joined_handler called at %dms\n", millis());
  delay(100);
  if (doOTAA)
  {
    uint32_t otaaDevAddr = lmh_getDevAddr();
#ifndef MAX_SAVE
    Serial.printf("OTAA joined and got dev address %08X\n", otaaDevAddr);
#endif
  }
  else
  {
#ifndef MAX_SAVE
    Serial.println("ABP joined");
#endif
  }

  // Default is Class A, where the SX1262 transceiver is in sleep mode unless a package is sent
  // If switched to Class C the power consumption is higher because the SX1262 chip remains in RX mode

  // lmh_class_request(CLASS_C);

  digitalWrite(LED_CONN, LOW);

  // Now we are connected, start the timer that will wakeup the loop frequently
  taskWakeupTimer.begin(SLEEP_TIME, periodicWakeup);
  taskWakeupTimer.start();

  eventType = 1;
  // Notify task about the event
  if (taskEvent != NULL)
  {
    Serial.printf("<<>> Waking up task at %dms\n", millis());
    xSemaphoreGive(taskEvent);
  }
}

/**@brief LoRa function for handling OTAA join failed
*/
static void lorawan_join_failed_handler(void)
{
  Serial.println("OVER_THE_AIR_ACTIVATION failed!");
  Serial.println("Check your EUI's and Keys's!");
  Serial.println("Check if a Gateway is in range!");
}
/**
 * @brief Function for handling LoRaWan received data from Gateway
 *
 * @param app_data  Pointer to rx data
 */
static void lorawan_rx_handler(lmh_app_data_t *app_data)
{
  
#ifndef MAX_SAVE
  Serial.printf("LoRa Packet received on port %d, size:%d, rssi:%d, snr:%d\n",
          app_data->port, app_data->buffsize, app_data->rssi, app_data->snr);
#endif
  switch (app_data->port)
  {
  case 3:
    // Port 3 switches the class
    if (app_data->buffsize == 1)
    {
      switch (app_data->buffer[0])
      {
      case 0:
        lmh_class_request(CLASS_A);
#ifndef MAX_SAVE
        Serial.println("Request to switch to class A");
#endif
        break;

      case 1:
        lmh_class_request(CLASS_B);
#ifndef MAX_SAVE
        Serial.println("Request to switch to class B");
#endif
        break;

      case 2:
        lmh_class_request(CLASS_C);
#ifndef MAX_SAVE
        Serial.println("Request to switch to class C");
#endif
        break;

      default:
        break;
      }
    }

    break;
  case LORAWAN_APP_PORT:
      // Copy the data into loop data buffer
      memcpy(rcvdLoRaData, app_data->buffer, app_data->buffsize);
      rcvdDataLen = app_data->buffsize;
      eventType = 0;

      // Assuming the new time is encoded as 3 bytes. e.g. 30=> 0x00, 0x00, 0x1F
      // Downlink must be sent on Port2
      // Downlink is in seconds
      uint32_t new_time = app_data->buffer[0] << 16;
      new_time |= app_data->buffer[1] << 8;
      new_time |= app_data->buffer[2];
      
      SLEEP_TIME = new_time * 1000;
      Serial.print("New Time Set ");Serial.println(new_time);
      Serial.print("Sleep Time ");Serial.println(SLEEP_TIME);

      taskWakeupTimer.stop();
      taskWakeupTimer.setPeriod(SLEEP_TIME);
      taskWakeupTimer.start();

      // Notify task about the event
      if (taskEvent != NULL)
      {
#ifndef MAX_SAVE
        Serial.println("Waking up loop task");
#endif
        xSemaphoreGive(taskEvent);
      }

  }
}

/**
 * @brief Callback for class switch confirmation
 * 
 * @param Class The new class
 */
static void lorawan_confirm_class_handler(DeviceClass_t Class)
{
#ifndef MAX_SAVE
  Serial.printf("switch to class %c done\n", "ABC"[Class]);
#endif

  // Informs the server that switch has occurred ASAP
  m_lora_app_data.buffsize = 0;
  m_lora_app_data.port = LORAWAN_APP_PORT;
  lmh_send(&m_lora_app_data, LMH_UNCONFIRMED_MSG);
}

/**
 * @brief Send a LoRaWan package
 * 
 * @return result of send request
 */

bool sendLoRaFrame(void)
{
  if (lmh_join_status_get() != LMH_SET)
  {
    //Not joined, try again later
#ifndef MAX_SAVE
    Serial.println("Did not join network, skip sending frame");
#endif
    return false;
  }

//Battery Voltage**********************************************************************************
  // Get a raw ADC reading
  analogReference(AR_INTERNAL_3_0);
  delay(50);
  int vbat_mv = readVBAT();
  int vbat_VOLT = (vbat_mv / 10) - 300;

  analogReference(AR_INTERNAL);   //This takes it back to default (3.6V), so wind direction has full resistance

  // Convert from raw mv to percentage (based on LIPO chemistry)
  uint8_t vbat_per = mvToPercent(vbat_mv);
  
  m_lora_app_data.port = LORAWAN_APP_PORT;

  uint32_t buffSize = 0;
  m_lora_app_data_buffer[buffSize++] = highByte(rain);
  m_lora_app_data_buffer[buffSize++] = lowByte(rain);
  m_lora_app_data_buffer[buffSize++] = vbat_mv / 20;



  m_lora_app_data.buffsize = buffSize;

  lmh_error_status error = lmh_send(&m_lora_app_data, LMH_UNCONFIRMED_MSG);

  return (error == 0);
}
void setup(void)
{

  //Rainfall************************************************************************
  pinMode(rainPin, INPUT_PULLUP);
  attachInterrupt(rainPin, raincounter, FALLING);
  //********************************************************************************
  
  // Initialize Serial for debug output
  time_t timeout = millis();
  Serial.begin(115200);
  while (!Serial)
  {
    if ((millis() - timeout) < 5000)
    {
      delay(100);
    }
    else
    {
      break;
    }
  }

  //Battery Voltage*****************************************************************

  // Set the analog reference to 3.0V (default = 3.6V)
  // analogReference(AR_INTERNAL_3_0);

  // Let the ADC settle
  delay(1);

  // Get a single ADC sample and throw it away
  float readVBAT();
  //******************************************************************************

 	// Create the LoRaWan event semaphore
	taskEvent = xSemaphoreCreateBinary();
	// Initialize semaphore
	xSemaphoreGive(taskEvent);

	// Initialize the built in LED
	pinMode(LED_BUILTIN, OUTPUT);
	digitalWrite(LED_BUILTIN, LOW);

	// Initialize the connection status LED
	pinMode(LED_CONN, OUTPUT);
	digitalWrite(LED_CONN, LOW);

#ifndef MAX_SAVE
	// Initialize Serial for debug output
	Serial.begin(115200);

	// On nRF52840 the USB serial is not available immediately
	while (!Serial)
	{
		if ((millis() - timeout) < 5000)
		{
			delay(100);
			digitalWrite(LED_BUILTIN, !digitalRead(LED_BUILTIN));
		}
		else
		{
			break;
		}
	}
#endif

	digitalWrite(LED_BUILTIN, LOW);

#ifndef MAX_SAVE

#endif

	// Initialize LoRaWan and start join request
	int8_t loraInitResult = initLoRaWan();

#ifndef MAX_SAVE
	if (loraInitResult != 0)
	{
		switch (loraInitResult)
		{
		case -1:
			Serial.println("SX126x init failed");
			break;
		case -2:
			Serial.println("LoRaWan init failed");
			break;
		case -3:
			Serial.println("Subband init error");
			break;
		case -4:
			Serial.println("LoRa Task init error");
			break;
		default:
			Serial.println("LoRa init unknown error");
			break;
		}

		// Without working LoRa we just stop here
		while (1)
		{
			Serial.println("Nothing I can do, just loving you");
			delay(5000);
		}
	}
	Serial.println("LoRaWan init success");
#endif

	// Take the semaphore so the loop will go to sleep until an event happens
	xSemaphoreTake(taskEvent, 10);
}

/**
 * @brief Arduino loop task. Called in a loop from the FreeRTOS task handler
 * 
 */
void loop(void)
{
	// Switch off blue LED to show we go to sleep
//	digitalWrite(LED_BUILTIN, LOW);

	// Sleep until we are woken up by an event
	if (xSemaphoreTake(taskEvent, portMAX_DELAY) == pdTRUE)
	{
		// Switch on blue LED to show we are awake
		digitalWrite(LED_BUILTIN, HIGH);
		delay(500); // Only so we can see the blue LED

		// Check the wake up reason
		switch (eventType)
		{
		case 0: // Wakeup reason is package downlink arrived
#ifndef MAX_SAVE
			Serial.println("Received package over LoRaWan");
#endif
			if (rcvdLoRaData[0] > 0x1F)
			{
#ifndef MAX_SAVE
				Serial.printf("%s\n", (char *)rcvdLoRaData);
#endif
			}
			else
			{
#ifndef MAX_SAVE
				for (int idx = 0; idx < rcvdDataLen; idx++)
				{
					Serial.printf("%X ", rcvdLoRaData[idx]);
				}
				Serial.println("");
#endif
			}

			break;
		case 1: // Wakeup reason is timer
#ifndef MAX_SAVE
			Serial.println("Timer wakeup");
#endif
			/// \todo read sensor or whatever you need to do frequently

			// Send the data package
			if (sendLoRaFrame())
			{
#ifndef MAX_SAVE
				Serial.println("LoRaWan package sent successfully");
#endif
			}
			else
			{
#ifndef MAX_SAVE
				Serial.println("LoRaWan package send failed");
				/// \todo maybe you need to retry here?
#endif
			}

			break;
		default:
#ifndef MAX_SAVE
			Serial.println("This should never happen ;-)");
#endif
			break;
		}
		// Go back to sleep
   Serial.println("Going back to sleep ;-)");
    digitalWrite(LED_BUILTIN, LOW);
		xSemaphoreTake(taskEvent, 10);
	}
}
//Rain*************************************************************************************
void raincounter() {

  //Debounce switch
  static unsigned long last_interrupt_time = 0;
  unsigned long interrupt_time = millis();
  // If interrupts come faster than 200ms, assume it's a bounce and ignore
  if (interrupt_time - last_interrupt_time > 200)
  {
    rain = rain + 1;
  }
  last_interrupt_time = interrupt_time;
}


float readVBAT(void)
{
  float raw;

  // Get the raw 12-bit, 0..3000mV ADC value
  analogReadResolution(12);
  raw = analogRead(vbat_pin);
  delay(50);
  analogReadResolution(10);

  return raw * REAL_VBAT_MV_PER_LSB;
}
uint8_t mvToPercent(float mvolts)
{
  if (mvolts < 3300)
    return 0;
  if (mvolts < 3600)
  {
    mvolts -= 3300;
    return mvolts / 30;
  }
  mvolts -= 3600;
  return 10 + (mvolts * 0.15F); // thats mvolts /6.66666666
}
/**
   @brief get LoRaWan Battery value
   @param mvolts
      Raw Battery Voltage
*/
uint8_t mvToLoRaWanBattVal(float mvolts)
{
  if (mvolts < 3300)
    return 0;
  if (mvolts < 3600)
  {
    mvolts -= 3300;
    return mvolts / 30 * 2.55;
  }
  mvolts -= 3600;
  return (10 + (mvolts * 0.15F)) * 2.55;
}

//***************************************************************************************

The nodes that are draining the battery faster are still sending data every 10 minutes?

I see only a few points in your code that should be improved:

(1) make eventType volatile to make sure it is always reloaded while the code is running

volatile uint8_t eventType = -1;

(2) Just to be save, reset the eventType after you have processed the event in the loop.

(3) Your debounicing in void raincounter() might fail. You define static unsigned long last_interrupt_time = 0; within the function and set it to 0. So every time the interrupt hits, the last_interrupt_time value will be reset to 0. Not sure if the declaration as static would change this.
I would declare static unsigned long last_interrupt_time = 0; outside of the function as a global variable.

Excellent, thanks for the tips @beegee, I’ll make those changes.
The devices keep sending at 10 minute intervals until the battery gets too low. I suppose that would indicate that they’re not in a loop of any sort?
Here’s a screenshot of what I’m seeing…and I have others with the same code/same device setup that have been going for over a year without issue…

Are the base boards of all devices the same PCB revision?
What base boards are you using?

There could be a mix of the earlier 5005 boards and the 19007 ones. Same core modules. They’re out in the wild now so couldn’t say definitively…

Difficult to say then. RAK19007 is better than RAK5005-O, but it shouldn’t actually not be that bad.
From your code I am guessing you are just using WB_IO1 for the input and have no other modules installed.

What kind of battery? Are you using solar to recharge?

It is difficult to find the reason with the devices not at hand.

Yep, just WB_IO1. Using 18650, no solar.
I’d be happy to send a failed device to see if you can solve the issue! I suppose a solar panel might solve the issue, but it’s a bit of a band aid solution I feel.

The 18650’s are brand new? Not to blame the batteries, but I had some bad experience with “recycled” cheap 18650 batteries in the past. Using since then always the more expensive but more reliable Li-Ion batteries with over-, undercharge and overheat protection integrated like these:
image

Before sending devices around, I will have a test with your code.
I have two similar devices running on a 200mAh battery (door open/close detector with a reed-relay RAK13011).
Getting ~5-6 months before I have to recharge.

My code RAK13011-Alarm-Msg-Queue

Yeah that’s a fair call on the batteries, it’s sometimes difficult to know if it’s genuine or a rip off. If you have a reputable supplier that can supply to AU I’d be keen to source a supply. Mine are rated to 3000Mah, so I should get at least a couple of years out of a single charge I would have thought. In addition to the new battery, I might put solar on from now on.

Sorry, I don’t know suppliers in AU. Struggling with reliable sources here in the Philippines as well.

I have built a few interrupt / counter battery based products, specifically flow counters and what I found is, on some rare instances the sensor and magnet stay in contact with one another when flow stops and will draw constant high current than expected because your interrupt pullup is always active. What you should test is:

  1. What is your current draw when you have a constant interrupt being triggered?
  2. How many interrupts do you expect to be triggered daily or hourly? Rather than interrupt based you can make it time based depending on the product requirement, for example check every 30s for rain, that way you can toggle all your GPIOs low when you do not need to draw current.
  3. start using ifdef and endif in your code and wrap all your serial debug data. Printing over serial port takes time and that time will increase your overall current draw.

Most of the above is just generic knowledge

Thanks @brolly759, I really appreciate the pointers. I’m pretty sure the magnet is not staying in contact with one another as it’s a rain gauge, it can’t really get “caught”. The interrupts are rare, only occur when it’s raining, but I guess could get 100 per 20mm or rain (or more or less obviously). I’m in a pretty dry part of AU, only average 450mm/yr.
The uplink is time based, it counts the interrupts while asleep. Every 10 minutes.
Point 3 is good, I’m a complete amateur when it comes to code. Many thanks.