Understanding BLE Disconnections

April 6th, 2020
Bluetooth, BLE
BLE Open Source Stacks

Connections in BLE are one of the two primary ways of transferring data. Debugging and understanding connections is critical to creating a reliable, low power product., especially when a product has issues transferring data.

Unfortunately, the Bluetooth Low Energy protocol isn’t extremely helpful when debugging connections. Part of this is the wireless nature of the protocol, but otherwise some of the way the protocol is defined makes it difficult for many to debug. Don't fear, we'll be giving you the tools and understanding you need to be able to investigate BLE connections and in particular disconnections

BLE Disconnection

If you’re seeing disconnections between your devices and are able to sniff the connection (using any number of tools that are available, or by looking at the error codes in HCI captures), then there are a few error codes you can expect in the packets.

The critical packets to look at, if you see them, are the LL_TERMINATE_IND packets, which contain an error code indicating why the connection terminated, and are specified as:

  • Authentication Failure - 0x05
  • Remote User Terminated Connection - 0x13
  • Connection Terminated by Local Host - 0x16
  • Remote device terminated connection due to low resources - 0x14
  • Remote device terminated connection due to Power Off - 0x15
  • Unsupported Remote Feature - 0x15
  • Pairing with Unit Key Not Supported error code - 0x29
  • Unacceptable Connection Parameters - 0x3B

On the wireless link, the LL_TERMINATE_IND can be sent by either Peripheral or Central device and contain the reason why the device disconnected. Note that often these reasons are mandated by the specification, but although product developers can provide a more specific reason, you will typically see the unhelpful 0x13 or 0x16.

From the error codes above, 0x13 and 0x16 are the most common ones you will see. 0x14 and 0x15 are intentional disconnection error codes - the other device is giving you a bit more information about why it's disconnecting. That's not common though because few product developers use them or they're used in corner cases(shutting down for example).

Let's assume a Central and peripheral device are connected. The peripheral device decides to disconnect. In this case what you would see if we looked at the code of the event at each device

  1. The Peripheral Host sends an HCI_Disconnect command to the Peripheral BLE controller. This command contains the disconnection reason (from the list above)
  2. The Peripheral Controller sends an HCI_Disconnection_Complete with 0x16 (Connection Terminated by Local Host)
  3. The Peripheral Controller sends the LL_TERMINATE_IND to the host with the reason
  4. The Central Controller receives the LL_TERMINATE_IND, closes the connection and sends a HCI_Disconnection_Complete to the Central Host with the reason
  5. The Central device application receives a disconnection alert
Bluetooth LE Disconnection message flow

If the Central device decided to disconnect, then the events would be reversed. This is fine for when one of the devices decides to disconnect and sends an actual Disconnection via LL_TERMINATE_IND packet, but that doesn't always happens when there are issues.

When the supervisor timeout expires, there is no LL_TERMINATE_IND packet with a reason, the 0x22 LL Response Timeout error code will be provided in the HCI_Disconnection_Complete message.

When debugging BLE devices in practice, this clean cut chain of events gets murky, mainly because the controllers have no idea what or what has actually happened. In case that the Peripheral (or Central) device go out of range or the signal is below the sensitivity, the packet loss increases until the supervisor timeout expires on both devices. In this case, there will not be an LL_TERMINATE_IND packet but each device will generate the Disconnection Complete event.

So, in summary, BLE disconnections occur because:

  1. One of the devices disconnects by requesting a disconnection
  2. The connection drops and the devices time out

While we covered the first case in the previous discussion, there's more to it. So let's dig deeper.

Intentional Disconnections

It may seem obvious, but either the Central or peripheral device can directly request a disconnection by issuing the HCI_Disconnect command. What this looks like on each Bluetooth stack is different, but underneath, Bluetooth is standardized and there is an HCI_Disconnect command. Typically the API for this call will also require a valid error code to be provided. While we've seen some errant calls in the code to disconnect (very unusual), Bluetooth stacks from Nordic, TI and others manufacturers can request a disconnect for a variety of reasons:

  1. Failed connection parameter negotiation - The connection parameters that are specified at the start or during a connection may not be accepted by the other side of the connection. In this case, the device can stay with the current connection parameters or disconnect. So potentially a failed parameter negotiation will result in a disconnection. Smartphones and Tablets like iPhones often have limitations on the connection parameters they allow. If the parameters you specify are out of what the other device can accept, then it can either reject the new parameters or drop the connection. The iPhone is notorious for requiring specific BLE connection parameters and dropping the connection if these are not met. Often you won't know why it's happening.
  2. Insufficient Channels for Connection - In cases where the adaptive frequency hopping algorithm marks too many channels as not usable because of interference, it's possible for devices to decide that they should disconnect. This can be worse with devices from different manufacturers that use variations of the AFH algorithm

BLE Connection Drops

The other reason why a disconnection happens is that one or both devices realize the connection has cut out. This can happen for a variety of reasons, but basically comes down to the Supervisor timeout expiring.

The Supervisor Timeout is a timer on each side of the connection in charge of making sure that the connection is valid. The timer is kept alive as long as packets are being received regularly as specified by the connection parameters and on each connection events. If packets are not being received, this timer can expire, the connection will be assumed to be dead by the Bluetooth controller. Why packets aren't actually being received may be due to multiple causes:

  • Hardware issues
  • Interference
  • Low Signal
  • Low Sensitivity or Output Power

Hardware Design

It's critical to start with a good design. Many of the connection issues we've debugged in the past are design related, and are very difficult to fix late in the development state. Consider this: once your design is certified, any change to it would require expensive re-certification. Fixes to improve antenna performance sometimes require mechanical redesign. So it's always best and cheapest to do a great design at the start.

One of the most critical things to do is to validate a design. We often see product developers copy a manufacturer design like from Nordic, TI or Silicon Labs. But that copy doesn't work well because there are differences that impact the performance. That's why it's important to do a proper design and consider everything. To do a proper validation, the radio is tested with the right equipment and under the right conditions.

When two BLE devices connect, they depend on the crystal in the design to be accurate so that packet decoding is successful. Without a good crystal, packets that are otherwise "good" can be decoded incorrect and become corrupt. Even more common, a receiver or transmitter can miss the time slot in the connection event, and miss the packet entirely when the crystal isn't accurate. Some crystals can have too loose of a tolerance. The Bluetooth SIG specifications imply a maximum tolerance of ±40ppm. While most crystals have a ±10ppm or ±20ppm tolerance. Aging and temperature can impact the crystal. What's more, the loading capacitors for the crystal shift the crystal frequencies.

Interference

Another big cause for BLE connection drops is interference in the 2.4GHz band. This can come from Wi-Fi, Bluetooth, microwaves or other systems. BLE is quite resilient to interference, but even the best wireless system can suffer. For example, using BLE in crowded places with thousands of people can be extremely challenging. It also depends on your application. If you send a packet every once in a while, the packet loss will be usually less than if you are doing a firmware download for a large file.

Some of the steps to rule out interference are:

  • Test your system with as little interference as possible - ideally you should use a screen room (a specially built room with RF isolation), but this isn't available to most. Instead, you should try to eliminate interference by going somewhere remote or turning off Bluetooth and Wi-Fi devices. Alternatively use a shielded box
  • Test both Central and peripheral using SMA connectors and with 70dB or so attenuation. This will limit path loss and interference and avoid any antenna effects.

Low Signal

For BLE to work properly, the signal level at the device has to be sufficient to decode the packets. When it's not, packets can't be received and the package will drop. Low signal is typically caused by a variety of issues such as RF design problems, antenna design, including an antenna that's not tuned, has metal nearby or is very weak, causing a lot the signal to not reach the receiver.

Anything in the chain of the RF signal can be an issue, but some of the items

  • Eliminate the antennas from the equation by connecting systems directly with the right attenuation
  • Test both Central and peripheral using SMA connectors and with 70dB or so attenuation. This will limit path loss and interference and avoid any antenna effects.

Low Sensitivity or Output Power

Theres a lot of variability in the chipsets that are used in a design. Some achieve a fantastic sensitivity of -97dBm, others are around -90dBm. This makes a huge difference in the performance of the system. The output power here also matters, and you have devices all the way from 0dBm to +8dBm, sometimes more. It's critical to select the right chipset, Bluetooth LE Chipset Guide

Debugging Connections

A wireless sniffer is almost a must for any developer, but can be expensive to many. But this sniffer only gives you a partial view of the situation. Capturing the HCI messages between the BLE Host and Controller can be very helpful and sometimes necessary to understand what happened.

One of the first action items is to determine if the signal level is an issue. Sometimes this is obvious, but one way to find out is by using one of the utilities available and use an iPhone to see what the signal level is for advertising packets (for peripherals). If you're receiving packets with -85dBm or below (-90dBm), then you will have significant packet losses and that will explain the disconnections. We use BlueNox BLE Scan, a utility developed at Argenox which we use for all our work.

BlueNox Scan BLE device information

On the embedded device side, debugging your product is best done with a UART output that prints data about the packets and the state. This is because using a debugger will often cause a disconnection if the CPU is stopped from meeting the Bluetooth timing requirements.

If your product is designed to talk to an iOS or Android device, then the HCI Bluetooth logs are available and you can use them.

There's a lot more details in our Ultimate Guide to debugging BLE Products

SUBSCRIBE

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Latest Posts