This is a case study to measure periodic network traffic from smart light bulbs.
Suppose that your home router could decide for itself what each Internet of Things (IoT) device in your home is allowed to do. It might decide, for example, that your smart fridge should only access its manufacturer's server. In order to do that, your router would need to establish what type of device it was dealing with. It would also need to know what forms of network activity are appropriate for devices of that type in order to work properly and remain secure. Over the last year, various academic articles have been published describing the use of machine learning algorithms to determine device types. We therefore thought it would be useful to reflect on a topic that's important in relation to such algorithms, namely periodic network traffic.
Recognising IoT device types
Various types of IoT device can be connected to a home network, including smart lightbulbs, TVs and vacuum cleaners. Over the last year, several universities have published methods of classifying the IoT device type on the basis of the associated network traffic patterns. AuDI and N-BaIoT are two of the published methods. Being able to classify the device type is important, because devices connected to a home network don't always reveal their type themselves (even though it's technically possible to do so, e.g. using the Manufacturer Usage Description Specification or MUD). Both methods are based on machine learning. They involve analysing the network traffic associated with your IoT devices to distinguish patterns. Although they make use of different algorithms, both methods assume that devices perform activities on a periodic basis. Your smart fridge might check for firmware updates every Sunday evening, for example. It's also assumed that devices of the same type undertake broadly similar periodic activities. So TVs are expected to establish video streams and to be active mainly in the evening, while smart thermometers send packets maybe once every five minutes.
Extracting time series
SPIN is an open-source system that we've developed to protect the Internet and its users against insecure smart devices in home networks. In recent months, we've been exploring the potential of the classification methods referred to above for enhancing SPIN. The first practical step in that process was to visualise periodic network traffic. We decided to do that on the basis of the AuDI method of converting network traffic data into time series. First, the network traffic has to be separated into distinct flows. In this context, a flow is defined as a series of network packets sent from an IoT device using a given communication protocol (NTP, ARP, RTSP, etc). Within this definition (unlike the general definition of a flow), the destination is irrelevant, because it’s not who a device is communicating with, but what it's doing. Next, each flow has to be converted into a binary time series with a sample rate of one measurement per second. In each case, the measured value is either 1, indicating that one or more packets were sent in the relevant time period, or 0, indicating that no packets were sent. The AUDI paper describes how the time series are used for signal analysis and classification. In the present context, we are concerned with the visualisation of these series.
Case study: periodic network traffic from smart light bulbs
We used the approach described above to generate time series for four smart lightbulbs of various brands. To that end, we connected the lightbulbs to SPIN for twenty-four hours and gathered the time series. We then visualised the time series with two questions in mind: Is the network traffic from IoT devices periodic? And do devices of the same type undertake similar periodic activities?
Figure 1: Flows from a Tuya lightbulb over a five-minute period. Each colour represents a unique flow, and the circles indicate whether the flow was active at the measurement moment. See Table 1 for a link to the full interactive plot.
Figure 1 shows the active flows from a Tuya lightbulb in the first five minutes after the lightbulb is switched on. In the left-hand region of the graph, you will see a lot of coloured circles, indicating numerous active flows. For example, there was DNS traffic (two light blue circles in quick succession, UDP port 53) and HTTP traffic (three orange circles, TCP port 80). Quite soon, however, the signal seems to stabilise. The red and blue flows are both active about once a minute (ARP and MQTT on TCP port 1883, respectively,), while the yellow flow is active every three seconds (UDP port 6666). If you download the interactive plot (see the table below) you can zoom out and examine other periods. One of the things you'll notice then is that there is DHCP traffic every 5 hours and 45 minutes (purple circles, UDP port 67). There is also some non-periodic activity, or activity whose periodicity cannot be discerned from a monitoring period of this length.
Figure 2: Observed flows from a Baixin lightbulb over a five-minute period. See Table 1 for a link to the full interactive plot.
We can conclude that traffic from the Tuya lightbulb is (semi-) periodic. Although our test group was small, it is sufficient to demonstrate that various lightbulbs can exhibit almost identical periodic activity. That much is apparent from Figure 2, which shows the first two operational minutes of a Baixin lightbulb. While the intervals between the periods of activity may differ (see, for example, the orange circles, TCP port 80), there is considerable broad similarity. The behaviour of the lightbulbs showed similarities later in the time series as well, generating ICMP traffic after about six hours and activity on TCP port 56010. We therefore suspect that the two makes of bulb have the same firmware loaded. The periodic activities of the other two lightbulbs were very different, however (see the table below). With the Omeran lightbulb, ARP, DNS and TCP traffic flows via port 8805 are active every two minutes. By contrast, the Mi Led is active via UDP port 8053 every fifteen seconds and via ARP every thirty seconds. In other words, devices of the same type sometimes have similar activity patterns, but not always. Of course, it might be that the periodic activities of other smart lightbulbs resemble those of the Omeran and Mi Led. Data points from more IoT devices would be required to determine whether that is the case.
|Device||First 5 minutes||Full data set|
|Tuya||Download plot||Download plot|
|Baixin||Download plot||Download plot|
|Omeran||Download plot||Download plot|
|Mi led||Download plot||Download plot|
Table 1: Links to interactive plot HTML files
Additional time series
In view of the findings outlined above, we want to gather a large number of additional time series. To make that possible, we'll be adding a feature to SPIN so that users can visualise time series and upload them to us. We'll then apply the AuDI and N-BaIoT methods to the additional time series to build our own classification model. If the model performs well, device type classification functionality may be added to SPIN in due course. The compiled data may also be useful to researchers: methods are often evaluated using data from a controlled lab environment, whereas we want to compile time series from real users in typical home environments. Naturally, participation in development of the planned data upload functionality will be optional, and the new features will be implemented with privacy and security in mind. We are open to suggestions on the best way to do that responsibly. We'd also like to hear whether you're interested in supporting this initiative by uploading time series and, if so, on what conditions.
This was originally published on the SIDN Labs blog.