Realtime detection of Russian crypto-phone Era with Flink DataStream and Stateful Functions

Tymur Yarosh
DevOops World … and the Universe
7 min readMar 25, 2022

--

Today’s night, I woke up three times from an airstrike alert. At the time of writing, Russia invades Ukraine, at least it’s trying. But there are many reasons why it can’t succeed. One is the inability of Russian military authorities to choose the right tool for the job. As engineers, we know that improper use of a tool can fail the project. The Russian military is different.

I’m neither a military expert nor a professional in cryptography, so my knowledge of Russian military communication and encryption analysis isn’t accurate. I focus this article on interoperability between Flink DataStream API and Flink Stateful Functions.

Era

Probably Era

What is Era? Era is a Russian matryoshka-phone that transmits data over a secured connection. It’s like Skype, empowered by nanotechnologies. Russians use it to sync with commanders and define what hospitals, maternity homes, residential buildings, universities, and other civilian infrastructure to bomb. It’s an excellent tool for Russian war criminals, but there is one drawback — it uses a 3g/4g network. I guess it’s ok to depend on the cellular network of your carriers when protecting your own country, but I wouldn’t use it for invasion. Depending on the enemy’s cellular infrastructure uncovers your positions and potentially can lead to data leaks.

Location detection

Let’s design the detection process. To understand the terrorists' position, we would need our carriers to publish events when any device is in the service area of a base station. Then the carrier captures the used data. The intelligence agency downloads the data to identify the encryption algorithm or sequences similar to the Era connection and then triangulates the position using locations of the base stations nearby. This should work, but capturing and analyzing all the data would require enormous storage, network, and computing resources. Moreover, people wouldn’t like it due to privacy issues. So, we need to capture the data of the suspicious devices only. To keep things simple, we’ll consider a device suspicious if either:

  • an active phone number was used by another device recently (a Russian terrorist could steal the phone from a civilian and move the SIM card to Era)
  • the device location changed faster than the owner could travel (a Russian terrorist could steal the IMEI)
  • the device or the phone number was registered first time recently (a Russian terrorist could steal or buy a new SIM card from the carrier)

Then the whole process is:

  1. Receive service area events when the device is nearby base stations
  2. Find suspicious anomalies
  3. Request captured data of a suspicious device from the carrier
  4. Identify Era-like data
  5. Triangulate the location

Check out this fantastic article on position triangulation if you would like to learn the process on a high level. I will not cover this topic here.

Architecture

Since the process is ready, let’s design the software. It consists of three logical parts:

Carrier generates DeviceRegisteredEvents when the device is discovered in the service area of a base station:

DeviceRegisteredEvent example

Flink ingests the events using DataStream API and forwards them to Stateful Functions for detecting suspicious devices. While evaluating a device, Stateful Functions buffers all the events. The Stateful Functions application forwards buffered and future events of suspicious devices back to the DataStream application. Then DataStream application processes DeviceRegisteredEvents to find the location.

Architecture

DataStream and Stateful Functions interoperability

Flink DataStream API is a powerful tool for building acyclic graphs. That means that data go through operators in one direction — forward only. Stateful Functions implements a free data flow model where data go from function to function in any direction. Combining both APIs in the same application enables powerful data streaming use cases. It starts from a streaming job. First, we obtain the environment and define Kafka ingresses for DeviceRegisteredEvents and DataAccessEvents:

Ingresses

We register sources and then turn them into Stateful Functions ingresses by mapping incoming data to a RoutableMessage. It associates data with the target function. Remote Java SDK for Stateful Functions requires messages to have an explicit type, so we’re turning the incoming event data into TypedValue. The implementation is almost the same for both kinds of ingested events. Here is an example from DeviceRegisteredEvent:

toTypedValue() implementation

Then we define the Stateful Functions module by registering ingresses, remote functions, and the egress:

Stateful Functions module

I’ll explain the egress later in this article. You need to be aware now that Stateful Functions egress can be turned to DataStream so that DataStream API can consume events emitted by Stateful Functions. The full interoperability is based on the following principles:

  • Send DataStream events to Stateful Functions by turning data streams into ingresses
  • Send Stateful Functions events to DataStream by turning Stateful Function egresses to data streams

You can check out the whole job on Github.

Find suspicious device

Stateful Functions API has many benefits; my favorite is that functions can represent business entities. This time no exception. We need two functions Device and Phone. A Device has a status: evaluating, trusted, or untrusted (suspicious). Evaluating means that the application isn’t aware of whether the device is trusted or untrusted, so the evaluation process is ongoing. When DeviceRegisteredEvent is received during the evaluation process, the Device:

  1. Shares DeviceRegisteredEvent with the Phone function to associate the phone number with the device
  2. Updates a device location history
  3. Appends phone number to device’s history
  4. Requests anomalies from the Phone function
  5. Buffers DeviceRegisteredEvent
DeviceRegisteredEvent handling while Device is being evaluating
DeviceRegisteredEvent handling when Device status is “evaluating”

When the Phone function receives a DeviceRegisteredEvent, it appends the device to its history. So if multiple devices use the same phone number, the Phone function could detect it. When Device asks the Phone to find anomalies, the Phone function compares the Device’s phone numbers history and the Phone’s device history. Differences mean that multiple devices use the same phone number. This is considered an anomaly.

When Phone completes anomaly search, it shares the result with the Device function via AnomaliesSearchCompletedEvent. Then the Device performs its anomaly search:

It consists of:

  1. Location validation (travel time across location history entries must be within a threshold)
  2. Device validation
    2.1 Multiple phone numbers used by the same device
    2.2 First registration was close to the war date

The Device function calls Carrier API to start data capturing for this particular IMEI if any anomaly is found. Then, the carrier begins sending DataAccessEvents:

Example of DataAccessEvent

The Device function catches the events and analyses the data. If it looks like Era, the device is considered suspicious. Its status changed to untrusted and buffered events sent to the egress.

Otherwise, the device becomes trusted, and no future events would be sent to the egress. EraRecognizer is stubbed since it’s out of the scope of this article. The whole Stateful Functions application is available on Github.

Locate suspicious device

Suspicious DeviceRegisteredEvents are now available from the egress. To access them, we create a DataStream of TypedValue. Then we map TypedValue to a DeviceRegisteredEvent and make a sliding window per device. Using a sliding window is to refine and publish the position by using the three latest events when a new event is received. The last step is to find the location.

Processing suspicious DeviceRegisteredEvents code

FindLocationFunction is stubbed since it’s out of the scope of this article.

Instead of conclusions

Russians launched a full-scale war against Ukraine. While being unable to beat the Ukrainian army, Russians terrorize civilians. They destroy our cities, kidnap officials and journalists, and use unconventional weapons against ordinary people. Russians already kill more than 100 kids and this number is growing. Help Ukraine to stop the war.

--

--