The UK’s National Crime Agency (NCA) repurposed its cloud-based data analytics platform to help identify threats to life in messages sent by suspected criminals over the encrypted EncroChat phone network.
After placing a “software implant” on an EncroChat server in Roubaix, investigators from France’s digital crime unit infiltrated the encrypted phone network in April 2020, capturing 70 million messages.
The operation, supported by Europol, led to arrests in the Netherlands, Germany, Sweden, France and other countries of criminals involved in drug trafficking, money laundering and firearms offences. More than 1,100 people have been convicted under the NCA’s investigation into the French EncroChat data, Operation Venetic, which has led to more than 3,000 arrests across the UK, and more than 2,000 suspects being charged.
UK police have seized nearly six and a half tonnes of cocaine, more than three tonnes of heroin and almost 14 and a half tonnes of cannabis, along with 173 firearms, 3,500 rounds of ammunition and £80m in cash from organised crime groups.
Europol supplied British investigators with overnight downloads of data gathered from phones identified as being in the UK, through Europol’s Large File Exchange, part of its Siena secure computer network.
With an estimated 9,000 UK-based EncroChat users, the NCA needed to quickly process a large volume of potentially incriminating data, so tasked its National Cyber Crime Unit (NCCU) with categorising it for human investigators to analyse. To automate the preprocessing of data once it had received the EncroChat material, NCCU staff added pre-built capabilities from Amazon Web Services (AWS) to its cloud data platform, including machine learning software with the capability to extract text, handwriting and data from EncroChat text messages and photographs.
“For us, it’s about preventing harm and protecting the public,” said an NCCU spokesperson, quoted in a technology company case study. “We had a flood of unstructured data and had to operate swiftly to reduce harm to the public. Our data scientists could probably have devised ways of analysing this data themselves. But when we have more than 200 threats to life, we can’t afford to spend time doing that. Using off-the-shelf services from AWS enabled us to go from a standing start to a full capability in the space of hours. If we were to build it ourselves from scratch, that might have taken over a month of effort.”
From 10 to 300 users in two weeks
The NCCU was able to scale-up its existing data analysis platform from tens of users in the NCA to 300 within two weeks of being informed of the EncroChat investigation.
Once the historic messages extracted from EncroChat’s in-phone database, called Realm, and live text messages sent from thousands of phones were processed, the NCA sent intelligence packages in the form of CSV files to Regional Organised Crime Units; the Police Service of Northern Ireland; Police Scotland; the Metropolitan Police; Border Force; the Prison Service; and HM Revenue & Customs.
These organisations were then responsible for analysing the data for further indications of threats to life, the drugs trade and other criminal activity.
The NCCU had been developing a cloud-based platform to analyse data for over three years before the EncroChat operation. Digital transformation consultancy Contino won the contract to build the platform on AWS.
By shifting from its on-premise infrastructure to the cloud, the NCCU said it has been able to spend more time on investigations, and less time on procuring and maintaining hardware and managing IT infrastructure.
“Previously, we had on-premises infrastructure, which required a lot of management and prevented us from doing the data science we wanted to do,” said an NCCU spokesperson. “Our small tech team spent a considerable amount of time building and managing infrastructure.
“This was a problem, because our recruitment and retention are based on providing people with engaging and challenging work fighting cyber crime, not administering IT.”
Advanced data processing
Within a year of beginning its pilot of the analytics platform – which used services including Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Relational Database Service (Amazon RDS) – the NCCU introduced more advanced data processing capabilities.
This included the Amazon EMR big data platform, which helps scale and automate data processing, and AWS Glue, a serverless data integration service that can combine and organise data from a wide range of sources.
As a law enforcement agency that handles sensitive and therefore potentially harmful data, the NCA and NCCU also needed the platform to be secure, so used Amazon GuardDuty to monitor network activity to shield it from malicious activity.
“Moving data outside of our perimeter is not a decision we take lightly,” said an NCCU spokesperson. “The transparency of AWS, its shared security model, and the access we had to documentation and experts assisted us on that journey considerably.”
Holland’s drug-talk software
At the start of May 2021, the Netherlands Forensic Institute (NFI) announced that its forensic big data analysis (FBDA) team had similarly modified a computer model it had previously developed to scan for drug-related messages sent between suspected criminals in large volumes of communications data, as part of a research and development project.
The NFI told Computer Weekly at the time that the “drug-talk” software was developed in-house before being modified for “threat-to-life” detection and passed on to the police.
Using deep learning techniques, the FBDA team initially trained the model’s neural network in generic language comprehension by having it read webpages and newspaper articles, before introducing it to the messages of suspected criminals, so it could learn how they communicate.
“The team then began using similar techniques to develop a model to recognise life-threatening messages,” said the NFI in a statement. “That model was ready when the chats from EncroChat poured into the police in Driebergen on 1 April.”