A secret spy plane operated by the US Marshals hunted drug cartel kingpins in Mexico. A military contractor that tracks terrorists in Africa is also flying surveillance aircraft over US cities. In two stories published last week, BuzzFeed News revealed the activities of aircraft that their operators didn’t want to discuss.
These discoveries came not from tip-offs from anonymous sources, but by training a computer to recognize known spy planes, then setting it loose on large quantities of flight-tracking data compiled by the website Flightradar24.
Here’s how we did it.
Surveillance aircraft often keep a low profile: The FBI, for example, registers its planes to fictitious companies to mask their true identity.
So BuzzFeed News trained a computer to find them by letting a machine-learning algorithm sift for planes with flight patterns that resembled those operated by the FBI and the Department of Homeland Security. Last year, we reported on aerial surveillance by these planes, mapping thousands of flights over more than four months from mid-August to the end of December 2015.
First we made a series of calculations to describe the flight characteristics of almost 20,000 planes in the four months of Flightradar24 data: their turning rates, speeds and altitudes flown, the areas of rectangles drawn around each flight path, and the flights’ durations. We also included information on the manufacturer and model of each aircraft, and the four-digit squawk codes emitted by the planes’ transponders.
Then we turned to an algorithm called the “random forest,” training it to distinguish between the characteristics of two groups of planes: almost 100 previously identified FBI and DHS planes, and 500 randomly selected aircraft.
The random forest algorithm makes its own decisions about which aspects of the data are most important. But not surprisingly, given that spy planes tend to fly in tight circles, it put most weight on the planes’ turning rates. We then used its model to assess all of the planes, calculating a probability that each aircraft was a match for those flown by the FBI and DHS.
(See here for details of the machine learning, and notes on the candidate planes it identified.)
The algorithm was not infallible: Among other candidates, it flagged several skydiving operations that circled in a relatively small area, much like a typical surveillance aircraft. But as an initial screen for candidate spy planes, it proved very effective. In addition to aircraft operated by the US Marshals and the military contractor Acorn Growth Companies, covered in our previous stories, it highlighted a variety of planes flown by law enforcement, and by the military and its contractors.
Some of these aircraft use technologies that challenge our assumptions about when and how we're being watched, tracked, or listened to. It's only by understanding when and how these technologies are used from the air that we'll be able to debate the balance between effective law enforcement, national security, and individual privacy.
Here are five of the most intriguing examples we found....