The black box, an airplane’s digital flight-data recorder, holds massive amounts of data, documenting the performance of engines, cockpit controls, hydraulic equipment, and GPS systems, typically at regular one-second intervals throughout a flight. Analysts have been studying black-box data to prevent accidents from ever occurring. Using software tools that can rapidly search data, operators can flag problem areas and determine whether a plane needs to be pulled off the line to be physically inspected, or if there are problems with flight procedures.

John Hansman, professor of aeronautics and astronautics and engineering systems at MIT, says today’s search methods are limited, with operators needing to identify ahead of time which parameters to check. Hansman and his colleagues devised a detection tool that spots flight glitches without knowing ahead of time what to look for.

The technique uses cluster analysis, a type of data mining that filters data into subsets, or clusters, of flights sharing common patterns. Flight data outside the clusters are flagged as abnormal; analysts can then further inspect these reports to see whether an anomaly is cause for alarm. The number of aircraft sensors has ballooned over the years, and a flight-data recorder on a Boeing 787, for example, is now able to record 2,000 flight parameters continuously for up to 50 hours — a much richer dataset than what is typically monitored.

The team mapped flight parameters in terms of vectors, with each vector representing all the parameters from a single flight. They then plotted vectors from multiple flights in a multiple-dimension “hyperspace.” Vectors with similar measurements clustered together, representing “normal” flights. The outliers, or vectors outside the data clusters, signaled flights with potential problems.

Co-author Lishuai Li hazards a guess that cluster analysis could have helped prevent an accident on Dec. 8, 2005, when Southwest Airlines Flight 1248 slid off a runway at Chicago’s Midway International Airport and crashed into traffic while attempting to land in a snowstorm. Li says the accident was due in part to the pilots’ failure to activate reverse thrusters in time, but that the cluster-analysis technique could have picked out such late-reversal thrusts in prior flights, and flagged them as potential problems.

Hansman aims to test the technique on a richer dataset in the near future, though he acknowledges that getting a hold of information from airline flight-data recorders can be tricky, mostly due to restrictive labor agreements on how the data can be used. He is currently in discussions with several airlines, as well as NASA, in hopes of acquiring more flight data.