Frameworks Coordinate Scientific Data Management
- Saturday, 01 January 2011
Voyager 2 sailing beyond the far boundary of the solar system. The rover Opportunity churning across the red soil of Mars. Cassini-Huygens imaging the moons of Saturn. Capable of journeying well beyond the reach of human explorers, NASA’s robotic missions have probed the distant reaches of space, sending back to Earth streams of unique data and images essential to developing an understanding of our universe. These returns are ultimately housed in NASA’s Planetary Data System (PDS), an archive of data products derived from NASA’s robotic missions, from Galileo to Pioneer to Stardust and more. Appropriately massive for the information it contains, the PDS is distributed across the Nation and organized in eight nodes in conjunction with a host of NASA partner institutions.
To help researchers draw the information they need from the ever-growing repositories of the PDS, in 1998 Daniel Crichton, program manager and principal computer scientist at NASA’s Jet Propulsion Laboratory, designed a unique software framework called the Object Oriented Data Technology (OODT) that transformed the PDS into an accessible virtual knowledge system. “The idea of OODT was to be able to capture all the data, the history of the data, and be able to tie and link all that together into an integrated but distributed system,” says Crichton.
OODT primarily functions as a set of building blocks for constructing systems that capture and manage complex parcels of scientific data, Crichton explains. Its cumulative power allows users to connect multiple, distributed databases and other data sources and then to search for and pull together information in varied data formats, building and populating databases with the aggregated results. During the software’s development, Crichton was careful to separate software architecture from data architecture, meaning OODT functions as a general-use tool that can plug into existing systems and be tailored and extended for their data. In addition to the PDS, NASA also uses OODT for multiple Earth science missions.
While developing OODT, Crichton was already thinking about applications for the software beyond NASA’s missions.
“We saw the unification and integration of science data as a real national need,” he says. Crichton and his colleagues looked into ways of better engaging the opensource software community to transfer the benefits of NASA software innovations to the public. With this in mind, Chris Mattmann, a senior computer scientist at JPL who worked with Crichton on OODT, cultivated connections at the Apache Software Foundation (ASF), based in Forest Hill, Maryland. An all-volunteer, nonprofit organization supported by major information technology companies like Google, Microsoft, and Yahoo!, the ASF manages almost 150 open-source software projects, including the Apache HTTP Server—a key technology in the development of the World Wide Web and the world’s most widely used Web server—and other popular developer software. Mattmann believed Apache was the ideal partner for transferring OODT for public use.
“Apache is different from other open-source communities,” he says. The organization follows a unique vetting process, he explains, that includes an incubation period to ensure that the candidate software is not only sound, but is also supported by a diverse community that will grow the software. It also provides infrastructure and leadership for housing, distributing, and managing the continued development of the technology.
“The ASF has been well known as having the ‘secret sauce’ for how to create successful, long-term, healthy open-source projects,” says ASF president Jim Jagielski. “We worry about the mailing lists, infrastructure, resources, and fundraising, and the projects can focus on what they do best, which is building great code and great communities.”