In general, and especially in the “big data” era, there is often a failure to collect sufficient data about the data (metadata). This lack of metadata drastically reduces the potential use of the data, and the attempt to rectify this situation after the fact is often difficult, if not impossible. The necessary metadata must first be located, and then organized and associated with the data. Often those who know about the metadata are no longer available. The proposed alternative is to prepare the data for long-term preservation based on currently accepted best-practices for long-term digital archives. This means that both the data and the metadata are simultaneously collected and prepared for the archive. This methodology provides the best chance that the necessary metadata will be available for future analysis, and ensures use of the data being collected and preserved.

CornerStone is a knowledge acquisition and synthesis framework designed for information model-driven information systems. Domain knowledge is captured from domain experts using a standard modeling paradigm. The knowledge is then synthesized into a knowledge base and the resulting content is extracted, filtered, and translated to specific file formats for use within the information system. The functional components of a model-driven information system are designed to respond to the domain’s information model, resulting in a system that is compliant with the domain’s information requirements.

The information model remains independent of the implementation technology, and the information model is captured in a language that has more expressive power than the other languages in the system. The underlying metamodel incorporates current best-practices for digital archives, federated object registries, and metadata registries. Multi-level governance is established for model management. Providing the core information requirements for the information system and the information model establishes the cornerstone for the information system’s architecture.

Unique features of the framework are the guiding principles under which it was developed: the information model should remain independent of its implementation, the model should evolve independent of the implementation technology, the model should remain disentangled from any implementation technology, and the model should drive the information system in a changing domain.

This work was done by John S. Hughes, Daniel J. Crichton, and Sean H. Hardman of Caltech for NASA’s Jet Propulsion Laboratory. This software is available for license through the Jet Propulsion Laboratory, and you may request a license at: here  . NPO-49832