The universal data model in theoretical materials science
The problem of creation of universal data storage becomes significant in the case of implementation of unstructured data collection using an object-oriented approach for information representation and a relational database. The domain area is described by units of data, in accordance with some predefined concepts. The main concepts are: “meta-type”, “instance type”, “instance”, “hierarchy type” and “relationship attribute”. We define and describe additional concepts: “property descriptor”, “primitive property”, “composite property”, “category type”, “category”, “subcategory”, and “linked category”. These concepts determine a set of tables that fixes the entity-relation structure in the database. A unique feature of such approach is the possibility of arrangement of unlimited number of objects over a limited set of concepts. During research activity, it is important to process relevant and complete the information about the objects of different nature and properties. The universal data model allows storing information of any type and complexity by using elementary primitives to describe the hierarchical links and data relationships. As these primitives the data elements related to the basic concepts of graph theory, database theory, and programming are used. They allow one to construct the necessary abstraction level and to use different methodological approaches for data processing. For example, the conceptual framework in materials science is connected to definitions of graph theory that allows one to describe crystal-chemical data using graph abstractions. With interrelated concepts from chemistry and discrete mathematics we can classify the basic objects, such as atom or bond as well as more difficult objects, such as molecules, rings, ligands, nets, etc… Applying the concept of "is a" allows to store information on objects with similar behavior and to consider their relations as hierarchy by the principle "from the general to the particular". Applying the concept "part of" allows storing information on objects and their relations, using the principles of aggregation and composition. This allows considering their relationships in accordance with the principle "from the whole to the part". The concept "is a" allows storing information about similar behavior of objects, such as atoms and void centers or bonds and channels. Concept "part of" allows describing the data hierarchies “atom–structure fragment–net” or “atom-bond-ring-tile”, etc. It is shown that the universal data storage is important part of expert system for materials science. It allows organizing the knowledge database to create and store strategic and meta-rules for more effective search of the decision in the process of predicting the properties for chemical substances of different classes.