Brief description of W@DIS
The W@DIS information system is designed to provide access to data, information, and ontologies relating to quantitative spectroscopy required for solving fundamental and applied problems pertaining to a number of subject domains: atmospheric optics, astronomy, etc.
The information system under discussion is a prototype of the next generation information system on molecular spectroscopy based on Semantic Web technologies. The system focuses on spectral data representation for the end user with a possibility to employ information characterized by different levels of details both for the structure of the data and for the knowledge (semantic annotations) associated with these data. It is the end user who is to make a decision about the level of details required.
In this information system, molecular spectroscopic data are divided into three parts: parameters of energy levels, transitions, and line profiles. The first two parts are independent of temperature and intended for inner spectroscopic tasks. The data about line profiles depend on temperature and chemical components of molecular mixtures.
The use of spectral databases in solving applied problems imposes different requirements on data quality. For example, calculations of solar radiation transfer assume weaker requirements on data quality as compared to those to be met in solving the problem associated with absorption of narrow-band laser radiation. The data quality in this kind of databases determines the degree of the validity of and trust in the data. Analysis of quality of a complete set of published data and expert data acquired from this set as well as access of researchers and agents to the details contained in the results of the analysis is the problem whose solution will provide a rapid assessment of expert data quality from different sources.
Originally W@DIS was designed for storage and presentation of a complete consistent set of published spectral data on the water molecule and water isotopologues [1–4]. Then a complete consistent set of data about the hydrogen sulphide molecule was added to W@DIS [5,6]. Today W@DIS incorporates complete sets of published data for more than 20 atmospheric molecules. Completeness of data is associated not only with a set of publications, but also with information on hyperfine structure of spectra, electronic states, sets of all the broadening substances, spectral line shape models, etc.
Structure of molecular data
A description of molecules and molecular spectral properties comprises data that can be represented by an XSAMS-schema. Structure of spectroscopic information resources is described in the W@DIS in more detail than the associated structures described by the XSAMS-schema. The results of an analysis of published data quality for states and transitions presented in databases rely on the XML-schema intended as a supplement to the XSAMS-schema. The data imported into the information system are provided with semantic annotations characterizing data quality.
Generation of semantic annotations on data quality in W@DIS
In the process of uploading or editing molecular data in W@DIS, a semantic annotation (ontology) on the data quality for the pertinent molecule is generated by a computer. The report is a set of semantic annotations and a system of classification of the annotations for a semantic search for the results of the analysis of the molecular data quality, using published literature on states, transitions, and trust of experts in the data of interest [7,8].
Information system W@DIS is designed to provide access to data, information and logical theories of information resources required for solving fundamental and applied problems pertaining to a number of subject domains: atmospheric optics, astronomy, etc.
A key element of the information representation of the data in this information system is elementary data source which accumulates the values of physical entities (for instance, energy levels) of a single molecule which are published as a definite information resource (in journal or web). Integrated data sources, so called data banks, contain all three parts and represent the data for the definite values of temperatures. These data banks are constructed in the system on the base of relational algebra and have metadata is described in the frame of description logics.
In our work the description of the information system for systematization of the energy levels of water and the isotopologues is presented. The system for importing and downloading data in Internet is constructed. The associated knowledge base is the basis for the semantic search of energy levels on the base of their attribute values in the definite interval of values. The energy levels from different data sources can be compared on their values or quantum numbers.
The extension of the data base and knowledge base have data and metadata on the energy levels of water and the isotopologues. Some data sources contain energy levels separated on para and ortho components. The most representative data include data on H216O, HDO, D2O, H217O and H218O.
The basis for the information system is the assumption that published facts describing spectral characteristics of molecules can be represented in different ways depending on the applied tasks to be solved in the system and information technologies used by one or another researcher in processing information objects like data, information or knowledge. The three-layer architecture of the system enables researchers to deal with data and applications, information and knowledge layers.
Operations in the data layer take knowledge of a natural language and fundamentals of molecular spectroscopy. Facts pertaining to thе data layer are facts about quantitative values of physical quantities (energy levels, vacuum wavenumbers, line intensities, etc.). These values are solutions to direct and inverse problems of molecular spectroscopy. They belong both to primary and composite published data sources. The facts appear in the information system during import of solutions to problems of molecular spectroscopy. In the data layer of the information system, researchers can select the molecules of interest, look up values of physical quantities, compare the values, upload their own data to the information system to compare them to a complete set of published data and export data from the information system, review results obtained from alignment of their own data, perform decomposition of composite data in the information system and look through thousands of references describing publications on spectral characteristics of molecules.
In the information layer of the information system, a description is given of properties of published spectral characteristics found in data sources accumulated in the data layer. Among these are properties describing a check on the validity of and assessment of trust in sources of data retrieved from publications. The information layer is useful for researchers having an understanding of the meaning of the properties under study. Some of the properties are concerned with restrictions imposed on physical quantities by mathematical models of molecules, conditions for alignment of values of physical quantities and trust in the data under consideration. For instance, researchers should understand the very essence of restrictions on physical quantities resulting from selection rules and those imposed on the publication of information resources. In particular, results obtained from calculations of root-mean-square deviations and of disordering the values of physical quantities are presented in the information layer.
For computer processing of values of properties of data retrieved from publications, the properties are represented as statements written in OWL 2. Statements pertaining to a publication form a source of information about the data available in the publication. Sources of information about molecules can be imported into the system. In the formation of resources of this kind, a particular emphasis is placed on the properties relating to the problem of the validity of and trust in information resources.
The knowledge layer in the information system involves computer processing of logical theories of information resources for molecular spectroscopy. Dealing with the knowledge layer requires that the interested researcher understand the language of specifications of the OWL 2 ontologies. This language is an interpreting logic used to build logical theories. The main function of the knowledge layer is to give the researcher the chance to make a semantic search for sources of information about spectral properties of molecules.
References
- J. Tennyson, P.F. Bernath, L.R. Brown, et al., IUPAC Critical Evaluation of the Rotational-Vibrational Spectra of Water Vapor. Part I. Energy Levels and Transition Wavenumbers for H217O and H218O, JQSRT, 2009, V.110(9), P.573-596.
- J. Tennyson, P.F. Bernath, L.R. Brown, et al., IUPAC Critical Evaluation of the Rotational-Vibrational Spectra of Water Vapor. Part II: Energy levels and transition wavenumbers for HD16O, HD17O,and HD18O, JQSRT, 2010, V.111(15), P.2160-2184.
- J. Tennyson, P.F. Bernath, L.R. Brown, et al., IUPAC Critical Evaluation of the Rotational-Vibrational Spectra of Water Vapor, Part III: Energy levels and transition wavenumbers for H216O, JQSRT, 2013, V.117, P.29–58.
- J. Tennyson, P.F. Bernath, L.R. Brown, et al., IUPAC critical evaluation of the rotational–vibrational spectra of water vapor. Part IV. Energy levels and transition wavenumbers for D216O, D217O, and D218O, JQSRT, 2014, V.142, P.93–108.
- E. R. Polovtseva, N. A. Lavrentiev, S. S. Voronina, et al., Information System for Molecular Spectroscopy. 5. Ro-vibrational Transitions and Energy Levels of the Hydrogen Sulfide Molecule, Atmos. and Oceanic Optics, 2012, Vol. 25, No. 2, pp. 157–165.
- S.S. Voronina, O.V.Naumenko, E.R. Polovtseva, et al., Systematization of Published Spectral Data on Deuterated Isotopologues of Hydrogen Sulfde Molecule, Proc. of SPIE XX-th Int. Symp. on Atmos. and Ocean Optics: Atmo. Physics, 2014, Vol. 9292, 92920B.
- A.Privezentsev, D.Tsarkov, J.Tennyson, et al., Computed Knowledge Base for Description of Information Resources of Water Spectroscopy, Proc. of the 7th International Workshop on OWL: Experiences and Directions (OWLED 2010), San Francisco, California, USA, June 21-22, 2010. Eds:E. Sirin, K. Clark, CEUR-WS Proc. Vol-614,http://ceur-ws.org/Vol-614/owled2010_submission_6.pdf
- A. Fazliev, A. Privezentsev, D. Tsarkov, et al., Ontology-Based Content Trust Support of Expert Information Resources in Quantitative Spectroscopy, In book: Knowledge Engineering and the Semantic Web, Communications in Computer and Information Science, V. 394, Springer, Berlin, Heidelberg, Eds: P. Klinov, D. Mouromtsev, pp.15-28.
|