Dataspace-Based Support Platform for Breath Gas Analysis
Dataspace-Based Support Platform for Breath Gas Analysis
Disciplines
Computer Sciences (90%); Mathematics (10%)
Keywords
-
Breath Gas Analysis,
Analytical Workflow Management,
Dataspace Support Platform,
Grid Computing,
MATLAB,
Problem Solving Environment
Cancer is one of the leading causes of death in the western world. Diagnosis of cancer often happens late in the course of the disease since available traditional diagnostic methods are not sufficiently sensitive and specific. A promising, relatively new, research domain addressing non-invasive diagnosis of lung and oesophageal cancers is based on breath gas analysis. There is strong evidence that these kinds of cancers can be detected using the concentration pattern of volatile compounds in exhaled air. The breath gas analysis scientific community is also addressing other application areas. They are currently at the stage of developing new analytical methods, collecting pilot data for cancer and other diseases and identifying marker compounds. At this stage of development, it is of particular importance to enable collaborating scientists and institutions (a) access to distributed breath gas data and analytical resources collected and developed at different research institutions around the world and (b) to easily contribute to and leverage the resources of an international- and national-scale, multi-institutional environment. This will strongly support global collaborations of scientists, improve decisions and increase the chance and scope of discoveries. In this context, a supporting information infrastructure providing advanced data management and analytical services, and their composition into scientific workflows allowing the scientist to efficiently access, integrate, pre-process, and analyze data from multiple various geographically distributed data sources, and publish the results of analysis is highly needed. In this document, we propose research effort contributing to the development and utilization of such an infrastructure. We address novel data management methods in conjunction with scientific workflow management and their automatic parameterization, which will provide a highly efficient and powerful scientific data management and analysis platform for the breath gas research community. Furthermore, we deal with autonomous computing-based methods for model parameterization with the objective to support breath gas researchers by enhanced automatic breath gas analysis features. Our data management approach is based on the grid and dataspace concepts and a new positioning of the well-known MATLAB language and computing environment. The Grid is an infrastructure that enables flexible, secure, and coordinated resource sharing among dynamic collections of individuals and institutions. Dataspaces are modeled as participants (datasets) and relationships. The project will propose, experimentally implement and evaluate a Dataspace-Based Support Platform on top of the grid that will deliver high-level of performance, guarantee high-level of accuracy of analytical methods, and provide rich semantic descriptions of distributed dataset collections produced by breath gas experiments. Breath gas research specific dataspaces will be set up to serve a special subject, which is on one hand the relationship of source data (exhaled breath measurement data) and its derived data (e.g. specific cancer markers) in breath gas analysis experiments and on the other hand to integrate scientific understandings into these applied experiments. The research program of the project is directly building on the results of cooperative research of the project partners and an elaborate network of international cooperation.
Exhaled breath analysis is an emerging new scientific field, as so many others driven by big data, with a large scientific community spread all over the world and with a promising significant impact on many application domains. Recent results suggest that early detection of different kinds of cancer a d other physiological and pathophysiologic al states is possible by means of breath analysis far beyond the scope of available diagnostic methods. The breath gas analysis community is investigating and screening for hundreds of compounds in exhaled breath. They require the usage of numerous analytical instruments as well as various statistical and data mining techniques supporting identification of specific markers. In the current stage of development, it is of particular importance to enable collaborating scientists and institutions access to distributed source breath gas data, derived data and analytical resources collected and developed at different research institutions around the world and to easily contribute to and leverage the resources of an international and national scale to a multi-institutional environment. Our project addresses these requirements by elaborating and implementing an information infrastructure, called ABA-Cloud, providing advanced data management and analytical services for this type of data-driven and computationally intensive science. The developed data management approach allows on one hand to preserve the relationship of source data (exhaled breath measurement data achieved with different analytical techniques) and its derived data (e.g. specific cancer markers) in breath gas analysis experiments, and on the other hand to integrate scientific understandings into these applied experiments. The required scientific analysis approach on top of this enhanced data set is improved by supporting researchers with integrating multiple problem solving environments into the life cycle of data analysis, parallel code execution on top of multiple cores or even remote computing machines, enabling safe inclusion of sensitive datasets into analytical processes through reinforced security mechanisms, and automatic analysis services providing hints to detect potentially new topics of interests. These advanced capabilities are provided via an easy-to-use domain-unspecific web browser application providing a collaborative online workplace for the researchers. It allows viewing all required data and processing code of the various phases of an experiment in great detail as well as exploring possible similarities to other experiments using either the same data set or the same analysis code during its execution. This integrated aspect of experiments and their interrelations is significantly contributing towards the setup of future investigations as well as the preservation of the understanding of finished ones.
- Universität Wien - 46%
- Österreichische Akademie der Wissenschaften - 8%
- FH Vorarlberg - 46%
- Thomas Feilhauer, FH Vorarlberg , associated research partner
- Anton Amann, Österreichische Akademie der Wissenschaften , associated research partner
Research Output
- 45 Citations
- 15 Publications
-
2012
Title Data Life Cycle Management and Analytics Code Execution Strategies for the Breath Gas Analysis Domain DOI 10.1016/j.procs.2012.04.017 Type Journal Article Author Elsayed I Journal Procedia Computer Science Pages 156-165 Link Publication -
2013
Title Towards a High Productivity Automatic Analysis Framework for Classification: An Initial Study DOI 10.1007/978-3-642-39736-3_3 Type Book Chapter Author Ludescher T Publisher Springer Nature Pages 25-39 -
2012
Title Data Stream Processing in CloudMiner. Type Conference Proceeding Abstract Author Brezany P Conference Proceedings of the 1st Annual World Congress on Cloud Computing, Dalian, China -
2012
Title Dataspace Support Platform for e-Science DOI 10.7494/csci.2012.13.1.49 Type Journal Article Author Elsayed I Journal Computer Science Pages 49 Link Publication -
2012
Title Security concept and implementation for a cloud based e-Science infrastructure DOI 10.1109/ares.2012.34 Type Conference Proceeding Abstract Author Ludescher T Pages 280-285 -
2012
Title Security Concept and Implementation for a Cloud Based e-Science Infrastructure. Type Conference Proceeding Abstract Author Brezany P Et Al Conference Seventh International Conference on Availability, Reliability and Security 2012 Society, 2012. -
2013
Title Dataspace Support Platform for e-Science: Dataspace-based Preservation of Scientific Studies. Type Book Author Elsayed I -
2013
Title ABA-Cloud: support for collaborative breath research DOI 10.1088/1752-7155/7/2/026007 Type Journal Article Author Elsayed I Journal Journal of Breath Research Pages 026007 Link Publication -
2013
Title Cloud-Based Code Execution Framework for scientific problem solving environments DOI 10.1186/2192-113x-2-11 Type Journal Article Author Ludescher T Journal Journal of Cloud Computing: Advances, Systems and Applications Pages 11 Link Publication -
2010
Title Grid-based scientific dataspace support platform for breath gas Analysis. Type Conference Proceeding Abstract Author Brezany P Et Al Conference 3rd Austrian Grid Symposium, Austrian Computer Society -
2010
Title Towards Large-Scale Scientific Dataspaces for e-Science Applications DOI 10.1007/978-3-642-14589-6_8 Type Book Chapter Author Elsayed I Publisher Springer Nature Pages 69-80 -
2010
Title Portals for collaborative research communities: two distinguished case studies DOI 10.1002/cpe.1685 Type Journal Article Author Elsayed I Journal Concurrency and Computation: Practice and Experience Pages 269-278 -
2010
Title Semantic data infrastructure to support a scientific dataspac for breath gas Analysis. Type Conference Proceeding Abstract Author Brezany P Et Al Conference Proceedings of the UK e-Science All Hands Meeting -
0
Title Volatile Biomarkers: Non-Invasive Diagnosis in Physiology and Medicine. Type Other Author Amann A -
0
Title The Data Bonanza: Improving Knowledge Discovery in Science, Engineering, and Business. Type Other Author Atkinson M