The automation of important aspects of scientific discovery will
significantly accelerate. Our claim is that given the right knowledge
and methods, computers could autonomously carry out discovery processes
by searching hypothesis spaces in a systematic, comprehensive, and
In this project, we are investigating a novel approach to automate the hypothesize-test-evaluate discovery cycle with an intelligent system that a scientist can task with lines of inquiry to test hypotheses of interest. We are implementing this approach in DISK (automated DIscovery of Scientific Knowledge), a system that extends the existing WINGS intelligent workflow system for scientific data analysis, and applying it to multi-omics.
Our work to date has focused on four major research objectives:
1) Representing hypotheses and associated evidence and confidence values;
2) Formulating lines of inquiry that express how to test hypotheses by running data analysis workflows against the data available;
3) Designing a meta-analysis engine that uses meta-workflows to assess the results of lines of inquiry and to revise and extend the original hypotheses accordingly; and
4) Developing intelligent agents for interactive discovery that explain new findings to scientists.
An overview of the DISK framework can be seen in the image above, illustrating how the four main objectives are integrated. First, a user defines the hypothesis to test with the help of the interactive discovery agent, which helps to transform the hypothesis statements into a machine readable representation. If the hypothesis matches a line of inquiry, then the system will start searching for the appropriate data to test it, exploring open repositories like the TCGA.
When the data is found, the workflows in the line of inquiry are sent to the workflow system, where they are executed. The results of the execution are then stored in a Linked Data repository. Finally, the repository is explored by the metaworkflows associated with the line of inquiry to analyze the results of all the workflows and create a revision of the original hypothesis.