Data Priorities

Bringing Tools to Data to Avoid Data migration and redundancy

Most current approaches for BigData analysis involve moving data to a server, HPC infrastructure or cloud where the software tools and reference databases are pre-configured. This is inefficient since this approach requires making redundant copies of data each time and additional costs/time associated with moving data back and forth.

Since there is no single tool or workflow to analyze genomic data, multiple copies of the data files may have to be made if the tools / workflows are not available in the same environment where data are stored. This causes the storage and computational costs to go up significantly.

Approaches to enable data to stay local but tools and algorithms are brought to bear on data in-place are needed. This would also include hybrid scenarios where data is distributed locally within enterprise/laboratory servers, on a private cloud and on public cloud infrastructures.


11 votes
Idea No. 34