40 ideas posted
Store results in the cloud (e.g. variant file) with methodologies documented and workflow available in a workflow management system so the analyses can be reproduced with other data
Submitted by Community Member 4 months ago
Submitted by will fitzhugh 1 month ago
Submitted by Edwin Wang 4 months ago
One of the clear benefits of the cloud data hosting model is that all data sets of a similar type can be analyzed consistently, something which individuals have to do themselves now.
So the example of having variant files generated with a documented methodology is a good one, as long one can get variant files for all samples produced with that same methodology.
This speaks to a broad issue of how much top-down control there will be on the cloud to ensure consistency and standards on the data. We'll have to strike the right balance here to both encourage participation and enable comprehensive analysis.
I agree that cloud arrangements with data and algorithms are an important advance. I think it is also important to recognize the value of metadata. Including systems to coordinate metadata with the primary data, and also the algorithms is essential.
Others have pointed out that technologies like iPython notebooks that show code and RESULTS are valuable because it lets you have the complete package.
Storing results in the cloud will be most useful of the results and their interpretations are represented using standard semantic terminologies. These will also need to be defined and systems developed to facilitate their use.
In conjunction we might also cultivate online dashboards for reproducibility, where results might be evaluated according to well-defined metrics to derive some form of plausibility score. This may not be straightforward for wet results, but seems well within reach for computational/algorithmic results. Imagine the time & effort saved if investigators could visit something like Angie's List for science, allowing one to ascertain at a glance which results show the greatest promise for replication. More details are in a draft proposal available from the Broad Institute at https://docs.google.com/document/d/1Ts1LAMyv9j3sb8F3UX87Gd1tfaBTQht08QQNtNgsjVQ/edit?usp=sharing
Copyright © 2013
| RFI and Privacy Policies