Campaign: Data Priorities

Longitudinal sequencing

Longitudinal sequencing: obtain the samples from patients at different time points. For examples, biopsy at diagnosis, pretreatment, post-treatment, and relaps

Submitted by

Voting

0 votes
Active

Campaign: Data Priorities

Provide data subsets for download and tool development

Subsets of large data sets should be provided for download to test local tools and for development of pipelines before they are uploaded to the cloud.

Submitted by

Voting

2 votes
Active

Campaign: Data Priorities

Correlate genome with claims & statistical data

In addition to clinical data, tie in claims data. Test feasibility of using CMS virtual data center in conjunction with the NCI cloud to link data. Other multipayer claims databases may also offer longitudinal claims histories.

 

Bring in statistical data, particularly from longitudinal studies (NLSY, HRES, NHANES) and those that have collected biospecimens. (develop standardized re-consent form)

Submitted by

Voting

2 votes
Active

Campaign: Data Priorities

Patient Access

Allow patients and their doctors to access data about them securely

Submitted by

Voting

0 votes
Active

Campaign: Data Priorities

Access to proteomic data of TCGA samples

Datasets containing the quantitative inventory of proteins in TCGA tumors are beginning to become available. Both mass spectrometry and affinity-based technologies are generating these data. The cloud should provide a means to connect these data to corresponding TCGA data.

Submitted by

Voting

3 votes
Active

Campaign: Data Priorities

Connect data with available specimens for follow-up studies

Mining cancer data in the cloud is great, but to enable ongoing research there should be a connection to specimens so researchers can pursue followup studies. This will require storing data about specimens from studies such as TCGA - where they are, how they can be accessed and what consent they are governed by. Just as the data from publications should be made available to allow reproduction of results, so should samples ...more »

Submitted by

Voting

3 votes
Active

Campaign: Data Priorities

Make it so we can still download the raw files

For those of us who don't want to use the cloud workflow, please make it so we still have access to all the raw data. Please don't lock us into your analytic approaches!

Submitted by

Voting

10 votes
Active

Campaign: Data Priorities

Bringing Tools to Data to Avoid Data migration and redundancy

Most current approaches for BigData analysis involve moving data to a server, HPC infrastructure or cloud where the software tools and reference databases are pre-configured. This is inefficient since this approach requires making redundant copies of data each time and additional costs/time associated with moving data back and forth. Since there is no single tool or workflow to analyze genomic data, multiple copies ...more »

Submitted by

Voting

11 votes
Active

Campaign: Data Priorities

ENCODE datasets

Include ENCODE datasets from both normal and cancer cell lines

Submitted by

Voting

3 votes
Active

Campaign: Data Priorities

Access to TCGA microsatellite instability data

Provide access to TCGA microsatellite instability data for analysis

Submitted by

Voting

2 votes
Active

Campaign: Data Priorities

Access to TCGA tissue slide images

Provide access to TCGA tissue slide images for analysis

Submitted by

Voting

14 votes
Active

Campaign: Data Priorities

Access to level 1 & 2 data TCGA copy number array data

Provide access to level 1 and 2 data for TCGA copy number array data for analysis

Submitted by

Voting

9 votes
Active