Problems the Data Biosphere is trying to solve Modern technologies produce enormous datasets in the life sciences which researchers struggle to manage. Genomics dataset sizes can easily be in the terabyte or even petabyte range, making it difficult to extract meaning for all but a few. The obvious solution is cloud-based data storage and computation designed to make biomedical research more accessible. But, how should these systems be designed? In our blog post we explore the motivations for creating the Data Biosphere organization and the approaches we think are key to its missions.