The European Nucleotide Archive: Life on the log scale

Guy Cochrane
Seminar

Aggressive technological development in nucleic acid sequencing platforms bring unprecedented growth in data volumes and an associated broadening of the range of biological applications to which sequencing is put. The European Nucleotide Archive (http://www.ebi.ac.uk/ena/), a long-standing project under which primary public domain nucleic acid sequencing information is collected and made available in perpetuity, is no stranger to exponential growth in data. The advent of second generation technologies, however, required some substantial new strategic and technical developments. In the talk, I will describe the design and implementation of the service under which ENA archives raw data from next generation platforms. I will outline service that is currently available and will touch upon ongoing developments that aim to make the service more useful for a broader userbase, including improved data submission tools, locus coordinate-based data retrieval and sequence similarity search. Finally, I will outline the approach that we take to building a sustainable and affordable service into the future, based on our compression technology and an outward-looking collaborative working model.