Genome sequencing information Genome project history Strain BD11-

Genome sequencing information Genome project history Strain BD11-00177 was sequenced because of its relevance to biodefense. The draft genome sequence was finished in August 2012. The GenBank accession number for the project is 177784. The genome project selleckchem is listed in the Genome OnLine Database (GOLD) [22] as project Gi21611. Sequencing was carried out at the Dutch Organization for Applied Scientific Research (TNO) and the Swedish Defense Research Agency (FOI). Initial automatic annotation was performed using the DOE-JGI Microbial Annotation Pipeline (DOE-JGI MAP). Table 2 shows the project information and its association with MIGS 2.0 compliance. Table 2 Project information Growth conditions and DNA isolation For DNA preparation, strain BD11-00177 was grown on 5% sheep blood agar plates for 72 h at 35��C in the presence of 5% CO2.

DNA was extracted using the Qiamp DNA Micro Kit according manufacturers guidelines (Qiagen, Westburg b.v., Leusden, The Netherlands). Genome sequencing and assembly Sequencing was performed by the Microbiology and Systems Biology group at TNO and the Division for CBRN Defence and Security at FOI using 454 Roche GS Junior and the Illumina MiSeq platforms. The initial draft assembly yielded 95 large (>1,000 bp) and 86 small (<1,000 bp), non-redundant contigs of 1,813,372 bp by combing 75,245 Roche/454 reads at 23�� coverage and 8,289,332 Illumina reads at 690�� coverage by hybrid assembly through the Ray Assembler V2.1 [24].

Genome annotation Open Reading Frames (ORFs) were predicted using the Prodigal gene prediction algorithm [23] as part of the DOE-JGI Microbial Annotation Pipeline (DOE-JGI MAP) using default parameters, followed by a round of manual curation. CRISPR elements were predicted using CRT and PILERCR [25]. Predictions from both methods were concatenated. Identification of tRNAs was performed using tRNAScan. Ribosomal RNA genes (5S, 16S, 23S) are predicted using the program RNAmmer [26]. With the exception of tRNA and rRNA, all models from Rfam [27] are used to search the genome sequence. For faster detection, sequences are first compared to a database containing all the ncRNA genes in the Rfam database using BLAST, with a very loose cutoff. Subsequently, sequences that have hits to any genes belonging to an Rfam model are searched using the program INFERNAL Entinostat [27]. Protein coding genes were compared to protein families (e.g., COGs, Pfam, KEGG) and the proteome of selected ��core�� genomes, which are publicly available, and the product names were assigned based on the results of these comparisons. Genome properties The genome was assembled into 95 large (>1,000 bp) contigs and includes one circular chromosome with a total size of 11,813,372 bp (32.23% GC content).

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>