Introduction to SeqIO Inwards and Onwards PDF page describes Bio. For implementation details, see the SeqIO development page. Python novices might find Peter’s introductory Biopython Workshop useful which start with working with sequence files using SeqIO.

SeqIO, and although there is some overlap it is well worth reading in addition to this WIKI page. There is a sister interface Bio. Note that the inclusion of Bio. Biopython does lead to some duplication or choice in how to deal with some file formats. Nexus will also read sequences from Nexus files – but Bio.

My vision is that for manipulating sequence data you should try Bio. Unless you have some very specific requirements, I hope this should suffice. File Formats This table lists the file formats that Bio. The format name is a simple lowercase string. PHRED quality scores for the base calls. This allows ABI to FASTQ conversion.

Reads the contig sequences from an ACE assembly file. The alignment format of Clustal X and Clustal W. Resulting sequences have a generic alphabet by default. FASTA format variant with no line wrapping and exactly two lines per record. FASTQ files are a bit like FASTA files but also include sequencing qualities. Sanger style FASTQ files which encode PHRED qualities using an ASCII offset of 33. Illumina style FASTQ files which encode Solexa qualities using an ASCII offset of 64.

PHRED qualities using an ASCII offset of 64. 51 onwards will also write the features table. This refers to the IMGT variant of the EMBL plain text file format. The NEXUS multiple alignment format, also known as PAUP format.

PHD files are output from PHRED, used by PHRAP and CONSED for input. Standard Flowgram Format applying the trimming listed in the file. The Stockholm alignment format is also known as PFAM format. Simple two column tab separated sequence files, where each line holds a record’s identifier and sequence. Qual files are a bit like FASTA files but instead of the sequence, record space separated integer sequencing values as PHRED quality scores. A matched pair of FASTA and QUAL files are often used as an alternative to a single FASTQ file. SeqIO you can treat sequence alignment file formats just like any other sequence file, but the new Bio.

AlignIO module is designed to work with such alignment files directly. Sequence Input The main function is Bio. In the above example, we opened the file using the built-in python function open. The with- statement makes sure that the file is properly closed after reading it. If you had a different type of file, for example a Clustalw alignment file such as opuntia. Iterators are great for when you only need the records one by one, in the order found in the file.