Nexus file
Encyclopedia
Nexus file format is widely used in Bioinformatics
. Several popular phylogenetic programs such as Paup*, MrBayes, Mesquite, and MacClade use this format.
An example for a simple DNA alignment would be:
#NEXUS
Begin data;
Dimensions ntax=4 nchar=15;
Format datatype=dna symbols="ACTG" missing=? gap=-;
Matrix
Species1 atgctagctagctcg
Species2 atgcta??tag-tag
Species3 atgttagctag-tgg
Species4 atgttagctag-tag
;
End;
CHARACTER block : The CHARACTER block contains information about the data matrix.
DATA block : The DATA block contains the data matrix (e.g. sequence alignment).
TREES block : The TREES block contains phylogenetic trees described using the Newick format
, e.g.
SETS block;
TREES block
CODONS block
DISTANCES block
PAUP block : This block contains all the commands used by Paup*. (refer to Command Reference Document - Second Draft for detail describtion of each command.)
Bioinformatics
Bioinformatics is the application of computer science and information technology to the field of biology and medicine. Bioinformatics deals with algorithms, databases and information systems, web technologies, artificial intelligence and soft computing, information and computation theory, software...
. Several popular phylogenetic programs such as Paup*, MrBayes, Mesquite, and MacClade use this format.
Syntax
Command inside square brackets [ and ] are ignored (comment). Each block starts withBEGIN block_name;
and finishes with END;
An example for a simple DNA alignment would be:
#NEXUS
Begin data;
Dimensions ntax=4 nchar=15;
Format datatype=dna symbols="ACTG" missing=? gap=-;
Matrix
Species1 atgctagctagctcg
Species2 atgcta??tag-tag
Species3 atgttagctag-tgg
Species4 atgttagctag-tag
;
End;
Basic blocks
TAXA block : The TAXA block contains information about taxa.CHARACTER block : The CHARACTER block contains information about the data matrix.
DATA block : The DATA block contains the data matrix (e.g. sequence alignment).
TREES block : The TREES block contains phylogenetic trees described using the Newick format
Newick format
In mathematics, Newick tree format is a way to represent graph-theoretical trees with edge lengths using parentheses and commas. It was adopted by James Archie, William H. E. Day, Joseph Felsenstein, Wayne Maddison, Christopher Meacham, F...
, e.g.
((A,B),C);
Paup
ASSUMPTIONS blockSETS block;
TREES block
CODONS block
DISTANCES block
PAUP block : This block contains all the commands used by Paup*. (refer to Command Reference Document - Second Draft for detail describtion of each command.)
External links
- NEXUS file format — detailed explanation with lots of examples
- Nexus to phyloXML converter
- NeXML
- Nexus to Fasta converter