Phylo

A package for creating and manipulating phylogenies

Phylo is a Julia package that provides functionality for generating phylogenetic trees to feed into our Diversity package to calculate phylogenetic diversity. Both are currently under development, so please raise an issue if you find any problems. Currently the package can be used to make trees manually, and to generate random trees using the framework from Distributions. For instance, to construct a sampler for 5 tip non-ultrametric trees, and then generate one or two random tree of that type:

julia> using Phylo

julia> nu = Nonultrametric(5);

julia> tree = rand(nu)
RootedTree with 5 tips, 9 nodes and 8 branches.
Leaf names are tip 3, tip 5, tip 2, tip 4 and tip 1

julia> trees = rand(nu, ["Tree 1", "Tree 2"])
TreeSet with 2 trees, each with 5 tips.
Tree names are Tree 2 and Tree 1

Tree 1: RootedTree with 5 tips, 9 nodes and 8 branches.
Leaf names are tip 5, tip 3, tip 4, tip 2 and tip 1

Tree 2: RootedTree with 5 tips, 9 nodes and 8 branches.
Leaf names are tip 3, tip 5, tip 1, tip 2 and tip 4

The code also provides iterators, and filtered iterators over the branches, nodes, branchnames and nodenames of a tree:

julia> collect(nodeiter(tree))
9-element Vector{LinkNode{OneRoot, String, Dict{String, Any}, LinkBranch{OneRoot, String, Dict{String, Any}, Float64}}}:
 LinkNode tip 1, a tip of the tree with an incoming connection (branch 16).

 LinkNode tip 2, a tip of the tree with an incoming connection (branch 10).

 LinkNode tip 3, a tip of the tree with an incoming connection (branch 11).

 LinkNode tip 4, a tip of the tree with an incoming connection (branch 14).

 LinkNode tip 5, a tip of the tree with an incoming connection (branch 9).

 LinkNode Node 6, an internal node with 1 inbound and 2 outbound connections (branches 12 and 9, 10)

 LinkNode Node 7, an internal node with 1 inbound and 2 outbound connections (branches 13 and 11, 12)

 LinkNode Node 8, an internal node with 1 inbound and 2 outbound connections (branches 15 and 13, 14)

 LinkNode Node 9, a root node with 2 outbound connections (branches 15, 16)

julia> collect(nodenamefilter(isroot, tree))
1-element Vector{String}:
 "Node 9"

TreeSets are iterators themselves

julia> collect(trees)

2-element Vector{RootedTree}:
 RootedTree with 5 tips, 9 nodes and 8 branches.
Leaf names are tip 3, tip 5, tip 1, tip 2 and tip 4

 RootedTree with 5 tips, 9 nodes and 8 branches.
Leaf names are tip 3, tip 5, tip 1, tip 2 and tip 4

The current main purpose of this package is to provide a framework for phylogenetics to use in our [Diversity][diversity-url] package, and they will both be adapted as appropriate until both are functioning as required (though they are currently working together reasonably successfully).

It can also read newick trees either from strings or files:

julia> using Phylo

julia> simpletree = parsenewick("((,Tip:1.0)Internal,)Root;")
RootedTree with 3 tips, 5 nodes and 4 branches.
Leaf names are Tip, Node 1 and Node 4


julia> getbranches(simpletree)
skipmissing(Union{Missing, LinkBranch{OneRoot, String, Dict{String, Any}, Float64}}[LinkBranch 1, from node Internal to node Tip (length 1.0).
, LinkBranch 2, from node Internal to node Node 1.
, LinkBranch 3, from node Root to node Internal.
, LinkBranch 4, from node Root to node Node 4.
])

julia> tree = open(parsenewick, Phylo.path("H1N1.newick"))
RootedTree with 507 tips, 1013 nodes and 1012 branches.
Leaf names are 227, 294, 295, 110, 390, ... [501 omitted] ... and 418

And it can read nexus trees from files too:

jjulia> ts = open(parsenexus, Phylo.path("H1N1.trees"))
[ Info: Created a tree called "TREE1"
[ Info: Created a tree called "TREE2"
TreeSet with 2 trees, each with 507 tips.
Tree names are TREE2 and TREE1

TREE1: RootedTree with 507 tips, 1013 nodes and 1012 branches.
Leaf names are H1N1_A_BRAZIL_11_1978, H1N1_A_TAHITI_8_1998, H1N1_A_TAIWAN_1_1986, H1N1_A_BAYERN_7_1995, H1N1_A_ENGLAND_45_1998, ... [501 omitted] ... and H1N1_A_PUERTORICO_8_1934

TREE2: RootedTree with 507 tips, 1013 nodes and 1012 branches.
Leaf names are H1N1_A_BRAZIL_11_1978, H1N1_A_TAHITI_8_1998, H1N1_A_TAIWAN_1_1986, H1N1_A_BAYERN_7_1995, H1N1_A_ENGLAND_45_1998, ... [501 omitted] ... and H1N1_A_PUERTORICO_8_1934

julia> ts["TREE1"]
RootedTree with 507 tips, 1013 nodes and 1012 branches.
Leaf names are H1N1_A_BRAZIL_11_1978, H1N1_A_TAHITI_8_1998, H1N1_A_TAIWAN_1_1986, H1N1_A_BAYERN_7_1995, H1N1_A_ENGLAND_45_1998, ... [501 omitted] ... and H1N1_A_PUERTORICO_8_1934

julia> gettreeinfo(ts)
Dict{String, Dict{String, Any}} with 2 entries:
  "TREE2" => Dict("lnP"=>-1.0)
  "TREE1" => Dict("lnP"=>1.0)

And while we wait for me (or kind contributors!) to fill out the other extensive functionality that many phylogenetics packages have in other languages, the other important feature that it offers is a fully(?)-functional interface to R, allowing any existing R library functions to be carried out on julia trees, and trees to be read from disk and written using R helper functions. Naturally the medium-term plan is to fill in as many of these gaps as possible in Julia, and as a result this R interface is not built into the package as it will make RCall (and R) a dependency, which I wanted to avoid. Instead, if you want to use the R interface you need to do it manually, as below:

julia> using RCall

julia> include(Phylo.path("rcall.jl", dir = "src"));

R> library(ape)

You can then translate back and forth using rcopy on R phylo objects, and RObject constructors on julia NamedTree types to keep them in Julia or @rput to move the object into R:

julia> rt = rcall(:rtree, 10)
RCall.RObject{RCall.VecSxp}

Phylogenetic tree with 10 tips and 9 internal nodes.

Tip labels:
  t10, t8, t1, t2, t6, t5, ...

Rooted; includes branch lengths.

julia> jt = rcopy(NamedTree, rt)
NamedTree with 10 tips, 19 nodes and 18 branches.
Leaf names are t8, t3, t7, t9, t6, ... [4 omitted] ... and t1

julia> rjt = RObject(jt); # manually translate it back to R

R> if (all.equal($rjt, $rt)) "no damage in translation"
[1] "no damage in translation"

julia> @rput rt; # Or use macros to pass R object back to R

julia> @rput jt; # And automatically translate jt back to R

R> jt

Phylogenetic tree with 10 tips and 9 internal nodes.

Tip labels:
  t10, t8, t1, t2, t6, t5, ...

Rooted; includes branch lengths.

R> if (all.equal(rt, jt)) "no damage in translation"
[1] "no damage in translation"
Phylo.PhyloModule
Phylo package

The Phylo package provides some simple phylogenetics types (e.g. NamedTree) to interface to the Diversity package for measuring phylogenetic diversity. It also provides an interface to R for copying trees to and from that language and can read newick and nexus tree files (including TreeSets that contain multiple trees).

Finally it also provides a standard abstract interface to phylogenetic trees, by defining AbstractNode, AbstractBranch and AbstractTree supertypes, and methods to interface to them. It also provides (through the Phylo.API submodule) methods to (re)define to write your own phylogenetic type in a way that will interact cleanly with other phylogenetic packages.

source
Phylo.BinaryNodeType
BinaryNode{B}(AbstractVector{B}, AbstractVector{B}) <: AbstractNode

A node of strict binary phylogenetic tree

source
Phylo.BranchType
Branch

A branch connecting two AbstractNodes of a phylogenetic tree
source
Phylo.BrownianTraitType
BrownianTrait{T <: AbstractTree, N <: Number}

A continuous trait evolved on a phylogenetic tree. This is a Sampleable type, so a random trait can be created using rand(). The trait to be evolved can be any continuous numeric type, including Unitful types for instance, and in the simplest case is determined by the third argument to the constructor start:

function BrownianTrait(tree::AbstractTree, trait::String, start::Number = 0.0; σ² = missing, σ = missing, f::Function = identity)

Note that when Unitful is being used, either here or in branch lengths, σ/σ² keyword argument units must be appropriate. The final keyword argument, f, is a function to transform the evolved gaussian trait into its true value. By default this is the identity function, but can, for instance, be abs to force a positive value on the trait, or more complex functions as required, such as a transformation to turn a continuous variable into a discrete trait

source
Phylo.DiscreteTraitType
DiscreteTrait{T <: AbstractTree, E <: Enum}

A discrete trait evolved on a phylogenetic tree. This is a Sampleable type, so a random trait can be created using rand(dt). The trait to be evolved must be an Enum (generally created using @enum), and is the second argument to the constructor:

function DiscreteTrait(tree::AbstractTree, ttype::Type{<:Enum}, transition_matrix::AbstractMatrix{Float64}, trait::String = "ttype")

The transition matrix holds transition rates from row to column (so row sums must be zero), and the transition probabilities in a branch are calculated as exp(transition_matrix .* branch_length).

source
Phylo.NodeType
Node{RT, NL, T}(AbstractVector{T}, AbstractVector{T}) <: AbstractNode

A node of potentially polytomous phylogenetic tree

source
Phylo.NonultrametricType
Nonultrametric{T <: AbstractTree,
               SAMP <: Sampleable}(n::Int,
                                   sampleable::SAMP = Exponential())
Nonultrametric{T <: AbstractTree,
               SAMP <: Sampleable}(tiplabels::Vector{String},
                                   sampleable::SAMP = Exponential())

The sampler for non-ultrametric phylogenetic trees of size n or with tip labels tiplabels. Generate random trees by calling rand().

source
Phylo.PolytomousTreeType
PolytomousTree

Phylogenetic tree object with polytomous branching, and known leaves and per node data

source
Phylo.SymmetricDiscreteTraitType
SymmetricDiscreteTrait{T <: AbstractTree, E <: Enum}

The simplest possible discrete trait evolved on a phylogenetic tree. This is a Sampleable type, so a random trait can be created using rand(sdt). The trait to be evolved must be an Enum (generally created using @enum), and is the second argument to the constructor:

function DiscreteTrait(tree::AbstractTree, ttype::Type{<:Enum}, transition_rate::Number, trait::String = "ttype")

The transition matrix holds transition rates from row to column (so row sums must be zero), and the transition probabilities in a branch are calculated as exp(transition_matrix .* branch_length).

source
Phylo.UltrametricType
Ultrametric{T <: AbstractTree,
            SAMP <: Sampleable,
            LenUnits <: Number}(n::Int,
                                sampleable::SAMP = Exponential())
Ultrametric{T <: AbstractTree,
            SAMP <: Sampleable,
            LenUnits <: Number}(tiplabels::Vector{String},
                                sampleable::SAMP = Exponential())

The sampler for ultrametric phylogenetic trees of size n or with tip labels tiplabels. Generate random trees by calling rand().

source
LightGraphs.dstFunction
dst(tree::AbstractTree, branch)

Return the destination node for this branch.

source
Phylo.branchdimsMethod
branchdims(::Type{<: AbstractTree})

retrieve the dimensions of the branch lengths for the tree.

source
Phylo.branchfilterMethod
branchfilter(filterfn::Function, tree::AbstractTree)

Returns an iterator over the branches of any tree, where the AbstractBranch is filtered by the function filterfn.

source
Phylo.branchfutureFunction
branchfuture(tree::AbstractTree, node)

Find the branches between a node on a tree and its leaves

source
Phylo.branchhistoryFunction
branchhistory(tree::AbstractTree, node)

Find the branch route between a node on a tree and its root

source
Phylo.branchiterMethod
branchiter(tree::AbstractTree)

Returns an iterator over the branches of any tree.

source
Phylo.branchnamefilterMethod
branchnamefilter(filterfn::Function, tree::AbstractTree)

Returns an iterator over the names of the branches of any tree, where the AbstractBranch is filtered by the function filterfn.

source
Phylo.branchnameiterMethod
branchnameiter(tree::AbstractTree)

Returns an iterator over the names of branches of any tree.

source
Phylo.branchrouteFunction
branchroute(tree::AbstractTree, node1, node2)

Find the branch route between two nodes on a tree

source
Phylo.createbranch!Function
createbranch!(tree::AbstractTree, src, dst[, len::Number];
              data)

Add a branch from src to dst on tree with optional length and data. source and destination can be either nodes or nodenames.

source
Phylo.createnode!Method
createnode!(tree::AbstractTree[, nodename]; data)

Create a node on a tree with optional node info.

source
Phylo.createnodes!Function
createnodes!(tree::AbstractTree, count::Integer)
createnodes!(tree::AbstractTree, nodenames)
createnodes!(tree::AbstractTree, nodedict)

Add a number of nodes, a vector with given names, or a Dict with node names and associated node info to a tree.

source
Phylo.deletebranch!Function
deletebranch!(tree::AbstractTree, branch)
deletebranch!(tree::AbstractTree, src, dst)

Delete the branch branch from tree, or branch connecting src node to dst node.

source
Phylo.distanceMethod
distance(tree::AbstractTree, node1, node2)

Distance between two nodes on a tree

source
Phylo.distancesMethod
distances(tree::AbstractTree)

Pairwise distances between all leaf nodes on a tree

source
Phylo.droptips!Function
droptips!(tree::AbstractTree{OneTree}, tips)

Function to drop tips from a phylogenetic tree tree, which are found in the vector of tips or tip names, tips.

source
Phylo.getancestorsFunction
getancestors(tree::AbstractTree, node)

Return the name of all of the nodes that are ancestral to this node.

source
Phylo.getbranchFunction
getbranch(tree::AbstractTree, branch)
getbranch(tree::AbstractTree, source, dest)

Returns a branch from a tree by name or by source and destination node.

source
Phylo.getbranchesFunction
getbranches(::AbstractTree)

Returns the vector of branches of a single tree, or a Dict of vectors of branches for multiple trees.

source
Phylo.getbranchnameFunction
getbranchname(::AbstractTree, branch)
getbranchname(branch)

Returns the branch name associated with a branch from a tree. For some branch types, it will be able to extract the branch name without reference to the tree.

source
Phylo.getbranchnamesFunction
getbranchnames(tree::AbstractTree)

Return a vector of branch names of a single tree, or a Dict of vectors of branch names for multiple trees.

source
Phylo.getchildrenFunction
getchildren(tree::AbstractTree, node)

Return the [name(s) of] the child node(s) for this node [name].

source
Phylo.getdescendantsFunction
getdescendants(tree::AbstractTree, node)

Return the names of all of the nodes that descend from this node.

source
Phylo.getinboundFunction
getinbound(tree::AbstractTree, node)

return the inbound branch to this node (returns name for node name, branch for node).

source
Phylo.getleafinfoFunction
getleafinfo(::AbstractTree[, label])

retrieve the leaf info for a leaf of the tree.

source
Phylo.getleafnamesFunction
getleafnames(::AbstractTree[, ::TraversalOrder])

Retrieve the leaf names from the tree (in some specific order).

source
Phylo.getleavesFunction
getleaves(::AbstractTree[, ::TraversalOrder])

Retrieve the leaves from the tree.

source
Phylo.getnodedataFunction
getnodedata(::AbstractTree, node)

retrieve the node data for a node of the tree.

source
Phylo.getnodenameFunction
getnodename(::AbstractTree, node)

Returns the node name associated with a node from a tree. For some node types, it will be able to extract the node name without reference to the tree.

source
Phylo.getnodenamesFunction
getnodenames(::AbstractTree[, ::TraversalOrder])

Return a vector of node names of a single tree (identified by id for a ManyTrees tree), or a Dict of vectors of node names for multiple trees.

source
Phylo.getnodesFunction
getnodes(::AbstractTree[, ::TraversalOrder])

Returns the vector of nodes of a single tree, or a Dict of vectors of nodes for multiple trees.

source
Phylo.getoutboundsFunction
getoutbounds(tree::AbstractTree, nodename)

Return the names of the outbound branches from this node.

source
Phylo.getparentFunction
getparent(tree::AbstractTree, node)

Return [the name of] the parent node for this node [name]. Second method may not be implemented for some node types.

source
Phylo.getrootFunction
getroot(::AbstractTree)

Returns the root of a single tree (must be only one tree for a ManyTrees tree).

source
Phylo.getrootsFunction
getroots(::AbstractTree)
getroots(::AbstractTree, id)

Returns a vector containing the root(s) of a single (OneTree) tree or a set of (ManyTrees) trees.

source
Phylo.gettreeinfoFunction
gettreeinfo(tree::AbstractTree)
gettreeinfo(tree::AbstractTree, treename)

Returns the info data associated with the tree(s).

source
Phylo.hasbranchFunction
hasbranch(tree::AbstractTree, branch)
hasbranch(tree::AbstractTree, source, dest)

Does tree have a branch branch or a branch from source to dest?

source
Phylo.hasheightFunction
hasheight(tree::AbstractTree, node)

Does the node have a height defined?

source
Phylo.hasinboundFunction
hasinbound(tree::AbstractTree, node)

Does the node have an inbound connection?

source
Phylo.hasnodeMethod
hasnode(tree::AbstractTree, node)

Returns whether a tree has a given node (or node name) or not.

source
Phylo.hasoutboundspaceFunction
hasoutboundspace(tree::AbstractTree, node)

Does the node have space for an[other] outbound connection?

source
Phylo.isinternalFunction
isinternal(tree::AbstractTree, node)

Is the node (referenced by name or node object) internal to the tree (neither root nor leaf)?

source
Phylo.isleafFunction
isleaf(tree::AbstractTree, node)

Is the node (referenced by name or node object) a leaf of the tree?

source
Phylo.isrootFunction
isroot(tree::AbstractTree, node)

Is the node (referenced by name or node object) a root of the tree?

source
Phylo.isunattachedFunction
isunattached(tree::AbstractTree, node)

Is the node (referenced by name or node object) unattached (i.e. not connected to other nodes)?

source
Phylo.keeptips!Function
keeptips!(tree::AbstractTree{OneTree}, tips)

Function to keep only the tips in a phylogenetic tree, tree, that are found in the vector of tips or tip names, tips.

source
Phylo.nbranchesFunction
nbranches(::AbstractTree)

Returns the number of branches of a single tree, or a Dict of numbers of branches for multiple trees.

source
Phylo.nleavesMethod
nleaves(::AbstractTree)

Returns the number of leaves (tips) in a tree.

source
Phylo.nnodesFunction
nnodes(::AbstractTree)

Returns the number of nodes of a single tree, or a Dict of numbers of nodes for multiple trees.

source
Phylo.nodefilterMethod
nodefilter(filterfn::Function, tree::AbstractTree)

Returns an iterator over the nodes of any tree, where the AbstractNode is filtered by the function filterfn.

source
Phylo.nodefutureFunction
nodefuture(tree::AbstractTree, node)

Find the nodes between a node on a tree and its leaves

source
Phylo.nodehistoryFunction
nodehistory(tree::AbstractTree, node)

Find the node route between a node on a tree and its root

source
Phylo.nodeiterMethod
nodeiter(tree::AbstractTree)

Returns an iterator over the nodes of any tree.

source
Phylo.nodenamefilterMethod
nodenamefilter(filterfn::Function, tree::AbstractTree)

Returns an iterator over the nodenames of any tree, where the AbstractNode itself is filtered by the function filterfn.

source
Phylo.nodenameiterMethod
nodenameiter(tree::AbstractTree)

Returns an iterator over the names of the nodes of any tree.

source
Phylo.nodenametypeMethod
nodenametype(::Type{AbstractTree})
nodenametype(::Type{AbstractNode})
nodenametype(::Type{AbstractBranch})

Returns type of node names from a tree type.

source
Phylo.noderouteFunction
noderoute(tree::AbstractTree, node1, node2)

Find the node route between two nodes on a tree

source
Phylo.nodetypeMethod
nodetype(::Type{AbstractTree})

Returns type of nodes from a tree type.

source
Phylo.nrootsFunction
nroots(::AbstractTree)

Returns the number of roots in a tree. For OneTree types, Unrooted trees will return 0, OneRoot trees should return 1, and manyroots tree (ones with multiple subtrees) will return the number of subtrees. ManyTrees types will return a Dict of counts of the number of roots for each tree in the set.

source
Phylo.ntreesMethod
ntrees(tree::AbstractTree)

Returns the number of trees in a tree object, 1 for a OneTree tree type, and the count of trees for a ManyTrees type.

source
Phylo.roottypeMethod
roottype(::Type{AbstractTree})
roottype(::Type{AbstractNode})
roottype(::Type{AbstractBranch})

Returns root type from a tree type.

source
Phylo.setbranchdata!Function
setbranchdata!(::AbstractTree, branch, label, value)
setbranchdata!(::AbstractTree, branch, data)

Set the branch data for a branch of the tree.

source
Phylo.setheight!Function
setheight!(tree::AbstractTree, nodename, height)

Set the height of the node.

source
Phylo.setnodedata!Function
setnodedata!(::AbstractTree, node, label, value)
setnodedata!(::AbstractTree, node, data)

Set the node data for a node of the tree.

source
Phylo.traversalFunction
traversal(::AbstractTree, ::TraversalOrder)
traversal(::AbstractTree, ::TraversalOrder, init)

Return an iterable object for a tree containing nodes in given order - preorder, inorder, postorder or breadthfirst - optionally starting from init.

source
Phylo.validate!Method
validate!(tree::AbstractTree)

Validate the tree by making sure that it is connected up correctly.

source