Samples and Features
Microbial profiles are made up of AbstractSample
s and AbstractFeature
s. Typically, an AbstractSample
is an individual biospecimen or other observation, and contains some number of AbstractFeature
s, such as taxa or gene functions. AbstractSample
s may also contain arbitrary metadata.
Sample types
At its most basic, an AbstractSample
simply encodes a name
(which should be a unique identifier), and a place to hold metadata. The concrete type MicrobiomeSample
is implemented with these two fields, the latter of which is a Dictionary
from Dictionaries.jl
.
You can instantiate a MicrobiomeSample
with just a name (in which case the metadata dictionary will be empty), using keyword arguments for metadata entries, or with existing metadata in the form of a dictionary (with keys of type Symbol
) or a NamedTuple
.
julia> s1 = MicrobiomeSample("sample1")
MicrobiomeSample("sample1", {})
julia> s2 = MicrobiomeSample("sample2"; age=37)
MicrobiomeSample("sample2", {:age = 37})
julia> s3 = MicrobiomeSample("sample3", Dict(:gender=>"female", :age=>23))
MicrobiomeSample("sample3", {:age = 23, :gender = "female"})
Working with metadata
To change or add metadata, you can use the same syntax as working with a [Dictionary
] directly, though note that this is a bit different from the Dict
type in base julia:
julia> insert!(s1, :age, 50)
MicrobiomeSample("sample1", {:age = 50})
julia> set!(s3, :gender, "nonbinary")
MicrobiomeSample("sample3", {:age = 23, :gender = "nonbinary"})
julia> delete!(s3, :gender)
MicrobiomeSample("sample3", {:age = 23})
You can access values of the dictionary using either getindex
or getfield
syntax, that is:
julia> s3[:age]
23
julia> s3.age
23
Bulk addiction of metadata is also possible, by passing a Dictionary
or NamedTuple
to set!
or insert!
(the latter will fail if any of the incoming keys are already found):
julia> insert!(s3, (gender = "nonbinary", genotype="XX"))
MicrobiomeSample("sample3", {:age = 23, :gender = "nonbinary", :genotype = "XX"})
julia> set!(s3, (genotype="XY", ses=7))
MicrobiomeSample("sample3", {:age = 23, :gender = "nonbinary", :genotype = "XY", :ses = 7})
Feature Types
AbstractFeature
types also have a name
, but other fields are optional. Microbiome.jl
defines three concrete AbstractFeature
types, Taxon
, GeneFunction
, and Metabolite
.
Taxon
The Taxon
type contains a name and (optionally) a rank (eg :phylum
).
julia> ecoli = Taxon("Escherichia_coli", :species)
Taxon("Escherichia_coli", :species)
julia> uncl = Taxon("Unknown_bug")
Taxon("Unknown_bug", missing)
You can access the name and rank fields using name
and taxrank
respectively, and also check whether the instance has a rank with hasrank
, which returns true
or false
.
julia> hasrank(ecoli)
true
julia> hasrank(uncl)
false
julia> taxrank(ecoli)
:species
julia> taxrank(uncl)
missing
julia> name(ecoli)
"Escherichia_coli"
julia> name(uncl)
"Unknown_bug"
For compatibility with other tools, converting a Taxon
to a String
will return the name prepended with the first letter of the taxonomic rank and 2 underscores. You can convert back using taxon
(note the lowercase 't'):
julia> String(uncl)
"u__Unknown_bug"
julia> String(ecoli)
"s__Escherichia_coli"
julia> String(ecoli) |> Taxon
Taxon("s__Escherichia_coli", missing)
julia> String(ecoli) |> taxon
Taxon("Escherichia_coli", :species)
GeneFunction
The GeneFunction
type contains a name and (optionally) a Taxon
. In addition to providing both a name and Taxon
, you can instantiate a GeneFunction
with just a name (in which case the taxon will be missing
), or with the name of the taxon (in which case it will not have a rank
).
julia> gf1 = GeneFunction("gene1")
GeneFunction("gene1", missing)
julia> gf2 = GeneFunction("gene2", "Species_name")
GeneFunction("gene2", Taxon("Species_name", missing))
julia> gf3 = GeneFunction("gene2", Taxon("Species_name", :species))
GeneFunction("gene2", Taxon("Species_name", :species))
You can access or check for various fields using similar methods as for Taxon
:
julia> hastaxon(gf1)
false
julia> hastaxon(gf2)
true
julia> hasrank(gf2)
false
julia> hasrank(gf3)
true
julia> name(gf3)
"gene2"
julia> taxon(gf3)
Taxon("Species_name", :species)
julia> taxrank(gf3)
:species
For compatibility with other tools, Converting a GeneFunction
to a String
if it has a Taxon
will include the taxon name separated by |
. Converting back can be done using genefunction
(note the lowercase g and f).
julia> String(gf3)
"gene2|s__Species_name"
julia> genefunction(String(gf3))
GeneFunction("gene2", Taxon("Species_name", :species))
Metabolites
The Metabolite
type has a name
and optionally a commonname
, a mass / charge ratio (mz
), and retention time (rt
).
julia> m = Metabolite("name", "common", 1., 2.)
Metabolite("name", "common", 1.0, 2.0)
julia> name(m)
"name"
julia> commonname(m)
"common"
julia> masscharge(m)
1.0
julia> retentiontime(m)
2.0
julia> m2 = Metabolite("other name")
Metabolite("other name", missing, missing, missing)
Types and Methods
Microbiome.MicrobiomeSample
— TypeMicrobiomeSample(name::String, metadata::Dictionary{Symbol, T}) <: AbstractSample
MicrobiomeSample(name::String; kwargs...)
MicrobiomeSample(name::String)
Microbiome sample type that includes a name and a Dictionary
of arbitrary metadata using Symbol
s (other than :name
or :metadata
) as keys.
Metadata can be accessed using getproperty
or getindex
on the sample itself.
Samples can be instantiated with only a name, leaving the metadata
Dictionary
blank
Adding or changing metadata follows the same rules as for the normal Dictionary
.
Missing docstring for metadata
. Check Documenter's build log for details.
Microbiome.Taxon
— TypeTaxon(name::String, rank::Union{Missing, Symbol, Int}) <: AbstractFeature
Taxon(name::String)
Microbial taxon with a name and a rank that can be one of
:domain
:kingom
:phylum
:class
:order
:faamily
:genus
:species
:subspecies
:strain
or missing
. Contructors can also use numbers 0-9, or pass a string alone (in which case the taxon
will be stored as missing
).
See also taxon
.
Microbiome.name
— Functionname(t::Union{AbstractSample, AbstractFeature})
Get the name
field from an AbstractSample
or AbstractFeature
.
Microbiome.hasrank
— Functionhasrank(t::Taxon)::Bool
Boolean function that returns true
if the rank
field in Taxon
t
is not missing
, or false
if it is missing
hasrank(gf::GeneFunction)::Bool
Boolean function that returns:
true
ifgf
has aTaxon
with a non-missingrank
field,false
if there's noTaxon
, orfalse
if theTaxon
has norank
Microbiome.taxrank
— Functiontaxrank(t::Union{Taxon, missing})
Get the rank
field from a Taxon
t
. Returns missing
if the rank is not set.
taxrank(gf::GeneFunction)
Get the rank
field from the taxon
field of a GeneFunction
gf
if it has one. Returns missing
if the taxon
or rank
is not set.
Microbiome.taxon
— Functiontaxon(::AbstractString)
Return a Taxon
from a string representation. If the string contains taxonomic rank information in the form "x__Thename"
where x
is the first letter of the rank, this information will be used.
Examples
julia> taxon("Unknown")
Taxon("Unknown", missing)
julia> taxon("s__Prevotella_copri")
Taxon("Prevotella_copri", :species)
taxon(gf::GeneFunction)
Get the taxon
field from a GeneFunction
, gf
. Returns missing
if the taxon is not set.
Microbiome.GeneFunction
— TypeGeneFunction(name::String, taxon::Union{Taxon, String, Missing}) <: AbstractFeature
GeneFunction(name::String)
Microbial gene function object with optional stratification (taxon
).
Microbiome.genefunction
— Functiongenefunction(n::AbstractString)
Make a GeneFunction
from a string, Converting anything after an initial |
as a Taxon
.
Microbiome.Metabolite
— TypeMetabolite(name::String, commonname::Union{Missing, String}, mz::Union{Missing, Float64}, rt::Union{Missing, Float64}) <: AbstractFeature
Metabolite(name::String)
Represents a small-molecule metabolite coming from an LCMS. The fields are
name
: required, this should be a unique identifiercommonname
: might refer to a chemical name like "proprionate"mz
: The mass/charge ratiort
: The retention time
Microbiome.commonname
— Functioncommonname(m::Metabolite)
Accessor function for the commonname
field of a Metabolite
.
Microbiome.masscharge
— Functionmasscharge(m::Metabolite)
Accessor function for the mz
field of a Metabolite
.
Microbiome.retentiontime
— Functionretentiontime(m::Metabolite)
Accessor function for the rt
field of a Metabolite
.