Samples and Features
Microbial profiles are made up of AbstractSamples and AbstractFeatures. Typically, an AbstractSample is an individual biospecimen or other observation, and contains some number of AbstractFeatures, such as taxa or gene functions. AbstractSamples may also contain arbitrary metadata.
Sample types
At its most basic, an AbstractSample simply encodes a name (which should be a unique identifier), and a place to hold metadata. The concrete type MicrobiomeSample is implemented with these two fields, the latter of which is a Dictionary from Dictionaries.jl.
You can instantiate a MicrobiomeSample with just a name (in which case the metadata dictionary will be empty), using keyword arguments for metadata entries, or with existing metadata in the form of a dictionary (with keys of type Symbol) or a NamedTuple.
julia> s1 = MicrobiomeSample("sample1")
MicrobiomeSample("sample1", {})
julia> s2 = MicrobiomeSample("sample2"; age=37)
MicrobiomeSample("sample2", {:age = 37})
julia> s3 = MicrobiomeSample("sample3", Dict(:gender=>"female", :age=>23))
MicrobiomeSample("sample3", {:age = 23, :gender = "female"})Working with metadata
To change or add metadata, you can use the same syntax as working with a [Dictionary] directly, though note that this is a bit different from the Dict type in base julia:
julia> insert!(s1, :age, 50)
MicrobiomeSample("sample1", {:age = 50})
julia> set!(s3, :gender, "nonbinary")
MicrobiomeSample("sample3", {:age = 23, :gender = "nonbinary"})
julia> delete!(s3, :gender)
MicrobiomeSample("sample3", {:age = 23})You can access values of the dictionary using either getindex or getfield syntax, that is:
julia> s3[:age]
23
julia> s3.age
23Bulk addiction of metadata is also possible, by passing a Dictionary or NamedTuple to set! or insert! (the latter will fail if any of the incoming keys are already found):
julia> insert!(s3, (gender = "nonbinary", genotype="XX"))
MicrobiomeSample("sample3", {:age = 23, :gender = "nonbinary", :genotype = "XX"})
julia> set!(s3, (genotype="XY", ses=7))
MicrobiomeSample("sample3", {:age = 23, :gender = "nonbinary", :genotype = "XY", :ses = 7})Feature Types
AbstractFeature types also have a name, but other fields are optional. Microbiome.jl defines three concrete AbstractFeature types, Taxon, GeneFunction, and Metabolite.
Taxon
The Taxon type contains a name and (optionally) a rank (eg :phylum).
julia> ecoli = Taxon("Escherichia_coli", :species)
Taxon("Escherichia_coli", :species)
julia> uncl = Taxon("Unknown_bug")
Taxon("Unknown_bug", missing)You can access the name and rank fields using name and taxrank respectively, and also check whether the instance has a rank with hasrank, which returns true or false.
julia> hasrank(ecoli)
true
julia> hasrank(uncl)
false
julia> taxrank(ecoli)
:species
julia> taxrank(uncl)
missing
julia> name(ecoli)
"Escherichia_coli"
julia> name(uncl)
"Unknown_bug"For compatibility with other tools, converting a Taxon to a String will return the name prepended with the first letter of the taxonomic rank and 2 underscores. You can convert back using taxon (note the lowercase 't'):
julia> String(uncl)
"u__Unknown_bug"
julia> String(ecoli)
"s__Escherichia_coli"
julia> String(ecoli) |> Taxon
Taxon("s__Escherichia_coli", missing)
julia> String(ecoli) |> taxon
Taxon("Escherichia_coli", :species)GeneFunction
The GeneFunction type contains a name and (optionally) a Taxon. In addition to providing both a name and Taxon, you can instantiate a GeneFunction with just a name (in which case the taxon will be missing), or with the name of the taxon (in which case it will not have a rank).
julia> gf1 = GeneFunction("gene1")
GeneFunction("gene1", missing)
julia> gf2 = GeneFunction("gene2", "Species_name")
GeneFunction("gene2", Taxon("Species_name", missing))
julia> gf3 = GeneFunction("gene2", Taxon("Species_name", :species))
GeneFunction("gene2", Taxon("Species_name", :species))You can access or check for various fields using similar methods as for Taxon:
julia> hastaxon(gf1)
false
julia> hastaxon(gf2)
true
julia> hasrank(gf2)
false
julia> hasrank(gf3)
true
julia> name(gf3)
"gene2"
julia> taxon(gf3)
Taxon("Species_name", :species)
julia> taxrank(gf3)
:speciesFor compatibility with other tools, Converting a GeneFunction to a String if it has a Taxon will include the taxon name separated by |. Converting back can be done using genefunction (note the lowercase g and f).
julia> String(gf3)
"gene2|s__Species_name"
julia> genefunction(String(gf3))
GeneFunction("gene2", Taxon("Species_name", :species))Metabolites
The Metabolite type has a name and optionally a commonname, a mass / charge ratio (mz), and retention time (rt).
julia> m = Metabolite("name", "common", 1., 2.)
Metabolite("name", "common", 1.0, 2.0)
julia> name(m)
"name"
julia> commonname(m)
"common"
julia> masscharge(m)
1.0
julia> retentiontime(m)
2.0
julia> m2 = Metabolite("other name")
Metabolite("other name", missing, missing, missing)Types and Methods
Microbiome.MicrobiomeSample — TypeMicrobiomeSample(name::String, metadata::Dictionary{Symbol, T}) <: AbstractSample
MicrobiomeSample(name::String; kwargs...)
MicrobiomeSample(name::String)Microbiome sample type that includes a name and a Dictionary of arbitrary metadata using Symbols (other than :name or :metadata) as keys.
Metadata can be accessed using getproperty or getindex on the sample itself.
Samples can be instantiated with only a name, leaving the metadata Dictionary blank
Adding or changing metadata follows the same rules as for the normal Dictionary.
Missing docstring for metadata. Check Documenter's build log for details.
Microbiome.Taxon — TypeTaxon(name::String, rank::Union{Missing, Symbol, Int}) <: AbstractFeature
Taxon(name::String)Microbial taxon with a name and a rank that can be one of
:domain:kingom:phylum:class:order:faamily:genus:species:subspecies:strain
or missing. Contructors can also use numbers 0-9, or pass a string alone (in which case the taxon will be stored as missing).
See also taxon.
Microbiome.name — Functionname(t::Union{AbstractSample, AbstractFeature})Get the name field from an AbstractSample or AbstractFeature.
Microbiome.hasrank — Functionhasrank(t::Taxon)::BoolBoolean function that returns true if the rank field in Taxon t is not missing, or false if it is missing
hasrank(gf::GeneFunction)::BoolBoolean function that returns:
trueifgfhas aTaxonwith a non-missingrankfield,falseif there's noTaxon, orfalseif theTaxonhas norank
Microbiome.taxrank — Functiontaxrank(t::Union{Taxon, missing})Get the rank field from a Taxon t. Returns missing if the rank is not set.
taxrank(gf::GeneFunction)Get the rank field from the taxon field of a GeneFunction gf if it has one. Returns missing if the taxon or rank is not set.
Microbiome.taxon — Functiontaxon(::AbstractString)Return a Taxon from a string representation. If the string contains taxonomic rank information in the form "x__Thename" where x is the first letter of the rank, this information will be used.
Examples
julia> taxon("Unknown")
Taxon("Unknown", missing)
julia> taxon("s__Prevotella_copri")
Taxon("Prevotella_copri", :species)taxon(gf::GeneFunction)Get the taxon field from a GeneFunction, gf. Returns missing if the taxon is not set.
Microbiome.GeneFunction — TypeGeneFunction(name::String, taxon::Union{Taxon, String, Missing}) <: AbstractFeature
GeneFunction(name::String)Microbial gene function object with optional stratification (taxon).
Microbiome.genefunction — Functiongenefunction(n::AbstractString)Make a GeneFunction from a string, Converting anything after an initial | as a Taxon.
Microbiome.Metabolite — TypeMetabolite(name::String, commonname::Union{Missing, String}, mz::Union{Missing, Float64}, rt::Union{Missing, Float64}) <: AbstractFeature
Metabolite(name::String)Represents a small-molecule metabolite coming from an LCMS. The fields are
name: required, this should be a unique identifiercommonname: might refer to a chemical name like "proprionate"mz: The mass/charge ratiort: The retention time
Microbiome.commonname — Functioncommonname(m::Metabolite)Accessor function for the commonname field of a Metabolite.
Microbiome.masscharge — Functionmasscharge(m::Metabolite)Accessor function for the mz field of a Metabolite.
Microbiome.retentiontime — Functionretentiontime(m::Metabolite)Accessor function for the rt field of a Metabolite.