Samples and Features

Microbial profiles are made up of AbstractSamples and AbstractFeatures. Typically, an AbstractSample is an individual biospecimen or other observation, and contains some number of AbstractFeatures, such as taxa or gene functions. AbstractSamples may also contain arbitrary metadata.

Sample types

At its most basic, an AbstractSample simply encodes a name (which should be a unique identifier), and a place to hold metadata. The concrete type MicrobiomeSample is implemented with these two fields, the latter of which is a Dictionary from Dictionaries.jl.

You can instantiate a MicrobiomeSample with just a name (in which case the metadata dictionary will be empty), using keyword arguments for metadata entries, or with existing metadata in the form of a dictionary (with keys of type Symbol) or a NamedTuple.

julia> s1 = MicrobiomeSample("sample1")
MicrobiomeSample("sample1", {})

julia> s2 = MicrobiomeSample("sample2"; age=37)
MicrobiomeSample("sample2", {:age = 37})

julia> s3 = MicrobiomeSample("sample3", Dict(:gender=>"female", :age=>23))
MicrobiomeSample("sample3", {:age = 23, :gender = "female"})

Working with metadata

To change or add metadata, you can use the same syntax as working with a [Dictionary] directly, though note that this is a bit different from the Dict type in base julia:

julia> insert!(s1, :age, 50)
MicrobiomeSample("sample1", {:age = 50})

julia> set!(s3, :gender, "nonbinary")
MicrobiomeSample("sample3", {:age = 23, :gender = "nonbinary"})

julia> delete!(s3, :gender)
MicrobiomeSample("sample3", {:age = 23})

You can access values of the dictionary using either getindex or getfield syntax, that is:

julia> s3[:age]
23

julia> s3.age
23

Bulk addiction of metadata is also possible, by passing a Dictionary or NamedTuple to set! or insert! (the latter will fail if any of the incoming keys are already found):

julia> insert!(s3, (gender = "nonbinary", genotype="XX"))
MicrobiomeSample("sample3", {:age = 23, :gender = "nonbinary", :genotype = "XX"})

julia> set!(s3, (genotype="XY", ses=7))
MicrobiomeSample("sample3", {:age = 23, :gender = "nonbinary", :genotype = "XY", :ses = 7})

Feature Types

AbstractFeature types also have a name, but other fields are optional. Microbiome.jl defines three concrete AbstractFeature types, Taxon, GeneFunction, and Metabolite.

Taxon

The Taxon type contains a name and (optionally) a rank (eg :phylum).

julia> ecoli = Taxon("Escherichia_coli", :species)
Taxon("Escherichia_coli", :species)

julia> uncl = Taxon("Unknown_bug")
Taxon("Unknown_bug", missing)

You can access the name and rank fields using name and taxrank respectively, and also check whether the instance has a rank with hasrank, which returns true or false.

julia> hasrank(ecoli)
true

julia> hasrank(uncl)
false

julia> taxrank(ecoli)
:species

julia> taxrank(uncl)
missing

julia> name(ecoli)
"Escherichia_coli"

julia> name(uncl)
"Unknown_bug"

For compatibility with other tools, converting a Taxon to a String will return the name prepended with the first letter of the taxonomic rank and 2 underscores. You can convert back using taxon (note the lowercase 't'):

julia> String(uncl)
"u__Unknown_bug"

julia> String(ecoli)
"s__Escherichia_coli"

julia> String(ecoli) |> Taxon
Taxon("s__Escherichia_coli", missing)

julia> String(ecoli) |> taxon
Taxon("Escherichia_coli", :species)

GeneFunction

The GeneFunction type contains a name and (optionally) a Taxon. In addition to providing both a name and Taxon, you can instantiate a GeneFunction with just a name (in which case the taxon will be missing), or with the name of the taxon (in which case it will not have a rank).

julia> gf1 = GeneFunction("gene1")
GeneFunction("gene1", missing)

julia> gf2 = GeneFunction("gene2", "Species_name")
GeneFunction("gene2", Taxon("Species_name", missing))

julia> gf3 = GeneFunction("gene2", Taxon("Species_name", :species))
GeneFunction("gene2", Taxon("Species_name", :species))

You can access or check for various fields using similar methods as for Taxon:

julia> hastaxon(gf1)
false

julia> hastaxon(gf2)
true

julia> hasrank(gf2)
false

julia> hasrank(gf3)
true

julia> name(gf3)
"gene2"

julia> taxon(gf3)
Taxon("Species_name", :species)

julia> taxrank(gf3)
:species

For compatibility with other tools, Converting a GeneFunction to a String if it has a Taxon will include the taxon name separated by |. Converting back can be done using genefunction (note the lowercase g and f).

julia> String(gf3)
"gene2|s__Species_name"

julia> genefunction(String(gf3))
GeneFunction("gene2", Taxon("Species_name", :species))

Metabolites

The Metabolite type has a name and optionally a commonname, a mass / charge ratio (mz), and retention time (rt).

julia> m = Metabolite("name", "common", 1., 2.)
Metabolite("name", "common", 1.0, 2.0)

julia> name(m)
"name"

julia> commonname(m)
"common"

julia> masscharge(m)
1.0

julia> retentiontime(m)
2.0

julia> m2 = Metabolite("other name")
Metabolite("other name", missing, missing, missing)

Types and Methods

Microbiome.MicrobiomeSampleType
MicrobiomeSample(name::String, metadata::Dictionary{Symbol, T}) <: AbstractSample
MicrobiomeSample(name::String; kwargs...)
MicrobiomeSample(name::String)

Microbiome sample type that includes a name and a Dictionary of arbitrary metadata using Symbols (other than :name or :metadata) as keys.

Metadata can be accessed using getproperty or getindex on the sample itself.

Samples can be instantiated with only a name, leaving the metadata Dictionary blank

Adding or changing metadata follows the same rules as for the normal Dictionary.

source
Missing docstring.

Missing docstring for metadata. Check Documenter's build log for details.

Microbiome.TaxonType
Taxon(name::String, rank::Union{Missing, Symbol, Int}) <: AbstractFeature
Taxon(name::String)

Microbial taxon with a name and a rank that can be one of

  1. :domain
  2. :kingom
  3. :phylum
  4. :class
  5. :order
  6. :faamily
  7. :genus
  8. :species
  9. :subspecies
  10. :strain

or missing. Contructors can also use numbers 0-9, or pass a string alone (in which case the taxon will be stored as missing).

See also taxon.

source
Microbiome.nameFunction
name(t::Union{AbstractSample, AbstractFeature})

Get the name field from an AbstractSample or AbstractFeature.

source
Microbiome.hasrankFunction
hasrank(t::Taxon)::Bool

Boolean function that returns true if the rank field in Taxon t is not missing, or false if it is missing

source
hasrank(gf::GeneFunction)::Bool

Boolean function that returns:

  • true if gf has a Taxon with a non-missing rank field,
  • false if there's no Taxon, or
  • false if the Taxon has no rank
source
Microbiome.taxrankFunction
taxrank(t::Union{Taxon, missing})

Get the rank field from a Taxon t. Returns missing if the rank is not set.

source
taxrank(gf::GeneFunction)

Get the rank field from the taxon field of a GeneFunction gf if it has one. Returns missing if the taxon or rank is not set.

source
Microbiome.taxonFunction
taxon(::AbstractString)

Return a Taxon from a string representation. If the string contains taxonomic rank information in the form "x__Thename" where x is the first letter of the rank, this information will be used.

Examples

julia> taxon("Unknown")
Taxon("Unknown", missing)

julia> taxon("s__Prevotella_copri")
Taxon("Prevotella_copri", :species)
source
taxon(gf::GeneFunction)

Get the taxon field from a GeneFunction, gf. Returns missing if the taxon is not set.

source
Microbiome.GeneFunctionType
GeneFunction(name::String, taxon::Union{Taxon, String, Missing}) <: AbstractFeature
GeneFunction(name::String)

Microbial gene function object with optional stratification (taxon).

source
Microbiome.MetaboliteType
Metabolite(name::String, commonname::Union{Missing, String}, mz::Union{Missing, Float64}, rt::Union{Missing, Float64}) <: AbstractFeature
Metabolite(name::String)

Represents a small-molecule metabolite coming from an LCMS. The fields are

  • name: required, this should be a unique identifier
  • commonname: might refer to a chemical name like "proprionate"
  • mz: The mass/charge ratio
  • rt: The retention time
source