Microbiome.jl functions

BiobakeryUtils.jl re-exports all of the functionality from Microbiome.jl. The docstrings from that package are reproduced here, but checkout the Microbiome.jl docs for more details.

Microbiome.CommunityProfileType
CommunityProfile{T, F, S} <: AbstractAbundanceTable{T, F, S}

An AbstractAssemblage from EcoBase.jl that uses a SparseMatrixCSC under the hood.

CommunityProfiles are tables with AbstractFeature-indexed rows and AbstractSample-indexed columns. Note - we can use the name of samples and features to index.

Microbiome.GeneFunctionType
GeneFunction(name::String, taxon::Union{Taxon, String, Missing}) <: AbstractFeature
GeneFunction(name::String)

Microbial gene function object with optional stratification (taxon).

Microbiome.MetaboliteType
Metabolite(name::String, commonname::Union{Missing, String}, mz::Union{Missing, Float64}, rt::Union{Missing, Float64}) <: AbstractFeature
Metabolite(name::String)

Represents a small-molecule metabolite coming from an LCMS. The fields are

  • name: required, this should be a unique identifier
  • commonname: might refer to a chemical name like "proprionate"
  • mz: The mass/charge ratio
  • rt: The retention time
Microbiome.MicrobiomeSampleType
MicrobiomeSample(name::String, metadata::Dictionary{Symbol, T}) <: AbstractSample
MicrobiomeSample(name::String; kwargs...)
MicrobiomeSample(name::String)

Microbiome sample type that includes a name and a Dictionary of arbitrary metadata using Symbols (other than :name or :metadata) as keys.

Metadata can be accessed using getproperty or getindex on the sample itself.

Samples can be instantiated with only a name, leaving the metadata Dictionary blank

Adding or changing metadata follows the same rules as for the normal Dictionary.

Microbiome.TaxonType
Taxon(name::String, rank::Union{Missing, Symbol, Int}) <: AbstractFeature
Taxon(name::String)

Microbial taxon with a name and a rank that can be one of

  1. :domain
  2. :kingom
  3. :phylum
  4. :class
  5. :order
  6. :faamily
  7. :genus
  8. :species
  9. :subspecies
  10. :strain

or missing. Contructors can also use numbers 0-9, or pass a string alone (in which case the taxon will be stored as missing).

See also taxon.

Base.delete!Method
delete!(as::AbstractSample, prop::Symbol)

Delete a metadata entry of sample as using the Symbol prop if it exists, or throw an error otherwise. If you don't want an error to be thrown if the value does not exist, use unset!.

Base.delete!Method
delete!(commp::CommunityProfile, sample::AbstractString, prop::Symbol)

Delete a metadata entry in sample from CommunityProfile commp using the Symbol prop if it exists, or throw an error otherwise. If you don't want an error to be thrown if the value does not exist, use unset!.

Base.filterMethod
filter(f, comm::CommunityProfile)

Apply f to the features of comm, and return a copy where f(feature) is true.

Base.getFunction
get(commp::CommunityProfile, sample::AbstractString, key::Symbol, default)

Return the value of the metadata in a sample stored for the given key, or the given default value if no mapping for the key is present.

Base.getFunction
get(commp::CommunityProfile, key::Symbol, default)

Return the value of the metadata in a sample stored for the given key, or the given default value if no mapping for the key is present.

Base.getMethod
get(as::AbstractSample, key::Symbol, default)

Return the value of the metadata in the sample as stored for the given key, or the given default value if no mapping for the key is present.

Base.getMethod
get(as::AbstractSample, key::Symbol, default)

Return the value of the metadata in the sample as stored for the given key, or the given default value if no mapping for the key is present.

Base.getMethod
get(t::AbstractSample)

Get the metadata field from an AbstractSample. Note that this is not a copy, so modifications to the returned value will update the parent AbstractSample as well.

Base.getMethod
get(commp::CommunityProfile, cols::AbstractVector{<:Symbol}, default)

end

Returns iterator of NamedTuple per sample, where keys are :sample and each metadata key found in commp. Samples without given metadata are filled with default.

Returned values can be passed to any Tables.rowtable - compliant type, eg DataFrame.

Base.getMethod
get(commp::CommunityProfile)

Returns iterator of NamedTuple per sample, where keys are :sample and each metadata key found in commp. Samples without given metadata are filled with missing.

Returned values can be passed to any Tables.rowtable - compliant type, eg DataFrame.

Base.getindexMethod
getindex(as::AbstractSample, prop::Symbol)

Return the prop value in the metadata dictionary of as. This enables using bracket syntax for access, eg as[prop].

Base.getpropertyMethod
getproperty(as::AbstractSample, prop::Symbol)

Return the prop value in the metadata dictionary of as. This enables using dot syntax for access, eg as.prop.

Base.haskeyMethod
haskey(as::AbstractSample, key::Symbol)

Determine whether the metadata of sample as has a mapping for a given key. Use !haskey to determine whether a sample as in a CommunityProfile doesn't have a mapping for a given key

Base.haskeyMethod
haskey(commp::CommunityProfile, sample::AbstractString, key::Symbol)

Determine whether the metadata of sample in a CommunityProfile commp has a mapping for a given key. Use !haskey to determine whether a sample in a CommunityProfile doesn't have a mapping for a given key

Base.insert!Method
insert!(as::AbstractSample, prop::Symbol, val)

Insert a value val to the metadata of sample as using a Symbol prop, and it will throw an error if prop exists. If you don't want an error to be thrown if the value exists, use set!.

Base.insert!Method
insert!(commp::CommunityProfile, sample::AbstractString, prop::Symbol, val)

Insert a value val to the metadata of sample in a CommunityProfile commp using a Symbol prop, and it will throw an error if prop exists. If you don't want an error to be thrown if the value exists, use set!.

Base.insert!Method
insert!(cp::CommunityProfile, md; namecol=:sample)

Add metadata (in the form of a Tables.jl table) a CommunityProfile. One column (namecol) should contain sample names that exist in commp, and other columns should contain metadata that will be added to the metadata of each sample.

Before starting, this will check that every value in every row is insert!able, and will throw an error if not. This requires iterating over the metadata table twice, which may be slow. If performance matters, you can use set! instead, though this will overwrite existing data.

Base.keysMethod
keys(as::AbstractSample)

Return an iterator over all keys of the metadata attached to sample as. collect(keys(as)) returns an array of keys.

Base.keysMethod
keys(commp::CommunityProfile, sample::AbstractString)

Return an iterator over all keys of the metadata attached to sample in a CommunityProfile commp. collect(keys(commp, sample)) returns an array of keys.

Dictionaries.set!Method
set!(as::AbstractSample, prop::Symbol, val)

Update or insert a value val to the metadata of sample as using a Symbol prop. If you want an error to be thrown if the value already exists, use insert!.

Dictionaries.set!Method
set!(commp::CommunityProfile, sample::AbstractString, prop::Symbol, val)
set!(commp::CommunityProfile, sample::AbstractString, md::Union{AbstractDict, NamedTuple})

Update or insert a value val to the metadata of sample in the CommunityProfile commp using a Symbol prop. If you want an error to be thrown if the value already exists, use insert!.

Can also pass a Dictionary or NamedTuple containing key=> value pairs, all of which will be set!.

Dictionaries.set!Method
set!(cp::CommunityProfile, md; namecol=:sample)

Add metadata (in the form of a Tables.jl table) a CommunityProfile. One column (namecol) should contain sample names that exist in commp, and other columns should contain metadata that will be added to the metadata of each sample.

Dictionaries.unset!Method
unset!(as::AbstractSample, prop::Symbol)

Delete a metadata entry of sample as using the Symbol prop. If you want an error to be thrown if the value does not exist, use delete!.

Dictionaries.unset!Method

unset!(commp::CommunityProfile, sample::AbstractString, prop::Symbol)

Delete a metadata entry in sample from CommunityProfile commp using the Symbol prop. If you want an error to be thrown if the value does not exist, use delete!.

Microbiome.abundancesFunction
abundances(at::AbstractAbundanceTable)

Get the underlying sparse matrix of an AbstractAbundanceTable. Note that this does not copy - any modifications to this matrix will update the parent.

Microbiome.braycurtisMethod
braycurtis(abt::AbstractAbundanceTable)

Returns a pairwise Bray-Curtis dissimilarity matrix.

Microbiome.commjoinMethod
commjoin(c1::CommunityProfile, comms::CommunityProfile...)

Join multiple CommunityProfiles, creating a new CommunityProfile. For now, sample names cannot overlap in any of the input profiles.

Microbiome.featurenamesFunction
featurenames(at::AbstractAbundanceTable)

Get a vector of feature names from at, equivalent to name.(features(at))

Microbiome.featuretotalsMethod
featuretotals(at::AbstractAbundanceTable)

Returns sum of each row (feature) in at. Note, return value is a nfeatures x 1 Matrix, not a Vector. If you need 1D Vector, use vec(featuretotals(at)).

Microbiome.ginisimpson!Method
ginisimpson!(abt::AbstractAbundanceTable; overwrite=false)

Adds a :ginisimpson entry to the metadata for each sample in abt with the Gini-Simpson alpha diversity of that sample (see ginisimpson). If overwrite=false (the default), uses insert! to perform this operation, so an error will be thrown if any sample already contains a :ginisimpson entry. Otherwise, uses set!.

Microbiome.ginisimpsonMethod
ginisimpson(v::Union{AbstractVector, AbstractSparseMatrix}) 
ginisimpson(abt::AbstractAbundanceTable, overwrite=false)

Computes the Gini-Simpson alpha diversity metric for a vector. When called on an AbstractAbundanceTable, returns a 1 x nsamples matrix with 1 entry per sample. See also ginisimpson!.

Microbiome.hasrankMethod
hasrank(gf::GeneFunction)::Bool

Boolean function that returns:

  • true if gf has a Taxon with a non-missing rank field,
  • false if there's no Taxon, or
  • false if the Taxon has no rank
Microbiome.hasrankMethod
hasrank(t::Taxon)::Bool

Boolean function that returns true if the rank field in Taxon t is not missing, or false if it is missing

Microbiome.hastaxonMethod
hastaxon(gf::GeneFunction)::Bool

Boolean function that returns true if the taxon field in a GeneFunction gf is not missing, or false if it is missing

Microbiome.hellingerMethod
hellinger(abt::AbstractAbundanceTable)

Returns a pairwise Hellinger distance matrix.

Microbiome.jaccardMethod
jaccard(abt::AbstractAbundanceTable)

Returns a pairwise Jaccard distance matrix.

Microbiome.nameMethod
name(t::Union{AbstractSample, AbstractFeature})

Get the name field from an AbstractSample or AbstractFeature.

Microbiome.pcoaFunction
pcoa(abt::AbstractAbundanceTable, f=braycurtis)

Returns eigenvectors from fitting MDS to a distance metric generated by f, by default braycurtis.

Microbiome.presentFunction
present(t::Union{Real, Missing}, minabundance::Real=0.0)
present(at::AbstractAbundanceTable, minabundance::Real=0.0)

Check if a given (non-zero) value is greater than or equal to a minimum value. If the minimum abundance is 0, just checks if value is non-zero.

If used on an AbstractAbundanceTable, returns a sparse boolean matrix of the same size.

Microbiome.prevalenceFunction
prevalence(a::AbstractArray{<:Real}, minabundance::Real=0.0)
prevalence(at::AbstractAbundanceTable, minabundance::Real=0.0)

Return the fraction of values that are greater than or equal to a minimum. If the minimum abundance is 0, returns the fraction of non-zero values.

If used on an AbstractAbundanceTable, returns a prevalence value for each feature accross the samples.

Microbiome.prevalence_filterMethod
prevalence_filter(comm::AbstractAbundanceTable; minabundance=0.0; minprevalence=0.05, renorm=false)

Return a filtered CommunityProfile where features with prevalence lower than minprevalence are removed. By default, a feature is considered "present" if > 0, but this can be changed by setting minabundance.

Optionally, set renorm = true to calculate relative abundances after low prevalence features are removed.

Microbiome.rankfilterMethod
rankfilter(comm::AbstractAbundanceTable, cl::Union{Symbol, Int}; keepempty=false)

Return a copy of comm, where only rows that have taxrank(feature) == cl are kept. Use keepempty = true to also keep features that don't have a rank (eg "UNIDENTIFIED").

Microbiome.relativeabundance!Method
relativeabundance!(a::AbstractAbundanceTable; kind::Symbol=:fraction)

Normalize each sample in AbstractAbundanceTable to the sum of the sample.

By default, columns sum to 1.0. Use kind=:percent for columns to sum to 100.

Microbiome.runtestsMethod
Microbiome.runtests(pattern...; kwargs...)

Equivalent to ReTest.retest(Microbiome, pattern...; kwargs...). This function is defined automatically in any module containing a @testset, possibly nested within submodules.

Microbiome.samplenamesFunction
samplenames(at::AbstractAbundanceTable)

Get a vector of sample names from at, equivalent to name.(samples(at))

Microbiome.samplesMethod
samples(at::AbstractAbundanceTable, name::AbstractString)

Returns sample in at with name name.

Microbiome.sampletotalsMethod
sampletotals(at::AbstractAbundanceTable)

Returns sum of each row (feature) in at. Note, return value is a 1 x nsamples Matrix, not a Vector. If you need 1D Vector, use vec(sampletotals(at)).

Microbiome.shannon!Method
shannon!(abt::AbstractAbundanceTable; overwrite=false)

Adds a :shannon entry to the metadata for each sample in abt with the Shannon alpha diversity of that sample (see shannon). If overwrite=false (the default), uses insert! to perform this operation, so an error will be thrown if any sample already contains a :shannon entry. Otherwise, uses set!.

Microbiome.shannonMethod
shannon(v::Union{AbstractVector, AbstractSparseMatrix}) 
shannon(abt::AbstractAbundanceTable)

Computes the Shannon alpha diversity metric for a vector. When called on an AbstractAbundanceTable, returns a 1 x nsamples matrix with 1 entry per sample. See also shannon!.

Microbiome.taxonMethod
taxon(::AbstractString)

Return a Taxon from a string representation. If the string contains taxonomic rank information in the form "x__Thename" where x is the first letter of the rank, this information will be used.

Examples

julia> taxon("Unknown")
Taxon("Unknown", missing)

julia> taxon("s__Prevotella_copri")
Taxon("Prevotella_copri", :species)
Microbiome.taxrankMethod
taxrank(gf::GeneFunction)

Get the rank field from the taxon field of a GeneFunction gf if it has one. Returns missing if the taxon or rank is not set.

Microbiome.taxrankMethod
taxrank(t::Union{Taxon, missing})

Get the rank field from a Taxon t. Returns missing if the rank is not set.