Microbiome.jl functions
BiobakeryUtils.jl
re-exports all of the functionality from Microbiome.jl
. The docstrings from that package are reproduced here, but checkout the Microbiome.jl docs for more details.
Microbiome.CommunityProfile
— TypeCommunityProfile{T, F, S} <: AbstractAbundanceTable{T, F, S}
An AbstractAssemblage
from EcoBase.jl that uses a SparseMatrixCSC
under the hood.
CommunityProfile
s are tables with AbstractFeature
-indexed rows and AbstractSample
-indexed columns. Note - we can use the name
of samples and features to index.
Microbiome.GeneFunction
— TypeGeneFunction(name::String, taxon::Union{Taxon, String, Missing}) <: AbstractFeature
GeneFunction(name::String)
Microbial gene function object with optional stratification (taxon
).
Microbiome.Metabolite
— TypeMetabolite(name::String, commonname::Union{Missing, String}, mz::Union{Missing, Float64}, rt::Union{Missing, Float64}) <: AbstractFeature
Metabolite(name::String)
Represents a small-molecule metabolite coming from an LCMS. The fields are
name
: required, this should be a unique identifiercommonname
: might refer to a chemical name like "proprionate"mz
: The mass/charge ratiort
: The retention time
Microbiome.MicrobiomeSample
— TypeMicrobiomeSample(name::String, metadata::Dictionary{Symbol, T}) <: AbstractSample
MicrobiomeSample(name::String; kwargs...)
MicrobiomeSample(name::String)
Microbiome sample type that includes a name and a Dictionary
of arbitrary metadata using Symbol
s (other than :name
or :metadata
) as keys.
Metadata can be accessed using getproperty
or getindex
on the sample itself.
Samples can be instantiated with only a name, leaving the metadata
Dictionary
blank
Adding or changing metadata follows the same rules as for the normal Dictionary
.
Microbiome.Taxon
— TypeTaxon(name::String, rank::Union{Missing, Symbol, Int}) <: AbstractFeature
Taxon(name::String)
Microbial taxon with a name and a rank that can be one of
:domain
:kingom
:phylum
:class
:order
:faamily
:genus
:species
:subspecies
:strain
or missing
. Contructors can also use numbers 0-9, or pass a string alone (in which case the taxon
will be stored as missing
).
See also taxon
.
Base.delete!
— Methoddelete!(as::AbstractSample, prop::Symbol)
Delete a metadata entry of sample as
using the Symbol prop
if it exists, or throw an error otherwise. If you don't want an error to be thrown if the value does not exist, use unset!
.
Base.delete!
— Methoddelete!(commp::CommunityProfile, sample::AbstractString, prop::Symbol)
Delete a metadata entry in sample
from CommunityProfile commp
using the Symbol prop
if it exists, or throw an error otherwise. If you don't want an error to be thrown if the value does not exist, use unset!
.
Base.filter
— Methodfilter(f, comm::CommunityProfile)
Apply f
to the features of comm
, and return a copy where f(feature)
is true
.
Base.get
— Functionget(commp::CommunityProfile, sample::AbstractString, key::Symbol, default)
Return the value of the metadata in a sample
stored for the given key
, or the given default
value if no mapping for the key is present.
Base.get
— Functionget(commp::CommunityProfile, key::Symbol, default)
Return the value of the metadata in a sample
stored for the given key
, or the given default
value if no mapping for the key is present.
Base.get
— Methodget(as::AbstractSample, key::Symbol, default)
Return the value of the metadata in the sample as
stored for the given key
, or the given default
value if no mapping for the key is present.
Base.get
— Methodget(as::AbstractSample, key::Symbol, default)
Return the value of the metadata in the sample as
stored for the given key
, or the given default
value if no mapping for the key is present.
Base.get
— Methodget(t::AbstractSample)
Get the metadata
field from an AbstractSample
. Note that this is not a copy, so modifications to the returned value will update the parent AbstractSample
as well.
Base.get
— Methodget(commp::CommunityProfile, cols::AbstractVector{<:Symbol}, default)
end
Returns iterator of NamedTuple
per sample, where keys are :sample
and each metadata key found in commp
. Samples without given metadata are filled with default
.
Returned values can be passed to any Tables.rowtable - compliant type, eg DataFrame
.
Base.get
— Methodget(commp::CommunityProfile)
Returns iterator of NamedTuple
per sample, where keys are :sample
and each metadata key found in commp
. Samples without given metadata are filled with missing
.
Returned values can be passed to any Tables.rowtable - compliant type, eg DataFrame
.
Base.getindex
— Methodgetindex(as::AbstractSample, prop::Symbol)
Return the prop
value in the metadata dictionary of as
. This enables using bracket syntax for access, eg as[prop]
.
Base.getproperty
— Methodgetproperty(as::AbstractSample, prop::Symbol)
Return the prop
value in the metadata dictionary of as
. This enables using dot syntax for access, eg as.prop
.
Base.haskey
— Methodhaskey(as::AbstractSample, key::Symbol)
Determine whether the metadata of sample as
has a mapping for a given key
. Use !haskey
to determine whether a sample as
in a CommunityProfile doesn't have a mapping for a given key
Base.haskey
— Methodhaskey(commp::CommunityProfile, sample::AbstractString, key::Symbol)
Determine whether the metadata of sample
in a CommunityProfile commp
has a mapping for a given key
. Use !haskey
to determine whether a sample
in a CommunityProfile doesn't have a mapping for a given key
Base.insert!
— Methodinsert!(as::AbstractSample, prop::Symbol, val)
Insert a value val
to the metadata of sample as
using a Symbol prop
, and it will throw an error if prop
exists. If you don't want an error to be thrown if the value exists, use set!
.
Base.insert!
— Methodinsert!(commp::CommunityProfile, sample::AbstractString, prop::Symbol, val)
Insert a value val
to the metadata of sample
in a CommunityProfile commp
using a Symbol prop
, and it will throw an error if prop
exists. If you don't want an error to be thrown if the value exists, use set!
.
Base.insert!
— Methodinsert!(cp::CommunityProfile, md; namecol=:sample)
Add metadata (in the form of a Tables.jl
table) a CommunityProfile
. One column (namecol
) should contain sample names that exist in commp
, and other columns should contain metadata that will be added to the metadata of each sample.
Before starting, this will check that every value in every row is insert!
able, and will throw an error if not. This requires iterating over the metadata table twice, which may be slow. If performance matters, you can use set!
instead, though this will overwrite existing data.
Base.keys
— Methodkeys(as::AbstractSample)
Return an iterator over all keys of the metadata attached to sample as
. collect(keys(as))
returns an array of keys.
Base.keys
— Methodkeys(commp::CommunityProfile, sample::AbstractString)
Return an iterator over all keys of the metadata attached to sample
in a CommunityProfile commp
. collect(keys(commp, sample))
returns an array of keys.
Dictionaries.set!
— Methodset!(as::AbstractSample, prop::Symbol, val)
Update or insert a value val
to the metadata of sample as
using a Symbol prop
. If you want an error to be thrown if the value already exists, use insert!
.
Dictionaries.set!
— Methodset!(commp::CommunityProfile, sample::AbstractString, prop::Symbol, val)
set!(commp::CommunityProfile, sample::AbstractString, md::Union{AbstractDict, NamedTuple})
Update or insert a value val
to the metadata of sample
in the CommunityProfile commp
using a Symbol prop
. If you want an error to be thrown if the value already exists, use insert!
.
Can also pass a Dictionary or NamedTuple containing key=> value pairs, all of which will be set!
.
Dictionaries.set!
— Methodset!(cp::CommunityProfile, md; namecol=:sample)
Add metadata (in the form of a Tables.jl
table) a CommunityProfile
. One column (namecol
) should contain sample names that exist in commp
, and other columns should contain metadata that will be added to the metadata of each sample.
Dictionaries.unset!
— Methodunset!(as::AbstractSample, prop::Symbol)
Delete a metadata entry of sample as
using the Symbol prop
. If you want an error to be thrown if the value does not exist, use delete!
.
Dictionaries.unset!
— Methodunset!(commp::CommunityProfile, sample::AbstractString, prop::Symbol)
Delete a metadata entry in sample
from CommunityProfile commp
using the Symbol prop
. If you want an error to be thrown if the value does not exist, use delete!
.
Microbiome.abundances
— Functionabundances(at::AbstractAbundanceTable)
Get the underlying sparse matrix of an AbstractAbundanceTable
. Note that this does not copy - any modifications to this matrix will update the parent.
Microbiome.braycurtis
— Methodbraycurtis(abt::AbstractAbundanceTable)
Returns a pairwise Bray-Curtis dissimilarity matrix.
Microbiome.commjoin
— Methodcommjoin(c1::CommunityProfile, comms::CommunityProfile...)
Join multiple CommunityProfile
s, creating a new CommunityProfile
. For now, sample names cannot overlap in any of the input profiles.
Microbiome.commonname
— Methodcommonname(m::Metabolite)
Accessor function for the commonname
field of a Metabolite
.
Microbiome.featurenames
— Functionfeaturenames(at::AbstractAbundanceTable)
Get a vector of feature names from at
, equivalent to name.(features(at))
Microbiome.features
— Methodfeatures(at::AbstractAbundanceTable)
Returns features in at
. To get featurenames instead, use featurenames
.
Microbiome.featuretotals
— Methodfeaturetotals(at::AbstractAbundanceTable)
Returns sum of each row (feature) in at
. Note, return value is a nfeatures x 1 Matrix
, not a Vector
. If you need 1D Vector
, use vec(featuretotals(at))
.
Microbiome.functionalprofile
— Methodfunctionalprofile(mat, features, samples)
Microbiome.genefunction
— Methodgenefunction(n::AbstractString)
Make a GeneFunction
from a string, Converting anything after an initial |
as a Taxon
.
Microbiome.ginisimpson!
— Methodginisimpson!(abt::AbstractAbundanceTable; overwrite=false)
Adds a :ginisimpson
entry to the metadata for each sample in abt
with the Gini-Simpson alpha diversity of that sample (see ginisimpson
). If overwrite=false
(the default), uses insert!
to perform this operation, so an error will be thrown if any sample already contains a :ginisimpson
entry. Otherwise, uses set!
.
Microbiome.ginisimpson
— Methodginisimpson(v::Union{AbstractVector, AbstractSparseMatrix})
ginisimpson(abt::AbstractAbundanceTable, overwrite=false)
Computes the Gini-Simpson alpha diversity metric for a vector. When called on an AbstractAbundanceTable
, returns a 1 x nsamples matrix with 1 entry per sample. See also ginisimpson!
.
Microbiome.hasrank
— Methodhasrank(gf::GeneFunction)::Bool
Boolean function that returns:
true
ifgf
has aTaxon
with a non-missingrank
field,false
if there's noTaxon
, orfalse
if theTaxon
has norank
Microbiome.hasrank
— Methodhasrank(t::Taxon)::Bool
Boolean function that returns true
if the rank
field in Taxon
t
is not missing
, or false
if it is missing
Microbiome.hastaxon
— Methodhastaxon(gf::GeneFunction)::Bool
Boolean function that returns true
if the taxon
field in a GeneFunction
gf
is not missing
, or false
if it is missing
Microbiome.hellinger
— Methodhellinger(abt::AbstractAbundanceTable)
Returns a pairwise Hellinger distance matrix.
Microbiome.jaccard
— Methodjaccard(abt::AbstractAbundanceTable)
Returns a pairwise Jaccard distance matrix.
Microbiome.masscharge
— Methodmasscharge(m::Metabolite)
Accessor function for the mz
field of a Metabolite
.
Microbiome.metabolicprofile
— Methodmetabolicprofile(mat, features, samples)
Microbiome.name
— Methodname(t::Union{AbstractSample, AbstractFeature})
Get the name
field from an AbstractSample
or AbstractFeature
.
Microbiome.pcoa
— Functionpcoa(abt::AbstractAbundanceTable, f=braycurtis)
Returns eigenvectors from fitting MDS
to a distance metric generated by f
, by default braycurtis
.
Microbiome.present
— Functionpresent(t::Union{Real, Missing}, minabundance::Real=0.0)
present(at::AbstractAbundanceTable, minabundance::Real=0.0)
Check if a given (non-zero) value is greater than or equal to a minimum value. If the minimum abundance is 0, just checks if value is non-zero.
If used on an AbstractAbundanceTable
, returns a sparse boolean matrix of the same size.
Microbiome.prevalence
— Functionprevalence(a::AbstractArray{<:Real}, minabundance::Real=0.0)
prevalence(at::AbstractAbundanceTable, minabundance::Real=0.0)
Return the fraction of values that are greater than or equal to a minimum. If the minimum abundance is 0, returns the fraction of non-zero values.
If used on an AbstractAbundanceTable
, returns a prevalence value for each feature
accross the sample
s.
Microbiome.prevalence_filter
— Methodprevalence_filter(comm::AbstractAbundanceTable; minabundance=0.0; minprevalence=0.05, renorm=false)
Return a filtered CommunityProfile
where features with prevalence lower than minprevalence
are removed. By default, a feature is considered "present" if > 0, but this can be changed by setting minabundance
.
Optionally, set renorm = true
to calculate relative abundances after low prevalence features are removed.
Microbiome.rankfilter
— Methodrankfilter(comm::AbstractAbundanceTable, cl::Union{Symbol, Int}; keepempty=false)
Return a copy of comm
, where only rows that have taxrank(feature) == cl
are kept. Use keepempty = true
to also keep features that don't have a rank
(eg "UNIDENTIFIED").
Microbiome.relativeabundance
— Functionrelativeabundance(at::AbstractAbundanceTable, kind::Symbol=:fraction)
Like relativeabundance!
, but does not mutate original.
Microbiome.relativeabundance!
— Methodrelativeabundance!(a::AbstractAbundanceTable; kind::Symbol=:fraction)
Normalize each sample in AbstractAbundanceTable to the sum of the sample.
By default, columns sum to 1.0. Use kind=:percent
for columns to sum to 100.
Microbiome.retentiontime
— Methodretentiontime(m::Metabolite)
Accessor function for the rt
field of a Metabolite
.
Microbiome.runtests
— MethodMicrobiome.runtests(pattern...; kwargs...)
Equivalent to ReTest.retest(Microbiome, pattern...; kwargs...)
. This function is defined automatically in any module containing a @testset
, possibly nested within submodules.
Microbiome.samplenames
— Functionsamplenames(at::AbstractAbundanceTable)
Get a vector of sample names from at
, equivalent to name.(samples(at))
Microbiome.samples
— Methodsamples(at::AbstractAbundanceTable, name::AbstractString)
Returns sample in at
with name name
.
Microbiome.samples
— Methodsamples(at::AbstractAbundanceTable)
Returns samples in at
. To get samplenames instead, use samplenames
.
Microbiome.sampletotals
— Methodsampletotals(at::AbstractAbundanceTable)
Returns sum of each row (feature) in at
. Note, return value is a 1 x nsamples Matrix
, not a Vector
. If you need 1D Vector
, use vec(sampletotals(at))
.
Microbiome.shannon!
— Methodshannon!(abt::AbstractAbundanceTable; overwrite=false)
Adds a :shannon
entry to the metadata for each sample in abt
with the Shannon alpha diversity of that sample (see shannon
). If overwrite=false
(the default), uses insert!
to perform this operation, so an error will be thrown if any sample already contains a :shannon
entry. Otherwise, uses set!
.
Microbiome.shannon
— Methodshannon(v::Union{AbstractVector, AbstractSparseMatrix})
shannon(abt::AbstractAbundanceTable)
Computes the Shannon alpha diversity metric for a vector. When called on an AbstractAbundanceTable
, returns a 1 x nsamples matrix with 1 entry per sample. See also shannon!
.
Microbiome.taxon
— Methodtaxon(::AbstractString)
Return a Taxon
from a string representation. If the string contains taxonomic rank information in the form "x__Thename"
where x
is the first letter of the rank, this information will be used.
Examples
julia> taxon("Unknown")
Taxon("Unknown", missing)
julia> taxon("s__Prevotella_copri")
Taxon("Prevotella_copri", :species)
Microbiome.taxon
— Methodtaxon(gf::GeneFunction)
Get the taxon
field from a GeneFunction
, gf
. Returns missing
if the taxon is not set.
Microbiome.taxonomicprofile
— Methodtaxonomicprofile(mat, features, samples)
Microbiome.taxrank
— Methodtaxrank(gf::GeneFunction)
Get the rank
field from the taxon
field of a GeneFunction
gf
if it has one. Returns missing
if the taxon
or rank
is not set.
Microbiome.taxrank
— Methodtaxrank(t::Union{Taxon, missing})
Get the rank
field from a Taxon
t
. Returns missing
if the rank is not set.