Groups and subsets
One of the most powerful ideas in SpatialEcology is that it lets you create views into all objects (most importantly ComMatrix
and Assemblage
) based on a subset of species or sites. The object will drop unused species or sites.
Let's say for instance we want to calculate the average range size for each latitudinal band for the dataset of European amphibians.
First, we load the data:
using SpatialEcology, Plots, CSV, DataFrames, Statistics
ENV["GKSwstype"]="nul"
amphdata = CSV.read(joinpath(dirname(pathof(SpatialEcology)), "..", "data", "amph_Europe.csv"), DataFrame)
amph = Assemblage(amphdata[!, 4:end],amphdata[!, 1:3], sitecolumns = false);
Assemblage with 73 species in 1010 sites
Species names:
Salamandra_salamandra, _Calotriton_asper, _Calotriton_arnoldi...Chioglossa_lusitanica, Pleurodeles_waltl
Site names:
1, 2, 3...1009, 1010
And let's add the rangesizes of each species to the dataset
addtraits!(amph, occupancy(amph), :rangesize)
73-element Array{Int64,1}:
353
11
1
5
2
86
35
419
95
39
⋮
17
100
31
3
7
24
6
16
59
Then let's get all unique latitudes
latitudes = unique(coordinates(amph)[:, 2])
37-element Array{Float64,1}:
46.5
47.5
37.5
38.5
39.5
43.5
44.5
45.5
40.5
36.5
⋮
64.5
65.5
66.5
67.5
35.5
34.5
68.5
69.5
70.5
We can use a simple to loop over all the latitudes, generate a relevant subset and calculate the mean rangesize
latitude_range = zeros(size(latitudes))
for (i, lat) in enumerate(latitudes)
sites = findall(==(lat), coordinates(amph)[:,2])
subset = view(amph, sites = sites)
latitude_range[i] = mean(subset[:rangesize])
end
scatter(latitudes, latitude_range, xlab = "Latitude", ylab = "Mean range size")
Subsetting and sampling over a factor is common enough that there is a specialized syntax for this, groupspecies
and groupsites
. All of the above can be expressed by grouping the assemblage over the second coordinate (latitude):
latitudinal_assemblages = groupsites(amph, coordinates(amph)[:,2], dropspecies = true)
latitude_range = [mean(lat[:rangesize]) for lat in latitudinal_assemblages]
37-element Array{Float64,1}:
255.83333333333334
217.55555555555554
212.57692307692307
190.06060606060606
169.3421052631579
150.88636363636363
164.04255319148936
170.75555555555556
169.6595744680851
178.40425531914894
⋮
628.0
628.0
628.0
680.25
680.25
680.25
673.3333333333334
740.0
740.0
You can also use subsetting to plot a single species:
spec = view(amph, species = ["_Bufo_bufo"])
plot(spec, title = "Common Toad", showempty = true, c = cgrad([:grey, :red], categorical = true))
#Todo make this work without wrapping sp
Note that getindex
([]
) will create a view by default - to create a new Assemblage
object you can use copy
.
Index
DataFrames.aggregate
SpatialEcology.asquantiles
SpatialEcology.asquantiles!
SpatialEcology.dispersionfield
SpatialEcology.groupsites
SpatialEcology.groupspecies
StatsAPI.pairwise
API
SpatialEcology.groupspecies
— Functiongroupspecies(a::EcoBase.AbstractAssemblage, s::Symbol; kwargs...)
SpatialEcology.groupsites
— Functiongroupsites(a::EcoBase.AbstractAssemblage, s::Symbol; kwargs...)
DataFrames.aggregate
— Functionaggregate(object, grid [, fun])
Aggregate object
(either an Assemblage
or Locations
type) to grid
. If object
is an Assemblage{PointData}
this will grid all points and return an Assemblage{GridData}
. grid
can be a GridTopology
or a single Integer
signifying the aggregation factor for already gridded data, the cellsize for point data. fun
is an optional function specifying how to lump occurrences. If not specified the default function is any
for Boolean Assemblages and sum
for Integer ones.
Utilities
SpatialEcology.dispersionfield
— Functiondispersionfield(asm, site)
StatsAPI.pairwise
— Functionpairwise(f, x[, y];
symmetric::Bool=false, skipmissing::Symbol=:none)
Return a matrix holding the result of applying f
to all possible pairs of entries in iterators x
and y
. Rows correspond to entries in x
and columns to entries in y
. If y
is omitted then a square matrix crossing x
with itself is returned.
As a special case, if f
is cor
, diagonal cells for which entries from x
and y
are identical (according to ===
) are set to one even in the presence missing
, NaN
or Inf
entries.
Keyword arguments
symmetric::Bool=false
: Iftrue
,f
is only called to compute for the lower triangle of the matrix, and these values are copied to fill the upper triangle. Only allowed wheny
is omitted. Defaults totrue
whenf
iscor
orcov
.skipmissing::Symbol=:none
: If:none
(the default), missing values in inputs are passed tof
without any modification. Use:pairwise
to skip entries with amissing
value in either of the two vectors passed tof
for a given pair of vectors inx
andy
. Use:listwise
to skip entries with amissing
value in any of the vectors inx
ory
; note that this might drop a large part of entries. Only allowed when entries inx
andy
are vectors.
Examples
julia> using StatsBase, Statistics
julia> x = [1 3 7
2 5 6
3 8 4
4 6 2];
julia> pairwise(cor, eachcol(x))
3×3 Matrix{Float64}:
1.0 0.744208 -0.989778
0.744208 1.0 -0.68605
-0.989778 -0.68605 1.0
julia> y = [1 3 missing
2 5 6
3 missing 2
4 6 2];
julia> pairwise(cor, eachcol(y), skipmissing=:pairwise)
3×3 Matrix{Float64}:
1.0 0.928571 -0.866025
0.928571 1.0 -1.0
-0.866025 -1.0 1.0
pairwise(metric::PreMetric, a::AbstractMatrix, b::AbstractMatrix=a; dims)
Compute distances between each pair of rows (if dims=1
) or columns (if dims=2
) in a
and b
according to distance metric
. If a single matrix a
is provided, compute distances between its rows or columns.
a
and b
must have the same numbers of columns if dims=1
, or of rows if dims=2
.
pairwise(metric::PreMetric, a, b=a)
Compute distances between each element of collection a
and each element of collection b
according to distance metric
. If a single iterable a
is provided, compute distances between its elements.
SpatialEcology.asquantiles
— Functionasquantiles(x, n)
SpatialEcology.asquantiles!
— Functionasquantiles!(x, n)