Working with GBIF data

In this example, we will see how we can make the packages SimpleSDMLayers and the GBIF.jl package interact. We will specifically plot the relationship between temperature and precipitation for a few occurrences of the kingfisher Megaceryle alcyon.

using SimpleSDMLayers
using GBIF
using Plots
using Statistics
temperature, precipitation = SimpleSDMPredictor(WorldClim, BioClim, [1,12])
2-element Array{SimpleSDMPredictor{Float32},1}:
 SDM predictor → 1080×2160 grid with 808053 Float32-valued cells
 SDM predictor → 1080×2160 grid with 808053 Float32-valued cells

We can get some occurrences for the taxon of interest:

kingfisher = GBIF.taxon("Megaceryle alcyon", strict=true)
kf_occurrences = occurrences(kingfisher, "hasCoordinate" => "true", "decimalLatitude" => (0.0, 65.0), "decimalLongitude" => (-180.0, -50.0), "limit" => 200)

for i in 1:4
  occurrences!(kf_occurrences)
end

@info kf_occurrences
[ Info: GBIF records: downloaded 1000 out of 100000

We can then extract the temperature for the first occurrence:

temperature[kf_occurrences[1]]
18.931166f0

Of course, it would be unwieldy to do this for every occurrence in our dataset, and so we will see a way do it much faster. But first, we do not need the entire surface of the planet to perform our analysis, and so we will instead clip the layers:

temperature_clip = clip(temperature, kf_occurrences)
precipitation_clip = clip(precipitation, kf_occurrences)
SDM predictor → 289×738 grid with 75299 Float32-valued cells
  Latitudes	(13.083333333333334, 61.083333333333336)
  Longitudes	(-172.58333333333334, -49.75)

This will make the future queries a little faster. By default, the clip function will ad a 5% margin on every side. To get the values of a layer at every occurrence in a GBIFRecord, we simply pass the records as a position:

histogram2d(temperature_clip, precipitation_clip, c=:viridis)
scatter!(temperature_clip[kf_occurrences], precipitation_clip[kf_occurrences], lab="", c=:white, msc=:orange)

This will return a record of all data for all geo-localized occurrences (i.e. neither the latitude nor the longitude is missing) in a GBIFRecords collection, as an array of the eltype of the layer. Note that the layer values can be nothing, in which case you might need to run filter(!isnothing, temperature_clip[kf_occurrences] for it to work with the plotting functions.

We can also plot the records over space, using the overloads of the latitudes and longitudes functions:

contour(temperature_clip, c=:alpine, title="Precipitation", frame=:box, fill=true)
scatter!(longitudes(kf_occurrences), latitudes(kf_occurrences), lab="", c=:white, msc=:orange, ms=2)