A Spatial Tale of Two Dolphin Species and Many Data Sources

Valencia - January 20, 2026

Sara Martino

sara.martino@ntnu.no

Dept. of Mathematical Science, NTNU

Trondheim and NTNU

Introduction

Goal: Understand the spatial distribution of wild species

How do we collect data?

How do we collect data?

Traditional data sources \(\longrightarrow\) go out and search for dolphins!!

How do we collect data?

Traditional data sources \(\longrightarrow\) go out and search for dolphins!!

The observation process introduces a bias…

… but we know the searching protocol…

How do we collect data?

Traditional data sources \(\longrightarrow\) go out and search for dolphins!!
- The observation process introduces a bias…
- … but we know the searching protocol…
- ..therefore, we can correct for such bias!! 😀

How do we collect data?

Traditional data sources \(\longrightarrow\) go out and search for dolphins!!
- The observation process introduces a bias…
- … but we know the searching protocol…
- ..therefore, we can correct for such bias!! 😀

BUT These data are expensive and limited in space and time! 😔

Can we also use data from other sources ❓

The Data

All our data are presence-only
We want to merge all data sources….
…accounting for each specific bias!

The Data

Data Type	Years	N. Campaigns	Sightings (Stenella)	Sightings (Tursiope)
FERRY	2007–2018	311	133	16
UNIRM	2017–2019	73	14	98
Social media	2008–2019	❓	136	465

Notes:

We have many “Social media” data
We have both a “Spatial” and a “Temporal” bias!!

The Data

Sightings for bottlenose and striped dophins

Statistical Tools

Statistical tools

Log Gaussian Cox Processes (presence only data)
- SPDE representation of Gaussian fields
- Inference using INLA

Thinned point process (observation bias)
- Detection function
- Needs more than just INLA \(\longrightarrow\) inlabru

Joint modeling (merging of all data sources)
- Easy with INLA+inlabru

Log Gaussian Cox Processes

We observe \(N\) points in the domain \(\Omega\).
Given the intensity \(\lambda(s)\) the likelihood is given by \[ \pi(Y|\lambda) =\exp \left\{ |\Omega| - \int_{\Omega} \lambda(s) ds\right\}\prod_{i = 1}^N\lambda(s_i) \]
The log-intensity is a Gaussian process \[ \log(\lambda(s)) = Z(s); \ Z(s)\sim \text{GP} \]
‼️ Not analytically tractable ‼️

Implementation - Grid discretization

Discretize the domain into a grid
\(N_{ij}\) = # of observation in cell \((i,j)\)
\(N_{ij} \sim \text{Pois}(\Lambda_{ij})\) where \[\Lambda_{ij} = \int_{s_{ij}}\lambda(s)ds \approx |s_{ij}|\exp(z_{ij})\]
Possible models for \(Z(s)\):
- Continuous GF \(\longrightarrow\) dense covariance matrix
- GMRF \(\longrightarrow\) sparse covariance matrix
NOTE The grid serves both to approximate the latent Gaussian field and to approximate the likelihood

Implementation - SPDE Approach¹

Use the SPDE approach over a mesh to represent the GF

\[ Z(s) = \sum_{i = 1}^nz_i\phi_i(s) \]

Approximate the GF
Do not need to approximate the observation location
Efficient computationally
Fits into the INLA framework!

Thinned point process

The true intensity: \(\lambda(s)\) (..this is what we are interested in!)
The thinned intensity \[ \lambda^{\star}(s) = \lambda(s)g(s) \]
- \(g(s)\) is the thinning (detection) function
- Unless \(g(s)\) is log-linear in all parameters the INLA framework will not work!
- inlabru is an extention of INLA that allowes for non linear terms

INLA-friendly models

Latent Gaussian Models

Likelihood

\[ \pi(\mathbf{y}|\mathbf{u},\theta) = \prod_{i = 1}^n\pi(y_i|u_i,\theta) \]

INLA-friendly models

Latent Gaussian Models

Likelihood

\[ \pi(\mathbf{y}|\mathbf{u},\theta) = \prod_{i = 1}^n\pi(y_i|u_i,\theta) \] 2. Latent field

Gaussianity

\[ \mathbf{u}|\theta\sim\mathcal{N}(0,\mathbf{Q}^{-1}(\theta)) \]

Linearity of the predictor

\[ \eta = \mathbf{A}^T\mathbf{u} \]

INLA-friendly models

Latent Gaussian Models

Likelihood

\[ \pi(\mathbf{y}|\mathbf{u},\theta) = \prod_{i = 1}^n\pi(y_i|u_i,\theta) \] 2. Latent field

Gaussianity

\[ \mathbf{u}|\theta\sim\mathcal{N}(0,\mathbf{Q}^{-1}(\theta)) \]

Linearity of the predictor

\[ \eta = \mathbf{A}^T\mathbf{u} \]

Hyper-prior \(\theta\sim\pi(\theta)\)

Thinned LGCP

Remember:

We assume that \(\log(\lambda(s))\) is linear wrt the latent Gaussian field

The log-observed intensity is then:

\[ \log(\lambda^{\star}(s)) = \log(\lambda(s)) + \log(g(s)) \]

This is linear only if

\[ \log(g(s)) \]

is linear…. which is often not the case!

Can we bring these models into the family of models that are amenable to INLA?

Yes (up to a certain degree) and the method is implemented in the inlabru library

`inlabru`: linearizing the non-linear term

Let \(\tilde{\eta}(\mathbf{u})\) be the non-linear predictor, and let \(\bar{\eta}(\mathbf{u})\) be the 1st order Taylor approximation at some \(\mathbf{u}_0\) \[ \bar{\eta}(\mathbf{u}) = \tilde{\eta}(\mathbf{u}_0) + B(\mathbf{u} - \mathbf{u}_0) = [\tilde{\eta}(\mathbf{u}_0)-B\mathbf{u}_0] + B\mathbf{u} \]

where \(B\) is the derivative matrix for the non-linear predictor, evaluated at \(\mathbf{u}_0\).

The non-linear observation model \(\tilde{\pi}(\mathbf{y}|\mathbf{u}, θ)\) is approximated by \[ \bar{\pi}(\mathbf{y}|\mathbf{u}, θ) = \pi(\mathbf{y}|\bar{\eta}(\mathbf{u}), θ)\approx\pi(\mathbf{y}|\tilde{\eta}(\mathbf{u}), θ) = \tilde{\pi}(\mathbf{y}|\mathbf{u}, θ) \] This model can be fit with INLA.

Iterative INLA algorithm

Start from a point \(\mathbf{u}_0^{(0)}\)
Repeat until convergence
- Compute the first order Taylor approximation for \(\tilde{\eta}(\mathbf{u})\) \[ \bar{\eta}(\mathbf{u}) = [\tilde{\eta}(\mathbf{u}_0)-B\mathbf{u}_0] + B\mathbf{u} \]
- Fit the model with INLA
- Compute \[ \begin{aligned} \hat{\theta}^{(k)} & = \text{argmax}_{\theta}\bar{\pi}(\theta|\mathbf{y}; \mathbf{u}_0^{{(k-1)}}) \\ \hat{\mathbf{u}}^{(k)} & = \text{argmax}_{\mathbf{u}}\bar{\pi}(\mathbf{y}| \hat{\theta}; \mathbf{u}_0^{{(k-1)}}) \\ \end{aligned} \]
- choose the next linearisation point as the point \(\mathbf{u}^{(k)}\) as \[ \mathbf{u}^{(k)} = (1-\alpha)\mathbf{u}^{(k-1)} + \alpha \mathbf{u}^{(k-1)} \]
for some \(\alpha >0\) that minimises a norm of the difference between the linearised and non-linear predictors.

Some notes

Upon convergence (\(|\tilde{\eta}(\mathbf{u}_*) - \bar{\eta}(\mathbf{u}_*)| = 0\)), the overall posterior approximation accuracy depends on the degree of nonlinearity in the vicinity of \(\mathbf{u}_*\)
Some methods exist to check the approximation accuracy.
Great care has to be taken when defining prior for the non-linear terms!
More details in the inlabru ArXiv paper¹!

Modeling the intensity

The “true” (unthinned) intensity: \[ \lambda(s) = \beta_0 + \beta^TX(s) + \sum_k f_k(x_k(s)) + u(s) \]
- \(u(s)\) is a GRF with Matern correlation function
- could be spatio-temporal but would need more data!
The observed intensity:

\[ \lambda_j^{\star}(s) = t_j \ \lambda(s)\ g_j(s);\qquad j = 1,\dots \]

where

\(\lambda(s)\) is the true density
\(g_j(s)\) is the thinning function for observation process \(j\)
\(t_j\) is a time-scaling factor ( this is known for all observations processes except for the social data!)

Accounting for spatial bias: detection functions

For FERRY data \[ g_{\text{ferry}}(s) = \exp\left(-\frac{1}{\sigma^2_{\text{ferry}}} d(s)^2\right) \] where \(d(s)\) is the perpendicular distance to the transect

Accounting for spatial bias: detection functions

For FERRY data \[ g_{\text{ferry}}(s) = \exp\left(-\frac{1}{\sigma^2_{\text{ferry}}} d(s)^2\right) \] where \(d(s)\) is the perpendicular distance to the transect
For UNIRM data \[ g_{unirm}(s) = \begin{cases} 1 \text{ for } d(s)< K\\ 0 \text{ for } d(s)> K\\ \end{cases} \]

How about the SOCIAL MEDIA data?

Distance from the coastline

Assume that the closer to the coast there are more small boats…. higher detection probability close to the coast

This in is not necessary true (people like islands a lot!)
This is also a covariate often used to model species density

\[ g_{\tiny{SOCIAL\_COAST}}(s) = \exp\left\{-\frac{d(s)^2}{\xi_{\tiny{FERRY}}^2}\right\} \]

EMODnet data for boat density

EmodNET (European Marine Observation and Data Network) records boats using AIS (Automatic Identification System, mandatory above 15m length)

Detection probability is higher where boat intensity is higher
Does not consider small boats which are often those reporting sightings

\[ g_{\tiny{SOCIAL\_EMODNET}}(s) = \Phi\left\{\frac{d(s)^2}{\xi_{\tiny{EMODNET}}^2}- \mu_{\tiny{EMODNET}} \right\} \]

Putting things togeter

The true intensity:

\[ \lambda(s) = \beta_0 + \beta X(s) + u(s); \qquad u(s)\sim GRF(\rho, \sigma^2_u) \]

The observed intensities:

\[ \begin{aligned} \lambda_{\tiny{FERRY}}(s) &= t_{\tiny{FERRY}} \ \lambda(s)\ g_{\tiny{FERRY}}(s);\\ \lambda_{\tiny{UNIRM}}(s) &= t_{\tiny{UNIRM}} \ \lambda(s)\ g_{\tiny{UNIRM}}(s);\\ \lambda_{\tiny{SOCIAL}}(s) &= t_{\tiny{SOCIAL}} \ \lambda(s)\ g_{\tiny{SOCIAL\_??}}(s);\\ \end{aligned} \]

Four choices for \(g_{\small{SOCIAL}}(s)\):

No (constant) detection \(g_{\tiny{SOCIAL\_BASELINE}}(s) = 1\) (benchmark)
Detection based on distance from the coastline \(g_{\tiny{SOCIAL\_COAST}}(s)\)
Detection based on boat intensity \(g_{\tiny{SOCIAL\_EMODNET}}(s)\)
Detection based on sightings intensity \(g_{\tiny{SOCIAL\_ANIMAL}}(s)\)

Is the model identifiable?

Low sightings intensity can result from:

There are no animals in the area
There are no observer in the area
How to solve this?
- Gather information about the observation process
- Use informative prior to “guide” inference

Priors and posteriors for the parameters in the detection functions

\(g_{\tiny{SOCIAL\_COAST}}(s)\)

\(g_{\tiny{SOCIAL\_EMODNET}}(s)\)

\(g_{\tiny{SOCIAL\_ANIMAL}}(s)\)

Results

Reconstructed intensity surface (Stenella)

Posterior Median

RWPCI

\[ \text{RWPCI} = (Q_3-Q_1)/Q_2 \]

Reconstructed intensity surface (Tursiope)

Posterior Median

RWPCI

\[ \text{RWPCI} = (Q_3-Q_1)/Q_2 \]

Summary and conclusions

Complex but very topical problem
What can we do:
- Model several data sources jointly
- Correct for the bias induced by the observation process
- Recover known covariate effects
- Estimate intensity surface with associated uncertainty
INLA + inlabru give a huge model flexibility….with great power comes great responsibility, models should be carefully considered!

A Spatial Tale of Two Dolphin Species and Many Data Sources

Trondheim and NTNU

Trondheim and NTNU

Trondheim and NTNU

Introduction

Introduction

How do we collect data?

How do we collect data?

How do we collect data?

How do we collect data?

Social Data

Social Data

Social Data

The Data

The Data

The Data

Statistical Tools

Statistical tools

Log Gaussian Cox Processes

Implementation - Grid discretization

Implementation - SPDE Approach1

Thinned point process

Thinned point process

INLA-friendly models

INLA-friendly models

INLA-friendly models

Thinned LGCP

inlabru: linearizing the non-linear term

Iterative INLA algorithm

Some notes

Modeling the intensity

Modeling the intensity

Accounting for spatial bias: detection functions

Accounting for spatial bias: detection functions

Modeling spatial bias for social data

Distance from the coastline

EMODnet data for boat density

Social data sightings for all species

Putting things togeter

Is the model identifiable?

Priors and posteriors for the parameters in the detection functions

Results

Reconstructed intensity surface (Stenella)

Reconstructed intensity surface (Tursiope)

Summary and conclusions

Implementation - SPDE Approach¹

`inlabru`: linearizing the non-linear term