What is the difference between first-order and second-order stationarity?

First-order stationarity requires a constant mean across the domain. Second-order stationarity additionally requires that the covariance between any two locations depends only on their separation distance and direction, not on absolute coordinates — implying constant variance and a well-defined, bounded semivariogram.

Can I use ordinary kriging on a non-stationary dataset?

Ordinary kriging assumes at minimum the intrinsic hypothesis (stationary increments). If your data shows a deterministic trend, detrend first and krige the residuals, or switch to universal kriging which fits the drift explicitly before estimating the residual covariance structure.

What does a Levene p-value above 0.05 mean for spatial data?

A p-value above the chosen significance level (commonly 0.05) means you cannot reject homogeneity of variance across distance bins — a necessary condition for second-order stationarity. It does not prove stationarity outright; combine it with sill stabilisation and a near-zero residual mean for a full verdict.

Testing for Second-Order Stationarity in Python

TL;DR: Call test_second_order_stationarity(coords, values) below. It fits a detrending plane with numpy.linalg.lstsq, bins pairwise semivariances, runs scipy.stats.levene across the bins, and checks whether the semivariogram sill stabilises. It returns a dict with is_stationary, levene_p_value, sill_cv, and individual pass/fail flags — all in under 30 lines of core logic.

Why This Matters

Second-order stationarity is the mathematical licence that lets you treat a spatial random field as homogeneous: same mean everywhere, same variance everywhere, and covariance that depends only on separation distance rather than absolute location. Without it, ordinary kriging weights become arbitrary, semivariogram estimates are biased, and prediction intervals lose their probabilistic meaning. Confirming or refuting stationarity is therefore the first quantitative gate in every stationarity and trend analysis workflow, and a prerequisite before any step in the wider core concepts of spatial statistics pipeline — from variography through to kriging interpolation.

The Three Mathematical Conditions

Second-order (weak) stationarity sits between the stricter “same full distribution everywhere” and the weaker intrinsic hypothesis. It imposes three explicit constraints that must hold across the study area:

E[Z(\mathbf{x})] = \mu \quad \forall\, \mathbf{x}

\text{Var}[Z(\mathbf{x})] = \sigma^2 \quad \forall\, \mathbf{x}

\text{Cov}[Z(\mathbf{x}),\, Z(\mathbf{x}+\mathbf{h})] = C(\mathbf{h}) \quad \text{(depends on lag } \mathbf{h} \text{ only)}

The semivariogram $\gamma(\mathbf{h}) = \frac{1}{2}\text{Var}[Z(\mathbf{x}+\mathbf{h}) - Z(\mathbf{x})]$ is bounded under second-order stationarity and reaches a finite sill equal to the process variance $\sigma^2$ . If the empirical variogram grows parabolically or never flattens, at least one of the three conditions fails.

Environment and Version Pinning

text

pip install numpy==1.26.4 scipy==1.13.0 geopandas==0.14.4

python

import numpy as np           # 1.26.4
from scipy.spatial.distance import pdist, squareform  # scipy 1.13.0
from scipy.stats import levene
import geopandas as gpd      # 0.14.4  — for CRS validation and spatial I/O

Your coordinates must be in a projected CRS (e.g., UTM or an equal-area projection) before passing them to any distance-based routine. Geographic degrees introduce metric distortion that makes lag distances meaningless. Verify with gdf.crs.is_projected and reproject with gdf.to_crs(epsg=...) if needed.

Step-by-Step Implementation

Step 1 — Load and validate spatial data

python

import geopandas as gpd
import numpy as np

gdf = gpd.read_file("soil_samples.gpkg")

# Ensure a projected CRS; reproject if necessary
if not gdf.crs.is_projected:
    gdf = gdf.to_crs(epsg=32632)  # UTM zone 32N — adjust for your region

# Extract coordinate array and observation vector
coords = np.column_stack((gdf.geometry.x, gdf.geometry.y))
values = gdf["zinc_ppm"].to_numpy(dtype=np.float64)

Projected coordinates feed directly into pdist, which computes Euclidean distances in metres. Using geographic degrees here would inflate long lags near the poles and compress them near the equator.

Step 2 — Remove linear trend (first-order detrending)

python

def remove_linear_trend(coords, values):
    """Fit a least-squares plane z = ax + by + c and return residuals."""
    X = np.column_stack((coords, np.ones(len(coords))))
    coeffs, _, _, _ = np.linalg.lstsq(X, values, rcond=None)
    trend = X @ coeffs
    residuals = values - trend
    return residuals, coeffs

Removing the linear trend isolates the stochastic component before testing covariance structure. Without this step, a north–south temperature gradient or elevation ramp would appear as growing semivariance, masking the true covariance behaviour. Higher-order non-stationarity may require a polynomial surface ( $x^2$ , $xy$ , $y^2$ terms) or an external covariate model.

Step 3 — Compute the empirical semivariogram

python

def empirical_semivariogram(coords, residuals, n_bins=10):
    """
    Returns (bin_centres, bin_mean_gamma, bin_variances_list).
    bin_variances_list holds the raw semivariance arrays for each bin
    — needed for Levene's test in Step 4.
    """
    dist_flat = pdist(coords)                          # upper-triangle distances
    diff_sq   = pdist(residuals.reshape(-1, 1),
                      metric="sqeuclidean")            # (z_i - z_j)^2
    gamma_flat = 0.5 * diff_sq                        # semivariance

    max_dist   = dist_flat.max()
    bin_edges  = np.linspace(0, max_dist, n_bins + 1)
    bin_idx    = np.clip(np.digitize(dist_flat, bin_edges) - 1, 0, n_bins - 1)

    bin_centres = []
    bin_means   = []
    bin_raw     = []
    for b in range(n_bins):
        mask = bin_idx == b
        if mask.sum() > 1:
            bin_centres.append(0.5 * (bin_edges[b] + bin_edges[b + 1]))
            bin_means.append(gamma_flat[mask].mean())
            bin_raw.append(gamma_flat[mask])
        else:
            bin_raw.append(np.array([np.nan]))

    return np.array(bin_centres), np.array(bin_means), bin_raw

The Matheron estimator used here — $\hat{\gamma}(\mathbf{h}) = \frac{1}{2|N(\mathbf{h})|} \sum_{N(\mathbf{h})} [Z(\mathbf{x}_i) - Z(\mathbf{x}_j)]^2$ — is sensitive to outliers. For datasets with extreme values, prefer the Cressie-Hawkins robust estimator or apply a log-transform before computing.

Step 4 — Test variance homogeneity and sill stabilisation

python

from scipy.stats import levene

def assess_stationarity(bin_raw, bin_means, alpha=0.05):
    """
    Combines Levene's test and sill-CV check into a stationarity verdict.

    Parameters
    ----------
    bin_raw   : list of np.ndarray — raw semivariances per bin from Step 3
    bin_means : np.ndarray         — mean semivariance per bin from Step 3
    alpha     : float              — significance level for Levene's test

    Returns
    -------
    dict with keys: variance_homogeneous, levene_p_value,
                    sill_reached, sill_cv, is_stationary
    """
    valid_bins = [v for v in bin_raw if not np.isnan(v).all() and len(v) > 1]

    # Levene's test (median-centred) across all populated bins
    if len(valid_bins) >= 2:
        _, p_levene = levene(*valid_bins, center="median")
        variance_homogeneous = p_levene > alpha
    else:
        p_levene = np.nan
        variance_homogeneous = False

    # Sill stabilisation: coefficient of variation of last 3 bin means < 0.15
    last = [v for v in valid_bins[-3:] if len(v) > 1]
    if last:
        sill_means = [v.mean() for v in last]
        mu = np.mean(sill_means)
        sill_cv = np.std(sill_means) / mu if mu > 0 else 1.0
        sill_reached = sill_cv < 0.15
    else:
        sill_cv = np.nan
        sill_reached = False

    return {
        "variance_homogeneous": variance_homogeneous,
        "levene_p_value": float(p_levene) if not np.isnan(p_levene) else None,
        "sill_reached": sill_reached,
        "sill_cv": float(sill_cv) if not np.isnan(sill_cv) else None,
        "is_stationary": variance_homogeneous and sill_reached,
    }

Step 5 — Assemble and run the full pipeline

python

def test_second_order_stationarity(coords, values, n_bins=10, alpha=0.05):
    """
    Full second-order stationarity test for projected spatial data.

    Parameters
    ----------
    coords  : np.ndarray, shape (N, 2) — projected x,y in metres
    values  : np.ndarray, shape (N,)   — observed attribute
    n_bins  : int                       — lag bins for semivariogram
    alpha   : float                     — significance level

    Returns
    -------
    dict with stationarity verdict and all diagnostic values
    """
    coords = np.asarray(coords, dtype=np.float64)
    values = np.asarray(values, dtype=np.float64)

    if coords.shape[0] != values.shape[0]:
        raise ValueError("coords and values must have equal length.")
    if coords.ndim != 2 or coords.shape[1] != 2:
        raise ValueError("coords must be shape (N, 2).")

    residuals, trend_coeffs = remove_linear_trend(coords, values)

    # Residual mean check: should be ~0 after detrending
    mean_residual    = float(residuals.mean())
    mean_stationary  = abs(mean_residual) < (values.std() * 0.05)

    bin_centres, bin_means, bin_raw = empirical_semivariogram(
        coords, residuals, n_bins=n_bins
    )
    result = assess_stationarity(bin_raw, bin_means, alpha=alpha)

    result["mean_residual"]   = mean_residual
    result["mean_stationary"] = mean_stationary
    result["is_stationary"]   = (
        result["is_stationary"] and mean_stationary
    )
    result["n_bins_populated"] = sum(
        1 for v in bin_raw if not np.isnan(v).all() and len(v) > 1
    )
    return result


# --- Usage ---
result = test_second_order_stationarity(coords, values, n_bins=12)
print(result)
# {'variance_homogeneous': True, 'levene_p_value': 0.312,
#  'sill_reached': True, 'sill_cv': 0.07,
#  'mean_residual': 0.003, 'mean_stationary': True,
#  'is_stationary': True, 'n_bins_populated': 12}

Interpreting the Output

The function returns five diagnostic keys that map directly to the three mathematical conditions:

Key	Condition tested	Pass criterion
`mean_stationary`	Constant mean	`abs(mean_residual) < 0.05 × std(values)`
`variance_homogeneous`	Constant variance across lags	Levene p-value > α
`sill_reached`	Bounded covariance	CV of last 3 bin means < 0.15
`sill_cv`	Sill stability magnitude	Lower is more stable
`is_stationary`	All three conditions	All three flags True

A levene_p_value close to 1.0 means variance is nearly identical across all lag bins — strong evidence of homoskedasticity. A sill_cv below 0.05 indicates an extremely stable plateau. Values between 0.10 and 0.15 are borderline; consider increasing n_bins or extending the search radius before concluding.

Critical Best Practices

Always project before testing

Geographic coordinates make Euclidean lag distances meaningless. Even at mid-latitudes a one-degree east–west separation is ~70 km while a one-degree north–south separation is ~111 km. Always confirm gdf.crs.is_projected returns True before computing pairwise distances. For continental-scale datasets use an equal-area projection (e.g., EPSG:6933) rather than a UTM zone.

Use Levene’s median variant, not the mean variant

scipy.stats.levene defaults to center="mean". Spatial semivariance distributions are right-skewed, especially at short lags where pairs are sparse. Set center="median" to make the test robust against non-normality and outlier pairs.

Match `n_bins` to your sample size

A rough rule: target at least 30 pairs per bin for stable Levene statistics. With $N$ observations there are $\frac{N(N-1)}{2}$ pairs; keep n_bins such that each bin holds at least 30. For 100 samples (~4 950 pairs) 10 bins works well. For 500 samples 20–25 bins is appropriate.

Detrend before testing, not after

Fitting the trend model on the original values and then testing residuals is the correct order. Testing the raw values for stationarity conflates drift with covariance structure; a linear north–south gradient will always cause Levene to fail even if the residuals are perfectly stationary.

Cross-validate your stationarity assumption

A stationary verdict from the test above is a necessary condition for ordinary kriging — not sufficient alone. Follow up with spatial k-fold cross-validation to confirm that prediction errors are spatially unbiased. If standardised errors show spatial clustering, unmodelled non-stationarity remains.

Troubleshooting

Symptom	Likely cause	Fix
`levene_p_value` is None	Fewer than 2 populated bins	Reduce `n_bins` or increase dataset size
`mean_stationary` False after detrending	Non-linear drift (e.g., curved gradient)	Add $x^2$ , $xy$ , $y^2$ terms or use `scipy.interpolate.RBFInterpolator` for flexible detrending
`sill_reached` False but data looks stationary visually	Search radius too short — distant pairs not sampled	Set `n_bins` max lag to at least half the domain extent
`variance_homogeneous` False despite clean data	Outlier pairs inflating distant bins	Apply a log or Box-Cox transform, or use the Cressie-Hawkins robust estimator
`ValueError: coords must be shape (N, 2)`	Passing a 1-D array or a 3-D array	Slice: `coords = coords[:, :2]`
Levene p-value oscillates between runs	Stochastic sub-sampling in large datasets	Seed with `np.random.seed(42)` and use all pairs, not a sample

Next Steps

Once stationarity is confirmed, the residuals are safe to hand off to variogram model fitting and ordinary kriging — see the Stationarity & Trend Analysis guide for the full detrend-then-model workflow. If non-stationarity persists after higher-order detrending, consider ordinary and universal kriging, which embed the drift term directly into the estimation system.

← Back to Stationarity & Trend Analysis

Stationarity & Trend Analysis — parent guide covering the full diagnostic workflow from gradient detection to residual validation
How to Calculate Moran’s I in PySAL — apply Moran’s I to detrended residuals as a complementary stationarity check
Spatial K-Fold Cross-Validation Setup — validate the stationarity assumption downstream with spatially blocked CV

Testing for Second-Order Stationarity in Python

Why This Matters #

The Three Mathematical Conditions #

Environment and Version Pinning #

Step-by-Step Implementation #

Step 1 — Load and validate spatial data #

Step 2 — Remove linear trend (first-order detrending) #

Step 3 — Compute the empirical semivariogram #

Step 4 — Test variance homogeneity and sill stabilisation #

Step 5 — Assemble and run the full pipeline #

Interpreting the Output #

Critical Best Practices #

Always project before testing #

Use Levene’s median variant, not the mean variant #

Match n_bins to your sample size #

Detrend before testing, not after #

Cross-validate your stationarity assumption #

Troubleshooting #

Next Steps #

Related #

Why This Matters

The Three Mathematical Conditions

Environment and Version Pinning

Step-by-Step Implementation

Step 1 — Load and validate spatial data

Step 2 — Remove linear trend (first-order detrending)

Step 3 — Compute the empirical semivariogram

Step 4 — Test variance homogeneity and sill stabilisation

Step 5 — Assemble and run the full pipeline

Interpreting the Output

Critical Best Practices

Always project before testing

Use Levene’s median variant, not the mean variant

Match `n_bins` to your sample size

Detrend before testing, not after

Cross-validate your stationarity assumption

Troubleshooting

Next Steps

Related