mk_test()sens_slope()change_point_detection()seasonal_decompose_climate()rolling_trend()temporal_homogeneity()trend_significance()autocorrelation_climate()fit_gev() and rgev_sim()return_period()peaks_over_threshold()heat_wave_detection()cold_spell_detection()drought_spell()extreme_value_index()The climatestatsr package provides a self-contained,
dependency-light collection of statistical functions for climate change
research. All methods operate on standard R vectors, matrices, and data
frames and require only base-R stats,
graphics, grDevices, and
utils.
| Family | Functions |
|---|---|
| Temporal analysis | mk_test, sens_slope,
change_point_detection,
seasonal_decompose_climate, rolling_trend,
temporal_homogeneity, trend_significance,
autocorrelation_climate |
| Spatial analysis | morans_i, hot_cold_spots,
spatial_interpolate, spatial_trend_field,
cluster_climate_zones, spatial_anomaly,
elevation_lapse_rate |
| Extreme events | fit_gev, rgev_sim,
return_period, peaks_over_threshold,
heat_wave_detection, cold_spell_detection,
drought_spell, extreme_value_index |
| Climate indices | spi, spei, pdsi_simple,
heat_index, wind_chill,
frost_days, growing_degree_days,
diurnal_temp_range |
| Detection & attribution | detection_attribution,
fingerprint_analysis, optimal_fingerprint |
| Utilities | fill_gaps_climate, homogenize_series,
aggregate_climate, anomaly_baseline,
standardize_climate, climate_summary |
install.packages("climatestatsr")Temporal analysis functions test for and quantify monotonic trends, abrupt shifts, and seasonal structure in climate time series.
mk_test()The Mann-Kendall test (Mann 1945; Kendall 1975) is the standard non-parametric method for detecting monotonic trends in hydro-climatic records. It is rank-based and therefore robust to non-normality and outliers. The optional pre-whitening step (Yue and Wang 2002) removes AR(1) autocorrelation before computing the test statistic.
# Simulate 50 years of warming at 0.025 °C/year
years <- 1971:2020
temp <- 14.0 + 0.025 * seq_len(50) + stats::rnorm(50, 0, 0.4)
result <- mk_test(temp)
print(result)
#>
#> Mann-Kendall Trend Test
#> -----------------------
#> Test statistic (Z): 5.6045
#> p-value : 0.000000
#> Kendall tau : 0.5478
#> Trend direction : increasingplot(result)Mann-Kendall result: time series with Sen slope (left) and value distribution (right).
Interpreting the output:
Z — standardised statistic; |Z| > 1.96 indicates
significance at α = 0.05tau — Kendall’s rank correlation; ranges from −1
(monotone decrease) to +1 (monotone increase)p.value — two-tailed probability under H₀ of no
trendtrend — plain-language conclusion# Apply AR(1) pre-whitening for autocorrelated series
ar_temp <- as.numeric(stats::arima.sim(list(ar = 0.65), n = 60)) +
seq(0, 3, length.out = 60) + 14
mk_pw <- mk_test(ar_temp, prewhiten = TRUE)
cat("Pre-whitened MK: Z =", round(mk_pw$statistic, 3),
" p =", round(mk_pw$p.value, 4), "\n")
#> Pre-whitened MK: Z = 1.648 p = 0.0994sens_slope()Sen’s slope (Sen 1968) is the median of all pairwise slopes between observations. It provides a robust estimate of trend magnitude unaffected by outliers.
ss <- sens_slope(years, temp)
cat(sprintf(
"Sen's slope : %+.4f °C/year\n", ss$slope))
#> Sen's slope : +0.0263 °C/year
cat(sprintf(
"Rate/decade : %+.3f °C\n", ss$slope_decade))
#> Rate/decade : +0.263 °C
cat(sprintf(
"95%% CI : [%.4f, %.4f]\n", ss$slope_ci["lower"],
ss$slope_ci["upper"]))
#> 95% CI : [0.0191, 0.0353]plot(years, temp, pch = 16, cex = 0.7, col = "steelblue",
xlab = "Year", ylab = "Temperature (°C)",
main = "Annual Mean Temperature with Sen's Slope")
abline(a = ss$intercept, b = ss$slope,
col = "firebrick", lwd = 2, lty = 2)
legend("topleft", legend = c("Observed", "Sen slope"),
col = c("steelblue", "firebrick"),
pch = c(16, NA), lty = c(NA, 2), lwd = c(NA, 2), bty = "n")Observed temperature with Sen slope (red dashed) and 95% CI band.
change_point_detection()Two methods are available:
# Simulate a step change at observation 30
x <- c(stats::rnorm(30, mean = 14.0, sd = 0.5),
stats::rnorm(30, mean = 15.5, sd = 0.5))
cp <- change_point_detection(x, method = "pettitt")
cat(sprintf("Change point at index : %d\n", cp$change_point))
#> Change point at index : 31
cat(sprintf("Mean before shift : %.2f °C\n", cp$mean_before))
#> Mean before shift : 14.06 °C
cat(sprintf("Mean after shift : %.2f °C\n", cp$mean_after))
#> Mean after shift : 15.64 °C
cat(sprintf("Significant (α=0.05) : %s\n", cp$significant))
#> Significant (α=0.05) : TRUEcp_cusum <- change_point_detection(x, method = "cusum")
plot(cp_cusum$cusum, type = "l", col = "steelblue", lwd = 2,
xlab = "Index", ylab = "CUSUM",
main = "CUSUM Change-Point Detection")
abline(v = cp_cusum$change_point, col = "firebrick",
lty = 2, lwd = 2)
abline(h = 0, lty = 3, col = "gray50")
legend("topleft",
legend = c("CUSUM", sprintf("Break at %d", cp_cusum$change_point)),
col = c("steelblue", "firebrick"),
lty = c(1, 2), lwd = 2, bty = "n")CUSUM series — the peak identifies the most probable break point.
seasonal_decompose_climate()Decomposes monthly or quarterly series into trend, seasonal, and remainder components using STL (Loess-based) or classical additive decomposition.
n <- 360 # 30 years of monthly data
t_idx <- seq_len(n)
temp_m <- 15 + 0.003 * t_idx +
8 * sin(2 * pi * t_idx / 12) +
stats::rnorm(n, 0, 0.5)dc <- seasonal_decompose_climate(temp_m, frequency = 12, method = "stl")
plot(dc)STL decomposition: original, trend, seasonal component, and remainder.
The extracted trend can itself be tested for significance:
trend_mk <- mk_test(dc$trend[!is.na(dc$trend)])
cat("Trend component MK test: tau =", round(trend_mk$tau, 3),
" p =", round(trend_mk$p.value, 4), "\n")
#> Trend component MK test: tau = 0.726 p = 0rolling_trend()Applies Sen’s slope over a moving window to reveal periods of accelerating or decelerating warming.
temp2 <- 13.5 + c(0.01 * seq_len(50), 0.04 * seq_len(41)) +
stats::rnorm(91, 0, 0.4)
rt <- rolling_trend(temp2, window = 20, step = 2)
plot(rt$mid_index, rt$slope_decade, type = "l",
col = "steelblue", lwd = 2,
xlab = "Mid-window index",
ylab = "Trend (°C per decade)",
main = "Rolling 20-Year Sen Slope")
abline(h = 0, lty = 2, col = "gray50")
polygon(c(rt$mid_index, rev(rt$mid_index)),
c(rt$slope_decade * 1.15, rev(rt$slope_decade * 0.85)),
col = adjustcolor("steelblue", 0.15), border = NA)20-year rolling Sen slope: warming accelerated after index ~40.
temporal_homogeneity()The Standard Normal Homogeneity Test (Alexandersson 1986) detects inhomogeneities caused by station moves, instrument changes, or observer changes.
# Inhomogeneous series: +0.8 °C offset after observation 40
x_inh <- c(stats::rnorm(40, 0, 1), stats::rnorm(40, 0.8, 1))
snht <- temporal_homogeneity(x_inh)
cat(sprintf("T0 statistic : %.2f (critical ≈ %.2f)\n",
snht$T0, snht$critical))
#> T0 statistic : 4.12 (critical ≈ 11.89)
cat(sprintf("Break at index: %d\n", snht$break_index))
#> Break at index: 40
cat(sprintf("Significant : %s\n", snht$significant))
#> Significant : FALSEplot(snht$T_series, type = "l", col = "steelblue", lwd = 1.5,
xlab = "Index", ylab = "T statistic",
main = "SNHT — Homogeneity Test")
abline(v = snht$break_index, col = "firebrick", lty = 2, lwd = 2)
abline(h = snht$critical, col = "orange", lty = 3, lwd = 1.5)
legend("topright",
legend = c("T statistic", "Break point", "Critical value"),
col = c("steelblue", "firebrick", "orange"),
lty = c(1, 2, 3), lwd = 2, bty = "n")SNHT statistic series — peak marks the inhomogeneity break.
trend_significance()Runs Mann-Kendall on every column of a matrix and applies FDR or Bonferroni correction for simultaneous testing.
mat <- matrix(stats::rnorm(50 * 30), nrow = 50)
# impose trend in 10 stations
mat[, 1:10] <- mat[, 1:10] + outer(seq_len(50), rep(0.05, 10))
ts_res <- trend_significance(mat, correction = "fdr")
cat("Stations with significant trend:\n")
#> Stations with significant trend:
print(table(ts_res$trend))
#>
#> increasing no trend
#> 9 21autocorrelation_climate()Reports ACF, PACF and the Ljung-Box test for residual autocorrelation — important before applying parametric models.
ar_series <- as.numeric(stats::arima.sim(list(ar = 0.7), n = 100)) + 15
ac <- autocorrelation_climate(ar_series, max.lag = 20, plot = TRUE)ACF and PACF of an AR(1) climate series (ρ ≈ 0.7).
ACF and PACF of an AR(1) climate series (ρ ≈ 0.7).
cat("AR(1) sample coefficient:", round(ac$ar1, 3), "\n")
#> AR(1) sample coefficient: 0.58
cat("Ljung-Box p-value :", round(ac$ljung_box$p.value, 4), "\n")
#> Ljung-Box p-value : 0morans_i()Global Moran’s I (Moran 1950) measures whether similar values cluster spatially (I > 0) or disperse (I < 0).
set.seed(42)
n <- 50
coords <- data.frame(lon = stats::runif(n, -10, 10),
lat = stats::runif(n, 40, 60))
# Temperature decreases with latitude → positive autocorrelation
x_sp <- 26 - 0.35 * coords$lat + stats::rnorm(n, 0, 0.8)
mi <- morans_i(x_sp, coords, n_perm = 499)
cat(sprintf("Moran's I = %.4f\n", mi$I))
#> Moran's I = 0.4867
cat(sprintf("Z-score = %.3f (p = %.4f)\n", mi$Z, mi$p.value))
#> Z-score = 16.633 (p = 0.0000)
cat(mi$interpretation, "\n")
#> positive spatial autocorrelationcol_ramp <- colorRampPalette(c("steelblue", "white", "firebrick"))(50)
val_rank <- rank(x_sp)
plot(coords$lon, coords$lat,
pch = 21, cex = 1.5,
bg = col_ramp[val_rank],
xlab = "Longitude", ylab = "Latitude",
main = sprintf("Spatial Field (Moran's I = %.3f)", mi$I))Scatter plot of spatial values coloured by latitude gradient.
hot_cold_spots()The Getis-Ord Gi* statistic (Getis and Ord 1992) identifies locations where high (hot spot) or low (cold spot) values cluster locally.
set.seed(5)
vals <- ifelse(coords$lon > 4,
stats::rnorm(n, 30, 2),
stats::rnorm(n, 18, 2))
hs <- hot_cold_spots(vals, coords, dist_threshold = 5)
print(table(hs$classification))
#>
#> cold spot hot spot not significant
#> 11 17 22cols <- c("hot spot" = "firebrick",
"cold spot" = "steelblue",
"not significant"= "gray80")
plot(hs$lon, hs$lat,
pch = 21, cex = 1.6,
bg = cols[hs$classification],
xlab = "Longitude", ylab = "Latitude",
main = "Getis-Ord Gi* Hot/Cold Spots")
legend("topright", legend = names(cols), pt.bg = cols,
pch = 21, pt.cex = 1.4, bty = "n")Getis-Ord Gi* classification: hot spots (east), cold spots (west).
spatial_interpolate()Interpolates station values onto a regular grid using IDW or loess spline.
set.seed(7)
obs <- matrix(stats::runif(50), ncol = 2,
dimnames = list(NULL, c("lon","lat")))
vals_obs <- sin(obs[,"lon"] * 3) + cos(obs[,"lat"] * 3)
grd <- as.matrix(expand.grid(
lon = seq(0.05, 0.95, length.out = 20),
lat = seq(0.05, 0.95, length.out = 20)))
pred <- spatial_interpolate(obs, vals_obs, grd, method = "idw")
image(matrix(pred, 20, 20),
col = colorRampPalette(c("steelblue","white","firebrick"))(64),
main = "IDW Interpolated Climate Field",
xlab = "Longitude", ylab = "Latitude")
points((obs[,"lon"] - 0.05) / 0.9,
(obs[,"lat"] - 0.05) / 0.9, pch = 3, cex = 0.8)IDW interpolation from 25 stations to a 20×20 grid.
spatial_trend_field()Computes per-cell Mann-Kendall + Sen slope across a space-time matrix — essential for mapping where warming is occurring.
set.seed(9)
mat_st <- matrix(stats::rnorm(50 * 200), nrow = 50)
mat_st[, 101:200] <- mat_st[, 101:200] +
outer(seq_len(50), rep(0.04, 100))
stf <- spatial_trend_field(mat_st)
cat(sprintf("Cells with significant trend: %d / %d (%.0f%%)\n",
sum(stf$p.value < 0.05, na.rm = TRUE),
nrow(stf),
100 * mean(stf$p.value < 0.05, na.rm = TRUE)))
#> Cells with significant trend: 102 / 200 (51%)stf$group <- ifelse(seq_len(nrow(stf)) <= 100, "No trend", "Trend imposed")
sig_frac <- tapply(stf$p.value < 0.05, stf$group, mean, na.rm = TRUE)
barplot(sig_frac * 100,
col = c("gray70", "firebrick"),
ylab = "% Significant (α=0.05)",
main = "Spatial Trend Field Results",
border = NA)Fraction of significant trend cells by group.
cluster_climate_zones()K-means clustering on multi-variable climate normals delineates coherent climate regions.
set.seed(1)
clim_mat <- matrix(
c(stats::rnorm(200, rep(c(5, 15, 25, 10, 20), each = 40), 2),
stats::rnorm(200, rep(c(1200, 600, 200, 900, 400), each = 40), 80)),
ncol = 2,
dimnames = list(NULL, c("temp", "precip")))
cz <- cluster_climate_zones(clim_mat, k = 5)
cat("Cluster sizes:\n")
#> Cluster sizes:
print(table(cz$cluster))
#>
#> 1 2 3 4 5
#> 43 39 41 37 40pal <- c("#e41a1c","#377eb8","#4daf4a","#984ea3","#ff7f00")
plot(clim_mat[, "temp"], clim_mat[, "precip"],
pch = 21, cex = 0.9,
bg = pal[cz$cluster],
xlab = "Mean Temperature (°C)",
ylab = "Annual Precipitation (mm)",
main = "K-Means Climate Zone Classification")
legend("topright", legend = paste("Zone", 1:5),
pt.bg = pal, pch = 21, pt.cex = 1.2, bty = "n")Five K-means climate zones in temperature–precipitation space.
elevation_lapse_rate()Estimates the temperature lapse rate from station data — essential for statistical downscaling.
elev <- seq(100, 3000, by = 100)
temp_lapse <- 25 - 0.0065 * elev + stats::rnorm(30, 0, 0.4)
lr <- elevation_lapse_rate(temp_lapse, elev)
plot(elev, temp_lapse, pch = 16, cex = 0.8, col = "steelblue",
xlab = "Elevation (m a.s.l.)",
ylab = "Temperature (°C)",
main = sprintf("Environmental Lapse Rate: %.2f °C / 1000 m",
lr$lapse_rate_per_1000m))
lines(lr$data$elevation, lr$data$fitted, col = "firebrick", lwd = 2)
legend("topright",
legend = c("Station data",
sprintf("Lapse rate = %.3f °C/1000 m (R² = %.3f)",
lr$lapse_rate_per_1000m, lr$r_squared)),
col = c("steelblue", "firebrick"),
pch = c(16, NA), lty = c(NA, 1), lwd = c(NA, 2), bty = "n")Temperature vs elevation with fitted lapse rate.
fit_gev() and rgev_sim()The Generalised Extreme Value distribution (Coles 2001) is the limiting distribution for block-maxima. Three shape families are unified: Gumbel (ξ = 0), Fréchet (ξ > 0, heavy tail), Weibull (ξ < 0, bounded).
set.seed(10)
ann_max <- rgev_sim(60, mu = 34, sigma = 4.5, xi = 0.12)
gev <- fit_gev(ann_max)
print(gev)
#>
#> GEV Distribution Fit
#> ----------------------
#> Location (mu) : 33.7343 [ 32.7051, 34.7635]
#> Scale (sigma) : 3.6308 [ 2.8895, 4.3722]
#> Shape (xi) : -0.0056 [ -0.1859, 0.1748]
#> Log-likelihood : -171.903
#> AIC / BIC : 349.81 / 356.09
#> N (block max) : 60
#> Distribution : Gumbel (approx.)return_period()rp <- return_period(gev, c(2, 5, 10, 20, 50, 100, 200))
print(rp)
#> T level lower upper
#> 1 2 35.06369 -50.60298 120.7304
#> 2 5 39.15763 -310.84050 389.1558
#> 3 10 41.85400 -481.78326 565.4913
#> 4 20 44.42985 -644.74474 733.6044
#> 5 50 47.74874 -854.22033 949.7178
#> 6 100 50.22453 -1010.11831 1110.5674
#> 7 200 52.68172 -1164.53720 1269.9007with(rp, {
plot(T, level, type = "b", log = "x", pch = 16,
col = "steelblue",
xlab = "Return period (years)",
ylab = "Temperature (°C)",
main = "GEV Return Level Curve",
ylim = range(c(lower, upper), na.rm = TRUE))
polygon(c(T, rev(T)), c(upper, rev(lower)),
col = adjustcolor("steelblue", 0.18), border = NA)
lines(T, lower, lty = 2, col = "steelblue")
lines(T, upper, lty = 2, col = "steelblue")
abline(v = c(10, 100), lty = 3, col = "gray60")
})GEV return level curve with 95% delta-method confidence band.
peaks_over_threshold()The POT / GPD approach (Davison and Smith 1990) uses all exceedances above a threshold rather than only one value per year, yielding more efficient estimates for long return periods.
set.seed(11)
daily_p <- stats::rexp(365 * 30, rate = 1 / 6)
pot <- peaks_over_threshold(daily_p, threshold = 22,
n_years = 30,
return_periods = c(10, 50, 100, 200))
cat(sprintf("Threshold : %.0f mm\n", pot$threshold))
#> Threshold : 22 mm
cat(sprintf("Peaks retained: %d\n", pot$n_excess))
#> Peaks retained: 237
cat(sprintf("GPD shape (ξ) : %.4f\n", pot$xi))
#> GPD shape (ξ) : -0.0227
cat(sprintf("GPD scale (σ) : %.4f\n", pot$sigma))
#> GPD scale (σ) : 6.8418
cat("\nReturn levels:\n")
#>
#> Return levels:
print(pot$return_levels)
#> T level
#> 1 10 50.45914
#> 2 50 60.24998
#> 3 100 64.35763
#> 4 200 68.40109heat_wave_detection()dates <- seq(as.Date("2000-01-01"), by = "day",
length.out = 365 * 15)
doy <- as.integer(format(dates, "%j"))
tmax <- 28 + 10 * sin(2 * pi * doy / 365) +
stats::rnorm(length(dates), 0, 2.5)
hw <- heat_wave_detection(tmax, dates,
threshold = "p95", min_days = 3)
cat(sprintf("Heat wave events over 15 years : %d\n", nrow(hw)))
#> Heat wave events over 15 years : 13
cat(sprintf("Mean duration : %.1f days\n",
mean(hw$duration)))
#> Mean duration : 3.4 days
cat(sprintf("Maximum peak temperature : %.1f °C\n",
max(hw$peak_temp)))
#> Maximum peak temperature : 43.5 °Chw$year <- as.integer(format(hw$start_date, "%Y"))
ann_hw <- tapply(hw$duration, hw$year, length)
barplot(ann_hw,
col = "firebrick",
border = NA,
xlab = "Year", ylab = "Number of events",
main = "Annual Heat Wave Frequency (p95 threshold, ≥3 days)")Annual heat wave count and mean duration over 15 years.
cold_spell_detection()cs <- cold_spell_detection(
tmin = tmax - stats::rnorm(length(tmax), 12, 1),
dates = dates,
threshold = "p05",
min_days = 3)
cat(sprintf("Cold spell events: %d\n", nrow(cs)))
#> Cold spell events: 6
if (nrow(cs) > 0) {
cat(sprintf("Mean duration : %.1f days\n", mean(cs$duration)))
}
#> Mean duration : 3.2 daysdrought_spell()set.seed(12)
spi_vals <- stats::rnorm(360)
dates_m <- seq(as.Date("1990-01-01"), by = "month",
length.out = 360)
droughts <- drought_spell(spi_vals, dates_m, threshold = -1.0,
min_duration = 2)
cat(sprintf("Drought spells detected : %d\n", nrow(droughts)))
#> Drought spells detected : 4
cat(sprintf("Mean duration : %.1f months\n",
mean(droughts$duration)))
#> Mean duration : 2.0 months
cat(sprintf("Maximum severity : %.2f\n",
max(droughts$severity)))
#> Maximum severity : 0.86extreme_value_index()The Hill estimator (Hill 1975) characterises how heavy-tailed a distribution is — important for wind speed, flood peaks, and other power-law climate variables.
set.seed(13)
ws <- abs(stats::rnorm(500, 10, 4))^1.5
ev <- extreme_value_index(ws, k = 30)
cat(sprintf("Hill tail index (k=%d): %.4f\n", ev$k_used, ev$hill_index))
#> Hill tail index (k=30): 0.1565
plot(ev$hill_plot$k, ev$hill_plot$hill, type = "l",
col = "steelblue", lwd = 1.5,
xlab = "k (number of order statistics)",
ylab = "Hill estimate",
main = "Hill Stability Plot")
abline(v = ev$k_used, col = "firebrick", lty = 2)Hill stability plot — plateau indicates a good choice of k.
spi()The SPI (McKee et al. 1993) standardises accumulated precipitation at any time scale (1 to 48+ months) against a gamma reference distribution. Values below −1.0 indicate moderate drought; below −2.0 indicate extreme drought.
precip <- stats::rgamma(360, shape = 2, scale = 35) # 30 years
spi3 <- spi(precip, scale = 3)
spi12 <- spi(precip, scale = 12)
cat(sprintf("SPI-3 — mean: %.3f SD: %.3f\n",
mean(spi3, na.rm = TRUE), stats::sd(spi3, na.rm = TRUE)))
#> SPI-3 — mean: -0.001 SD: 1.003
cat(sprintf("SPI-12 — mean: %.3f SD: %.3f\n",
mean(spi12, na.rm = TRUE), stats::sd(spi12, na.rm = TRUE)))
#> SPI-12 — mean: -0.003 SD: 1.031old_par <- graphics::par(mfrow = c(2, 1), mar = c(3, 4, 2, 1))
col3 <- ifelse(!is.na(spi3) & spi3 >= 0, "steelblue", "firebrick")
graphics::plot(spi3, type = "h", col = col3,
xlab = "", ylab = "SPI-3",
main = "Standardised Precipitation Index")
graphics::abline(h = c(-1, 1), lty = 2, col = "gray40")
col12 <- ifelse(!is.na(spi12) & spi12 >= 0, "steelblue", "firebrick")
graphics::plot(spi12, type = "h", col = col12,
xlab = "Month", ylab = "SPI-12")
graphics::abline(h = c(-1, 1), lty = 2, col = "gray40")SPI-3 (top) and SPI-12 (bottom) drought indices over 30 years.
graphics::par(old_par)spei()The SPEI (Vicente-Serrano et al. 2010) fits a log-logistic distribution to the climatic water balance (P − PET), making it sensitive to temperature- driven evapotranspiration increases — a key advantage under climate change.
set.seed(14)
tmin_m <- abs(stats::rnorm(360, 8, 3)) + 2
tmax_m <- tmin_m + stats::runif(360, 7, 14)
pr_m <- stats::rgamma(360, 5, 0.06)
sp6 <- spei(precip = pr_m, tmin = tmin_m, tmax = tmax_m,
lat = 45, scale = 6)
cat(sprintf("SPEI-6 — mean: %.3f SD: %.3f\n",
mean(sp6, na.rm = TRUE), stats::sd(sp6, na.rm = TRUE)))
#> SPEI-6 — mean: 2.312 SD: 1.165col_sp <- ifelse(!is.na(sp6) & sp6 >= 0, "steelblue", "firebrick")
graphics::plot(sp6, type = "h", col = col_sp,
xlab = "Month", ylab = "SPEI-6",
main = "SPEI-6 (Hargreaves PET, lat = 45°)")
graphics::abline(h = c(-1, 1), lty = 2, col = "gray40")SPEI-6 — the index accounts for evapotranspiration demand.
pdsi_simple()The Palmer Drought Severity Index integrates precipitation deficit and evapotranspiration over multiple months, making it sensitive to slowly evolving droughts.
doy_m <- rep(1:12, 30)
temp_p <- 10 + 12 * sin(pi * doy_m / 6) + stats::rnorm(360, 0, 1)
pr_p <- pmax(0, 50 + 20 * cos(pi * doy_m / 6) +
stats::rnorm(360, 0, 15))
pdsi_vals <- pdsi_simple(temp_p, pr_p, lat = 40)
graphics::plot(pdsi_vals, type = "l", col = "steelblue",
xlab = "Month", ylab = "PDSI",
main = "Simplified Palmer Drought Severity Index")
graphics::abline(h = c(-2, 2), lty = 2, col = c("firebrick","steelblue"))
graphics::abline(h = 0, lty = 3, col = "gray50")Simplified PDSI — values below −2 indicate severe drought.
heat_index()The Heat Index (Rothfusz 1990; Steadman 1979) converts temperature and relative humidity to an apparent “feels-like” temperature.
temp_grid <- seq(25, 45, by = 2)
rh_grid <- seq(30, 100, by = 5)
hi_mat <- outer(temp_grid, rh_grid,
FUN = function(t, r) heat_index(t, r))
image(temp_grid, rh_grid, hi_mat,
col = colorRampPalette(c("lightyellow","orange","firebrick"))(64),
xlab = "Air temperature (°C)",
ylab = "Relative humidity (%)",
main = "Heat Index (°C)")
contour(temp_grid, rh_grid, hi_mat,
levels = c(27, 32, 41, 54), add = TRUE, col = "white")Heat index surface: apparent temperature rises steeply with humidity.
wind_chill()temp_wc <- seq(-30, 5, by = 2)
wind_wc <- seq(5, 80, by = 5)
wc_mat <- outer(temp_wc, wind_wc, FUN = wind_chill)
image(temp_wc, wind_wc, wc_mat,
col = colorRampPalette(c("navy","steelblue","white"))(64),
xlab = "Air temperature (°C)",
ylab = "Wind speed (km/h)",
main = "Wind Chill Temperature (°C)")
contour(temp_wc, wind_wc, wc_mat,
levels = c(-40, -30, -20, -10), add = TRUE, col = "gray20")Wind chill surface: perceived temperature can be far below air temperature.
frost_days()dates_d <- seq(as.Date("2010-01-01"), by = "day",
length.out = 365 * 10)
doy_d <- as.integer(format(dates_d, "%j"))
tmin_d <- 5 - 15 * sin(2 * pi * doy_d / 365) +
stats::rnorm(length(dates_d), 0, 2)
fd <- frost_days(tmin_d, dates_d, by = "year")
barplot(fd, col = "steelblue", border = NA,
xlab = "Year", ylab = "Frost days",
main = "Annual Frost Day Count (Tmin < 0 °C)")Annual frost day count simulated over 10 years.
growing_degree_days()dates_g <- seq(as.Date("2020-01-01"),
as.Date("2020-12-31"), by = "day")
doy_g <- as.integer(format(dates_g, "%j"))
tmax_g <- 22 + 12 * sin(2 * pi * doy_g / 365)
tmin_g <- 10 + 8 * sin(2 * pi * doy_g / 365)
gdd_cum <- growing_degree_days(tmax_g, tmin_g,
base_temp = 10,
cumulative = TRUE)
plot(gdd_cum, type = "l", col = "darkgreen", lwd = 2,
xlab = "Day of year",
ylab = "Cumulative GDD (base 10 °C)",
main = "Growing Degree Days 2020")
abline(v = 91, lty = 2, col = "gray60") # ~April 1
abline(v = 274, lty = 2, col = "gray60") # ~Oct 1
text(91, max(gdd_cum) * 0.05, "Apr", col = "gray40", adj = 0)
text(274, max(gdd_cum) * 0.05, "Oct", col = "gray40", adj = 1)Cumulative GDD accumulation through the growing season.
diurnal_temp_range()dtr <- diurnal_temp_range(tmax_g, tmin_g, dates_g, by = "month")
barplot(dtr, col = "steelblue", border = NA,
names.arg = month.abb,
xlab = "Month", ylab = "DTR (°C)",
main = "Mean Diurnal Temperature Range by Month")Monthly mean DTR — a proxy for cloudiness and land-surface change.
detection_attribution()A projection-based approach that tests whether the observed climate change is distinguishable from natural internal variability (Hasselmann 1979; Allen and Tett 1999).
set.seed(20)
obs_anom <- cumsum(stats::rnorm(50, 0.03, 0.12))
nat_ens <- matrix(stats::rnorm(50 * 30, 0, 0.45), ncol = 30)
forc_sig <- seq(0, 1.5, length.out = 50)
da <- detection_attribution(obs_anom, nat_ens, forc_sig)
cat(sprintf("Signal detected : %s\n", da$detected))
#> Signal detected : TRUE
cat(sprintf("Z-score : %.3f\n", da$z_score))
#> Z-score : 3.498
cat(sprintf("p-value : %.4f\n", da$p.value))
#> p-value : 0.0005
cat(sprintf("Attribution fraction: %.1f%%\n",
da$attribution_fraction * 100))
#> Attribution fraction: 14.9%hist(da$projection_natural,
col = "steelblue",
border = "white",
breaks = 15,
xlab = "Projection onto forced signal (Pearson r)",
main = "Detection Test: Observed vs Natural Variability")
abline(v = da$projection_observed,
col = "firebrick", lwd = 3)
legend("topright",
legend = c("Natural ensemble",
sprintf("Observed (r = %.3f)", da$projection_observed)),
fill = c("steelblue", NA),
border = c("white", NA),
lty = c(NA, 1),
col = c(NA, "firebrick"),
lwd = c(NA, 3), bty = "n")Observed projection vs natural ensemble distribution — signal clearly separated.
fingerprint_analysis()Extracts leading Empirical Orthogonal Functions (Lorenz 1956; von Storch and Zwiers 1999) to identify dominant spatial modes of climate variability or change.
set.seed(21)
mat_eof <- matrix(stats::rnorm(60 * 200), nrow = 60)
# Inject a dominant warming pattern in first 80 locations
mat_eof[, 1:80] <- mat_eof[, 1:80] +
outer(seq(0, 2, length.out = 60), rep(1, 80))
fp <- fingerprint_analysis(mat_eof, n_eof = 3)
cat(sprintf("EOF1 explains: %.1f%% of variance\n",
fp$var_explained[1] * 100))
#> EOF1 explains: 14.6% of variance
cat(sprintf("EOF2 explains: %.1f%% of variance\n",
fp$var_explained[2] * 100))
#> EOF2 explains: 3.5% of variance
cat(sprintf("Cumulative (3 EOFs): %.1f%%\n",
fp$cumvar[3] * 100))
#> Cumulative (3 EOFs): 21.3%matplot(fp$pc[, 1:3], type = "l",
lty = 1, lwd = 1.5,
col = c("firebrick", "steelblue", "darkgreen"),
xlab = "Time step", ylab = "PC score",
main = "Leading EOF Principal Components")
legend("topleft",
legend = sprintf("PC%d (%.1f%%)", 1:3,
fp$var_explained * 100),
col = c("firebrick", "steelblue", "darkgreen"),
lty = 1, lwd = 1.5, bty = "n")Leading PC time series — EOF1 captures the forced warming trend.
optimal_fingerprint()Estimates scaling factors for anthropogenic (ANT) and natural (NAT) signals using GLS regression (Allen and Tett 1999; Hegerl and Zwiers 2011).
set.seed(22)
obs_c <- cumsum(stats::rnorm(50, 0.03, 0.10))
all_c <- cumsum(stats::rnorm(50, 0.028, 0.05)) +
cumsum(stats::rnorm(50, 0.005, 0.03))
nat_c <- cumsum(stats::rnorm(50, 0, 0.12))
ofp <- optimal_fingerprint(obs_c, all_c, nat_c)
cat(sprintf("ANT scaling factor: %.3f\n", ofp$beta_all))
#> ANT scaling factor: 1.320
cat(sprintf("NAT scaling factor: %.3f\n", ofp$beta_nat))
#> NAT scaling factor: 1.290
cat(sprintf("Residual variance : %.4f\n", ofp$residual_variance))
#> Residual variance : 0.1269fill_gaps_climate()Three methods: linear interpolation, monthly climatology, and reference-station regression.
x_gaps <- c(10.2, NA, NA, NA, 14.0, 15.1, NA, 17.3, 18.0)
cat("Original :", x_gaps, "\n")
#> Original : 10.2 NA NA NA 14 15.1 NA 17.3 18
cat("Filled :", round(fill_gaps_climate(x_gaps), 2), "\n")
#> Filled : 10.2 11.15 12.1 13.05 14 15.1 16.2 17.3 18homogenize_series()Corrects a detected SNHT break by shifting the earlier segment to match the mean of the later segment.
set.seed(25)
x_inh <- c(stats::rnorm(40, 14.0, 0.5),
stats::rnorm(40, 15.8, 0.5))
x_hom <- homogenize_series(x_inh)
cat(sprintf("Before adjustment — mean of segment 1: %.2f\n",
mean(x_inh[1:40])))
#> Before adjustment — mean of segment 1: 13.88
cat(sprintf("After adjustment — mean of segment 1: %.2f\n",
mean(x_hom[1:40])))
#> After adjustment — mean of segment 1: 15.71
cat(sprintf("Mean of segment 2 (reference) : %.2f\n",
mean(x_hom[41:80])))
#> Mean of segment 2 (reference) : 15.71aggregate_climate()dates_agg <- seq(as.Date("2000-01-01"), by = "day",
length.out = 365 * 5)
temp_agg <- stats::rnorm(length(dates_agg), 15, 6)
ann <- aggregate_climate(temp_agg, dates_agg, to = "annual")
seas <- aggregate_climate(temp_agg, dates_agg, to = "seasonal")
cat("Annual means:\n"); print(ann)
#> Annual means:
#> period value
#> 1 2000 15.02341
#> 2 2001 15.00400
#> 3 2002 15.48877
#> 4 2003 14.96704
#> 5 2004 14.52703anomaly_baseline()yr <- 1950:2020
temp_b <- 13.5 + 0.022 * seq_len(71) + stats::rnorm(71, 0, 0.45)
anom <- anomaly_baseline(temp_b, yr, 1961, 1990)
plot(yr, anom, type = "l", col = "steelblue", lwd = 1.5,
xlab = "Year", ylab = "Anomaly (°C)",
main = "Temperature Anomalies (1961–1990 baseline)")
abline(h = 0, lty = 2, col = "gray50")
lines(stats::lowess(yr, anom, f = 0.3),
col = "firebrick", lwd = 2.5)
legend("topleft",
legend = c("Annual anomaly", "LOWESS smoother"),
col = c("steelblue", "firebrick"),
lty = 1, lwd = c(1.5, 2.5), bty = "n")Temperature anomalies relative to 1961–1990 baseline.
standardize_climate()x_raw <- stats::rnorm(120, mean = 18, sd = 5)
z <- standardize_climate(x_raw)
cat(sprintf("Raw — mean: %.2f SD: %.2f\n",
mean(x_raw), stats::sd(x_raw)))
#> Raw — mean: 17.90 SD: 5.27
cat(sprintf("Std — mean: %.6f SD: %.6f\n",
mean(z), stats::sd(z)))
#> Std — mean: -0.000000 SD: 1.000000climate_summary()temp_long <- 13.5 + 0.022 * seq_len(71) + stats::rnorm(71, 0, 0.45)
res <- climate_summary(temp_long,
variable_name = "Annual Mean Temperature (°C)")
#>
#> === Climate Summary: Annual Mean Temperature (°C) ===
#> N observations : 71 (missing: 0)
#> Mean : 14.334
#> Std. deviation : 0.676
#> Min / Max : 13.087 / 15.808
#> 5th / 95th pct : 13.228 / 15.540
#> MK Tau : 0.5517 (p = 0.0000)
#> Trend : increasing
#> Sen's slope : 0.0237 per year
#> Slope /decade : 0.2372Allen, M. R. and Tett, S. F. B. (1999). Checking for model consistency in optimal fingerprinting. Climate Dynamics 15(6), 419–434. doi:10.1007/s003820050291
Alexandersson, H. (1986). A homogeneity test applied to precipitation data. International Journal of Climatology 6(6), 661–675. doi:10.1002/joc.3370060607
Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate. Journal of the Royal Statistical Society: Series B 57(1), 289–300. doi:10.1111/j.2517-6161.1995.tb02031.x
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer-Verlag, London. doi:10.1007/978-1-4471-3675-0
Davison, A. C. and Smith, R. L. (1990). Models for exceedances over high thresholds. Journal of the Royal Statistical Society: Series B 52(3), 393–442. doi:10.1111/j.2517-6161.1990.tb01796.x
Getis, A. and Ord, J. K. (1992). The analysis of spatial association by use of distance statistics. Geographical Analysis 24(3), 189–206. doi:10.1111/j.1538-4632.1992.tb00261.x
Hegerl, G. C. and Zwiers, F. (2011). Use of models in detection and attribution of climate change. Wiley Interdisciplinary Reviews: Climate Change 2(4), 570–591. doi:10.1002/wcc.121
Hill, B. M. (1975). A simple general approach to inference about the tail of a distribution. Annals of Statistics 3(5), 1163–1174. doi:10.1214/aos/1176343247
Mann, H. B. (1945). Nonparametric tests against trend. Econometrica 13(3), 245–259. doi:10.2307/1907187
McKee, T. B., Doesken, N. J. and Kleist, J. (1993). The relationship of drought frequency and duration to time scales. Proceedings of the 8th Conference on Applied Climatology, 179–184. American Meteorological Society.
Moran, P. A. P. (1950). Notes on continuous stochastic phenomena. Biometrika 37(1), 17–23. doi:10.2307/2332142
Pettitt, A. N. (1979). A nonparametric approach to the change-point problem. Applied Statistics 28(2), 126–135. doi:10.2307/2346729
Sen, P. K. (1968). Estimates of the regression coefficient based on Kendall’s tau. Journal of the American Statistical Association 63(324), 1379–1389. doi:10.2307/2285891
Steadman, R. G. (1979). The assessment of sultriness. Part I. Journal of Applied Meteorology 18(7), 861–873. doi:10.1175/1520-0450(1979)018%3C0861:TAOSPI%3E2.0.CO;2
Thornthwaite, C. W. (1948). An approach toward a rational classification of climate. Geographical Review 38(1), 55–94. doi:10.2307/210739
Vicente-Serrano, S. M., Begueria, S. and Lopez-Moreno, J. I. (2010). A Multiscalar Drought Index Sensitive to Global Warming. Journal of Climate 23(7), 1696–1718. doi:10.1175/2009JCLI2909.1
von Storch, H. and Zwiers, F. W. (1999). Statistical Analysis in Climate Research. Cambridge University Press, Cambridge. doi:10.1017/CBO9780511612336
Yue, S. and Wang, C. Y. (2002). Applicability of prewhitening to eliminate the influence of serial correlation on the Mann-Kendall test. Water Resources Research 38(6), 4-1–4-7. doi:10.1029/2001WR000861
sessionInfo()
#> R version 4.2.1 (2022-06-23 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 26200)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=C
#> [2] LC_CTYPE=English_United States.utf8
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.utf8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] climatestatsr_0.1.2
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.39 R6_2.6.1 lifecycle_1.0.5 jsonlite_2.0.0
#> [5] evaluate_1.0.5 cachem_1.1.0 rlang_1.1.7 cli_3.6.5
#> [9] otel_0.2.0 rstudioapi_0.18.0 jquerylib_0.1.4 bslib_0.10.0
#> [13] rmarkdown_2.31 tools_4.2.1 xfun_0.57 yaml_2.3.12
#> [17] fastmap_1.2.0 compiler_4.2.1 htmltools_0.5.9 knitr_1.51
#> [21] sass_0.4.10