Sensitivity Analysis¶

The food systems model involves many uncertain inputs: emission factors for different greenhouse gases, crop yield potentials, health dose-response parameters, and policy-level choices like carbon prices and health valuations. Understanding which of these uncertainties matter most for model outcomes is essential for directing research priorities and communicating result robustness.

Global sensitivity analysis answers the question: which input uncertainties drive the most variation in model outcomes? This is quantified through Sobol indices, which decompose the variance of each output into contributions from individual inputs and their interactions.

First-order index (\(S_1\)): The fraction of output variance attributable to a single input, acting alone.
Total-order index (\(S_T\)): The fraction of output variance attributable to a single input, including all its interactions with other inputs.

A parameter with \(S_1 \approx S_T\) influences the output mainly through its direct effect. A parameter with \(S_T \gg S_1\) is involved in significant interactions with other parameters.

This implementation supports two surrogate modelling approaches:

Polynomial Chaos Expansion (PCE): Computes Sobol indices analytically from a polynomial approximation to the model response.
Random Forest (RF): Computes Sobol indices via Monte Carlo permutation of a fitted random forest ensemble.

Both methods are fitted to the same set of solved model scenarios, enabling direct comparison of surrogate quality and derived sensitivity indices.

Methodology¶

Polynomial Chaos Expansion¶

The central idea is to approximate the model’s input-output mapping as a polynomial in the uncertain inputs. If \(\mathbf{X} = (X_1, \ldots, X_d)\) are the uncertain parameters and \(Y\) is a scalar output, the PCE representation is:

\[Y \approx \sum_{\boldsymbol{\alpha}} c_{\boldsymbol{\alpha}}\, \Psi_{\boldsymbol{\alpha}}(\mathbf{X})\]

where \(\boldsymbol{\alpha} = (\alpha_1, \ldots, \alpha_d)\) is a multi-index, \(c_{\boldsymbol{\alpha}}\) are scalar coefficients, and \(\Psi_{\boldsymbol{\alpha}}\) are multivariate orthonormal polynomials with respect to the joint input distribution. Each \(\Psi_{\boldsymbol{\alpha}}\) is a product of univariate orthonormal polynomials:

\[\Psi_{\boldsymbol{\alpha}}(\mathbf{X}) = \prod_{i=1}^d \psi_{\alpha_i}^{(i)}(X_i)\]

where \(\psi_k^{(i)}\) is the degree-\(k\) orthonormal polynomial for the marginal distribution of \(X_i\). For uniform inputs these are (normalised) Legendre polynomials; for normal inputs, Hermite polynomials.

The orthonormality property \(\mathbb{E}[\Psi_{\boldsymbol{\alpha}} \Psi_{\boldsymbol{\beta}}] = \delta_{\boldsymbol{\alpha}\boldsymbol{\beta}}\) is what makes the variance decomposition exact: the variance of the expansion is simply the sum of squared coefficients. Once the coefficients are known, Sobol indices follow directly without further model evaluations.

The polynomial basis is generated using chaospy. A cross-truncation parameter \(q \in (0, 1]\) controls the multi-index set: lower values favour lower-order interaction terms, keeping the basis compact. The default is \(q = 0.5\).

Sparse Fitting via LARS¶

With many parameters and moderate polynomial degree, the number of candidate basis terms can exceed the number of model samples. Rather than requiring a full tensor-product design, the implementation uses Least Angle Regression (LARS) with cross-validation to select a sparse subset of active terms.

LARS incrementally adds basis terms that are most correlated with the current residual, using cross-validation to choose the optimal number of active terms. This produces a parsimonious expansion that captures the dominant polynomial structure without overfitting.

The fitting uses sklearn.linear_model.LarsCV with 5-fold cross-validation.

Validation: Surrogate quality is assessed through multiple metrics:

Holdout error: When holdout_fraction > 0 (recommended), the tail of the Sobol sequence is reserved as held-out test data. The surrogate is fitted on the remaining training samples and evaluated on the holdout set. This gives an honest, out-of-sample error estimate. Using the tail preserves the space-filling quality of the training design (Sobol sequences front-load coverage).
Leave-one-out error (PCE only): A relative error computed via the hat matrix without refitting. Reported alongside holdout error for comparison.
Out-of-bag R² (RF only): The OOB score from bootstrap aggregation. Reported alongside holdout error for comparison.
R-squared (\(R^2\)): Coefficient of determination on training data.

The primary validation_error field in output files is the holdout error when available, falling back to LOO (PCE) or OOB (RF) error otherwise.

Random Forest¶

As an alternative to PCE, a Random Forest ensemble can be fitted to the same model samples. Sobol indices are computed via Monte Carlo integration: for each parameter, the marginal effect is estimated by permuting that parameter’s values while holding others fixed, then comparing the variance reduction.

Random Forests are non-parametric and can capture non-smooth or discontinuous model responses that polynomials may miss. However, they are more expensive to evaluate conditionally (requiring Monte Carlo samples at each grid point) and produce noisier sensitivity estimates.

The implementation uses sklearn.ensemble.RandomForestRegressor with OOB scoring enabled.

Sobol Indices from PCE Coefficients¶

Thanks to the orthonormality of the basis, the total variance of the expansion is:

\[D = \sum_{\boldsymbol{\alpha} \neq \mathbf{0}} c_{\boldsymbol{\alpha}}^2\]

The first-order Sobol index for parameter \(i\) sums the squared coefficients of terms where only \(\alpha_i > 0\) (no other parameter is active):

\[\begin{split}S_{1,i} = \frac{1}{D} \sum_{\substack{\boldsymbol{\alpha}:\, \alpha_i > 0 \\ \alpha_j = 0\; \forall\, j \neq i}} c_{\boldsymbol{\alpha}}^2\end{split}\]

The total-order Sobol index sums all terms where \(\alpha_i > 0\), regardless of other active parameters:

\[S_{T,i} = \frac{1}{D} \sum_{\boldsymbol{\alpha}:\, \alpha_i > 0} c_{\boldsymbol{\alpha}}^2\]

These indices satisfy \(0 \le S_{1,i} \le S_{T,i} \le 1\) and \(\sum_i S_{1,i} \le 1\) (with equality when there are no interactions).

Conditional Sensitivity Analysis¶

Some parameters are policy choices (e.g., GHG price, value per YLL) rather than epistemic uncertainties. It is often useful to ask: given a specific policy setting, how sensitive are outcomes to the remaining uncertain parameters?

Designated slice parameters are fixed at specific conditioning values. For PCE, the conditioning is performed analytically: for each slice parameter \(j\) with conditioning value \(x_j^*\), the univariate basis is evaluated at that value, and the resulting factors are absorbed into the PCE coefficients:

\[c'_{\boldsymbol{\alpha}} = c_{\boldsymbol{\alpha}} \cdot \prod_{j \in \text{slice}} \psi_{\alpha_j}^{(j)}(x_j^*)\]

Sobol indices are then computed from the transformed coefficients \(c'_{\boldsymbol{\alpha}}\), considering only terms where at least one non-slice parameter is active. The result is a set of Sobol indices for the remaining parameters, conditional on the policy choices — showing how sensitivity patterns shift as policy values change.

For RF, conditioning is done via Monte Carlo: slice parameters are held fixed while remaining parameters are sampled from their marginal distributions, and Sobol indices are computed from the resulting predictions.

Experimental Design¶

Sobol Quasi-Random Sequences¶

The model is sampled using a scrambled Sobol sequence — a quasi-random, low-discrepancy design that provides better coverage of the parameter space than simple random sampling, especially in high dimensions. The Sobol sequence fills the \([0, 1]^d\) hypercube with balanced coverage, and an inverse CDF transform maps each dimension to the corresponding parameter distribution.

The implementation uses scipy.stats.qmc.Sobol with scrambling enabled for improved uniformity. Scrambling is seeded for deterministic reproducibility (default seed: 42).

Note

The number of samples should be a power of 2 for optimal balance of the Sobol sequence (e.g., 64, 128, 256, 512, 1024).

Supported Distributions¶

Each uncertain parameter is assigned an independent marginal distribution. The joint distribution is the product of the marginals (i.e., parameters are assumed independent).

Distribution	Required fields	Optional fields	Description
`uniform` (default)	`lower`, `upper`		Flat distribution over [lower, upper]
`log_uniform`	`lower`, `upper`		Log-uniform: uniform on log scale over [lower, upper] (both > 0)
`normal`	`mean`, `std`	`bounds`	Gaussian with given mean and standard deviation
`normal_ci`	`lower`, `upper`	`confidence`, `bounds`	Normal distribution where [lower, upper] defines a confidence interval
`lognormal`	`mu`, `sigma`		Log-normal with log-scale mean and std

When the distribution field is omitted, uniform is assumed (requiring only lower and upper).

The normal_ci distribution derives its mean as (lower + upper) / 2 and its standard deviation from the confidence level (default: 0.9, i.e. 90% CI). This is useful when the literature reports uncertainty as a confidence interval around a central value rather than as explicit standard deviations.

The optional bounds field (a two-element list [lo, hi]) truncates normal and normal_ci distributions to the given range using a truncated normal. Use null for an unbounded side (e.g., bounds: [0, null] enforces non-negativity). This prevents physically meaningless values (e.g., negative multiplicative factors) in the tails of the distribution.

Configuration¶

Sensitivity analysis is configured through the _generators DSL in the scenarios file (see Configuration for the full generator syntax). A sensitivity generator uses mode: sensitivity and specifies parameter distributions rather than fixed value lists.

Full example (from the production configuration):

# config/gsa.yaml
name: "gsa"

scenarios:
  default: {}

  _generators:
    - name: gsa_{sample_id}
      mode: sensitivity
      samples: 4096
      slice_parameters: [value_per_yll, ghg_price]
      parameters:
        yield_factor:
          lower: 0.8
          upper: 1.2
          bounds: [0, null]
        ch4_factor:
          distribution: normal_ci
          lower: 0.5
          upper: 1.5
          confidence: 0.9
          bounds: [0, null]
        # ... (remaining parameters)
      template:
        sensitivity:
          crop_yields:
            all: "{yield_factor}"
          emission_factors:
            ch4: "{ch4_factor}"
            # ...
        health:
          value_per_yll: "{value_per_yll}"
        emissions:
          ghg_price: "{ghg_price}"

# Surrogate fitting methods (applied independently to the same scenarios)
sensitivity_analysis:
  holdout_fraction: 0.15
  methods:
    pce:
      grid_resolution: 50
      method_options:
        n_mc_conditional: 4096
        cross_truncation: 0.8
    rf:
      grid_resolution: 50
      method_options:
        n_estimators: 500

The generator defines only the scenario sampling design (parameters, distributions, sample count). Surrogate method configuration lives in the separate sensitivity_analysis section, allowing multiple methods to be applied to the same solved scenarios without duplication.

Health relative risk parameters use a quantile parameterization: each rr_<risk_factor> value is a quantile \(q \in [0, 1]\) that interpolates between the GBD confidence bounds [15] at every dose-response breakpoint:

\[\log(\text{RR}(q)) = (1 - q) \cdot \log(\text{RR}_{\text{low}}) + q \cdot \log(\text{RR}_{\text{high}})\]

\(q = 0\): GBD lower bound (strongest protective effect for beneficial foods)
\(q = 1\): GBD upper bound (weakest protective effect for beneficial foods)

This is applied per (risk factor, cause, exposure) point, so a single quantile parameter per risk factor produces cause-specific adjustments automatically.

Generator field reference:

name: Scenario name pattern. Use {sample_id} as a placeholder for the zero-indexed sample number (e.g., gsa_{sample_id} produces gsa_0, gsa_1, …, gsa_4095).
mode: sensitivity: Activates space-filling Sobol sampling with distribution-based parameter specifications.
samples: Number of samples to draw. Should be a power of 2.
seed (optional): Random seed for the scrambled Sobol sequence. Default: 42.
slice_parameters (optional): List of parameter names to use as conditioning variables in the conditional Sobol analysis.
parameter_groups (optional): Mapping of group names to parameter lists, used for plot organisation and colour grouping.
parameters: Mapping of parameter names to distribution specifications (see Supported Distributions above).
template: Configuration template with {param_name} placeholders that are substituted with sampled values. Type is preserved when the placeholder is the entire value.

``sensitivity_analysis`` field reference:

holdout_fraction: Fraction of samples reserved for out-of-sample validation (e.g., 0.15 for 15%). Set to 0 to disable holdout.
methods: Mapping of method names (pce, rf) to method-specific config. Each method entry supports:
- grid_resolution (default: 100): Number of grid points for conditional Sobol evaluation along each slice parameter axis.
- method_options: Method-specific hyperparameters:
  - PCE: max_degree (default: 3), cross_truncation (default: 0.5), n_mc_conditional (default: 4096).
  - RF: n_estimators (default: 500), n_mc_global (default: 16384), n_mc_conditional (default: 8192).

Parameter Range Justification¶

This section documents the uncertainty ranges assigned to each sensitivity parameter, with references to the scientific literature. The range represents the multiplicative factor applied to the model’s default values.

Distribution choice. Parameters whose ranges are derived from formal confidence intervals reported in the literature use normal_ci distributions, where lower and upper define the 90% confidence interval of a normal distribution (truncated at zero to prevent physically meaningless negative factors). This applies to the three emission-related factors (CH₄, N₂O, and LUC), whose ranges are grounded in IPCC confidence intervals. The remaining parameters (crop yields, food loss & waste, feed conversion ratios) use uniform distributions because their ranges are derived from informal expert assessments, inter-estimate disagreement, or error metrics that do not correspond to a well-defined confidence level.

The CH₄ and N₂O sensitivity factors represent combined uncertainty in both the underlying emission factors (measurement and methodology uncertainty) and the 100-year global warming potentials (GWP100) used to convert physical emissions to CO₂-equivalents. The IPCC AR6 reports GWP100 90% confidence intervals of ±40% for CH₄ and ±47% for N₂O [1]. Since EF and GWP uncertainties are independent (the former is an agricultural measurement question, the latter a climate science question), they combine approximately in quadrature.

CH₄ factor (`ch4_factor`: 0.5–1.5)¶

This factor scales all CH₄ emissions in the model: enteric fermentation, manure management (from animal production links), and rice paddy emissions (from wetland rice crop production links).

Emission factor uncertainty. The IPCC 2019 Refinement reports a Tier 1 uncertainty of ±30–50% (95% CI) for livestock CH₄ emission factors [2]. Enteric fermentation dominates total livestock CH₄ (roughly 80–90% of the total). Species-specific standard deviations in the 2019 Refinement (Tables 10.12–10.13) translate to 95% CIs of ±26% (sheep) to ±39% (cattle/buffalo). Manure management uncertainty is considerably larger: Hristov et al. report a 95% CI of ±63–65% for US manure CH₄ using gridded Monte Carlo analysis [3]. A GLEAM one-at-a-time sensitivity analysis found approximately ±39% total variation in ruminant CH₄ when perturbing all 92 input parameters [4]. The midpoint of the IPCC Tier 1 range, ±40%, is a reasonable central estimate for the emission factor alone.

Combined uncertainty. Adding GWP100 uncertainty (±40%, 90% CI [1]) in quadrature with the emission factor uncertainty (±40%, 95% CI) gives a combined uncertainty of approximately ±57%. The ±50% range is a conservative rounding of this combined estimate.

Distribution. Since both the IPCC GWP100 (90% CI) and Tier 1 emission factor (95% CI) uncertainties are reported as formal confidence intervals, the combined range is treated as a 90% CI of a normal_ci distribution (truncated at zero). This is slightly conservative relative to the quadrature result.

N₂O factor (`n2o_factor`: 0.3–1.7)¶

This factor scales all N₂O emissions in the model: manure management and manure applied to soils (from animal production links), and direct and indirect emissions from synthetic fertiliser application (from fertiliser distribution links). N₂O emission factors are among the most uncertain parameters in agricultural GHG inventories.

Emission factor uncertainty. The IPCC 2019 Refinement [5] reports the aggregated Tier 1 direct emission factor EF₁ = 0.01 with a 95% CI of 0.001–0.018, corresponding to multiplicative factors of ×0.1–1.8. When disaggregated by climate, the 95% CIs are tighter (e.g., wet-climate synthetic fertiliser: 0.013–0.019 around 0.016, i.e., ±19%) but differ substantially between wet and dry climates.

Since the model uses the aggregated EF₁ = 0.01 uniformly across countries, it mixes climate zones where the true EF differs by a factor of ~3 (wet 0.016 vs. dry 0.005). An emission-factor-only range of approximately ±50% captures:

The global aggregate N₂O uncertainty from Tian et al., who synthesise bottom-up inventories to estimate total agricultural N₂O at 3.8 Tg N/yr with a min–max range across methodologies of 2.5–5.8 Tg N/yr, i.e., factors of ×0.66–1.53 relative to the central estimate [6].
The structural uncertainty from using aggregated rather than climate-disaggregated emission factors. Hergoualc’h et al. showed that propagating the 95% CIs of disaggregated EFs gives a global estimate of 883–1,285 Gg N₂O-N/yr (±19% around the midpoint), whereas the aggregated method spans 539–2,713 Gg/yr [7].
Indirect N₂O pathways with very wide 95% CIs (EF₄ for volatilisation: 0.002–0.05; Frac_LEACH: 0.10–0.80) [5].

The range is narrower than the full IPCC aggregated 95% CI [×0.1, ×1.8] because global aggregation across countries and N sources averages out regional extremes. Evidence of possible systematic underestimation (legacy effects suggesting a true global mean EF of ~1.9% [8]) supports including factors well above 1.0 in the range.

Combined uncertainty. Adding GWP100 uncertainty (±47%, 90% CI [1]) in quadrature with the emission factor uncertainty (±50%) gives a combined uncertainty of approximately ±69%, rounded to ±70%.

Distribution. As with CH₄, both contributing uncertainties are formal confidence intervals, so the combined range is treated as a 90% CI of a normal_ci distribution (truncated at zero).

Land-use change emissions (`luc_factor`: 0.3–1.7)¶

This factor scales CO₂ emissions from land conversion (both cropland and pasture expansion). LUC emissions are among the most uncertain components of the global carbon budget, driven by uncertainty in carbon stocks, the spatial pattern of conversion, and methodological choices.

The ±70% range aligns with the IPCC AR6 WGIII assessment, which reports a 90% CI of ±70% for CO₂-LULUCF emissions — the largest fractional uncertainty of any major emission category [9]. This is consistent with:

The Global Carbon Budget: E_LUC for 2022 is 1.2 ± 0.7 GtC/yr (semi-quantitative 1σ / 68% CI), with biogeochemical parameterisation as the dominant uncertainty component [10].
Above-ground biomass carbon stock estimates: the IPCC 2019 Refinement default AGB values for tropical forests have standard deviations averaging ~55% of the mean across ecological zones and continents [11]. The GlobBiomass dataset used in global maps has a global mean error of ~50% [12].
Soil organic carbon stock change factors (F_LU, F_MG, F_I): 95% CIs of ±11–16% per factor, propagating to a combined SOC-change uncertainty of ~20–25% for well-characterised climate zones, reaching ±50% in poorly sampled regions [13].
Amortisation period choices: switching from the conventional 20-year to a 30-year period changes annualised emissions by ~33% [14].

Distribution. The IPCC AR6 WGIII directly reports ±70% as a 90% CI, making this the most straightforward case for a normal_ci distribution (truncated at zero).

Crop yield factor (`yield_factor`: 0.8–1.2)¶

This factor scales all crop production yields (efficiency on crop_production links), representing uncertainty in attainable yield estimates. The model uses GAEZ v5 attainable yield data, which are subject to climate model uncertainty, agronomic model limitations, and spatial aggregation error.

The ±20% range is supported by:

Grid-cell prediction error: Normalised root-mean-square error (NRMSE) between GAEZ attainable yields and observed yields is 18–28% across major crop groups [16].
Inter-annual variability: Coefficients of variation in national crop yields are 13–22% for major cereals, reflecting weather-driven fluctuations that the model’s single-year snapshot does not capture [17].
Climate model spread: GAEZ v5 provides yield projections under five GCMs (GFDL-ESM4, IPSL-CM6A-LR, MPI-ESM1-2-HR, MRI-ESM2-0, UKESM1-0-LL); inter-model yield spread is typically 10–25% for a given crop and region.

Yield uncertainty propagates strongly to land use (more yield means less land required) and GHG emissions (through reduced land-use change pressure).

Distribution. A uniform distribution is used because the range is synthesised from heterogeneous error metrics (NRMSE, CV, model spread) that do not correspond to a formal confidence interval at any specific level.

Food loss and waste factor (`flw_factor`: 0.7–1.3)¶

This factor scales the efficiency of food_processing links, which incorporate food loss and waste fractions from SDG 12.3.1 data (food loss index from FAO + food waste index from UNEP). A factor below 1.0 means more food is lost than the baseline estimates suggest (higher FLW); above 1.0 means less food is lost (lower FLW).

The ±30% range is supported by:

Inter-estimate disagreement: Global FLW estimates range from ~24% of food supply on a caloric basis [18] to ~33% on a mass basis [19] to ~40% when on-farm harvest losses are included [20], giving multiplicative factors of ×0.73–1.21 relative to the FAO central estimate.
Measurement bias: Self-reported consumer waste data (which feed into the UNEP Food Waste Index) systematically underestimate actual waste by 20–40% compared to direct waste compositional analysis [21].
Country-level data gaps: The SDG 12.3.1 Food Loss Index has a stated random error of ~25% at the country level [22]. Only ~12% of the global population lives in countries that directly track FLW. Much post-harvest loss data for developing countries was collected over 30 years ago [23].

The dominant bias direction in the literature is underestimation, which would support an asymmetric range skewed higher. The symmetric ±30% range is a conservative simplification.

Distribution. A uniform distribution is used because the range is derived from inter-estimate disagreement and data gap assessments rather than formal confidence intervals.

Feed conversion ratio factor (`fcr_factor`: 0.8–1.2)¶

This factor scales feed conversion efficiencies (efficiency on animal_production links), representing uncertainty in how much feed is required per unit of animal product. Higher values mean better conversion (more product per unit feed). The model uses Wirsenius (2000) regional feed energy requirements [24] converted via NRC net-energy-to-metabolisable-energy factors [25].

The ±20% range is supported by:

Inter-source disagreement: The model applies calibration multipliers (up to 2.0×) to reconcile Wirsenius-based efficiencies with GLEAM feed baselines. After calibration, residual disagreement between Wirsenius and GLEAM / Herrero et al. [26] is typically 10–30% for a given region–product pair.
Energy conversion uncertainty: The NE-to-ME conversion factors (k_m=0.60, k_g=0.40, k_l=0.60) are NRC “typical” values. Published ranges span k_m: 0.55–0.65 and k_g: 0.35–0.45, introducing ~10–15% uncertainty [25].
Temporal lag: The Wirsenius data reflects ~1994–1998 conditions. Monogastric FCRs have improved ~10–20% since then through genetic progress (~0.5–1%/year for poultry and pork); ruminant improvement is minimal.
Precedent: Alexander et al. [27] and Springmann et al. [28] both used ±20% for FCR uncertainty in Monte Carlo global food system analyses. The IPCC 2019 Refinement reports ±20% uncertainty for Tier 2 feed intake estimates [2].

The ±20% range captures data source disagreement, conversion factor uncertainty, and temporal lag without bleeding into inter-system variation (which is already represented by the model’s regionalized FCR assignment).

Distribution. A uniform distribution is used because the range is based on inter-source disagreement and precedent from other studies that also used uniform distributions for FCR uncertainty [27] [28].

Health relative risk parameters (`rr_protective`, `rr_harmful`: 0–1)¶

Relative risk uncertainty is parameterized as quantiles \(q \in [0, 1]\) that interpolate between the GBD lower and upper confidence bounds for dose-response relative risks. At \(q = 0\) the strongest effect estimate is used; at \(q = 1\) the weakest.

Rather than specifying a separate quantile per risk factor, two grouped parameters reduce dimensionality:

rr_protective: applies to all risk factors whose RR decreases with intake (e.g., fruits, vegetables, whole grains, legumes, nuts & seeds).
rr_harmful: applies to all risk factors whose RR increases with intake (e.g., red meat, processed meat).

Direction is inferred automatically from the dose-response data: for each risk factor, log_rr at the lowest intake is compared with log_rr at the highest intake. This grouping is justified because protective food groups share a common uncertainty mechanism (GBD confidence interval bounds).

Individual risk factor keys (e.g., whole_grains: 0.5) remain supported and take precedence over group keys when both are specified. However, specifying both a group key and an individual key for the same risk factor raises an error.

Production stability cost (`prod_stability_cost`: 0.05–0.5, slice parameter)¶

This parameter controls the L1 penalty cost applied to deviations of crop and animal product production from their current (baseline) levels. In the model’s production stability mode (penalty_mode: "l1"), each unit of absolute deviation incurs a cost of l1_cost (bn USD per Mha for crops/grassland, or Mha-equivalent for animals). The penalty induces the optimizer to replicate current production patterns rather than radically restructuring the food system.

The range 0.05–0.5 spans the behavioural transition zone: below ~0.05 the penalty is too weak to prevent large production shifts; by ~0.5 deviations have largely flattened to a residual floor. The central calibrated value of 0.22 pushes cropland deviation to approximately 5%.

Distribution. A uniform distribution is used because the range reflects a modelling choice (how strongly to penalise production deviations) rather than an empirically grounded uncertainty estimate.

Slice parameter. Because the production stability cost is a structural modelling choice rather than an empirical uncertainty, it is designated as a slice parameter alongside the policy axes. This separates the effect of the modeller’s regularisation choice from the variance attributable to physical parameter uncertainties. Conditional Sobol indices are reported at specific L1 cost values (0.05, 0.22, 0.5), showing how sensitivity structure shifts with regularisation strength.

Policy slice parameters¶

The value per YLL (value_per_yll: 0–20,000 USD) and GHG price (ghg_price: 0–300 USD/tCO₂-eq) are policy-choice parameters rather than epistemic uncertainties. They are designated as slice parameters: included in the PCE fit but analytically conditioned on specific values to show how sensitivity patterns shift across the policy space. Their ranges are chosen to span a wide policy-relevant domain without claiming to represent a specific probability distribution.

References

Running the Analysis¶

The sensitivity analysis requires three stages: build all sampled scenarios, solve them, then fit surrogates. Snakemake handles all dependencies automatically.

Run the full pipeline (build + solve + analyze for all samples):

tools/smk -j4 --configfile config/gsa.yaml

Run just the surrogate analysis step (after scenarios are already solved):

# PCE analysis for the default scenario group
tools/smk -j4 --configfile config/gsa.yaml -- \
    results/gsa/analysis/sobol_global_indices_gsa_pce.parquet

# RF analysis for the same scenarios
tools/smk -j4 --configfile config/gsa.yaml -- \
    results/gsa/analysis/sobol_global_indices_gsa_rf.parquet

Output paths use two wildcards: {group} identifies the scenario sampling group (e.g., gsa, gsa-l1-low) and {method} selects the surrogate (pce or rf). Both methods consume the same solved scenarios.

Note

Each sample requires a full model build and solve. Start with a small sample count (32–64) for testing, then increase (1024–4096) for production. A coarser spatial resolution also reduces per-sample cost.

Output Files¶

Four Parquet files are written to results/{name}/analysis/ per (group, method) combination:

sobol_global_indices_{group}_{method}.parquet — Global Sobol indices

Column	Type	Description
`output`	string	Output metric (`total_cost`, `ghg_emissions`, `land_use`, `yll`)
`parameter`	string	Parameter name from generator spec
`S1`	float	First-order Sobol index
`ST`	float	Total-order Sobol index

One row per (output, parameter) pair.

sobol_conditional_indices_{group}_{method}.parquet — Conditional Sobol indices (1D slices)

Column	Type	Description
`output`	string	Output metric
`parameter`	string	Parameter name (non-slice parameters only)
`S1_cond`	float	Conditional first-order Sobol index
`ST_cond`	float	Conditional total-order Sobol index
`conditional_variance`	float	Output variance when slice parameters are fixed
slice columns	float	One column per slice parameter with the conditioning value

One row per (output, parameter, conditioning-value combination).

sobol_conditional_joint_indices_{group}_{method}.parquet — Joint conditional Sobol indices (2D grid)

Same schema as above, but conditioned on all slice parameters simultaneously over a 2D grid. Used by the dominant factor phase diagram plot.

sobol_validation_{group}_{method}.parquet — Surrogate quality metrics

Column	Type	Description
`output`	string	Output metric
`validation_error`	float	Primary error metric (holdout error when available)
`r2_train`	float	R² on training data
`r2_test`	float	R² on holdout data (null if holdout disabled)
`n_train`	int	Number of training samples
`n_test`	int	Number of holdout samples
`method`	string	Surrogate method (`pce` or `rf`)

Additional method-specific columns: loo_error, n_terms, n_active_terms, max_degree (PCE); oob_error, n_estimators (RF).

Plots

Three types of sensitivity plots are generated per (group, method):

tools/smk -j4 --configfile config/gsa.yaml -- \
    results/gsa/plots/sobol_conditional_s1_vs_ghg_price_gsa_pce.pdf

Stacked area charts (sobol_conditional_s1_vs_{slice}_{group}_{method}.pdf): Conditional first-order Sobol shares vs each slice parameter. One panel per model output.
Dominant factor phase diagrams (sobol_conditional_dominant_factor_{group}_{method}.pdf): 2D policy space coloured by which parameter has the highest conditional S1.
Contour surfaces (sobol_conditional_s1_surface_{parameter}_{group}_{method}.pdf): Per-parameter conditional S1 surface over the 2D policy space.

Interpreting Results¶

Reading Sobol indices:

\(S_1 \approx 1\): This parameter is the dominant driver; reducing its uncertainty would substantially reduce output uncertainty.
\(S_T \gg S_1\): This parameter is involved in significant interactions with other parameters.
\(S_1 \approx S_T \approx 0\): This parameter has negligible influence on the output.

Example interpretation: If yield_factor has \(S_1 = 0.6\) for ghg_emissions, then 60% of the variance in GHG emissions is explained by crop yield uncertainty alone.

Validation quality:

Validation error < 0.1 indicates a reliable surrogate.
Validation error > 0.1 suggests the surrogate approximation is insufficient. For PCE, consider increasing samples or polynomial degree. For RF, consider increasing the number of estimators. Comparing PCE and RF errors can reveal whether the issue is model non-smoothness (where RF may outperform PCE) or insufficient data.

Conditional indices: These show how sensitivity patterns shift with policy choices. For instance, at low GHG prices, yield uncertainty may dominate emissions variance, while at high GHG prices, land-use-change factors may become more important as the model restructures production patterns.

Sensitivity Analysis¶

Methodology¶

Polynomial Chaos Expansion¶

Sparse Fitting via LARS¶

Random Forest¶

Sobol Indices from PCE Coefficients¶

Conditional Sensitivity Analysis¶

Experimental Design¶

Sobol Quasi-Random Sequences¶

Supported Distributions¶

Configuration¶

Parameter Range Justification¶

CH4 factor (ch4_factor: 0.5–1.5)¶

N2O factor (n2o_factor: 0.3–1.7)¶

Land-use change emissions (luc_factor: 0.3–1.7)¶

Crop yield factor (yield_factor: 0.8–1.2)¶

Food loss and waste factor (flw_factor: 0.7–1.3)¶

Feed conversion ratio factor (fcr_factor: 0.8–1.2)¶

Health relative risk parameters (rr_protective, rr_harmful: 0–1)¶

Production stability cost (prod_stability_cost: 0.05–0.5, slice parameter)¶

Policy slice parameters¶

Running the Analysis¶

Output Files¶

Interpreting Results¶

CH₄ factor (`ch4_factor`: 0.5–1.5)¶

N₂O factor (`n2o_factor`: 0.3–1.7)¶

Land-use change emissions (`luc_factor`: 0.3–1.7)¶

Crop yield factor (`yield_factor`: 0.8–1.2)¶

Food loss and waste factor (`flw_factor`: 0.7–1.3)¶

Feed conversion ratio factor (`fcr_factor`: 0.8–1.2)¶

Health relative risk parameters (`rr_protective`, `rr_harmful`: 0–1)¶

Production stability cost (`prod_stability_cost`: 0.05–0.5, slice parameter)¶