Configuration

Overview

The food-opt model is configuration-driven: all scenario parameters, crop selections, constraints, and solver options are defined in YAML configuration files under config/. This allows exploring different scenarios without modifying code.

The default configuration is config/default.yaml, structured into thematic sections.

Custom configuration files

Instead of modifying the default configuration file, it is recommended to explore individual scenarios by creating named configuration files, overriding specific parts of the default configuration. Such a named configuration file must contain at the minimum a name. An example could be something like the following:

# config/my_scenario.yaml
name: "my_scenario"           # Scenario name → results/my_scenario/
planning_horizon: 2040        # Override the default 2030 horizon
land:
  regional_limit: 0.6         # Tighten land availability
  slack_marginal_cost: 1e10   # Optional: raise slack penalty during validation
emissions:
  ghg_price: 250              # Raise the carbon price above the default

Any keys omitted in your custom file fall back to the defaults shown in the sections below, so you can keep overrides concise.

By default, results are saved under results/{name}/, allowing multiple scenarios coming from different configuration files to coexist. This root (and roots for processing, logs, and benchmarks) can be overridden via paths in the config.

To build and solve the model based on the above example configuration, you would run the following:

tools/smk -j4 --configfile config/my_scenario.yaml

Scenario Presets

The workflow supports scenario presets defined in config/scenarios.yaml that apply configuration overrides via a {scenario} wildcard. This allows exploring variations (e.g., with/without health constraints or GHG pricing) within a single configuration without duplicating config files.

Each scenario preset in scenarios.yaml contains a set of configuration overrides that are applied recursively on top of the base configuration. For example:

# config/scenarios.yaml
default:
  health:
    enabled: false
  emissions:
    ghg_pricing_enabled: false

HG:
  health:
    enabled: true
  emissions:
    ghg_pricing_enabled: true

With default path roots, the scenario name becomes part of all output paths:

  • Built models: results/{name}/build/model_scen-{scenario}.nc

  • Solved models: results/{name}/solved/model_scen-{scenario}.nc

  • Plots: results/{name}/plots/scen-{scenario}/

To build a specific scenario:

tools/smk -j4 --configfile config/my_scenario.yaml -- results/my_scenario/build/model_scen-HG.nc

This feature enables systematic sensitivity analysis and comparison across policy scenarios using a single configuration file.

Programmatic Scenario Generation

When conducting sensitivity analyses or parameter sweeps, you often need many scenarios that differ only in one or two parameter values. Writing these out manually is tedious and error-prone. The _generators DSL allows you to define scenario templates that are automatically expanded into concrete scenarios at configuration load time.

Basic structure

A generator specification has three required fields:

_generators:
  - name: scenario_{param}      # Name pattern with {placeholders}
    parameters:                 # Parameter definitions
      param:
        <value-spec>
    template:                   # Configuration template
      some_section:
        some_key: "{param}"     # Placeholder substitution

When the configuration is loaded, each generator expands into multiple concrete scenarios. The {param} placeholders in both the name and template are replaced with generated values.

Generating parameter values

There are three ways to specify parameter values:

  1. Log-spaced values (space: log): Uses logarithmic spacing, useful when sensitivity varies across orders of magnitude.

    parameters:
      price:
        space: log
        start: 5       # First value
        stop: 500      # Last value
        num: 8         # Number of points
        round: true    # Optional: round to integers
    
  2. Linear-spaced values (space: lin or omitted): Uses uniform spacing.

    parameters:
      fraction:
        space: lin
        start: 0.0
        stop: 1.0
        num: 11
    
  3. Explicit values (values): Specify exact values for non-uniform grids.

    parameters:
      n:
        values: [3, 5, 10, 20, 50, 100]
    

Combination modes

When a generator has multiple parameters, the mode field controls how they are combined:

  • Zip mode (default): Pairs parameters element-wise. All parameter lists must have the same length. Generates N scenarios from N values per parameter. Use this when parameters should vary together along a single dimension.

  • Grid mode: Computes the Cartesian product. Generates M × N scenarios from M values of one parameter and N of another. Use this to explore a full parameter space.

Example: Single-parameter sweep

This generator creates 8 scenarios with log-spaced GHG prices from 5 to 500:

_generators:
  - name: ghg_{ghg}
    parameters:
      ghg:
        space: log
        start: 5
        stop: 500
        num: 8
        round: true
    template:
      emissions:
        ghg_price: "{ghg}"

Result: scenarios ghg_5, ghg_8, ghg_14, …, ghg_500 (8 total).

Example: Paired parameters (zip mode)

This generator creates scenarios where GHG price and YLL value increase together:

_generators:
  - name: ghg_yll_{ghg}
    mode: zip
    parameters:
      ghg:
        space: log
        start: 5
        stop: 500
        num: 8
        round: true
      yll:
        space: log
        start: 50
        stop: 100000
        num: 8
        round: true
    template:
      emissions:
        ghg_price: "{ghg}"
      health:
        value_per_yll: "{yll}"

Result: 8 scenarios where the i-th GHG value pairs with the i-th YLL value.

Example: Parameter grid (grid mode)

This generator explores all combinations of GHG and biomass prices:

_generators:
  - name: ghg{ghg}_biomass{biomass}
    mode: grid
    parameters:
      ghg:
        values: [0, 50, 100, 150, 200, 250, 300]
      biomass:
        values: [0, 50, 100, 150, 200]
    template:
      emissions:
        ghg_price: "{ghg}"
      biomass:
        marginal_values_usd_per_tonne: "{biomass}"

Result: 35 scenarios (7 × 5 combinations).

Mixing generators with manual scenarios

Generators can coexist with manually defined scenarios in the same file:

# Manual scenario
baseline:
  validation:
    enforce_baseline_diet: true

# Generated scenarios
_generators:
  - name: sensitivity_{x}
    parameters:
      x:
        values: [1, 2, 3]
    template:
      some_param: "{x}"

Type preservation

When a placeholder is the entire value (e.g., "{param}"), the numeric type is preserved. When embedded in a string (e.g., "prefix_{param}"), values are converted to strings. This ensures configuration values have the correct types for downstream processing.

Sensitivity analysis mode

In addition to zip and grid modes, generators support mode: sensitivity for surrogate-based global sensitivity analysis. In this mode, parameter values are drawn from a space-filling Sobol sequence transformed to specified probability distributions, rather than from fixed value lists.

Each parameter specifies a distribution instead of a value range:

_generators:
  - name: gsa_{sample_id}
    mode: sensitivity
    samples: 256
    slice_parameters: [ghg_price]
    parameters:
      yield_factor:
        lower: 0.8
        upper: 1.2
      ch4_factor:
        distribution: lognormal
        mu: 0.0
        sigma: 0.15
      ghg_price:
        lower: 0
        upper: 300
    template:
      sensitivity:
        crop_yields:
          all: "{yield_factor}"
        emission_factors:
          ch4: "{ch4_factor}"
      emissions:
        ghg_price: "{ghg_price}"

Supported distributions are uniform (default; requires lower, upper), log_uniform (requires lower, upper; both positive), normal (requires mean, std), normal_ci (requires lower, upper; optional confidence, bounds), and lognormal (requires mu, sigma).

The samples field sets the number of quasi-random samples (should be a power of 2). The slice_parameters field designates parameters for conditional analysis — these are included in the surrogate fit but can be fixed at specific values to study how sensitivity changes with policy choices. Surrogate method configuration (PCE, RF) lives in a separate sensitivity_analysis top-level section.

See Sensitivity Analysis for full methodology details, output file formats, and interpretation guidance.

Configuration sections

Scenario Metadata

scenarios:
  # Each key represents a named scenario that can be activated via the
  # {scenario} wildcard in Snakemake (e.g., model_scen-default.nc).
  # The values are configuration overrides applied recursively on top
  # of the default configuration.
  default: {}
  # Example:
  # high_ghg:
  #   emissions:
  #     ghg_price: 500

# --- section: temporal ---
# Temporal configuration
#
# planning_horizon: Target year for optimization. Controls population
#   projections (UN WPP) and GDP clustering (IMF WEO).
#
# baseline_year: Reference year for observed baseline data. Controls FAOSTAT
#   production, GDD dietary intake, GBD health data, ESA CCI land cover,
#   and LUIcube grassland data.
#   Hard constraints (data must exist for exact year):
#     - LUIcube grassland: 1992-2020 (binding upper limit)
#     - ESA CCI land cover: 1992-2022
#     - GBD mortality: manual download required for matching year
#   Soft constraints (nearest available year used automatically):
#     - GDD dietary intake: latest available is 2018
#     - GBD dietary risk exposure: latest available is 2019
#
# currency_base_year: Base year for inflation-adjusted USD values.
planning_horizon: 2030
baseline_year: 2020
currency_base_year: 2024
  • planning_horizon: Target year for optimization (default: 2030). Currently determined only which (projected) population levels to use.

  • currency_base_year: Base year for inflation-adjusted USD values (default: 2024). All cost data is automatically converted to real USD in this base year using CPI adjustments. See Crop Production (Production Costs section) for details on cost modeling.

Download Options

downloads:
  show_progress: true

Path Options

# Root directories for workflow artifacts. Defaults keep everything under the
# project directory, but these can be redirected (e.g. to scratch storage).
# Environment variables and "~" are expanded by the Snakefile.
paths:
  results_root: "results"
  processing_root: "processing"
  logs_root: "logs"
  benchmarks_root: "benchmarks"

NetCDF Options

# NetCDF export settings for PyPSA network files (build and solve outputs)
netcdf:
  float32: true  # Downcast float64 to float32 to reduce file size
  compression:   # Passed to xarray.Dataset.to_netcdf; set to null to disable
    zlib: true
    complevel: 4

paths.*_root values support environment-variable and ~ expansion in the Snakefile (for example "${GROUP_SCRATCH}/${USER}/food-opt/processing").

Validation Options

validation:
  use_actual_yields: false
  use_actual_production: false
  enforce_baseline_diet: false # Set food consumption equal to current day values
  enforce_baseline_feed: false # Fix animal feed use to GLEAM baseline values
  land_slack: false # Enable land slack generators (allows exceeding regional land limits at cost)
  disable_new_cropland: false # If true, no new land can supply the cropland pool
  disable_new_pasture: false # If true, no new land can supply the pasture pool
  disable_spared_cropland: false # If true, existing cropland cannot be spared
  disable_spared_grassland: false # If true, existing grassland cannot be spared
  slack_marginal_cost: 50. # bn USD per Mt/Mha for validation slack (food groups, feed, land)
  feed_slack_cost_factor: 0.1 # Feed slack cost as fraction of slack_marginal_cost (lower separates feed from food slack)
  grassland_yield_multiplier: 1.0 # Multiplier applied to effective grassland feed yields before building grassland links
  production_stability:
    enabled: false
    penalty_mode: "hard"  # "hard" = inequality bounds, "quadratic" = soft QP penalty, "l1" = linear absolute value penalty
    quadratic_cost: 1.0  # bn USD per deviation² unit (only used when penalty_mode is "quadratic")
    l1_cost: 0.12  # bn USD per Mha deviation (crops/grassland). Calibrated jointly with animals.l1_cost to bring both land-use and animal-feed deviations to ~5% of observed 2020 totals (see notebooks/prod_stability_calibration.ipynb).
    deviation_type: "absolute"  # "absolute" or "relative" deviation from baseline
    crops:
      enabled: true
      max_relative_deviation: 0.2  # ±20%
      enable_slack: false  # Allow violating minimum production bounds at penalty cost
      min_baseline: 0.000001  # Mha floor. Hard mode: ignore near-zero baselines below this threshold. Relative penalty modes: denominator floor for near-zero/zero baselines.
    grassland:
      enabled: true
      max_relative_deviation: 0.2
      enable_slack: false
      min_baseline: 0.000001  # Mha floor
    animals:
      enabled: true
      max_relative_deviation: 0.2
      enable_slack: false  # Allow violating minimum production bounds at penalty cost
      min_baseline: 0.00001  # Mt DM floor (scaled to Mha-eq internally). Hard mode: ignore near-zero baselines below this threshold. Relative penalty modes: denominator floor for near-zero/zero baselines.
      l1_cost: 0.033  # bn USD per Mt DM deviation. Calibrated jointly with parent l1_cost; see notebooks/prod_stability_calibration.ipynb. Set to null to fall back to automatic Mha-equivalent scaling from parent l1_cost.
    land_conversion:
      enabled: true  # Penalize land-use transitions (conversion, pasture routing, sparing) from zero baseline
  animal_growth_cap:
    enabled: true  # Cap animal production growth to prevent unrealistic spatial reallocation
    max_relative_increase: 0.1  # Maximum relative increase from baseline (0.1 = +10%)
    min_baseline: 0.00001  # Mt DM floor; ignore near-zero baselines below this threshold

# --- section: food_incentives ---
food_incentives:
  enabled: false  # When true, food-level incentives are applied to the objective
  sources: []

# --- section: consumer_values ---
consumer_values:
  baseline_scenario: "baseline"  # Scenario name for consumer values extraction (must have enforce_baseline_diet=true)

# --- section: food_utility_piecewise ---
food_utility_piecewise:
  enabled: false  # When true, use piecewise diminishing marginal utility for food consumption
  n_blocks: 4
  decline_factor: 0.7  # Multiplicative utility decay by block (0 < factor <= 1)
  total_width_multiplier: 2.0  # Total incentivized quantity as multiple of baseline consumption
  min_block_width_mt: 0.00001  # Minimum width floor (Mt/year) for each utility block to avoid tiny upper bounds

# --- section: optimal_taxes ---
optimal_taxes:
  enabled: false  # When true, enables the optimal taxes/subsidies workflow

Set validation.enforce_baseline_diet to true to force the optimizer to match baseline consumption derived from the processed GDD file. When this flag is active, the diet.baseline_age and baseline_year settings determine which cohort/year is enforced. Use validation.food_group_slack_marginal_cost to set the penalty (USD2024 per Mt) for the slack generators that backstop those fixed food-group loads. Keep the value high so slack only activates when recorded production cannot meet the enforced demand targets.

Set validation.enforce_baseline_feed to true to fix animal feed use to GLEAM-derived baseline levels (see Baseline Feed Intake). The baseline is scaled from GLEAM 2.0 (2010) to the reference year and calibrated against the known GLEAM 3.0 global total using validation.gleam_calibration_year and validation.gleam_calibration_total_gt_dm.

See Validation for a detailed walkthrough of the validation workflow and diagnostic figures.

Consumer Utility Options

Two mutually exclusive options can be used to represent consumer preference in the objective:

  • food_incentives applies a single linear marginal-cost adjustment per (food, country) pair.

  • food_utility_piecewise applies a piecewise diminishing marginal utility curve per (food, country) pair.

When food_utility_piecewise.enabled is true, the workflow always reads utility blocks from results/{name}/consumer_values/utility_blocks.csv. These blocks are generated by calibrate_food_utility_blocks from:

  • baseline dual values extracted by extract_consumer_values; and

  • baseline per-food consumption from the baseline scenario solve.

The current calibration anchors marginal utility at the baseline quantity: the utility block containing baseline consumption uses the extracted dual value, with higher utility below baseline and lower utility above baseline according to food_utility_piecewise.decline_factor.

food_utility_piecewise cannot be combined with validation.enforce_baseline_diet in the same scenario.

Production Stability Bounds

The validation.production_stability section allows constraining how much crop and animal product production can deviate from current (baseline) levels. This is useful for investigating what positive changes (e.g., improved health outcomes, reduced emissions) can be achieved with limited disruption to existing production patterns.

Three penalty modes are available, selected via penalty_mode:

  • ``hard`` (default): Inequality bounds. Per-(product, country) production is bounded by:

    \[(1 - \delta) \times \text{baseline} \le \text{production} \le (1 + \delta) \times \text{baseline}\]

    where \(\delta\) is the max_relative_deviation parameter (e.g., 0.2 for ±20%).

  • ``l1``: Soft L1 (linear absolute-value) penalty on deviations from baseline production. Each unit of absolute deviation incurs a cost of l1_cost (bn USD per Mha for crops/grassland, or Mha-equivalent for animals). An L1 cost of approximately 1.0 is roughly the lowest value that induces the model to replicate current production patterns.

  • ``quadratic``: Soft quadratic penalty on deviations, with cost quadratic_cost (bn USD per deviation² unit).

The deviation_type option (absolute or relative) controls whether deviations are measured in absolute units or relative to the baseline.

Configuration options:

  • production_stability.enabled: Master switch for the feature (default: false)

  • production_stability.penalty_mode: hard, l1, or quadratic (default: hard)

  • production_stability.l1_cost: L1 penalty cost (default: 0.22, only used when penalty_mode is l1)

  • production_stability.quadratic_cost: Quadratic penalty cost (default: 1.0, only used when penalty_mode is quadratic)

  • production_stability.deviation_type: absolute or relative (default: absolute)

  • production_stability.crops.enabled: Apply to crop production

  • production_stability.crops.max_relative_deviation: Maximum relative deviation for crops (0-1, hard mode only)

  • production_stability.animals.enabled: Apply to animal product production

  • production_stability.animals.max_relative_deviation: Maximum relative deviation for animal products (0-1, hard mode only)

Behavior notes:

  • Products with zero baseline production are constrained to zero (no new products introduced)

  • Products missing baseline data are skipped with a warning

  • Multi-cropping is automatically disabled when production stability is enabled

Crop Selection

crops:
# Core cereals
- wheat
- dryland-rice
- wetland-rice
- maize
- barley
- oat
- rye
- sorghum
- buckwheat
- foxtail-millet
- pearl-millet
# Legumes/pulses
- soybean
- dry-pea
- chickpea
- cowpea
- gram
- phaseolus-bean
- pigeonpea
# Roots and tubers
- white-potato
- sweet-potato
- cassava
- yam
# Vegetables
- tomato
- carrot
- onion
- cabbage
# Fruits
- banana
- citrus
- coconut
# Stimulant crops
- cocoa
- coffee
- tea
# Oil crops
- sunflower
- rapeseed
- groundnut
- sesame
- oil-palm
- olive
# Sugar crops
- sugarcane
- sugarbeet
# Fiber crops
- cotton
# Fodder / biomass (also listed in non_food_crops below)
- alfalfa
- silage-maize
- biomass-sorghum
# Note: mango and taro excluded - missing RES02 (growing season) data for GFDL-ESM4

# --- section: non_food_crops ---
# Crops not intended for human food production (fodder, biomass).
# These are excluded from foods.csv validation but still need yield/land data.
non_food_crops:
- alfalfa
- silage-maize
- biomass-sorghum

See Crop Production for full list. Add/remove crops to explore specialized vs. diversified production systems.

Multiple Cropping

multiple_cropping:
  double_rice:
    crops:
    - wetland-rice
    - wetland-rice
    water_supplies:
    - r
    - i
  rice_wheat:
    crops:
    - wetland-rice
    - wheat
    water_supplies:
    - r
    - i
  maize_soybean:
    crops:
    - maize
    - soybean
    water_supplies:
    - r
    - i

Define sequential cropping systems as ordered lists of crops. Entries may repeat a crop (double rice) or mix cereals and legumes (rice→wheat, maize→soybean) and list multiple water_supplies (r for rainfed, i for irrigated) to build both variants. The build_multi_cropping rule checks growing-season compatibility, aggregates eligible area/yields, and sums irrigated water demand; build_model turns each combination into a multi-output land link. Leave the section empty to disable the feature. Multiple cropping zones that imply relay cropping (GAEZ classes “limited double” or “double rice … limited triple”) are still accepted here but are interpreted as sequential crop chains; relay-specific dynamics are not yet modelled.

Country Coverage

countries:
# - ABW  # No level-1 GADM data
- AFG
- AGO
# - AIA  # No regions (microstate)
# - ALA  # No population
- ALB
# - AND  # excluded: microstate
- ARE
- ARG
- ARM
- ASM
# - ATA  # No level-1 GADM data
# - ATF  # No population
- ATG
- AUS
- AUT
- AZE
- BDI
- BEL
- BEN
# - BES  # excluded: small overseas territory
- BFA
- BGD
- BGR
# - BHR  # excluded: desert city-state
- BHS
- BIH
# - BLM  # No regions (microstate)
- BLR
- BLZ
# - BMU  # No regions (microstate)
- BOL
- BRA
- BRB
- BRN
- BTN
# - BVT  # No level-1 GADM data
- BWA
- CAF
- CAN
# - CCK  # No level-1 GADM data
- CHE
- CHL
- CHN
- CIV
- CMR
- COD
- COG
# - COK  # excluded: small island territory
- COL
- COM
- CPV
- CRI
- CUB
# - CUW  # No level-1 GADM data
# - CXR  # No level-1 GADM data
# - CYM  # excluded: small overseas territory
- CYP
- CZE
- DEU
- DJI
# - DMA  # excluded: small island state
- DNK
- DOM
- DZA
- ECU
- EGY
- ERI
# - ESH  # excluded: sparse desert territory
- ESP
- EST
- ETH
- FIN
- FJI
# - FLK  # No level-1 GADM data
- FRA
# - FRO  # excluded: small island territory
# - FSM  # excluded: small island state
- GAB
- GBR
- GEO
# - GGY  # Too small
- GHA
# - GIB  # No level-1 GADM data
- GIN
# - GLP  # excluded: overseas department
- GMB
- GNB
- GNQ
- GRC
- GRD
# - GRL  # excluded: ice-dominated
- GTM
- GUF
# - GUM  # excluded: small island territory
- GUY
# - HKG  # No level-1 GADM data
# - HMD  # No level-1 GADM data
- HND
- HRV
- HTI
- HUN
- IDN
# - IMN  # excluded: small island territory
- IND
# - IOT  # No level-1 GADM data
- IRL
- IRN
- IRQ
- ISL
- ISR
- ITA
- JAM
# - JEY  # No regions (microstate)
- JOR
- JPN
- KAZ
- KEN
- KGZ
- KHM
# - KIR  # No level-1 GADM data
# - KNA  # excluded: small island state
- KOR
# - KWT  # excluded: desert city-state
- LAO
- LBN
- LBR
- LBY
# - LCA  # excluded: small island state
# - LIE  # excluded: microstate
- LKA
- LSO
- LTU
- LUX
- LVA
# - MAC  # No level-1 GADM data
# - MAF  # No level-1 GADM data
- MAR
# - MCO  # No level-1 GADM data
- MDA
- MDG
# - MDV  # No level-1 GADM data
- MEX
# - MHL  # No regions (microstate)
- MKD
- MLI
- MLT
- MMR
- MNE
- MNG
# - MNP  # excluded: small island territory
- MOZ
- MRT
# - MSR  # excluded: small island territory
# - MTQ  # excluded: overseas department
- MUS
- MWI
- MYS
# - MYT  # excluded: overseas department
- NAM
# - NCL  # excluded: overseas territory
- NER
# - NFK  # No level-1 GADM data
- NGA
- NIC
# - NIU  # No level-1 GADM data
- NLD
- NOR
- NPL
# - NRU  # No regions (microstate)
- NZL
- OMN
- PAK
- PAN
# - PCN  # No level-1 GADM data
- PER
- PHL
# - PLW  # excluded: small island state
- PNG
- POL
- PRI
# - PRK  # excluded: no health data available for North Korea
- PRT
- PRY
- PSE
# - PYF  # excluded: overseas territory
# - QAT  # excluded: desert city-state
# - REU  # excluded: overseas department
- ROU
- RUS
- RWA
- SAU
- SDN
- SEN
# - SGP  # excluded: desert city-state (urban)
# - SGS  # No level-1 GADM data
# - SHN  # excluded: small island territory
# - SJM  # No population
- SLB
- SLE
- SLV
# - SMR  # No regions (microstate)
- SOM
# - SPM  # excluded: small island territory
- SRB
- SSD
- STP
- SUR
- SVK
- SVN
- SWE
- SWZ
# - SXM  # No level-1 GADM data
# - SYC  # excluded: small island state
- SYR
# - TCA  # excluded: small island territory
- TCD
- TGO
- THA
- TJK
# - TKL  # No regions (microstate)
- TKM
- TLS
# - TON  # excluded: small island state
- TTO
- TUN
- TUR
# - TUV  # No regions (microstate)
- TWN
- TZA
- UGA
- UKR
# - UMI  # No population
- URY
- USA
- UZB
# - VAT  # No level-1 GADM data
# - VCT  # excluded: small island state
- VEN
# - VGB  # excluded: small island territory
# - VIR  # excluded: small island territory
- VNM
- VUT
# - WLF  # excluded: overseas territory
# - WSM  # excluded: small island state
- YEM
- ZAF
- ZMB
- ZWE

Include countries/territories to model; exclude to reduce problem size. Microstate and countries missing essential data are commented out.

Spatial Aggregation

Controls regional resolution and land classification.

aggregation:
  regions:
    target_count: 400
    allow_cross_border: false
    method: "kmeans"
  simplify_tolerance_km: 5
  simplify_min_area_km: 25
  resource_class_quantiles: [0.25, 0.5, 0.75]
  # Data source for determining irrigated land area when aggregating by region/resource class.
  # - "current": use GAEZ "land equipped for irrigation" dataset (same area for all crops)
  # - "potential": use GAEZ irrigated suitability rasters (crop-specific potential area)
  irrigated_area_source: "current"
Trade-offs:
  • More regions → higher spatial resolution, longer solve time

  • Fewer resource classes → faster solving, less yield heterogeneity

Land, Water, Fertilizer, and Residues

Limits on land, fertilizer availability, and residue management.

land:
  regional_limit: 1.0 # fraction of each region's potential cropland that is made available.
  land_use_cost_usd_per_ha: 0.0 # Small optional per-hectare land-use cost to regularize land allocation (set >0 to activate)
  conversion_cost_forest_usd_per_ha: 8000 # Overnight investment cost for converting forested land to agriculture (2024 USD/ha); sources in docs/land_use.rst
  conversion_cost_nonforest_usd_per_ha: 2000 # Overnight investment cost for converting non-forested land to agriculture (2024 USD/ha); sources in docs/land_use.rst
  investment_horizon: 25 # Years over which to annualize land conversion investment costs
  discount_rate: 0.05 # Annual discount rate for annualizing land conversion investment costs
  filtering:
    min_crop_yield_t_per_ha: 0.01      # Minimum yield for crop links (t/ha); filters ~1% of entries
    min_grassland_yield_t_per_ha: 0.05 # Minimum yield for grassland links (t/ha); filters ~6% of entries
    min_area_ha: 100                    # Minimum land area (ha); filters very small resource classes

Water Supply

water:
  # Water supply scenario determines which dataset is used for regional water limits:
  # - "sustainable": Water Footprint Network blue water availability by basin (Hoekstra & Mekonnen 2011)
  #                  Represents sustainable water extraction limits.
  # - "current_use": Huang et al. (2018) gridded irrigation water withdrawals
  #                  Represents actual/current agricultural water use, useful for validation.
  supply_scenario: "current_use"
  # Reference year for Huang irrigation data (only used when supply_scenario is "current_use")
  huang_reference_year: 2010
  • water.supply_scenario selects the water availability dataset: sustainable (Water Footprint Network blue water availability) or current_use (Huang et al. irrigation withdrawals). Use current_use for validation or benchmarking against present-day withdrawals.

  • water.huang_reference_year selects the year (1971-2010) used for the Huang monthly withdrawals when supply_scenario is current_use.

fertilizer:
  limit: 200_000_000  # t-N (200 Mt-N total limit in synthetic fertilizer application)
  marginal_cost_usd_per_tonne: 500 # USD per t-N of synthetic fertilizer
  # High-input agriculture N application rates (percentile of global FUBC data)
  n_percentile: 80  # Use 80th percentile for high-input systems (range: 0-100)
  # Manure nitrogen management
  manure_n_to_fertilizer: 0.75 # Fraction of N excreted in confined quarters available as fertilizer (accounting for losses during storage/handling)
residues:
  max_feed_fraction: 0.30 # Maximum fraction of residues that can be removed for animal feed (remainder must be incorporated into soil)
  max_feed_fraction_by_region: # Overrides by ISO3 country code or M49 region/sub-region name (country overrides sub-region overrides region)
    Asia: 0.70 # Asia uses a lot of crop residues for feeding; setting this higher helps livestock feed balancing in the model
  • residues.max_feed_fraction_by_region overrides the global fraction for ISO3 countries or UN M49 regions/sub-regions.

  • Precedence is: country overrides sub-region overrides region.

GAEZ Data Parameters

Configures which GAEZ v5 climate scenario and input level to use.

data:
  gaez:
    # GAEZ v5 parameters
    # Note: RES05 (yields/suitability) has ENSEMBLE, but RES02 (growing season) only has individual GCMs
    climate_model: "GFDL-ESM4" # Specific GCMs: "GFDL-ESM4", "IPSL-CM6A-LR", "MPI-ESM1-2-HR", "MRI-ESM2-0", "UKESM1-0-LL"
    climate_model_ensemble: "ENSEMBLE" # Multi-model mean (only available for RES05, not RES02)
    period: "FP2140" # Future: "FP2140" (2021-2040), "FP4160" (2041-2060), "FP6180" (2061-2080), "FP8100" (2081-2100); Historical: "HP0120" (2001-2020), "HP8100" (1981-2000)
    climate_scenario: "SSP126" # "SSP126" (low emissions), "SSP370" (medium, ~RCP4.5), "SSP585" (high), "HIST" (historical)
    input_level: "H" # "H" (High), "L" (Low)
    # Variable codes for GAEZ v5
    yield_var: "RES05-YCX" # Average attainable yield, current cropland
    water_requirement_var: "RES05-WDC" # Water deficit/net irrigation requirement during crop cycle, current cropland
    suitability_var: "RES05-SX1" # Share of grid cell assessed as VS or S (very suitable or suitable)
  usda:
    # API credentials: configure in config/secrets.yaml or via USDA_API_KEY environment variable
    # See config/secrets.yaml.example for setup instructions
    retrieve_nutrition: true  # Set to true to fetch nutrition data from USDA instead of using the provided data
    # Nutrient mapping: internal name -> USDA FoodData Central name
    # USDA names must match nutrient names in FoodData Central exactly
    nutrients:
      protein: "Protein"
      carb: "Carbohydrate, by difference"
      fat: "Total lipid (fat)"
      cal: "Energy"
  land_cover:
    # ECMWF credentials: configure in config/secrets.yaml or via environment variables
    # See config/secrets.yaml.example for setup instructions
    version: "v2_1_1"
  faostat:
    qcl_production_element_code: 5510  # "Production" in tonnes (QCL dataset, covers crops and livestock)
    fbs_food_supply_element_code: 645  # "Food supply quantity (kg/capita/yr)" in FBS dataset
    fbs_other_uses_element_code: 5154  # "Other uses (non-food)" in 1000 tonnes (FBS dataset)
  soilgrids:
    target_resolution_m: 10000  # Target resolution in meters (10000m = 10km)
Scenarios:
  • SSP126: Strong mitigation (1.5-2°C warming)

  • SSP370: Moderate emissions (~3°C)

  • SSP585: High emissions (~4-5°C)

Input Levels:
  • H: Modern agriculture (fertilizer, irrigation, pest control)

  • L: Subsistence farming (minimal external inputs)

Irrigation

irrigation:
  # Which model crops are allowed to have irrigated production.
  # In GAEZ v5, all crops have both irrigated (HILM/LILM) and rainfed (HRLM/LRLM) data available.
  # List specific crops here if you want to restrict irrigation, or use "all" for all crops.
  irrigated_crops: "all"

# --- section: costs ---
costs:
  averaging_period:
    start_year: 2015
    end_year: 2024

animal_costs:
  fadn:
    high_cost_threshold_usd_per_mt: 50000
    livestock_specific_costs:
      SE330: "Other livestock specific costs"
    shared_farm_costs:
      SE340: "Machinery & building current costs"
      SE345: "Energy"
      SE350: "Contract work"
      SE356: "Other direct inputs"
      SE360: "Depreciation"
      SE370: "Wages paid"
      SE380: "Interest paid"
      SE390: "Taxes"
    grazing_cost_items:
      SE310: "Feed for grazing livestock"
      SE315: "Feed for grazing livestock home-grown"
    exclude_costs:
      SE320: "Feed for pigs & poultry"
      SE325: "Feed for pigs & poultry home-grown"
      SE375: "Rent paid"

  usda:
    request_timeout_seconds: 120
    # Conversion factors: kg per head dressed weight
    dressed_weight_kg_per_head:
      meat-cattle: 350.0
      meat-pig: 90.0
    include_items:
    - "Hired labor"
    - "Opportunity cost of unpaid labor"
    - "Bedding and litter"
    - "Custom services"
    - "Fuel, lube, and electricity"
    - "Repairs"
    - "Interest on operating capital"
    - "Marketing"
    - "Veterinary and medicine"
    - "Capital recovery of machinery and equipment"
    - "General farm overhead"
    - "Taxes and insurance"
    grazing_cost_items:
    - "Grazed feed"
    exclude_items:
    - "Homegrown harvested feed"
    - "Purchased feed"
    - "Total, feed costs"
    - "Opportunity cost of land"
    - "Total, operating costs"
    - "Costs listed"

  faostat:
    aggregate_area_code_limit: 5000
    element_codes:
      production: ["2510", "5510"]
      stocks: ["2111", "5111"]
      producing_animals: ["2313", "5318", "5313"]

crop_costs:
  non_endogenous_cost_share: 0.7  # Fraction of revenue (price * yield) attributed to non-endogenous production costs
  faostat:
    price_element_code: 5532  # Producer Price (USD/tonne)
    yield_element_code: 5412  # Yield (hg/ha)

# --- section: cost_calibration ---
# Unified cost calibration corrections extracted from production stability duals.
# Covers crops, grassland, and animals. When enabled, additive cost corrections
# are applied at build time to align model costs with revealed preferences.
cost_calibration:
  enabled: true       # Apply calibration corrections to production costs
  generate: false     # Generate calibration from solved model (breaks DAG cycle when true)
  scenario: "calibration"  # Scenario name used for calibration solve
  crop_correction_csv: "data/curated/calibration/crop_cost.csv"
  grassland_correction_csv: "data/curated/calibration/grassland_cost.csv"
  animal_correction_csv: "data/curated/calibration/animal_cost.csv"

Restrict irrigation to water-scarce scenarios or explore rainfed-only production.

Macronutrients

macronutrients: {}
  # For each of "carb", "protein", "fat" and "cal" we support "min",
  # "max" and "equal" keywords, which are given in g/person/day; see
  # example below. Alternatively, use "equal_to_baseline: true" to
  # enforce per-country equality at the level implied by each country's
  # baseline diet (mutually exclusive with min/max/equal).
  # carb:
  #   min: 250              # g/person/day
  #   # equal_to_baseline: true  # per-country g/person/day from baseline diet
  # protein:
  #   min: 50      # g/person/day
  # fat:
  #   min: 50      # g/person/day
  # cal:
  #   min: 2000    # kcal/person/day
  #   # equal_to_baseline: true  # per-country kcal/person/day from baseline diet

# --- section: sensitivity ---
# Multiplicative adjustment factors for sensitivity analysis. Applied after
# model construction. See config/schemas/config.schema.yaml for structure.
sensitivity: {}

# --- section: byproducts ---
# Foods that are not for direct human consumption (excluded from food group tracking)
byproducts:
- wheat-bran
- wheat-germ
- rice-bran
- barley-bran
- oat-bran
- buckwheat-hulls
- oilseed-meal
- rapeseed-meal
- ddgs
- molasses
- maize-ethanol
- sugarcane-ethanol
- cotton-lint

Use min, max, or equal constraints.

Food Groups

food_groups:
  included:
  - whole_grains
  - grain
  - fruits
  - vegetables
  - legumes
  - nuts_seeds
  - starchy_vegetable
  - oil
  - red_meat
  - poultry
  - dairy
  - eggs
  - sugar
  - stimulants
  # Optional per-group constraints with "min", "max" or "equal" in g/person/day
  constraints: {}
  equal_by_country_source: null
  # Per-capita consumption caps (g/person/day) applied as e_nom_max on stores.
  # Values are set to:
  #   ceil(2 * max(TMREL, max country-level group consumption))
  # using custom baseline diet estimates from processing/{name}/baseline_diet.csv
  # and TMREL values from derived health RR curves (where available).
  max_per_capita:
    whole_grains: 300
    grain: 1403
    fruits: 658
    vegetables: 785
    legumes: 300
    nuts_seeds: 79
    starchy_vegetable: 1221
    oil: 155
    red_meat: 285
    poultry: 241
    dairy: 2865
    eggs: 213
    sugar: 133
    stimulants: 50
  # Fix relative food contributions within each food group based on baseline
  # consumption data. When enabled, the model maintains baseline ratios between
  # foods in each group (e.g., if wheat is 60% and rice 40% of grains, that
  # ratio is preserved) while allowing total group consumption to vary.
  fix_within_group_ratios:
    enabled: false

included lists the food groups tracked by the model. constraints is an optional mapping where any included group may define min, max, or equal targets in g/person/day. Leaving constraints empty disables all food group limits; add entries only for the groups you want to control.

Diet Controls

diet:
  baseline_age: "All ages"
  # Foods whose group-level consumption is overridden with waste-corrected
  # FAOSTAT Food Balance Sheet supply data. This addresses cases where GDD
  # survey intake substantially underestimates actual consumption (e.g. yam
  # in West Africa).
  # Conversion factor from GDD beverage intake (cups/day) to dry commodity
  # weight (g/day). Only tea (v18) is actively used; coffee-green is sourced
  # from FAOSTAT FBS (fbs_override_foods) because GDD v17 data is unreliable
  # for many countries (e.g. 42× overestimate for India vs FAOSTAT supply).
  # Factor = serving_size_g × dry_fraction_per_g_brewed
  # Tea: 240 g/cup × 0.01 g-dry/g-brewed = 2.4 g-dry/cup
  stimulant_brewed_to_dry:
    tea: 2.4
  fbs_override_foods:
  - yam
  - cocoa-powder
  - coffee-green

Customize baseline_age if you pre-process alternative cohorts for the baseline diet. The reference year is controlled by the top-level baseline_year parameter. These values are used whenever validation.enforce_baseline_diet is set to true.

Biomass

biomass:
  crops:
  - maize
  - oil-palm
  - sugarcane
  - biomass-sorghum
  marginal_values_usd_per_tonne: 0  # USD_2024 per tonne dry matter exported to the energy sector
  enforce_baseline_demand: true  # Enforce baseline biofuel/industrial and biogas demand
  biofuel_demand_scale: 1.0  # Solve-time multiplier on enforced biofuel/industrial and biogas demand
  biogas_crop_demand: "data/curated/biogas_crop_demand.csv"  # Biogas crop demand (silage maize); null to disable
  enforce_fiber_demand: true  # Enforce baseline fiber demand (cotton lint) from FAOSTAT FBS

Per-country biomass buses track dry-matter exports to the energy sector. All foods listed under byproducts gain links to this bus, providing a disposal route for byproducts that lack feed mappings. Crops listed in biomass.crops can be diverted directly as feedstocks. The marginal_values_usd_per_tonne parameter (USD2024 per tonne dry matter) sets the price received when biomass leaves the food system; set to 0 for free disposal.

When enforce_baseline_demand is true, biofuel and biogas crop demand is fixed at baseline levels. Each biofuel link is created with p_nom equal to baseline demand and p_min_pu = 1.0, forcing flow to match demand exactly. Two sources of demand are combined:

  • Biofuel/industrial demand from FAOSTAT Food Balance Sheets (Other uses element), routed via food buses. This captures ethanol (maize grain, sugarcane) and biodiesel (vegetable oils) demand.

  • Biogas crop demand from biogas_crop_demand (default: data/curated/biogas_crop_demand.csv), routed directly from crop buses. This captures whole-crop silage maize diverted to anaerobic digestion for biogas production. Set biogas_crop_demand to null to disable.

Biogas crop demand (data/curated/biogas_crop_demand.csv)

Country

Crop

Demand (Mt DM)

Source

DEU

silage-maize

14.85

FNR 2024: 900 kha biogas maize × ~47 t FM/ha × 35% DM [1]

ITA

silage-maize

2.40

ISAAC/CIB: ~125 kha biogas maize in Po Valley × ~55 t FM/ha [2]

AUT

silage-maize

0.25

Austrian Biomass Association: ~20 kha estimated [3]

CZE

silage-maize

0.42

Czech Biogas Association: ~40 kha [4]

Countries with negligible or zero biogas crop demand are omitted (zero by default). Denmark banned crop-based biogas feedstock; France caps it at 15%; Poland, Netherlands, and Belgium use manure-dominant systems.

Footnotes

When enforce_fiber_demand is true, baseline fiber demand (cotton lint) is enforced via per-country fiber buses and fixed-capacity stores. Each country with positive demand gets a fiber:{country} bus and a store:fiber:cotton-lint:{country} store whose capacity equals the FAOSTAT-derived demand. The store bounds (e_min_pu = e_max_pu = 1.0) force the store level to equal demand exactly, so cotton lint production must match baseline fiber consumption. Cotton lint is excluded from biomass byproduct routing when fiber demand is enforced to prevent double-counting.

Animal Products

animal_products:
  include:
  - meat-cattle
  - meat-pig
  - meat-chicken
  - dairy
  - eggs
  - dairy-buffalo
  - meat-sheep
  # GLEAM 3.0 production system → model product mapping.
  # Defines which model products each (Animal, LPS) system contributes to.
  # Multi-product systems (e.g. cattle grazing → dairy + meat) are split
  # using FCR-weighted shares in the feed baseline.
  #
  # Sheep/goat milk is proxied through "dairy" (cattle milk pathway) rather
  # than modeled as a separate product. At ~3-4% of global milk production
  # the volume doesn't justify a distinct product with its own efficiency,
  # emissions, and nutritional profile. The feed accounting is still correct:
  # GLEAM3 sheep/goat feed intake is captured in the dairy baseline, and
  # FAOSTAT dairy production (see faostat_items.dairy) includes sheep/goat
  # milk. The production-based scaling step reconciles any efficiency mismatch.
  gleam3_system_product_map:
    Cattle:
      Grassland: [dairy, meat-cattle]
      Mixed: [dairy, meat-cattle]
      Feedlots: [meat-cattle]
    Buffalo:
      Grassland: [dairy-buffalo, meat-cattle]
      Mixed: [dairy-buffalo, meat-cattle]
    Sheep:
      Grassland: [dairy, meat-sheep]
      Mixed: [dairy, meat-sheep]
    Goats:
      Grassland: [dairy, meat-sheep]
      Mixed: [dairy, meat-sheep]
    Chicken:
      Broiler: [meat-chicken]
      Layer: [eggs]
      Backyard: [eggs, meat-chicken]
    Pigs:
      Backyard: [meat-pig]
      Intermediate: [meat-pig]
      Industrial: [meat-pig]
  # For multi-product species (cattle, buffalo, chicken, sheep/goats),
  # the Wirsenius scaling factor f splits total GLEAM3 feed between
  # co-products. Countries with f far from the regional median likely
  # reflect GLEAM3 data quality issues rather than real efficiency
  # differences. This factor clamps f to [median/k, median*k] where k
  # is the value below. Set to a large value (e.g. 100) to disable.
  me_scaling_clamp_factor: 2.0
  # Ruminant net-to-metabolizable energy conversion efficiency factors
  # Used to convert net energy (NE) requirements to metabolizable energy (ME) requirements
  # Based on NRC (2000) typical values for mixed diets
  # ME_required = NE_m/k_m + NE_g/k_g (+ NE_l/k_l for dairy)
  # TODO: Should check the reference for this.
  net_to_metabolizable_energy_conversion:
    k_m: 0.60  # Maintenance efficiency
    k_g: 0.40  # Growth efficiency
    k_l: 0.60  # Lactation efficiency (dairy)
  # Carcass-to-retail meat conversion factors
  carcass_to_retail_meat:
    meat-cattle: 0.67  # kg boneless retail beef per kg carcass (OECD-FAO 2023)
    meat-pig: 0.73     # kg boneless retail pork per kg carcass (OECD-FAO 2023)
    meat-chicken: 0.60 # kg boneless retail chicken per kg carcass (OECD-FAO 2023)
    eggs: 1.00         # No conversion needed (whole egg = retail product)
    dairy: 1.00        # No conversion needed (whole milk = retail product)
    meat-sheep: 0.63   # kg boneless retail lamb per kg carcass (slightly lower than beef)
    dairy-buffalo: 1.00 # No conversion needed (whole milk = retail product)
  # FAOSTAT QCL item names to aggregate for each model product.
  # First item is the primary product; additional items are proxied species
  # whose production is lumped into the model product.
  faostat_items:
    dairy:
      - "Raw milk of cattle"
      - "Raw milk of goats"       # proxy: goat milk → dairy
      - "Raw milk of sheep"       # proxy: sheep milk → dairy
      - "Raw milk of camel"       # proxy: camel milk → dairy
    meat-cattle:
      - "Meat of cattle with the bone, fresh or chilled"
      - "Meat of buffalo, fresh or chilled"   # proxy: buffalo → cattle
    meat-pig:
      - "Meat of pig with the bone, fresh or chilled"
    meat-chicken:
      - "Meat of chickens, fresh or chilled"
      - "Meat of ducks, fresh or chilled"            # proxy: duck → chicken
      - "Meat of turkeys, fresh or chilled"           # proxy: turkey → chicken
      - "Meat of pigeons and other birds n.e.c., fresh, chilled or frozen"
    eggs:
      - "Hen eggs in shell, fresh"
    dairy-buffalo:
      - "Raw milk of buffalo"
    meat-sheep:
      - "Meat of sheep, fresh or chilled"
      - "Meat of goat, fresh or chilled"              # proxy: goat → sheep
  residue_crops:
  - banana
  - barley
  - chickpea
  - cowpea
  - dry-pea
  - dryland-rice
  - foxtail-millet
  - gram
  - maize
  - oat
  - pearl-millet
  - phaseolus-bean
  - pigeonpea
  - rye
  - sorghum
  - sugarcane
  - wetland-rice
  - wheat

fodder_decomposition:
  fdd_crops:
    - alfalfa
    - silage-maize
  eurostat:
    averaging_years: 5
  suitability_blend_weight: 0.7
  yield_corrections:
    enabled: true
    eurostat_moisture: 0.65
    floor: 0.2
    ceiling: 2.0

grazing:
  enabled: true
  isimip_utilization_rate: 0.60 # Applied to ISIMIP yields in merge step
  forage_overlap_crops:
  - alfalfa
  - silage-maize
  - biomass-sorghum
  grassland_forage_calibration:
    enabled: true
    generate: false
    grassland_yield_correction: "data/curated/calibration/grassland_yield.csv"
    fodder_conversion_correction: "data/curated/calibration/fodder_conversion.csv"
    exogenous_forage: "data/curated/calibration/exogenous_forage.csv"
    scenario: "default"

Disable grazing to force intensive feed-based systems.

Trade Configuration

trade:
  hubs: 20
  crop_default_trade_cost_per_km: 0.01  # USD_2024 per tonne per km (1e-2)
  crop_trade_cost_categories:
    bulk_dry_goods:
      cost_per_km: 0.006  # USD_2024 per tonne per km (6e-3)
      crops:
      - wheat
      - dryland-rice
      - wetland-rice
      - maize
      - soybean
      - barley
      - oat
      - rye
      - dry-pea
      - chickpea
      - cocoa
      - coffee
      - tea
      - cotton
    bulky_fresh:
      cost_per_km: 0.014  # USD_2024 per tonne per km (1.4e-2)
      crops:
      - white-potato
      - sweet-potato
      - yam
      - cassava
      - sugarbeet
      - biomass-sorghum
    perishable_high_value:
      cost_per_km: 0.022  # USD_2024 per tonne per km (2.2e-2)
      crops:
      - tomato
      - carrot
      - onion
      - cabbage
      - banana
      - sugarcane
      - sunflower
      - rapeseed
      - groundnut
  non_tradable_crops:
    - alfalfa
    - biomass-sorghum
    - silage-maize
  food_default_trade_cost_per_km: 0.021  # USD_2024 per tonne per km (2.1e-2)
  food_trade_cost_categories:
    chilled_meat:
      cost_per_km: 0.028  # USD_2024 per tonne per km (2.8e-2)
      foods:
      - meat-cattle
      - meat-pig
      - meat-chicken
    dairy_and_eggs:
      cost_per_km: 0.024  # USD_2024 per tonne per km (2.4e-2)
      foods:
      - dairy
      - eggs
  non_tradable_foods: []
  feed_default_trade_cost_per_km: 0.012  # USD_2024 per tonne per km (1.2e-2)
  feed_trade_cost_categories:
    grain_protein:
      cost_per_km: 0.006  # USD_2024 per tonne per km (6e-3) - matches crop bulk_dry_goods
      feeds:
      - ruminant_grain
      - ruminant_protein
      - monogastric_grain
      - monogastric_protein
    forage:
      cost_per_km: 0.012  # USD_2024 per tonne per km (1.2e-2) - 2x grain cost
      feeds:
      - ruminant_forage
    bulky_low_quality:
      cost_per_km: 0.016  # USD_2024 per tonne per km (1.6e-2) - 2.67x grain cost
      feeds:
      - ruminant_roughage
      - monogastric_low_quality
  non_tradable_feeds:
  - ruminant_forage

Increase trade costs to explore localized food systems; decrease for globalized trade.

All trade costs are expressed in USD_2024 per tonne per kilometer.

Emissions Pricing

emissions:
  ghg_pricing_enabled: true # Whether to include GHG pricing in the objective function
  ghg_price: 200 # USD_2024/tCO2-eq (emissions stored in MtCO2-eq internally)
  ch4_to_co2_factor: 27.0 # IPCC AR6 GWP100 (WG1, Chapter 7, Table 7.15; https://www.ipcc.ch/report/ar6/wg1/chapter/chapter-7/)
  n2o_to_co2_factor: 273.0 # IPCC AR6 GWP100 (WG1, Chapter 7, Table 7.15; https://www.ipcc.ch/report/ar6/wg1/chapter/chapter-7/)
  rice:
    methane_emission_factor_kg_per_ha: 134.47 # kg CH4 per ha per crop (IPCC 2019 Refinement, Vol 4, Chapter 5, Tables 5.11 and 5.11A. Default for continuously flooded fields.)
    rainfed_wetland_rice_ch4_scaling_factor: 0.54 # IPCC 2019 Refinement, Vol 4, Chapter 5, Table 5.12. Scaling factor for "Regular rainfed" water regime.
  fertilizer:
    synthetic_n2o_factor: 0.010 # kg N2O-N per kg N input (IPCC 2019 Refinement, Table 11.1 aggregated default)
    # Indirect N2O emission parameters (IPCC 2019 Refinement, Chapter 11.2.2, Table 11.3)
    indirect_ef4: 0.010 # kg N2O-N per kg (NH3-N + NOx-N) volatilized and deposited (EF4)
    indirect_ef5: 0.011 # kg N2O-N per kg N leached/runoff (EF5)
    frac_gasf: 0.11 # Fraction of synthetic fertilizer N volatilized as NH3 and NOx (FracGASF)
    frac_gasm: 0.21 # Fraction of organic N and grazing N volatilized as NH3 and NOx (FracGASM)
    frac_leach: 0.24 # Fraction of applied/deposited N lost through leaching and runoff in wet climates (FracLEACH-(H))
  residues:
    incorporation_n2o_factor: 0.010 # kg N2O-N per kg residue N incorporated into soil (IPCC 2019 Refinement, Table 11.1 aggregated default)

Land Use Change

luc:
  horizon_years: 25
  managed_flux_mode: "zero"
  forest_fraction_threshold: 0.2  # Minimum forest fraction (0-1) to apply regrowth sequestration
  savanna_pvc_threshold: 75  # MgC/ha potential vegetation carbon; Hayek et al. 2024 threshold for closed vs open savanna
  # Data source for cropland baseline area:
  # - "gaez": GAEZ RES06-HAR (2010-2019 average harvested area), consistent with production stability
  # - "esa": ESA CCI land cover satellite data
  cropland_source: "gaez"

Controls how land use change emissions and carbon sequestration are modeled over the planning horizon.

Parameters:
  • horizon_years: Time horizon (years) for amortizing land use change emissions

  • managed_flux_mode: How to treat emissions from existing managed land ("zero" assumes no net flux from current agricultural land)

  • forest_fraction_threshold: Minimum forest cover fraction (0-1) required for a grid cell to be eligible for regrowth sequestration when land is spared

Health Configuration

health:
  enabled: true  # Whether to include health costs in the objective function
  region_clusters: 30
  intake_grid_points: 15  # Number of grid knots over empirical RR range
  log_rr_points: 15
  ssb_sugar_g_per_100g: 5.7  # ≈50 kcal per 226.8 g sugar-sweetened beverage (SSB) implies ~5.7 g sugar per 100 g
  value_per_yll: 50000  # USD_2024 per year of life lost
  intake_cap_g_per_day: 1000  # Uniform generous cap on intake grids and clipping
  intake_age_min: 11  # GDD adult band starts at 11; set to 11 to retain adult intake data. Note however that GDB chronic disease risk factors are for adults of >=25 years.
  # Dietary risk factors to consider (must match GDD data items)
  risk_factors:
  - fruits
  - vegetables
  - nuts_seeds
  - legumes
  - red_meat
  - whole_grains
  # GBD also covers seafood omega-3 and processed meat risk factors,
  # but fish/seafood and processed meat are not modelled as food groups.
  # GDB has data on sugar-sweetened beverage intake as a risk factor,
  # from which we can in theory derive added sugar intake risk
  # factors. The epidemiological evidence for this is, however,
  # lacking, and so we don't count "sugar" as a risk factor.
  # - sugar
  # Health outcomes/causes to consider (must be present in IHME GBD data and relative risks)
  causes:
  - CHD              # Coronary/Ischemic Heart Disease
  - Stroke           # Stroke (all types)
  - T2DM             # Type 2 Diabetes Mellitus
  - CRC              # Colorectal Cancer
  # Mapping of risk factors to the causes they affect
  risk_cause_map:
    fruits: [CHD, Stroke, T2DM]
    vegetables: [CHD, Stroke]
    nuts_seeds: [CHD, T2DM]
    legumes: [CHD]
    red_meat: [CHD, Stroke, T2DM, CRC]
    whole_grains: [CHD, Stroke, T2DM, CRC]
    # sugar: [CHD, Stroke, T2DM, CRC]
  # Per risk-factor overrides using log-linear RR from literature CSV files.
  # When a risk factor maps to a CSV path, the GBD dose-response curve is replaced
  # with a log-linear curve derived from the CSV, age-corrected using relative
  # attenuation factors from the GBD data.
  alternative_rr:
    red_meat: "data/curated/red_meat_rr_log_linear.csv"
  # Multi-objective clustering settings for grouping countries into health clusters
  clustering:
    weights:
      geography: 1.0    # Weight for geographic proximity
      gdp: 0.5          # Weight for GDP per capita similarity
      population: 0.3   # Weight for population balance across clusters

Reduce region_clusters or log_rr_points to speed up solving.

The value_per_yll parameter monetizes health impacts in USD_2024 per year of life lost (YLL).

Solver Configuration

solving:
  solver: highs
  # solver: gurobi
  # io_api controls how the model is communicated to the solver:
  # - 'lp' or 'mps': Write problem to file (LP/MPS format) which solver reads
  # - 'direct': Use solver's Python API directly (e.g., gurobipy) for faster performance
  # - null: Use linopy's default (typically 'lp')
  io_api: "direct"
  threads: 1  # Number of threads to use for solving
  # The calculate_fixed_duals option induces linopy to solve the MILP,
  # then fix all integer variables to their optimal values, then solve
  # the resulting LP in order to get dual variables for model
  # constraints.
  calculate_fixed_duals: true
  options_gurobi:
    LogToConsole: 0
    OutputFlag: 1
    Method: 2
    MIPGap: 0.001  # target 0.1% relative optimality gap
    MIPFocus: 2
  options_highs:
    solver: "choose"
    mip_rel_gap: 0.001  # align relative gap with gurobi setting
  export_for_tuning: false  # Export model to MPS before solving (for Gurobi parameter tuning)
  time_limit: null  # Solver-internal time limit in minutes (null = no limit)
  runtime: 5  # Maximum solver runtime in minutes (used by SLURM)
  mem_mb: 6000  # Maximum solve_model memory in MB (used by SLURM)
  inline_analysis: false  # When true, analysis runs inside the solve process (no intermediate .nc)

# --- section: remote_solve ---
remote_solve:
  enabled: false  # If true, solve_model is executed remotely over SSH
  local_scenarios: ["baseline"]  # Scenarios that must always solve locally (currently only "baseline" is supported)
  host: "user@login.cluster"  # Placeholder SSH host or alias; customize for your setup
  workdir: "~/path/to/food-opt"  # Placeholder remote project root containing this repository
  pixi_env: "default"  # Placeholder remote pixi environment passed to tools/smk -e
  use_slurm: false  # Set true when remote solves should be submitted via --slurm
  slurm_account: ""  # SLURM account for remote job submission
  slurm_partition: ""  # SLURM partition for remote compute jobs
  sync_workflow: false  # Sync workflow/ and config/ code before remote solve (may dirty remote git state)
  sync_pixi_files: false  # Sync pixi.toml and pixi.lock to remote workdir
  ssh_options: []  # Extra ssh CLI args, e.g. ["-o", "ControlMaster=auto"]
  rsync_options: []  # Extra rsync CLI args
  preflight_check: true  # If true, create remote workdir before syncing
Solver choice:
  • HiGHS: Open-source, fast, good for most problems

  • Gurobi: Commercial, often faster for very large problems, requires license (free for academic users)

The remote_solve subsection allows delegating only solve_model to a remote SSH host (for example an HPC login node) while keeping model building and analysis local. See Workflow & Execution for setup instructions and usage details. Set remote_solve.local_scenarios (default: ["baseline"]) for scenarios that must always use the local solve_model rule.

Plotting Configuration

plotting:
  comparison_scenarios:
  - "scen-default"

  # Crop groups for map visualizations. Each group has a display name, a hex
  # color, and a list of member crops. Every crop listed under the top-level
  # `crops` key should belong to exactly one group (validated at startup).
  # Colors sourced from ColorBrewer Dark2 and Paired palettes.
  crop_groups:
    Cereals:
      color: "#E6AB02"  # Dark2 #6
      crops: [wheat, dryland-rice, wetland-rice, maize, barley, oat, rye,
              sorghum, buckwheat, foxtail-millet, pearl-millet]
    Legumes:
      color: "#666666"  # Dark2 #8
      crops: [soybean, dry-pea, chickpea, cowpea, gram, phaseolus-bean, pigeonpea]
    "Roots & tubers":
      color: "#A6761D"  # Dark2 #7
      crops: [white-potato, sweet-potato, cassava, yam]
    Vegetables:
      color: "#1B9E77"  # Dark2 #1
      crops: [tomato, carrot, onion, cabbage]
    Fruits:
      color: "#D95F02"  # Dark2 #2
      crops: [banana, citrus, coconut]
    Oilseeds:
      color: "#7570B3"  # Dark2 #3
      crops: [sunflower, rapeseed, groundnut, sesame, oil-palm, olive]
    "Sugar crops":
      color: "#E7298A"  # Dark2 #4
      crops: [sugarcane, sugarbeet]
    Stimulants:
      color: "#B15928"  # Paired #12
      crops: [cocoa, coffee, tea]
    "Fiber crops":
      color: "#1F78B4"  # Paired #2
      crops: [cotton]
    "Feed crops":
      color: "#66A61E"  # Dark2 #5
      crops: [alfalfa, silage-maize, biomass-sorghum, grassland]

  colors:
    # Sensitivity parameter colors and groups (tab20b palette).
    # Group order determines stacking order; colors within a group are adjacent hues.
    parameter_groups:
      Emissions:                                    # blues, light → dark
        color: "#6b6ecf"
        parameters:
          ch4_factor: "#9c9ede"
          n2o_factor: "#6b6ecf"
          luc_factor: "#5254a3"
      Agricultural:                                 # greens, light → dark
        color: "#8ca252"
        parameters:
          yield_factor: "#cedb9c"
          fcr_factor: "#b5cf6b"
          flw_factor: "#8ca252"
      "Health risk":                                # reds, light → dark
        color: "#d6616b"
        parameters:
          rr_protective: "#e7969c"
          rr_harmful: "#d6616b"
      "Policy valuation":                           # purples, light → dark
        color: "#ce6dbd"
        parameters:
          ghg_price: "#de9ed6"
          value_per_yll: "#ce6dbd"
    crops:
      wheat: "#C58E2D"
      'dryland-rice': "#E0B341"
      'wetland-rice': "#F7E29E"
      maize: "#F1C232"
      barley: "#B68D23"
      oat: "#D4B483"
      rye: "#A67C52"
      sorghum: "#A0522D"
      buckwheat: "#8B5A2B"
      'foxtail-millet': "#E3C878"
      'pearl-millet': "#D9A441"
      soybean: "#7B4F2A"
      'dry-pea': "#B9925B"
      chickpea: "#D7B377"
      cowpea: "#8C5C38"
      gram: "#A47038"
      'phaseolus-bean': "#6E3B1E"
      pigeonpea: "#9C6B3E"
      'white-potato': "#8FB98B"
      'sweet-potato': "#CE7B3A"
      cassava: "#6E8B3D"
      yam: "#4F6F2C"
      tomato: "#C0392B"
      carrot: "#E67E22"
      onion: "#D35400"
      cabbage: "#27AE60"
      banana: "#F7DC6F"
      citrus: "#F39C12"
      coconut: "#8E735B"
      sunflower: "#F1C40F"
      rapeseed: "#F5B041"
      groundnut: "#A8683C"
      sesame: "#C97A2B"
      'oil-palm': "#A04000"
      olive: "#6E7D57"
      cocoa: "#5C3317"
      coffee: "#6F4E37"
      tea: "#4B7A2E"
      cotton: "#F5F5DC"
      sugarcane: "#9B59B6"
      sugarbeet: "#AF7AC5"
      alfalfa: "#1ABC9C"
      'biomass-sorghum': "#16A085"
      grassland: "#7FB77E"
    # Colors sourced from ColorBrewer Dark2, Set2, and Paired palettes.
    food_groups:
      whole_grains: "#E6AB02"  # Dark2 #6
      grain: "#FFD92F"  # Set2 #6
      fruits: "#FC8D62"  # Set2 #2
      vegetables: "#66C2A5"  # Set2 #1
      legumes: "#8DA0CB"  # Set2 #3
      nuts_seeds: "#A6761D"  # Dark2 #7
      starchy_vegetable: "#E78AC3"  # Set2 #4
      oil: "#E5C494"  # Set2 #7
      red_meat: "#E31A1C"  # Paired #6
      poultry: "#FB9A99"  # Paired #5
      dairy: "#A6CEE3"  # Paired #1
      eggs: "#FDBF6F"  # Paired #7
      stimulants: "#B15928"  # Paired #12

  fallback_cmaps:
    crops: "Set3"

Customize visualization colors for publication-quality plots. The colors.food_groups palette is applied consistently across all food-group charts and maps; extend it if you add new groups to data/curated/food_groups.csv.