.. SPDX-FileCopyrightText: 2026 Koen van Greevenbroek .. .. SPDX-License-Identifier: CC-BY-4.0 Tutorial ======== This tutorial walks you through two complete modelling exercises with ``food-opt``. It assumes you have finished the :doc:`introduction` (clone, ``pixi install``, credentials, manually-downloaded datasets). You'll leave each part with solved scenarios, auto-generated plots, and a handful of hand-rolled comparisons built in a notebook. Both parts use a reduced spatial resolution of 200 optimisation regions so that they complete in a few minutes on a laptop (once the one-off raw-data download — about half an hour, depending on your connection — has run). The tutorial configs live under ``config/tutorial/``. .. toctree:: :hidden: tutorials/tutorial_01_analysis tutorials/tutorial_02_analysis Part 1 — GHG prices at a fixed diet ----------------------------------- In this first exercise, we solve three scenarios that are identical except for the greenhouse-gas price applied to the objective function, and we hold consumption at the observed 2020 diet in all three. Because the diet is fixed, every difference between scenarios comes from how **production** — which crops are grown where, which livestock systems are used, and where trade flows — reorganises when emissions become more costly. Step 1 — Look at the config ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Open ``config/tutorial/01_ghg_prices.yaml``. The file is short — every key not listed here falls back to ``config/default.yaml``: .. literalinclude:: ../config/tutorial/01_ghg_prices.yaml :language: yaml A few things to note: * ``name: "tutorial_01"`` controls the output directory: everything lands under ``results/tutorial_01/``. * ``aggregation.regions.target_count: 200`` keeps the LP small enough to solve in minutes. The full-resolution default is 400; values below 200 fail the per-country clustering step because there are more countries in the default list than regions. * ``planning_horizon`` and ``baseline_year`` are both 2020, aligning the model with the most recent year for which GDD dietary data exist. * The ``scenarios:`` block defines three scenarios that each set ``validation.enforce_baseline_diet: true``. That flag forces consumption per food group to equal the observed 2020 diet in every country. * ``health.value_per_yll: 0`` disables the health-cost objective. Health costs are the subject of separate documentation — we keep them out of the tutorial on purpose. If you want to experiment, you can copy this file to a new name (e.g. ``config/tutorial/01_my_variant.yaml``), change the ``name`` field, and edit any overrides you like. Step 2 — Dry run ~~~~~~~~~~~~~~~~ Before committing to a full run, it's worth asking Snakemake what it *would* do: .. code-block:: bash tools/smk -j4 --configfile config/tutorial/01_ghg_prices.yaml -n The ``-n`` flag prints the planned execution graph without actually running anything. On a clean checkout you'll see data-preparation rules (downloads, region clustering, yield aggregation), the model build, three solves (one per scenario), analysis extraction, and plotting. Step 3 — Run the workflow ~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash tools/smk -j4 --configfile config/tutorial/01_ghg_prices.yaml The first run is the longest because Snakemake has to download raw datasets (GAEZ, GADM, UN WPP, FAOSTAT, ESA CCI, …). Subsequent runs of any tutorial or other configuration reuse the same cached data. The build step itself is shared across all scenarios; only the three solves and the downstream analysis/plots are scenario-specific. When the workflow finishes, you will find: * ``results/tutorial_01/build/model.nc`` — the PyPSA network before solving. * ``results/tutorial_01/solved/model_scen-{baseline,ghg_mid,ghg_high}.nc`` — the three solved networks. * ``results/tutorial_01/analysis/scen-{baseline,ghg_mid,ghg_high}/*.parquet`` — standardised statistics extracted from each solve (see :doc:`analysis` for the full schema). * ``results/tutorial_01/plots/scen-*/*.pdf`` — auto-generated figures. * ``results/tutorial_01/plots/comparison/`` — cross-scenario comparison plots, produced because we set ``plotting.comparison_scenarios: "all"``. Step 4 — Analyse in a notebook ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The companion notebook :doc:`tutorials/tutorial_01_analysis` walks through five quick comparisons across the three scenarios: total agricultural land, the cropland vs grassland split, net GHG emissions by gas, the composition of animal feed, and the objective-cost breakdown. Open it in the docs to browse the rendered outputs, or download it and run it locally against your own ``results/tutorial_01/`` directory. To run it yourself: .. code-block:: bash pixi run -e dev jupyter lab docs/tutorials/tutorial_01_analysis.ipynb Because all three scenarios share the same (baseline) diet, anything that moves between them reflects production-side reorganisation. Total agricultural land typically falls sharply as the GHG price rises (because marginal land is released and the regrowing land sequesters carbon), the gas composition of net emissions shifts, and the objective's ``ghg_cost`` column becomes strongly negative — at these prices, net emissions are negative, so ``ghg_price × emissions`` is a revenue term in the objective. .. note:: The notebook opens with a short contextualisation that is worth reading: even at ``baseline``, this tutorial's model uses less land than the real world and produces net-negative emissions by default. Serious studies "coerce" the model toward observed production using ``validation.production_stability`` (see ``config/sensitivity.yaml`` and ``config/gsa.yaml``) or hard constraints (see ``config/validation.yaml``). The tutorial omits both to keep the config short. Part 1 — Summary ~~~~~~~~~~~~~~~~ At this point you've exercised the full end-to-end workflow: config, build, solve, analysis, and custom post-processing. But because consumption was held fixed, Tutorial 1 can't tell you whether a different *diet* would reduce emissions more cheaply — the model had no way to weigh "change what people eat" against "change how food is produced". Part 2 adds that missing piece. Part 2 — Letting diet respond via consumer values ------------------------------------------------- In Part 1, we fixed consumption with ``enforce_baseline_diet: true``. That guarantees realism (nobody is forced to eat something unusual), but it also rules out dietary shift as a mitigation option. A more interesting model lets the optimiser decide when giving up some of today's diet is worth the GHG savings — which requires pricing the cost of deviating from today's diet. ``food-opt`` does that by **deriving consumer values from a baseline solve**: 1. Solve a baseline scenario with ``enforce_baseline_diet: true``. The per-(food, country) equality constraints on the ``food_consumption`` links are binding, and their **dual variables** (shadow prices) represent each food's marginal utility under today's diet — expressed as bn USD per Mt. 2. Feed those consumer values into a **piecewise diminishing-marginal-utility curve** centred at baseline consumption. Each block represents an additional increment of consumption beyond (or below) baseline, with decreasing utility. 3. In subsequent scenarios, drop ``enforce_baseline_diet`` and enable the piecewise curve. Consumption is now free to move, but the optimiser "pays" for deviations — so small dietary shifts are cheap while large ones become expensive. The workflow automates steps 1–2: the ``extract_consumer_values`` and ``calibrate_food_utility_blocks`` rules run automatically whenever a scenario needs the calibrated blocks. Step 1 — Look at the config ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Open ``config/tutorial/02_consumer_values.yaml``: .. literalinclude:: ../config/tutorial/02_consumer_values.yaml :language: yaml The key differences from Part 1: * ``food_utility_piecewise.enabled: true`` at the top level turns on the piecewise utility curve globally. * ``consumer_values.baseline_scenario: "baseline"`` tells the calibration step which scenario's dual variables to extract. The name must match one of the scenarios below. * The ``baseline`` scenario keeps ``enforce_baseline_diet: true`` and **explicitly disables** the piecewise curve (``food_utility_piecewise.enabled: false``). These two settings are mutually exclusive — attempting to combine them raises a validation error. * The ``ghg_mid`` and ``ghg_high`` scenarios inherit the top-level ``food_utility_piecewise`` settings and no longer set ``enforce_baseline_diet``, so consumption is free. The piecewise-utility parameters themselves are worth a brief look: * ``n_blocks: 4`` — the curve has four steps above and below baseline. * ``decline_factor: 0.7`` — each successive block is worth 70% of the previous one, giving diminishing returns. * ``total_width_multiplier: 2.0`` — the curve spans from 0 up to twice baseline consumption. See :doc:`configuration` for the full description. Step 2 — Solve the baseline first ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Part 2 involves two sequential steps: the baseline must be solved before consumer values can be extracted and the other scenarios can build their utility blocks. Snakemake handles the dependency automatically, but it is instructive to do the baseline on its own first: .. code-block:: bash tools/smk -j4 --configfile config/tutorial/02_consumer_values.yaml -- \ results/tutorial_02/solved/model_scen-baseline.nc After this finishes you will have: * ``results/tutorial_02/solved/model_scen-baseline.nc`` — the baseline solution. * ``results/tutorial_02/consumer_values/baseline/values.csv`` — the extracted dual variables. * ``results/tutorial_02/consumer_values/baseline/utility_blocks.csv`` — the calibrated piecewise utility curve. The companion notebook :doc:`tutorials/tutorial_02_analysis` begins with a quick look at the extracted values — the ``value_bnusd_per_mt`` column of ``values.csv`` ranks each (food, country) pair by the marginal utility the baseline implies. Step 3 — Solve the remaining scenarios ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash tools/smk -j4 --configfile config/tutorial/02_consumer_values.yaml Now both the mid- and high-GHG scenarios solve, using the same calibrated utility blocks. On a laptop, each solve takes a few minutes longer than Part 1 because the LP has extra variables for the piecewise blocks. Step 4 — Compare against Tutorial 1 in a notebook ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The companion notebook :doc:`tutorials/tutorial_02_analysis` covers three comparisons: * Global food-group consumption across the three scenarios, to see whether — and which — food groups actually shift once the diet is free. * The objective breakdown with the ``consumer_values`` column visible alongside ``ghg_cost`` (the two forces trading off against each other). * A side-by-side comparison of net GHG emissions between Tutorial 1 (fixed diet) and Tutorial 2 (flexible diet) at identical GHG prices. The gap between the two is a rough measure of the demand-side mitigation potential. Also have a look at the auto-generated comparison plot at ``results/tutorial_02/plots/consumer_values/consumption_comparison.pdf``, which shows the same pattern per food group. Gotchas ~~~~~~~ A few things that commonly trip people up: * ``food_utility_piecewise.enabled: true`` and ``validation.enforce_baseline_diet: true`` cannot be active for the same scenario. The baseline scenario enables the latter and disables the former; all other scenarios do the opposite. * ``consumer_values.baseline_scenario`` must name a scenario that exists and that has ``enforce_baseline_diet: true``. If it doesn't, the calibration rule fails with a validation error. * The calibrated utility blocks are **specific to the baseline scenario** that produced them. If you change the baseline (e.g. different ``planning_horizon`` or ``baseline_year``), rerun the baseline solve so the values and blocks are regenerated. Where to go from here --------------------- You have now solved two small scenario sets, inspected the output files, and built a handful of comparisons by hand. Some natural next steps: * **Scale up the GHG price sweep.** ``config/sensitivity.yaml`` and ``config/ghg_yll_grid.yaml`` do the same thing at full resolution, with log-spaced GHG prices generated programmatically via the :doc:`scenario generator DSL `. * **Turn on health costs.** :doc:`health` describes the Global Burden of Disease integration and how ``health.value_per_yll`` prices diet-related disease burden alongside the environmental objectives. * **Perform a global sensitivity analysis.** :doc:`sensitivity_analysis` describes the polynomial-chaos and random-forest surrogate workflows used for Sobol-index decomposition. * **Learn the rule graph.** :doc:`workflow` documents every rule in the pipeline; :doc:`results` and :doc:`analysis` document every output file and column.