Nutrition

Overview

The nutrition module ensures that the optimized food system meets population dietary requirements. This includes:

  • Macronutrient constraints: Carbohydrates, protein, fat, and calories per capita

  • Food group constraints: Consumption of whole grains, fruits, vegetables, etc.

  • Population scaling: Aggregating per-capita needs to regional/national totals

Macronutrients

Configuration

Macronutrient constraints are specified in config/default.yaml:

macronutrients: {}
  # For each of "carb", "protein", "fat" and "cal" we support "min",
  # "max" and "equal" keywords, which are given in g/person/day; see
  # example below. Alternatively, use "equal_to_baseline: true" to
  # enforce per-country equality at the level implied by each country's
  # baseline diet (mutually exclusive with min/max/equal).
  # carb:
  #   min: 250              # g/person/day
  #   # equal_to_baseline: true  # per-country g/person/day from baseline diet
  # protein:
  #   min: 50      # g/person/day
  # fat:
  #   min: 50      # g/person/day
  # cal:
  #   min: 2000    # kcal/person/day
  #   # equal_to_baseline: true  # per-country kcal/person/day from baseline diet

# --- section: sensitivity ---
# Multiplicative adjustment factors for sensitivity analysis. Applied after
# model construction. See config/schemas/config.schema.yaml for structure.
sensitivity: {}

# --- section: byproducts ---
# Foods that are not for direct human consumption (excluded from food group tracking)
byproducts:
- beet-pulp
- wheat-bran
- wheat-germ
- rice-bran
- barley-bran
- oat-bran
- buckwheat-hulls
- oilseed-meal
- palm-kernel-meal
- rapeseed-meal
- ddgs
- molasses
- maize-ethanol
- maize-gluten-feed
- maize-gluten-meal
- maize-starch
- sugarcane-ethanol
- cotton-lint

# --- section: gleam3_feed_attribution ---
# When distributing a GLEAM3 intake bucket (e.g. "By-products" =
# ~200 Mt/yr ruminant DM intake globally) across model feed categories,
# ``compute_gleam3_feed_fractions`` weights each contained model entity
# by its per-country production potential, derived from FAOSTAT crop
# production × the foods.csv pathway factor that produces it. For
# pathways where the realised share of the source crop is materially
# below 1.0 (so the unmodified potential over-states the entity's true
# share of intake), supply a dispatch-share override here. Pathways
# not listed default to 1.0.
gleam3_feed_attribution:
  pathway_dispatch_shares:
    # Wet-milled corn (HFCS / starch / corn-oil / gluten meal+feed):
    # ~15-20 % of US corn × US share of global maize production (~30 %)
    # ⇒ roughly 5-8 % of global maize. USDA ERS, Corn and Other Feed
    # Grains: https://www.ers.usda.gov/topics/crops/corn-and-other-feed-grains/
    maize_wetmill: 0.07
    # Dry-milled corn fuel ethanol: ~40 % of US corn × US ~30 % share
    # ⇒ ~12 % of global maize. USDA ERS Feed Grains Sector at a Glance.
    maize_ethanol: 0.12
    # Sugarcane ethanol: Brazil ~50 % of cane to ethanol vs sugar, India
    # and most other producers mostly sugar; ~25 % global average. F.O.
    # Licht / OECD-FAO Agricultural Outlook 2023-2032 oilseeds chapter.
    sugarcane_ethanol: 0.25

Constraint types:

  • min: Lower bound (≥)

  • max: Upper bound (≤)

  • equal: Exact requirement (=)

Model Implementation

Macronutrients are realised in the PyPSA network as a per-country store for each nutrient, fed by food-consumption links that convert food flows (Mt/year) into nutrient flows. The construction happens in workflow/scripts/build_model/nutrition.py; the bounds themselves are added at solve time in workflow/scripts/solve_model/core.py.

Nutrient buses and stores. add_macronutrient_loads creates one bus nutrient:{nutrient}:{country} and one extendable store store:nutrient:{nutrient}:{country} for every configured nutrient and country. The store carrier name equals the nutrient (e.g. protein); its unit (Mt for mass nutrients, PJ for energy) tells downstream code how to convert per-capita requirements into network units.

Food → nutrient conversion. add_food_nutrition_links adds one multi-output link consume:{food}:{country} per food and country, with bus0 = food:{food}:{country} and additional output buses for every nutrient. The efficiency on bus i equals the food’s content of nutrient i, taken from data/curated/nutrition.csv (USDA FDC SR Legacy values per 100 g) and rescaled by _nutrition_efficiency_factor so that food flows in Mt/year map onto the carrier units above. Foods listed in byproducts.include are excluded from these links so they cannot enter human consumption.

Bounds on store level. Because the network is solved over a single representative snapshot now, the macronutrient store level Store-e equals the annual nutrient throughput. add_macronutrient_constraints adds one constraint per nutrient and per country to the linopy model, selecting Store-e for the matching nutrient stores and comparing it to a population-scaled RHS. The RHS conversion is

\[\begin{split}\text{rhs}_{n,c} = \begin{cases} \dfrac{r_n \cdot p_c \cdot 365}{10^{12}} & \text{(mass nutrients, Mt/year)} \\[6pt] r_n \cdot p_c \cdot 365 \cdot K_{\mathrm{kcal\rightarrow PJ}} & \text{(energy, PJ/year)} \end{cases}\end{split}\]

where \(r_n\) is the per-capita daily requirement (g/day or kcal/day), \(p_c\) is the population of country \(c\) (persons), and \(K_{\mathrm{kcal\rightarrow PJ}} = 4.184 \times 10^{-12}\). Configuration entries map to operators as

  • min: xStore-e rhs

  • max: xStore-e rhs

  • equal: x or equal_to_baseline: trueStore-e == rhs

Equality constraints silence any min/max on the same nutrient; equal_to_baseline uses each country’s baseline per-capita intake (_compute_baseline_macronutrient_by_country) as the RHS instead of a global value, so countries hold their own current diet on that nutrient.

Food Groups

Beyond macronutrients, the model can also constrains consumption of food groups. Moreover, food groups are used to assess dietary risk factors (see Health Impacts).

Configuration

food_groups:
  included:
  - whole_grains
  - grain
  - fruits
  - vegetables
  - legumes
  - nuts_seeds
  - starchy_vegetable
  - oil
  - red_meat
  - poultry
  - dairy
  - eggs
  - sugar
  - stimulants
  - animal_fat
  # Optional per-group constraints with "min", "max" or "equal" in g/person/day
  constraints: {}
  equal_by_country_source: null
  # Per-capita consumption caps (g/person/day) applied as e_nom_max on stores.
  # Values are set to:
  #   ceil(2 * max(TMREL, max country-level group consumption))
  # using custom baseline diet estimates from processing/{name}/baseline_diet.csv
  # and TMREL values from derived health RR curves (where available).
  max_per_capita:
    whole_grains: 300
    grain: 1403
    fruits: 658
    vegetables: 785
    legumes: 300
    nuts_seeds: 79
    starchy_vegetable: 1221
    oil: 155
    red_meat: 285
    poultry: 241
    dairy: 2865
    eggs: 213
    sugar: 133
    stimulants: 50
    animal_fat: 50
  # Fix relative food contributions within each food group based on baseline
  # consumption data. When enabled, the model maintains baseline ratios between
  # foods in each group (e.g., if wheat is 60% and rice 40% of grains, that
  # ratio is preserved) while allowing total group consumption to vary.
  fix_within_group_ratios:
    enabled: false

# --- section: weight_conversion ---
# Mass-basis conversion tables, keyed "<from>_to_<to>". Each table maps a
# food (or food group) name to a multiplicative factor; foods not listed
# default to 1.0. Consumed by the diet pipeline (baseline_diet, FLW,
# health RR conversions) and by the animal-product pipeline (FAOSTAT QCL
# carcass → retail conversion and feed→retail ME normalisation).
# Bases recognised model-wide: dry, fresh, cooked, carcass, brewed.
weight_conversion:
  # GBD whole_grain is reported on a dry whole-grain basis (the TMREL
  # 100-150 g/day is calibrated on dry content); GBD legumes is reported
  # on a cooked basis. Convert to model basis (dry) via 0.45 / 0.40.
  cooked_to_dry:
    grain: 0.45
    whole_grains: 0.45
    legumes: 0.40
  # GBD red_meat is reported in cooked basis; the model uses raw retail
  # mass. Inflation factor 1/0.7 ≈ 1.43 lands GBD red_meat exposure on
  # raw retail basis. (Plan: complement with a basis correction to the
  # health module's red_meat RR function so attributable burden uses
  # cooked-basis exposure end to end.)
  cooked_to_fresh:
    red_meat: 1.43
    poultry: 1.0
  # Green-tea-leaf → made (dry) tea: FAO uses 0.22 as the standard
  # processing yield (1 kg of green leaf yields ~0.22 kg of dry made
  # tea). Applied in the FBS override so the supply-side GAEZ tea yields
  # (made-tea basis) and the demand-side baseline consumption (derived
  # from FBS green-leaf supply) land in the same basis.
  fresh_to_dry:
    tea-dried: 0.22
  # Carcass → retail (boneless, edible) conversion for meat products.
  # Source: OECD-FAO Agricultural Outlook 2023-2032, Meat chapter,
  # Box 6.1 ("Edible retail weight"). Cross-reference: USDA Agricultural
  # Handbook 697 (1992), Table 7. The food bus carries retail mass, while
  # FAOSTAT QCL reports meat in carcass weight — these factors land FBS
  # supply (FBS override path), QCL production (prepare_faostat_animal_
  # production), implicit FLW (prepare_food_loss_waste), and ME-per-kg
  # requirements (build_feed_to_animal_products) on the retail/fresh basis
  # the model uses internally. Eggs and dairy aren't listed because their
  # FBS supply is already in retail mass (factor 1.0 by default).
  carcass_to_fresh:
    meat-cattle: 0.67  # OECD-FAO 2023 Box 6.1: Beef 67%
    meat-pig: 0.73     # OECD-FAO 2023 Box 6.1: Pigmeat 73%
    meat-sheep: 0.66   # OECD-FAO 2023 Box 6.1: Sheep 66%
    meat-chicken: 0.60 # OECD-FAO 2023 Box 6.1: Poultry 60%

List the active groups under food_groups.included and only specify constraints for the ones that need limits (min, max, or equal in g/person/day). Leaving constraints empty allows the optimizer to choose any mix of foods that satisfies macronutrient and other requirements.

Foods are assigned to groups in data/curated/food_groups.csv (one food,group row per food). Unmapped foods bypass the group buses entirely — their nutrients still count, but they are unconstrained at the group level.

Model Implementation

Food groups mirror the macronutrient pattern but route a mass of food to a per-country store instead of a nutrient mass:

  1. Group buses and stores. add_food_group_buses_and_loads (workflow/scripts/build_model/nutrition.py) creates a bus group:{group}:{country} and an extendable store store:group:{group}:{country} (carrier group_{group}) for every included group. When food_groups.max_per_capita is set, the store’s e_nom_max is pre-clamped to the corresponding Mt/year cap (g/person/day × population × 365 / 10¹²), so the network rejects infeasible diets up-front.

  2. Food → group routing. The same multi-link that carries food into the nutrient buses (consume:{food}:{country}) also adds an extra output bus group:{group}:{country} with efficiency 1, looked up from food_groups.csv. One unit of food therefore deposits one unit of mass on its group’s store, in addition to its nutrient contributions.

  3. Population-scaled bounds. add_food_group_constraints (workflow/scripts/solve_model/core.py) selects the Store-e variables for each group_{group} carrier and adds one linopy constraint per country:

    \[\text{Store-e}_{g,c}\;\{\le,\ge,=\}\;\frac{r_g \cdot p_c \cdot 365}{10^{12}}\quad [\text{Mt/year}]\]

    with operator chosen from min/max/equal in the config. As with macronutrients, an equal bound silences min/max for that group.

  4. Per-country equality from the baseline diet. When the diet module is configured to anchor a group to current per-country consumption (diet.enforce_baseline or an equality CSV), the solver builds a per_country_equal mapping {group: {country: g/person/day}} from the baseline diet and feeds it to add_food_group_constraints. The equality RHS then uses the country-specific value instead of a global one — useful when the goal is to hold today’s group mix fixed and let the model choose within-group composition.

This setup keeps dietary diversity decoupled from macronutrient adequacy: the optimizer can satisfy energy/protein/fat from a narrow set of foods only if no binding group min (e.g. fruits, vegetables) prevents it.

Population Data

Population projections come from the UN World Population Prospects (WPP) 2024 revision.

Data Processing

The prepare_population rule (workflow/scripts/prepare_population.py):

  1. Load WPP data: data/downloads/WPP_population.csv.gz

  2. Filter:

    • Countries in config['countries']

    • Planning horizon year (config['planning_horizon'], e.g., 2030)

    • Medium variant projection

  3. Aggregate: Sum population by country (converts thousands → persons)

  4. Output:

    • processing/{name}/population.csv: Total population by country

    • processing/{name}/population_age.csv: Age-structured population for health module

Age Structure

Age-structured population is used in the health module to weight dietary risk factors by demographic composition (children vs. adults vs. elderly have different disease burdens).

Nutritional Content Data

The file data/curated/nutrition.csv contains nutritional composition for each food product, sourced from the USDA FoodData Central database. This data is retrieved from the SR Legacy (Standard Reference) database, which provides laboratory-analyzed nutrient data for foods.

Data source: U.S. Department of Agriculture, Agricultural Research Service. FoodData Central, 2019. https://fdc.nal.usda.gov/

Content: Macronutrient values (protein, carbohydrates, fat) and energy (kcal) per 100g of food product.

License: Public domain under CC0 1.0 Universal. See Data Sources for full details.

The FAO Nutrient Conversion Table for Supply Utilization Accounts (2024 edition) is also stored locally in data/downloads/fao_nutrient_conversion_table_for_sua_2024.xlsx via the download_fao_nutrient_conversion_table workflow rule, providing FAO-authored nutrient factors for cross-checking FAOSTAT supply data (subject to FAO’s non-commercial use guidance). workflow/scripts/prepare_fao_edible_portion.py distils the edible portion coefficients from sheet 03 of that workbook for all configured crops, materialising them in processing/{name}/fao_edible_portion.csv for downstream use.

When the model assembles crop→food conversion links it rescales dry-matter crop production to fresh edible food mass using these coefficients together with moisture fractions from data/curated/crop_moisture_content.csv: dry harvests are uplifted by edible_portion_coefficient / (1 - moisture_fraction) before applying the pathway-specific processing factors from data/curated/foods.csv. Each processing pathway can produce multiple food products with factors that maintain mass balance (sum ≤ 1.0). Crops flagged in data/curated/yield_unit_conversions.csv are the few cases where GAEZ reports processed outputs (sugar or oil); those entries handle the unit conversion back to dry matter so that downstream processing can proceed uniformly.

Retrieval:

  • The repository includes pre-fetched nutritional data from USDA

  • To update with fresh data, enable data.usda.retrieve_nutrition: true in the config

  • Run: snakemake -- data/curated/nutrition.csv (requires network access and API key)

  • Food-to-USDA mappings are maintained in data/curated/usda_food_mapping.csv

  • A shared API key is included in the repository; users can optionally obtain their own free API key at https://fdc.nal.usda.gov/api-key-signup

Per-Capita vs. Total Consumption

The model works with total annual flows (Mt/year) but nutritional requirements are per-capita per-day. Conversion:

\[\text{Total requirement (Mt/year)} = \frac{\text{per capita (g/day)} \times \text{population} \times 365}{10^{12}}\]

From the model’s perspective:

  • Food buses carry total food availability (Mt)

  • Nutrient buses carry total nutrient availability (Mt for mass, PJ for energy)

  • Constraints compare these totals to population-scaled requirements

Dietary Patterns

The model does not currently prescribe specific dietary patterns (e.g., Mediterranean, vegetarian, EAT-Lancet) but rather:

  1. Lower / upper bounds: Ensure minimum nutritional adequacy

  2. Cost minimization: Subject to those bounds, minimize environmental + health costs

Workflow Integration

Nutritional constraints are incorporated in the build_model rule:

  1. Load population: processing/{name}/population.csv

  2. Load nutrition data: data/curated/nutrition.csv

  3. Create nutrient buses: Per-country buses for each nutrient

  4. Create food → nutrient links: Based on nutritional content

  5. Add global constraints: Population × requirement bounds

No separate rule needed—nutrition is integrated into the model structure.