Introduction¶
Overview¶
food-opt is a global food systems optimization model that is designed to study the trade-offs between optimizing for both positive health outcomes as well as desirable environmental outcomes. The model uses a resource flow-based structure implemented with PyPSA/linopy to jointly optimize food production, processing, trade and consumption patterns.
Key Objectives¶
The model balances multiple objectives:
Environmental sustainability: Minimize greenhouse gas emissions (CO₂, CH₄, N₂O), land use change, nitrogen pollution, and water use
Health outcomes: Minimize disease burden from dietary risk factors
These objectives are co-optimized for while operating within biophysical limits on crop yields, land availability, and irrigation capacity as well as satisfying constraints on nutritional adequacy in terms of macronutrients.
Key Features¶
Food System Coverage¶
Crop production: more than 60 different crops with spatially-explicit yield potentials
Livestock systems: Multiple production systems (grazing vs. feed-based) for meat and dairy
Food processing and trade: Accounting for waste losses, trading frictions and more
Nutritional assessment: Mapping to food group-based dietary risk factors and health outcomes
Environmental Impact Assessment¶
Greenhouse gas emissions from production, land use change, and nitrogen fertilization
Land use change impacts with spatially-explicit carbon storage estimates
Water use constraints based on irrigation infrastructure and basin-level availability
Nitrogen pollution from fertilizer application
Health and Nutrition¶
Macronutrient constraints to ensure basic nutritional adequacy
Integration with Global Burden of Disease dietary risk factors
Population-level health impact assessment in terms of years of life lost
Health valuation via a configurable value-per-YLL constant
Global Extent & Flexible Spatial Resolution¶
Geophysical input data at high-resolution gridcell level (0.05° × 0.05°)
Optimization at configurable sub-national regional scale
Global coverage with detailed country and regional analysis

Global model coverage showing optimization regions. The model divides the world into sub-national regions (here 250 regions) that balance spatial detail with computational tractability. Each colored polygon represents an optimization region created by clustering administrative units.¶
Getting Started¶
Prerequisites¶
Installation¶
Clone the repository:
git clone <repository-url> cd food-opt
Install dependencies:
uv sync
Retrieve Global Dietary Database and Global Burden of Disease datasets manually (see
data/manually_downloaded/README.md
and the Manual Download Checklist)The workflow will automatically download required datasets when first run.
Quick Start¶
Running Your First Model¶
The quickest path to a runnable scenario is:
Copy the defaults:
cp config/default.yaml config/my_scenario.yaml
Edit
config/my_scenario.yaml
and add aname
field near the top (for examplename: "my_scenario"
).Run the workflow with your scenario file:
tools/smk -j4 --configfile config/my_scenario.yaml all
This sequence will:
Download required global datasets (GAEZ, GADM, UN population, etc.)
Process and harmonize spatial data for the configured countries
Build the linear programming model
Solve the optimization problem
Generate summary statistics and visualizations
Results will be saved under results/my_scenario/
.
Understanding the Workflow¶
The Snakemake workflow is organized into stages:
Data preparation: Population, regions, resource classes, crop yields
Model building: Assemble PyPSA network with all constraints
Solving: Run the linear program with configured solver
Visualization: Generate maps, plots, and CSV exports
You can target individual stages by specifying the output file. For example, to only build the model without solving:
tools/smk -j4 --configfile config/my_scenario.yaml results/my_scenario/build/model.nc
Or to just prepare regional aggregation:
tools/smk -j4 --configfile config/my_scenario.yaml processing/my_scenario/regions.geojson
See Workflow & Execution for detailed information on the workflow stages.
Configuring Your First Scenario¶
The baseline configuration (config/default.yaml
) provides a starting point. Key parameters to adjust:
countries
: List of ISO 3166-1 alpha-3 country codes to includeaggregation.regions.target_count
: Number of optimization regions (trade-off between detail and solve time)crops
: Which crops to include in the modelemissions.ghg_price
: Carbon price in USD/tCO2-eq (flows stored in MtCO2-eq internally)macronutrients
: Minimum dietary requirements
After editing the configuration, create a new named scenario by changing the name
field at the top of the file, then run:
tools/smk -j4 --configfile config/<your-name>.yaml all
Results will be saved under results/<your-name>/
.
Project Structure¶
The repository is organized as follows:
food-opt/
├── config/ # Configuration files for scenarios and parameters
│ └── config.yaml # Main configuration file
├── data/ # Input data (downloaded and processed)
│ ├── downloads/ # Raw downloaded datasets
│ ├── crops.csv # Crop definitions
│ ├── foods.csv # Crop-to-food processing pathways
│ └── nutrition.csv # Nutritional content (from USDA FoodData Central)
├── processing/ # Intermediate processed datasets
│ └── {config_name}/ # Processing outputs per scenario
├── results/ # Model outputs and analysis
│ └── {config_name}/ # Results per scenario
│ ├── build/ # Built model before solving
│ ├── solved/ # Solved model with optimal values
│ └── plots/ # Visualizations and CSV exports
├── workflow/ # Snakemake workflow
│ ├── Snakefile # Main workflow definition
│ ├── rules/ # Modular rule definitions
│ └── scripts/ # Data processing and modeling scripts
├── tools/ # Utility wrappers
│ └── smk # Memory-capped Snakemake wrapper
├── notebooks/ # Exploratory analyses
└── vendor/ # Bundled third-party dependencies
Important Notes¶
The
results/
directory contains auto-generated files—never edit these manuallySeveral CSV files (
data/feed_conversion.csv
,data/feed_to_animal_products.csv
,data/food_groups.csv
) contain mock placeholder dataAlways use the
tools/smk
wrapper to run Snakemake, as it enforces memory limits to prevent system instabilityThe first run will take significant time to download global datasets (~several GB)