API Reference¶

This section provides automatically generated documentation for the workflow scripts and utility modules.

Core Scripts¶

Model Building and Solving¶

Food systems optimization model builder.

Modular package for constructing PyPSA networks representing global food production, conversion, trade, and nutrition constraints.

Component Naming and Accessing Conventions¶

This module follows a consistent naming and attribute scheme for all PyPSA components. Never parse component names to extract metadata - always use columns.

Naming Scheme¶

Names use : as delimiter (uncommon in data values, safe for parsing if needed):

Pattern: {type}:{specifier}:{scope}

Buses:

crop:{crop}:{country}           e.g., crop:wheat:USA
food:{food}:{country}           e.g., food:bread:USA
feed:{category}:{country}       e.g., feed:ruminant_forage:USA
residue:{item}:{country}        e.g., residue:wheat_straw:USA
group:{group}:{country}         e.g., group:cereals:USA
nutrient:{nutrient}:{country}   e.g., nutrient:protein:USA
land:cropland:{region}_c{class}_{water}            e.g., land:cropland:usa_east_c1_r
land:pasture:{region}_c{class}                    e.g., land:pasture:usa_east_c1
land:existing_cropland:{region}_c{class}_{water}  e.g., land:existing_cropland:usa_east_c1_r
land:new:{region}_c{class}_{water}                e.g., land:new:usa_east_c1_r
land:existing_grassland_convertible:{region}_c{class}  e.g., land:existing_grassland_convertible:usa_east_c1
land:existing_grassland_marginal:{region}_c{class}     e.g., land:existing_grassland_marginal:usa_east_c1
water:{region}                  e.g., water:usa_east
fertilizer:supply               (global)
fertilizer:{country}            e.g., fertilizer:USA
emission:{type}                 e.g., emission:co2, emission:ghg

Links:

produce:{crop}_{water}:{region}_c{class}  e.g., produce:wheat_rainfed:usa_east_c1
produce:multi_{combo}_{water}:{region}_c{class}
produce:grassland:{region}_c{class}
pathway:{pathway}:{country}               e.g., pathway:milling:USA
convert:{item}_to_{category}:{country}    e.g., convert:wheat_to_ruminant_grain:USA
animal:{product}_{feed}:{country}         e.g., animal:beef_grassfed:USA
consume:{food}:{country}                  e.g., consume:bread:USA
use:existing_land:{region}_c{class}_{water}
use:existing_to_pasture:{region}_c{class}
convert:new_land:{region}_c{class}_{water}
convert:new_to_pasture:{region}_c{class}
use:existing_grassland_{type}_to_pasture:{region}_c{class}
spare:land:{region}_c{class}_{water}
spare:existing_grassland_{type}:{region}_c{class}
distribute:fertilizer:{country}
incorporate:residue_{item}:{country}
aggregate:{from}_to_{to}                  e.g., aggregate:ch4_to_ghg
trade:{commodity}:{from}_{to}
biomass:{item}:{country}

Stores:

store:group:{group}:{country}    e.g., store:group:cereals:USA
store:nutrient:{nutrient}:{country}  e.g., store:nutrient:protein:USA
store:water:{region}             e.g., store:water:usa_east
store:fertilizer:{country}       e.g., store:fertilizer:USA
store:emission:{type}            e.g., store:emission:ghg

Generators:

supply:land_{type}:{region}_c{class}_{water}  e.g., supply:land_existing_cropland:usa_east_c1_r
supply:fertilizer
slack:{type}:{scope}             e.g., slack:water:usa_east

Carrier Column¶

Use the carrier column for type identification. Carriers identify link type only; specific items (crops, foods, products) are stored in metadata columns.

Buses: crop_{crop}, food_{food}, feed_{category}, residue_{item}, group_{group}, {nutrient}, land_cropland, land_pasture, land_existing_cropland, land_existing_grassland_convertible, land_existing_grassland_marginal, land_new, water, fertilizer, co2, ch4, n2o, ghg
Links: - crop_production: Crop production (use crop column for specific crop) - crop_production_multi: Multi-cropping production (use crop column for combination) - grassland_production: Grassland/pasture production - animal_production: Animal product production (use product, feed_category columns) - food_consumption: Food consumption (use food, food_group columns) - food_processing: Food processing pathways (use pathway, crop columns) - feed_conversion: Crop/food to feed conversion (use crop, feed_category columns) - trade_crop: Crop trade (use crop column) - trade_food: Food trade (use food column) - trade_feed: Feed trade (use feed_category column) - biomass_crop: Crop to biomass (use crop column) - biomass_byproduct: Byproduct to biomass (use food column) - fertilizer_distribution: Fertilizer distribution - emission_aggregation: GHG emission aggregation - land_use, land_conversion, existing_to_pasture, new_to_pasture, existing_grassland_to_pasture, spare_land, spare_existing_grassland, residue_incorporation

Custom Columns¶

All components have consistent domain-specific columns for filtering:

Buses:
- country: str | NaN - country code (NaN for global/regional)
- region: str | NaN - region name (for land/water buses)
Links:
- country: str | NaN - country code
- region: str | NaN - region name
- crop: str | NaN - crop name
- food: str | NaN - food name
- food_group: str | NaN - food group name
- product: str | NaN - animal product name
- feed_category: str | NaN - feed category
- resource_class: int | NaN - land quality class
- water_supply: str | NaN - “irrigated” or “rainfed”
- land_type: str | NaN - e.g. “convertible” or “marginal” for grassland pools
Stores:
- country: str | NaN - country code
- food_group: str | NaN - food group name
- nutrient: str | NaN - nutrient name
Generators:
- country: str | NaN - country code
- region: str | NaN - region name
Global Constraints:
- country: str | NaN - country code
- food_group: str | NaN - food group name
- nutrient: str | NaN - nutrient name
- product: str | NaN - product name
- crop: str | NaN - crop name

Accessing Components¶

Use regular pandas indexing with carrier and domain columns. Fail fast when no components found:

# Get food group stores for a specific group
group_stores = n.stores.static[n.stores.static["carrier"] == f"group_{group}"]
if group_stores.empty:
    raise ValueError(f"No stores found for food group '{group}'")

# Get crop production links for a specific country
crop_links = n.links.static[
    (n.links.static["carrier"] == "crop_production") &
    (n.links.static["crop"] == crop) &
    (n.links.static["country"] == country)
]
if crop_links.empty:
    raise ValueError(f"No production links for crop '{crop}' in '{country}'")

# Get all consumption links
consume_links = n.links.static[n.links.static["carrier"] == "food_consumption"]

Solving utilities for the food systems optimization model.

Modular package providing specialized constraint builders for the PyPSA linopy model during optimization.

Data Preparation¶

Process UN WPP population data for total and age-specific counts.

Pre-compute health data for SOS2 linearisation in the solver.

class workflow.scripts.prepare_health_costs.RelativeRiskTable[source]¶

Bases: dict[tuple[str, str], dict[str, ndarray]]

Container mapping (risk, cause) to exposure grids and log RR values.

workflow.scripts.prepare_health_costs.main()[source]¶

Main entry point for health cost preparation.

Return type:: None

workflow.scripts.build_regions.cluster_regions(gdf, target_count, allow_cross_border, method='kmeans', random_state=0)[source]¶

Cluster level-1 administrative regions into target_count clusters.

Clustering is based on centroids in a projected CRS (EPSG:3857) for a reasonable Euclidean approximation. When cross-border clustering is not allowed, clustering is performed per country and the per-country targets are allocated proportionally to the number of base regions.

Parameters:

gdf (GeoDataFrame)
target_count (int)
allow_cross_border (bool)
method (str)
random_state (int)

Return type:

GeoDataFrame

Yield and Land Processing¶

SPDX-FileCopyrightText: 2025 Koen van Greevenbroek

SPDX-License-Identifier: GPL-3.0-or-later

workflow.scripts.compute_resource_classes.read_raster_float(path)[source]¶

Parameters:: path (str)

SPDX-FileCopyrightText: 2025 Koen van Greevenbroek

SPDX-License-Identifier: GPL-3.0-or-later

workflow.scripts.aggregate_class_areas.read_raster_float(path)[source]¶

Parameters:: path (str)

workflow.scripts.aggregate_class_areas.load_scaled_fraction(path, *, target_shape=None, target_transform=None, target_crs=None)[source]¶

Parameters:

path (str)
target_shape (tuple[int, int] | None)

Return type:

ndarray

workflow.scripts.aggregate_class_areas.raster_bounds(transform, width, height)[source]¶

Parameters:

width (int)
height (int)

SPDX-FileCopyrightText: 2025 Koen van Greevenbroek

SPDX-License-Identifier: GPL-3.0-or-later

SPDX-FileCopyrightText: 2025 Koen van Greevenbroek

SPDX-License-Identifier: GPL-3.0-or-later

Water Resources¶

SPDX-FileCopyrightText: 2025 Koen van Greevenbroek

SPDX-License-Identifier: GPL-3.0-or-later

SPDX-FileCopyrightText: 2025 Koen van Greevenbroek

SPDX-License-Identifier: GPL-3.0-or-later

workflow.scripts.build_region_water_availability.compute_month_overlaps(start_day, length_days)[source]¶

Return array of day overlaps per month for given season.

start_day is 1-indexed (GAEZ convention). length_days can exceed 365; values above one year are capped at 365 to avoid infinite wrap.

Parameters:

start_day (float)
length_days (float)

Return type:

ndarray

workflow.scripts.build_region_water_availability.build_basin_region_shares(basins_path, regions)[source]¶

Parameters:

basins_path (str)
regions (GeoDataFrame)

Return type:

DataFrame

workflow.scripts.build_region_water_availability.compute_region_monthly_water(shares, monthly_basin, regions)[source]¶

Parameters:

shares (DataFrame)
monthly_basin (DataFrame)
regions (list[str])

Return type:

DataFrame

workflow.scripts.build_region_water_availability.load_crop_growing_seasons(crop_files)[source]¶

Parameters:: crop_files (Iterable[str])
Return type:: DataFrame

workflow.scripts.build_region_water_availability.compute_region_growing_water(region_month_water, crop_seasons, regions)[source]¶

Parameters:

region_month_water (DataFrame)
crop_seasons (DataFrame)
regions (list[str])

Return type:

DataFrame

Process Huang et al. gridded irrigation water withdrawal data.

Aggregates monthly gridded irrigation water use (km³/month at 0.5° resolution) to model regions, producing outputs compatible with the sustainable water availability data from the Water Footprint Network.

This script produces the same output format as build_region_water_availability.py so that the two data sources can be used interchangeably.

Reference:: Huang et al. (2018). Reconstruction of global gridded monthly sectoral water withdrawals for 1971-2010 and analysis of their spatiotemporal patterns. Hydrology and Earth System Sciences, 22, 2117-2133. https://doi.org/10.5194/hess-22-2117-2018

workflow.scripts.process_huang_irrigation_water.compute_month_overlaps(start_day, length_days)[source]¶

Return array of day overlaps per month for given season.

Parameters:

start_day (float)
length_days (float)

Return type:

ndarray

workflow.scripts.process_huang_irrigation_water.aggregate_gridded_to_regions(data_array, lon, lat, regions_gdf)[source]¶

Aggregate a gridded array to regions by summation.

Parameters:

data_array (ndarray) – 2D array (lat, lon) with water withdrawal values.
lon (ndarray) – 1D longitude coordinates.
lat (ndarray) – 1D latitude coordinates.
regions_gdf (GeoDataFrame) – GeoDataFrame with ‘region’ column and geometry.

Return type:

DataFrame

Returns:

DataFrame with ‘region’ and ‘value’ columns.

workflow.scripts.process_huang_irrigation_water.load_crop_growing_seasons(crop_files)[source]¶

Load and aggregate crop growing seasons from yield files.

This is a copy of the function from build_region_water_availability.py to ensure consistency.

Parameters:: crop_files (Iterable[str])
Return type:: DataFrame

workflow.scripts.process_huang_irrigation_water.compute_region_growing_water(region_month_water, crop_seasons, regions)[source]¶

Compute growing-season weighted water availability.

This mirrors the function from build_region_water_availability.py but uses ‘water_available_m3’ column from monthly data.

Parameters:

region_month_water (DataFrame)
crop_seasons (DataFrame)
regions (list[str])

Return type:

DataFrame

workflow.scripts.process_huang_irrigation_water.process_huang_irrigation(nc_path, regions_path, crop_files, reference_year=2010)[source]¶

Process Huang et al. irrigation NetCDF to regional water data.

Parameters:

nc_path (str) – Path to the extracted Huang irrigation NetCDF file.
regions_path (str) – Path to the regions GeoJSON file.
crop_files (list[str]) – List of crop yield file paths for growing season data.
reference_year (int) – Year to use for water withdrawal (default: 2010).

Returns:

DataFrame with monthly region water (region, month, water_available_m3)
DataFrame with growing season water (same format as build_region_water_availability)

Return type:

Tuple of

Utility Modules¶

SPDX-FileCopyrightText: 2025 Koen van Greevenbroek

SPDX-License-Identifier: GPL-3.0-or-later

workflow.scripts.raster_utils.calculate_all_cell_areas(src, *, repeat=True)[source]¶

Return per-pixel area in hectares for a geographic (lon/lat) raster.

Parameters:

src (DatasetReader) – Raster opened with rasterio, expected in lon/lat coordinates.
repeat (bool) – When True (default) repeat the per-row areas across columns, yielding a 2D array matching the raster shape. When False, return the 1D per-row areas without repeating, which is useful when the caller can rely on broadcasting to avoid materialising the full 2D matrix.

Return type:

ndarray

workflow.scripts.raster_utils.scale_fraction(arr)[source]¶

Scale array to 0..1 if stored as 0..100 or 0..10000; clip to [0,1].

Parameters:: arr (ndarray)
Return type:: ndarray

workflow.scripts.raster_utils.raster_bounds(transform, width, height)[source]¶

Calculate bounding box from raster transform and dimensions.

Parameters:

width (int)
height (int)

workflow.scripts.raster_utils.read_raster_float(path)[source]¶

Open raster and return array + source, converting nodata to NaN.

Returns:: (array as float32, rasterio source - caller must close)
Return type:: tuple
Parameters:: path (str)

workflow.scripts.raster_utils.load_raster_array(path)[source]¶

Load raster as float32 array, converting nodata to NaN.

Parameters:: path (str)
Return type:: ndarray

Visualization Scripts¶

workflow.scripts.plotting.plot_regions_map.plot_regions_map(regions_path, output_path)[source]¶

Parameters:

regions_path (str)
output_path (str)

Return type:

None

Plot a Plate Carrée map of resource classes by grid cell.

workflow.scripts.plotting.plot_resource_classes_map.plot_resource_classes_map(classes_path, regions_path, output_path)[source]¶

Parameters:

classes_path (Path)
regions_path (Path)
output_path (Path)

Return type:

None

workflow.scripts.plotting.plot_crop_production_map.main()[source]¶

Return type:: None

Plot objective breakdown and visualize health risk factors by region.

class workflow.scripts.plotting.plot_health_impacts.HealthInputs(risk_breakpoints, cluster_cause, cause_log_breakpoints, cluster_summary, clusters, cluster_risk_baseline)[source]¶

Bases: object

Parameters:

risk_breakpoints (DataFrame)
cluster_cause (DataFrame)
cause_log_breakpoints (DataFrame)
cluster_summary (DataFrame)
clusters (DataFrame)
cluster_risk_baseline (DataFrame)

risk_breakpoints¶

cluster_cause¶

cause_log_breakpoints¶

cluster_summary¶

clusters¶

cluster_risk_baseline¶

__init__(risk_breakpoints, cluster_cause, cause_log_breakpoints, cluster_summary, clusters, cluster_risk_baseline)¶

Parameters:

risk_breakpoints (DataFrame)
cluster_cause (DataFrame)
cause_log_breakpoints (DataFrame)
cluster_summary (DataFrame)
clusters (DataFrame)
cluster_risk_baseline (DataFrame)

class workflow.scripts.plotting.plot_health_impacts.HealthResults(cause_costs, risk_costs, intake, cluster_population)[source]¶

Bases: object

Parameters:

cause_costs (DataFrame)
risk_costs (DataFrame)
intake (DataFrame)
cluster_population (Mapping[int, float])

cause_costs¶

risk_costs¶

intake¶

cluster_population¶

__init__(cause_costs, risk_costs, intake, cluster_population)¶

Parameters:

cause_costs (DataFrame)
risk_costs (DataFrame)
intake (DataFrame)
cluster_population (Mapping[int, float])

workflow.scripts.plotting.plot_health_impacts.sanitize_identifier(value)[source]¶

Parameters:: value (str)
Return type:: str

workflow.scripts.plotting.plot_health_impacts.sanitize_food_name(food)[source]¶

Parameters:: food (str)
Return type:: str

workflow.scripts.plotting.plot_health_impacts.compute_health_results(n, inputs, risk_factors, value_per_yll, tmrel_g_per_day, food_groups_df)[source]¶

Compute health costs from optimized network, relative to TMREL intake levels.

Parameters:

n (Network)
inputs (HealthInputs)
risk_factors (list[str])
value_per_yll (float)
tmrel_g_per_day (dict[str, float])
food_groups_df (DataFrame)

Return type:

HealthResults

workflow.scripts.plotting.plot_health_impacts.compute_baseline_risk_costs(n, inputs, risk_factors, value_per_yll, tmrel_g_per_day, food_groups_df)[source]¶

Compute baseline health costs by risk factor and by cause, relative to TMREL intake levels.

Health costs represent the monetized burden from deviations from optimal (TMREL) intake. Both total costs and individual risk factor contributions are measured relative to TMREL.

Return type:

tuple[DataFrame, DataFrame]

Returns:

(risk_costs_df, cause_costs_df) where risk_costs has columns (cluster, risk_factor, cost) and cause_costs has columns (cluster, cause, cost, log_total, rr_total, coeff).

Parameters:

n (Network)
inputs (HealthInputs)
risk_factors (list[str])
value_per_yll (float)
tmrel_g_per_day (dict[str, float])
food_groups_df (DataFrame)

workflow.scripts.plotting.plot_health_impacts.build_cluster_risk_tables(risk_costs_df, cluster_population)[source]¶

Parameters:

risk_costs_df (DataFrame)
cluster_population (Mapping[int, float])

Return type:

tuple[dict[str, dict[int, float]], dict[str, dict[int, float]]]

workflow.scripts.plotting.plot_health_impacts.compute_total_health_costs_per_capita(cause_costs, cluster_population)[source]¶

Compute total health costs per capita for each cluster.

The total cost per cluster is the sum across causes (health costs are additive across different health outcomes).

Parameters:

cause_costs (DataFrame)
cluster_population (Mapping[int, float])

Return type:

dict[int, float]

workflow.scripts.plotting.plot_health_impacts.plot_health_map(gdf, cluster_lookup, per_capita_by_risk, output_path, top_risks, *, diverging=True, value_label='Health cost per capita (bnUSD)', total_per_capita=None)[source]¶

Parameters:

gdf (GeoDataFrame)
cluster_lookup (Mapping[str, int])
per_capita_by_risk (Mapping[str, Mapping[int, float]])
output_path (Path)
top_risks (Iterable[str])
diverging (bool)
value_label (str)
total_per_capita (Mapping[int, float] | None)

Return type:

None

workflow.scripts.plotting.plot_health_impacts.build_health_region_table(gdf, cluster_lookup, cost_by_risk, per_capita_by_risk)[source]¶

Parameters:

gdf (GeoDataFrame)
cluster_lookup (Mapping[str, int])
cost_by_risk (Mapping[str, Mapping[int, float]])
per_capita_by_risk (Mapping[str, Mapping[int, float]])

Return type:

DataFrame

workflow.scripts.plotting.plot_health_impacts.main()[source]¶

Return type:: None

Plot global food consumption by food group per person per day.

workflow.scripts.plotting.plot_food_consumption.main()[source]¶

Return type:: None