API Reference¶
This section provides automatically generated documentation for the workflow scripts and utility modules.
Core Scripts¶
Model Building and Solving¶
Food systems optimization model builder.
Modular package for constructing PyPSA networks representing global food production, conversion, trade, and nutrition constraints.
Component Naming and Accessing Conventions¶
This module follows a consistent naming and attribute scheme for all PyPSA components. Never parse component names to extract metadata - always use columns.
Naming Scheme¶
Names use : as delimiter (uncommon in data values, safe for parsing if needed):
Pattern: {type}:{specifier}:{scope}
Buses:
crop:{crop}:{country} e.g., crop:wheat:USA
food:{food}:{country} e.g., food:bread:USA
feed:{category}:{country} e.g., feed:ruminant_forage:USA
residue:{item}:{country} e.g., residue:wheat_straw:USA
group:{group}:{country} e.g., group:cereals:USA
nutrient:{nutrient}:{country} e.g., nutrient:protein:USA
land:cropland:{region}_c{class}_{water} e.g., land:cropland:usa_east_c1_r
land:pasture:{region}_c{class} e.g., land:pasture:usa_east_c1
land:existing_cropland:{region}_c{class}_{water} e.g., land:existing_cropland:usa_east_c1_r
land:new:{region}_c{class}_{water} e.g., land:new:usa_east_c1_r
land:existing_grassland_convertible:{region}_c{class} e.g., land:existing_grassland_convertible:usa_east_c1
land:existing_grassland_marginal:{region}_c{class} e.g., land:existing_grassland_marginal:usa_east_c1
water:{region} e.g., water:usa_east
fertilizer:supply (global)
fertilizer:{country} e.g., fertilizer:USA
emission:{type} e.g., emission:co2, emission:ghg
Links:
produce:{crop}_{water}:{region}_c{class} e.g., produce:wheat_rainfed:usa_east_c1
produce:multi_{combo}_{water}:{region}_c{class}
produce:grassland:{region}_c{class}
pathway:{pathway}:{country} e.g., pathway:milling:USA
convert:{item}_to_{category}:{country} e.g., convert:wheat_to_ruminant_grain:USA
animal:{product}_{feed}:{country} e.g., animal:beef_grassfed:USA
consume:{food}:{country} e.g., consume:bread:USA
use:existing_land:{region}_c{class}_{water}
use:existing_to_pasture:{region}_c{class}
convert:new_land:{region}_c{class}_{water}
convert:new_to_pasture:{region}_c{class}
use:existing_grassland_{type}_to_pasture:{region}_c{class}
spare:land:{region}_c{class}_{water}
spare:existing_grassland_{type}:{region}_c{class}
distribute:fertilizer:{country}
incorporate:residue_{item}:{country}
aggregate:{from}_to_{to} e.g., aggregate:ch4_to_ghg
trade:{commodity}:{from}_{to}
biomass:{item}:{country}
Stores:
store:group:{group}:{country} e.g., store:group:cereals:USA
store:nutrient:{nutrient}:{country} e.g., store:nutrient:protein:USA
store:water:{region} e.g., store:water:usa_east
store:fertilizer:{country} e.g., store:fertilizer:USA
store:emission:{type} e.g., store:emission:ghg
Generators:
supply:land_{type}:{region}_c{class}_{water} e.g., supply:land_existing_cropland:usa_east_c1_r
supply:fertilizer
slack:{type}:{scope} e.g., slack:water:usa_east
Carrier Column¶
Use the carrier column for type identification. Carriers identify link type only;
specific items (crops, foods, products) are stored in metadata columns.
Buses:
crop_{crop},food_{food},feed_{category},residue_{item},group_{group},{nutrient},land_cropland,land_pasture,land_existing_cropland,land_existing_grassland_convertible,land_existing_grassland_marginal,land_new,water,fertilizer,co2,ch4,n2o,ghgLinks: -
crop_production: Crop production (usecropcolumn for specific crop) -crop_production_multi: Multi-cropping production (usecropcolumn for combination) -grassland_production: Grassland/pasture production -animal_production: Animal product production (useproduct,feed_categorycolumns) -food_consumption: Food consumption (usefood,food_groupcolumns) -food_processing: Food processing pathways (usepathway,cropcolumns) -feed_conversion: Crop/food to feed conversion (usecrop,feed_categorycolumns) -trade_crop: Crop trade (usecropcolumn) -trade_food: Food trade (usefoodcolumn) -trade_feed: Feed trade (usefeed_categorycolumn) -biomass_crop: Crop to biomass (usecropcolumn) -biomass_byproduct: Byproduct to biomass (usefoodcolumn) -fertilizer_distribution: Fertilizer distribution -emission_aggregation: GHG emission aggregation -land_use,land_conversion,existing_to_pasture,new_to_pasture,existing_grassland_to_pasture,spare_land,spare_existing_grassland,residue_incorporation
Custom Columns¶
All components have consistent domain-specific columns for filtering:
Buses:
country: str | NaN - country code (NaN for global/regional)region: str | NaN - region name (for land/water buses)
Links:
country: str | NaN - country coderegion: str | NaN - region namecrop: str | NaN - crop namefood: str | NaN - food namefood_group: str | NaN - food group nameproduct: str | NaN - animal product namefeed_category: str | NaN - feed categoryresource_class: int | NaN - land quality classwater_supply: str | NaN - “irrigated” or “rainfed”land_type: str | NaN - e.g. “convertible” or “marginal” for grassland pools
Stores:
country: str | NaN - country codefood_group: str | NaN - food group namenutrient: str | NaN - nutrient name
Generators:
country: str | NaN - country coderegion: str | NaN - region name
Global Constraints:
country: str | NaN - country codefood_group: str | NaN - food group namenutrient: str | NaN - nutrient nameproduct: str | NaN - product namecrop: str | NaN - crop name
Accessing Components¶
Use regular pandas indexing with carrier and domain columns. Fail fast when
no components found:
# Get food group stores for a specific group
group_stores = n.stores.static[n.stores.static["carrier"] == f"group_{group}"]
if group_stores.empty:
raise ValueError(f"No stores found for food group '{group}'")
# Get crop production links for a specific country
crop_links = n.links.static[
(n.links.static["carrier"] == "crop_production") &
(n.links.static["crop"] == crop) &
(n.links.static["country"] == country)
]
if crop_links.empty:
raise ValueError(f"No production links for crop '{crop}' in '{country}'")
# Get all consumption links
consume_links = n.links.static[n.links.static["carrier"] == "food_consumption"]
Solving utilities for the food systems optimization model.
Modular package providing specialized constraint builders for the PyPSA linopy model during optimization.
Data Preparation¶
Process UN WPP population data for total and age-specific counts.
Pre-compute health data for SOS2 linearisation in the solver.
- class workflow.scripts.prepare_health_costs.RelativeRiskTable[source]¶
Bases:
dict[tuple[str,str],dict[str,ndarray]]Container mapping (risk, cause) to exposure grids and log RR values.
- workflow.scripts.prepare_health_costs.main()[source]¶
Main entry point for health cost preparation.
- Return type:
- workflow.scripts.build_regions.cluster_regions(gdf, target_count, allow_cross_border, method='kmeans', random_state=0)[source]¶
Cluster level-1 administrative regions into target_count clusters.
Clustering is based on centroids in a projected CRS (EPSG:3857) for a reasonable Euclidean approximation. When cross-border clustering is not allowed, clustering is performed per country and the per-country targets are allocated proportionally to the number of base regions.
Yield and Land Processing¶
SPDX-FileCopyrightText: 2025 Koen van Greevenbroek
SPDX-License-Identifier: GPL-3.0-or-later
SPDX-FileCopyrightText: 2025 Koen van Greevenbroek
SPDX-License-Identifier: GPL-3.0-or-later
- workflow.scripts.aggregate_class_areas.load_scaled_fraction(path, *, target_shape=None, target_transform=None, target_crs=None)[source]¶
SPDX-FileCopyrightText: 2025 Koen van Greevenbroek
SPDX-License-Identifier: GPL-3.0-or-later
SPDX-FileCopyrightText: 2025 Koen van Greevenbroek
SPDX-License-Identifier: GPL-3.0-or-later
Water Resources¶
SPDX-FileCopyrightText: 2025 Koen van Greevenbroek
SPDX-License-Identifier: GPL-3.0-or-later
SPDX-FileCopyrightText: 2025 Koen van Greevenbroek
SPDX-License-Identifier: GPL-3.0-or-later
- workflow.scripts.build_region_water_availability.compute_month_overlaps(start_day, length_days)[source]¶
Return array of day overlaps per month for given season.
start_day is 1-indexed (GAEZ convention). length_days can exceed 365; values above one year are capped at 365 to avoid infinite wrap.
- Parameters:
basins_path (
str)regions (
GeoDataFrame)
- Return type:
DataFrame
- workflow.scripts.build_region_water_availability.compute_region_monthly_water(shares, monthly_basin, regions)[source]¶
- workflow.scripts.build_region_water_availability.compute_region_growing_water(region_month_water, crop_seasons, regions)[source]¶
Process Huang et al. gridded irrigation water withdrawal data.
Aggregates monthly gridded irrigation water use (km³/month at 0.5° resolution) to model regions, producing outputs compatible with the sustainable water availability data from the Water Footprint Network.
This script produces the same output format as build_region_water_availability.py so that the two data sources can be used interchangeably.
- Reference:
Huang et al. (2018). Reconstruction of global gridded monthly sectoral water withdrawals for 1971-2010 and analysis of their spatiotemporal patterns. Hydrology and Earth System Sciences, 22, 2117-2133. https://doi.org/10.5194/hess-22-2117-2018
- workflow.scripts.process_huang_irrigation_water.compute_month_overlaps(start_day, length_days)[source]¶
Return array of day overlaps per month for given season.
- workflow.scripts.process_huang_irrigation_water.aggregate_gridded_to_regions(data_array, lon, lat, regions_gdf)[source]¶
Aggregate a gridded array to regions by summation.
- Parameters:
- Return type:
DataFrame- Returns:
DataFrame with ‘region’ and ‘value’ columns.
- workflow.scripts.process_huang_irrigation_water.load_crop_growing_seasons(crop_files)[source]¶
Load and aggregate crop growing seasons from yield files.
This is a copy of the function from build_region_water_availability.py to ensure consistency.
- workflow.scripts.process_huang_irrigation_water.compute_region_growing_water(region_month_water, crop_seasons, regions)[source]¶
Compute growing-season weighted water availability.
This mirrors the function from build_region_water_availability.py but uses ‘water_available_m3’ column from monthly data.
- workflow.scripts.process_huang_irrigation_water.process_huang_irrigation(nc_path, regions_path, crop_files, reference_year=2010)[source]¶
Process Huang et al. irrigation NetCDF to regional water data.
- Parameters:
- Returns:
DataFrame with monthly region water (region, month, water_available_m3)
DataFrame with growing season water (same format as build_region_water_availability)
- Return type:
Tuple of
Utility Modules¶
SPDX-FileCopyrightText: 2025 Koen van Greevenbroek
SPDX-License-Identifier: GPL-3.0-or-later
- workflow.scripts.raster_utils.calculate_all_cell_areas(src, *, repeat=True)[source]¶
Return per-pixel area in hectares for a geographic (lon/lat) raster.
- Parameters:
src (
DatasetReader) – Raster opened with rasterio, expected in lon/lat coordinates.repeat (
bool) – When True (default) repeat the per-row areas across columns, yielding a 2D array matching the raster shape. When False, return the 1D per-row areas without repeating, which is useful when the caller can rely on broadcasting to avoid materialising the full 2D matrix.
- Return type:
- workflow.scripts.raster_utils.scale_fraction(arr)[source]¶
Scale array to 0..1 if stored as 0..100 or 0..10000; clip to [0,1].
- workflow.scripts.raster_utils.raster_bounds(transform, width, height)[source]¶
Calculate bounding box from raster transform and dimensions.
Visualization Scripts¶
Plot a Plate Carrée map of resource classes by grid cell.
- workflow.scripts.plotting.plot_resource_classes_map.plot_resource_classes_map(classes_path, regions_path, output_path)[source]¶
- Parameters:
classes_path (
Path)regions_path (
Path)output_path (
Path)
- Return type:
Plot objective breakdown and visualize health risk factors by region.
- class workflow.scripts.plotting.plot_health_impacts.HealthInputs(risk_breakpoints, cluster_cause, cause_log_breakpoints, cluster_summary, clusters, cluster_risk_baseline)[source]¶
Bases:
object- Parameters:
risk_breakpoints (
DataFrame)cluster_cause (
DataFrame)cause_log_breakpoints (
DataFrame)cluster_summary (
DataFrame)clusters (
DataFrame)cluster_risk_baseline (
DataFrame)
- risk_breakpoints¶
- cluster_cause¶
- cause_log_breakpoints¶
- cluster_summary¶
- clusters¶
- cluster_risk_baseline¶
- __init__(risk_breakpoints, cluster_cause, cause_log_breakpoints, cluster_summary, clusters, cluster_risk_baseline)¶
- Parameters:
risk_breakpoints (
DataFrame)cluster_cause (
DataFrame)cause_log_breakpoints (
DataFrame)cluster_summary (
DataFrame)clusters (
DataFrame)cluster_risk_baseline (
DataFrame)
- class workflow.scripts.plotting.plot_health_impacts.HealthResults(cause_costs, risk_costs, intake, cluster_population)[source]¶
Bases:
object- Parameters:
- cause_costs¶
- risk_costs¶
- intake¶
- cluster_population¶
- workflow.scripts.plotting.plot_health_impacts.compute_health_results(n, inputs, risk_factors, value_per_yll, tmrel_g_per_day, food_groups_df)[source]¶
Compute health costs from optimized network, relative to TMREL intake levels.
- Parameters:
- Return type:
- workflow.scripts.plotting.plot_health_impacts.compute_baseline_risk_costs(n, inputs, risk_factors, value_per_yll, tmrel_g_per_day, food_groups_df)[source]¶
Compute baseline health costs by risk factor and by cause, relative to TMREL intake levels.
Health costs represent the monetized burden from deviations from optimal (TMREL) intake. Both total costs and individual risk factor contributions are measured relative to TMREL.
- Return type:
tuple[DataFrame,DataFrame]- Returns:
(risk_costs_df, cause_costs_df) where risk_costs has columns (cluster, risk_factor, cost) and cause_costs has columns (cluster, cause, cost, log_total, rr_total, coeff).
- Parameters:
- workflow.scripts.plotting.plot_health_impacts.build_cluster_risk_tables(risk_costs_df, cluster_population)[source]¶
- workflow.scripts.plotting.plot_health_impacts.compute_total_health_costs_per_capita(cause_costs, cluster_population)[source]¶
Compute total health costs per capita for each cluster.
The total cost per cluster is the sum across causes (health costs are additive across different health outcomes).
- workflow.scripts.plotting.plot_health_impacts.plot_health_map(gdf, cluster_lookup, per_capita_by_risk, output_path, top_risks, *, diverging=True, value_label='Health cost per capita (bnUSD)', total_per_capita=None)[source]¶
- workflow.scripts.plotting.plot_health_impacts.build_health_region_table(gdf, cluster_lookup, cost_by_risk, per_capita_by_risk)[source]¶
Plot global food consumption by food group per person per day.