Food Processing & Trade¶
Food Processing¶
Overview¶
The food processing module converts raw agricultural products (crops and animal products) into final food products consumed by the population. This captures:
Multi-output processing: Single crops can produce multiple co-products (e.g., wheat → white flour + bran + germ)
Alternative pathways: Different processing options for the same crop (e.g., white flour vs. wholemeal flour from wheat)
Mass balance: Processing losses and byproducts are explicitly tracked
Unit conversion: Conversion from dry matter (DM) to commercial commodity weight as traded/consumed (per-crop policy declared in
data/curated/crop_moisture_content.csv)
Processing is represented in the model as PyPSA multi-output links with crop buses as inputs and multiple food buses as outputs. Each pathway creates one link per country, with efficiencies adjusted for food loss and waste factors.
Data Files¶
The two files below, created and distributed for internal food-opt use, define possible food processing pathways and food groups.
- data/curated/foods.csv
Defines crop-to-food processing pathways using a pathway-based format that supports multi-output processing. Each pathway can convert one crop into one or more food products, with conversion factors maintaining mass balance.
Columns:
pathway: Unique identifier for the processing pathway (e.g.,white_flour,milled_rice)crop: Input crop name (must match config crops list)food: Output food product namefactor: Conversion factor (mass of food output per unit mass of crop input). Pathway efficiencies are then multiplied by the per-crop dry-matter-to-food factor fromdata/curated/crop_moisture_content.csv(thefood_conversioncolumn controls whether the moisture inversion is applied —inverse_moisturefor almost every crop,identityonly for tea where the listed moisture refers to the as-harvested fresh leaf form rather than the as-traded commercial commodity).description: Explanation of the conversion and source reference
Multi-output pathways: Multiple rows with the same pathway name represent co-products from a single processing operation. For example, the
white_flourpathway produces flour-white (0.75), wheat-bran (0.20), and wheat-germ (0.03) from wheat, with factors summing to ≤ 1.0 to respect mass balance.Alternative pathways: Different pathways for the same crop represent processing alternatives that the model can choose between based on demand and costs. For example, wheat can be processed via
white_flourorwholemeal_flourpathways.Sugar and oil crops: Yields for sugarcane, sugarbeet, and oil-palm are first converted back to whole-crop dry matter (see Crop Production) and then uplifted to fresh mass using the moisture table. Pathway factors therefore reflect fresh extraction efficiencies: 0.24 for palm oil (24 % oil from fresh fruit bunches), 0.14 for cane sugar, and 0.10 for beet sugar.
Cotton ginning: Cotton is processed via the
cotton_ginningpathway, which produces three outputs from seed cotton: cotton lint (0.38), cottonseed oil (0.083), and oilseed meal (0.275). Cotton lint is a non-food byproduct routed to per-country fiber demand stores (see Configuration); cottonseed oil enters the food system as an edible oil; oilseed meal enters the generic oilcake pool available as animal feed.Sugar beet pulp: The
sugarbeet_sugarpathway emits beet-pulp (BPULP) as a coproduct of sucrose extraction at 0.05 t dry pulp per t fresh beet, consistent with Feedipedia (Legrand 2015): 1 t fresh beet yields ~0.5 t wet pulp (≈ 10–12 % DM), commonly dried to ~0.05 t at 90 % DM. The pulp is low in protein (~8 % CP) and high in pectin and digestible fibre, so it categorises as a ruminant grain feed rather than a protein feed (the model still gains a defensible supply route for the BPULP share of the GLEAM3 “By-products” intake bucket).Maize wet-milling: An alternative to the existing
maize_grainandmaize_ethanolpathways,maize_wetmillrepresents the corn-refining route that produces starch (the dominant industrial product, much of it converted to HFCS) plus the two gluten coproducts that the GLEAM3 baseline labels MZGLTM (gluten meal, ~60 % CP) and MZGLTF (gluten feed, ~21 % CP). Per the Corn Refiners Association feed handbook, one 56-lb bushel of corn yields about 32 lb starch, 11.4 lb gluten feed, and 3 lb gluten meal — translating to 0.571, 0.204, and 0.054 t per t maize. Corn germ oil (~0.029) and steepwater are not separately tracked. Maize starch is a byproduct entity with no human-food sink in this model and is routed to biomass disposal. Gluten meal categorises naturally as a protein ruminant feed (N ≈ 106 g/kg DM) and gluten feed as a grain ruminant feed (N ≈ 39 g/kg DM); both have GLEAM3 entries only on the ruminant side.Palm oil and palm kernel: The
palm_oilpathway produces both crude palm oil from the mesocarp (0.24 t per t FFB) and palm-kernel-meal from the kernel-oil pressing step (0.022 t per t FFB). The factor for the meal byproduct is derived from the Malaysian Palm Oil Board’s industry-standard kernel extraction rate (KER ≈ 4.5 % of FFB, see theMPOB_KERsource URL infoods.csv) combined with Feedipedia’s ~50 % press cake yield from kernels (mechanical extraction,FEEDIPEDIA_PKMsource URL). The ~2 % palm kernel oil that would jointly emerge from the kernel-pressing step is not separately tracked — it is small relative to global vegetable oil supply and is lumped into wastage. Palm-kernel-meal itself has a relatively low crude protein content (~17 % CP, N ≈ 27 g/kg DM) and would auto-categorise as a grain feed; the model overrides this to protein indata/curated/feed_category_overrides.csvto match GLEAM3’s accounting convention (PKEXP is booked under the “Oil seed cakes” intake bucket together with the higher-protein meals) and to ensure the new supply lands on the right feed bus for closing the gap. See Exogenous Protein Feed for the rest of the protein-feed story.Note
When
validation.use_actual_yieldsis true, the sugarcane, sugarbeet, and oil-palm rasters already deliver whole-crop fresh mass, so the workflow skips the conversion above and relies on the moisture table to convert to dry matter before applying extraction factors.- data/curated/food_groups.csv
Maps foods to food groups for dietary constraint aggregation and health impact assessment. Each food must be assigned to exactly one food group.
Columns:
food: Food product name (must match foods produced indata/curated/foods.csv)group: Food group identifier (e.g.,grain,whole_grains,legumes,oil,byproduct)
Coverage: This file must include all foods that can be produced according to
data/curated/foods.csvpathways, including byproducts (bran, meal, hulls, etc.). Foods without group assignments will generate warnings and will not contribute to food group constraints or health impact calculations.Food groups: Standard groups include grains, whole_grains, legumes, nuts_seeds, oil, starchy_vegetable, fruits, vegetables, sugar, stimulants, byproduct, red_meat, prc_meat, poultry, dairy, and eggs. Fish/seafood is not currently modelled. Additional groups can be defined by extending
config.food_groups.included.Byproduct handling: Foods assigned to the
byproductgroup (such as wheat-bran, rice-bran, oat-bran, wheat-germ, oilseed-meal, rapeseed-meal, and buckwheat-hulls) are excluded from direct human consumption. Instead, these byproducts can be utilized as animal feed (see Feed Conversion), making them available for livestock production systems. Byproducts also have links to per-countrybiomassbuses, providing a disposal route for byproducts that lack feed mappings; surplus can be exported to the energy sector at the configuredbiomass.marginal_values_usd_per_tonne(set to 0 for free disposal).Diet-food disposal: A few foods that are part of the human diet still need an extra biomass disposal route because actual production exceeds what the modelled diet can absorb. Examples include cottonseed oil (a fixed-coefficient byproduct of cotton grown for fiber) and foods with substantial unmodelled non-food demand such as coconut (coir, charcoal, husk fuel), foxtail-millet (birdseed/forage), and sesame (high post-harvest losses). These are listed under
biomass.disposal_foods; see disposal foods for the full list and rationale.
Food Loss & Waste Adjustments¶
The workflow incorporates food loss (pre-retail) and food waste (retail & household) adjustments when converting crops to foods. Food loss and waste are measured and tracked by the UN under the Sustainable Development Goal 12.3. The FAO is responsible for preparing data on food loss, whereas the UNEP is responsible for preparing data on food waste. Both are available through a UN Statistics Division API.
workflow/scripts/prepare_food_loss_waste.pyretrieves * SDG indicator 12.3.1 data (seriesAG_FLS_PCTandAG_FOOD_WST_PC) from the UN Statistics Division API, using ISO-3 area codes. * Food Balance Sheets data (FBSdomain) from FAOSTAT to obtain country-level per-capita food supply.UNSD reports food loss as a percentage. Regional totals (
ALPproduct code) are available for M49 regions, while product-level breakdowns (CRL_PUL,FRT_VGT,RT_TBR,ANMPROD) exist only for the global series. The script therefore: 1. Pulls the latest world loss percentages by product type. 2. Converts them into correction factors by dividing each product share by the globalALPtotal (e.g. fruits & vegetables ≈ 25 % / 13 % ≈ 1.9). 3. Applies these factors to each country’s regionalALPpercentage, yielding group-specific loss fractions for the model food groups.Food waste is reported as kilograms per capita per year. To convert this to a fraction of available food supply, the script retrieves the FAOSTAT FBS Grand Total item (kg/capita/year), converts both to grams/day, and computes
waste_fraction = waste_g_day / supply_g_day.The resulting dataset
processing/{name}/food_loss_waste.csvlists, for every country and model food group, the derived loss_fraction and waste_fraction.
During build_model the crop→food conversion links multiply the baseline processing efficiency by (1 - loss_fraction) * (1 - waste_fraction) for the relevant country-food group pair. The same FLW adjustment is applied to animal product links (feed→dairy, feed→meat, etc.), ensuring consistent treatment across all food pathways.
Dairy-specific loss: The SDG data groups all animal products (meat, dairy, eggs) together under the ANMPROD category. However, dairy has significantly lower losses than meat due to better cold chain infrastructure and faster processing. To correct for this, the script compares FAOSTAT dairy production with food supply data to calculate an implicit dairy loss fraction, which is typically near zero. This FBS-derived loss fraction replaces the SDG value for dairy products.
Because all factors are multiplicative (dry matter → fresh mass → edible portion → usable food), their ordering does not affect the final efficiency.
Trade¶
Overview¶
The trade module enables inter-regional flows of crops and food products, subject to transport costs.
To avoid creating a complete graph of region-to-region links (entailing \(O(n^2)\) links for \(n\) regions), the model uses a hub-based topology:
Country buses: Each country has local crop/food buses
Hub buses: A small number of hub nodes (configured count)
Hub connections: Regions connect to nearest hubs; hubs connect to each other
This reduces links from \(O(n^2)\) to \(O(n \times h + h^2)\), where \(n\) = regions and \(h\) = hubs.
Configuration¶
Inter-hub transport costs (trade_cost_per_t_km) live alongside the
farm-to-wholesale marketing markups (marketing_cost_per_t) inside
the unified commodities: block. See
Configuration for the full literal include, and Production Costs
(Default marketing-cost parameters (USD_2024 per tonne)) for the per-class default values and
sources.
Trade Cost Categories¶
Transport costs differentiate by commodity handling requirements. The seven crop / food classes and three feed classes used in the model are listed (together with marketing markups and references) under Default marketing-cost parameters (USD_2024 per tonne). Representative examples:
Bulk dry goods: Cereals, legumes in containers/bulk carriers
Bulky fresh: Potatoes, cassava, sugar beets
Perishable high-value / fresh produce: Fruits, vegetables, sugarcane requiring refrigeration
Chilled meat: Temperature-controlled meat transport
Dairy and eggs: Cold-chain packaged dairy and eggs
Feed byproducts: Brans, meals, distillers grains in bulk
Industrial byproducts: Cotton lint, ethanol, starch
Hub Location¶
Hub positions are determined by k-means clustering on region centroids:
Compute population-weighted centroid for each region
Run k-means with k = configured hub count
Assign each region to nearest hub
Create hub-hub distance matrix for hub-to-hub transport
This ensures hubs are spatially distributed to minimize total transport distance.
Hub-based trade network showing trade hubs (green circles) and trade links: country-to-hub links (thin) and hub-to-hub links (thick).¶
Non-Tradable Commodities¶
Certain products are designated non-tradable:
Fodder crops (alfalfa, biomass sorghum) via
commodities.crops.non_tradable– too bulky/low-value to transportFoods listed in
commodities.foods.non_tradable(optional) – keeps fragile or policy-protected items localFeed categories listed in
commodities.feeds.non_tradable(default: fresh forage) – the local-only feed pool
Non-tradable crops, foods, or feeds must be consumed (as food, feed, or byproducts) within their production region.
Model Implementation¶
Trade links are created in workflow/scripts/build_model.py:
# Pseudocode
for crop in tradable_crops:
for region in regions:
hub = nearest_hub(region)
n.add("Link",
f"trade_{crop}_{region}_to_{hub}",
bus0=f"crop_{crop}_{region}",
bus1=f"crop_{crop}_hub{hub}",
p_nom=inf, # No capacity limit
marginal_cost=distance * cost_per_km)
for hub_i, hub_j in hub_pairs:
n.add("Link",
f"trade_{crop}_hub{hub_i}_to_hub{hub_j}",
bus0=f"crop_{crop}_hub{hub_i}",
bus1=f"crop_{crop}_hub{hub_j}",
p_nom=inf,
marginal_cost=hub_distance * cost_per_km)
The same structure applies to all foods (including animal products and byproducts),
so every consumable item can flow through the hub network unless it is listed under
commodities.foods.non_tradable.