.. SPDX-FileCopyrightText: 2026 Koen van Greevenbroek .. .. SPDX-License-Identifier: CC-BY-4.0 .. _analysis: Analysis ======== This section describes post-hoc analyses that can be performed on solved models to extract insights about production, consumption, and the environmental and health impacts of food systems. .. _statistics-extraction: Statistics Extraction --------------------- The statistics extraction produces standardized Parquet files summarizing key model outputs. These files provide a consistent interface for downstream analysis and visualization, extracting data from the solved PyPSA network using actual dispatch flows rather than capacity-based estimates. Running the Extraction ~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash # Extract all statistics for a scenario tools/smk -j4 --configfile config/.yaml -- \ results/{name}/analysis/scen-default/crop_production.parquet # Or request any downstream plot to trigger extraction automatically Output Files ~~~~~~~~~~~~ All statistics are written to ``results/{name}/analysis/scen-{scenario}/``. **crop_production.parquet** — Crop production by crop, region, and country .. csv-table:: :header: Column, Type, Unit, Description ``crop``, string, —, "Crop identifier (e.g., ``wheat``, ``maize``, ``grassland``)" ``region``, string, —, "Production region identifier" ``country``, string, —, "ISO 3166-1 alpha-3 country code" ``production_mt``, float, Mt, "Production quantity in megatonnes" Sources include single-crop production links, grassland production, and multicropping links (where multiple crops share the same land). **land_use.parquet** — Land allocation by crop, region, resource class, and water supply .. csv-table:: :header: Column, Type, Unit, Description ``crop``, string, —, "Crop identifier" ``region``, string, —, "Production region identifier" ``resource_class``, int, —, "Land suitability class (0 = least productive, higher integers = more productive; number of classes set by ``aggregation.resource_class_quantiles``)" ``water_supply``, string, —, "Water regime (``rainfed`` or ``irrigated``)" ``country``, string, —, "ISO 3166-1 alpha-3 country code" ``area_mha``, float, Mha, "Cultivated area in million hectares" For multicropping systems, land area is attributed to individual crops proportionally by their yield (efficiency) on that land. **animal_production.parquet** — Livestock product output by product and country .. csv-table:: :header: Column, Type, Unit, Description ``product``, string, —, "Product identifier (e.g., ``dairy``, ``meat-cattle``, ``eggs``)" ``country``, string, —, "ISO 3166-1 alpha-3 country code" ``production_mt``, float, Mt, "Production quantity in megatonnes" **food_consumption.parquet** — Food consumption and macronutrients by food and country .. csv-table:: :header: Column, Type, Unit, Description ``food``, string, —, "Food identifier (e.g., ``wheat``, ``bread``, ``beef``)" ``country``, string, —, "ISO 3166-1 alpha-3 country code" ``consumption_mt``, float, Mt, "Total consumption in megatonnes" ``protein_mt``, float, Mt, "Protein content in megatonnes" ``carb_mt``, float, Mt, "Carbohydrate content in megatonnes" ``fat_mt``, float, Mt, "Fat content in megatonnes" ``cal_pj``, float, PJ, "Energy content in petajoules" ``consumption_g_per_person_day``, float, g/person/day, "Per-capita daily consumption" ``protein_g_per_person_day``, float, g/person/day, "Per-capita daily protein intake" ``carb_g_per_person_day``, float, g/person/day, "Per-capita daily carbohydrate intake" ``fat_g_per_person_day``, float, g/person/day, "Per-capita daily fat intake" ``cal_kcal_per_person_day``, float, kcal/person/day, "Per-capita daily energy intake" **food_group_consumption.parquet** — Consumption aggregated by food group and country Has the same columns as ``food_consumption.parquet``, except with ``food_group`` instead of ``food``. Food groups aggregate related foods (e.g., ``cereals``, ``fruits``, ``red_meat``) for higher-level analysis. **feed_by_source.parquet** — Animal feed consumption decomposed by supply source Each row attributes a portion of an animal-class draw from a feed-category bus back to the upstream supply on that bus, using the bus's source mix as attribution weights. All quantities are on a dry-matter basis (every feed bus in the model is uniformly DM). Trade flows between countries net out at the global level for a given feed_category and are excluded from the source list; attribution is to primary (non-trade) inflows. .. csv-table:: :header: "Column", "Type", "Unit", "Description" ``product``, str, –, "Raw animal product name (e.g., ``meat-cattle``, ``dairy``, ``eggs``)" ``animal``, str, –, "Animal-class display label (e.g., ``Cattle``, ``Sheep``)" ``feed_category``, str, –, "Raw feed category at the animal_production input (``ruminant_forage``, ``monogastric_low_quality``, etc.)" ``source_key``, str, –, "Stable internal source identifier; one of ``grassland``, ``residue``, ``fodder_crop``, ``grain_crop``, ``protein_crop``, ``food_byproduct``, ``exog_forage_cal``, ``exog_protein_cal``, ``exog_browse``, ``exog_swill``, ``exog_other``" ``source``, str, –, "Human-readable source label (e.g., ``Crop residues``, ``Exog. browse / leaves``)" ``mt_dm``, float, Mt DM, "Attributed feed mass (dry matter)" ``feed_by_category.parquet`` and ``feed_by_animal.parquet`` are coarser views (by feed category alone, or by animal alone) that drop the source breakdown. Example Usage ~~~~~~~~~~~~~ Load statistics in Python for custom analysis: .. code-block:: python import pandas as pd # Load crop production production = pd.read_parquet("results/opt/analysis/scen-default/crop_production.parquet") # Total wheat production globally wheat_total = production[production["crop"] == "wheat"]["production_mt"].sum() # Load consumption with per-capita values consumption = pd.read_parquet("results/opt/analysis/scen-default/food_consumption.parquet") # Average per-capita protein intake avg_protein = consumption["protein_g_per_person_day"].mean() GHG Intensity ------------- The GHG intensity analysis computes greenhouse gas emissions attributable to each unit of food consumed. This provides a consumption-centric view of impacts, tracing emissions through trade and processing networks back to production. **GHG intensity** measures the greenhouse gas emissions per unit of food consumed (kg CO₂e per kg food). Unlike production-based accounting, this consumption-attributed metric traces emissions through the entire supply chain: if wheat is grown in one country, milled into flour, and consumed in another, the emissions from farming, processing, and transport are all attributed to the final consumption. GHG Attribution Methodology ~~~~~~~~~~~~~~~~~~~~~~~~~~~ The GHG attribution uses a flow-based approach via sparse matrix algebra. The network of production, processing, and trade links forms a directed graph where each node (bus) receives material from upstream and passes it downstream. Emissions occur at production links (e.g., fertilizer N₂O, enteric CH₄). The key insight is that emission intensity propagates through the network: the intensity at any bus equals its direct emissions plus the weighted average of upstream intensities. This gives a linear system: .. math:: \rho = e + M \rho where :math:`\rho` is the vector of emission intensities at each bus, :math:`e` is the vector of direct emission contributions, and :math:`M` is the weighted adjacency matrix (flow fractions). Solving :math:`(I - M)\rho = e` yields the consumption-attributed intensity at each food bus. Running the GHG Extraction ~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash # Extract consumption-attributed GHG intensity for a scenario tools/smk -j4 --configfile config/.yaml -- \ results/{name}/analysis/scen-default/ghg_attribution.parquet Output files: ``results/{name}/analysis/scen-{scenario}/ghg_attribution.parquet`` Per-country, per-food consumption-attributed GHG intensity including: .. csv-table:: :header: Column, Type, Unit, Description ``country``, string, —, "ISO 3166-1 alpha-3 country code" ``food``, string, —, "Food identifier" ``food_group``, string, —, "Food group" ``consumption_mt``, float, Mt, "Consumption quantity" ``ghg_kgco2e_per_kg``, float, kgCO2e/kg, "GHG intensity" ``ghg_usd_per_t``, float, USD/t, "Monetized GHG damage" ``results/{name}/analysis/scen-{scenario}/ghg_attribution_totals.parquet`` Total consumption-attributed GHG emissions by country and food group: .. csv-table:: :header: Column, Type, Unit, Description ``country``, string, —, "ISO 3166-1 alpha-3 country code" ``food_group``, string, —, "Food group" ``ghg_mtco2eq``, float, MtCO2eq, "Total emissions attributed to consumption" Net Emissions ------------- The net emissions extraction reads the solved network's emission aggregation links directly, providing the absolute net GHG balance including negative emissions from spared land sequestration. .. code-block:: bash # Extract net emissions for a scenario tools/smk -j4 --configfile config/.yaml -- \ results/{name}/analysis/scen-default/net_emissions.parquet ``results/{name}/analysis/scen-{scenario}/net_emissions.parquet`` Net GHG emissions by gas and source category: .. csv-table:: :header: Column, Type, Unit, Description ``gas``, string, —, "Gas type (co2, ch4, n2o)" ``source``, string, —, "Emission source category" ``mtco2eq``, float, MtCO2eq, "Emissions in CO2 equivalents" Health Impacts -------------- The health impacts analysis computes marginal years of life lost (YLL) per unit of food consumed, based on dose-response curve derivatives at current population intake levels. **Health impact** measures the years of life lost (YLL) per unit of food consumed. This is computed as the marginal effect—the derivative of the dose-response curve at current population intake levels. Foods with protective effects (fruits, vegetables, legumes) have negative values, while foods associated with health risks (processed meat, excess red meat) have positive values. Health Attribution Methodology ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Health impacts are computed by evaluating the slope of the piecewise-linear dose-response curves at current intake levels. For each (health cluster, risk factor) pair: 1. Current per-capita intake is computed from consumption flows and population 2. The slope of the log-relative-risk curve at this intake is determined 3. The chain rule converts this to YLL per unit intake change: .. math:: \frac{d(\text{YLL})}{d(\text{intake})} = \frac{\text{YLL}_\text{base}}{\text{RR}_\text{ref}} \cdot \text{RR} \cdot \frac{d(\log \text{RR})}{d(\text{intake})} 4. Units are converted from YLL per g/capita/day to YLL per Mt food The result captures how marginal changes in consumption affect population health outcomes, accounting for where each country currently sits on the dose-response curve. Running the Health Extraction ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash # Extract health marginals for a scenario tools/smk -j4 --configfile config/.yaml -- \ results/{name}/analysis/scen-default/health_marginals.parquet Output files: ``results/{name}/analysis/scen-{scenario}/health_marginals.parquet`` Per-country, per-food-group marginal health impacts including: .. csv-table:: :header: Column, Type, Unit, Description ``country``, string, —, "ISO 3166-1 alpha-3 country code" ``food_group``, string, —, "Food group (risk factor)" ``yll_per_mt``, float, YLL/Mt, "Marginal years of life lost per megatonne" ``health_usd_per_t``, float, USD/t, "Monetized marginal health damage" ``results/{name}/analysis/scen-{scenario}/health_totals.parquet`` Total years of life lost by health cluster: .. csv-table:: :header: Column, Type, Unit, Description ``health_cluster``, int, —, "Health cluster identifier" ``yll_myll``, float, MYLL, "Total years of life lost in millions" ``results/{name}/analysis/scen-{scenario}/health_attribution.parquet`` YLL attributed to each risk factor by health cluster and disease cause, using proportional allocation based on excess log-relative-risk: .. csv-table:: :header: Column, Type, Unit, Description ``health_cluster``, int, —, "Health cluster identifier" ``cause``, string, —, "Disease cause (e.g. CHD, Stroke)" ``food_group``, string, —, "Risk factor / food group" ``yll_myll``, float, MYLL, "Attributed years of life lost in millions" Sample Results ~~~~~~~~~~~~~~ The following figures show consumption-weighted global averages of GHG intensity and health impacts by food group: .. _fig-analysis-ghg: .. figure:: https://github.com/Sustainable-Solutions-Lab/GLADE/releases/download/doc-figures/analysis_marginal_ghg.png :alt: Bar chart showing GHG intensity by food group :align: center :width: 80% Global average GHG intensity by food group (consumption-weighted). Animal products (red meat, dairy) show the highest emissions per kg, while plant-based foods generally have lower intensities. .. _fig-analysis-yll: .. figure:: https://github.com/Sustainable-Solutions-Lab/GLADE/releases/download/doc-figures/analysis_marginal_yll.png :alt: Bar chart showing health impact by food group :align: center :width: 80% Global average health impact by food group (consumption-weighted). Negative values indicate protective effects (fruits, vegetables, legumes, whole grains), while positive values indicate health risks. The magnitude reflects the marginal impact at current global intake levels. Generating Global Average Plots ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash # Generate global average plots tools/smk -j4 --configfile config/.yaml -- \ results/{name}/plots/scen-default/marginal_ghg_global.pdf \ results/{name}/plots/scen-default/marginal_yll_global.pdf Objective Breakdown ------------------- The objective breakdown analysis extracts the cost components that make up the model's objective function, grouped into high-level categories. This enables analysis of how different cost drivers contribute to the total system cost. Running the Objective Extraction ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash # Extract objective breakdown for a scenario tools/smk -j4 --configfile config/.yaml -- \ results/{name}/analysis/scen-default/objective_breakdown.parquet Output file: ``results/{name}/analysis/scen-{scenario}/objective_breakdown.parquet`` Single-row Parquet file with cost categories in billion USD: .. csv-table:: :header: Column, Type, Unit, Description ``crop_production``, float, bn USD, "Land use and yield-related costs" ``trade``, float, bn USD, "Import/export costs" ``fertilizer``, float, bn USD, "Synthetic fertilizer costs" ``processing``, float, bn USD, "Food processing/conversion costs" ``consumption``, float, bn USD, "Consumption-related costs" ``animal_production``, float, bn USD, "Livestock production costs" ``feed_conversion``, float, bn USD, "Feed processing costs" ``consumer_values``, float, bn USD, "Utility from food consumption (negative)" ``biomass_exports``, float, bn USD, "Revenue from biomass exports (negative)" ``biomass_routing``, float, bn USD, "Internal biomass flow costs" ``health_burden``, float, bn USD, "Health costs from YLL" ``ghg_cost``, float, bn USD, "Emissions costs" The script validates that extracted categories sum to the model's reported objective value and raises an error if they don't match (within 1% tolerance). It also raises errors for unrecognized component patterns to ensure the analysis is updated when the model structure changes. .. seealso:: :doc:`validation` A complementary analysis approach that fixes production and demand to observed values, using slack variables to reveal data inconsistencies.