The Brief

Fresh Earth operates at the intersection of agriculture, carbon markets, and ESG compliance. The company's IBIS platform tracks carbon sequestration and emissions across agricultural supply chains — satellite data, sensor networks, supply-chain provenance records, and the complex methodologies of voluntary carbon market standards all feeding into a single system.

IBIS-SIM extends this with simulation capabilities: given a planned land-use change or farming practice change, what is the projected carbon impact over a defined time horizon?

As Chief of AI, the mandate was twofold: improve the ML systems underpinning both platforms, and increase the velocity at which the team could ship new features and model updates.

What Was Delivered

Serverless ML pipeline on AWS Lambda and S3, scaling horizontally for batch agricultural data processing
Per-source ingestion streams with distinct normalization layers: satellite imagery, IoT sensors, logistics records, manual surveys
SHAP-based explainability layer producing per-prediction feature attribution for every carbon estimate
IBIS-SIM: Monte Carlo simulation engine producing probability distributions over carbon outcomes, not point estimates
Model registry with A/B testing infrastructure and automated rollback on performance degradation
Deployment time cut from 45 minutes to under 10
30% improvement in release velocity across the quarter

The Approach

Carbon accounting at scale has two hard problems.

The first is data heterogeneity: satellite imagery, IoT sensor readings, logistics records, manual survey data, and regulatory filings all need to feed into the same models using consistent methodology. Each source has its own format, quality characteristics, and failure modes.

The second is auditability: carbon credits are financial instruments, and the models producing them need to produce explainable outputs that can be reviewed by auditors and challenged in a dispute.

"The auditability requirement changed every model architecture decision. If you cannot explain why the model produced a specific carbon estimate, the credit is not saleable."

The architecture centered on a serverless ML pipeline — AWS Lambda for computation, S3 for data storage, SQS for job queuing — that could scale horizontally for batch processing of large agricultural datasets while remaining cost-efficient during low-volume periods.

The Build

Ingestion layer. Each data source was treated as a distinct ingestion stream with its own normalization layer. Satellite imagery went through preprocessing (cloud masking, atmospheric correction, NDVI calculation) before being ingested. IoT sensor data went through anomaly detection and gap-filling. This separation of concerns made it straightforward to update individual streams without touching the core model.

ML models. Primarily gradient-boosted trees for tabular supply-chain data, and CNN-based models for satellite imagery. Both were managed through a lightweight MLOps layer with versioning, A/B testing infrastructure, and rollback capabilities built early and used often.

Explainability layer. SHAP values to produce per-prediction feature attribution. For a given carbon estimate, the system produces a ranked list of factors that drove the estimate and their relative contributions. Not a nice-to-have — a requirement for the auditor-facing report generation.

IBIS-SIM. Monte Carlo simulation: the carbon model runs with sampled parameter distributions to produce probability distributions over outcomes rather than point estimates. This is more honest about the inherent uncertainty in carbon projection, and more useful for the land-use planning decisions IBIS-SIM informs.

class CarbonSimulator:
    def simulate(
        self,
        baseline: LandUseScenario,
        intervention: LandUseScenario,
        n_samples: int = 10_000,
        horizon_years: int = 10
    ) -> SimulationResult:
        samples = [
            self._run_single(baseline, intervention, horizon_years)
            for _ in range(n_samples)
        ]
        return SimulationResult(
            p10=np.percentile(samples, 10),
            p50=np.percentile(samples, 50),
            p90=np.percentile(samples, 90),
        )

The Outcome

The 30% release-velocity improvement came from three sources: automated testing infrastructure that caught regressions before production, a model registry that made promoting a new model version straightforward without manual coordination, and refactored deployment pipelines that cut deploy time from 45 minutes to under 10.

The IBIS platform is processing millions of data points per month. The explainability reports are used in carbon credit audits. IBIS-SIM is in active use for land-use planning decisions.

Lessons

In ML systems that feed financial instruments, the model is not the product. The audit trail is the product. Every design decision should ask: can an auditor who knows nothing about our codebase understand why the model made this prediction?

Serverless ML has a cold-start problem that matters at the wrong times. For Fresh Earth's use case — batch processing with some latency tolerance — this was acceptable. For real-time applications where a user is waiting, provisioned concurrency or a different architecture is required.

Probability distributions over point estimates are almost always more honest and more useful. The additional complexity is worth it. Presenting a carbon estimate as a range (p10: 450 tonnes, p50: 520 tonnes, p90: 610 tonnes) is more defensible than presenting 520 tonnes as a fact.

Fresh Earth