Monday, October 6, 2025

Lightning AI invites Sylow Team to Talk about AI Trust Protocols

Gianni Crivello
Sylow's CEO, Ethan Henley, at the Lightning AI HQ

To the Reader:

The following is a write-up of a talk Sylow gave at the Lightning AI Headquarters in New York on September 30, 2025. A recording can be found at https://youtu.be/8ufF0CK9I5A for your viewing!

Lightning AI HQ

Introduction: The Payment Infra Problem

We're approaching an inflection point where autonomous agents are capable of sophisticated decision-making but lack the infrastructure to act on those decisions in economic contexts. The challenge isn't really about making payments happen—current rails move $15T annually just fine— the challenge is about trust: how do we allow autonomous systems to initiate financial transactions without either constraining them so severely they become useless, or trusting them so completely that we enable catastrophic errors?

The Payments Problem

This talk began with a deceptively simple question: what actually constitutes a payment? Strip away the complexity of card networks and ACH rails, and you find four essential requirements that define value transfer between economic actors.

  1. Identity establishes who the parties are. In traditional systems, this means SSN for individuals, EIN for businesses—stable identifiers tied to legal entities with established reputations and recourse mechanisms.
  2. Authorization determines what permissions exist to execute transactions. Today this primarily means checking that a human physically possesses a card or can authenticate to a bank account.
  3. Settlement handles the actual money movement through the payment rails.
  4. Reconciliation provides the audit trail proving what happened.

Current infrastructure excels at these requirements—4.8B credentials supporting 150M merchants moving $15T annually. The problem for agents is that every step of this architecture presumes a human is always in the decision loop, can be reached for real-time authorization, and will notice fraudulent activity within days or weeks.

The Agents Problem

Lets talk definitions - it seems like the term “agent” is as ubiquitous as it is nebulous. For the purposes of this article, an agent is defined as autonomous software that pursues goals through a perception → decision → action loop, distinguishing it from traditional software that executes predetermined workflows (like traditional RPA). The critical difference: agents can act under uncertainty, making decisions in novel situations without explicit programming for each case.

So now the fundamental question crystallizes: how does trust scale when the decision-maker isn't directly human?

Traditional approaches fail immediately. You can't give an agent your credit card and hope for the best—the blast radius of a compromised agent is unbounded. You can't require human approval for every micro-decision—that defeats the purpose of autonomous operation. You can't rely on humans noticing fraudulent activity after the fact—agents can execute thousands of transactions in the time it takes to check your email.

The trust problem manifests in several interdependent challenges. How do we verify an agent represents its claimed principal and not a hijacked session?How do we ensure agent actions align with user intent, not hallucinated goals or adversarial prompts? How do we maintain audit trails that meaningfully capture autonomous decision sequences?

We are of the opinion that the answers require rethinking authorization from first principles.

Authorization Architectures: Why OAuth 2 Breaks

OAuth 2 dominates web authorization for good reason—it elegantly solved the problem of allowing third-party applications to access user resources without exposing credentials. But every design decision in OAuth 2 reveals assumptions about human interaction patterns. This is the same problem we had when discussing payment infrastructure - humans at every step!

Oauth 2.0 flow

Specifically, OAuth 2 was made for humans behind browsers. The authorization flow assumes you can redirect a user to an identity provider, show them a consent screen, and have them click "approve" in real-time. It answers the question "who is accessing my server" with coarse-grained permissions: user, admin, paid_user, guest. The tokens are long-lived by necessity—fetching new ones requires human interaction. The model assumes the resource owner is the end user making the request.

For agentic systems, each of these assumptions becomes a constraint. Agents don't browse—they execute programmatic workflows that might span hours or days. They need fine-grained permissions that change as tasks progress, not static role assignments. They need short-lived tokens that can be refreshed automatically without breaking autonomous operation. And critically, the agent initiating a request often isn't the resource owner—a healthcare agent might release medical records to a user, or a procurement agent might purchase services on behalf of a team.

GNAP: Authorization Patterns for Autonomous Systems

GNAP (Grant Negotiation Authorization Protocol, RFC 9635) represents a different approach built on acknowledging that authorization flows don't always involve browsers or immediate human interaction. Rather than treating non-browser clients as edge cases to be hacked into an OAuth flow, GNAP makes them first-class citizens.

The core innovation is key-bound tokens. Instead of bearer tokens that work for whoever possesses them, GNAP tokens are cryptographically bound to specific agent keys. This means stolen tokens are useless without the corresponding private key—addressing one of the fundamental security weaknesses in autonomous systems.

The protocol enables flexible authorization flows for complex delegation scenarios. When a healthcare agent needs to release medical records, GNAP can model the situation where the resource owner (hospital), the requesting agent (AI assistant), and the beneficiary (patient) are three distinct entities with different cryptographic identities. This flexibility extends to the temporal dimension: tokens are short-lived by design, with explicit validity periods that map to specific task phases.

The architecture separates concerns cleanly:

GNAP Process Flow

Trust protocols (GNAP) handle identity and authorization. Payment protocols (x402, AP2) handle actual transaction execution. This separation means you can evolve authorization patterns without rebuilding payment infrastructure, and vice versa.


MCP as the Agent-to-Service Interface

Our opinionation doesn’t stop at GNAP - we believe that the Model Context Protocol (MCP) can serve as the standardized interface layer between agents and merchant services. Rather than each merchant implementing custom agent integrations, MCP provides a common vocabulary for exposing capabilities.

The pattern is straightforward in concept but sophisticated in execution. A merchant MCP server registers tools that agents can call:

server.registerTool("addToCart", {...}, async () => ({...}));

server.registerTool("checkout", {...}, async () => ({...}));

Notice that permission boundaries map to tool boundaries! Agents can call addToCart freely—it's exploratory, reversible, and carries no direct financial risk. But checkout triggers the GNAP authorization flow, creating an explicit user consent checkpoint.

This separation, we believe, elegantly handles the consent problem. When a user says "Take my Pinterest inspiration, a photo of my living room, my budget, and these product links to design my living room," the agent can autonomously explore possibilities, calling addToCart dozens of times as it evaluates options. Only when the agent determines it has a complete solution and calls addToCart does the user receive an authorization request.

The user then sees a consolidated purchase request—not a stream of approval prompts for each consideration. The agent maintains autonomous operation throughout the discovery phase. The financial commitment happens in a single, well-defined moment with full context visible to the user. This solution offers an elegant user experience while prioritizing safety.

The Current Agentic Payments Ecosystem

While we are prescriptive on how we believe trust infrastructure scales in the world of agentic commerce, it is important to see who is working on the latest-and-greatest in terms of the payments infrastructure (which, for the purposes of this presentation, we have largely written off as a solved problem).

It is important to note that as of the date of publishing, no single approach has achieved clear dominance, and the best patterns likely haven't been discovered yet. Google seeks to define how these systems get implemented at scale with their AP2 specification, and OpenAI released their protocol alongside Stripe and Shopify about 2 weeks later. This competition will likely shape the agentic commerce landscape.

What's notable across these protocols is the lack of standardized trust infrastructure (i.e, what we are sounding the alarms about). Regardless of who wins, GNAP can be the layer that handles authorization and trust establishment, while the payment protocols handle actual transaction execution. This architectural separation allows each layer to evolve independently—you can swap payment rails without rebuilding authorization, and vice versa. Whether this separation persists as the ecosystem matures or whether integrated approaches emerge remains an open question, but one we are highly opinionated about.

Wrapping Up: Open Questions and Challenges

The patterns demonstrated—cryptographic identity, granular authorization with consent checkpoints, auditable transaction chains—solve the bounded case where agents explore and humans authorize. But fundamental challenges remain unresolved.

Agent-to-agent commerce presents the hardest problem. When both parties in a transaction are autonomous systems, how do you establish trust? Who bears liability when an agent is hijacked? How do you handle disputes when neither party has a human to explain what happened? The ATXP protocol is exploring this space, but solutions remain speculative.

Intent verification over time becomes problematic as agent operation extends beyond simple tasks. An agent monitoring cloud costs for weeks might identify a $10K optimization opportunity. The authorization checkpoint provides safety, but it doesn't verify the agent's reasoning is sound. You either force users to become domain experts or you trust the agent's judgment—neither scales well.

Regulatory compliance looms as an unavoidable constraint. Financial regulations assume human decision-makers with specific responsibilities. How do KYC, AML, and consumer protection rules apply when an agent initiates transactions? If an agent makes a fraudulent purchase, does the user bear full liability since they authorized the agent? These questions lack clear answers in any current solution, and it is likely that agentic payments will gain momentum in the far less regulated crypto ecosystem (x402 is best positioned, currently) before augmenting traditional payment rails.

Our conclusion is that regardless of who is the first to answer these questions, the separation of trust infrastructure from payment rails means these layers can evolve independently, and that is a good thing. GNAP-style authorization can mature while payment protocols compete and consolidate. What matters is that, like most things in this AI landscape, the big players will likely define the patterns everyone else follows. The architecture we've demonstrated provides one path. If this is one that you find interesting, we would love to have your contribution to our open-source project! https://github.com/TwigBush/TwigBush