# Under the hood of our provider-agnostic front office system

> And why this “invisible stack” every cloud needs goes far beyond billing 

Most cloud platforms don't start life thinking seriously about the billing or commercial experience. Anxious to get to market, teams reach for something like Stripe to solve the “billing” thing. And for good reason! Stripe's developer experience is genuinely excellent and the platform handles a bit of everything that you might need. 

But in the cloud world, so-called “billing” isn’t just a thing that needs to happen, but a core part of the experience. From pricing, spend management, reports, and commits to purchase orders, tax, and budgets, the things that traditional software businesses push into a back-office afterthought are _product-level experiences_ in the cloud. At the same time, hyperscale operators like AWS and GCP have set a gold standard, and enterprise customers expect it from every cloud service they use today.

At Datum, we think about this as moving the back office to the front office. And since our mission is to help the next wave of clouds thrive in the AI era, we’re investing meaningfully in [Milo](https://github.com/milo-os/), an open source system of record (and increasingly a system of action). The goal is a “back office” system that can enable a modern customer experience while helping operators scale and respond to the needs of their unique business.

A new part of that system is `milo-os/billing`. This post focuses on one part of it: how we built the payments layer and specifically, how we integrated Stripe without letting Stripe _become_ our billing system. Future posts will go deeper on other components: the service catalog, price book, billing account management, metering, and more. But the payments architecture is a useful entry point, because it illustrates the design principle driving our work.

## It’s a trap!

The allure of a direct Stripe integration is strong and well-earned. A `stripe_customer_id` column on a user table here, product catalog items mirrored there, email templates referencing "your card on file," a few SDK calls scattered through your handlers. None of these choices are wrong in isolation and each one is a fairly obvious thing to do at the time.

But each one is also a thread tying your domain model to a specific provider. And when you need something different (negotiated contracts, multi-currency settlement, usage-based pricing layered with commits and discounts) untangling those threads is a substantial project, usually at the worst possible time - eg when you’re growing fast and racing to meet big step functions in the business and product.

The pattern we've seen across infra-cloud teams that adopt a simplistic “billing equals payments” mindset is that it works fine, until it doesn't. The moment you're trying to model usage-based metering alongside commit-based pricing alongside per-seat subscriptions, mixing them together inside a single (even powerful) integration becomes a gordian knot. 

## A control-plane mindset, applied to the front office

When we designed Datum's control-plane architecture, we modeled our platform APIs on Kubernetes-style CRDs to inherit a useful substrate: discovery, validation, RBAC, audit, and event-driven reconciliation. Our commercial system builds on that foundation.

Digging into the stack, there are a few key components we think every cloud service provider needs to own natively:

- **Service catalog** — what you sell and how it's structured

- **Billing account** — the financial relationship with each organization

- **Price book** — how products map to prices across customer segments

We built these as native services in [Milo](http://github.com/milo-os/), keeping the data model and financial lifecycle inputs provider-agnostic by design. On top of that foundation, we layer pluggable integrations for metering, payments, tax, and invoicing that can be swapped out as needed.

This post focuses on the payments layer, which offers a concrete illustration of how that pluggability works in practice.

## How the payments layer works

The system is built around a small set of CRDs split across two API groups.

The first, `billing.miloapis.com`, is the consumer-facing API. Every Datum customer interacts with these resources directly, through the portal or kubectl:

- **BillingAccount** — carries currency, payment terms, contact info, tax registrations, and billing address. Scoped one-per-Organization today, with a path to multi-account billing down the road.
- **BillingAccountBinding** — links each Project to the BillingAccount paying for its consumption.
- **PaymentMethod** — an abstract payment instrument with a brand, a last-four, and a `Pending / AwaitingConfirmation / Active / Failed` lifecycle. No provider-specific fields.
- **PaymentMethodClass** — the indirection point. It names a provider and points at the provider's typed configuration via `parametersRef`.

The second API group belongs to the payments provider itself. For Stripe, that's `stripe.billing.miloapis.com`. Customers never touch it directly. It contains:

- **StripeProviderConfig** — Stripe SDK configuration: publishable key, API version, webhook secret.
- **StripePaymentMethod** — Stripe-side state for each PaymentMethod: the SetupIntent, the client secret, the upstream `cus_…` ID. Owner-referenced by its parent PaymentMethod, so it shares the parent's lifecycle.

Another provider — Adyen, Braintree, or anyone else — plugs in by introducing its own CRD group, referenced via `PaymentMethodClass.parametersRef`. The resources in `billing.miloapis.com` never change.

## The parametersRef pattern

If you've worked with Kubernetes Gateway API, this shape will look familiar. It's the same indirection that lets a Gateway be backed by Envoy, Istio, or Cilium without the Gateway resource knowing which. Here it is applied to payments:

```
apiVersion: billing.miloapis.com/v1alpha1
kind: PaymentMethodClass
metadata:
  name: stripe-default
spec:
  provider: stripe
  parametersRef:
    group: stripe.billing.miloapis.com
    kind: StripeProviderConfig
    name: stripe-production
```

A PaymentMethod references a PaymentMethodClass. The class names a provider and points at a typed configuration resource that the provider owns. Stripe's config carries a publishable key and API version. A future Adyen provider might need entirely different fields — and that's fine, because none of it leaks into the PaymentMethod schema or the customer experience.

## What the flow looks like end-to-end

When Datum's Cloud Portal collects a card, it doesn't call Stripe directly. It creates a PaymentMethod referencing the org's PaymentMethodClass and waits. Here's what happens:

1. The `stripe-provider` controller (deployed separately, in its own repo) sees the new resource and creates a `StripePaymentMethod` as an owner-referenced child.
2. It mints a SetupIntent against the configured Stripe account and writes the client secret to `StripePaymentMethod.status.setupIntent`.
3. The base PaymentMethod advances to `AwaitingConfirmation`.
4. The portal reads the client secret from the Stripe child resource and hands it to Stripe Elements in the browser.
5. When the SetupIntent succeeds, Stripe POSTs a webhook to the stripe-provider's ingress — not to the portal, not to the billing service. The provider verifies the signature, projects a normalized brand/last-four to `PaymentMethod.status.details`, and advances the phase to `Active`.

The billing service never imports the Stripe SDK. The portal never calls Stripe's API directly. Both observe the same `PaymentMethod.status` and react to it, the same way they'd observe any other Datum resource transitioning.

The one piece of provider-specific code the portal carries is the in-browser card form — required by PCI compliance, since card details can't touch our servers. The portal loads `@stripe/stripe-js` when the active PaymentMethodClass points at the stripe-provider, hands it the client secret, and lets Stripe render the iframe. A future provider with a different in-browser SDK plugs in the same way: keyed off the class, scoped to the card-collection step, entirely separate from how the rest of the platform reasons about a PaymentMethod.

## What comes for free

Because the billing service is just another aggregated API on Milo, we get a number of things along with it:

- **Auth and RBAC.** Org owners get billing-admin permissions through the same IAM Role and PolicyBinding resources that grant all other org permissions. No separate login, no second source of truth.
- **Activity feed.** Every BillingAccount and PaymentMethod change emits to Milo's activity stream automatically, routed into the same audit log the portal already renders for project changes.
- **Quotas.** "One BillingAccount per Organization" is a ResourceRegistration plus a ClaimCreationPolicy enforced at the admission layer — no quota-check code needed in the billing controllers.

## Tradeoffs worth knowing about

While we do get some important things “for free”, building in this way isn’t without costs. Here are few things are worth flagging:

**Webhook routing.** Each provider terminates its own webhooks at its own ingress. That's intentional, and only the stripe-provider should verify a Stripe signature. But it means every provider you ship needs its own public endpoint, its own TLS chain, and its own deploy story. We think that is a manageable cost for cloud teams, but it is real.

**Provider state leaks.** Stripe knows things the PaymentMethod resource will never represent: dispute history, fraud signals, granular decline reasons. One has to make a deliberate decision about what to surface through standard status fields versus keeping in a provider-specific sub-object. The temptation to "just add another field to PaymentMethod" is always there, but we suggest you resist it, or you'll reinvent provider lock-in one field at a time.

**Multi-currency.** We currently settle payments in USD today. The currency field is in the API, providers are aware of it, but every provider implementation will eventually have to reason about which currencies it can charge in and how that interacts with tax locality, etc. This is a known future complexity, and not something we’ve solved for today. 

## What's next

Our billing service is shipping soon to power Datum Cloud, including an [Amberflo](https://amberflo.io/) provider for metering and rating. Everything in `milo-os/billing` is open source under AGPLv3.

Payments are a piece of the front-office system, but there is more to the stack. Upcoming posts in this series will cover:

- The **service catalog**: how we model what Datum sells 
- The **price book**: handling pricing, commits, and discounts in a provider-agnostic way
- **Metering and rating** with Amberflo as an initial pluggable provider
- How **multi-account billing** and cross-project allocation will work

To keep up with our work, or jump in to help extend it, please visit the project on [GitHub](http://github.com/milo-os/billing).