|
| 1 | +--- |
| 2 | +title: "Owning the pipe: physical replication, cloud neutrality, and the escape from DBaaS lock-in" |
| 3 | +date: 2026-04-14T10:32:59+10:00 |
| 4 | +description: "Why the physical replication stream is the key primitive that DBaaS providers deliberately withhold — and how a cloud-neutral stack built on PostgreSQL, Kubernetes, and CloudNativePG gives it back to you." |
| 5 | +tags: ["postgresql", "postgres", "kubernetes", "k8s", "cloudnativepg", "cnpg", "dok", "data on kubernetes", "dbaas", "sovereignty", "wal", "physical-replication", "open-source", "cncf"] |
| 6 | +cover: cover.jpg |
| 7 | +thumb: thumb.jpg |
| 8 | +draft: false |
| 9 | +--- |
| 10 | + |
| 11 | +_This article examines how managed database services deliberately suppress |
| 12 | +access to the physical replication stream, turning operational convenience into |
| 13 | +permanent lock-in. It makes the case for a cloud-neutral stack — PostgreSQL, |
| 14 | +Kubernetes, and CloudNativePG — as the only architecture that returns full |
| 15 | +operational sovereignty to the organisation that owns the data._ |
| 16 | + |
| 17 | +<!--more--> |
| 18 | + |
| 19 | +--- |
| 20 | + |
| 21 | +Over the past decade, Kubernetes has done something remarkable: it turned |
| 22 | +infrastructure into a portable abstraction. Compute workloads can now move |
| 23 | +between any cloud, any data centre, and any bare-metal cluster without |
| 24 | +rewriting a line of application code. The underlying hardware has been |
| 25 | +effectively commoditised. |
| 26 | + |
| 27 | +The database has not. |
| 28 | + |
| 29 | +While every other layer of the stack has been liberated, the data layer has |
| 30 | +not. PostgreSQL sits at the centre of this story. As the world's most deployed |
| 31 | +open-source relational database, it is also the engine most targeted by |
| 32 | +hyperscaler DBaaS offerings — and the one whose most powerful primitive is most |
| 33 | +deliberately withheld: the WAL stream, PostgreSQL's physical replication |
| 34 | +mechanism. |
| 35 | + |
| 36 | +## The Day 2 reality of managed databases |
| 37 | + |
| 38 | +The appeal of Database-as-a-Service is real. On Day 1, you click a button and a |
| 39 | +production-grade PostgreSQL cluster appears. No storage provisioning, no |
| 40 | +replication configuration, no backup policy to write. It is genuinely |
| 41 | +impressive, and it is easy to understand why organisations reach for it. |
| 42 | + |
| 43 | +Day 2 is where the architecture reveals itself. |
| 44 | + |
| 45 | +High availability, disaster recovery, point-in-time recovery, performance |
| 46 | +tuning, major version upgrades — all of this is managed through a proprietary |
| 47 | +control plane that your team does not own, cannot inspect, and cannot export. |
| 48 | +The operational intelligence that should live in your platform, expressed as |
| 49 | +code, reviewed by your engineers, and versioned in your repositories, is instead |
| 50 | +locked inside a hyperscaler's console. |
| 51 | + |
| 52 | +This is not merely an inconvenience. When you need to respond to a compliance |
| 53 | +requirement, a regulatory change, or a geopolitical shift that demands you move |
| 54 | +workloads to a different jurisdiction or cloud, you discover that the |
| 55 | +operational steering wheel is not in your hands. The muscle memory required to |
| 56 | +operate your database at scale was never yours to begin with. |
| 57 | + |
| 58 | +## The physical replication gap |
| 59 | + |
| 60 | +The most consequential thing a managed database provider withholds is access to |
| 61 | +the WAL stream — the physical replication stream that is the beating heart of |
| 62 | +PostgreSQL. |
| 63 | + |
| 64 | +Physical replication is what makes it possible to maintain a byte-for-byte |
| 65 | +replica of a primary instance in real time. It underpins streaming WAL to |
| 66 | +object storage for backup and point-in-time recovery, live standby clusters |
| 67 | +across regions, and the kind of frictionless, ongoing portability that makes |
| 68 | +cloud neutrality operational rather than aspirational. |
| 69 | + |
| 70 | +The distinction between PostgreSQL's logical tools matters here. Logical backup |
| 71 | +and restore — pg_dump — requires a maintenance window proportional to dataset |
| 72 | +size, making it impractical at production scale for large databases. Logical |
| 73 | +replication is a different matter entirely: operating continuously at the level |
| 74 | +of decoded changes, it is well-suited to a controlled, one-time migration out of |
| 75 | +a managed service and is the foundation of blue-green major version upgrades. |
| 76 | +It is, in fact, the exact mechanism described in the migration section later in |
| 77 | +this article. But logical replication is not designed for permanent, ongoing |
| 78 | +portability: it does not replicate DDL, sequences, or large objects, and it |
| 79 | +cannot sustain the continuous multi-cluster replication that operational |
| 80 | +sovereignty requires over the long term. |
| 81 | + |
| 82 | +That sustained capability requires the WAL stream. And managed database |
| 83 | +providers deliberately do not expose it. This is not an oversight — it is the |
| 84 | +architecture of lock-in. Once your data reaches the scale where ongoing |
| 85 | +physical replication matters, and that stream is withheld, the cost of leaving |
| 86 | +grows faster than the cost of staying. The provider knows it. |
| 87 | + |
| 88 | +## The cloud-neutral resolution |
| 89 | + |
| 90 | +The solution is not to avoid the cloud. It is to refuse the false choice between |
| 91 | +cloud convenience and operational control. |
| 92 | + |
| 93 | +A cloud-neutral PostgreSQL architecture, built on open-source components, gives |
| 94 | +you both. The stack is straightforward: |
| 95 | + |
| 96 | +- **Compute:** Kubernetes — the software-defined, portable infrastructure layer |
| 97 | + that runs identically on any cloud or bare-metal environment. |
| 98 | +- **Operator:** CloudNativePG — the open-source Kubernetes operator that |
| 99 | + codifies all Day 2 operational tasks declaratively. |
| 100 | +- **Engine:** Standard PostgreSQL — unmodified, fully open, with no proprietary |
| 101 | + extensions or behavioural divergence. |
| 102 | + |
| 103 | +What makes this stack significant is not any individual component, but the fact |
| 104 | +that the entire configuration lives in your version control system as Kubernetes |
| 105 | +manifests. High availability topology, backup schedules, retention policies, |
| 106 | +replication configuration, resource limits — all of it is declarative, auditable |
| 107 | +and portable. It moves with you. |
| 108 | + |
| 109 | +I explored the broader implications of this approach in a |
| 110 | +[post on the CNCF blog](https://www.cncf.io/blog/2024/11/20/cloud-neutral-postgres-databases-with-kubernetes-and-cloudnativepg/), |
| 111 | +if you want to go deeper on the cloud-neutrality angle. |
| 112 | + |
| 113 | +## What CloudNativePG actually delivers on Day 2 |
| 114 | + |
| 115 | +[CloudNativePG](https://cloudnative-pg.io) was purpose-built for the Day 2 |
| 116 | +problem. As a CNCF Sandbox project — the first relational database operator to |
| 117 | +enter the CNCF since 2018 and the first ever for PostgreSQL — it automates the |
| 118 | +full lifecycle of a PostgreSQL cluster on Kubernetes: automated failover, |
| 119 | +synchronous replication, point-in-time recovery, rolling updates, major version |
| 120 | +upgrades, and more. |
| 121 | + |
| 122 | +Crucially, because CNPG manages standard PostgreSQL with full access to the |
| 123 | +engine internals, the physical replication stream is yours. You own the pipe. |
| 124 | + |
| 125 | +You can stream your WAL to object storage for backup and PITR. You can maintain |
| 126 | +a physical standby in a separate Kubernetes cluster — in a different region or |
| 127 | +a different cloud entirely — using CloudNativePG's |
| 128 | +[distributed topology for replica clusters](https://cloudnative-pg.io/docs/current/replica_cluster#distributed-topology). |
| 129 | +You can migrate your entire dataset to a new environment by promoting that |
| 130 | +standby — with downtime measured in seconds, not hours. |
| 131 | + |
| 132 | +This is the capability that managed services deliberately withhold, and it is |
| 133 | +the capability that makes portability permanent rather than theoretical. |
| 134 | + |
| 135 | +## Observability as a first-class concern |
| 136 | + |
| 137 | +Sovereignty over data and compute is necessary but not sufficient. If your |
| 138 | +metrics, logs, and traces are trapped in a proprietary cloud console, you lose |
| 139 | +operational visibility the moment you move. |
| 140 | + |
| 141 | +CloudNativePG integrates natively with the CNCF observability stack. It |
| 142 | +produces [structured JSON logs directly to stdout](https://cloudnative-pg.io/docs/current/logging), |
| 143 | +making them immediately consumable by any log aggregation pipeline. It exposes |
| 144 | +a rich set of [PostgreSQL metrics via a native Prometheus endpoint](https://cloudnative-pg.io/docs/current/monitoring), |
| 145 | +and it supports OpenTelemetry for distributed tracing. |
| 146 | + |
| 147 | +Your "eyes and ears" are as portable as your data. There is no |
| 148 | +proprietary dashboard you must replicate or vendor-specific agent you must |
| 149 | +re-instrument when you change cloud providers. |
| 150 | + |
| 151 | +## Migrating without a maintenance window |
| 152 | + |
| 153 | +For organisations currently running on a managed database service, the migration |
| 154 | +path follows a clear sequence. |
| 155 | + |
| 156 | +First, build a parallel environment. Use |
| 157 | +[logical replication](https://cloudnative-pg.io/docs/current/logical_replication) |
| 158 | +to synchronise your data from the managed service into a CNPG-managed cluster. This phase can run |
| 159 | +indefinitely alongside production — it is low-risk, reversible, and gives your |
| 160 | +team the operational experience of running the new platform under real load |
| 161 | +before it matters. |
| 162 | + |
| 163 | +Second, perform the cutover. Because the data is continuously synchronised, |
| 164 | +the cutover is a controlled pivot rather than a disruptive migration. Downtime |
| 165 | +is a function of the replication lag at the moment you flip, not of dataset |
| 166 | +size. |
| 167 | + |
| 168 | +Third, maintain permanent portability. Once you are within the CloudNativePG |
| 169 | +ecosystem and running standard PostgreSQL with full WAL access, you can replicate |
| 170 | +your cluster anywhere — different cloud, different region, bare metal — using |
| 171 | +native physical replication. The investment in moving is a one-time cost. The |
| 172 | +freedom it buys is permanent. |
| 173 | + |
| 174 | +The financial services sector illustrates this well. At KubeCon Amsterdam, Laurent Parodi and I gave a |
| 175 | +[talk](https://www.youtube.com/watch?v=m0LBKjlxrog) in which he walked through |
| 176 | +how HSBC approached this migration, navigating the intersection of strict |
| 177 | +regulatory requirements and the operational scale you would expect from one of |
| 178 | +the world's largest financial institutions. It is one of the more instructive |
| 179 | +real-world examples of this architecture in a heavily regulated environment. |
| 180 | + |
| 181 | +## Staying in the cloud, leaving the DBaaS |
| 182 | + |
| 183 | +For many organisations, the most immediate path forward does not require moving |
| 184 | +away from the cloud at all. If your applications already run on a |
| 185 | +hyperscaler-managed Kubernetes service — Amazon EKS, Azure Kubernetes Service, |
| 186 | +Google GKE — you are already closer to the solution than you might think. |
| 187 | + |
| 188 | +The logical first step is not to migrate to a different provider or to bare |
| 189 | +metal. It is to move the PostgreSQL database from the hyperscaler's DBaaS |
| 190 | +offering — Amazon RDS, Azure Database for PostgreSQL Flexible Server — into the |
| 191 | +Kubernetes cluster you already operate, colocated with the applications that |
| 192 | +connect to it. CloudNativePG runs identically on EKS or AKS as it does on any |
| 193 | +other conformant Kubernetes distribution. Your application manifests do not |
| 194 | +change. Your network topology typically improves, since the database is now |
| 195 | +inside the same cluster rather than accessed over a managed service endpoint. |
| 196 | + |
| 197 | +The outcome is immediate and compounding: you recover the operational |
| 198 | +intelligence currently locked inside RDS or Flexible Server, you eliminate the |
| 199 | +DBaaS premium from your cloud bill, and — crucially — you regain access to the |
| 200 | +WAL stream. From that point, replicating to a second region, streaming WAL to |
| 201 | +object storage, or moving to a different environment entirely are all decisions |
| 202 | +you make on your own terms, at a time of your choosing. |
| 203 | + |
| 204 | +For a step-by-step walkthrough of this migration — covering Amazon RDS, Azure |
| 205 | +Database for PostgreSQL Flexible Server, and Google Cloud SQL as source systems |
| 206 | +— I wrote |
| 207 | +[CloudNativePG Recipe 5]({{< relref "../20240327-zero-cutover-migrations/index.md" >}}), |
| 208 | +which covers the full logical replication setup for a near-zero-downtime |
| 209 | +cutover into Kubernetes. Some operational details will have evolved with newer |
| 210 | +releases, but the approach and the underlying mechanics remain sound. |
| 211 | + |
| 212 | +If you are running on Azure AKS specifically, this [walkthrough on deploying |
| 213 | +CloudNativePG on AKS](https://www.youtube.com/watch?v=KEApG5twaA4) is a good |
| 214 | +companion. The same logic applies across all hyperscaler Kubernetes offerings: |
| 215 | +Today, the cloud is not the problem. The DBaaS is. |
| 216 | + |
| 217 | +## Compliance is now a pull force |
| 218 | + |
| 219 | +For organisations operating under the EU Data Act or preparing for the Cyber |
| 220 | +Resilience Act, operational sovereignty is no longer purely an architectural |
| 221 | +preference — it is a compliance requirement. Both frameworks demand demonstrable |
| 222 | +data portability and the ability to move critical workloads between providers or |
| 223 | +onto private infrastructure. |
| 224 | + |
| 225 | +A cloud-neutral architecture built on open standards is the most direct path to |
| 226 | +satisfying these requirements, and the architecture described here is precisely |
| 227 | +what auditors and regulators mean when they ask for evidence of portability. It |
| 228 | +is also the architecture that gives you the operational capability to actually |
| 229 | +execute a migration under time pressure, rather than just asserting in a |
| 230 | +compliance document that you could. |
| 231 | + |
| 232 | +## The bottom line |
| 233 | + |
| 234 | +DBaaS lock-in is not inevitable. It is the product of a specific architectural |
| 235 | +choice — handing Day 2 operational responsibility to a managed service that |
| 236 | +withholds the one primitive that makes portability possible at scale. |
| 237 | + |
| 238 | +The alternative is not to build everything yourself. CloudNativePG handles the |
| 239 | +hard operational problems. Kubernetes handles infrastructure portability. Standard |
| 240 | +PostgreSQL handles your data, with no proprietary divergence. The stack is mature, |
| 241 | +production-proven, and already running mission-critical workloads at organisations |
| 242 | +including IBM, Google Cloud, Microsoft Azure, HSBC, Tesla, GEICO Tech and Novo |
| 243 | +Nordisk. The [full adopters list](https://github.com/cloudnative-pg/cloudnative-pg/blob/main/ADOPTERS.md) |
| 244 | +is publicly maintained and growing. |
| 245 | + |
| 246 | +Owning the pipe — keeping access to the physical replication stream — is the |
| 247 | +difference between a database that can follow your organisation wherever it needs |
| 248 | +to go, and one that cannot. |
| 249 | + |
| 250 | +That distinction is worth building for. |
| 251 | + |
| 252 | +--- |
| 253 | + |
| 254 | +If you are interested in the practicalities of running this stack in production, |
| 255 | +I encourage you to explore the [CloudNativePG documentation](https://cloudnative-pg.io/docs/) |
| 256 | +and [get in touch with the community](https://github.com/cloudnative-pg#getting-in-touch). |
| 257 | +The project is open, governed transparently under the CNCF, and built to remain |
| 258 | +so. |
| 259 | + |
| 260 | +The themes in this article also formed the basis of a talk I gave with Floor |
| 261 | +Drees at [Open Sovereign Cloud Day, KubeCon EU 2026](https://colocatedeventseu2026.sched.com/event/2H5Uc/beyond-the-dbaas-trap-achieving-data-sovereignty-with-kubernetes-and-cloudnativepg-floor-drees-gabriele-bartolini-edb) |
| 262 | +— titled "Beyond the DBaaS Trap: Achieving Data Sovereignty with Kubernetes |
| 263 | +and CloudNativePG". If you prefer the spoken version, that is a good |
| 264 | +companion to this article. |
| 265 | + |
| 266 | +--- |
| 267 | + |
| 268 | +Stay tuned for the upcoming recipes! For the latest updates, consider |
| 269 | +subscribing to my [LinkedIn](https://www.linkedin.com/in/gbartolini/) and |
| 270 | +[Twitter](https://twitter.com/_GBartolini_) channels. |
| 271 | + |
| 272 | +If you found this article informative, feel free to share it within your |
| 273 | +network on social media using the provided links below. Your support is |
| 274 | +immensely appreciated! |
| 275 | + |
| 276 | +_This article was drafted and refined with the assistance of Claude (Anthropic). |
| 277 | +All technical content, corrections and editorial direction are the author's own._ |
0 commit comments