12 minutes
How-to: Cloudnative FerretDB with Automated Recovery from Continuous Backups
Why?
FerretDB is an open source compatibility layer that serves a MongoDB-compatible database server, and stores the data in PostgreSQL. I think this is amazing, but I will save the “Why FerretDB?” talk for another blog.
This is a technical guide that will walk you through how to get the following:
- FerretDB deployment backed by highly available and secured cloudnative-pg postgres cluster.
- Barman-cloud plugin backups for point-in-time restores
- Automatic recovery on redeploy
In those 3 steps. What this means for me is that I can get a completely open source MongoDB-compatible database that behaves like the rest of my Kubernetes homelab: if I delete the whole cluster and redeploy it from manifests using GitOps, all my data automatically comes back from backups, without needing any manual intervention. This allows me to play with my homelab without worrying about data restore: everything will come back up on a fresh bootstrap, no manual intervention required.
Step 0: Prerequisites
- An S3-compatible object storage. Either get one from a cloud provider near you, or set up something like Garage or Versity Gateway on your NAS. If you get a cloud-provided object storage, make sure to get one that is not too expensive per write transaction, as postgres will be writing there constantly.
- A Kubernetes cluster (I’m using Kubernetes 1.34 on Talos 1.11 at the time of writing)
- cloudnative-pg operator installed on it: one of these commands should get you going (I’m running 1.27 at the moment)
- barman-cloud plugin for backups of cloudnative-pg provisioned databases. Check the instructions here. I tested this with 0.6.0.
- Both cloudnative-pg and barman-cloud require cert-manager to be installed on the cluster
- In addition, to run postgresql databases, you need some kind of provisioner to provision volumes, preferably on local host storage. Check out democratic-csi or openebs-local
With that out of the way, let’s get started.
Step 1: FerretDB backed by cloudnative-pg
I will be sharing YAML throughout this tutorial. You can copy it to your machine, and then run kubectl apply -f filename.yaml
to apply it to your kubernetes cluster, or put it in your GitOps repo, whatever you want.
The following YAML gives you a highly available cloudnative-pg cluster of 3 nodes, with 2 FerretDB servers in front to serve your application with MongoDB-compatible requests. Feel free to change these numbers, or add a resources block, if you prefer.
The comments explain some of the how and why.
---
# yaml-language-server: $schema=https://github.com/datreeio/CRDs-catalog/raw/refs/heads/main/postgresql.cnpg.io/cluster_v1.json
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: ferretdb-pg-cluster
namespace: ferretdb
spec:
# Keep this in sync with the correct image for FerretDB itself
# read FerretDB release notes and upgrade them together
imageName: ghcr.io/ferretdb/postgres-documentdb:17-0.106.0-ferretdb-2.5.0
instances: 3
# postgres-documentdb needs these IDs
postgresUID: 999
postgresGID: 999
storage:
size: 10Gi
# This should be the name of the storage class
# as configured in your provisioner
storageClass: local-hostpath
postgresql:
shared_preload_libraries:
- pg_cron
- pg_documentdb_core
- pg_documentdb
parameters:
# pg_cron needs to know which database FerretDB uses
cron.database_name: ferretDB
# These parameters are necessary to run ferretdb without superuser access
# Copied from https://github.com/FerretDB/documentdb/blob/ferretdb/packaging/10-preload.sh
documentdb.enableCompact: "true"
documentdb.enableLetAndCollationForQueryMatch: "true"
documentdb.enableNowSystemVariable: "true"
documentdb.enableSortbyIdPushDownToPrimaryKey: "true"
documentdb.enableSchemaValidation: "true"
documentdb.enableBypassDocumentValidation: "true"
documentdb.enableUserCrud: "true"
documentdb.maxUserLimit: "100"
pg_hba:
# pg_cron always runs as `postgres`
- host ferretDB postgres localhost trust
# This is needed to prevent fe_sendauth error
- host ferretDB ferret localhost trust
bootstrap:
initdb:
database: ferretDB
owner: ferret
postInitApplicationSQL:
- create extension if not exists pg_cron;
- create extension if not exists documentdb cascade;
- grant documentdb_admin_role to ferret;
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: ferretdb
namespace: ferretdb
spec:
replicas: 2
selector:
matchLabels:
app: ferretdb
template:
metadata:
labels:
app: ferretdb
spec:
containers:
- name: ferretdb
# Keep this in sync with the correct image for the postgresql cluster,
# always read FerretDB release notes and upgrade them together
image: ghcr.io/ferretdb/ferretdb:2.5.0
ports:
- containerPort: 27017
env:
- name: FERRETDB_POSTGRESQL_URL
valueFrom:
secretKeyRef:
# This secret gets automatically generated by cloudnative-pg
name: ferretdb-pg-cluster-app
key: uri
---
apiVersion: v1
kind: Service
metadata:
name: ferretdb-service
namespace: ferretdb
spec:
selector:
app: ferretdb
ports:
- protocol: TCP
port: 27017
targetPort: 27017
type: NodePort
This is a bit longer than the basic example that was on FerretDB’s blog earlier this year, but that is because we need to run postgres without connecting as the postgres
superuser.
After applying, watch the pods come up one by one:
kubectl -n ferretdb get pods -w
After startup is done, check it all works by running:
kubectl get cluster.postgresql.cnpg.io -n ferretdb
And check that it says: Cluster in healthy state
under STATUS
like in the below example output:
NAMESPACE NAME AGE INSTANCES READY STATUS PRIMARY
ferretdb ferretdb-pg-cluster 33h 3 3 Cluster in healthy state ferretdb-pg-cluster-1
This is great for security, and is necessary to work with the way cloudnative-pg handles backups in our case.
Try out your FerretDB service now: just connect to the node it is running on (since we used NodePort), or if you’re using a cloud Kubernetes outside your home network, change that NodePort to ClusterIP and use kubectl port-forward. You could use any Mongo client you wish, this is what it looks like with mongosh:
mongosh 'mongodb://ferret:the-password-from-the-cloudnative-pg-secret@ip-of-the-node-it-is-running-on-or-localhost-if-port-forwarding:27017'
Then run some Mongo commands.
Step 2: Enable backups
We will use the barman-cloud plugin to take backups. You may read about a legacy way of taking backups with cloudnative-pg, but there are 2 reasons why we use the plugin instead: the legacy way will be removed from the next cloudnative-pg release; and the legacy way requires barman-cloud (the backup tool) to be a part of the postgres image used. Our FerretDB provided image does not have barman-cloud. Luckily the new way with the plugin creates a separate container with barman-cloud and adds it to each database pod, so that it works regardless of which postgres container is running.
Postgres backups come in multiple flavors. Today we’ll look at the 2 that barman provides: base backups are like “save points” that can be restored from scratch in full. Write-Ahead-Logs (WALs) backup is a log of all the database transactions that can be replayed. These can be used to replay all the way to the latest state, or up to some point in time between the earliest base backup and the latest WAL.
So the combination of WALs and base backups give us both point in time restore, and restore up to the latest state, with no or very little data loss in case of an accidental (or intentional) cluster-wide outage. How these 2 should be used, is entirely handled by the tooling. All we need to do, is ensure backups are automatically made, and to specify a point in time to restore to.
So let’s get those backups rolling. First we connect the postgres cluster to the S3 bucket using a BarmanObjectStore object:
---
apiVersion: v1
kind: Secret
metadata:
name: s3-creds
namespace: ferretdb
stringData:
ACCESS_KEY_ID: your S3 access key ID goes here
SECRET_KEY: your S3 secret key goes here
---
# yaml-language-server: $schema=https://github.com/datreeio/CRDs-catalog/raw/refs/heads/main/barmancloud.cnpg.io/objectstore_v1.json
apiVersion: barmancloud.cnpg.io/v1
kind: ObjectStore
metadata:
name: ferretdb-backupstore
namespace: ferretdb
spec:
retentionPolicy: 14d
configuration:
destinationPath: s3://bucketname/optionalsubfolder/ # Change this
# Change the endpoint URL to whatever your cloud provider told you to use
# NOTE: should be https if using cloud bucket
endpointURL: http://versity.storage.svc.cluster.local:7070
s3Credentials:
accessKeyId:
name: s3-creds
key: ACCESS_KEY_ID
secretAccessKey:
name: s3-creds
key: SECRET_KEY
Fill in the following S3 connection data in the above YAML before applying:
- credentials:
- access key ID
- secret key
- bucket name (and optional subfolder) NOTE: ensure bucket exists first
- endpoint URL
And never commit unencrypted secrets to your Git repo (the ObjectStore
is safe but the Secret
should be encrypted with SOPS or filled using external-secrets if you are doing GitOps). Alternatively, .gitignore
it.
Then add this plugins
section to the spec
of your postgresql cluster:
---
# yaml-language-server: $schema=https://github.com/datreeio/CRDs-catalog/raw/refs/heads/main/postgresql.cnpg.io/cluster_v1.json
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: ferretdb-pg-cluster
namespace: ferretdb
spec:
[...]
plugins:
- name: barman-cloud.cloudnative-pg.io
isWALArchiver: true
parameters:
barmanObjectName: ferretdb-backupstore
serverName: pg
Check that the barman-cloud pods are all started after a few seconds. You should see 2/2
for every pod that is part of the cluster under the READY
column of the output of the following command:
kubectl -n ferretdb get pods
Like the ferretdb-pg-cluster
pods here:
NAME READY STATUS RESTARTS AGE
ferretdb-79dccb7d5d-8kbnq 1/1 Running 0 34h
ferretdb-79dccb7d5d-wnlc9 1/1 Running 0 34h
ferretdb-pg-cluster-1 2/2 Running 0 33h
ferretdb-pg-cluster-2 2/2 Running 0 33h
ferretdb-pg-cluster-3 2/2 Running 0 33h
You can also check your S3 storage to see if the WALs are actually ending up in there.
For example, you could install rclone and connect it to your cloud storage by running rclone config
.
Then afterwards, you could run rclone ls name-of-your-remote:
to see all the files in there. You should see files in a wals/
subdirectory of the path you configured in your ObjectStore.
So now we have the barman-cloud containers streaming the WALs to your cloud object storage. But in order to be able to restore, we also need a “save point” or base backup. I like to set this up in such a way that it is taken periodically, for example every night, and cloudnative-pg makes this very easy. No cronjob needed, just this:
---
# yaml-language-server: $schema=https://raw.githubusercontent.com/datreeio/CRDs-catalog/refs/heads/main/postgresql.cnpg.io/scheduledbackup_v1.json
apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
name: ferretdb
spec:
backupOwnerReference: self
cluster:
name: ferretdb-pg-cluster # Should be the name of your postgres cluster
schedule: "0 0 1 * * *" # This is a slightly unusual cron format
immediate: true # Means: take a snapshot right now as well
method: plugin
pluginConfiguration:
name: barman-cloud.cloudnative-pg.io
Check that it is applied:
kubectl -n ferretdb get scheduledbackup
Check that it spawns a regular Backup
object:
kubectl -n ferretdb get scheduledbackup
And check the phase. You can watch it to not have to keep refreshing by adding -w
. Eventually the PHASE
should be completed
.
NAMESPACE NAME AGE CLUSTER METHOD PHASE ERROR
ferretdb ferretdb-20250924091153 33h ferretdb-pg-cluster plugin completed
And you should see a recovery window in the status of your ObjectStore
:
kubectl -n ferretdb describe objectstore ferretdb-backupstore
On the bottom of the output you should see something like:
Status:
Server Recovery Window:
Pg:
First Recoverability Point: 2025-09-24T09:12:01Z
Last Successful Backup Time: 2025-09-25T01:00:07Z
And that’s it. Now you could leave it running for a while, write some data, delete some data and try out point in time restore using cloudnative-pg documentation
But for this tutorial, we will go to the initially promised final step.
Step 3. Automatic recovery
It is important that your backups are properly wired up and you have a recovery window in your ObjectStore
before you start this step.
The usecase is the following: imagine your whole kubernetes cluster is wiped. Maybe there’s water damage to your house, a lightning hit, or (more common in the homelab) you have so seriously messed up your cluster that it is easier to start over and re-apply all the YAML files you have been keeping track of, than to actually fix the problem. Or if you are using a proper GitOps solution like FluxCD, applying all the YAMLs is the same as just deploying Flux into a fresh cluster.
Let’s set up our postgres cluster configuration to automatically restore from the ObjectStore
. Add the lines marked with +
to the postgres cluster document (without the +), comment out the initdb
section and re-apply it:
---
# yaml-language-server: $schema=https://github.com/datreeio/CRDs-catalog/raw/refs/heads/main/postgresql.cnpg.io/cluster_v1.json
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: ferretdb-pg-cluster
namespace: ferretdb
+ annotations:
+ # required for seamless bootstrap: https://github.com/cloudnative-pg/cloudnative-pg/issues/5778#issuecomment-2783417464
+ cnpg.io/skipEmptyWalArchiveCheck: "enabled"
spec:
# Keep this in sync with the correct image for FerretDB itself
# read FerretDB release notes and upgrade them together
imageName: ghcr.io/ferretdb/postgres-documentdb:17-0.106.0-ferretdb-2.5.0
instances: 3
# postgres-documentdb needs these IDs
postgresUID: 999
postgresGID: 999
storage:
size: 10Gi
# This should be the name of the storage class as configured in your provisioner
storageClass: local-hostpath
postgresql:
shared_preload_libraries:
- pg_cron
- pg_documentdb_core
- pg_documentdb
parameters:
# pg_cron needs to know which database FerretDB uses
cron.database_name: ferretDB
# The parameters below are necessary to run ferretdb without superuser access
# Copied from https://github.com/FerretDB/documentdb/blob/ferretdb/packaging/10-preload.sh
documentdb.enableCompact: "true"
documentdb.enableLetAndCollationForQueryMatch: "true"
documentdb.enableNowSystemVariable: "true"
documentdb.enableSortbyIdPushDownToPrimaryKey: "true"
documentdb.enableSchemaValidation: "true"
documentdb.enableBypassDocumentValidation: "true"
documentdb.enableUserCrud: "true"
documentdb.maxUserLimit: "100"
pg_hba:
# pg_cron always runs as `postgres`
- host ferretDB postgres localhost trust
# This is needed to prevent fe_sendauth error
- host ferretDB ferret localhost trust
bootstrap:
+ recovery:
+ source: &source pg
# initdb:
# database: ferretDB
# owner: ferret
# postInitApplicationSQL:
# - create extension if not exists pg_cron;
# - create extension if not exists documentdb cascade;
# - grant documentdb_admin_role to ferret;
plugins:
- name: barman-cloud.cloudnative-pg.io
isWALArchiver: true
parameters:
barmanObjectName: ferretdb-backupstore
serverName: pg
+ externalClusters:
+ - name: pg
+ plugin:
+ name: barman-cloud.cloudnative-pg.io
+ parameters:
+ barmanObjectName: ferretdb-backupstore
+ serverName: pg
Don’t forget the annotation! Without it, barman-cloud will refuse to continue writing backups to the same ObjectStore and serverName after recovery. It is generally safe to use this annotation1.
Now check this has applied successfully:
kubectl get cluster.postgresql.cnpg.io -n ferretdb
kubectl describe cluster.postgresql.cnpg.io -n ferretdb
(you can also shorten these to just use cluster
if you have no other resources named cluster
except the cnpg cluster.)
If it has, go ahead and delete your entire postgres cluster:
kubectl delete -n ferretdb cluster.postgresql.cnpg.io ferretdb-pg-cluster --force
Check that all the postgres cluster pods and PVCs are gone (you might have to kubectl delete --force
some of them):
kubectl -n ferretdb get pods
kubectl -n ferretdb get pvc
If they are all gone, here comes the test: re-apply the postgresql YAML and it should start a full-recovery:
kubectl apply -f postgres-cluster.yaml
Check the pods:
kubectl -n ferretdb get pods -w
Check the logs:
kubectl -n ferretdb logs name-of-the-pod
And soon, your cluster should be back and healthy. Connect to it as in Step 1 and check that your data is restored. Make a happy dance, then go to bed and sleep soundly, knowing your data is safe, no manual intervention is required for restore, and you are using a completely open source document database solution.
If things change and you want to check how I am running FerretDB, go to github.com/fhoekstra/home-ops, press t
and type ferretdb
.
If you found an error, I would love to accept your PR to this blog at https://codeberg.org/fhoekstra/blog
If you need help setting this up at home, come to the Home Operations Discord.
Attribution
I did not come up with any of this myself, I just combined the ideas from some amazing people around me, tested it and wrote it down. Special thanks go to @eaglesemanation for figuring out how to run FerretDB against cloudnative-pg without superuser access. That made this setup possible.
I should also thank @tholinka and @Phycoforce from the Home-Operations community for their help with barman-cloud backups. Their home-ops repos are a great resource.
You can browse many more repos using cloudnative-pg and barman-cloud by searching kubesearch.dev to get ideas of other ways to set this up.
Footnotes
But if you are going to do a major upgrade (for example to Postgresql 18), you might want to do that using a fresh cluster, with its own new
serverName
instead of keeping the backups in the same place. Thanks to @MASTERBLASTER in the Home-Operations community for this addition. ↩︎
how-to tutorial database postgresql kubernetes gitops cloudnative cloudnative-pg cloudnativepg ferretdb mongodb open-source homelab guide
2499 Words
2025-09-25 08:01