Backing up and restoring
Backups
BigAnimal backs up the data in your PostgreSQL clusters. You can change the retention period on the Backups tab when you create or edit your cluster. Depending on the cloud your cluster is deployed on, BigAnimal uses either of the following cloud object storage solutions.
Cloud | Object storage solution | |
---|---|---|
AWS | Amazon S3 (standard tier) | |
Azure | Azure Blob Storage (in the "hot" access tier with geo-zone-redundant storage (GZRS)) | |
Google Cloud | Cloud Storage |
Your organization is responsible for the charges associated with the cloud object storage solution.
PostgreSQL clusters in BigAnimal are continuously backed up through a combination of base backups and transaction log (WAL) archiving. When a new cluster is created, an initial "base" backup is taken. After that, every time a WAL file is closed, which is, by default, up to every 5 minutes, it's uploaded to the cloud object storage solution. If your cluster has faraway replicas, BigAnimal copies the WAL files from your cloud object storage solution and asynchronously transfers them to the faraway replicas. Your organization is responsible for the charges associated with the cloud object storage solution.
You can change the retention period on the Additional Settings tab when you create or edit your cluster.
Replication lag with faraway replicas
With faraway replicas, the primary server writes to the archive, which is moved to the object store. The replica reads from the object store as files arrive. Prior to the archive replicating across to the replica, the log file must write at least 16MB (the WAL segment size) of data to the WAL to cut a new archive file. If the time interval for closing a WAL file is too large, it can introduce a delay until the replica receives the latest record. The amount of time that it takes to fill up the 16MB log file and copy it to the archive is the replication lag. The replication lag is the period of time when data can be lost. It can be minutes or hours, depending on the amount of activity in your database. You need to measure the replication lag to determine if it's acceptable from a possible data-loss perspective. If it isn't acceptable, consider using a distributed high-availability cluster.
To determine the replication lag, you can compare the last log sequence number (LSN) in the 16MB log file on the primary to the LSN in the log file in the replica.
Restores
If a restore is necessary—for example, in case of an accidental DROP TABLE
statement—you can restore clusters to any point in the backup retention period.
Cluster restores aren't performed in place on an existing cluster. Instead, a new cluster is created and initialized with data from the backup archive. Restores must replay the transaction logs between the most recent full database backup and the target restore point. Thus restore times (that is, RTO) depend on the write activity in the source cluster.
BigAnimal retains backups for 30 days by default. You can restore backups into a new cluster in any region supported by BigAnimal.
Perform a cluster restore
The restore operation is available for any cluster that has at least one available backup. For newly created clusters, the initial backup becomes available a few minutes after the new cluster is fully operational.
Restore a single-node or primary/standby high-availability cluster
- Select the cluster you want to restore on the Clusters page in the BigAnimal portal.
- From Quick Actions, select Restore.
- On the Restore Cluster page:
- Fill in the required fields.
- To restore to the last possible recovery point, in the Source section, in the Point in Time Restore field, select Now on the calendar. Or, to restore further back in time, choose a timestamp.
- Review your selections in Cluster Summary and select Restore Cluster to begin the restore process.
- The new cluster is now available on the Clusters page.
Restore a distributed high-availability cluster
- On the Clusters page in the BigAnimal portal, select the cluster you want to restore.
- Select Quick Actions > Restore.
- In the Cluster Settings tab, enter a cluster name and password for your restored cluster.
- Select Next: Data Group.
- Select the Node Settings tab.
- In the Source section, select Fully Restore or Point in Time Restore. A point-in-time restore restores the data group as it was at the specified date and time.
- In the Nodes section, select Two Data Nodes or Three Data Nodes. For more information on node architecture, see Distributed high availability.
- Follow Steps 3-5 in Creating a distributed high-availability cluster.
- Select Restore.