How to Backup and Restore Gitpod

⚠️ Gitpod Self-hosted has been replaced with Gitpod Dedicated, a self-hosted, single-tenant managed service that runs in your private cloud account but is managed by us.
Try out Gitpod Dedicated.

For business continuity purposes, it is important to think about how you might restore your ability to use Gitpod, and thus develop software in the event of a catastrophic failure of Gitpod or the underlying infrastructure it runs on. This guide assumes that you will use the backup and restore strategy for disaster recovery and will guide you towards what needs to be backed up and how to restore using said backup. Please see our background reading on disaster recovery for more information.

Important: When using Gitpod in a production setting, we recommend you base your installation on the single cluster reference architecture. Using in-cluster dependencies is not recommended because there is no means to produce backups, and the database/storages systems are within failure domain of the cluster. If possible consider using Gitpod SaaS.

Note: We recommend to regularly trial run a recovery using this method to ensure that it works in practice and to allow yourself the chance to spot any unforeseen issues.

What to back up

It is critical to consider what needs to be backed up and ensure you take the necessary steps to secure each of the listed elements. What needs to be backed up is closely aligned with Gitpod’s architecture and how it runs.

Database

The database is a central component in Gitpod where all metadata about users and workspaces, as well as settings of the Gitpod instance (such as auth providers) are stored. This makes the database a critical component. In case of a database outage, you are not able to log in, use the Gitpod dashboard, or start workspaces. We recommend using a cloud provider native relational database service that supports MySQL - see required components. This means that you can rely on the best practices of each service for disaster recovery. For example:

Object Storage

Gitpod uses object storage to store blob data. This includes workspace backups created when a workspace stops and are used to restore the state upon restart. As such, to secure the work of your users, it is critical to think about backing up this data and/or relying on the best practices for disaster recovery of the object storage service being used. For example:

  • AWS S3: You can consider using cross-region replication to increase reliability further - although S3 already stores your data across multiple geographically distant Availability Zones by default.
  • Google Cloud Storage: Consider using the Multi-Regional Storage option for additional availability.

OCI Image Registry

Gitpod uses an image registry to cache images and store images it builds on behalf of users. Note: For non-airgapped environments, this is not the registry that contains the images of Gitpod’s services. As such, losing this data means that workspace starts may take longer because images need to be re-built. Consider implementing best practices for securing the registry you are using.

Application Config

Important: KOTS Snapshots will NOT save any data from your Gitpod database, registry or object storage. It will also not backup any data outside of your gitpod namespace. It will backup:

  • the KOTS dashboard
  • the KOTS configuration
  • the version of Gitpod installed
  • the TLS certificate generated by cert-manager (if enabled)

Although you could simply re-install Gitpod using the regular installation path, this can take some time and you would need to re-configure it to the state you had last had it in. To minimize your recovery time, you can persist the application configuration (ideally regularly).

Configuring Velero

Velero is an open source tool to safely backup and restore, perform disaster recovery, and migrate Kubernetes cluster resources and persistent volumes. It is used by KOTS to connect to your backup location. It supports many data sources, including AWS, Azure and GCP storage solutions.

Please follow the installation instructions as per the Velero documentation. KOTS requires Restic integration to function correctly, which can be added by appending the --use-restic flag to the velero install command.

Triggering Your First Backup

For full documentation on the KOTS backup solution, please see their documentation.

To create a new backup via the KOTS CLI or in the Snapshots section of your KOTS dashboard, you can do this by running the following command:

language icon language: 
bash
kubectl kots backup --namespace gitpod

When that has finished, you will be able to list your backups:

language icon language: 
bash
kubectl kots backup ls

And it will display a list that looks similar to this:

language icon language: 
bash
NAME              STATUS       ERRORS    WARNINGS    STARTED                          COMPLETED                        EXPIRES
instance-ab1cd    Completed    0         0           2022-08-11 13:36:38 +0100 BST    2022-08-11 13:36:54 +0100 BST    29d

Cluster Configuration

To reduce the time it takes you to re-create a cluster, you can move to an infrastructure as code flow, e.g. by codifying the infrastructure you need using Terraform.

How to restore

The following explains how you might restore Gitpod after its underlying cluster fails.

  1. Recreate your infrastructure. Ideally, you do this using something like a Terraform script.
  2. Configure Velero using the instructions above - it is recommended that you install the same version that you used previously.
  3. List your available backups using:
language icon language: 
bash
kubectl kots backup ls
  1. Restore the backup using:
language icon language: 
bash
kubectl kots restore --from-backup instance-ab1cd
  1. Load the KOTS dashboard:
language icon language: 
bash
kubectl kots admin-console --namespace gitpod
  1. Hit the “Redeploy” button.
  2. This should result in your Gitpod instance having the same state as before, thus allowing your users to pick up where they left off.

Security considerations

Application Config

Velero should be configured so that is deployed to a different namespace to Gitpod. The Velero deployment will contain secrets which will allow access to your backup source of choice. Your Kubernetes cluster should be configured to limit access to these resources with use of a role-based access policy (RBAC).

You should always consult with the Velero documentation to ensure that you are following their best practice guidelines to ensure the integrity of your backup artifacts.

Was this helpful?