
Archiving Bitbucket Server repositories to S3 or GCS
A quick script example using “git bundle” to help archive git repositories to AWS Glacier or GCS Coldline storage.
Bitbucket Server is organised into projects each of which contains multiple repositories. Now let’s take an example of an organisation migrating tens of thousands of repositories away from their hosted Bitbucket Server set up.
Not all of these repositories were being migrated. Many tens, if not hundreds of these projects are very old and some teams wanted to ‘cold archive’ them rather than migrate to their new Git hosting — with their choice being either S3 Glacier or GCP Coldline storage.
Note that this article is aimed at end-users rather than the administrators of a Bitbucket Server, who have access to other tools for full or partial backups.
A quick introduction to the relevant concepts of the Bitbucket Server API.
- Authorization requires a TOKEN which is passed to the API via the
Authorization: Bearer TOKEN
header. - Given a PROJECT, then the API offers an endpoint
rest/api/1.0/projects/PROJECT/repos
to get its repositories. - The API has paging with a
limit=NNN
to control how many records are returned.
The pseudo-logic we are after is;
- Obtain the list of repositories for the project
- Iterate each, clone, and use
git bundle
to create a backup - Push all backups to GCP (or AWS)
Here’s the frame of a trivial shell script that summarises how to use the API to automate this task (requires jq
to be installed to parse the JSON).
TOKEN="XXXXX"
SERVER="https://my-bitbucket-server"project="$1"
bucket="$2"workdir=$(pwd)repositories=$(curl -H "Authorization: Bearer ${TOKEN}" ${SERVER}/rest/api/1.0/projects/${project}/repos?limit=999 | jq -r '.values[].slug')for repository in ${repositories}
do
dir="${workdir}/${project}-${repository}"
git clone "${SERVER}/scm/${project}/${repository}.git" "${dir}"
cd "${dir}"
git bundle create "${dir}.bundle" --all
cd ..
donegsutil -m cp *.bundle ${bucket}
Run as script.sh "project" "gs://your-target-bucket"
.
From here you you could enhance to push bundles individually, tidy up local disk space as you go, add paging, and so on. You could take a seed file of repository URLs to be more selective. Or you could enhance this script to create a new Git repository (let’s say for example, in GitHub using the GitHub API), and git push --mirror
the repository into the new target location.
A note from the author
Thank you for reading this article — I hope you found it useful and please do follow me for other articles on DevOps, CICD, and Cloud.