On Compute Engine we can do Snapshots, which are basically backups. Could you try to figure out how we could create a script to do automated snapshots every day and keep lik
My solution is slightly simpler. I want to snapshot all disks not just the primary disk.
By listing all disks in the project this handles all servers from one single script - as long as it is run within a gcloud project (and could be modified to run outside a project server too.
To tidy up older snapshots doesn't need such complex date processing as it can be handled from the gcloud command line using a filter
https://gitlab.com/alan8/google-cloud-auto-snapshot
#!/bin/bash
# loop through all disks within this project and create a snapshot
gcloud compute disks list | tail -n +2 | while read DISK_NAME ZONE c3 c4; do
gcloud compute disks snapshot $DISK_NAME --snapshot-names auto-$DISK_NAME-$(date "+%s") --zone $ZONE
done
#
# snapshots are incremental and dont need to be deleted, deleting snapshots will merge snapshots, so deleting doesn't loose anything
# having too many snapshots is unwieldy so this script deletes them after 60 days
#
gcloud compute snapshots list --filter="creationTimestamp<$(date -d "-60 days" "+%Y-%m-%d") AND (auto.*)" --uri | while read SNAPSHOT_URI; do
gcloud compute snapshots delete --quiet $SNAPSHOT_URI
done
#
Also note that for OSX users you have to use something like
$(date -j -v-60d "+%Y-%m-%d")
for the the creationTimestamp filter
If nothing else I know that [--set-scheduling]
is a situational gcloud flag and there's a wait [process]
that will prevent the current command from executing until that process is complete. Combine that with &&
operator (executes same-statement commands after the previous is completed), stringing this sucker together shouldn't be too hard. Just run it at startup (when you create an instance it has startup command option) and have it count time or make one of the regular maintenance functions trigger the commands. But honestly, why mix syntax if you don't have to?
This could work (don't copy/paste)
gcloud config set compute/zone wait [datetime-function] && \
gcloud compute disks snapshot snap1 snap2 snap3 \
--snapshot-names ubuntu12 ubuntu14 debian8 \
--description=\
'--format="multi(\
info:format=list always-display-title compact,\
data:format=list always-display-title compact\
)"'
In theory gcloud will set the compute/zone but will have to wait until the time specified. Because of the double ampersand (&&) the next command will not execute until after the first command is complete. I may have gone overboard on the description but I did so for the sake of showing the simplicity of it, I know it won't work as is but I also know I'm not that far off. Wow after looking at all the code one might believe we're attempting to solve the immortality sequence. I don't think working it out in a bash file is the best way. gcloud made command line for people that don't know command line. We've been taught (or learned... or haven't learned yet) to write code a proper way relative to the environment. I say we apply that here and use the CLOUD SDK to our advantage.
There's also a 3rd Party service called VMPower.io which can automate the capture, retention and restore of snapshots for google cloud. It isn't free but will do what you're looking for without having to code anything.
in my example, I have a maintenance window to create a snapshot for MySQL. it assumes that the service account has permission to execute gcloud snapshots commands. hope it helps:
#!/bin/bash
days_to_keep=7
disk=`curl -s "http://metadata.google.internal/computeMetadata/v1/instance/disks/1/device-name" -H "Metadata-Flavor: Google"`
zone=`curl -s "http://metadata.google.internal/computeMetadata/v1/instance/zone" -H "Metadata-Flavor: Google"`
project=`curl -s "http://metadata.google.internal/computeMetadata/v1/project/project-id" -H "Metadata-Flavor: Google"`
zone=`basename ${zone}`
storage_location=`echo ${zone}|sed 's/-[a-z]$//'`
systemctl stop mysqld
sleep 5
# flush file system buffers
sync
# create snapshot
gcloud -q compute disks snapshot ${disk} --project=${project} --snapshot-names=${disk}-$(date +%s) --zone=${zone} --storage-location=${storage_location}
systemctl start mysqld
delete_date=$(date -d "-${days_to_keep} days" "+%Y-%m-%d")
# get list of snapshots to delete
to_del=$(gcloud compute snapshots list --filter="name ~ ${disk}* AND creationTimestamp<$delete_date" --format "csv[no-heading](name)")
# delete bulk of old snapshots
if [[ ! -z ${to_del} ]]
then
gcloud compute snapshots delete -q ${to_del}
fi
There is now a feature called "Snapshot Schedule" available in GCP.
It still seems to be in Beta, and there is not much documentation on this feature yet. But it is a straight forward process to enable it. First you create a snapshot schedule and can assign it to persistent disks after you set it up.
See also the command line reference to create a schedule with corresponding the gcloud command:
gcloud beta compute resource-policies create-snapshot-schedule
https://cloud.google.com/sdk/gcloud/reference/beta/compute/resource-policies/create-snapshot-schedule
To assign the schedule to a persistent disk you can use the command
gcloud beta compute disks add-resource-policies
https://cloud.google.com/sdk/gcloud/reference/beta/compute/disks/add-resource-policies
Update 2019-02-15: Since yesterday there is a blog announcement about the scheduled snapshots feature and also a page in the Compute Engine documentation for scheduled snapshots.
Script assumes $HOSTNAME is the same as disk-name (my primary system disk assumes the same name as the the VM instance or $HOSTNAME -- (change to your liking) ultimately, wherever it says $HOSTNAME, it needs to point to the system disk on your VM.
gcloud creates incremental diff snapshots. The oldest will contain the most information. You do not need to worry about creating a full snapshot. Deleting the oldest will make the new oldest snapshot the primary that future incrementals will base from. This is all done on Google side logic -- so it is automagic to gcloud.
We have this script set to run on a cron job every hour. It creates a incremental snapshot (abt 1 to 2GB) and deletes any that are older than retention days. Google magically resizes the oldest snapshot (which was previously an incremental) to be the base snapshot. You can test this by deleting the base snapshot and refreshing the snapshot list (console.cloud.google.com) -- the "magic" occurs in the background and you may need to give it a minute or so to rebase itself. Afterwards, you'll notice the oldest snapshot is the base and it's size will reflect the used portion of the disk that you are performing the snapshot on.
#!/bin/bash
. ~/.bash_profile > /dev/null 2>&1 # source environment for cron jobs
retention=7 #days
zone=`gcloud info|grep zone:|awk -F\[ '{print $2}'|awk -F\] '{print $1}'`
date=`date +"%Y%m%d%H%M"`
expire=`date -d "-${retention} days" +"%Y%m%d%H%M"`
snapshots=`gcloud compute snapshots list --regexp "(${HOSTNAME}-.*)" --uri`
# Delete snapshots older than $expire
for line in "${snapshots[@]}"
do
snapshot=`echo ${line}|awk -F\/ '{print $10}'|awk -F\ '{print $1}'`
snapdate=`echo $snapshot|awk -F\- '{print $3}'`
if (( $snapdate <= $expire )); then
gcloud compute snapshots delete $snapshot --quiet
fi
done
# Create New Snapshot
gcloud compute disks snapshot $HOSTNAME --snapshot-name ${HOSTNAME}-${date} --zone $zone --description "$HOSTNAME Disk snapshot ${date}"