Automated CI/CD Data Snapshots
from rhymepurple@lemmy.ml to selfhosted@lemmy.world on 13 Jun 03:11
https://lemmy.ml/post/16814669
from rhymepurple@lemmy.ml to selfhosted@lemmy.world on 13 Jun 03:11
https://lemmy.ml/post/16814669
cross-posted from: lemmy.ml/post/16693054
Is there a feature in a CI/CD pipeline that creates a snapshot or backup of a service’s data prior to running a deployment? The steps of a ideal workflow that I am searching for are similar to:
- CI tool identifies new version of service and creates a pull request
- Manually merge pull request
- CD tool identifies changes to Git repo
- CD tool creates data snapshot and/or data backup
- CD tool deploys update
- Issue with deployment identified that requires rollback
- Git repo reverted to prior commit and/or Git repo manually modified to prior version of service
- CD tool identifies the rolled back version
- (OPTIONAL) CD tool creates data snapshot and/or data backup
- CD tool reverts to snapshot taken prior to upgrade
- CD tool deploys service to prior version per the Git repo
- (OPTIONAL) CD tool prunes data snapshot and/or data backup based on provided parameters (eg - delete snapshots after _ days, only keep 3 most recently deployed snapshots, only keep snapshots for major version releases, only keep one snapshot for each latest major, minor, and patch version, etc.)
threaded - newest
Not sure what you’re doing, but if we’re talking about a bog standard service backed by a db, I don’t think having automated reverts of that data is the best idea. you might lose something! That said, triggering a snapshot of your db as a step before deployment is a pretty reasonable idea.
Reverting a service back to a previous version should be straightforward enough, and any dedicated ci/cd tool should have an API to get you information from the last successful deploy, whether that is the actual artifact you’re deploying, or a reference to a registry.
As you’re probably entirely unsurprised by, there are a ton of ways to skin this cat. you might consider investing in preventative measures, testing your data migration in a lower environment, splitting out db change commits from service logic commits, doing some sort of blue/green or canary deployment.
I get fairly nerd-sniped when it comes to build pipelines so happy to talk more concretely if you’d like to provide some more details!
Thanks for the reply! I am currently looking to do this for a Kubernetes cluster running various services to more reliably (and frequently) perform upgrades with automated rollbacks when necessary. At some point in the future, it may include services I am developing, but at the moment that is not the intended use case.
I am not currently familiar enough with the CI/CD pipeline (currently Renovatebot and ArgoCD) to reliably accomplish automated rollbacks, but I believe I can get everything working with the exception of rolling back a data backup (especially for upgrades that contain backwards incompatible database changes). In terms of storage, I am open to using various selfhosted services/platforms even if it means drastically changing the setup (eg - moving from TrueNAS to Longhorn, moving from Ceph to Proxmox, etc.) if it means I can accomplish this without a noticeable performance degradation to any of the services.
I understand that it can be challenging (or maybe impossible) to reliably generate backups while the services are running. I also understand that the best way to do this for databases would be to stop the service and perform a database dump. However, I’m not too concerned with losing <10 seconds of data (or however long the backup jobs take) if the backups can be performed in a way that does not result in corrupted data. Realistically, the most common use cases for the rollbacks would be invalid Kubernetes resources/application configuration as a result of the upgrade or the removal/change of a feature that I depend on.
Are you using PersistentVolumes? If your storage class supports it, looks like there’s a volume snapshot concept you can use, have you looked into that?
Yes, I am using PersistentVolumes. I have played around with different tools that have backup/snapshot abilities, but I haven’t seen a way to integrate that functionality with a CD tool. I’m sure if I spent enough time working through things, I may be able to put together something that allows the CD tool to take a snapshot. However, I think that having it handle rollbacks would be a bit too much for me to handle without assistance.