Using rsync for backups, because it's not shiny and new (tinkerbetter.tube)
from mesamunefire@piefed.social to selfhosted@lemmy.world on 03 Oct 17:19
https://piefed.social/post/1332496

You might not even like rsync. Yeah it’s old. Yeah it’s slow. But if you’re working with Linux you’re going to need to know it.

In this video I walk through my favorite everyday flags for rsync.

Support the channel:
https://patreon.com/VeronicaExplains
https://ko-fi.com/VeronicaExplains
https://thestopbits.bandcamp.com

Here’s a companion blog post, where I cover a bit more detail: https://vkc.sh/everyday-rsync

Also, @BreadOnPenguins made an awesome rsync video and you should check it out: https://www.youtube.com/watch?v=eifQI5uD6VQ

Lastly, I left out all of the ssh setup stuff because I made a video about that and the blog post goes into a smidge more detail. If you want to see a video covering the basics of using SSH, I made one a few years ago and it’s still pretty good: https://www.youtube.com/watch?v=3FKsdbjzBcc

Chapters:
1:18 Invoking rsync
4:05 The –delete flag for rsync
5:30 Compression flag: -z
6:02 Using tmux and rsync together
6:30 but Veronica… why not use (insert shiny object here)

#homelab #linux #rsync #self-hosting #selfhosted #sysadmin

threaded - newest

mesamunefire@piefed.social on 03 Oct 17:21 next collapse

Ive personally used rsync for backups for about….15 years or so? Its worked out great. An awesome video going over all the basics and what you can do with it.

Eldritch@piefed.world on 03 Oct 17:25 next collapse

And I generally enjoy Veronica's presentation. Knowledgable and simple.

mesamunefire@piefed.social on 03 Oct 17:29 collapse

Her https://tinkerbetter.tube/w/ffhBwuXDg7ZuPPFcqR93Bd made me learn a new way of looking at data. There was some tricks I havent done before. She has such good videos.

Eldritch@piefed.world on 03 Oct 17:59 collapse

Yep, I found her through YouTube. Her and action retro's content is always great.with some Adrian black on the side.

confusedpuppy@lemmy.dbzer0.com on 03 Oct 18:07 collapse

I use rsync for many of the reasons covered in the video. It’s widely available and has a long history. To me that feels important because it’s had time to become stable and reliable. Using Linux is a hobby for me so my needs are quite low. It’s nice to have a tool that just works.

I use it for all my backups and moving my backups to off network locations as well as file/folder transfers on my own network.

I even made my own tool (codeberg.org/taters/rTransfer) to simplify all my rsync commands into readable files because rsync commands can get quite long and overwhelming. It’s especially useful chaining multiple rsync commands together to run under a single command.

I’ve tried other backup and syncing programs and I’ve had bad experiences with all of them. Other backup programs have failed to restore my system. Syncing programs constantly stop working and I got tired of always troubleshooting. Rsync when set up properly has given me a lot less headaches.

solrize@lemmy.ml on 03 Oct 17:31 next collapse

I’ve been using borg because of the backend encryption and because the deduplication and snapshot features are really nice. It could be interesting to have cross-archive deduplication but maybe I can get something like that by reorganizing my backups. I do use rsync for mirroring and organizing downloads, but not really for backups. It’s a synchronization program as the name implies, not really intended for backups.

cmgvd3lw@discuss.tchncs.de on 03 Oct 17:55 collapse

I think Arch wiki recommends rsync for backups

i_stole_ur_taco@lemmy.ca on 03 Oct 18:00 next collapse

The thing I hate most about rsync is that I always fumble to get the right syntax and flags.

This is a problem because once it’s working I never have to touch it ever again because it just works and keeping working. There’s not enough time to memorize the usage.

mesamunefire@piefed.social on 03 Oct 18:06 next collapse

I feel this too. I have a couple of “spells” that work wonders in a literal small notebook with other one liners over the years. Its my spell book lol.

NuXCOM_90Percent@lemmy.zip on 03 Oct 18:17 next collapse

One trick that one of my students taught me a decade or so ago is to actually make an alias to list the useful flags.

Yes, a lot of us think we are smart and set up aliases/functions and have a huge list of them that we never remember or, even worse, ONLY remember. What I noticed her doing was having something like goodman-rsync that would just echo out a list of the most useful flags and what they actually do.

So nine times out of 10 I just want rsync -azvh --progress ${SRC} ${DEST} but when I am doing something funky and am thinking “I vaguely recall how to do this”? dumbman rsync and I get a quick cheat sheet of what flags I have found REALLY useful in the past or even just explaining what azvh actually does without grepping past all the crap I don’t care about in the man page. And I just keep that in the repo of dotfiles I copy to machines I work on regularly.

tal@olio.cafe on 03 Oct 18:53 next collapse

Most Unix commands will show a short list of the most-helpful flags if you use --help or -h.

muix@lemmy.sdf.org on 03 Oct 20:48 collapse

tldr and atuin have been my main way of remembering complex but frequent flag combinations

oddlyqueer@lemmy.ml on 03 Oct 18:22 collapse

This is why I still don’t know sed and awk syntax lol. I eventually get the data in the shape I need and then move on, and never imprint how they actually work. Still feel like a script kiddie every time I use them (so once every few years).

tal@olio.cafe on 03 Oct 18:51 collapse

sed can do a bunch of things, but I overwhelmingly use it for a single operation in a pipeline: the s// operation. I think that that’s worth knowing.

sed 's/foo/bar/'

will replace all the first text in each line matching the regex “foo” with “bar”.

That’ll already handle a lot of cases, but a few other helpful sub-uses:

sed 's/foo/bar/g'

will replace all text matching regex “foo” with “bar”, even if there are more than one per line

sed 's/\([0-9a-f]*\)/0x\1/g

will take the text inside the backslash-escaped parens and put that matched text back in the replacement text, where one has ‘\1’. In the above example, that’s finding all hexadecimal strings and prefixing them with ‘0x’

If you want to match a literal “/”, the easiest way to do it is to just use a different separator; if you use something other than a “/” as separator after the “s”, sed will expect that later in the expression too, like this:

sed 's%/%SLASH%g

will replace all instances of a “/” in the text with “SLASH”.

NuXCOM_90Percent@lemmy.zip on 03 Oct 18:01 next collapse

I would generally argue that rsync is not a backup solution. But it is one of the best transfer/archiving solutions.

Yes, it is INCREDIBLY powerful and is often 90% of what people actually want/need. But to be an actual backup solution you still need infrastructure around that. Bare minimum is a crontab. But if you are actually backing something up (not just copying it to a local directory) then you need some logging/retry logic on top of that.

At which point you are building your own borg, as it were. Which, to be clear, is a great thing to do. But… backups are incredibly important and it is very much important to understand what a backup actually needs to be.

tal@olio.cafe on 03 Oct 18:40 next collapse

I would generally argue that rsync is not a backup solution.

Yeah, if you want to use rsync specifically for backups, you’re probably better-off using something like rdiff-backup, which makes use of rsync to generate backups and store them efficiently, and drive it from something like backupninja, which will run the task periodically and notify you if it fails.

rsync: one-way synchronization

unison: bidirectional synchronization

git: synchronization of text files with good interactive merging.

rdiff-backup: rsync-based backups. I used to use this and moved to restic, as the backupninja target for rdiff-backup has kind of fallen into disrepair.

That doesn’t mean “don’t use rsync”. I mean, rsync’s a fine tool. It’s just…not really a backup program on its own.

neidu3@sh.itjust.works on 03 Oct 19:20 collapse

+1 for rfiff-backup. Been usinit for 20 years or so, and I love it.

non_burglar@lemmy.world on 03 Oct 20:30 collapse

I use rsync and a pruning script in crontab on my NFS mounts. I’ve tested it numerous times breaking containers and restoring them from backup. It works great for me at home because I don’t need anything older than 4 monthly, 4 weekly, and 7 daily backups.

However, in my job I prefer something like bacula. The extra features and granularity of restore options makes a world of difference when someone calls because they deleted prod files.

tomkatt@lemmy.world on 03 Oct 18:03 next collapse

Rsync is great. I’ve been using it to back up my book library from my local Calibre collection to my NAS for years, it’s absurdly simple and convenient. Plus, -ruv lets me ignore unchanged files and backup recursively, and if I clean up locally and need that replicated, just need to add —delete.

dohpaz42@lemmy.world on 03 Oct 18:08 next collapse

Here’s how I approach old and slow:

  1. Older software is mature and battle tested. It’s been around long enough that the developers should know what they’re doing, and have built a strong community for help and support.
  2. Slow is okay when it comes to accuracy. Would I love to back up my gigabytes (peanuts compared to some of you folks out there with data centers in your attics) in seconds? Yes. But more importantly, I’d rather have my data be valid for if I ever need to do any kind of restore. And I’ve been around the block enough times in my career to see many useless backups.
calliope@retrolemmy.com on 03 Oct 18:12 next collapse

Tangentially, I don’t see people talk about rclone a lot, which is like rsync for cloud storage.

It’s awesome for moving things from one provider to another, for example.

Eldritch@piefed.world on 03 Oct 18:15 next collapse

It's fine. But yes in the Linux space. We tend to want to host ourselves. Not have to trust some administrator of some cloud we don't know/trust.

calliope@retrolemmy.com on 03 Oct 18:24 collapse

“Cloud” storage doesn’t mean “someone else’s computer” all the time. I thought that would be obvious in a self-hosting forum, but here we are.

Like… NextCloud, for example. I thought people self-hosted cloud storage?

Rclone has adapters for many, many things.

“In the Linux space?!” Why are you being patronizing about this when you don’t seem to know what you’re talking about?

Eldritch@piefed.world on 03 Oct 20:24 collapse

I mention in the Linux space only because it's what I'm familiar with and didn't want to make assumptions about groups I'm not familiar with. Unlike you who's looking for a way to take umbridge and talk passed people. I went to college for IT and have done it for 30 years.

In network and IT planning. The cloud is the wider network outside your own. That you don't have mapped. Often depicted by a "cloud". If I have a personal data pool on one of my own networks. And need it from another. It may transmit via the "cloud". But it isn't IN the cloud. It's on a personal server. If the server is in your house, and you can point exactly to where your data is. Then the rule of thumb is that it is in your house. Not the cloud. If it's hosted on a system you couldn't directly point to on a network you have no knowledge of. Especially a shared system. Then things literally and figuratively are getting cloudier.

That said, marketing as it often does. Appropriates and misuses words based around buzz. And I am not about to admonish hobbyist who use it in the marketing sense. I understand, I get it.

If you host in OSX on Apple Silicon, that's great. If you host on a 68k Mac or Amiga you're a fucking mad lad! If you're hosting under Windows, any TCP port in the storm mate. If you are hosting from a Linux distribution that is not God's chosen, cool how is it working out? If you are hosting from BeOS. or Haiku, you are a glorious oddball and absolutely my sort of person. And if you are hosting from an appliance that you really don't know what it's running, welcome to the hobby. It's a good starting point. And a lill data in the cloud isn't a crime. We all have some. But if you can't easily point to it. Can you really know you have it?

Landless2029@lemmy.world on 03 Oct 18:39 next collapse

I tried rclone once because I wanted to sync a single folder from documents and freaked out when it looked like it was going to purge all documents except for my targeted folder.

Then I just did it via the portal…

calliope@retrolemmy.com on 03 Oct 19:45 collapse

rsync can sometimes look similarly scary! I very clearly remember triple-checking what it’s doing.

rclone works amazingly well if you have hundreds of folders or thousands of files and you can’t be bothered to babysit a portal.

davidvasandani@social.coop on 03 Oct 19:06 collapse

@calliope It’s also great for local or remote backups over ssh, smb, etc.

calliope@retrolemmy.com on 03 Oct 19:47 collapse

It has been remarkably useful! I keep trying to tell people about it but apparently I am just their main use case or something.

I would have loved it when I was using Samba to share files on my local network decades ago. It’s like a Swiss Army knife!

Landless2029@lemmy.world on 03 Oct 18:35 next collapse

I need a breakdown like this for Rclone. I’ve got 1TB of OneDrive free and nothing to do with it.

I’d love to setup a home server and backup some stuff to it.

tal@olio.cafe on 03 Oct 18:58 next collapse

slow

rsync is pretty fast, frankly. Once it’s run once, if you have -a or -t passed, it’ll synchronize mtimes. If the modification time and filesize matches, by default, rsync won’t look at a file further, so subsequent runs will be pretty fast. You can’t really beat that for speed unless you have some sort of monitoring system in place (like, filesystem-level support for identifying modifications).

portnull@lemmy.dbzer0.com on 03 Oct 19:00 next collapse

Maybe I am missing something but how does it handle snapshots?

I use rsync all the time but only for moving data around effectively. But not for backups as it doesn’t (AFAIK) hanld snapshots

stratself@lemdro.id on 03 Oct 19:14 next collapse

Rsync depends on OpenSSH, but it definitely isn’t SFTP. I’ve tried using it against an SFTPGo instance, and lost some files because it runs its own binary, bypassing SFTPGo’s permission checks. Instead, I’ve opted for rclone with the SFTP backend, which does everything rsync do and is very well compliant.

In fact, while SFTPGo’s main developer published a fix for this bug, he also expressed intention to drop support for the command entirely. I think I’m just commenting to give a heads up for any passerby.

Jessica@discuss.tchncs.de on 03 Oct 19:17 next collapse

If you’re trying to back up Windows OS drives for some reason, robocopy works quite similarly to rsync.

probable_possum@leminal.space on 03 Oct 19:32 collapse

rsnapshot is a script for the purpose of repeatedly creating deduplicated copies (hardlinks) for one or more directories. You can chose how many hourly, daily, weekly,… copies you’d like to keep and it removes outdated copies automatically. It wraps rsync and ssh (public key auth) which need to be configured before.