fedora-classroom
LOGS

00:59:49 <jds2001> #startmeeting Fedora Classroom - intro to rsync
01:00:08 <jds2001> #topic Intro
01:00:44 <jds2001> The idea for this class came up when nirik was doing his session on preupgrade
01:00:57 * nirik waves
01:01:05 <jds2001> There was the suggestion there (which is a good one) to back up your stuff prior to doing the upgrade
01:01:06 <tw2113> hi
01:01:49 <jds2001> But there weren't any ways presented to do so.  I think that rsync is an easy way to back stuff up. (and do lots of other things)
01:02:05 <jds2001> So that's the first topic.
01:02:11 <onekopaka> like keep mirrors in sync (usually ;-) )
01:02:15 <jds2001> #topic Using rsync for backup
01:02:25 <jds2001> onekopaka: that's another topic :)
01:02:53 <jds2001> the basic purpose in life of rsync, as you might guess by it's name, is to keep two things in sync.
01:03:17 <jds2001> I'm just going to discuss the client side of rsync in this class, setting up an rsync server is beyond the scope
01:03:28 <jds2001> (but look at man rsyncd.conf if you're interested)
01:03:59 <jds2001> so the easiest form of rsync is to back up via ssh
01:04:22 <jds2001> all you need is another machine to ssh to, and you can sync files between them
01:04:30 <jds2001> no rsync server is required.
01:04:49 <jds2001> both machines obviously need to have ssh installed, and rsync :)
01:05:18 <jds2001> and the form to do that would be rsync -av /some/dir/ user@machine:/some/dir
01:05:29 <jds2001> I used a trailing slash in the source spec of that rsync command
01:05:52 <jds2001> that's important, as it says to consider all of the files in the directory, rather than the directory itself.
01:06:12 <jds2001> did i lose anyone there?
01:06:21 <tw2113> question
01:06:26 <jds2001> sure
01:06:44 <tw2113> with the remote location, are single quotes needed around the destination?
01:06:57 <tw2113> i've always used user@machine:'/some/dir'
01:07:01 <jds2001> not unless it has some weird characters in it.
01:07:08 <tw2113> cool
01:07:22 <tw2113> i'm done
01:07:46 <jds2001> the -a flag is actually a combination  flag for several
01:07:56 <jds2001> basically it says to "archive" the directory
01:08:04 <jds2001> preserve timestamps, permissions, owner, etc
01:08:35 <jds2001> from the rsync manpage: archive mode; equals -rlptgoD (no -H,-A,-X)
01:09:17 <jds2001> the -v is just for verbose
01:09:28 <jds2001> it outputs all the files as it does it.
01:10:28 <jds2001> if there are already files in the destination by the same name, then they are only copied if they are different.
01:10:52 <jds2001> and the beauty of rsync is that only the differences are copied, not the entire file again.
01:11:04 <jds2001> any questions/comments?
01:11:43 <jds2001> guess not.
01:11:53 <BounceCat> am I correct to understand it is a unidirectional sync?
01:12:06 <jds2001> BounceCat: it is.
01:12:13 <jds2001> i.e. one side is copied to the other.
01:12:25 <jds2001> but you can make the source the remote machine if you want.
01:12:34 <BounceCat> ok
01:12:47 <tw2113> just flip the locations
01:12:51 <jds2001> yeah
01:12:57 <tw2113> put the "user@...." stuff first
01:13:22 <jds2001> #topic other uses for rsync
01:14:03 <jds2001> There are a number of other uses for rsync as well.  If you have multiple webservers and a master repository of content for them, you could  use rsync in cron in order to keep them in sync.
01:14:38 <jds2001> In Fedora Infrastructure, one of the uses we have is to gather httpd logs from multiple machines and aggregrate them into a single location
01:15:14 <fenris02> for web servers, you'll need to remember to modify etags handling if you rsync.  (default etag includes inum info, which is not valid across hosts.)
01:16:00 <jds2001> fenris02: etags have always been mysterious for me, maybe you can enlighten me how they're used after class?
01:16:06 <fenris02> http://developer.yahoo.net/blog/archives/2007/07/high_performanc_11.html
01:16:49 <jds2001> basically, any time i need to keep two things in sync, or copy things from one place to another, i generally use rsync.
01:17:09 <jds2001> You can even use rsync locally, I just did that today, when setting up a home directory for a user
01:17:22 <jds2001> both the source and destination can be local filesystem paths
01:17:47 <jds2001> the only restriction is that you can't use two remote paths (and I *have* had use for that, but maybe I'm unique :) )
01:18:46 <onekopaka> jds2001: well in that case, why not just ssh into the remote and rsync locally there?
01:18:59 <onekopaka> </obvious>
01:19:07 <jds2001> onekopaka: it's been more like user@host:/path user@otherhost:/path
01:19:17 <onekopaka> jds2001: oh.
01:19:22 <jds2001> and host and otherhost couldn't talk directly to each other.
01:19:30 <onekopaka> jds2001: well.
01:19:33 <onekopaka> jds2001: that sucks.
01:19:43 <tw2113> i think i tried that once for poo and giggles
01:20:16 <jds2001> anyhow.
01:20:20 <fenris02> ssh user@host1 "tar cfz -" | ssh user@host2 "cd /path/to/; tar xzf -"
01:20:29 <jds2001> fenris02: exactly :)
01:20:43 <fenris02> nasty.  rsync to local, and then out again is far nicer if you have space
01:21:24 <jds2001> the uses for rsync are as limitless as your imagination :)
01:21:55 <jds2001> #topic Syncing with public rsync servers
01:22:20 <jds2001> There are servers that run a server that serves the rsync protocol without need for an account on them
01:22:38 <jds2001> that's the server side I'm not covering here (but can cover later if there's interest)
01:22:55 <jds2001> mainly, these are upstream servers for mirrors.
01:23:42 <jds2001> so the source side of those would be rsync://<whatever>/some/dir
01:24:15 <jds2001> if you eliminate a source dir, you will get a list of 'modules' that the server offers
01:24:42 <jds2001> im sorry, if you eliminate a destination dir.
01:24:46 <jds2001> and a source dir :)
01:25:12 <jds2001> so if I do rsync rsync://mirrors.tummy.com
01:25:14 <jds2001> I get
01:25:24 <jds2001> [jds2001@rugrat convert2]$ rsync rsync://mirrors.tummy.com
01:25:24 <jds2001> fedora         	Fedora - RedHat community project
01:25:24 <jds2001> epel           	Fedora EPEL - RedHat community project
01:25:24 <jds2001> fedora-enchilada	Fedora - RedHat community project
01:25:24 <jds2001> pub            	The full mirror
01:25:48 <jds2001> those are the modules defined in nirik's rsyncd.conf
01:26:01 * nirik nods
01:26:40 <jds2001> If I wanted to sync fedora-enchilada (which is the complete content of the master mirrors), I could do the following (and I'll explain all of these switches):
01:27:48 <jds2001> rsync -avH --delay-updates --delete --delete-after rsync://mirrors.tummy.com/fedora-enchilada/ /some/path
01:28:02 <jds2001> some of those are important when syncing mirrors.
01:28:09 <jds2001> -H means to preserve hardlinks
01:28:19 <jds2001> everyone know what hardlinks are?
01:28:30 <onekopaka> I guess.
01:28:39 <jds2001> i take that as no :)
01:28:58 <jds2001> they are ways for multiple directory entries to refer to the same inode on a filesystem.
01:29:11 <onekopaka> I'm not sure about the difference between hard and symbolic.
01:29:17 <onekopaka> that's my issue.
01:29:31 <jds2001> oh, good question
01:29:50 <jds2001> so a symbolic link is simply a pointer to another location on the filesystem.
01:30:00 <jds2001> they are an inode of it's own.
01:30:13 <onekopaka> oh, okay.
01:30:23 <jds2001> the big difference is that hardlinks are restricted to the same filesystem (since they are directory entries that point to the same inode)
01:30:37 <jds2001> and symbolic links can point anywhere you please.
01:30:46 <jds2001> (including places that don't exist)
01:31:18 <jds2001> the big use of hardlinks is identical files in multiple places
01:32:03 <jds2001> for example, multilib stuff.  The i386 and x86_64 versions are in the x86_64 directory, but the i386 version is also in the i386 directory
01:32:11 <jds2001> on a fedora mirror, the space savings are huge
01:32:12 <fenris02> identical selinux labels, ownership and permissions.
01:33:12 <jds2001> [jstanley@monster fedora]$ du -sh .
01:33:12 <jds2001> 302G	.
01:33:19 <jds2001> [jstanley@monster fedora]$ du -shl .
01:33:20 <jds2001> 398G	.
01:33:44 <onekopaka> wow
01:33:44 <jds2001> so hardlinks on my mirror are saving me 96GB of actual disk.
01:34:01 * onekopaka doesn't have 302GB of space anywhere.
01:34:46 <jds2001> onekopaka: and that's not everything :)
01:34:59 <jds2001> nirik: how big is everything about?
01:35:14 * jds2001 doesn't carry debug, ppc
01:35:16 <nirik> not sure off hand.
01:35:30 <jds2001> no biggie
01:35:45 <jds2001> so the next thing says --delay-updates
01:36:15 <jds2001> what that does is delays updating any files until the rsync run is completed.  This keeps your mirror consistent during the sync process
01:36:21 <jds2001> either all updated, or all not.
01:37:10 <jds2001> --delete and --delete-after say to delete any files that are on the local filesystem that aren't on the remote, but only after the sync has completed.
01:37:25 <jds2001> (the default is to do it beforehand to save space)
01:37:58 <jds2001> and then the source and destination.
01:38:12 <jds2001> I mentioned that I didn't carry everything on my mirror.
01:38:25 <jds2001> I do that through the use of excludes.
01:38:48 <jds2001> rsync --exclude-from=/home/jstanley/mirrorsync/fedora-excludes -avH --delay-updates --delete --delete-after rsync://mirror.hiwaay.net/fedora-linux-updates/ /mirror/fedora/updates
01:38:57 <jds2001> that's my real rsync line from my mirror
01:39:24 <jds2001> so --exclude-from= is a newline separated list of things that I wish to exclude
01:39:33 <fenris02> no -c ?  extraneous?
01:40:13 <jds2001> im not sure what -c does
01:40:34 <jds2001> pj
01:40:35 <jds2001> oh
01:40:36 <fenris02> -c, --checksum              skip based on checksum, not mod-time & size
01:40:48 <jds2001> that creates load on the rsync server
01:40:55 <jds2001> and time and size is suffcient.
01:41:03 <jds2001> some rsync servers actually forbid -c
01:41:26 <jds2001> (I ran into that when I found I had a bunch of corrupted files on my mirror)
01:41:37 <jds2001> I quickly changed upstream mirrors obviously :)
01:41:45 <fenris02> good to know.  thanks.
01:42:43 <jds2001> so in my exclude file, I have things like
01:42:44 <jds2001> ppc/
01:42:44 <jds2001> ppc64/
01:43:02 <jds2001> so any directory called ppc will be excluded
01:43:03 * nirik tsks. No ppc machines? :)
01:43:30 <jds2001> nirik: i actually have one on the shelf :)
01:43:39 * jds2001 may have to get it down sometime :()
01:43:40 <jds2001> :)
01:43:45 <jds2001> dual 1.25GHz G4 :)
01:44:15 <nirik> much better than mine. Feel free to send it my way. :)
01:44:45 * nirik notes to get back to the topic that many packages build on librsync to provide rsync functionality inside them. rdiff-backup, etc.
01:45:36 <jds2001> any questions?
01:47:17 <jds2001> well that's all that I had....
01:47:52 <jds2001> next up is mether talking about bodhi and koji, sure to be interesting
01:47:58 <jds2001> but for now...
01:48:01 <jds2001> #endmeeting