Robust, Non-Incremental, One-Time, Perfect Backup

ZeroAltitude

Registered
Hello,

I want to ask the opinion of some Mac OSX/Unix adepts or gurus. I want to create a one-time, non-incremental but perfect backup of my single hard drive in my TiBook. I have a firewire 80 Gig Maxtor drive.

This is what I did:

Terminal
$ su root
$ cd /Volumes/Maxtor
$ mkdir sysback
$ cd sysback
$ tar cvXef - / | gzip -9 - >sysback.tar.gz

Some explanation:

tar option c: create
option v: report verbosely
option X: do not cross mount points (in particular, don't tar /Volumes! :))
option e: die on first error
option f: file is
- : stdout

I believe that in theory, this should create a compressed copy of my volume /.

Am I correct? Does anyone have alternate suggestions for what I am trying to accomplish?

Thanks!

-ZeroAltitude
 
I note that I expect the following:

If I were to use this gzip to recreate a harddrive, then the chances are very high that my aliases and dock references would be broken when starting from this drive, and would all have to be recreated.

I think I could live with that.

How to avoid? Let me know if you know.

-0
 
also keep in mind that tar will trash any and all resource forks. It's possible that this isn't a problem for you, ie, you have no classic applications or files and no data formats that use resource forks... pretty unlikely...

I would recomend you look at pax (http://www.versiontracker.com/moreinfo.fcgi?id=11144&db=mac). It operates pretty much like tar (with some limitations, like no archiving to a remote host) and combined with gzip should be a good solution for what you want to do.

Good luck!

-alex.
 
Hello,

Thanks for the input. I come from the PC/Unix world, and was not aware that 'tar would trash resource forks.' So now I have a question.

I looked up the spec of HFS+, and I do understand what data forks and resource forks *are*. But why would tar trash them? Doesn't tar in some relevant sense replicate the file layout on disk?

If anyone can explain this, I would be most, most appreciative. Now, I'm off to look at hfspax!

-0
 
Think of it this way:

In a sense, tar is a filesystem format. A tar file defines a filesystem (keep in mind that in UNIX a filesystem is any directory and it's children). Tar is also cross-platform. Given this, it has to cater to the least common denominator. Non HFS[+] filesystems simply do not have support for mulitiple forks; there's no where for them to go.(1) This is why we have macbinary, to wrap up the two forks (which are really like two files with a meta-level association) into one "true" single file. Like the various microsoft (and other) filesystem formats tar simply has no where to put that other file, so it drops the resource fork.

HFSpax is designed around HFS+, so you avoid this problem, but of course you can't use it as a cross-platform data exchange solution either...

hope this helps clear things up a bit.
-alex.


(1) this is why when you serve appleshare from a UNIX box, using something like XINET's KA-Share, the appleshare server software has to do some "tricky handling" of resource forks (ie creating a .rsrc folder, in which it stores all the resource forks for the files in the parent folder of .rsrc). The server then reunites the forks on the fly when serving to a Mac client, so the client sees a normal appleshare volume complete with resource forks.
 
Thanks alex. I read the hfspax documentation, and that, combined with your explanation, means something concrete to me. So I'll reiterate, to make sure I understand (feel free to not reply, this is for my, and possibly others', benefit).

The HFS+ filesystem is unique, or almost unique, in that is supports more than just the data element of the file. It supports a 'resource fork', which has certain information such as window information, and apparently other application-state information. It also supports 'finder info', which includes information about the file type.

These other 'forks' of the file are *not* a part of the intuitive 'data' of the file (e.g. the executable code, the raw/constant data repository, etc).

Now, these forks are all associated to each other by means of information in the HFS+ file tables. Unfortunately, the Mac OSX ports of the file description tools (such as ls, tar, etc) are not 100% compatible with all of HFS+.

For example, you *can* see resource forks with 'ls' under OSX (sometimes -- mileage varies). But this is by a magic in OSX's ls -- it converts HFS+ information about resource forks into specially named file listings, even though there are no real files answering to those listings.

tar, too, does not fully understand HFS+. Even if tar's file listing mechanism, like ls, could 'see' the resource forks, it would not produce an HFS+ joined file fork if you restored it: it would produce a really odd, separate file, that would be a data fork (in HFS+ terms) that no longer associated to the file it once did (the deeper HFS+ association having been broken and badly reattached, to be metaphorical).

Thanks so much,

-0
 
I've written a perl script that uses hfspax to make backups, incremental or full. Though, if you want to make a full backup there's not much advantage over hfspax + gzip.

Besides multi-level incremental backups, it also lets you exclude files based on name (exact string or regular) or file size, so that you don't need to back up things like your browser cache and that 1 GB file you forgot you still had. If anyone wants a copy, speak up.
 
Hi Gen,

That sounds cool. You should post it here! You know what I would be interested in, is working with your or others on greating a nice graphical front-end for a tool like your script. An opensource project for people like us. Well, it's just a thought, but, I think a good one.

-0
 
I'll put some documentation together and put a link up.

I'd be happy to donate the code to anyone who wants to further develop it.
 
Launch the terminal.
Type:
sudo ditto -V -rsrc /Volumes/MainVolume /Volumes/80gigMaxtor

Substitute the appropriate volume names. Use quotes where appropriate to escape spaces in the volume names. It may take a while to complete, but it will return a verbose progress report as it copies the contents of drive1 to drive2. Ditto will preserve all resource forks and creator code information, as well as UNIX file permissions.
 
Back
Top