Admin Mistakes: GNU, BSD TAR and POSIX Compatibility
Background
So, you’re writing a simple shell script to archive and move some files to another host. For the archives you’re using TAR command. Simple, isn’t it?
Problem
After a couple of days you have to extract the data from an archive to search for something but when you attempt to extract them you get errors similar to the following.
tar: Ignoring unknown extended header keyword `XXXXXXXXX' tar: Ignoring unknown extended header keyword `XXXXXXXXX' tar: Ignoring unknown extended header keyword `XXXXXXXXX'
And of course the data are not extracted properly.
Mistake
The files were compressed on a Mac (Snow Leopard) which is using BSD TAR and the destination host was Linux (that uses GNU TAR). As you might have guessed, there is an incompatibility between BSD and GNU TAR regarding the handling of vendor extended attributes. Specifically, BSD TAR supports them (as defined in IEEE Std 1003.1-2001 (POSIX.1-2001)) while GNU TAR doesn’t.
Resolution
There are a few different options we have to avoid this mistake. The best one is to simply use either BSD or GNU TAR but not combined. The other option is to use the “–format” option in order to use a compatible format between the systems. Here is the equivalent documentation for BSD TAR:
--format format (c, r, u mode only) Use the specified format for the created archive. Supported formats include ``cpio'', ``pax'', ``shar'', and ``ustar''. Other formats may also be supported; see libarchive-formats(5) for more information about currently-supported formats. In r and u modes, when extending an existing archive, the format specified here must be compatible with the format of the existing archive on disk.
And for GNU TAR:
--posix like --format=posix --format FORMAT selects output archive format v7 - Unix V7 oldgnu - GNU tar <=1.12 gnu - GNU tar 1.13 ustar - POSIX.1-1988 posix - POSIX.1-2001
Finally, you could utilize the “–pax-option” option of GNU TAR to delete these attributes. Here is its man page documentation:
--pax-option KEYWORD-LIST used only with POSIX.1-2001 archives to modify the way tar han- dles extended header keywords
For example, if your warnings were like:
tar: Ignoring unknown extended header keyword `somefile.ino' tar: Ignoring unknown extended header keyword `somefile.nlink'
You could use option:
--pax-option="delete=somefile.{ino,nlink}"
To delete them.
NOW I HATE MACS
b (@0hi)
May 16, 2012 at 03:18
Just to make it clear. It’s not Mac-specific. Any OS using BSD TAR will have the same behaviour.
xorl
May 16, 2012 at 08:49