Admin Mistakes: GNU, BSD TAR and POSIX Compatibility
So, you’re writing a simple shell script to archive and move some files to another host. For the archives you’re using TAR command. Simple, isn’t it?
After a couple of days you have to extract the data from an archive to search for something but when you attempt to extract them you get errors similar to the following.
tar: Ignoring unknown extended header keyword `XXXXXXXXX' tar: Ignoring unknown extended header keyword `XXXXXXXXX' tar: Ignoring unknown extended header keyword `XXXXXXXXX'
And of course the data are not extracted properly.
The files were compressed on a Mac (Snow Leopard) which is using BSD TAR and the destination host was Linux (that uses GNU TAR). As you might have guessed, there is an incompatibility between BSD and GNU TAR regarding the handling of vendor extended attributes. Specifically, BSD TAR supports them (as defined in IEEE Std 1003.1-2001 (POSIX.1-2001)) while GNU TAR doesn’t.
There are a few different options we have to avoid this mistake. The best one is to simply use either BSD or GNU TAR but not combined. The other option is to use the “–format” option in order to use a compatible format between the systems. Here is the equivalent documentation for BSD TAR:
--format format (c, r, u mode only) Use the specified format for the created archive. Supported formats include ``cpio'', ``pax'', ``shar'', and ``ustar''. Other formats may also be supported; see libarchive-formats(5) for more information about currently-supported formats. In r and u modes, when extending an existing archive, the format specified here must be compatible with the format of the existing archive on disk.
And for GNU TAR:
--posix like --format=posix --format FORMAT selects output archive format v7 - Unix V7 oldgnu - GNU tar <=1.12 gnu - GNU tar 1.13 ustar - POSIX.1-1988 posix - POSIX.1-2001
Finally, you could utilize the “–pax-option” option of GNU TAR to delete these attributes. Here is its man page documentation:
--pax-option KEYWORD-LIST used only with POSIX.1-2001 archives to modify the way tar han- dles extended header keywords
For example, if your warnings were like:
tar: Ignoring unknown extended header keyword `somefile.ino' tar: Ignoring unknown extended header keyword `somefile.nlink'
You could use option:
To delete them.