The default format for GNU tar is defined at compilation time. Also upstream (among the developers of GNU Tar) they discuss to perform the same migration, see for example the last paragraph on this page of the GNU Tar manual: Since then, at least openSUSE (since its release 12.2) changed their default GNU Tar format from 'GNU tar 1.13.x format' to the (slightly) superior 'POSIX 1003.1-2001 (pax) format'. This answer is intended to be a supplementary update to the approach of using Tar output to hash the contents of directories, as it was proposed (among other things) in the excellent answers of Warren Young and Gilles some time ago. Migration to POSIX archive format affects GNU Tar based checksums # Traverse the specified path and update the hash with a description of itsĮlse: pass # silently symlinks and other special filesįor root in sys.argv: traverse(h, root) # Return the hash of the contents of the specified file, as a hex string
It takes directories and file contents into accounts and ignores symbolic links and other files, and returns a fatal error if any file can't be read. Here's a minimally tested Python script that builds a hash describing a hierarchy of files. If you just need a hash of the tree's file contents, this will do the trick: $ find -s somedir -type f -exec md5sum + | sort -z # file hashesĮcho 'End of hashed data.' # End of input marker The right way depends on exactly why you're asking: Option 1: Compare Data Only