Hoover Library & Archives Introduces New Open Source Software On GitHub

Monday, September 26, 2016

By Jim Sam


This week, the Hoover Institution Library & Archives announces the debut of its GitHub page. GitHub is the world’s leading repository of open-source software, and we are thrilled to be able to share some tools we developed for our audiovisual labs.

The first set of tools is utilities for making and verifying indexed MD5 checksum files on a Macintosh computer. These files are essential for ensuring our files do not change either when uploading to our preservation server or as the file resides on the server over time.

We made these because our preferred checksum app does not support Macintosh, and we were unable to find another app that could produce indexed checksum files. Leveraging the Python programming software that ships with every Mac, we were able to solve this problem.

The second tool is a way to automate what audio engineers call a null test. It tells us if two different files have the same audio content by manipulating how audio signals with opposite phase cancel each other. This has a variety of uses. For example, it ensures that when we embed metadata into the header of a file, the pure-audio section remains intact.

For more on how these tools were built, be sure to catch Hoover archivist Jim Sam’s talk “Let the Robots Do the Work” at the International Association of Sound and Audiovisual Archives annual conference on Wednesday, September 28, 2016 in the Library of Congress’ Madison building.


Archivist Jim Sam is a specialist in sound recordings at Hoover Library & Archives.