Link to ETD Workflow Google doc:
https://docs.google.com/document/d/1Qs_uIWPjDfUIcQhxJiMlgvgN8XpexZn2CTs0eSGTOxo/edit
HathiTrust Support: feedback@issues.hathitrust.org
As a condition of our membership in the HathiTrust (HT) Partnership we are required to submit records for our print holdings annually. HT requires the holdings be separated into 3 files: Single Part Monographs, Multi-Part Monographs, and Serials. Information on submitting holdings data to HT is available at: https://www.hathitrust.org/print_holdings.
Rick Leveille has the specifications for generating the files that will be sent to HT.
The Libraries contributes content digitized by the Open Content Alliance (Internet Archive) to the HathiTrust. HathiTrust (HT) already works with the Internet Archive (IA), which facilitates the process. We only need to provide HT with MARC records that include content specific to IA, the IA Identifier and the ARK Identifier, in the 955 tag. More information on submitting bibliographic records is available here: https://www.hathitrust.org/bib_specifications
On the fcweb.library.umass.edu server, there is a PHP script (www/html/ht/get_IA_data2.php) which takes a file of comma separated Aleph bib numbers and IA Identifiers and creates the appropriate MARCXML for submission to HathiTrust. It FTPs the file to HT and sends the email notification that is required by HT. It works fairly well, but occasionally encounters a problem that needs to be addressed.
The file of comma separated values can be created using the Pick List that is returned from OCA. Talk to Lisa Persons about where the latest files are, typically they are located in W:\Open Content Alliance\Pick lists\Completed picklists. The first 2 columns on the Pick List should contain the bib number and the IA identifier. If there are errors, a column will be inserted between the first and second column where the error is noted. Errors should be deleted from the Pick List and the empty column should be deleted. Delete all other columns, and the header, of the Pick List and save the file as “ialist_YYYYMMDD_[local identifying information].csv” (no spaces). Follow the below steps to complete the processing:
php get_IA_data2.php
For digitized content other than digitized theses and dissertation, you can also create the file of comma separated values by printing the 856 from Aleph in Aleph sequential format for the records that you want to upload to HT. This will provide you with the bib number and IA identifier, but you will need to massage the data to get it into the proper format. This has been useful for Massachusetts Documents records, because many of the print records never had an OCLC number.