Saturday, 25 January 2014 09:00

Ohio Archaeological & Historical Society Publications

Written by 
Rate this item
(0 votes)

This series of links is for the series of publications, how many is not known by this writer, released by today's Ohio Historical Society.  Originally published under the title Ohio Archæological and History Publications, in a series of 113 volumes, they contain the quarterly publications by year in book form.  At present the only volume on this site is the twentieth in the series published in 1911.  The same material, though not in archival format, is available from the Publications web site of the Ohio Historical Society.  Through their efforts the various volumes are full-text searchable and is an amazing collection not often fully utilized.  The chief difference between the full record set on their site differs in one manner only from that hosted herein; the same material presented here displays the actual page, not the text of the page.

Ohio Historical Society (captured image)OHS text / pageThe other primary portal for accessing this invaluable record set is through the the Internet repository, Archive.Org.  While usable in a roughshod fashion, it leaves a lot to be desired on many accounts.  First, it is not a true archive in the purest sense.  The images are little more than second or third generation photostatic copies and in most cases without the corresponding digital archive supporting the low quality searchable PDF files.  Usable, yes, after a disorganized blurry fashion.  Desirable for material that should remain viable for future generations, no.  The image below is one of the better ones by far.

Second, because of the alluded-to low quality imaging utilized, the inherent searchability of the PDF file is severely limited.  In most cases it us useless.  It is the thought of this writer that Archive.Org failed with their funding, and the time would be better spent on other more esoteric material.  This detraction is not directed solely at the that fine organization, but more towards the firms that willy-nilly host the material on their servers.  Archive.Org needs to establish better protocols for the vetting process, including better standards, for the hosting of their "archive."

The Ohio Historical Society web site serves a far better purpose.  Other than the aforementioned lack of the imagery utilized in creating a true digital archive.

Now to the pages presented on this site, in brief.  A sacrificial volume, in this case that of 1911, was located and unbound to the signature level. The signatures were each scanned at 400 pixels-per-inch and 24-bit color and each singular page saved to a file.  This set of images, was saved into a “Raw” directory after the overscan was removed and a path created.  The final attribute for this set of truly archival images then had metadata applied to the record set.  In this instance metadata can be thought of as the old card catalogs those of a certain age remember at our local library.  Only in lieu of describing a book, it has been applied to the pages of that book.

Archive.Org image from PDFArchive.Org imageFrom this master set a duplicate set was created and placed in a directory entitled “Enhanced”.  Using the aforementioned path, in essence the image sans anything not particular to the page itself, the page was tonally adjusted using several techniques particular to digital imaging.  Additional adjustments being utilized as well, a series of images thus usable for creating images usable on the Internet can be created.  This series of actions were stored in a script, and then that script was applied to the record set.

The “Enhanced” directory thus created then becomes the input for the online images.  A subtle drop shadow for aesthetic viewing was added and the images saved to a “PNG” directory.  This, again, was performed using a set of scripts, permitting an easier digital workflow.  This set of Internet images was then individually processed as to the total image dimensions.  These are the image viewable on this web site.

The metadata, and this is specific only to the CMS software used on this web site, Joomla, then becomes the input to a spreadsheet.  Using software specific only to the CMS Joomla software, this spreadsheet was created as it permits ease of use in creating the over 500 pages to be built for the online record set.  In one section of this spreadsheet was placed the text of each page.  In essence the text, having been OCRed, was copied and pasted into a text file to which corrections for errors was made.  The resultant text was then back-fed into the spreadsheet.  Why you may ask? This OCRed text becomes the basis of the search engine, both internal and external, that is used.  The final aspect for the presentation was the uploading of the spreadsheet to a program that constructs the individual pages.

BrethrenArchives.Com imageBrethrenArchives.Com imageAs part of this overall process three PDFs were generated.  The first meets the standards as set forth for PDF/A-1a.  Why this seemingly lower quality standard? Because there is no text in the file —they are images! They are not content generated by a modern computer word processing program.  This is a fallacy that Archive.Org, actually the vendors crafting the online content for these invaluable old volumes, is prosecuting.  This master PDF is the basis for two additional PDFs.

One is a “Report PDF” file and the other is the OCR PDF.  The Report PDF contains all the particulars of the original images as to the image color spectrum, the original metadata, in essence, all the particulars of the files as well as a thumbnail of each.  The OCR PDF file is, as the title describes, the images having been Optical Character Recognition applied internally by Adobe Acrobat.  Again, this file is a PDF/A-1a standards compliant file.  Again, why this lower standard? The file contains “images of antiquity” and was not created by a modern word processing or image editing program.  They are the same as photographs in your shoe box under the bed.  The resultant OCRed file, after a careful examination of the embedded fonts, will show hundreds upon hundreds, and perhaps even thousands upon thousands of “fonts.” Not a one, likely, will be any recognized standards compliant or recognized font of today.  Again, another fallacy perpetuated by firms creating Archive.Org content when using the higher complaint standard PDF formats.

The final step of this mini-project, and that more important, perhaps, to archivists, is the creation of archival friendly media.  Yes, no media disk has yet been invented that meets the same standards as the time trusted microfilm.  However, when was the last time you had access to a film reader at your present space? Not many people have access to one and digital will slowly, inevitably take precedence over film.  You might state that I am on the cutting edge.  Though the media used in this project is purported by the manufacturer, Taiyo Yuden in this case, to exceed a 70 year life span, caution should always be used.  Thus standards as in place for all institutional procedures should be adhered to.  This is stated because it is hoped that whichever archive into which the possession of these records occurs will have such protocols in place.

One last thing.  Earlier the concept of metadata was commented on and that the record sets created both herein, but as well for all other record sets digitized by this writer, are important when creating such records.  Below are two tables demonstrating the differences between metadata properly generated when compared for what has quickly become a de facto online repository, Archive.Org:

Metadata Comparison

Repository: Personal archives of A. Wayne Webb

Date Range: 1911

Object Metadata:

Author: A. Wayne Webb

Title: Ohio Archæological and Historical Society Publications: Volume XX

Description: Volume 20 of the series Ohio Archæological and Historical Society Publications. This particular volume, printed in 1911 by Frederick J. Herr (Columbus, Ohio, publisher), contains articles entitled as follows: Prehistoric Earthworks in Wisconsin, The Place of the Ohio Valley in America, A Vanishing Race, Some Local History, Delaware in the Days of 1812, Tarhe — The Crane, General Harmar's Expedition, Four Cycles: A Centennial Ode, Jefferson's Ordinance of 1784, Rufus Putnam Memorial Association, William Henry Rice — In Memoriam, The Bunch of Grapes Tavern, General Roeliff Brinkerhoff, Site of Fort Gower, The Ice Age in North America, The Wilderness Trail, Poems on Ohio, Logan — The Mingo Cheif, 1710-1780: Draper Manuscripts, The Kendal Community, The Ohio River, Birthplace of Little Turtle, Recollections of Newark, Ohio, A Visit to Fort Ancient, Pipe's Cliff, The Cincinnati Municipal Elections of 1828, Oberlin's Part in the Slavery Conflict, Twenty-Sixth Annual Meeting of the Ohio Ohio State Archaeological and Historical Society, May 31, 1911, To Cincinnati, A Prophecy, General Roeliff Brinkerhoff, Celebration of the Surrender of General John H. Morgan, Early Steamboat Travel on the Ohio River, That Old Log House Where Used to be Our Farm, William H. West, The Battle of Lake Erie in Ballad and History, and, Brady's Leap.

Record ID: Ohio Historical Society Publications

Record Group Descriptor: N/A

Series: Volume XX

Stored: Personal archives

Source Format: Paper – 6.10" X 9.00" (approx.)

Technical Metadata:

Operator: A. Wayne Webb

Scanner: Microtek 9800XL

Dynamic Range: 3.7

File Format: Tagged Information File Format (TIF)

Color Mode: 24-Bit Color

Spatial Resolution: 400 ppi

Image Quality: 2 (no obvious visible defects)

Scale: 100%

Gamme Correction: Adobe RGB (1998)

Color Calibration: Kodak IT8.7/2-1993

Author: A. Wayne Webb

Compression: LZW

Pixel Array: 2450 X 3585 (varies)

Image Creation Date: May 11, 2011

Informational Metadata:

Keywords: Ohio Archæological and Historical Society Publications: Volume 20; Volume XX; 1911; Ohio Archaeological and Historical Society Publications; periodical, quarterly

Notes:

Contains 505 images including book and metadata target.  Scanned at 400dpi gray scale with no tonal adjustments and saved with no adjustments or scaling.  Scanner used: Microtek 9800XL with 3.7 dynamic range.
Duplicate set of images tonally adjusted using selectively the Red channel at 98 (black) and 233 (white), Green channel at 99 (black) and 231 (white) and Blue channel at 77 (black) and 211 (white).  Additional color correction and level mode channel adjustments were used.
The scanner was calibrated using a Kodak Q–60 Color Input Target IT8.7/2-1993 calibration target.

Rights Usage Terms:

None.  All images and content are copyrighted 2014 by A. Wayne Webb of Millville, New Jersey.

Scanning Center: indiana

LCCN: 90643025

Shiptracking: LDS33002

Mediatype: texts

Page Progression: lr

Identifer: ohioarcheological20ohio

Scanner: scribe1.indiana.archive.org

Ppi: 500

Camera: Canon EOS 5D Mark II

Operator: volunteer-pamela-shipley@...

Scandate: 20130711193832

Republisher: volunteer-wayne-shipley@...

Imagecount: 510

Identifier-access: http://archive.org/details/ohioarchological20ohio

Identifier-ark: ark:/13960/t3zs4dr8f

Bookplateleaf: 0006

Ocr: ABBYY FineReader 8.0

As can be seen above and to the left, the differences are drastic and significantly different in their descriptive approaches.  The one gives clearly the parameters used in the record set creation, while the other leaves it to a series of questions difficult to answer.

It matters not this particular record set of Archive.Org uses this series of record descriptive phrases.  In other words, this is not an oddity.  All of the metadata applied to other books of old are the same, a banal series of meaningless words.

To further confound an archivist, or those interested in preserving these volumes for posterity, once the cropped images have been compiled into a PDF the images, and yes they are never tonally adjusted, are tossed into the digital trash-bin.  Some of the PDFs are of so low a quality as to appear as if a sheen of vaseline were smeared atop them.

One final step, if applicable, to the books stored on this site.  During the scanning process if a photographic image is deemed worthy enough it is scanned at a resolution of no less than 1,200 ppi, and more often than not at the higher 2,400 pixels, and then carefully adjusted removing the moiré pattern at the same time.  More often than not it is only in these old volumes that images yet remain.

Cordially,

A. Wayne Webb

Digital Archivist

Indian ScoutIndian Scout

Click HereVolume XX (1911)Click Here
Read 9720 times Last modified on Sunday, 02 February 2014 09:21
A. Wayne Webb

A long time historian of the German Baptist Brethren church, and its more modern derivative bodies, Mr. Webb has moved on to become a recognized authority in digitally archiving manuscripts, both published works as well as singular documents.  He served as the Editor of Brethren Roots, 2002 to 2008, as published by The Fellowship of Brethren Genealogists.  To that end he has created and maintains a series of Internet web sites devoted to his passion, German Baptist Brethren history.

Leave a comment

Make sure you enter the (*) required information where indicated. HTML code is not allowed.