Project Gutenberg The first digital librAry
... the free - and first - digital library was born on July 4th, 1971. Michael Hart finishes typing the first e-book into his computer at University of Illinois. Instead of sending the 5 k file directly to 100 users through the fragile servers, Hart shared the location of the text on the web, with a hyperlink. Thus was born the idea of Project Gutenberg. A library with every literary work in public domain able to be accessed as easily as clicking a hyperlink.
"The idea is that whether they were avid readers of print or not in the past, people should easily be able to look up quotations they hear in conversation, movies, or music, or they read in books, newspapers and magazines, within a library containing all of these quotations in an easy to use format." - Marie Lebert
The digital library of PG was original in concept, and process. The process started by Hart is still used today, and the unique aspect of human proofreading is also applied to works published on PG.
...Let's Look at the Steps Involved in Digitizing a Book...
1. Submit a Title
Today, an international team of volunteers start by reviewing titles for eligibility. They are looking to see if the work is in the public domain, and if so, start digitizing.
2. OCRization
Special scanning software, an Optical Character Recognition machine recognizes printed characters and converts them into an editable document.
3. Human Proofreading
After texts are OCRized, they undergo human proofreading - a unique aspect of PGs model. 2 different volunteers check the accuracy of the scanned image and fix any errors before publishing.
These 3 steps produce an average conversion accuracy of 99.95% - a standard used in the Library of Congress.
The proofread text is published as an e-pub file. Some advantages of the e-pub format...
- Extend the lifespan of the text
- Easily shared and accessed
- Content is indexed for search
- Reflowable - text format optimizes as per the output format
- Able to highlight, notate, comment
What about Internet Archive?
To Gutenberg, or to Archive?
That is the question...
.