Project Gutenberg welcomes contributions of eBooks from people with
the interest, time, and skillset needed to meet our submission
standards. Details of the process and the standards are at our
copyright clearance site copy.pglaf.org and upload site
Join Distributed Proofreaders, Instead
For most people interested in producing eBooks, we recommend starting
with Distributed Proofreaders (https://www.pgdp.net). With
Distributed Proofreaders, you can get involved with different portions
of the production pipeline described below. This is a much easier way
to get started, and results in very high quality eBooks.
If you simply want to suggest a book for digitization, DP has online
forums for this, or you can simply send an email (contact information
is on the site).
Distributed Proofreaders maintains canonical guidance on production.
Being a Solo Producer
If you might be interested in producing an eBook yourself, without involving
Distributed Proofreaders, here is some guidance. But start with what’s above,
including the DP links.
In a nutshell, the production process typically involves the following:
- Identify a candidate printed book. Confirm it is not already in the
collection, or in process by other volunteers. Use the Collection
Development Policy to guide
you on eligibility.
- Obtain a copyright clearance for the printed book. Usually this is
based on scanned title page and verso page demonstrating the printed
book was published more than 95 years ago. See the Copyright
- Obtain scans of the book. This may be done using your own scanner,
or there might be online scans available for reuse. Scans
must come from the exact same print edition as your copyright
- Perform optical character recognition (OCR) on the scans, to make an
approximate representation of the book in plain text.
- Proofread, proofread, proofread: “Fix” the OCR output by carefully
fixing any errors it made. Remove page headers &
footers. De-hyphenate. Add back italics or other formatting.
- Format: Generate valid and well-formed HTML source. Different tools
are available for this, and usually involve editing the HTML source
code directly. Note that many tools produce convoluted, non-standard,
or non-valid HTML, which can be very difficult to clean up for Project
Gutenberg: poor HTML is not accepted, even if it is valid.
- Check, and recheck. The upload site has various tools, including to
test proper conversion to derived formats.
- Upload your work, using the copyright clearance key generated
- Coordinate with the Project Gutenberg production volunteers (known
as “whitewashers,” after the Mark Twain book) on final formatting and
- Once the eBook is added to the Project Gutenberg collection, confirm
it is appearing correctly, and all metadata are correct.
- If possible, stay in touch into the future. If we receive errata
reports that require access to source material, or are stylistic or
subjective in nature, we might get in touch to discuss potential