Distributed Proofreaders
Template:Short description Template:Infobox website
Distributed Proofreaders (commonly abbreviated as DP or PGDP) is a web-based project that supports the development of e-texts for Project Gutenberg by allowing many people to work together in proofreading drafts of e-texts for errors. Template:As of the site had digitized 49,000 titles.<ref>Template:Cite web</ref>
History
Distributed Proofreaders was founded by Charles Franks in 2000 as an independent site to assist Project Gutenberg.<ref name=":0">Template:Cite book</ref><ref name="brit1">Template:Cite web</ref> Distributed Proofreaders became an official Project Gutenberg site in 2002.<ref name=brit1 />
On 8 November 2002, Distributed Proofreaders was slashdotted,<ref>Template:Cite web</ref><ref>Template:Cite web</ref> and more than 4,000 new members joined in one day, causing an influx of new proofreaders and software developers, which helped to increase the quantity and quality of e-text production.
In 2006, the Distributed Proofreaders Foundation was formed to provide Distributed Proofreaders with its own legal entity and not-for-profit status, separate from Project Gutenberg.<ref name=":1">Template:Cite web</ref><ref name="globemail">Template:Cite web</ref> The founding trustees were Charles Franks, Juliet Sutherland, and Gregory B. Newby.<ref name=":1" />
In July 2015, the 30,000th Distributed Proofreaders produced e-text was posted to Project Gutenberg. DP-contributed e-texts comprised more than half of works in Project Gutenberg by 2009.<ref name="brit1" />
Proofreading process
DP servers are located in the United States, and therefore works must be cleared by Project Gutenberg as being in the public domain according to United States copyright law before they can be proofread and eventually published.<ref name=":2">Template:Cite journal</ref>
Public domain works, typically books with expired copyright, are scanned by volunteers or sourced from digitization projects, and the images are run through optical character recognition (OCR) software.<ref name=":2" /> Since OCR software is far from perfect, the resulting text always includes errors.<ref>Template:Cite book</ref> To correct them, pages are made available to volunteers via the Internet; the original page image and the recognized text appear side by side.<ref>Template:Cite conference</ref> Each set is presented to multiple volunteers to enter corrections, which results in a combined dataset that minimizes errors.<ref>Template:Cite book</ref> This process distributes the time-consuming error-correction process with a method akin to distributed computing.<ref name=":0" />
A post-processor combines the pages and prepares the text for uploading to Project Gutenberg.<ref name=":2" />
Besides custom software created to support the project, DP also runs a forum and a wiki for project coordinators and participants.
Related projects
DP Europe
In January 2004, Distributed Proofreaders Europe started, hosted by Project Rastko, Serbia.<ref>Template:Cite web</ref> This site had the ability to process text in Unicode UTF-8 encoding. Books proofread centered on European culture, with a considerable proportion of non-English texts including Hebrew, Arabic, Urdu, and many others. Template:As of, DP Europe had produced 787 e-texts, the last of these in November 2011.
DP Canada
In December 2007, Distributed Proofreaders Canada launched to support the production of e-books for Project Gutenberg Canada and take advantage of shorter Canadian copyright terms. Although it was established by members of the original Distributed Proofreaders site, it is a separate entity.<ref>Template:Cite web</ref> All its projects are posted to Faded Page, their book archive website. In addition, it supplies books to Project Gutenberg Canada, and, where copyright laws are compatible, to the original Project Gutenberg.
Milestones
The source for many of these entries is the DP Timeline.<ref>Template:Cite web</ref>
| Milestone | Date | e-text |
|---|---|---|
| First | 1 Oct 2000 | The Odyssey, Homer, Lang tr. (first pages for proofreading) |
| 1,000th | 19 Feb 2003 | Tales of St. Austin's, P. G. Wodehouse |
| 2,000th | 3 Sep 2003 | Hamlet — the 'Bad Quarto', William Shakespeare |
| 3,000th | 14 Jan 2004 | The Anatomy of Melancholy, Robert Burton |
| 4,000th | 6 Apr 2004 | Aventures du Capitaine Hatteras, Jules Verne |
| 5,000th | 24 Aug 2004 | A Short Biographical Dictionary of English Literature, John William Cousin |
| 10,000th | 9 Mar 2007 | (See 10,000th E-book below) |
| 15,000th | 12 May 2009 | Philosophical Transactions of the Royal Society - Vol 1 - 1666, Various. Henry Oldenburg (editor) |
| 20,000th | 10 April 2011 | (See 20,000th E-book below) |
| 25,000th | 10 April 2013 | The Art and Practice of Silver Printing, H. P. Robinson and Capt. Abney<ref>Template:Cite web</ref> |
| 30,000th | 7 July 2015 | Graded Literature Readers: Fourth Book<ref>Template:Cite web</ref> |
| 35,000th | 26 Jan 2018 | Shores of the Polar Sea, a Narrative of the Arctic Expedition of 1875–1876<ref>Template:Cite web</ref> |
| 40,000th | 10 October 2020 | All four volumes of London Labour and the London Poor<ref>Template:Cite web</ref> |
| 45,000th | 18 January 2023 | Elihu Stewart's Down the Mackenzie and Up the Yukon in 1906<ref>Template:Cite web</ref> |
10,000th E-book
On 9 March 2007, Distributed Proofreaders announced the completion of more than 10,000 titles. In celebration, a collection of fifteen titles was published:
- Slave Narratives, Oklahoma (A Folk History of Slavery in the United States From Interviews with Former Slaves) by the U.S. Work Projects Administration (English)
- Eighth annual report of the Bureau of ethnology. (1891 N 08 / 1886–1887) edited by John Wesley Powell (English)
- R. Caldecott's First Collection of Pictures and Songs by Randolph Caldecott [Illustrator] (English)
- Como atravessei Àfrica (Volume II) by Serpa Pinto (Portuguese)
- Triplanetary by E. E. "Doc" Smith (English)
- Heidi by Johanna Spyri (English)
- Heimatlos by Johanna Spyri (German)
- October 27, 1920 issue of Punch (English)
- Sylva, or, A Discourse of Forest-Trees by John Evelyn (English)
- Encyclopedia of Needlework by Therese de Dillmont (English)
- The annals of the Cakchiquels by Francisco Ernantez Arana (fl. 1582), translated and edited by Daniel G. Brinton (1837–1899) (English with Central American Indian)
- The Shanty Book, Part I, Sailor Shanties (1921) by Richard Runciman Terry (1864–1938) (English)
- Le marchand de Venise by William Shakespeare, translated by François Guizot (French)
- Agriculture for beginners, Rev. ed. by Charles William Burkett (English)
- Species Plantarum (Part 1) by Carl Linnaeus (Carl von Linné) (Latin)
20,000th E-book
On April 10, 2011, the 20,000th book milestone was celebrated as a group release of bilingual books:<ref>Distributed Proofreaders celebrates 20,000 books posted Template:Webarchive, Distributed Proofreaders, April 10, 2011</ref>
- The Renaissance in Italy–Italian Literature, Vol 1, John Addington Symonds (English with Italian)
- Märchen und Erzählungen für Anfänger; erster Teil, H. A. Guerber (German with English)
- Gedichte und Sprüche, Walther von der Vogelweide (Middle High German (Template:Circa–1500) with German)
- Studien und Plaudereien im Vaterland, Sigmon Martin Stern (German with English)
- Caos del Triperuno, Teofilo Folengo (Italian with Latin)
- Niederländische Volkslieder, Hoffmann von Fallersleben (German with Dutch)
- A "San Francisco", Salvatore Di Giacomo (Italian with Neapolitan)
- O' voto, Salvatore Di Giacomo (Italian with Neapolitan)
- De Latino sine Flexione & Principio de Permanentia, Giuseppe Peano (1858–1932) (Latin with Latino sine Flexione)
- Cappiddazzu paga tuttu—Nino Martoglio, Luigi Pirandello (Italian with Sicilian)
- The International Auxiliary Language Esperanto, George Cox (English with Esperanto)
- Lusitania: canti popolari portoghesi, Ettore Toci (Italian with French)
30,000th E-book
On 7 July 2015, the 30,000th book milestone was celebrated with a group of thirty texts. One was numbered 30,000:<ref>Template:Cite web</ref>
- Graded literature readers - Fourth book, editors: Harry Pratt Judson and Ida C. Bender, 1900
40,000th E-book
On 10 October 2020, the 40,000th book milestone was celebrated with the completion of a four-volume work, London Labour and the London Poor, by Henry Mayhew.<ref>Template:Cite web</ref>