E-Print or E-Text?
Considering the golden future of Open Access (OA) eprint archives there are a lot of unanswered questions.
Which is the best data format for archiving articles?
(Most people believe: PDF)
Should repositories accept all sort of possible types of articles which are published also in print?
Possible types:
* author's text file given to the editors (before or after peer review or without peer review)
* author's text file with indication of printed pagination
* publisher's PDF
* publisher's PDF with additional materials (corrections, additions, abstract)
* author's text file with such additamenta
* scanned printed publication (images) without e-text
* scanned printed publication (images) with OCR-text in one PDF (cf. "Paper Capture")
* scanned printed publication (images) and separate e-text (OCR'd or author's file)
* scanned printed publication with author's manuscript notes
and so on ...
Which is the best solution for each academic discipline?
If there is no permission to use publisher's PDF or there is no such PDF - should the author archive his text file or a scanned version of the printed text?
Are repositories accepting PDFs with text behind the image? See
http://www.dclab.com/pdfconversion3.asp
http://www.experts-exchange.com/Web/Graphics/Adobe_Acrobat/Q_21089485.html
http://www.designer-info.com/master.htm?http://www.designer-info.com/Writing/paper_to_pdf.htm
(This seems the most simple solution, if one wants both image + e-text: good OCR software like ABBY Finereader can generate such PDFs.)
I suppose
http://dlc.dlib.indiana.edu/archive/00001319/
is such a PDF.
A little sample section
Samples from E-LIS (all PDFs)
Mixed collection (publisher's PDFs and other)
http://eprints.rclis.org/archive/00003487/
No original pagination (printed on pp. 47-52 = 6 pp., PDF: 8 pp.)
and two others from the same source
http://eprints.rclis.org/archive/00003506/
No original pagination
http://eprints.rclis.org/archive/00003143/
No original pagination
http://eprints.rclis.org/archive/00002436/
(1978)
No original pagination
Samples from Oxford Eprints (all PDFs)
Mostly publisher's PDFs (esp. journals from Oxford U Press!)
http://eprints.ouls.ox.ac.uk/archive/00000895/
http://eprints.ouls.ox.ac.uk/archive/00000899/
Scanned articles without E-text
Samples from UQ (AU), mostly PDFs
http://eprint.uq.edu.au/archive/00001728/
Scanned article without E-text
http://eprint.uq.edu.au/archive/00002065/
Preprint without original pagination?
http://eprint.uq.edu.au/archive/00001951/
No original pagination
http://eprint.uq.edu.au/archive/00000774/
Preprint, not revised
http://eprint.uq.edu.au/archive/00000756/
HTML
Samples from QUT (AU), all PDFs
http://eprints.qut.edu.au/archive/00000787/
http://eprints.qut.edu.au/archive/00000729/
No original pagination
Samples from
PhilSci? Archive
http://philsci-archive.pitt.edu/archive/00002213/
WORD format, what sort of paper?
http://philsci-archive.pitt.edu/archive/00001798/
Preprint, PDF, publication data only in the PDF
http://philsci-archive.pitt.edu/archive/00001333/
HTML
Samples from the Digital Library of the Commons
http://dlc.dlib.indiana.edu/archive/00001319/
Scanned article with E-text (PDF)
Samples from NELLCO
http://lsr.nellco.org/suffolk/fp/papers/9/
No original pagination, PDF
http://lsr.nellco.org/upenn/wps/papers/42/
Scanned article without E-text and abstract (E-Text) as preface, PDF
http://lsr.nellco.org/upenn/wps/papers/39/
same case as the last item, no source of publication indicated (Hofstra Law Review)
Samples from UC, Postprint section
http://repositories.cdlib.org/postprints/276/
"Preprint" in the Postprint section, PDF with abstract as preface, printed text without original pagination
http://repositories.cdlib.org/postprints/260/
same case but with original pagination
http://repositories.cdlib.org/postprints/164/
publisher's preprint-pdf without pagination