Prior to offering transcription of scanned printed text as a service, I built up some experience with Project Gutenberg. I didn’t really have any experience of transcription from handwritten texts, but offered the chance to try, I gave it a go. Some thoughts on the process ….
With scanned printing, there are really three stages. The first is OCR – Optical Character Recognition.then proofreading, then final formatting. OCR happens by computer, and is pretty fast – it just requires the right software. Proofreading – making sure the text has been captured accurately, getting rid of spurious bits and pieces – is slower. The final formatting once you have a good text is a quick pass through. All of these are really present for handwritten texts, but the OCR has to be done by eye! (If there’s software that can reliably read handwriting, I’d like to know about it!)
Proofreading is quite slow – especially when there are things like names of people or places that may have been obvious to the writer, but aren’t so obvious without their mental context. It helps to have some sort of overview of the whole document, as the same names may crop up elsewhere.
The final format will depend on what is to be done with the document, but once the text is in place, it’s easy enough for it to be bashed into any required page or file format. The task of initially transcribing from handwriting was shared between people – and it was interesting how much extra work was required simply to get the different extracts back to the same format – note to self: make sure this is defined properly in advance next time!
What was most interesting was the sense of personal involvement in people’s stories. We were transcribing a kind of visitors’ book. To follow small elements of the family history over the years was a surprisingly touching experience.