by Stommel, Martin and Frieder, Gideon
Abstract:
Document enhancement tools are a valuable help in the study of historic documents. Given proper filter settings, many effects that impair the legibility can be evened out (e.g. washed out ink, stained and yellowed paper). However, because of differing authors, languages, handwritings, fonts and paper conditions, no single parameter set fits all documents. Therefore, the parameters are usually tuned in a time-consuming manual process to every individual document. To simplify this procedure, this paper introduces a classifier for the legibility of an enhanced historic text document. Experiments on the binarisation of a set of documents from 1938 to 1946 show that the classifier can be used to automatically derive robust filter settings for a variety of documents.
Reference:
Stommel, Martin and Frieder, Gideon, "Automatic Estimation of the Legibility of Binarised Mixed Handwritten and Typed Documents", Technical report, TZI Universität Bremen, no. 56, 2010.
Bibtex Entry:
@TECHREPORT{Stommel2010b,
author = {Stommel, Martin and Frieder, Gideon},
title = {Automatic Estimation of the Legibility of Binarised Mixed Handwritten
and Typed Documents},
institution = {TZI Universit{\"a}t Bremen},
year = {2010},
number = {56},
abstract = {Document enhancement tools are a valuable help in the study of historic
documents. Given proper filter settings, many effects that impair
the legibility can be evened out (e.g. washed out ink, stained and
yellowed paper). However, because of differing authors, languages,
handwritings, fonts and paper conditions, no single parameter set
fits all documents. Therefore, the parameters are usually tuned
in a time-consuming manual process to every individual document.
To simplify this procedure, this paper introduces a classifier for
the legibility of an enhanced historic text document. Experiments
on the binarisation of a set of documents from 1938 to 1946 show
that the classifier can be used to automatically derive robust filter
settings for a variety of documents.},
owner = {pmania},
series = {TZI-Berichte},
timestamp = {2012.11.06}
}