[gs-devel] how to extract text correctly from pdf when kerning
happen
chen bin
chenbin.sh at gmail.com
Thu Aug 2 17:23:57 PDT 2007
I try to extract text from pdf files. My algorithm to combine
characters is simple and works well on most situations.
I use the one tenth of height of smaller character to judge if the two
character are close enough to be in one word.
My algorithm fails when kerning happens.
For example:
$1 EACH becomes $1EACH in my algorithm
Does any method exist to tell if the kerning happens when extracting
character from PDF?
Regards
Chen Bin
--
help me, help you.
More information about the gs-devel
mailing list