Log of #ghostscript at irc.freenode.net.

Search:
 <<<Back 1 day (to 2019/07/14)Fwd 1 day (to 2019/07/16) >>>20190715 
kens StephDesc, so you are talking about Windows, the 'Ghostscript pritner driver' isn't a real driver. Its a PPD file (PostScript Pritner Description), the actual driver is supplied by Microsoft. The outptu is pretty much exacly the same as you would get by printing from any Windows application to a PostScript printer.06:58.11 
  The reason we supply it is pretty much the same reason for ht e'press return to continue' prompt,instead of 'press any key'. By supplying a PPD file we can tell people to use that for creating PostScript files to send to Ghostscript rather than saying 'use any PostScript printer driver'06:59.17 
  That being the case, we can't help you with the PostScript it produces, other than saying 'talk to Microsoft'06:59.56 
  Having said that....07:00.02 
  If you start from a Windows application, and use the Microsoft print system to create the PostScript, the printer driver does embed an extension (which we support) which allows for text to be labelled with its Unicode code point.07:00.56 
  I suspect your problem is that the application you are printing from does **NOT** use that API. What it does is generate the PostScript itself and embed it in the output from the driver using the 'PostScript pass-through' feature of the device driver.07:01.45 
  I would guess you are printing from Acrobat.07:01.58 
  So, the problem is not us, its either Microsoft or, more likely, Adobe's products which are causing your problem.07:02.24 
  To re-iterate what Robin said yesterday; if you have a PDF file form which you can extract the text, then use that.07:02.45 
  Perhaps if you were to explain nwhat you are actually trying to achieve (and why) we might be able to offer some suggestions. As it is, in the asence of any clue what you are trying to achieve, and with no file to look at, all I can tell you is 'not us guv'07:03.34 
  Oh, and its nothing to do with fonts.07:04.06 
StephDesc Kens, Thanks for your explanation, and Yes, I'm going to process directly the PDF file. But just to understand well, you told me that postscript does not support ToUnicde CMap information, so even if Acrobat Reader or the MS driver were ok, the text could still not be extracted, no?07:37.22 
  As the PDF file uses CID fonts..07:37.57 
kens Its not really to do with fonts07:38.10 
  The PostScript language has no means (as standard) to associate any given glyph with a specific character code in a known encoding (such as Unicode)07:38.49 
  PostScript is designed for pritning, nothing else, so you don't care what the character is, as long as its drawn correctly.07:39.09 
StephDesc ok, thanks I know what you mean07:39.32 
kens It so happens that, in general, programmers have used ASCII for Latin text, but its far from guaranteed07:39.37 
StephDesc Thanks again for all these information07:39.49 
kens For non-Latin text there is no single simple standard07:39.51 
  Now the Microsoft PostScrip driver embeds extra information in the font dictionaries it creates07:40.18 
  non-standard info, but it causes no harm07:40.25 
  A consumer which understands that informaiton (and Ghostscri is one such, as is Acrobat Distiller) can use that information to attach a Unicode value to each glyph code.07:40.58 
StephDesc yes, I read some articles about that..07:41.40 
kens So if (say) Microsoft Word printed some Devanagri text, then the extra info there would let us find that in the PostScript and creat a PDF file with a ToUnicode CMap and you would be able to extract the text07:41.41 
  But, if the application creates the PostScript itaelf, and injects it into the output of the MS driver, using the PostScript pass-through meacanism, then that information is not present07:42.17 
  So we don't have any information to use. If it so happens that the text is encoded with ASCII, or Identity, then we can still extract the information and use it (there are actually another couple of cases, but that's a bit obscure), but in the general case, the information is gone at that point07:43.27 
  THis is essentially what Robin meant when he said don't do any more conversions than you absolutely need to.07:43.48 
StephDesc I understand... he's totally right07:44.13 
  Thanks Kens07:45.23 
kens You're welcome, its an unfortunately complicated situation07:45.38 
StephDesc hum.. Yes! ;-)07:45.53 
kens Its basically because PostScript predates Unicode wide-spreqad adoption at least, and neither PDF nor PostScript was ever intended to be editable :-)07:46.50 
voices is ghostscript a language19:42.50 
  i just wrote (drew?) my first postscript diagram.19:43.38 
  a square 19:43.57 
  with a hello world caption in times roman19:45.00 
  the syntax is quite.. logical19:46.33 
 <<<Back 1 day (to 2019/07/14)Forward 1 day (to 2019/07/16)>>> 
ghostscript.com #mupdf
Search: