IRC Logs

Log of #ghostscript at irc.freenode.net.

Search:
 <<<Back 1 day (to 2014/12/09)20141210 
kubilayrd hey all13:34.28 
  I'm using ghostscript to convert a PDF file to images per each page13:34.45 
  though I've been having troubles about the image quality13:35.02 
  so I switched from jpeg to pngalpha13:35.14 
  is it the right way to go?13:35.26 
kens Better, jpeg is a bad choice unless you need really small images and can stand the lossy compression13:35.43 
kubilayrd I thought so13:36.39 
  Is dDOINTERPOLATE default?13:36.46 
nsz jpeg is good if the pages have photo content.. if it's text then don't use jpeg13:37.02 
kubilayrd I need better text output13:37.33 
kens By default images will be interpolated if they have the Interpolate flag set in the image dictionary. Otherwise they are not interpolated. -dDOINTERPOLATE and -dNOINTERPOLATE override the settings in the image dictionary13:37.34 
  define 'better'13:37.45 
kubilayrd well, they look... fuzzy?13:37.57 
  I mean, same resolution but pixelated13:38.09 
chrisl Don't use pngalpha.....13:38.19 
kubilayrd here's a sample of my command:13:38.22 
kens Only if you set -dTextAlphaBits and they will be pixels :-)13:38.22 
kubilayrd gs -dNOPAUSE -dDOINTERPOLATE -dBATCH -sDEVICE=pngalpha -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -dUseCropBox -sOutputFile=page-%d.png -r600 src.pdf13:38.29 
  what do you suggest to improve this?13:38.36 
kens Well there's your first problem, TextAlphaBits13:38.40 
  That will result in fuzzy text13:38.49 
kubilayrd should I remove it at all?13:38.57 
  or to set a different value13:39.01 
kens Remove it13:39.04 
kubilayrd allright, giving a shot now13:39.15 
kens As chrisl says, use png not pngalpha13:39.22 
kens can't recall the correct device name(s) atm13:39.36 
Robin_Watts gs -sDEVICE=png16m -r1800 -dDownScaleFactor=3 -o page-%d.png src.pdf13:39.41 
kubilayrd thanks, trying now13:40.04 
kens I was going to suggest using that as well13:40.10 
Robin_Watts That will give you antialiased text - it'll look good for screen use.13:40.12 
kens Beat me to it13:40.17 
Robin_Watts If you are going to a 600dpi printer, better to use:13:40.28 
  gs -sDEVICE=png16m -r600 -o page-%d.png src.pdf13:40.39 
chrisl Erm, but if he doesn't want fuzzy text, that just doing to be a different fuzzy......13:40.42 
kens It does depend what you want13:40.50 
Robin_Watts but I guess you're unlikely to be printing if you're using png13:40.54 
kubilayrd right13:41.07 
kens And I prefer the downscaled output to the TextAlphaBits13:41.10 
  600 dpi is quite high for screen disply though13:41.36 
chrisl Hmm, don't like either......13:41.42 
kens Wellits more that I *really* don't like the result of TextAlphaBits13:42.13 
kubilayrd I'm converting PDF files of a magazine to images for an iPad magazine app13:42.55 
  images are well... okay but texts are not13:43.06 
kens thought the iPad coudl read PDF directly13:43.15 
chrisl I'd be looking at a PDF viewer rather than converting to images.....13:43.34 
kubilayrd yeah, if the customer didn't ask for it :)13:43.41 
kens Yeah I was going to suggets MuPDF13:43.43 
  Customers require training :-)13:43.56 
kubilayrd their PDF sizes differs from 30MB to... 400MB13:44.10 
  differ*13:44.14 
chrisl After all, a major point of PDF is to scale nicely13:44.18 
kens Well then a better bet is to create a reduced content PDF. Presumably the files are so big because they contain large high resolution images (in fact it rather sounds liek the PDFs are nothing *but* large high resolution images)13:45.10 
kubilayrd gs output for a 400MB pdf is almost a quarter13:45.10 
  right you are13:45.23 
kens It sort of sounds like the workflow is image->PDF->image which is madness13:45.34 
kubilayrd unfortunately yes, it is silly13:46.34 
kens You could use Ghostscript to produce a lower resolution PDF file (the images are lower resoltuion, tbut hte text would remain as text, if it was text in the fistr place)13:46.35 
  Note that if the 'text' is actaully image data, then much of this discussion is moot13:47.02 
kubilayrd nope, it is not13:47.16 
kens OK then I would suggest creating a low res PDF so that the text remains scalable and then shipping that instead of bitmap format. It will likely be smaller than a PNG and the text will scale nicely13:47.56 
kubilayrd what's the sDEVICE for lowering the PDF res?13:49.35 
kens Not that simple13:49.45 
  You need to use the pdfwrite device and set some switches.13:50.00 
kubilayrd gs documentation has it?13:50.35 
kens You 'll need DownsampleColorImages=true (same for Gray and Mono), ColorImageDownsampleThreshold, ColorImageDownsampleType and ColorImageResolution (and again for gray and mono) and yes, this is all in ps2pdf.htm13:51.18 
  Don't blame me for the settings, they are to match Adobe :-(13:51.41 
  Oops you'll want ColorImageFilter too, probably best not to DCT the images13:52.21 
kubilayrd Before I get confused I better ask: Do we do this process with a single line of gs command but different parameters?13:52.59 
kens One command line, lots of parameters13:53.10 
  gs -sDEVICE=pdfwrite -dDownsampleColorImages=true -dColorImageDownsampleThreshold=1.5 -dColorImageDownsampleType=/Bicubic -dColorImageResolution=150......13:54.23 
  You cna always store the settings in a file and use the @ syntax to read the settings13:54.48 
  You may need to experiment/researcxh a bit to find out what settings are good, they depend on the input13:55.56 
kubilayrd I surely do :)13:56.05 
kens I suggest setting the resolution of images to somewhere around 1-2 times thje screen resolution, downsample threshold from 1 to 1.5 (depends if you have any small images), set the ImageFilter to Flate and teh DownsampleType to Bicubic (but try out subsample too)13:57.50 
  You can lose metadata (like annotations and actions and so on) by doing the conversion to a new PDF file, but its certainly going to lose less than conversion to an image file :-)13:58.59 
kubilayrd well, metadata is not important14:09.15 
  does png16m render better than png?14:10.21 
  the text I mean14:10.27 
kens Its better than using pngalpha14:10.35 
  But otherwise the text is dependent on the size of the text and the reswolution of the output14:11.02 
kubilayrd what's the sense of using dDownScaleFactor instead of setting lower res?14:12.10 
kens You get anti-aliasing by rendering to a higer resolution and then downsampling the resulting high res bitmap to a low res bitmap14:12.38 
kubilayrd does it affect dimensions as well?14:13.04 
kens Of course, as chrisl points out, that will also make the text blurry14:13.06 
Robin_Watts kubilayrd: It's conceptually simpler, and can give better results when antialiased edges overlap.14:13.19 
kens It affects the content, but not the dimensions14:13.19 
Robin_Watts It affects dimensions too, depending on how they are measured.14:13.38 
  The point is that we render at 1800 dpi, which might give (say) 3000x3000 output pixels.14:14.07 
kens I can't think of a dimension which is altered, provided the factor is integer14:14.17 
Robin_Watts We then downscale by 3, and you end up at 1000x1000 pixels.14:14.18 
kubilayrd but it makes text blurry, the latest downscaleright?14:14.46 
kens Any anti-aliasing will do that14:15.01 
Robin_Watts Hence the downscale factor changing DOES change the dimensions - but we boosted the resolution given to the main ghostscript core to compensate.14:15.01 
  kubilayrd: You say blurry, I say antialiased.14:15.14 
kens The point of anti-aliasing is that its intended to make small text easier to read at low resolution14:15.36 
  Arguably if the resolution is high rnough that you can tell that its anti-aliased, then you don't need it14:16.09 
  But its a personal preference14:16.24 
kubilayrd it may cause memory pressure when the dimensions or the file size is too high14:16.59 
kens Anti-aliasing cannot affect the memory, the bitmap image is the same size14:17.22 
  If you mean creating the high res bitmap and doensampling then yes, it will use more memory to create the high res bitmap14:17.49 
kubilayrd I mean turning off downscaling14:18.13 
kens But the result is the same size. Presumably you are doing the rendering on something other than an iPad, so the memory usage shouldn't be a problem14:18.15 
  If you don't want anti-aliasing then don't use doewnscaling *and* don't set the high resolution.14:18.43 
  If you want downscaling then you muist render to a higher resolution.14:19.14 
  BTW none fo this has any effect with the pdfwrite device, only when rendering to a bitmap14:19.32 
kubilayrd So, no way to get an anti-aliased text if the resolution is low?14:20.02 
kens Yes, we've given you 2 ways to get anti-aliased text. One is use TextAlphaBits, the other is render to a higher resolution and downsample to the lower resolution.14:20.46 
  THe *final* resolution is the same in each case. Therefore the image is the same sixe in each case, therefore no memory worries14:21.07 
kubilayrd You said TextAlphaBits should be removed14:21.20 
kens You complained the text was blurry14:21.29 
  blurry = anti-aliased14:21.40 
kubilayrd I got that14:21.59 
kens THen I don't understand your qeustions14:22.22 
kubilayrd If I set the resolution to a higher value without downscaling, then does it help me prevent anti-aliased text?14:22.59 
kens The resolution is nothing to do with anti-aliasing14:23.12 
  You won't get anti-aliased text unless you take some specific action to get it14:23.26 
  Either by setting TextAlphaBits or by using a DownsclaeFactor along with a higher resolution14:23.54 
kubilayrd What might be a reason for getting an antialiased text?14:27.14 
  I mean, in which cases?14:27.22 
kens There are four possibilities:14:28.34 
  1) You set TextAlphaBits14:28.34 
  2) You set GrpahicsAlphaBits and the text isn't text, its part of an image14:28.34 
  3) The text is part of an image with /Interpolate true14:28.34 
  4) You set DownscaleFactor14:28.34 
  Note that 3) could also be 'the text is part of an image and you set -dDOINTERPOLATE'14:29.06 
kubilayrd And if it makes the text harder to read, why would someone need that?14:29.54 
kens Some people feel that, at low resolutions and small point sizes, anti-aliasing makes hte text *easier* to read14:30.19 
kubilayrd I see14:30.46 
chrisl I find it often looks nicer at a glance, but isn't actually any easier to read14:31.29 
kens fetches coffee, back shortly14:31.29 
kubilayrd it does look nice if I dont zoom14:31.49 
chrisl Well, don't zoom14:32.03 
kubilayrd zoom is a critical need for a magazine app14:32.49 
chrisl But raster image data isn't (nicely) scalable - hence our recommendation to use PDF14:33.22 
kubilayrd what you say is that if I'm to use zoom, then converted images are not enough14:36.01 
chrisl They are sub-optimal at best14:36.21 
kens I agree, if you want it zommable then (epsecially for text) you must not render to a bitmap format.14:37.41 
  Bitmaps are not scalable14:37.58 
kubilayrd even when the resolution is higher?14:38.28 
kens What you are doing then is *downsampling* the high resolution image, and then reducing the downsampling as you 'zoom in'. In effect you are starting by zoooming out.14:39.24 
  And of xourse, what you have ios a high resolution image, which uses up lots of space.14:39.43 
  Including the text and white space. In a PDF file the text is compact, the white space uses nothing and vector (linework) is compact14:40.25 
kubilayrd I understand14:41.05 
  bitmap for good, pdf for best14:41.10 
  in my case14:41.12 
chrisl PDF for all - that's the point of being a scalable for mat14:41.43 
  format, even.....14:41.49 
kens There are potential reasons for using a bitmap, but in general, for viewing, you are better keeping the PDF as a PDF, even if that means reducing the size by reducing the quality of the images contained in the PDF14:41.51 
kubilayrd maybe when the content isn't scalable14:42.50 
chrisl When the content isn't scalable to start with, you don't *lose* anything keeping it as PDF14:43.45 
kubilayrd thought ghostscript was more into the image manipulation14:45.30 
kens Ghostscript doesn't do image manipualtion, that's ImageMagick14:45.52 
  Ghostscript is a PostScript and PDF interpreter and rendering library14:46.07 
kubilayrd though IM uses gs afaik14:46.55 
kens It does, but only for rendering PostScript and PDF to an image format, then ImageMagick can work on teh image.14:47.19 
kubilayrd I see14:48.11 
avih does/did pdf and postscript have much in common? does one use the other as an underlaying technology?14:48.21 
kens PDF originally had the same imaging model as PostScript14:48.37 
  Nowadays its a superset14:48.48 
kubilayrd thank you guys, catch you later14:48.51 
avih pdf is a superset of postscript then?14:49.09 
kens No14:49.14 
avih oh14:49.18 
kens PostScript is a programming language, PDF is not14:49.22 
  and never has been14:49.30 
  But the underlying imaging model is a superset14:49.44 
  ALso PostScript is a streamable format, PDF isn't (yes I nkow about linearisatyion, its nonsense)14:50.25 
mSIK hi there, is there anyone who knows how to play with mupdf viewer and bookmarks?14:50.26 
avih and what does "imaging model" mostly refers to? is it about images as part of the document? or imaging the output rendering? or something else?14:51.02 
kens avih the available graphics primitives, such as colour spaces, paths, etc.14:51.23 
avih ah. thx14:51.35 
kens SO for example PostScript and PDF have a curveto operator, and an arc operator, but no circle operator, you have to make a circle from arcs.14:52.07 
avih i see. so PDF is a declarative imaging tool which can defines the output with the imaging model, and postscript uses a subset of this imaging model, but is also programable?14:55.44 
  (weird.. never really learned how pdf works.. and TBH also never got too close to postscript)14:56.42 
kens More or less, yes. PostScript is the older format, and the fact that its a programming language has advantages and disadvantages. Adobe actually started with the old Illustrator file format (whihc is itself based on PostScript) for the PDF syntax14:56.45 
  With PDF 1.4 the major diversion in the graphcis model occured when Adobe added transparency14:57.20 
avih so transparency was hard or impossible to implement with pre-1.4? or is it like advancements in css, where most things could be implemented for a long time but new stuff makes it easier/more modular to produce?14:58.30 
kens There is no transparency in PDF versions prior to 1.3, it was a feature added with version 1.4 of the specification14:58.59 
  Which is why,if you tell Ghostscript's pdfwrite device to create a PDF 1.3 file from a PDF 1.4 input which contains transparency, the output file is a big bitmap.15:00.03 
avih ah.15:00.31 
  and i'm guessing either most PDF viewers today support 1.4 or newer, or most PDF files around are 1.3 or earlier?15:02.16 
kens All viewers support at least PDF 1.4 nowadays15:02.33 
  It was released a long time ago15:02.42 
avih oh15:02.48 
kens still has old versions of Acrobat though :-)15:03.01 
avih old enough to not support 1.4? how old would old be?15:03.25 
kens Hmm, around about the year 200015:03.51 
avih i stopped using adobe pdf reader long while ago.15:03.59 
chrisl There are also specialist PDF subsets that specifically exclude transparency (and other features)15:04.01 
kens Good point, yes15:04.10 
  THough even those have been superseded by newer revisions that do support transparency15:04.32 
avih these days it's sumatra on windows, or pdf.js in firefox15:04.33 
  or on osx PDFNut iirc15:04.43 
kens OS/X supports PDF natively15:04.55 
chrisl Good grief, pdf.js is awful..... :-(15:05.00 
avih ah, so mostly they would be viewers on top of the OS thing?15:05.14 
kens Indeed it is, it handles transparency very badly too (if indeed it manages at all)15:05.24 
  avih I don't know much about OS/X PDF viewers15:05.39 
avih yes, pdf.js is not quite there yet, but works good enough on many pdf files15:05.40 
kens PDF 1.4 seems to have been released in 200115:06.12 
avih and it has the huge advantage of not requiring a plugin15:06.21 
kens Its also in Javascript, so its slow15:06.45 
avih and i guess with less security exposure than embedding an exsicting pdf renderer into firefox15:06.57 
  existing*15:07.09 
kens I would not be too sure about that15:07.11 
avih how so?15:07.18 
kens Well,its running Javascript15:07.26 
  GS doesn't run javascript so an offending script in the PDF file cannot be dangerous, for example15:07.56 
avih and js gets much more security reviewes than any single pdf viewer, i'd guess15:08.15 
kens And therefore also has a larger security surface15:08.46 
avih therefore? because it gets more reviews??15:09.10 
kens No because it gets more usage15:09.19 
chrisl High security is great, but if it doesn't actually display my PDFs in a useful way, it's of limited use to me15:09.29 
avih chrisl: obviously :)15:09.44 
kens Same reason Windows gets more viruses than a Mac, if you're going to target one platform, which one do you target ?15:09.45 
kens is very happy GS doesn't run Java.15:10.22 
avih i only started examining pdf.js recently. other than font rendering issues on windows (anti aliasing is sometimes b0rked somehow), it displays the pdf documents i'm using good enough15:10.50 
kens It doesn't do a very good job with a wide range of the PDF files I get.15:11.09 
  Broken output, crashes (killed Firefox once), incorrect rendering, etc15:11.35 
avih hmm15:11.42 
kens I expect it works adequately for sufficiently simple files15:11.48 
  We do, after all, see the worst offenders15:12.04 
avih you do :)15:12.13 
chrisl I found very few PDFs I was viewing worked satisfactorily - ignoring all the insane PDFs we have to look at for GS and mupdf15:12.38 
kens Most of the ones posted to Stack Overflow questions work adequatley15:13.20 
avih well, you got more pdf knowledge than most, contribute to pdf.js ;)15:13.22 
kens Javascript ? Not likely.15:13.40 
avih :)15:13.46 
  file bugs? :)15:13.58 
chrisl I'd rather spend the time making Ghostscript better, or mupdf15:14.06 
kens returns to trying to make sense of GhostPCL15:14.54 
avih i'm off to my stuff too. thanks for the enlightening conversation :)15:15.22 
chrisl Now PCL, there *is* a bonkers pdl.....15:15.28 
kens chrisl, by the way I forgot to mention, the PS and PDF interpreters now work as well as they ever have (or better) with my new device code. Just got about 6 PCL files with genuine problems, and 10 XPS files which show invisible differences15:16.13 
chrisl kens: that's very cool. How much is going to reusable for other purposes?15:17.01 
kens Most of it, I very much hope.15:17.20 
  When I get rid of the final problems I'll rewrite Robin's spy device to use this approach and see how that goes15:17.38 
  THat'll give me a useful device I can use for an example15:17.52 
  Then it'll be Zoltan's pattern problem.15:18.13 
  I will need to refactor some of the code to make it usable in a general fashion. Making it work has been my main priority15:18.48 
chrisl Hmm, our default iodev_no_file_status() returns gs_error_undefinedfilename - that would not have been my choice.....15:21.42 
kens Well, at least it returns an error and doesn't just crash liek so many of the device methods15:22.13 
chrisl I think I would have preferred invalidaccess15:22.54 
kens Seems reasonable.15:23.11 
chrisl Too late now.....15:23.21 
kens You can change it easily enough I'd have thought15:23.40 
  Surely nothgin relies on it being undefinedfilename ?15:23.52 
kens sees a huge pit opening up in front15:24.04 
chrisl Yeh, I'd rather not make that assumption.....15:24.22 
kens I found another case bureied in the compositor that assumes that a device method can be called unles it is specifically NULL yesterday. Another one that needs to be fixed.15:25.30 
  Hmm, the PCL interpreter (on Windows) does a pl_main_universe_dnit() and then later tries to release a parameter list. It looks to me like the parameter list is already freed and overwritten by then :-(15:42.20 
Robin_Watts In 15 mins or so, it will be 10/12/14 16:18:2016:06.01 
kens Only in the UK, the US have a different date format :-)16:06.26 
Robin_Watts Their loss :)16:06.41 
jordyd Why is it that GenericResourceDir is defined in Resource/Init/gs_res.ps when it’s a LL3 feature?16:26.02 
chrisl We need it to find the other init files16:27.30 
  And besides, who wants a non-LL3 PS interpreter???16:28.04 
jordyd Fair enough16:29.37 
chrisl Don't forgot a lot of stuff that was introduced in Level 2 and then 3 already existed, or something very similar, as unofficial extensions before Adobe actually released the specs16:30.50 
  jordyd: tbh, if it weren't for the large amount of effort and no externally visible benefit, we'd rip out the (probably broken) support for building Level 1 and 2 only interpreters16:47.05 
mvrhel_laptop kens: simple question for you. I need to do some translations when using the xpswrite device for what I am doing with gsview. is there an easy way for me to do this with some -c '1 0 0 1 dx dy .setdefaultmatrix' option or something? This is probably something I should know but it is not obvious to me 17:58.25 
  oops kens is gone17:58.38 
  chrisl: maybe you know17:58.44 
chrisl mvrhel_laptop: what's the input?18:03.12 
mvrhel_laptop pdf18:03.16 
chrisl Then, no, I don't think there is18:03.30 
  Although, I'm not sure what .setdefaultmatrix does, so.....18:04.18 
mvrhel_laptop so I have been using -dDEVICEWIDTHPOINTS=xx and -dDEVICEHEIGHTPOINTS=xx to specify the paper size and that is working fine to give me what I want but in certain cases I want to get everything centered 18:04.40 
  ok. I may have to do this a bit differently then18:04.52 
chrisl The problem is that the PDF interpreter does an initgraphics before drawing pages, which zaps scaling/transforms in the graphics state18:06.06 
mvrhel_laptop and -dPSFitPage works fine18:06.47 
  to force things to scale18:06.53 
chrisl PSFitPage?18:07.07 
mvrhel_laptop that is a command line option18:07.14 
  to scale the content to the page size18:07.22 
chrisl Oh, then just to -dFITPAGE18:07.31 
mvrhel_laptop hmm our documentation needs a bit of updating then18:07.50 
chrisl Or rather -dFitPage18:07.56 
  FitPage is documented in Use.htm18:08.27 
mvrhel_laptop ok I see18:08.37 
rayjj mvrhel: the "standard" way of doing translation is the use the 'Install' procedure of setpagedevice. That changes the default matrix.18:09.09 
  mvrhel_laptop: eg., -c "<< /Install { 10 20 translate } >> setpagedevice"18:09.44 
mvrhel_laptop rayjj: ah. that is what I need18:09.52 
  rayjj: comes and save the day18:10.04 
chrisl Hmm, I didn't think that worked with PDF.....18:10.04 
rayjj mvrhel_laptop: you can also rotate, scale, etc18:10.12 
mvrhel_laptop is this stuff in our documentation?18:10.55 
chrisl It's Postscript18:11.07 
mvrhel_laptop fair enough18:11.14 
rayjj mvrhel_laptop: chrisl: I just tried: gswin32c -c "<< /Install { 100 200 translate } >> setpagedevice" -f examples/annots.pdf18:11.38 
  it works fine18:11.43 
mvrhel_laptop thanks rayjj18:11.48 
rayjj mvrhel_laptop: Install is part of the PLRM18:11.55 
chrisl I wonder why I thought it didn't work - maybe we broke it a while back, then fixed it, of course......18:12.09 
rayjj in the setpagedevice section (also describes BeginPage EndPage)18:12.18 
  chrisl: probably18:12.28 
  chrisl: it probably doesn't work with -dFitPage18:13.00 
  mvrhel_laptop: BTW, -dFitPage is _supposed_ to work with PDF _or_ PS input18:13.42 
mvrhel_laptop yes I see that now that I read the doc...18:14.33 
  thanks rayhh18:14.36 
  rayjj18:14.38 
rayjj mvrhel_laptop: it is supposed to rotate for "best fit"18:15.41 
mvrhel_laptop right18:15.47 
  that seems to be working nicely18:15.58 
  and this translation thing is the last part I need. 18:16.18 
 Forward 1 day (to 2014/12/11)>>> 
ghostscript.com
Search: