Ghostscript IRC logs

Log of #ghostscript at irc.freenode.net.

	<<<Back 1 day (to 2014/12/09)	20141210
kubilayrd	hey all	13:34.28
	I'm using ghostscript to convert a PDF file to images per each page	13:34.45
	though I've been having troubles about the image quality	13:35.02
	so I switched from jpeg to pngalpha	13:35.14
	is it the right way to go?	13:35.26
kens	Better, jpeg is a bad choice unless you need really small images and can stand the lossy compression	13:35.43
kubilayrd	I thought so	13:36.39
	Is dDOINTERPOLATE default?	13:36.46
nsz	jpeg is good if the pages have photo content.. if it's text then don't use jpeg	13:37.02
kubilayrd	I need better text output	13:37.33
kens	By default images will be interpolated if they have the Interpolate flag set in the image dictionary. Otherwise they are not interpolated. -dDOINTERPOLATE and -dNOINTERPOLATE override the settings in the image dictionary	13:37.34
	define 'better'	13:37.45
kubilayrd	well, they look... fuzzy?	13:37.57
	I mean, same resolution but pixelated	13:38.09
chrisl	Don't use pngalpha.....	13:38.19
kubilayrd	here's a sample of my command:	13:38.22
kens	Only if you set -dTextAlphaBits and they will be pixels :-)	13:38.22
kubilayrd	gs -dNOPAUSE -dDOINTERPOLATE -dBATCH -sDEVICE=pngalpha -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -dUseCropBox -sOutputFile=page-%d.png -r600 src.pdf	13:38.29
	what do you suggest to improve this?	13:38.36
kens	Well there's your first problem, TextAlphaBits	13:38.40
	That will result in fuzzy text	13:38.49
kubilayrd	should I remove it at all?	13:38.57
	or to set a different value	13:39.01
kens	Remove it	13:39.04
kubilayrd	allright, giving a shot now	13:39.15
kens	As chrisl says, use png not pngalpha	13:39.22
*kens*	can't recall the correct device name(s) atm	13:39.36
Robin_Watts	gs -sDEVICE=png16m -r1800 -dDownScaleFactor=3 -o page-%d.png src.pdf	13:39.41
kubilayrd	thanks, trying now	13:40.04
kens	I was going to suggest using that as well	13:40.10
Robin_Watts	That will give you antialiased text - it'll look good for screen use.	13:40.12
kens	Beat me to it	13:40.17
Robin_Watts	If you are going to a 600dpi printer, better to use:	13:40.28
	gs -sDEVICE=png16m -r600 -o page-%d.png src.pdf	13:40.39
chrisl	Erm, but if he doesn't want fuzzy text, that just doing to be a different fuzzy......	13:40.42
kens	It does depend what you want	13:40.50
Robin_Watts	but I guess you're unlikely to be printing if you're using png	13:40.54
kubilayrd	right	13:41.07
kens	And I prefer the downscaled output to the TextAlphaBits	13:41.10
	600 dpi is quite high for screen disply though	13:41.36
chrisl	Hmm, don't like either......	13:41.42
kens	Wellits more that I really don't like the result of TextAlphaBits	13:42.13
kubilayrd	I'm converting PDF files of a magazine to images for an iPad magazine app	13:42.55
	images are well... okay but texts are not	13:43.06
*kens*	thought the iPad coudl read PDF directly	13:43.15
chrisl	I'd be looking at a PDF viewer rather than converting to images.....	13:43.34
kubilayrd	yeah, if the customer didn't ask for it :)	13:43.41
kens	Yeah I was going to suggets MuPDF	13:43.43
	Customers require training :-)	13:43.56
kubilayrd	their PDF sizes differs from 30MB to... 400MB	13:44.10
	differ*	13:44.14
chrisl	After all, a major point of PDF is to scale nicely	13:44.18
kens	Well then a better bet is to create a reduced content PDF. Presumably the files are so big because they contain large high resolution images (in fact it rather sounds liek the PDFs are nothing but large high resolution images)	13:45.10
kubilayrd	gs output for a 400MB pdf is almost a quarter	13:45.10
	right you are	13:45.23
kens	It sort of sounds like the workflow is image->PDF->image which is madness	13:45.34
kubilayrd	unfortunately yes, it is silly	13:46.34
kens	You could use Ghostscript to produce a lower resolution PDF file (the images are lower resoltuion, tbut hte text would remain as text, if it was text in the fistr place)	13:46.35
	Note that if the 'text' is actaully image data, then much of this discussion is moot	13:47.02
kubilayrd	nope, it is not	13:47.16
kens	OK then I would suggest creating a low res PDF so that the text remains scalable and then shipping that instead of bitmap format. It will likely be smaller than a PNG and the text will scale nicely	13:47.56
kubilayrd	what's the sDEVICE for lowering the PDF res?	13:49.35
kens	Not that simple	13:49.45
	You need to use the pdfwrite device and set some switches.	13:50.00
kubilayrd	gs documentation has it?	13:50.35
kens	You 'll need DownsampleColorImages=true (same for Gray and Mono), ColorImageDownsampleThreshold, ColorImageDownsampleType and ColorImageResolution (and again for gray and mono) and yes, this is all in ps2pdf.htm	13:51.18
	Don't blame me for the settings, they are to match Adobe :-(	13:51.41
	Oops you'll want ColorImageFilter too, probably best not to DCT the images	13:52.21
kubilayrd	Before I get confused I better ask: Do we do this process with a single line of gs command but different parameters?	13:52.59
kens	One command line, lots of parameters	13:53.10
	gs -sDEVICE=pdfwrite -dDownsampleColorImages=true -dColorImageDownsampleThreshold=1.5 -dColorImageDownsampleType=/Bicubic -dColorImageResolution=150......	13:54.23
	You cna always store the settings in a file and use the @ syntax to read the settings	13:54.48
	You may need to experiment/researcxh a bit to find out what settings are good, they depend on the input	13:55.56
kubilayrd	I surely do :)	13:56.05
kens	I suggest setting the resolution of images to somewhere around 1-2 times thje screen resolution, downsample threshold from 1 to 1.5 (depends if you have any small images), set the ImageFilter to Flate and teh DownsampleType to Bicubic (but try out subsample too)	13:57.50
	You can lose metadata (like annotations and actions and so on) by doing the conversion to a new PDF file, but its certainly going to lose less than conversion to an image file :-)	13:58.59
kubilayrd	well, metadata is not important	14:09.15
	does png16m render better than png?	14:10.21
	the text I mean	14:10.27
kens	Its better than using pngalpha	14:10.35
	But otherwise the text is dependent on the size of the text and the reswolution of the output	14:11.02
kubilayrd	what's the sense of using dDownScaleFactor instead of setting lower res?	14:12.10
kens	You get anti-aliasing by rendering to a higer resolution and then downsampling the resulting high res bitmap to a low res bitmap	14:12.38
kubilayrd	does it affect dimensions as well?	14:13.04
kens	Of course, as chrisl points out, that will also make the text blurry	14:13.06
Robin_Watts	kubilayrd: It's conceptually simpler, and can give better results when antialiased edges overlap.	14:13.19
kens	It affects the content, but not the dimensions	14:13.19
Robin_Watts	It affects dimensions too, depending on how they are measured.	14:13.38
	The point is that we render at 1800 dpi, which might give (say) 3000x3000 output pixels.	14:14.07
kens	I can't think of a dimension which is altered, provided the factor is integer	14:14.17
Robin_Watts	We then downscale by 3, and you end up at 1000x1000 pixels.	14:14.18
kubilayrd	but it makes text blurry, the latest downscaleright?	14:14.46
kens	Any anti-aliasing will do that	14:15.01
Robin_Watts	Hence the downscale factor changing DOES change the dimensions - but we boosted the resolution given to the main ghostscript core to compensate.	14:15.01
	kubilayrd: You say blurry, I say antialiased.	14:15.14
kens	The point of anti-aliasing is that its intended to make small text easier to read at low resolution	14:15.36
	Arguably if the resolution is high rnough that you can tell that its anti-aliased, then you don't need it	14:16.09
	But its a personal preference	14:16.24
kubilayrd	it may cause memory pressure when the dimensions or the file size is too high	14:16.59
kens	Anti-aliasing cannot affect the memory, the bitmap image is the same size	14:17.22
	If you mean creating the high res bitmap and doensampling then yes, it will use more memory to create the high res bitmap	14:17.49
kubilayrd	I mean turning off downscaling	14:18.13
kens	But the result is the same size. Presumably you are doing the rendering on something other than an iPad, so the memory usage shouldn't be a problem	14:18.15
	If you don't want anti-aliasing then don't use doewnscaling and don't set the high resolution.	14:18.43
	If you want downscaling then you muist render to a higher resolution.	14:19.14
	BTW none fo this has any effect with the pdfwrite device, only when rendering to a bitmap	14:19.32
kubilayrd	So, no way to get an anti-aliased text if the resolution is low?	14:20.02
kens	Yes, we've given you 2 ways to get anti-aliased text. One is use TextAlphaBits, the other is render to a higher resolution and downsample to the lower resolution.	14:20.46
	THe final resolution is the same in each case. Therefore the image is the same sixe in each case, therefore no memory worries	14:21.07
kubilayrd	You said TextAlphaBits should be removed	14:21.20
kens	You complained the text was blurry	14:21.29
	blurry = anti-aliased	14:21.40
kubilayrd	I got that	14:21.59
kens	THen I don't understand your qeustions	14:22.22
kubilayrd	If I set the resolution to a higher value without downscaling, then does it help me prevent anti-aliased text?	14:22.59
kens	The resolution is nothing to do with anti-aliasing	14:23.12
	You won't get anti-aliased text unless you take some specific action to get it	14:23.26
	Either by setting TextAlphaBits or by using a DownsclaeFactor along with a higher resolution	14:23.54
kubilayrd	What might be a reason for getting an antialiased text?	14:27.14
	I mean, in which cases?	14:27.22
kens	There are four possibilities:	14:28.34
	1) You set TextAlphaBits	14:28.34
	2) You set GrpahicsAlphaBits and the text isn't text, its part of an image	14:28.34
	3) The text is part of an image with /Interpolate true	14:28.34
	4) You set DownscaleFactor	14:28.34
	Note that 3) could also be 'the text is part of an image and you set -dDOINTERPOLATE'	14:29.06
kubilayrd	And if it makes the text harder to read, why would someone need that?	14:29.54
kens	Some people feel that, at low resolutions and small point sizes, anti-aliasing makes hte text easier to read	14:30.19
kubilayrd	I see	14:30.46
chrisl	I find it often looks nicer at a glance, but isn't actually any easier to read	14:31.29
*kens*	fetches coffee, back shortly	14:31.29
kubilayrd	it does look nice if I dont zoom	14:31.49
chrisl	Well, don't zoom	14:32.03
kubilayrd	zoom is a critical need for a magazine app	14:32.49
chrisl	But raster image data isn't (nicely) scalable - hence our recommendation to use PDF	14:33.22
kubilayrd	what you say is that if I'm to use zoom, then converted images are not enough	14:36.01
chrisl	They are sub-optimal at best	14:36.21
kens	I agree, if you want it zommable then (epsecially for text) you must not render to a bitmap format.	14:37.41
	Bitmaps are not scalable	14:37.58
kubilayrd	even when the resolution is higher?	14:38.28
kens	What you are doing then is downsampling the high resolution image, and then reducing the downsampling as you 'zoom in'. In effect you are starting by zoooming out.	14:39.24
	And of xourse, what you have ios a high resolution image, which uses up lots of space.	14:39.43
	Including the text and white space. In a PDF file the text is compact, the white space uses nothing and vector (linework) is compact	14:40.25
kubilayrd	I understand	14:41.05
	bitmap for good, pdf for best	14:41.10
	in my case	14:41.12
chrisl	PDF for all - that's the point of being a scalable for mat	14:41.43
	format, even.....	14:41.49
kens	There are potential reasons for using a bitmap, but in general, for viewing, you are better keeping the PDF as a PDF, even if that means reducing the size by reducing the quality of the images contained in the PDF	14:41.51
kubilayrd	maybe when the content isn't scalable	14:42.50
chrisl	When the content isn't scalable to start with, you don't lose anything keeping it as PDF	14:43.45
kubilayrd	thought ghostscript was more into the image manipulation	14:45.30
kens	Ghostscript doesn't do image manipualtion, that's ImageMagick	14:45.52
	Ghostscript is a PostScript and PDF interpreter and rendering library	14:46.07
kubilayrd	though IM uses gs afaik	14:46.55
kens	It does, but only for rendering PostScript and PDF to an image format, then ImageMagick can work on teh image.	14:47.19
kubilayrd	I see	14:48.11
avih	does/did pdf and postscript have much in common? does one use the other as an underlaying technology?	14:48.21
kens	PDF originally had the same imaging model as PostScript	14:48.37
	Nowadays its a superset	14:48.48
kubilayrd	thank you guys, catch you later	14:48.51
avih	pdf is a superset of postscript then?	14:49.09
kens	No	14:49.14
avih	oh	14:49.18
kens	PostScript is a programming language, PDF is not	14:49.22
	and never has been	14:49.30
	But the underlying imaging model is a superset	14:49.44
	ALso PostScript is a streamable format, PDF isn't (yes I nkow about linearisatyion, its nonsense)	14:50.25
mSIK	hi there, is there anyone who knows how to play with mupdf viewer and bookmarks?	14:50.26
avih	and what does "imaging model" mostly refers to? is it about images as part of the document? or imaging the output rendering? or something else?	14:51.02
kens	avih the available graphics primitives, such as colour spaces, paths, etc.	14:51.23
avih	ah. thx	14:51.35
kens	SO for example PostScript and PDF have a curveto operator, and an arc operator, but no circle operator, you have to make a circle from arcs.	14:52.07
avih	i see. so PDF is a declarative imaging tool which can defines the output with the imaging model, and postscript uses a subset of this imaging model, but is also programable?	14:55.44
	(weird.. never really learned how pdf works.. and TBH also never got too close to postscript)	14:56.42
kens	More or less, yes. PostScript is the older format, and the fact that its a programming language has advantages and disadvantages. Adobe actually started with the old Illustrator file format (whihc is itself based on PostScript) for the PDF syntax	14:56.45
	With PDF 1.4 the major diversion in the graphcis model occured when Adobe added transparency	14:57.20
avih	so transparency was hard or impossible to implement with pre-1.4? or is it like advancements in css, where most things could be implemented for a long time but new stuff makes it easier/more modular to produce?	14:58.30
kens	There is no transparency in PDF versions prior to 1.3, it was a feature added with version 1.4 of the specification	14:58.59
	Which is why,if you tell Ghostscript's pdfwrite device to create a PDF 1.3 file from a PDF 1.4 input which contains transparency, the output file is a big bitmap.	15:00.03
avih	ah.	15:00.31
	and i'm guessing either most PDF viewers today support 1.4 or newer, or most PDF files around are 1.3 or earlier?	15:02.16
kens	All viewers support at least PDF 1.4 nowadays	15:02.33
	It was released a long time ago	15:02.42
avih	oh	15:02.48
*kens*	still has old versions of Acrobat though :-)	15:03.01
avih	old enough to not support 1.4? how old would old be?	15:03.25
kens	Hmm, around about the year 2000	15:03.51
avih	i stopped using adobe pdf reader long while ago.	15:03.59
chrisl	There are also specialist PDF subsets that specifically exclude transparency (and other features)	15:04.01
kens	Good point, yes	15:04.10
	THough even those have been superseded by newer revisions that do support transparency	15:04.32
avih	these days it's sumatra on windows, or pdf.js in firefox	15:04.33
	or on osx PDFNut iirc	15:04.43
kens	OS/X supports PDF natively	15:04.55
chrisl	Good grief, pdf.js is awful..... :-(	15:05.00
avih	ah, so mostly they would be viewers on top of the OS thing?	15:05.14
kens	Indeed it is, it handles transparency very badly too (if indeed it manages at all)	15:05.24
	avih I don't know much about OS/X PDF viewers	15:05.39
avih	yes, pdf.js is not quite there yet, but works good enough on many pdf files	15:05.40
kens	PDF 1.4 seems to have been released in 2001	15:06.12
avih	and it has the huge advantage of not requiring a plugin	15:06.21
kens	Its also in Javascript, so its slow	15:06.45
avih	and i guess with less security exposure than embedding an exsicting pdf renderer into firefox	15:06.57
	existing*	15:07.09
kens	I would not be too sure about that	15:07.11
avih	how so?	15:07.18
kens	Well,its running Javascript	15:07.26
	GS doesn't run javascript so an offending script in the PDF file cannot be dangerous, for example	15:07.56
avih	and js gets much more security reviewes than any single pdf viewer, i'd guess	15:08.15
kens	And therefore also has a larger security surface	15:08.46
avih	therefore? because it gets more reviews??	15:09.10
kens	No because it gets more usage	15:09.19
chrisl	High security is great, but if it doesn't actually display my PDFs in a useful way, it's of limited use to me	15:09.29
avih	chrisl: obviously :)	15:09.44
kens	Same reason Windows gets more viruses than a Mac, if you're going to target one platform, which one do you target ?	15:09.45
*kens*	is very happy GS doesn't run Java.	15:10.22
avih	i only started examining pdf.js recently. other than font rendering issues on windows (anti aliasing is sometimes b0rked somehow), it displays the pdf documents i'm using good enough	15:10.50
kens	It doesn't do a very good job with a wide range of the PDF files I get.	15:11.09
	Broken output, crashes (killed Firefox once), incorrect rendering, etc	15:11.35
avih	hmm	15:11.42
kens	I expect it works adequately for sufficiently simple files	15:11.48
	We do, after all, see the worst offenders	15:12.04
avih	you do :)	15:12.13
chrisl	I found very few PDFs I was viewing worked satisfactorily - ignoring all the insane PDFs we have to look at for GS and mupdf	15:12.38
kens	Most of the ones posted to Stack Overflow questions work adequatley	15:13.20
avih	well, you got more pdf knowledge than most, contribute to pdf.js ;)	15:13.22
kens	Javascript ? Not likely.	15:13.40
avih	:)	15:13.46
	file bugs? :)	15:13.58
chrisl	I'd rather spend the time making Ghostscript better, or mupdf	15:14.06
*kens*	returns to trying to make sense of GhostPCL	15:14.54
avih	i'm off to my stuff too. thanks for the enlightening conversation :)	15:15.22
chrisl	Now PCL, there is a bonkers pdl.....	15:15.28
kens	chrisl, by the way I forgot to mention, the PS and PDF interpreters now work as well as they ever have (or better) with my new device code. Just got about 6 PCL files with genuine problems, and 10 XPS files which show invisible differences	15:16.13
chrisl	kens: that's very cool. How much is going to reusable for other purposes?	15:17.01
kens	Most of it, I very much hope.	15:17.20
	When I get rid of the final problems I'll rewrite Robin's spy device to use this approach and see how that goes	15:17.38
	THat'll give me a useful device I can use for an example	15:17.52
	Then it'll be Zoltan's pattern problem.	15:18.13
	I will need to refactor some of the code to make it usable in a general fashion. Making it work has been my main priority	15:18.48
chrisl	Hmm, our default iodev_no_file_status() returns gs_error_undefinedfilename - that would not have been my choice.....	15:21.42
kens	Well, at least it returns an error and doesn't just crash liek so many of the device methods	15:22.13
chrisl	I think I would have preferred invalidaccess	15:22.54
kens	Seems reasonable.	15:23.11
chrisl	Too late now.....	15:23.21
kens	You can change it easily enough I'd have thought	15:23.40
	Surely nothgin relies on it being undefinedfilename ?	15:23.52
*kens*	sees a huge pit opening up in front	15:24.04
chrisl	Yeh, I'd rather not make that assumption.....	15:24.22
kens	I found another case bureied in the compositor that assumes that a device method can be called unles it is specifically NULL yesterday. Another one that needs to be fixed.	15:25.30
	Hmm, the PCL interpreter (on Windows) does a pl_main_universe_dnit() and then later tries to release a parameter list. It looks to me like the parameter list is already freed and overwritten by then :-(	15:42.20
Robin_Watts	In 15 mins or so, it will be 10/12/14 16:18:20	16:06.01
kens	Only in the UK, the US have a different date format :-)	16:06.26
Robin_Watts	Their loss :)	16:06.41
jordyd	Why is it that GenericResourceDir is defined in Resource/Init/gs_res.ps when itâs a LL3 feature?	16:26.02
chrisl	We need it to find the other init files	16:27.30
	And besides, who wants a non-LL3 PS interpreter???	16:28.04
jordyd	Fair enough	16:29.37
chrisl	Don't forgot a lot of stuff that was introduced in Level 2 and then 3 already existed, or something very similar, as unofficial extensions before Adobe actually released the specs	16:30.50
	jordyd: tbh, if it weren't for the large amount of effort and no externally visible benefit, we'd rip out the (probably broken) support for building Level 1 and 2 only interpreters	16:47.05
mvrhel_laptop	kens: simple question for you. I need to do some translations when using the xpswrite device for what I am doing with gsview. is there an easy way for me to do this with some -c '1 0 0 1 dx dy .setdefaultmatrix' option or something? This is probably something I should know but it is not obvious to me	17:58.25
	oops kens is gone	17:58.38
	chrisl: maybe you know	17:58.44
chrisl	mvrhel_laptop: what's the input?	18:03.12
mvrhel_laptop	pdf	18:03.16
chrisl	Then, no, I don't think there is	18:03.30
	Although, I'm not sure what .setdefaultmatrix does, so.....	18:04.18
mvrhel_laptop	so I have been using -dDEVICEWIDTHPOINTS=xx and -dDEVICEHEIGHTPOINTS=xx to specify the paper size and that is working fine to give me what I want but in certain cases I want to get everything centered	18:04.40
	ok. I may have to do this a bit differently then	18:04.52
chrisl	The problem is that the PDF interpreter does an initgraphics before drawing pages, which zaps scaling/transforms in the graphics state	18:06.06
mvrhel_laptop	and -dPSFitPage works fine	18:06.47
	to force things to scale	18:06.53
chrisl	PSFitPage?	18:07.07
mvrhel_laptop	that is a command line option	18:07.14
	to scale the content to the page size	18:07.22
chrisl	Oh, then just to -dFITPAGE	18:07.31
mvrhel_laptop	hmm our documentation needs a bit of updating then	18:07.50
chrisl	Or rather -dFitPage	18:07.56
	FitPage is documented in Use.htm	18:08.27
mvrhel_laptop	ok I see	18:08.37
rayjj	mvrhel: the "standard" way of doing translation is the use the 'Install' procedure of setpagedevice. That changes the default matrix.	18:09.09
	mvrhel_laptop: eg., -c "<< /Install { 10 20 translate } >> setpagedevice"	18:09.44
mvrhel_laptop	rayjj: ah. that is what I need	18:09.52
	rayjj: comes and save the day	18:10.04
chrisl	Hmm, I didn't think that worked with PDF.....	18:10.04
rayjj	mvrhel_laptop: you can also rotate, scale, etc	18:10.12
mvrhel_laptop	is this stuff in our documentation?	18:10.55
chrisl	It's Postscript	18:11.07
mvrhel_laptop	fair enough	18:11.14
rayjj	mvrhel_laptop: chrisl: I just tried: gswin32c -c "<< /Install { 100 200 translate } >> setpagedevice" -f examples/annots.pdf	18:11.38
	it works fine	18:11.43
mvrhel_laptop	thanks rayjj	18:11.48
rayjj	mvrhel_laptop: Install is part of the PLRM	18:11.55
chrisl	I wonder why I thought it didn't work - maybe we broke it a while back, then fixed it, of course......	18:12.09
rayjj	in the setpagedevice section (also describes BeginPage EndPage)	18:12.18
	chrisl: probably	18:12.28
	chrisl: it probably doesn't work with -dFitPage	18:13.00
	mvrhel_laptop: BTW, -dFitPage is _supposed_ to work with PDF _or_ PS input	18:13.42
mvrhel_laptop	yes I see that now that I read the doc...	18:14.33
	thanks rayhh	18:14.36
	rayjj	18:14.38
rayjj	mvrhel_laptop: it is supposed to rotate for "best fit"	18:15.41
mvrhel_laptop	right	18:15.47
	that seems to be working nicely	18:15.58
	and this translation thing is the last part I need.	18:16.18
	Forward 1 day (to 2014/12/11)>>>

IRC Logs

Log of #ghostscript at irc.freenode.net.