Ghostscript IRC logs

Log of #ghostscript at irc.freenode.net.

	<<<Back 1 day (to 2016/07/14)	20160715
fredross-perry	Robin, tor8, sebras - several commits on my master. One of them is the ones that moves the java sources into a sub folder.	00:18.56
aiena	ow do I render pdf to images properly I used this command "gs -dNOPAUSE -dBATCH -sDEVICE=jpeg -r250 -sOutputFile='%00d.jpg' ./SILVER\ QUEEN.pdf" to render out jpg's pf each page but some embedded fonts render poorly	06:26.14
	of	06:26.22
chrisl	aiena: Well, I wouldn't use jpeg, since the quantisation will blur high frequency color changes (like the edges of glyphs). You might get better perceived results by using -dTextAlphaBits=2 or -dTextAlphaBits=4. Finally, use a higher resolution: 250dpi doesn't give many pixels for details objects like glyphs	06:29.46
aiena	so what dpi do you recommend ?	06:30.30
chrisl	Well, the higher the better, but bare minimum I'd use would be 300	06:31.18
aiena	chrisl: so its better to render at high dpi and then scale down the images	06:32.43
chrisl	aiena: it is better to use a higher resolution: whether you render at low resolution, or scale down afterwards, you're losing detail with will always make text harder to read	06:34.33
	However, TextAlphaBits effectively renders text at high resolution, and downsamples to an antialiased representation which some people prefer	06:35.50
Timmy	Hello	06:39.33
ghostbot	Welcome to #ghostscript, the channel for Ghostscript and MuPDF. If you have a question, please ask it, don't ask to ask it. Do be prepared to wait for a reply as devs will check the logs and reply when they come on line.	06:39.33
Timmy	I was wondering if someone could help me understand the scaling that is involved with the placement of an annotation unto a pdf.	06:40.51
aiena	hmm my computer froze and I lost the flags specified they are not in my irc logs too	06:41.07
	can someone pplayback previous conversation a bit for me	06:41.19
	playback	06:41.24
chrisl	aiena: http://ghostscript.com/irclogs/current.html	06:41.36
aiena	thanks	06:41.56
chrisl	aiena: I'd like to be able to give a clearer, more definitive answer, but it's all about trade-offs, and what's the best trade-off in one situation may not be the same as in others........	06:43.02
aiena	I agree	06:43.15
	this is for rendering of pages on mobile	06:43.27
chrisl	Rendering on mobile, or rendering for mobile?	06:45.32
aiena	chrisl: for mobile / tablet	06:48.34
	chrisl: hmm stil have a problem	06:48.42
	I rendered at dpi of 500 some fonts are not rendering correctly	06:48.56
	so its not a resolution issue	06:49.05
	in the pdf the render fine	06:49.12
chrisl	Okay, can you be more specific about what's wrong?	06:49.36
aiena	chrisl: I'll make a screen shot	06:49.58
	maybe you will understand image magick doesnt give the same problem I think imamgemagick also uses gs	06:50.28
chrisl	What versions of gs are you using, and does it give any warning/error messages?	06:50.47
aiena	I am using Ghostscript 9.18 (2015-10-05) and no errors it just says "Processing pages 1 through 145.\n" then "Page1\n" ... "Page145\n" and cleanly terminates.	06:55.12
	chrisl: see http://paste.opensuse.org/view/raw/ed2d1cfc left is gs output right is pdf view	06:56.08
	in okular	06:56.14
chrisl	Oh, looks like a bad Truetype font......	06:56.39
aiena	yes rest of pages seem ok	06:57.09
chrisl	Try adding -dGridFitTT=0 to your command line	06:57.16
aiena	chrisl: rerunning same command will just overwrite files on linux right	06:57.54
chrisl	Yeh	06:58.14
aiena	gs -dNOPAUSE -dBATCH -sDEVICE=jpeg -r500 -dGridFitTT=0 -sOutputFile='%00d.jpg' ".SILVER QUEEN.pdf"	06:58.15
alexcher	aiena: How does this file look in Acrobat?	07:02.05
aiena	not sure dont have it	07:02.20
	but if okular renders it fine Adobes product should work better	07:02.34
chrisl	Okular uses poppler which, IIRC, renders TrueType fonts unhinted, whereas we default to using the hints in a TTF	07:05.13
	With Acrobat the experience I have is conflicting on that front.....	07:05.39
aiena	chrisl: I checked on another system Acrobat renders the same pdfg fine	07:07.16
	pdf	07:07.18
chrisl	aiena: that doesn't mean the PDF or the embedded font is correct - Acrobat is hideously tolerant of breakages	07:08.10
aiena	TOLERANT OF BREAKAGES AS IN IT WILL REPLACE THE EMBEDDED FONT WITH ANOTHER	07:08.48
	oops	07:08.50
	meant it in small forgive me	07:08.55
chrisl	It does, but in this case I suspect it either doesn't apply hints, or only applies a subset of hints - being closed source, and Adobe being notoriously tight lipped about such things, it's very hard to know	07:09.56
	aiena: Anyway, did the GridFitTT make a difference?	07:12.01
aiena	yes	07:12.06
	it did the font rendered correctly	07:12.14
	so pdf looks awesome as image now	07:12.31
	is it safe to include that param for all pdf's	07:12.46
chrisl	Pretty much: you'll get the same kind of output that Okular produces	07:13.12
aiena	:)	07:13.32
	Well acrobt seems to produce same kind of o/p as Okular too	07:13.45
	so I was like gs is so good am mature so why was it rendering it weird	07:14.03
chrisl	It's possible that there is a bug in the Freetype hinting code, but I've also seen plenty of TTFs that the hinting really does produce output like you were seeing	07:14.37
aiena	what exactly is hinting	07:15.07
	I really dont understand how pdf works. JUst use it on a day to day basis	07:15.23
chrisl	This isn't really PDF, it's fonts.	07:15.40
aiena	hmm yeah got it	07:16.20
chrisl	Hinting is a guide to the rendering engine how best to fit the points in the outline of a glyph to the physical grid of the of the pixels in the output medium	07:16.29
aiena	its how fonts are rendered	07:16.31
	thanks so hinting is a vector to raster guide	07:17.03
chrisl	Yes	07:17.15
aiena	but rasterisation is the engines job	07:17.18
	is the hinting stored in the document itself	07:17.45
chrisl	It's in the font	07:17.53
	in TTF, hinting can be global to the font, specific to each glyph, or both	07:18.21
aiena	ok	07:19.33
chrisl	The problem is the way TTF hinting works is, urm, really poor :-(	07:19.59
aiena	Ok	07:20.22
	chrisl: is there a flag to change jpegs quality	07:20.59
	e.g. I want each jpeg at 90 % q	07:21.06
	what is gs default	07:21.14
chrisl	aiena: http://www.ghostscript.com/doc/9.19/Devices.htm#JFIF	07:21.44
aiena	thanks	07:22.04
	chrisl: which png device should I use	07:24.50
	I am unclear as there are several options	07:24.57
	e.g. what would gimp use	07:25.15
	suppose I were to render out to png	07:25.30
	I dont want alpha	07:25.42
chrisl	Probably png16m	07:25.52
aiena	png16m is 8bit png right by default	07:26.13
chrisl	8 bit per sample, RGB	07:26.29
aiena	ok	07:26.36
chrisl	So, 24 bits per pixel	07:26.40
	Being lossless compressed, the PNG files will be larger than JPEGs, but will give more readable results for text	07:28.43
aiena	chrisl: but my findings are peculiar	07:38.32
	for pure image pages the pngs are larger	07:38.43
	for pure text pages png's are smaller than the corresponding jpeg	07:38.57
	I m trying to understand why that is the case	07:39.07
	is it becuse gs converts from truecolor to gs etc per page	07:39.35
kens	jpeg is a lossy format, it achieves compression by discarding data which is 'indistinguishable' to the human eye	07:39.36
aiena	for png but jpeg is always RGB	07:39.53
	kens: but my pngs are smaller then jpegs in certain cases	07:40.14
kens	SO for image data, particularly photographs it will acheive good compression. For data with a lof oof high frequencies (black white transitions, like text) it will achive a poor looking result and comrpress badly	07:40.24
aiena	about half the size	07:40.25
	kens: Ah	07:40.44
kens	latrge areas of white/black compress well with compression schemes which work in other ways.	07:40.53
	You need to choose a compression scheme which owkrs well for the kind of data you expect to get.	07:41.19
aiena	Ah thank you. SO for books png is a better format	07:41.23
chrisl	Sorry, I had assumed pages of mixed content	07:41.26
aiena	chrisl: yes it is mixed content	07:41.37
kens	For mostly text data, png will compress better, and will give nicer looking output, so a double win	07:41.46
aiena	probably from books on wildlife etc jpeg is better	07:41.53
	for literature png will give a cleaner result because of kens explanation	07:42.10
kens	For mostly photographic data (JPEG = Joint Photographic Experts Group) jpeg will compress better	07:42.15
	The lossy nature of JPEG means that black/white transitions turn into gray blurry transitions, so text is a bad thing to apply JPEG to	07:42.53
chrisl	Of course, the best option is to use some sort of container file that allows you to use the best format for each individual type of data - like, um, PDF........	07:44.17
aiena	chrisl: ah	09:31.25
chrisl	Ah?	09:32.42
aiena	urm I meant I understood	09:49.03
	I was under the impression pdf stores data as vector and raster	09:49.21
chrisl	It does, that's the point.	09:49.45
	Generally, things like text are smaller when stored as "text" rather than as rendered bitmaps of text, they also remain scalable. Similarly with line art. OTOH, photographic images can be stored as JPEG, for the most compact representation.	09:51.40
kens	It also can use different compression strategies for differnt kinds of image data. So monochrome images (which are black and white and do not compress well with JPEG) can be compressed with JBIG2 or CCITT g4 fax, while colour images can be compressed with JPEG	09:52.01
aiena	kens: How can I make gs use png for black and white and jpeg for color	10:57.10
kens	select different device	10:57.25
aiena	kens: can gs detect which pages of a pdf are plain text and use png for that and pages with color use jpeg at q90 ?	10:57.53
kens	Not really	10:58.07
	You would have to interpret the page to dicsover whether it is colour or not. And even then it could be incorrect	10:58.32
	THere's no provision for changing device in the middle of a job. You could write a device which tracked oclour usage and wrote the output file with different compression	11:00.14
psmlbhor	hello. Is there any provision with MuPDF to read input from stdin like the '-_' option in ghostscript?	11:02.39
Robin_Watts	psmlbhor: Using what tool?	11:05.02
	Mupdf (the C lib) will read from any source you want.	11:05.12
	mutool etc are a different kettle of fish.	11:05.50
chrisl	And it still won't stream from stdin	11:06.12
aiena	kens: ok I have decided to use png for everything it seems like a better option every choice you pick involves some tradeoff. Besides sticking to one standard format makes things easier for apps too	11:08.50
kens	Yeah I would use PNG I expect	11:09.20
psmlbhor	Robin_Watts, Does mutool draw or convert support something like that ?	11:11.12
Ebras	Tor8: i think dealing with writeByte wouldn't be a problem?	11:11.34
	tor8: even if .length changes.	11:11.50
	tor8: but it is also writable for the world and that just feels wrong...	11:12.14
chrisl	psmlbhor: reading PDF from stdin is a rather pointless exercise, because the format isn't streamable.	11:12.49
Robin_Watts	psmlbhor: No, mutool draw does not, because PDF needs to be accessed randomly.	11:13.16
psmlbhor	chrisl, Robin_Watts : thank you. That will help	11:13.45
Robin_Watts	psmlbhor: If you want to handle streaming (so processing while data is still arriving) that can be done.	11:14.09
	but that involves cunning buffering code.	11:14.35
psmlbhor	Robin_Watts, how can that be done?	11:15.13
Robin_Watts	Read docs/progressive.txt	11:15.28
chrisl	psmlbhor: for the vast majority of PDF files, you just end up buffering up the entire input file to temporary storage.	11:15.56
Robin_Watts	Even with linearised files you'll still end up buffering the whole input file to temporary storage.	11:16.35
psmlbhor	yes I was thinking the same	11:16.59
chrisl	At best, even linearised files only allow accelerated display of the first page	11:17.04
Robin_Watts	The key difference is that a) you might be able to process the early pages faster, and b) if you have control over the fetching mechanism, you can arrange to get the bits of the file you need earlier.	11:17.21
	chrisl: No, our implementation is better than that.	11:17.31
chrisl	Then it's random access, not streaming	11:17.56
Robin_Watts	Yes. If you have control over the fetching mechanism, then you can do better than streaming.	11:18.24
	and often you can get accelerated access to pages other than page 1 even with just streaming.	11:18.58
psmlbhor	Robin_Watts, chrisl : Ok, I'll read progressive.txt. I think it is relevant to me	11:20.04
Robin_Watts	psmlbhor: My advice would be to get it working with simple 'download the whole file, then decode it' first.	11:20.52
	Then worry about using progressive mode as an enhancement.	11:21.17
psmlbhor	Robin_Watts, ok. Thanks	11:22.00
tor8	sebras: yes, matching java's List with .size() and .get(i) methods seems like a better idea.	11:22.55
	Robin_Watts: new trick I learned today ... 'git rebase --whitespace=fix' to fix up whitespace errors in a commit	12:32.09
	Robin_Watts: might be worth passing on to fred when he shows up	12:32.28
	fredross-perry: (for the logs) you may also want to run 'git config user.email fred.ross-perry@artifex.com' since your commits are currently signed by Fred-Ross-Perrys-Computer.local :)	12:34.58
fredross-perry	tor8 - thanks gfor the git tips	15:09.28
	Forward 1 day (to 2016/07/16)>>>

IRC Logs

Log of #ghostscript at irc.freenode.net.