| <<<Back 1 day (to 2016/07/14) | 20160715 |
fredross-perry | Robin, tor8, sebras - several commits on my master. One of them is the ones that moves the java sources into a sub folder. | 00:18.56 |
aiena | ow do I render pdf to images properly I used this command "gs -dNOPAUSE -dBATCH -sDEVICE=jpeg -r250 -sOutputFile='%00d.jpg' ./SILVER\ QUEEN.pdf" to render out jpg's pf each page but some embedded fonts render poorly | 06:26.14 |
| of | 06:26.22 |
chrisl | aiena: Well, I wouldn't use jpeg, since the quantisation will blur high frequency color changes (like the edges of glyphs). You might get better perceived results by using -dTextAlphaBits=2 or -dTextAlphaBits=4. Finally, use a higher resolution: 250dpi doesn't give many pixels for details objects like glyphs | 06:29.46 |
aiena | so what dpi do you recommend ? | 06:30.30 |
chrisl | Well, the higher the better, but *bare* minimum I'd use would be 300 | 06:31.18 |
aiena | chrisl: so its better to render at high dpi and then scale down the images | 06:32.43 |
chrisl | aiena: it is better to use a higher resolution: whether you render at low resolution, or scale down afterwards, you're losing detail with will always make text harder to read | 06:34.33 |
| However, TextAlphaBits effectively renders text at high resolution, and downsamples to an antialiased representation which *some* people prefer | 06:35.50 |
Timmy | Hello | 06:39.33 |
ghostbot | Welcome to #ghostscript, the channel for Ghostscript and MuPDF. If you have a question, please ask it, don't ask to ask it. Do be prepared to wait for a reply as devs will check the logs and reply when they come on line. | 06:39.33 |
Timmy | I was wondering if someone could help me understand the scaling that is involved with the placement of an annotation unto a pdf. | 06:40.51 |
aiena | hmm my computer froze and I lost the flags specified they are not in my irc logs too | 06:41.07 |
| can someone pplayback previous conversation a bit for me | 06:41.19 |
| playback | 06:41.24 |
chrisl | aiena: http://ghostscript.com/irclogs/current.html | 06:41.36 |
aiena | thanks | 06:41.56 |
chrisl | aiena: I'd like to be able to give a clearer, more definitive answer, but it's all about trade-offs, and what's the best trade-off in one situation may not be the same as in others........ | 06:43.02 |
aiena | I agree | 06:43.15 |
| this is for rendering of pages on mobile | 06:43.27 |
chrisl | Rendering *on* mobile, or rendering *for* mobile? | 06:45.32 |
aiena | chrisl: for mobile / tablet | 06:48.34 |
| chrisl: hmm stil have a problem | 06:48.42 |
| I rendered at dpi of 500 some fonts are not rendering correctly | 06:48.56 |
| so its not a resolution issue | 06:49.05 |
| in the pdf the render fine | 06:49.12 |
chrisl | Okay, can you be more specific about what's wrong? | 06:49.36 |
aiena | chrisl: I'll make a screen shot | 06:49.58 |
| maybe you will understand image magick doesnt give the same problem I think imamgemagick also uses gs | 06:50.28 |
chrisl | What versions of gs are you using, and does it give any warning/error messages? | 06:50.47 |
aiena | I am using Ghostscript 9.18 (2015-10-05) and no errors it just says "Processing pages 1 through 145.\n" then "Page1\n" ... "Page145\n" and cleanly terminates. | 06:55.12 |
| chrisl: see http://paste.opensuse.org/view/raw/ed2d1cfc left is gs output right is pdf view | 06:56.08 |
| in okular | 06:56.14 |
chrisl | Oh, looks like a bad Truetype font...... | 06:56.39 |
aiena | yes rest of pages seem ok | 06:57.09 |
chrisl | Try adding -dGridFitTT=0 to your command line | 06:57.16 |
aiena | chrisl: rerunning same command will just overwrite files on linux right | 06:57.54 |
chrisl | Yeh | 06:58.14 |
aiena | gs -dNOPAUSE -dBATCH -sDEVICE=jpeg -r500 -dGridFitTT=0 -sOutputFile='%00d.jpg' ".SILVER QUEEN.pdf" | 06:58.15 |
alexcher | aiena: How does this file look in Acrobat? | 07:02.05 |
aiena | not sure dont have it | 07:02.20 |
| but if okular renders it fine Adobes product should work better | 07:02.34 |
chrisl | Okular uses poppler which, IIRC, renders TrueType fonts unhinted, whereas we default to using the hints in a TTF | 07:05.13 |
| With Acrobat the experience I have is conflicting on that front..... | 07:05.39 |
aiena | chrisl: I checked on another system Acrobat renders the same pdfg fine | 07:07.16 |
| pdf | 07:07.18 |
chrisl | aiena: that doesn't mean the PDF or the embedded font is correct - Acrobat is hideously tolerant of breakages | 07:08.10 |
aiena | TOLERANT OF BREAKAGES AS IN IT WILL REPLACE THE EMBEDDED FONT WITH ANOTHER | 07:08.48 |
| oops | 07:08.50 |
| meant it in small forgive me | 07:08.55 |
chrisl | It does, but in this case I suspect it either doesn't apply hints, or only applies a subset of hints - being closed source, and Adobe being notoriously tight lipped about such things, it's very hard to know | 07:09.56 |
| aiena: Anyway, did the GridFitTT make a difference? | 07:12.01 |
aiena | yes | 07:12.06 |
| it did the font rendered correctly | 07:12.14 |
| so pdf looks awesome as image now | 07:12.31 |
| is it safe to include that param for all pdf's | 07:12.46 |
chrisl | Pretty much: you'll get the same kind of output that Okular produces | 07:13.12 |
aiena | :) | 07:13.32 |
| Well acrobt seems to produce same kind of o/p as Okular too | 07:13.45 |
| so I was like gs is so good am mature so why was it rendering it weird | 07:14.03 |
chrisl | It's possible that there is a bug in the Freetype hinting code, but I've also seen plenty of TTFs that the hinting really does produce output like you were seeing | 07:14.37 |
aiena | what exactly is hinting | 07:15.07 |
| I really dont understand how pdf works. JUst use it on a day to day basis | 07:15.23 |
chrisl | This isn't really PDF, it's fonts. | 07:15.40 |
aiena | hmm yeah got it | 07:16.20 |
chrisl | Hinting is a guide to the rendering engine how best to fit the points in the outline of a glyph to the physical grid of the of the pixels in the output medium | 07:16.29 |
aiena | its how fonts are rendered | 07:16.31 |
| thanks so hinting is a vector to raster guide | 07:17.03 |
chrisl | Yes | 07:17.15 |
aiena | but rasterisation is the engines job | 07:17.18 |
| is the hinting stored in the document itself | 07:17.45 |
chrisl | It's in the font | 07:17.53 |
| in TTF, hinting can be global to the font, specific to each glyph, or both | 07:18.21 |
aiena | ok | 07:19.33 |
chrisl | The problem is the way TTF hinting works is, urm, really poor :-( | 07:19.59 |
aiena | Ok | 07:20.22 |
| chrisl: is there a flag to change jpegs quality | 07:20.59 |
| e.g. I want each jpeg at 90 % q | 07:21.06 |
| what is gs default | 07:21.14 |
chrisl | aiena: http://www.ghostscript.com/doc/9.19/Devices.htm#JFIF | 07:21.44 |
aiena | thanks | 07:22.04 |
| chrisl: which png device should I use | 07:24.50 |
| I am unclear as there are several options | 07:24.57 |
| e.g. what would gimp use | 07:25.15 |
| suppose I were to render out to png | 07:25.30 |
| I dont want alpha | 07:25.42 |
chrisl | Probably png16m | 07:25.52 |
aiena | png16m is 8bit png right by default | 07:26.13 |
chrisl | 8 bit per sample, RGB | 07:26.29 |
aiena | ok | 07:26.36 |
chrisl | So, 24 bits per pixel | 07:26.40 |
| Being lossless compressed, the PNG files will be larger than JPEGs, but will give more readable results for text | 07:28.43 |
aiena | chrisl: but my findings are peculiar | 07:38.32 |
| for pure image pages the pngs are larger | 07:38.43 |
| for pure text pages png's are smaller than the corresponding jpeg | 07:38.57 |
| I m trying to understand why that is the case | 07:39.07 |
| is it becuse gs converts from truecolor to gs etc per page | 07:39.35 |
kens | jpeg is a lossy format, it achieves compression by discarding data which is 'indistinguishable' to the human eye | 07:39.36 |
aiena | for png but jpeg is always RGB | 07:39.53 |
| kens: but my pngs are smaller then jpegs in certain cases | 07:40.14 |
kens | SO for image data, particularly photographs it will acheive good compression. For data with a lof oof high frequencies (black white transitions, like text) it will achive a poor looking result and comrpress badly | 07:40.24 |
aiena | about half the size | 07:40.25 |
| kens: Ah | 07:40.44 |
kens | latrge areas of white/black compress well with compression schemes which work in other ways. | 07:40.53 |
| You need to choose a compression scheme which owkrs well for the kind of data you expect to get. | 07:41.19 |
aiena | Ah thank you. SO for books png is a better format | 07:41.23 |
chrisl | Sorry, I had assumed pages of mixed content | 07:41.26 |
aiena | chrisl: yes it is mixed content | 07:41.37 |
kens | For mostly text data, png will compress better, and will give nicer looking output, so a double win | 07:41.46 |
aiena | probably from books on wildlife etc jpeg is better | 07:41.53 |
| for literature png will give a cleaner result because of kens explanation | 07:42.10 |
kens | For mostly photographic data (JPEG = Joint Photographic Experts Group) jpeg will compress better | 07:42.15 |
| The lossy nature of JPEG means that black/white transitions turn into gray blurry transitions, so text is a bad thing to apply JPEG to | 07:42.53 |
chrisl | Of course, the best option is to use some sort of container file that allows you to use the best format for each individual type of data - like, um, PDF........ | 07:44.17 |
aiena | chrisl: ah | 09:31.25 |
chrisl | Ah? | 09:32.42 |
aiena | urm I meant I understood | 09:49.03 |
| I was under the impression pdf stores data as vector and raster | 09:49.21 |
chrisl | It does, that's the point. | 09:49.45 |
| Generally, things like text are smaller when stored as "text" rather than as rendered bitmaps of text, they also remain scalable. Similarly with line art. OTOH, photographic images can be stored as JPEG, for the most compact representation. | 09:51.40 |
kens | It also can use different compression strategies for differnt kinds of image data. So monochrome images (which are black and white and do not compress well with JPEG) can be compressed with JBIG2 or CCITT g4 fax, while colour images can be compressed with JPEG | 09:52.01 |
aiena | kens: How can I make gs use png for black and white and jpeg for color | 10:57.10 |
kens | select different device | 10:57.25 |
aiena | kens: can gs detect which pages of a pdf are plain text and use png for that and pages with color use jpeg at q90 ? | 10:57.53 |
kens | Not really | 10:58.07 |
| You would have to interpret the page to dicsover whether it is colour or not. And even then it could be incorrect | 10:58.32 |
| THere's no provision for changing device in the middle of a job. You could write a device which tracked oclour usage and wrote the output file with different compression | 11:00.14 |
psmlbhor | hello. Is there any provision with MuPDF to read input from stdin like the '-_' option in ghostscript? | 11:02.39 |
Robin_Watts | psmlbhor: Using what tool? | 11:05.02 |
| Mupdf (the C lib) will read from any source you want. | 11:05.12 |
| mutool etc are a different kettle of fish. | 11:05.50 |
chrisl | And it still won't *stream* from stdin | 11:06.12 |
aiena | kens: ok I have decided to use png for everything it seems like a better option every choice you pick involves some tradeoff. Besides sticking to one standard format makes things easier for apps too | 11:08.50 |
kens | Yeah I would use PNG I expect | 11:09.20 |
psmlbhor | Robin_Watts, Does mutool draw or convert support something like that ? | 11:11.12 |
Ebras | Tor8: i think dealing with writeByte wouldn't be a problem? | 11:11.34 |
| tor8: even if .length changes. | 11:11.50 |
| tor8: but it is also writable for the world and that just feels wrong... | 11:12.14 |
chrisl | psmlbhor: reading PDF from stdin is a rather pointless exercise, because the format isn't streamable. | 11:12.49 |
Robin_Watts | psmlbhor: No, mutool draw does not, because PDF needs to be accessed randomly. | 11:13.16 |
psmlbhor | chrisl, Robin_Watts : thank you. That will help | 11:13.45 |
Robin_Watts | psmlbhor: If you want to handle streaming (so processing while data is still arriving) that *can* be done. | 11:14.09 |
| but that involves cunning buffering code. | 11:14.35 |
psmlbhor | Robin_Watts, how can that be done? | 11:15.13 |
Robin_Watts | Read docs/progressive.txt | 11:15.28 |
chrisl | psmlbhor: for the vast majority of PDF files, you just end up buffering up the entire input file to temporary storage. | 11:15.56 |
Robin_Watts | Even with linearised files you'll still end up buffering the whole input file to temporary storage. | 11:16.35 |
psmlbhor | yes I was thinking the same | 11:16.59 |
chrisl | At best, even linearised files only allow accelerated display of the first page | 11:17.04 |
Robin_Watts | The key difference is that a) you might be able to process the early pages faster, and b) if you have control over the fetching mechanism, you can arrange to get the bits of the file you need earlier. | 11:17.21 |
| chrisl: No, our implementation is better than that. | 11:17.31 |
chrisl | Then it's random access, not streaming | 11:17.56 |
Robin_Watts | Yes. If you have control over the fetching mechanism, then you can do better than streaming. | 11:18.24 |
| and often you can get accelerated access to pages other than page 1 even with just streaming. | 11:18.58 |
psmlbhor | Robin_Watts, chrisl : Ok, I'll read progressive.txt. I think it is relevant to me | 11:20.04 |
Robin_Watts | psmlbhor: My advice would be to get it working with simple 'download the whole file, then decode it' first. | 11:20.52 |
| Then worry about using progressive mode as an enhancement. | 11:21.17 |
psmlbhor | Robin_Watts, ok. Thanks | 11:22.00 |
tor8 | sebras: yes, matching java's List with .size() and .get(i) methods seems like a better idea. | 11:22.55 |
| Robin_Watts: new trick I learned today ... 'git rebase --whitespace=fix' to fix up whitespace errors in a commit | 12:32.09 |
| Robin_Watts: might be worth passing on to fred when he shows up | 12:32.28 |
| fredross-perry: (for the logs) you may also want to run 'git config user.email fred.ross-perry@artifex.com' since your commits are currently signed by Fred-Ross-Perrys-Computer.local :) | 12:34.58 |
fredross-perry | tor8 - thanks gfor the git tips | 15:09.28 |
| Forward 1 day (to 2016/07/16)>>> | |