| <<<Back 1 day (to 2012/12/21) | 2012/12/22 |
Robin_Watts | no. | 00:13.28 |
bencc | Robin_Watts: do you know how it works? | 00:40.19 |
Robin_Watts | bencc: I don't know, but I have some ideas as to how it might - and gs doesn't do anything like that. | 01:14.22 |
ray_work | bencc: Have you tried -dDOINTERPOLATE on the scanned PDF ? | 01:29.03 |
| bencc: That's a gs option that smooths scanned images. But gs doesn't do any OCR or speck removal (these are usually the task of software like Abby FineReader or ScanSoft OCR) | 01:30.37 |
| bencc: AFAIK, Acrobat (even full Acrobat, not just the free reader) does not do any OCR, either | 01:31.50 |
bencc | ray_work: Acrobat writer does OCR | 01:44.06 |
| is OCR related to improved readability of scanned images? | 01:44.34 |
| I don't really need the actual text just want it to look nicer | 01:44.46 |
| dDOINTERPOLATE didn't improve the PDF | 01:45.00 |
ray_work | bencc: I have full acrobat, and I don't know how to enable OCR. What options are you setting for that ? | 01:45.24 |
bencc | Tools -> Document Processing -> optimize scanned pdf | 01:45.46 |
| actually, I'm not using the OCR feature and still getting an excellent text quality | 01:46.10 |
ray_work | bencc: send a bug report with a sample file (can be a dummy) to bugs.ghostscript.com | 01:46.12 |
bencc | ray_work: ok, thanks | 01:46.31 |
| ray_work: the problem might be with a specific PDF reader | 01:46.58 |
ray_work | bencc: you should see _some_ difference with -dDOINTERPOLATE | 01:47.02 |
bencc | but still, after using Acrobat optimization it looks great also in this reader | 01:47.27 |
ray_work | bencc: at least if you are viewing it with GS | 01:47.38 |
bencc | ok | 01:48.05 |
ray_work | bencc: curious -- windows or mac/linux ? | 01:48.06 |
bencc | windows | 01:48.11 |
| using gs on ubuntu | 01:48.16 |
| and watching the PDF using pdf.js in the browser | 01:48.27 |
ray_work | bencc: OK. If you put a bug report in, one of us will have a look. | 01:48.44 |
bencc | ok. thanks | 01:49.40 |
ray_work | bencc: the browser may be ignoring the /Interpolate true of the scanned image, but Acrobat may be interpolating the image into the PDF so that any viewer will see the interpolated (smoothed) image. Of course, the file is much larger that way | 01:50.23 |
| bencc: please attach the PDF with and without the "Tools -> Document Processing -> optimize scanned pdf" | 01:51.26 |
| bencc: or the original PDF and the one with "Tools -> Document Processing -> optimize scanned pdf" | 01:51.51 |
bencc | ok | 01:52.07 |
| the original is 115KB | 01:52.13 |
| the optimized is 50KB | 01:52.18 |
ray_work | bencc: since doing image smoothing when the file has /Interpolate true is "implementation dependent" according to the spec, the pdf.js may be ignoring it | 01:53.23 |
bencc | Acrobat has checkboxes for JPEG2000, JIBG2 and filters for text sharpening, deskew and descreen | 01:53.26 |
| so I'm probably looking for a command line tool that can improve the actual embedded images | 01:54.05 |
| like Acrobat does | 01:54.10 |
ray_work | bencc: none of those imply OCR | 01:54.18 |
bencc | right | 01:54.24 |
| I don't need OCR | 01:54.27 |
| http://tv.adobe.com/watch/acrobat-x-tips-tricks/quick-tip-how-to-optimize-a-scanned-to-pdf-document/ | 01:55.09 |
ray_work | bencc: we'll have a look. GS has several different image filters. JPEG2000 and JBIG2 generally can't improve quality, text sharpening might be what is doing it. And whether or not 'descreen' is in effect depends on what the scanned PDF was | 01:56.29 |
bencc | does GS have text sharpening? | 01:57.10 |
| what does descreen means? | 01:57.20 |
ray_work | bencc: thanks for the URL, I'll view it. I have a scanner (Fujitsu ScanSnap) that scans to PDF. | 01:57.20 |
bencc | thank you | 01:57.53 |
ray_work | bencc: if the scanned PDF image is not "contone" (BitsPerComponent 8), as a tiffg4 or ccitt g4 image may be, the scanning software may represent shades of gray with a halftone pattern | 01:58.51 |
bencc | ok | 01:59.26 |
ray_work | bencc: so 'descreen' generally does a modified low pass filter to try an derive a gray shade for a screened area | 01:59.36 |
bencc | you gave me an idea | 02:00.01 |
| to only apply one change at a time and see what improves the PDF quality | 02:00.20 |
ray_work | bencc: it generally is helpful on photo images, but text only will scan a gray at the edge, so edge smoothing is what is needed | 02:00.55 |
bencc | so is there a chance GS will support feature like this? | 02:00.59 |
| I mean text sharpening | 02:01.19 |
ray_work | bencc: well, if you have a lottery ticket, there's always a chance ;-) | 02:01.44 |
bencc | I don't | 02:02.20 |
ray_work | bencc: I haven't examined text sharpening to see what it looks like, so I'll wait until you post example files | 02:02.28 |
bencc | it could be out of scope or not | 02:02.32 |
| ok. cool | 02:02.40 |
ray_work | bencc: but we see LOTS of scanned PDF's, so being able to print/display them better would be useful to our user/customer base, I would think | 02:03.34 |
| bencc: and we have a few on the staff with image processing experience (mvrhel, Robin_Watts, myself, and maybe others) | 02:04.16 |
bencc | ray_work: which component is related to this bug? | 02:06.07 |
ray_work | bencc: I get email for all bug reports. I won't be attending to IRC for a while (dinner) | 02:06.11 |
bencc | ok. thanks | 02:06.29 |
ray_work | bencc: probably just "graphics library" | 02:06.38 |
| (or pdfwrite if you are trying to make an improved PDF from a scanned PDF) -- just guess and we'll figure it out | 02:07.18 |
bencc | ok | 02:07.19 |
aibek | hello all. Are PostScript files produced by GPL GhostScript copyrighted? I see this copyright notice in the file: | 03:20.40 |
| % This copyright applies to everything between here and the %%EndProlog: | 03:21.20 |
| % Copyright (C) 2010 Artifex Software, Inc. All rights reserved. | 03:21.21 |
| %%BeginResource: procset GS_pswrite_2_0_1001 1.001 0 | 03:21.21 |
| ... | 03:21.21 |
| %%EndProlog | 03:21.21 |
| I could find no information about this on the WWW. | 03:21.30 |
| Nothing on http://www.ghostscript.com or http://pages.cs.wisc.edu/~ghost/ | 03:22.16 |
| This is present in all files produced by pdf2ps, gs v. 8.71. | 04:01.28 |
| I am using GPL Ghostscript 871. Someone said on the Debian IRC channel that the notice is not present in GPL Ghostscript 905. | 04:52.09 |
mvrhel_laptop | ray_work: you there? | 06:17.07 |
ray_work | oops. mvrhel not here anymore | 06:35.29 |
| mvrhel: (for the logs) just call me... I am avail this weekend and Monday | 06:43.34 |
| Forward 1 day (to 2012/12/23)>>> | |