| <<<Back 1 day (to 2016/03/07) | 20160308 |
frague | Hi guys | 16:03.21 |
| I'm trying to use ghostscript through th python-ghostscript binding, to thumbnail some PDF files, but I've an "undefined" error | 16:04.30 |
| I've tryed many different PDF files, but always the same issue | 16:05.13 |
| (not from the PDF) | 16:05.26 |
| My parameters : | 16:05.53 |
| '-sDEVICE=png16m', '-dNOPAUSE', '-dNODISPLAY', '-dBATCH', '-dSAFER', '-r96x96', '-dFirstPage=1', '-dLastPage=1', (u'-sOutputFile=/home/fguerin/Boulot/workspace/django/intranet/' u'media/django_turnit/tourcoinginfo_22_web.001.png') | 16:06.03 |
| If I use the CLI version of GS, no issue. | 16:06.48 |
| Does somebody have an idea ? | 16:07.13 |
| I'me using GS release 918 on a debian / sid | 16:07.42 |
| I'me using GS release 918 on a debian / sid | 16:11.24 |
| mode/#ghostscript | 16:11.28 |
kens | Can't offer any help with Python. Whatever 'bindings' you are using, they are not from us | 16:13.00 |
| I have nop idea what the (u'-sOutputFile...) stuff is about though | 16:14.05 |
malc_ | kens: u'' I just a unicode string literal | 16:14.41 |
kens | And tyhe braces ? | 16:14.57 |
frague | unicode parameters are not supported ? | 16:15.18 |
malc_ | kens: thye? | 16:15.27 |
| tyhe | 16:15.30 |
kens | I didn't say that, I asked what the braces are about ? | 16:15.34 |
| '(' and ')' | 16:15.40 |
malc_ | (a, b, c, ...) = syntax for tuple creation | 16:16.03 |
frague | The braces are for an array, normalized at gs initialization | 16:16.04 |
kens | It might help if you were to quote the actual error | 16:16.09 |
| So this is some kind of Python stuff, like I said, I can't help with Python, I don;'t speak it | 16:16.28 |
frague | Just 'undefined' | 16:16.36 |
malc_ | kens: it's mostly tsssssshhststs | 16:16.48 |
kens | No, there will be nore, on the back channel | 16:16.58 |
malc_ | easy to pick up | 16:16.59 |
frague | I've nothing on stderr / stdout | 16:17.28 |
kens | Then how are you getting undefined ? | 16:17.40 |
| malc_ I don't need to learn any more languages :-) | 16:18.22 |
malc_ | kens: how many are under your belt? | 16:18.54 |
kens | frague, if you get nothing on stderr or stdout, where does the 'undefined' come from ? | 16:19.04 |
frague | I thik it is the message returned and consumed through an exception | 16:19.18 |
kens | malc_ Not counting assembl;y languages ? Hmm.... I guess 7 | 16:19.47 |
malc_ | kens: lemme guess c,c++,c#,java? | 16:20.16 |
kens | frague either the message comes from Ghostscript, in whch case there should be *much* more, or it doesn't in which case Ghostscript is not the problem. | 16:20.17 |
malc_ | objc | 16:20.20 |
kens | Not C# or java or objective C | 16:20.30 |
frague | it's from the returncode of the liggs.gsapi_run* | 16:20.33 |
kens | BASIC, fortran, Pascal etc | 16:20.41 |
malc_ | kens: BASIC, fortran, Pascal, C, C++ = 5 | 16:21.03 |
kens | frague then there will be data being returned on the back channel, you need to capture that and tell us what it syas. I can't tell you how to do that. | 16:21.09 |
frague | libgs.gsapi_run_* | 16:21.09 |
kens | malc_ I got interrupted | 16:21.21 |
malc_ | sorry | 16:21.25 |
frague | I'll try this/// | 16:21.27 |
kens | PostScript is another of course, | 16:21.39 |
malc_ | let's call it Forth | 16:22.04 |
kens | Its still a programming language :) | 16:22.17 |
malc_ | ofcourse :) | 16:22.29 |
frague | return code is -21 | 16:24.29 |
kens | Sure, but that doesn't tell us *why* its that value | 16:24.43 |
| THt information is printed on stderr or stdout | 16:24.54 |
Robin_Watts | frague: One possible test would be for you to try running the pdf file through gs NOT using python. | 16:26.28 |
kens | Robin_Watts : he said he did from the ommand line and it was OK | 16:26.42 |
Robin_Watts | kens: OK. | 16:26.47 |
frague | I did it, and it worked | 16:26.52 |
kens | That's why I want to know what GS is complaining about | 16:26.59 |
Robin_Watts | OK, so the problem is quite possibly how GS is being invoked. | 16:27.07 |
| And there is a trick we can do to capture that. | 16:27.16 |
frague | It's about using the gs api | 16:27.33 |
Robin_Watts | OK. | 16:27.48 |
frague | from <stdout>: Unrecoverable error: undefined in obj | 16:28.23 |
| Operand stack: 475 0 | 16:28.36 |
kens | Well, that suggests the PDF interpreter has not been initialised. Can't see any way that you could get that result. | 16:28.54 |
| Especially not if the file works from the command line | 16:29.06 |
| On stdout it should also have echoed the version of GS being used, start by checking its the one you expect and not some ancient version | 16:29.47 |
frague | GPL Ghostscript 9.18 (2015-10-05) Copyright (C) 2015 Artifex Software, Inc. All rights reserved. This software comes with NO WARRANTY: see the file PUBLIC for details. Unrecoverable error: undefined in obj Operand stack: 475 0 | 16:30.36 |
kens | And you are certain the same PDF file works from the command line ? Because I can't see any way that could be the case | 16:31.17 |
frague | (I just did apt-get it from sid / debian) | 16:31.19 |
| gs -q -dBATCH -sDEVICE=png16m -dNOPAUSE -dSAFER -r96 -dFirstPage=1 -dLastPage=1 -sOutputFile=./test.png /home/fguerin/Boulot/workspace/django/intranet/media/filer_public/f6/bb/f6bb80cc-55f5-45c3-9b39-e4c0f982ca00/tourcoinginfo_22_web.pdf | 16:31.39 |
kens | Well, I have no idea. If GS works from the command line, then it should work from anywahere | 16:32.13 |
frague | '-sDEVICE=png16m', '-dNOPAUSE', '-dNODISPLAY', '-dBATCH', '-dSAFER', '-r96x96', '-dFirstPage=1', '-dLastPage=1', (u'-sOutputFile=/home/fguerin/Boulot/workspace/django/intranet/' u'media/django_turnit/tourcoinginfo_22_web.001.png') | 16:32.41 |
kens | If it were me I would start by reducing the complexity of the command line. Make paths smaller, remove switches I didn't need etc | 16:32.49 |
frague | << args passsed to GS | 16:32.55 |
kens | Well you don't have an input file in those args | 16:33.10 |
| If you are trying to pass a PDF file as PostScript thats' not going to work | 16:33.42 |
frague | ? | 16:34.17 |
kens | If you are exepcting to be able to read chunks of data from a PDF file and pass them to Ghostscript, then you can't. | 16:34.36 |
| Instead pass the input filename as one of the arguments | 16:35.18 |
| Just like on the command line | 16:35.24 |
frague | I put the file name bt a "run_filname" command that launches the libgs.gsapi_run_file API command | 16:35.49 |
| I'll try this | 16:36.11 |
kens | RIght | 16:36.16 |
frague | Better worked ! - same message than in CLI | 16:38.46 |
| but no output | 16:40.01 |
| :( | 16:40.10 |
| (no file created) | 16:40.21 |
kens | Try it without the Unicode and a simpler path | 16:40.28 |
chrisl | -dNODISPLAY == no output | 16:40.28 |
kens | OH, well there you go | 16:40.35 |
| From Use.htm: | 16:41.26 |
| " | 16:41.26 |
| -dNODISPLAY | 16:41.26 |
| Initializes Ghostscript with a null device (a device that discards the output image) rather than the default device or the device selected with -sDEVICE=. This is usually useful only when running PostScript code whose purpose is to compute something rather than to produce an output image. " | 16:41.26 |
frague | I'll try this tomorrow, I've to go now... (french, it's 17:47 here ) | 16:47.46 |
| Bye and thank you for your help ! | 16:48.05 |
rayjj | frague: (for the logs) you can use -Z: to have Ghostscript print out its argument list (to make sure they are coming in properly). For example: gswin64c -Z: examples/tiger.eps will print: | 18:18.12 |
| % Init started, instance 0x00000148E6FF6DD0, with args: -dDisplayFormat=198788 -dDisplayResolution=120 -Z: examples/tiger.eps | 18:18.14 |
| similarly on linux, but without the -dDisplayFormat=... and -dDisplayResolution=... since those are specific to the windows version | 18:18.15 |
| frague: it also prints out: % Start time = .... when initialization of the interpreter is complete | 18:20.05 |
| note, these messages go to stderr not sdtout | 18:22.18 |
kent3116 | Hi. I'm having an issue when trying to use 'pdf_insert_page' - that's causing a 'catch' to occur and then is causing an Assertion fail in ../source/fitz/context.c | 18:36.39 |
| Expression: ctx->error->top = -1 | 18:36.58 |
Robin_Watts | kent3116: So you're falling pdf_insert_page without an enclosing fz_try(ctx) ... fz_catch(ctx) ... ? | 18:37.56 |
| s/falling/calling/ | 18:38.01 |
kent3116 | Question: For my testing, I'm opening an existing PDF document of 4 pages. I'm opening the same PDF document, looping through each page and calling 'pdf_insert_page' to append it onto the end. Can I use the same 'ctx' for both documents ? or do I need a separate 'ctx' for each document ? | 18:39.25 |
| fz_try(pdf_ctx) { int lastpage = pdf_count_pages(pdf_ctx, pdf_doc); debugAppend("End Page", (long)lastpage); pdf_insert_page(pdf_ctx, pdf_doc, page, lastpage); // Add page onto the end ??? debugAppend("Page Added"); numPagesAdded--; } fz_catch(pdf_ctx) { | 18:39.41 |
Robin_Watts | kent3116: If you're running in a single thread, a single context is fine. | 18:40.10 |
kent3116 | fz_try(pdf_ctx) { int lastpage = pdf_count_pages(pdf_ctx, pdf_doc); pdf_insert_page(pdf_ctx, pdf_doc, page, lastpage); numPagesAdded--; } fz_catch(pdf_ctx) { | 18:40.13 |
| Yes - single thread | 18:40.18 |
Robin_Watts | kent3116: Use pastebin, please :) | 18:40.47 |
kent3116 | How ? | 18:40.57 |
mvrhel_laptop | bbiab | 18:41.20 |
Robin_Watts | Go to pastebin.com in your browser. Drop in your text there, hit submit. Drop the resultant URL in here. | 18:41.39 |
| kent3116: That exception suggests that fz_throw() is being called somewhere that's not in an fz_try() block. | 18:43.38 |
kent3116 | http://pastebin.com/raw/8nAEYZj2 | 18:43.52 |
| http://pastebin.com/8nAEYZj2 | 18:44.28 |
Robin_Watts | Ok, at what point are you seeing the assert? | 18:45.39 |
kent3116 | Between the 'End Page' and 'Page Added' according to my debug logs - so I'm thinking on the 'pdf_insert_page' line. | 18:47.33 |
Robin_Watts | Can you not see the callstack from the assert? | 18:47.51 |
kent3116 | I'm getting mixed results in that sometimes it addes 2 pages and fails on the 3rd. Other times, it fails on the first | 18:48.56 |
Robin_Watts | Can you not see the callstack from the assert? | 18:49.13 |
kent3116 | No - I don't get anything until I quit the app - then I just get a MS Runtime Error with the info above | 18:50.32 |
Robin_Watts | I don't understand. | 18:51.38 |
| Is this code only called on quitting? | 18:52.10 |
kent3116 | Sorry - getting kids ready for school, etc. | 19:00.48 |
| I'm getting the assert most times I try to call the logic. However, it's not until I quit the app that I'm getting the MS Runtime Error. | 19:02.06 |
| Okay - work day is calling. If you can see anything wrong with my code - appreciate letting me know and I'll check back later today. | 19:08.52 |
| Thanks | 19:08.53 |
| Kent. | 19:08.54 |
Robin_Watts | kent3116: My work day is over (UK time). Can't see anything obvious, and I don't have time to dig into it. I think you need to sort your environment so that it stops in a debugger when the assert goes off. | 19:09.51 |
tkamppeter | chrisl, hi | 19:15.04 |
Hinnerk | Hi. I'm just trying out pdfsandwhich, which essentially uses some preporcessing, then calls tesseract and finally gs to produce a PDF including the OCR text. Now I found that the text contains spaces between each letter. When I checked the tesseract output, there were no (extra) spaces present, so apparently they were introduced in the final step calling gs. | 22:08.37 |
| The options in the call are: | 22:10.00 |
| -q -dNOPAUSE- -dBATCH-sDEVICE=pdfwrite -dDEVIDEWIDTHPOINTS=612 -dDEVICEHEIGHTPOINTS=842 -dPDFFitPage -o (file) (file) | 22:10.02 |
| what can I do to keep gs from adding spaces? I looked at the options of pdfwrite, but can't make sense of it... | 22:10.56 |
| nothing seems to relate. | 22:11.07 |
marcosw | Hinnerk: can you email a copy of the tesseract output to me at support (at) artifex.com? | 22:27.33 |
Hinnerk | I just found something that does relate: http://bugs.ghostscript.com/show_bug.cgi?id=696116 | 22:28.41 |
| However, this is NOT just an issue of Acrobat Reader, I just checked with Sumatra Read, same issue. Is there some reader that does not have issues with the output? | 22:28.43 |
| Supposedly it is not a bug - at least it's WONTFIX. But it still seems odd to me, that the output is essentially unusable ... | 22:29.48 |
marcosw | I don't know of a reader that doesn't have this issue, but otoh, I haven't looked. | 22:30.58 |
| my reading of the final comment of the bug is that it only affects files that use a certain type of CIDFont. | 22:31.49 |
Hinnerk | that is of the input file? | 22:32.17 |
| I have to admit, I only understand 10% of the comments. | 22:32.18 |
| If so, can I somehow change the font? | 22:32.41 |
marcosw | yes, the tesseract generated file that is input to Ghostscript. | 22:32.47 |
| why is tesseract calling ghostscript? The pdf file attached to the bug you reference seems perfectly reasonable. | 22:33.57 |
Hinnerk | I don't know. | 22:40.47 |
marcosw | It looks like there won't be much I can do to help you. | 22:43.30 |
| sorry | 22:43.33 |
Hinnerk | I just split the commands pdfsandwich is using and left out the call to gs at the end. looks perfect to me. Copy paste leads to some very stupid font and size in Word, but I could care less. At least I will be able to search documents. | 22:49.58 |
| well, you helped. I couldn't even make much sense of the bug's comments. So, thank you. | 22:51.53 |
| So long... | 22:59.18 |
| Forward 1 day (to 2016/03/09)>>> | |