Log of #ghostscript at irc.freenode.net.

Search:
 <<<Back 1 day (to 2022/07/24)Fwd 1 day (to 2022/07/26) >>>20220725 
artifexirc-bot <jonesmeier> hi, i was processing alot of files with the output device "ps2write", it was going fine for a while but then an error came up "cannot open device", i don't have the exact error message anymore09:54.05 
  <jonesmeier> should I give it some rest every few calls or something ?09:54.23 
  <jonesmeier> no sorry, it seems it's because /tmp is filling up with tempfiles, i thought only once the error occurs but apparently it does right away, i need to check that10:00.55 
  <jonesmeier> i was piping the console output to grep and it didn't like that it seems and never removed it's temp files18:03.19 
  <jonesmeier> (i'm trying to use it to detect corrupt PDF files)18:04.22 
  <jonesmeier> looks like it's working fine now.. thx18:05.33 
  <KenSharp> OK well if you do find a problem feel free to file a bug report.18:06.21 
  <KenSharp> temp files ought to be cleaned up, they should be unlinked after creation (on Linux) and so should disappear when closed.18:06.41 
  <KenSharp> I'm slightly surprised that you are getting temp files at all, if all you want to do is detect corrupt files.18:07.33 
  <KenSharp> If it were me I woudl run with -sDEVICE=nullpage -dPDFSTOPONERROR and check the return code18:07.52 
  <jonesmeier> ok thanks, maybe I should investigate the thing with grep again when it's done18:10.35 
  <jonesmeier> your suggestion does sound much smarter than what I'm doing for sure!18:10.54 
  <KenSharp> I don't know about smarter, but ti might be quicker!18:11.15 
  <jonesmeier> because I'm converting them to Postscript and then look at it's text output, it never seemed to change the exit code even if it had errors there18:11.36 
  <KenSharp> Oh wow, I really wouldn't convert to PostScript.18:11.49 
  <KenSharp> That's is going to use temp files for certain, and it'll be potentially slow.18:12.02 
  <jonesmeier> but I didn't know about that option you mentioned there18:12.09 
  <KenSharp> The pdfwrite family (which includes ps2write) try **really** hard not to stop, so they note errors, and carry on.18:12.29 
  <KenSharp> Then give you a report at the end and return success.18:12.44 
  <jonesmeier> hm yeah I'm not sure, it is quite slow, but that's not a big concern for me.. I will try your suggestion definitely!18:13.04 
  <KenSharp> But if you use -dPDFSTOPONERROR they should stop when they hit an error, and return a non-zer status18:13.06 
  <jonesmeier> yes it would always return successfully, even when noting multiple errors and not even being able to read a single page18:13.24 
  <KenSharp> We're trying to mimic Acrobat 🙂18:13.40 
  <jonesmeier> very nice, thank you! I'm going to make the changes now and see how that works :)18:14.05 
  <KenSharp> Basically people tend to grumble when we throw errors on PDF input, because "Acrobat can open it"18:14.07 
  <KenSharp> Which is true, but only because Acrobat silently ignores/fixes many problems.18:14.23 
  <KenSharp> Well if you do find problems, do please open a bug report so we can look into it18:14.47 
  <KenSharp> Best of luck 🙂18:14.51 
  <jonesmeier> hehe ok I see, so in order to basically do exactly what Acrobat does, it can't treat many things as an error18:15.59 
  <KenSharp> Yep, the PDF interpreter tries to ignore errors and carry on. Unlike Acrobat we do report at the end if we found anything concerning though18:16.31 
  <jonesmeier> I guess that's good if ultimately it could really read the file.. in my case it can't read some, but returns successfully.. so I guess it can read some of it, it has a correct header and stuff, but it says "no pages will be processed" or something18:16.49 
  <jonesmeier> on those broken files18:17.09 
  <KenSharp> Yeah if the trailer can't be read to find the Pages array then basically you#'re out of luck18:17.20 
  <KenSharp> But we have a number of badly broken files (many produced by the OSS-fuzz fuzzing tool) which we can get 'some' content out of, even though even Acrobat won't open them.18:17.53 
  <jonesmeier> yeah that must be it, the files were truncated on copying18:17.58 
  <jonesmeier> nice, then it can still help in these situations! I always do try it18:18.40 
  <jonesmeier> I ended up buying "Recovery Toolbox for PDF" for 40 bucks, because it really fixed all of them.. so I was happy to pay that18:19.11 
  <KenSharp> It can often extract 'some' content, but not always everything. But at least it tries to tell you if it thinks there was a problem. I've not heard of Recovery Toolbox though, new one to me18:19.42 
  <KenSharp> If you use the pdfwrite device to write a new PDF file GS will 'fix' a lot of broken files.18:20.07 
  <jonesmeier> ok, will continue to try on Linux first of course, this one was a Windows tool18:20.16 
  <jonesmeier> alright, will try that again too18:20.42 
  <KenSharp> If you do have files which the recovery thing can fix and we can't I'd be interested to see them. Maybe we can do a better job.18:21.33 
  <jonesmeier> okay... well unfortunately these are files from HR like contracts and stuff with personal data18:23.21 
  <KenSharp> Hmm wel that's not going to work, can't be having those.18:23.39 
  <jonesmeier> maybe I can simulate what happened to these files18:23.58 
  <jonesmeier> because I have yet to find out why they got truncated during copying...18:24.32 
  <KenSharp> Don't worry too much, if you happen to find a file or two you can share that would be great, but we do have a lot of broken files 🙂18:25.04 
  <jonesmeier> copied from a LAN samba share to a VPN samba share.. i want to do some testing, so maybe i can reproduce it, with some testfiles too18:25.18 
  <jonesmeier> hehe ok :) well maybe, i will report back if i can reproduce the errors, and then with some harmless files18:25.50 
  <KenSharp> I imagine it would be useful for you to figure out why they are being truncated, I'd certianly want to know....18:25.54 
  <jonesmeier> yeah that's what I need to do next!18:26.06 
  <jonesmeier> it's really not great..18:26.52 
  <KenSharp> Worrying I would think18:27.06 
  <jonesmeier> yes18:27.11 
  <jonesmeier> ok the PDFSTOPONERROR did throw errors on a couple of more files now, I need to check afterwards if these were broken too.. I'm also only going with "opens in Adobe Reader" of course18:46.38 
  <jonesmeier> but I will do another run afterwards, my version right now looks for "Processing pages" in the output18:47.07 
  <KenSharp> Hmm wel Acrobat will open all kinds of broken files, and it won't usually tell you they were bnoken.18:47.17 
  <jonesmeier> thank you very much for the time so far Ken18:47.18 
  <KenSharp> NP18:47.22 
  <KenSharp> I'm off now anyway, well past quitting time18:47.34 
  <jonesmeier> hmm ok will have to see!18:47.57 
  <jonesmeier> Have a good one! :)18:48.09 
  <KenSharp> If, when you close a PDF file Acrobat offers to 'save the changes' and you didn't change anything, then it silently fixed a broken file for you. That's often a good clue18:48.28 
  <KenSharp> You have a good day 🙂18:48.37 
  <jonesmeier> ah that's a good clue! I'll remember that for some manual tests, thx18:49.15 
  <jonesmeier> you too18:49.19 
 <<<Back 1 day (to 2022/07/24)Forward 1 day (to 2022/07/26)>>> 
ghostscript.com #mupdf
Search: