Log of #ghostscript at irc.freenode.net.

Search:
 <<<Back 1 day (to 2020/11/24)Fwd 1 day (to 2020/11/26) >>>20201125 
GitGud64 Hello!14:12.12 
  I don't know if this is a correct place to ask, but I have an interesting problem14:12.48 
  in a PDF I want to compress there are DCS %% comments in the begging and my script gives errors and I had no luck with ProcessDSCComment if anyone could be so kind to help me :)14:13.53 
artifexirc-bot <KenSharp> Firstly are you sure its a PDF file, secondly why would this be a problem ? You are allowed up to (IIRC) 1KB of data before the PDF header, and % is the comment character in a PDF file, so provided the xref table is correct, and all the offsets are valid, having % comments in a PDF file should not be a problem14:15.03 
  <KenSharp> Note that Ghostscript (and the pdfwrite device) do not compress PDF files.14:15.30 
GitGud64 Yes, the file is .pdf and opens normally in chrome or any other PDF viewer. The comment look like this "%%DOC_TYPE: CAF01"14:17.36 
artifexirc-bot <KenSharp> Then I don't understand what your problem is14:17.52 
GitGud64 and i'm using -dPDFSETTINGS=/screen to lower the size of PDFs14:18.05 
  I know it is tehnically not compressing14:18.16 
artifexirc-bot <KenSharp> You're going to have to tell me what version of Ghostscript you are using, teh complete command line, and probably share the PDF file too. Also, you haven't given any indication of the nature of the problem. Are you getting errors ? If so, what is on the back channel ?14:19.08 
GitGud64 when I want to process a file that has these comments I get this error Error: /undefined in obj14:19.28 
  1989 1 3 %oparray_pop 1977 1 3 %oparray_pop 1833 1 3 %oparray_pop --nostringval-- %errorexec_pop .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval--14:19.29 
  Current file position is 16014:19.30 
artifexirc-bot <KenSharp> Sounds like the file is not being processed as a PDF file14:19.46 
GitGud64 the beggining of the file starts like this14:20.45 
  %%DOC_TYPE: CAF01%%GEFCO_REF: T9232036990%%EXTERNAL_REF: T9232036990%%CREATED_DATE: 03/11/2020 16:49%%COMPANY_ID: 0838%%SITE_ID: %PDF-1.76 0 obj<< /Creator (OpenText Exstream Version 9.5.302 64-bit (DBCS))/CreationDate (11/3/2020 16:49:27)/Author (Registered to: GEFCO )/Title (INES_CAF01)14:20.47 
  and if I remove the %% it works normally14:21.10 
artifexirc-bot <KenSharp> Well I'd have to see the whole file to try it, but if you put PostScript in front of a PDF file header you might reasonably expect that Ghostscript will treat it as a PostScript file rather than a PDF file.14:21.51 
GitGud64 -sDEVICE=pdfwrite -dCompatibilityLevel=1.7 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile="C:\exerror\%%~nxf" "%%~f"14:22.24 
  This is the whole script14:22.30 
artifexirc-bot <KenSharp> Those aren't DSC comments eitehr, unless someone has produced a new version of DSC, they are just commnets, presumably important to the OpenText product.14:22.51 
  <KenSharp> Which version of Ghostscript are you using ?14:23.00 
chrisl You can't just prepend random stuff to a PDF file - if you do, you end up with a broken PDF file14:23.08 
GitGud64 9.53.314:23.21 
artifexirc-bot <KenSharp> chrisl you can hacve random garbage in front of the PDF header, we strip it off14:23.29 
chrisl If you just prepend any old stuff to an otherwise correct PDF, you break the offsets in the xref table14:24.08 
artifexirc-bot <KenSharp> Well I'm going to need to see the whole file to be able to tell if there is a solution, my suspicion is that ther is not. If you put PostScript in front of the header of a PDF file then I'm not surprised Ghostscript thinks its a PostScript program14:24.17 
  <KenSharp> chrisl I'm not sure that's entirely true, I thought (I'd have to try it out to check) that we strip off garbage up to the PDF header, so as long as the xref deosn't include the rubbish at the start it should work. Also we can repoair files broken in that way14:25.19 
chrisl "%PDF-1.76" ??14:25.36 
artifexirc-bot <KenSharp> My suspicion is that the PostScript at the start of the file is causing Ghostscript to treat the file as PostScript not PDF.14:25.40 
  <KenSharp> Oh well that would break the PDF scanning14:25.56 
  <KenSharp> Possibly14:26.02 
  <KenSharp> Like I said, I'd need to see the file and run it locally14:26.13 
GitGud64 https://filebin.net/54fyimrbzw2u0wir14:26.55 
  Here is my script and the PDF file14:27.03 
chrisl The fact that just stripping off the initial "%%" comments results in working normally, rather suggests the xref does *not* account for those comments14:27.10 
artifexirc-bot <KenSharp> Possibly, or its a scanning error failing to derive the correct type14:27.30 
  <KenSharp> I'll get hte file and look14:27.34 
GitGud64 It is my first time trying to do this so i'm sorry for this stupid questions14:27.46 
artifexirc-bot <KenSharp> Its not a stupid question, but we do need quite a lot of information to answer it I'm afraid.14:28.08 
  <KenSharp> Well if I open it with Acrobat and close it, it offers to 'save the changes' which suggests that the xref table includes the stuff at the froont, though that's not 100%14:29.08 
GitGud64 Thank you! If you need any other files fell free to ask14:29.08 
chrisl If it was just misidentifying the file type, then we'd give warnings about rescanning the xref etc, which would not be "normal"14:29.11 
artifexirc-bot <KenSharp> Not if it thought the PDF file was PostScript surely ?14:29.32 
  <KenSharp> We'd get an undefined error on 'obj'14:29.33 
GitGud64 ^ 14:29.47 
artifexirc-bot <KenSharp> Well gswin64c CAF01_INES_136721308_923.pdf works fine for me14:30.23 
  <chrisl> @KenSharp If you strip off the errant comments, and the file runs normally, then the xref has not been updated to account for the extra data at the start of the file14:30.28 
  <KenSharp> It runs fine **without** stripping off the comments for me14:30.40 
  <KenSharp> and no warnings about repairing or rebuilding14:30.55 
GitGud64 hmm14:31.28 
artifexirc-bot <KenSharp> The file has no comments in the start14:31.37 
  <chrisl> Not for me... "/undefined in obj"14:31.43 
GitGud64 Can you send me your script so I can test it on my end14:31.45 
artifexirc-bot <KenSharp> The one you posted begins %PDF-1.714:31.52 
  <KenSharp> GitGud64 I posted teh command I used above14:32.16 
  <KenSharp> gswin64c CAF01_INES_136721308_923.pdf14:32.26 
  <chrisl> And stripping the comments off *does* end up with an error about reading the xref14:32.42 
  <KenSharp> Well that's odd because the file I have here doesn't have any comments at the start.14:32.58 
  <KenSharp> I am now confused14:33.01 
  <KenSharp> Damn I bet Acrobat resaved it14:33.33 
  <KenSharp> Even though I told it not to14:33.38 
  <chrisl> So we are failing to spot that it's a PDF, I think14:33.54 
GitGud64 can you test with the new one I upload14:33.57 
artifexirc-bot <KenSharp> <sigh> Yes that was it14:33.58 
  <chrisl> And stripping the comments doesn't end up running "normally"14:34.27 
  <KenSharp> chrisl try turning the %% into &&14:35.19 
  <KenSharp> The file will run then14:35.25 
  <chrisl> That makes sense, yes14:35.35 
GitGud64 from my test if I removed those double % it gave me that there is some gibberish and proccess it normally14:35.44 
  yea14:35.48 
artifexirc-bot <KenSharp> It then gives me warnings about garbage preceding the %PDF and that the xref is incorrect14:35.50 
  <KenSharp> So it looks to me like the presence of the PostScript comments is making teh scanner think the file is a PostScript file, noot a PDF file, and in addition the file's xref table is incorrect. Possibly it has been through a CR/LF conversion process14:37.19 
GitGud64 I don't even know what this means :P14:38.09 
artifexirc-bot <chrisl> Yes, we specifically check for "%%"14:38.19 
  <KenSharp> GitGud64 your PDF file is broken14:38.21 
GitGud64 :(14:38.46 
artifexirc-bot <KenSharp> You can have Ghostscript repair it but to do so you need to remove teh %% comments. Their presence is making Ghostscript (which is a PostScript interpreter) treat the file as a PostScript program instead of a PDF file.14:39.03 
GitGud64 Is there any way to fix that?14:39.04 
artifexirc-bot <KenSharp> Like I said, remove the %% comment lines14:39.41 
  <KenSharp> And then Ghostscript will be able to recognise that its a PDF file not a PostScript program14:39.56 
GitGud64 Aha, is there any way to do this by a program?14:39.57 
artifexirc-bot <KenSharp> Well a decent text editor will do the job14:40.09 
GitGud64 I know that :P14:40.21 
  I have quite a lot of those files14:40.28 
  I meant if it is possible to do that via a script or something like that?14:41.29 
artifexirc-bot <KenSharp> Well you could write somemthing reasonably simple to read the file until it finds the %PDF and then sends the remaining content to a new file.14:41.31 
  <KenSharp> I'm not a Python programmer but I imagine you could use it, or Perl, or something like that to do the job14:41.56 
GitGud64 Well, neither am I :/14:42.23 
  anyway, thank you both for the help!14:42.50 
artifexirc-bot <KenSharp> No problem14:43.01 
chrisl I don't feel we've been that much help, but for what there was, you're welcome....14:43.21 
  FWIW, on Unix type systems, something like sed, or tr would probably help filter the offending lines14:44.13 
  GitGud64: I assume you're running on Windows?14:45.44 
GitGud64 Yea14:45.47 
chrisl There is a hack you could which might work for these files, but I'd be hesitant using it for the general case14:46.36 
GitGud64 Can you tell me how please?14:47.06 
chrisl Do you know where you're Ghostscript install is (program files.... etc)14:47.35 
GitGud64 yea14:47.43 
chrisl Somewhere in there, there should be a directory called "Resource"14:48.09 
GitGud64 yes14:48.31 
chrisl Agh, one sec - my phone just rang.....14:48.44 
GitGud64 take your time14:48.51 
chrisl It's okay, nothing important!14:50.02 
  Right, that Resource directory, make a copy of it somewhere in your user directory tree14:50.47 
  For simplicity, maybe something like "c:\Users\<username>\gs\Resource"... or similar14:52.04 
GitGud64 ok14:52.52 
  its done14:53.30 
chrisl Now, using my example path above, in your batch file, for running Ghostscript, add "-Ic:/users/<username/gs/Resource/Init" - so it looks like: %ghostscript% -Ic:/users/<username/gs/Resource/Init -sDEVICE=pdfwrite -dCompatibilityLevel=1.7 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile="%%~dpf%filepre%%%~nxf" "%%~f"14:55.51 
  (BTW, I'm not in front of a Windows machine just now, so I'm working from memory on this!)14:58.23 
GitGud64 Ok, going to try it15:07.01 
chrisl That won't (yet) fix your problem, but it's setting things up so we can15:07.25 
GitGud64 Chris, do you maybe have a discord or something like that? It would be easier for me :)15:08.01 
artifexirc-bot <KenSharp> This channel is on discord15:08.24 
  <KenSharp> The server is Artifex and the channel is public #ghostscript15:08.43 
  <KenSharp> Public text xhannel that is15:08.54 
  <KenSharp> Public text channel that is15:09.00 
chrisl Ah, I was kind of hoping there was a way to post a URL-type thing...15:09.16 
artifexirc-bot <KenSharp> I'm not familiar enough with Discord to kow if that's possible 🙂15:09.54 
  <KenSharp> I'm not familiar enough with Discord to know if that's possible 🙂15:10.00 
GitGud64 can ya send me link to that server, I can't find it15:10.31 
artifexirc-bot <KenSharp> There's a thing for inviting people in the channel15:10.39 
  <KenSharp> Try this: https://discord.gg/rEneq37J15:11.00 
GitGud64 Awesome, thanks :)15:11.27 
artifexirc-bot <KenSharp> Don't thank me till you find out if it works 😄15:11.41 
  <get_gud> Welp, at least im on discord 😛15:11.57 
  <KenSharp> Ah so you are, I see you now15:12.05 
  <chrisl> Cool, okay15:12.09 
  <chrisl> @get_gud So when you run your batch file now, the thing to check is that are no messages like "*** Warning: GenericResourceDir doesn't point to a valid resource directory." coming from Ghostscript15:13.27 
  <get_gud> will try that later15:13.54 
  <get_gud> need to redownload my PDFs15:14.06 
  <get_gud> tried to remove %% in notepad++, but after that I get tons of other errors15:14.24 
  <get_gud> and when I want to copy them I have seen, that I did that on my copy15:14.42 
  <get_gud> 😛15:14.45 
  <chrisl> Well, you can always the one from where you put them for us earler: https://filebin.net/54fyimrbzw2u0wir15:15.32 
  <chrisl> But for the above test, it doesn't matter, we just need to make sure that Ghostscript is obeying the "-I" directive, and still finding the files it needs to run15:16.35 
  <get_gud> ok15:18.13 
  <get_gud> give me a minute15:18.17 
  <get_gud> @chrisl I get the same error15:23.28 
  <get_gud> Error: /undefined in obj15:23.43 
  <get_gud> Operand stack:15:23.44 
  <get_gud> 6 015:23.45 
  <chrisl> But *not* an addition warning about "*** Warning: GenericResourceDir doesn't point to a valid resource directory." coming from Ghostscript"15:23.51 
  <get_gud> https://cdn.discordapp.com/attachments/773567375458828329/781178425623969812/unknown.png15:24.10 
  <get_gud> That is all I get15:24.12 
  <chrisl> Cool, that is what I expect.15:24.27 
  <chrisl> Now, in that copied directory, have a look in Resource\Init for a file called "pdf_main.ps" and open it in a text editor15:25.12 
  <get_gud> got it15:25.54 
  <chrisl> Okay, now search for (%%) in that file15:26.39 
  <chrisl> It should in a line that looks like "dup 2 string .peekstring pop dup (%!) eq exch (%%) eq or {"15:26.57 
  <get_gud> dup 2 string .peekstring pop dup (%!) eq exch (%%) eq or {15:27.27 
  <get_gud> this right15:27.30 
  <chrisl> Yes, indeed. Now, replace the "(%%) " with "()" (without the quotes!) and save the file15:28.15 
  <get_gud> so delete these %%15:28.35 
  <get_gud> done15:28.37 
  <chrisl> Now trying running your batch file - it will spew a warning, but should run to completion15:29.03 
  <get_gud> WORKS!15:30.33 
  <get_gud> 😄15:30.39 
  <get_gud> yea, it really gives tons of errors, but the output looks alright 🙂15:31.04 
  <chrisl> Yeh, the problem is, PDF uses a list of byte offets into the file to find stuff it needs. The comments added to the start of those files breaks those offsets which leaves the file technically invalid15:32.30 
  <get_gud> aha15:33.00 
  <get_gud> I don't know how they got corruped like that tho15:33.12 
  <KenSharp> Its 'stuff' that's been added by someone's workflow system.15:34.06 
  <KenSharp> I imagine that their system normally removes the extraneous stuff15:34.27 
  <get_gud> yea15:34.48 
  <KenSharp> GEFCO appears to be GEFCO Logistics, possibly15:35.02 
  <get_gud> I had no clue that there was anything wrong with those files as they open normally everywhere else15:35.11 
  <get_gud> yea, it is a logistics company15:35.22 
  <KenSharp> If you open the file with Acrobat, and then close it (without doing anything) it will offer to 'save the changes' which is normally an indication it has silently fixed the file.15:35.48 
  <KenSharp> We prefer to tell people there is something wrong with it (and then fix it)15:36.03 
  <chrisl> Luckily, Ghostscript and most other PDF consumers encounter broken files often enough that we implement recovery code to try (and usually succeed) in dealing with stuff like this15:36.15 
  <get_gud> You guys are awesome 😉15:36.32 
  <get_gud> may I ask, if I run the same script on normal files, it will work or will I get erros on normal ones?15:37.00 
  <chrisl> It'll work just fine on PDF files15:37.23 
  <KenSharp> It will work, the problem is that if you try to run PostScript programs tehy may get mis-identified as PDF files. Or at least, not identified as PostScript15:37.42 
  <KenSharp> I suspect its 'probably' OK on the majority of PostScript too, but obviously we don't test what you have now15:38.18 
  <get_gud> Understandable15:38.29 
  <get_gud> will test it with other files I have to "compress" and will report back if I get any strange errors15:39.10 
  <chrisl> Since the batch file explicitly only handles files with a .pdf extension, there shouldn't be a clash with Postscript hobs15:39.27 
  <chrisl> Since the batch file explicitly only handles files with a .pdf extension, there shouldn't be a clash with Postscript jobs15:39.34 
  <get_gud> yea, it was only meant for PDFs15:39.42 
  <chrisl> If you find the need for something similar for Postscript, just don't use the "-I" option in that batch file15:40.09 
  <get_gud> Ok15:40.27 
  <RayJohnston> that way it will use the standard (built into the %rom% file system in the executable) pdf_main.ps15:56.01 
  <RayJohnston> @chrisl I'm wondering if we should have an option to force recognition as PDF? This comes up with customers sometimes, but it would be useful to others (such as @get_gud )15:58.01 
  <get_gud> sounds like a neat feature 🙂17:40.19 
  <chrisl> Well, consider it.... but it's already sparked some discussion on our private channel, so I wouldn't hold your breath for it to drop real soon!17:42.03 
  <chrisl> We'll, consider it.... but it's already sparked some discussion on our private channel, so I wouldn't hold your breath for it to drop real soon!17:42.11 
  <get_gud> Glad I could be of service17:43.13 
 <<<Back 1 day (to 2020/11/24)Forward 1 day (to 2020/11/26)>>> 
ghostscript.com #mupdf
Search: