Log of #ghostscript at irc.freenode.net.

 <<<Back 1 day (to 2021/02/15)Fwd 1 day (to 2021/02/17) >>>20210216 
mnestorov Hello gentlemen :) . A quick follow-up to our conversation yesterday.14:25.17 
  1. Using pdfa_dep.ps and removing useless command line arguments solved 6.2.3 (DeviceRGB may be used only if the file has a PDF/A-1 OutputIntent that uses an RGB colour space).14:25.17 
  2. I compiled and ran 9.54.0 (commit:master 1efe1f702) locally. Running with parameter `-dPDFACompatibilityPolicy=2` against the "problematic" PDF gives:14:25.18 
  `GPL Ghostscript GIT PRERELEASE 9.54.0: Text string detected in DOCINFO cannot be represented in XMP for PDF/A1, aborting conversion.`14:25.18 
  `GPL Ghostscript GIT PRERELEASE 9.54.0: Unrecoverable error, exit code 255`14:25.19 
  2.1 However, if the command is ran with `-dPDFACompatibilityPolicy=1`, the warning is still printed, but no abort is made. Successful command completion plus a completely valid PDF/A document, as validated by veraPDF.14:25.19 
  I mean, logically that the compatibility policy to act like that, maybe it's just helpful for you to know that the logging of 9.54.0 is in this particular case14:26.07 
artifexirc-bot <KenSharp> mnestorov yes that behaviour is what I would expect if the Title can't be handled14:28.40 
  <KenSharp> Thanks for reporting back!14:28.48 
mnestorov I think that would classify that your fix works :)14:29.04 
artifexirc-bot <KenSharp> Yeah it sounds like it is the same problem I fixed a week or so back, something weird about the way recent versions of ImageMagick are creating their PDF files14:29.35 
  <KenSharp> It's not illgeal, but I really don't think it's what htey intend14:29.52 
mnestorov Is the DOCINFO referring to a title?14:30.52 
artifexirc-bot <KenSharp> DOCINFO is the pdfmark for the Document Information Dictionary, which is where the /Title is located in the PDF. There is a duplicate (in XML obviously) in the XMP metadata block14:32.12 
  <KenSharp> For PDF/A they have to be byte by byte identical14:32.28 
  <KenSharp> Which only works if the /Title is in PDFDocEncoding and limited to the values representable in one byte by UTF-814:32.50 
mnestorov Yes, that is why I understood from the PDF/A protocol14:32.57 
artifexirc-bot <KenSharp> The /Title from ImageMagick cotains NUL characters14:33.16 
mnestorov aha14:33.24 
artifexirc-bot <KenSharp> Which basically means we cant' handle them14:33.35 
mnestorov Should it contain NUL?14:33.45 
artifexirc-bot <KenSharp> So the only alternatives are 1) drop the /Title or 2) don't make a PDF/A14:33.51 
  <KenSharp> I don't believe that it shoudl contain a NUL. (I'm not certain baout your PDF obviously, this relates to the ImageMagick ones)14:34.20 
  <KenSharp> Each characer is written as 2 bytes14:34.34 
  <KenSharp> Each initial byte is a 0x0014:34.42 
  <KenSharp> That looks awfully like its UTF-1614:34.49 
  <KenSharp> But there is no byte order mark14:34.58 
  <KenSharp> Additionally the string is terminated with 0x00 0x0014:35.09 
  <KenSharp> Which looks like someone read a C string, including hte terminator14:35.21 
mnestorov Hah... I don't know if that can be considered as a bug on IM side14:36.21 
artifexirc-bot <KenSharp> It is possible that this is deliberate, it is leagal in PDFDocEncoding to use the NUL character (it turns into a /.notdef glyph) but it seems highly unlikely14:36.23 
mnestorov I see14:36.33 
artifexirc-bot <KenSharp> If you look at the document properties of such a file using Acrobat it displays an empty string for the /Title14:36.58 
mnestorov So they drop it as well?14:38.05 
  I mean, the iText guys14:38.18 
artifexirc-bot <KenSharp> Acrobat doesn't display it in the original PDF. I can't comment about iText, I don't use it.14:38.35 
mnestorov I don't either, but I thought that Adobre used iText underneath14:39.03 
artifexirc-bot <KenSharp> Heck no!14:39.09 
  <KenSharp> Adobe invented PDF 🙂14:39.14 
mnestorov hah....misinformation from my part then14:39.35 
  sorry for being silly14:39.41 
artifexirc-bot <KenSharp> Not a problem14:39.50 
  <KenSharp> I do suspect the broken /Title from ImageMagick is a bug, but I'm not inclined to sign up and report it, not least because I'd have to find a way to reproduce it14:40.19 
mnestorov I understand that. Maybe someone else got around to it.14:41.14 
artifexirc-bot <KenSharp> I did ask the person who reported the bug to us to report the problem to IM, but I have no idea if they did14:41.44 
mnestorov When was this initially reported here?14:42.06 
artifexirc-bot <KenSharp> Give me a minute and I'll look14:42.18 
  <KenSharp> someone pinging me on another channel14:42.25 
mnestorov no worries14:42.29 
artifexirc-bot <KenSharp> OK the bug report is here14:43.36 
  <KenSharp> https://bugs.ghostscript.com/show_bug.cgi?id=70348614:43.37 
  <KenSharp> From the 6th February14:43.47 
  <KenSharp> That was actually their second attempt to report a bug to us I think, the first time round I couldn't reproduce a problem, because the file they sent had been produced by an earlier version of IM, which properly included the byte order mark14:45.16 
mnestorov I might just dig around the bug tracker in IM and see if someone reported it. But other than that, I truly appreciate your help with my situation and your fixes in the latest gs! :) If my feedback is of any use, the gs documentation, namely these files here https://ghostscript.com/doc/current/Psfiles.htm and the compilation docs here14:49.08 
  https://www.ghostscript.com/doc/9.50/Make.htm were very useful. The only thing I couldn't get is why is your canonical repo at https://git.ghostscript.com/?p=ghostpdl.git;a=summary restricted to the public, in terms of getting snapshots or pulling. I had to use the mirror at github.14:49.09 
artifexirc-bot <KenSharp> I think that's due to restricting bandwidth14:49.51 
  <KenSharp> We were serving too many requests from our own servers14:50.02 
  <KenSharp> But maybe I'm wrong about that, it ought to be possible to clone our own repo14:50.37 
artifexirc-bot <KenSharp> is not a Git expert14:50.41 
mnestorov It didn't cause a problem, of course, I managed to find you guys on GH. :)14:51.22 
artifexirc-bot <KenSharp> I think we prefer people to pickup the code from Github, but I believe it ought to work from our own server.....14:51.59 
chrisl The snapshot feature was a magnet for (D)DOS attacks, so we had to disable it14:52.48 
artifexirc-bot <KenSharp> Ah, that would be it14:53.11 
mnestorov Hah, I'm hoping it's not the competition ddos-ing you :)14:54.25 
chrisl Given other experiences, it was *probably* "security researchers"14:55.02 
artifexirc-bot <KenSharp> It may be because we run a bug bounty program for our **products** we get an awful lot of script kisddies 'your website has the following problems.... where's my bounty ?'14:55.11 
mnestorov Ugh14:55.45 
  I don't know how you guys keep up with all of the work, especially when you have to deal with such things...it doesn't sound like it's your first time external forces keep you busy (from the fun of programming)14:57.14 
  first time dealing with*14:57.24 
artifexirc-bot <KenSharp> Oh we get a few questions, it's not a huge number14:57.36 
  <KenSharp> And teh bug reports keep the product nice and tight14:57.49 
  <KenSharp> Which is good for our commercial customers14:57.57 
  <KenSharp> Everyone needs a brak from programming now and then14:58.23 
mnestorov I hope you (and all of the nice people from gs and friends) get it :)14:59.56 
artifexirc-bot <KenSharp> Well ordinarily we'd travel to our company meetings, but that's been out of the question for the last 12 months 😦15:00.31 
  <KenSharp> A short weekend break in San Francisco is always nice15:00.48 
 <<<Back 1 day (to 2021/02/15)Forward 1 day (to 2021/02/17)>>> 
ghostscript.com #mupdf