| <<<Back 1 day (to 2020/01/15) | Fwd 1 day (to 2020/01/17) >>> | 20200116 |
I_desire_pdfa | Hello all. I wish to convert pdf to pdf/a 2-b using command line. I have noticed that if I convert the pdf to ps first it will solve most pdf/a validation problems. The only drawback is that I loose OCR and the possibility to search into the pdf. Does someone know how to do to keep this functionality ? | 14:01.05 |
| Here is the exact command line I am using: pdf2ps -r150 original.pdf original.ps && gs -dPDFA=2 -dBATCH -dNOPAUSE -dNOSAFER -sColorConversionStrategy=RGB -sDEVICE=pdfwrite -dPDFACompatibilityPolicy=1 -dPDFSETTINGS=/ebook -sOutputFile=fichier_converti_PDFA.pdf /Library/PDFA/PDFA_def.ps original.ps | 14:01.20 |
chrisl | I_desire_pdfa: Three shouldn't be any reason to convert to PS first | 14:08.56 |
| s/Three/There | 14:09.07 |
| I_desire_pdfa: You definitely shouldn't be using -dPDFSETTINGS=/ebook | 14:11.32 |
I_desire_pdfa | Thank you for your answer. I didn't convert to ps first and removed -dPDFSETTINGS=/ebook. I receive those errors when validating the pdf | 14:23.57 |
| Validating file "fichier_converti_PDFA.pdf" for conformance level pdf1.7XML line 10:25: xmlParseCharRef: invalid xmlChar value 0. | 14:23.58 |
chrisl | I_desire_pdfa: what version of gs are you using? | 14:24.20 |
I_desire_pdfa | GPL Ghostscript 9.50 (2019-10-15) | 14:24.34 |
chrisl | I_desire_pdfa: Hmm, it's possible that it's something that's been fixed since 9.50 was released, unfortunately, the person who would know is on vacation this week | 14:26.39 |
| It's equally possible you have found a bug: either with our PDF/A output, *or* with the PDF/A validator | 14:27.36 |
I_desire_pdfa | I am using this validator https://www.pdf-online.com/osa/validate.aspx | 14:29.47 |
| I can come back next week if you want. In case here is a link to the original file: https://www.mycloud.ch/s/S008AE38725EC27590B6E2466B00B41435DD5B18471 | 14:31.06 |
chrisl | It's not my area, but I do know there have been issues with some validators failing valid files. | 14:31.14 |
| Next would UK "office hours" would be best, the person you'll want uses the nick "kens" | 14:31.47 |
| FWIW, I know he tends to recommend verapdf for validation | 14:32.32 |
I_desire_pdfa | Ok I will try, thanks for the tip and informations | 14:32.59 |
chrisl | No problem, sorry I couldn't be more help | 14:33.12 |
I_desire_pdfa | You are right, with veraPDF it validates correctly !! Why shouldn't I use -dPDFSETTINGS=/ebook ? It valides equally by using it or not ! | 14:37.35 |
chrisl | The PDFSETTINGS thing sets a *lot* of parameters in one go, there would be a chance the it would set one to a value not compatible with the standard you want to output to | 14:38.51 |
| It just happens that, in this case, the value you're using doesn't clash with the PDF/A-2 you want out | 14:40.21 |
I_desire_pdfa | Ok thank you. Is it still useful for you (and me) that I come back next week to expose the problem or not ? | 14:40.34 |
chrisl | It's probably worth mentioning it to kens, at the very least, it wil mean he's aware of it | 14:41.45 |
| (although, he'll probably not thank me for suggesting that!) :-) | 14:42.08 |
I_desire_pdfa | :-) | 14:43.03 |
chrisl | I_desire_pdfa: is this something you're likely to be doing on more than one occasion? | 14:44.05 |
I_desire_pdfa | Sorry for late answer. Yes why ? | 14:57.36 |
chrisl | I_desire_pdfa: If you haven't done so, it's probably worthwhile spending a bit of time reading: https://ghostscript.com/doc/9.50/VectorDevices.htm | 15:00.11 |
| It details a lot of the limitations/implications of using the pdfwrite and related devices | 15:00.33 |
I_desire_pdfa | Ok I will try to understand all thanks for the suggestion. | 15:02.59 |
chrisl | I_desire_pdfa: You probably don't have to understand *all* of it. But if you have other questions in the future, it might save some confusion (for both parties) if you have a feel for what's going on | 15:05.41 |
I_desire_pdfa | Yes thanks, and have a nice evening/morning ? | 15:06.32 |
| <<<Back 1 day (to 2020/01/15) | Forward 1 day (to 2020/01/17)>>> | |