| <<<Back 1 day (to 2022/04/27) | Fwd 1 day (to 2022/04/29) >>> | 20220428 |
artifexirc-bot | <Chaul-Jhin-Kim> Howdy | 12:42.29 |
| <Chaul-Jhin-Kim> Dear GOD/GODS and/or anyone else who can HELP ME (e.g. TIME TRAVELERS or MEMBERS OF SUPER-INTELLIGENT ALIEN CIVILIZATIONS): The next time I wake up, please change my physical form to that of FINN MCMILLAN formerly of SOUTH NEW BRIGHTON at 8 YEARS OLD and keep it that way FOREVER. I am so sick of this chubby Asian man body! Thank you! - CHAUL JHIN | 12:42.37 |
| <Chaul-Jhin-Kim> KIM (a.k.a. A DESPERATE SOUL). | 12:42.38 |
| <qwertynik> Used this command from ghostscript: | 14:19.38 |
| <qwertynik> `gs -o output.pdf -sDEVICE=pdfwrite -dFILTERIMAGE -dFILTERVECTOR -dBlackText input.pdf` to generate a PDF without any drawings and text as black. This works for most of the cases. However, came across this PDF where the white text towards the bottom of the PDF is NOT being converted to black. | 14:19.39 |
| <qwertynik> Any ideas why? | 14:19.40 |
| <qwertynik> https://cdn.discordapp.com/attachments/773567375458828329/969241606546915398/MIAA-emailer6.pdf | 14:20.06 |
| <qwertynik> @here | 14:20.11 |
| <KenSharp> There's quite a lot of white text, which bit in particular ? | 14:20.54 |
| <qwertynik> Used this command from ghostscript: | 14:21.10 |
| <qwertynik> `gs -o output.pdf -sDEVICE=pdfwrite -dFILTERIMAGE -dFILTERVECTOR -dBlackText input.pdf` to generate a PDF without any drawings and text as black. This works for most of the cases. However, came across this PDF where the white text towards the bottom of the PDF is NOT being converted to black. | 14:21.11 |
| <qwertynik> Any ideas why? | 14:21.12 |
| <qwertynik> | 14:21.14 |
| <qwertynik> Ghostscript version: `9.55.0` | 14:21.15 |
| <qwertynik> OS: `Ubuntu 18.04.6 LTS` | 14:21.16 |
| <qwertynik> https://cdn.discordapp.com/attachments/773567375458828329/969242090867408957/unknown.png | 14:22.02 |
| <qwertynik> Thanks for asking. Do let me know if more information is required. | 14:22.29 |
| <KenSharp> Well to be honest, I'm not seeing any of the text coming out black, except hte text that was already black | 14:23.38 |
| <qwertynik> Yes. Any suggestions on how to resolve the same? | 14:24.01 |
| <KenSharp> Well BlackText is really a rendering time switch, and since pdfwrite isn't rendering, I'm not certain it should have any effect | 14:24.26 |
| <KenSharp> OK so I had a typo | 14:24.59 |
| <KenSharp> I do see the blue text turning black | 14:25.04 |
| <qwertynik> Yes. Text does turn to black. | 14:25.26 |
| <KenSharp> Well the short answer is no, not a clue | 14:29.35 |
| <KenSharp> I'd suggest you open a bug report, the file is too large and complex to investigate in a few minutes and will need reduced before it can be looked at. | 14:30.03 |
| <KenSharp> Ah, could be because the font is not embedded in the PDF file | 14:31.03 |
| <KenSharp> Hmm, no | 14:31.21 |
| <chrisl> Sure it's text? | 14:31.54 |
| <KenSharp> Yeah its text in a CIDFont Arial | 14:32.08 |
| <KenSharp> And it's drawn with 1 1 1 rg | 14:32.17 |
| <KenSharp> Ah, it does have a blend mode | 14:32.49 |
| <KenSharp> CIOuld be that. | 14:32.52 |
| <KenSharp> Nope, not that. Well one for Michael I guess. | 14:35.02 |
| <KenSharp> @qwertynik I suggest you open a bug report the relevant developer is not online currently | 14:35.31 |
| <KenSharp> Actually, now I think about it (and this is not my code) IIRC the BlackText code doesn't change pure white. | 14:40.11 |
| <KenSharp> Because if it did, white text on a black background would get mapped to black text on a black background and you wouldn't be able to see it | 14:40.51 |
| <KenSharp> So yeah, I think this is how it works by design | 14:41.02 |
| <KenSharp> But I could be mistakedn, it's not my code. | 14:41.13 |
| <KenSharp> OK final answer. The font is a tyepe 3 uncoloured font. | 15:03.42 |
| <KenSharp> That means it takes on the colour which is in force at the time the text is drawn. | 15:03.54 |
| <KenSharp> For rendering that will be black, for pdfwrite, however, it will be whatever the current colour is, because the pdfwrite family of devices don't write colours immediately | 15:04.24 |
| <KenSharp> So basically this is an example of a file for which 'BlackText won't work with pdfwrite. | 15:04.43 |
| <mvrhel> Type 3 fonts do not work with -dBlackText | 15:44.24 |
| <mvrhel> and white text will remain white even if it did | 15:44.49 |
| <qwertynik> Thanks @KenSharp and @mvrhel for taking the time and responding. Any workarounds that can be employed to generate a PDF with only text in black color? | 17:26.12 |
| <qwertynik> Can confirm that white text is also converted to white irrespective of the background. | 17:26.48 |
| <mvrhel> @qwertynik Yes. Right now there is a threshold value based upon luminance. If the color value is about that level the color is mapped to white. If it is below that level it is mapped to black. I am thinking about changing this to include information about chrominance also though. Pure yellow for example has a very high luminance and is often mapped to white. Right now the threshold is hard coded at compile time | 17:29.17 |
| <mvrhel> @qwertynik Yes. Right now there is a threshold value based upon luminance. If the color value is above that level the color is mapped to white. If it is below that level it is mapped to black. I am thinking about changing this to include information about chrominance also though. Pure yellow for example has a very high luminance and is often mapped to white. Right now the threshold is hard coded at compile time | 17:29.29 |
| <mvrhel> @qwertynik So right now, there is nothing you can do to generate only black text | 17:30.21 |
| <mvrhel> sorry | 17:30.23 |
| <Robin_Watts> @mvrhel So a threshold of 0 would mean 'everything goes to black' ? Or 'everything other than pure white goes to black' ? | 17:30.35 |
| <mvrhel> A threshold of zero would make everything white | 17:31.01 |
| <mvrhel> oh == | 17:31.21 |
| <mvrhel> casr | 17:31.22 |
| <mvrhel> I have to look at the code | 17:31.27 |
| <mvrhel> case | 17:31.33 |
| <mvrhel> And right now it is compile time set | 17:31.42 |
| <mvrhel> But the question is moot | 17:32.04 |
| <mvrhel> because if someone is going to change the level, they can change > to => | 17:32.19 |
| <mvrhel> easy enough | 17:32.23 |
| <Robin_Watts> right, but when it becomes runtime set... | 17:32.38 |
| <mvrhel> it is in the same file | 17:32.39 |
| <mvrhel> right, when we go to runtime, I will need to give this some thought | 17:32.50 |
| <mvrhel> which will be next week at the earliest | 17:32.58 |
| <mvrhel> if I do it | 17:33.04 |
| <mvrhel> probably need to have some way to set that on the command line | 17:33.36 |
| <mvrhel> @Robin_Watts any suggestions | 17:33.40 |
| <qwertynik> Makes sense @mvrhel. However, did not notice this when running commands. Will creating a sample PDF for clarity using Foxit Editor Pro and test behavior to deepen understanding. | 17:33.49 |
| <qwertynik> | 17:33.49 |
| <qwertynik> `Right now the threshold is hard coded at compile time. I am finding that everyone has their own opinion (need?) for a different cutoff so I may end up making this a command line parameter` | 17:33.51 |
| <qwertynik> | 17:33.52 |
| <qwertynik> Certainly, having more control would help here. Looking forward to the feature. | 17:33.53 |
| <mvrhel> we could do a string | 17:33.55 |
| <mvrhel> >X, =>X, etc | 17:34.09 |
| <mvrhel> we could have an inversion even | 17:34.24 |
| <mvrhel> then | 17:34.27 |
| <Robin_Watts> 0, negative, positive ? | 17:34.34 |
| <mvrhel> <X | 17:34.35 |
| <mvrhel> 0-, 0+ | 17:34.45 |
| <mvrhel> 100+, 100- | 17:34.53 |
| <Robin_Watts> `diff = | luminace - white|` | 17:35.35 |
| <mvrhel> well I am going to change that | 17:35.46 |
| <mvrhel> to be more of a Delta E. Either 1 norm or 2 norm | 17:35.59 |
| <mvrhel> due to the issue with yellow | 17:36.06 |
| <Robin_Watts> Then check `if (diff > set value) -> black` | 17:36.11 |
| <mvrhel> I suppose that could be an option too | 17:36.13 |
| <Robin_Watts> Then check `if (diff > option_value) -> black` | 17:36.18 |
| <Robin_Watts> Then option_value being 0 would leave pure white as white. | 17:36.36 |
| <Robin_Watts> but option_value of -1 would send everything to black. | 17:36.47 |
| <mvrhel> right. I will think about it | 17:37.13 |
| <Robin_Watts> and an option_value of 0.1 (or whatever an appropriate scale is) would be a sane 'almost black' thing. | 17:37.20 |
| <qwertynik> @mvrhel Created this sample to test conversion for different colors. Added yellow and white. Both converted to black with this command: | 17:41.47 |
| <qwertynik> `gs -o op.pdf -sDEVICE=pdfwrite -dFILTERIMAGE -dFILTERVECTOR -dBlackText -f BlackTextGenerationTesting.pdf` | 17:41.48 |
| <qwertynik> https://cdn.discordapp.com/attachments/773567375458828329/969292360112566332/BlackTextGenerationTesting.pdf | 17:41.49 |
| <qwertynik> Would this work irrespective of the font being a Type 3 font? | 17:43.41 |
| <qwertynik> Can confirm that white text is also converted to *black irrespective of the background. | 17:45.17 |
| <qwertynik> Is there a different way to accomplish this? I have no experience using PostScript, but could using it help here? Would be curious to know how. | 17:48.05 |
| <qwertynik> Is there a different way to accomplish this @KenSharp? I have no experience using PostScript, but could using it help here? Would be curious to know how. | 17:48.16 |
| <qwertynik> Is there a different way to accomplish this @KenSharp and others? I have no experience using PostScript, but could using it help here? Would be curious to know how. | 17:48.35 |
| <qwertynik> @mvrhel Created this sample to test conversion for different colors. Added yellow and white. **Both converted to black** with this command: | 17:52.13 |
| <qwertynik> `gs -o op.pdf -sDEVICE=pdfwrite -dFILTERIMAGE -dFILTERVECTOR -dBlackText -f BlackTextGenerationTesting.pdf` | 17:52.14 |
| <qwertynik> https://cdn.discordapp.com/attachments/773567375458828329/969292360112566332/BlackTextGenerationTesting.pdf | 17:52.16 |
| <mvrhel> Type 3 fonts are never going to work. It is possible that -dBlackVector might make them get drawn black | 18:06.27 |
| <mvrhel> There are just too many corner cases that we can trip on with this stuff | 18:06.58 |
| <KenSharp> There is no practical way to know what the background is, so text is always converted (in the current case, white text is rendered to black, in the new code it has to pass the liuminance test referred to by Michael, this is a change in behaviour) | 18:15.39 |
| <KenSharp> Basically, no. There are several tricks yo ucan pull which might work with the old PostScript-based PDF interpreter, the new PDF interpreter is based in C, so there is nothing you can do in PostScript to affect it. The only way to get what you are trying for would be some fiarly involved surgery in the pdfwrite device. | 18:18.10 |
| <<<Back 1 day (to 2022/04/27) | Forward 1 day (to 2022/04/29)>>> | |