| <<<Back 1 day (to 2020/10/14) | Fwd 1 day (to 2020/10/16) >>> | 20201015 |
Antonio81 | Hello everyone | 11:04.39 |
| I am trying to download thirdparty-leptonica.tar.gz but I am not being able to | 11:05.05 |
kens | What is the nature of the problem ? | 11:05.38 |
Antonio81 | https://ghostscript.com/ocr.html | 11:05.49 |
| On step 2 I am not being able to download the file | 11:06.08 |
| where it says "For the Ghostscript 9.53 release, you can download a snapshot of this source here." | 11:06.25 |
kens | Ah, that link right, I was using the Git link | 11:06.46 |
Antonio81 | neither by git | 11:06.57 |
kens | Hmm, I get a failed network error | 11:07.08 |
Antonio81 | yup | 11:07.12 |
| me too | 11:07.13 |
kens | Just a minute while I poke one of my colleagues | 11:07.20 |
Antonio81 | and Step 1 was also with that same problem, but somehow it downloaded earlier | 11:07.35 |
kens | It may be that the server is simply overloaded | 11:07.54 |
Antonio81 | hum ok | 11:08.16 |
kens | I've asked Robin_Watts to read the conversation | 11:08.31 |
Antonio81 | btw, where should I unpack leptonica to? | 11:08.31 |
kens | He knows more about this than I do and can also look at the server | 11:08.44 |
Robin_Watts | Antonio81: Hi. | 11:08.48 |
| Let me make an archive of the leptonica/tesseract source for you and update that page. give me 5 mins. | 11:09.35 |
Antonio81 | thanks a lot man | 11:09.47 |
Robin_Watts | OK, so I've updated ocr.html in our git repo. It should go live within 5 mins or so. | 11:17.20 |
| That includes pointers to where to download the updated source for gs/tesseract/leptonica. | 11:17.51 |
Antonio81 | thanks | 11:17.58 |
| I'll wait 5/10 mins | 11:18.06 |
Robin_Watts | Thanks. Having someone follow the instructions is a good check for them, thanks. | 11:19.15 |
Antonio81 | Where should I unpack leptonica? | 11:21.00 |
Robin_Watts | Into "leptonica" within ghostpdl. | 11:45.39 |
| I updated that in the instructions a few minutes after the first fix :) | 11:46.01 |
Antonio81 | kens, are you there? | 14:31.28 |
kens | Yep | 14:31.34 |
Antonio81 | I'm having an issue | 14:31.42 |
kens | OK.... | 14:31.49 |
Antonio81 | with tesseract | 14:31.50 |
kens | OK what's the problem ? | 14:32.21 |
Antonio81 | i've set the environment variable TESSDATA_PREFIX to "C:\Program Files\gs\gs9.53.3\tesseract\tessdata" | 14:32.29 |
| but i'm getting this error | 14:32.37 |
| Error opening data file C:\Program Files\gs\gs9.53.3\tesseract\tessdata/eng.traineddata | 14:32.37 |
kens | Hmm, well that certainly looks all right | 14:32.57 |
Antonio81 | I think it is because of the forward slash | 14:32.57 |
| and the files are there, i've confirmed | 14:33.13 |
kens | Maybe, but in general Ghostscript treats those as equivalent on Windows | 14:33.23 |
| I often use / | 14:33.27 |
Antonio81 | the only way for the ocr to work is pasting tessdata inside "C:\Program Files\gs\gs9.53.3\bin\" | 14:33.45 |
kens | Yeah it will look in the current workign directory first I think | 14:34.08 |
| which will be the bin directory where Ghostscript resides I guess | 14:34.21 |
| This is stuff which has recently changed (problems on MacOS) so I'm not totally up to date with it | 14:34.55 |
| Let me see if Robin is around still | 14:35.05 |
Antonio81 | thanks | 14:35.10 |
Robin_Watts | Antonio81: What version of gs are you using? | 14:35.56 |
| THe release, or the "post-release" ? | 14:36.04 |
| with the release, you'll need to use --permit-file-read="C:\...." | 14:36.52 |
| with the updated code post release you shouldn't need to. | 14:37.06 |
Antonio81 | 9.53.3 | 14:37.37 |
| i've downloaded the one on the ocr documentation which is ocr-beta | 14:37.59 |
| i guess | 14:38.00 |
| i've downloaded this one gs9533w64-ocr_beta.exe | 14:38.25 |
Robin_Watts | That's the release version. | 14:38.39 |
| All the binaries there are for the release version. | 14:38.51 |
Antonio81 | and then downloaded this snapshot ghostpdl-9.53.x-ocr-fixes | 14:38.56 |
| and i've replaced the files inside the install dir | 14:39.11 |
| (because i dont have msvt to rebuild ghostscript) | 14:39.27 |
Robin_Watts | Ah, so you're trying to use a frankenstein. | 14:39.30 |
Antonio81 | true :D | 14:39.35 |
Robin_Watts | You choices are: | 14:39.45 |
| 1) Use the vanilla version. Use --permit-file-read=... to tell gs it's OK to read that file. | 14:40.02 |
| 2) Get MSVC, use the new sources, rebuild, don't need to use --permit-file-read=... | 14:40.30 |
| There is no middle way. | 14:40.42 |
Antonio81 | but i've tried first only after installing gs9533w64-ocr_beta.exe | 14:41.22 |
| and an error poped up saying that ocr was not a valid sDEVICE | 14:41.40 |
| that's why i've downloaded the snapshot and overwritten the content | 14:42.00 |
Robin_Watts | The snapshot will not help you unless you rebuild. | 14:42.16 |
Antonio81 | but it is working actually lol | 14:42.35 |
| I can get the content from a PDF | 14:42.42 |
| without rebuilding | 14:42.51 |
Robin_Watts | I'm sorry, you're absolutely confusing me here. | 14:43.18 |
Antonio81 | sorry | 14:43.26 |
| 1) I've not rebuilt gs | 14:43.41 |
| actually | 14:43.52 |
| -1) i've downloaded gs9533w64-ocr_beta.exe and installed | 14:44.18 |
| 0) downloaded the snapshot (and tesseract and leptonica) | 14:44.58 |
| and then it worked | 14:45.05 |
| without rebuilding gs | 14:45.10 |
Robin_Watts | My car wouldn't start this morning. So I punched the dog. Then it started right up. | 14:45.36 |
Antonio81 | thats it :D | 14:45.45 |
| damn dog was having a negative thought | 14:46.04 |
Robin_Watts | So, I am at a loss to understand what you want from me here. | 14:47.00 |
Antonio81 | my problem is that I'm getting an error with the environment variable TESSDATA_PREFIX | 14:48.37 |
| it is set but im still getting an error | 14:48.47 |
| i've set the environment variable TESSDATA_PREFIX to "C:\Program Files\gs\gs9.53.3\tesseract\tessdata" | 14:48.58 |
kens | Antonio81 I believe that's because the binary you are using is the relase version. That ninary will require you to tell Ghostscript that the training data is safe to read. | 14:49.27 |
Antonio81 | Error opening data file C:\Program Files\gs\gs9.53.3\tesseract\tessdata/eng.traineddata | 14:49.32 |
| ok ok | 14:49.51 |
Robin_Watts | Antonio81: what kens said. | 14:50.03 |
kens | So you need to add --permit-file-read="c:/Program Files/gs/..../eng.traineddata" to the command line | 14:50.06 |
Antonio81 | it works if I paste the tessdata inside "C:\Program Files\gs\gs9.53.3\bin\" | 14:50.27 |
| but I think that there's no problem right? | 14:50.36 |
kens | The current code (which is not available as a pre-built binary) has the permission added for you as a convenience | 14:50.39 |
Antonio81 | ok, I will try with that command | 14:50.53 |
kens | As you say there's no real problem with haveing the training data in the binary directory | 14:51.00 |
Antonio81 | nope, its not working, but its ok | 14:53.00 |
| it'll have to be inside bin folder | 14:53.07 |
kens | For now that's the simplest solution. | 14:53.20 |
Antonio81 | thanks a lot again ;) | 14:53.29 |
kens | No problem, it's useful to have feedback so thankts for trying it out | 14:53.52 |
Antonio81 | Your program is very complete | 15:18.23 |
| I am using pdf version change, pcl generation and now the OCR | 15:18.47 |
kens | Ghostscript ? Its a 'mature' application | 15:18.49 |
Antonio81 | Im also using pcl scale commands to refit the content | 15:19.26 |
kens | Then you know more about PCL than I do :-) | 15:19.42 |
| <<<Back 1 day (to 2020/10/14) | Forward 1 day (to 2020/10/16)>>> | |