| <<<Back 1 day (to 2014/09/10) | 2014/09/11 |
rayjj | kens (for the logs): are you still playing with mooscript ? It would sure be nice to be able to scrap all of the SMask and transparency group nonsense in pdf_draw.ps and work in C. | 00:02.06 |
| when I told my son (15 yr old) about the new mupdf customer it was a ROFL for him (and a high five for us) | 00:03.26 |
kens | henrys (logs) rayjj, chrisl, robin and I probably ought to talk about compressed image data pass-through at the next meeting. Porbably worth sticking on the aganda as a reminder (not for discussion at the meeting probably). | 07:10.39 |
| rayjj I'd *like* to get back to Mooscript, but I've done nothing with it since the last staff meeting. Long-haul flights seem to bet the only occasion when I have time for projects these days. | 07:11.28 |
| And yes, the new MuPDF customer is definitely LMAO time | 07:12.02 |
casper366 | hi, I've got a problem. I've got a pdf-File (made from Latex) which presumably doesn't have a CropBox. I'd like to Crop it, that the massive borders dissapear | 10:46.47 |
kens | What exactly do you mena by crop in this case ? Do you mean you want to add a CropBox, or do you want to redefine the media and translate the content onto the reduced media ? | 10:47.42 |
casper366 | my end goal is to have the pdf in 800x600 or a5 on my kindle. Now its A4 plus has borders which i don't need | 10:48.32 |
kens | So do you want to add a CropBox or alter the MediaBox and translate the content ? | 10:49.23 |
casper366 | I guess (if I understood the definitions) i want to alter the Mediabox | 10:50.49 |
kens | Tricky. | 10:51.09 |
casper366 | is it easier to add a cropbox and then just output that? | 10:51.54 |
kens | You will need to set a fixed media size, and apply a BeginPage procedure to translate the content by an appropriate amount. | 10:52.09 |
| I htink pdftk will let you add a CropBox, otherwise you can do it with Ghostscript and a pdfmark I htink. | 10:52.41 |
casper366 | media size in points, correct? | 10:52.42 |
kens | MediaBox is given in PostScript/PDF units, which are 72 to the inch unless a UserUnit is applied | 10:53.11 |
| There's a partial answer to adding a CropBox here: | 10:54.26 |
| http://stackoverflow.com/questions/25505001/ghostscript-issues-with-a-cropbox | 10:54.27 |
| And there's a couple more suggestions here: | 10:55.08 |
| http://stackoverflow.com/questions/9532652/cropbox-and-mediabox-in-ghostscript | 10:55.09 |
casper366 | thanks I'll process these links and be right back | 10:57.10 |
kens | And another one | 10:58.25 |
| http://stackoverflow.com/questions/6183479/cropping-a-pdf-using-ghostscript-9-01?rq=1 | 10:58.26 |
| me lunches | 11:10.42 |
casper366 | thanks. I know the last one already, but when I use that the output is the same as the input-file | 11:23.41 |
| Hi, I've found a prog that does just the trick (http://briss.sourceforge.net/). Unfortunately I can't read java, so I have no clue how it works. | 11:46.06 |
| Thanks for your help :-) | 11:46.15 |
dddddddd | hi | 11:49.45 |
ghostbot | Welcome to #ghostscript, the channel for Ghostscript and MuPDF. If you have a question, please ask it, don't ask to ask it. Do be prepared to wait for a reply as devs will check the logs and reply when they come on line. | 11:49.45 |
dddddddd | is this the right place to ask a question about mupdf project building in android sdk, on eclipse? | 11:50.41 |
Robin_Watts | The MuPDF devs are here, yes. | 11:50.53 |
| I am one of them, but I don't use eclipse. | 11:51.07 |
dddddddd | hmm, ok | 11:52.23 |
paulgardiner | I have used eclipse with MuPDF | 11:52.42 |
dddddddd | So I followed the steps in http://www.mupdf.com/docs/how-to-build-mupdf-for-android | 11:54.07 |
| I have the NDK installed, and I can run successfully the HelloJNI project | 11:54.31 |
Robin_Watts | including step 10? everyone skips step 10 and then wonders why it doesn't work. | 11:54.35 |
dddddddd | what is step 10, sorry? | 11:54.48 |
Robin_Watts | oh, those instructions. I was thinking of platform/android/ReadMe.txt which has more detail. | 11:55.17 |
dddddddd | I installed cygwin, I am on Windows | 11:55.29 |
| ant and make also | 11:55.34 |
Robin_Watts | but the important step that everyone skips is the "make generate". | 11:55.36 |
dddddddd | ohh, I haven't checked it | 11:55.54 |
| right | 11:56.09 |
kens | Its in those instructions | 11:56.11 |
Robin_Watts | It is, yes, it's just not "step 10" :) | 11:56.22 |
kens | last step in 'prepare the source' | 11:56.26 |
dddddddd | Could it be that ignoring that step makes the project crash when loading the native library, in openfile()? | 11:57.03 |
Robin_Watts | dddddddd: No. It wouldn't have built. | 11:57.19 |
dddddddd | Ok | 11:57.23 |
Robin_Watts | If it's crashing when loading the native library, it's probably googles fault. | 11:57.33 |
dddddddd | It is building, so lets put that option down | 11:57.39 |
Robin_Watts | They broke the latest ndk by missing out a function. | 11:57.42 |
dddddddd | it is running | 11:57.45 |
Robin_Watts | strtod (or strtof, one of them) | 11:57.52 |
dddddddd | but when I choose a PDF and open it, it just crashes | 11:57.56 |
| I can show the Exception message, if you want | 11:58.07 |
Robin_Watts | When the shared lib is loaded, it can't resolve that function, and then it dies. | 11:58.15 |
| What version of the ndk are you using? | 11:58.36 |
dddddddd | let me check | 11:58.43 |
| android-ndk-r10 | 11:59.16 |
Robin_Watts | Try using r8e | 11:59.29 |
kens | I thought you fixed that in the MuPDF source.... | 11:59.43 |
dddddddd | ok, I'll try | 12:00.16 |
Robin_Watts | kens: So did I, but we've had one report that the latest version still doesn't work. | 12:01.40 |
dddddddd | Just to let you know, it is crashing on line 14 of MuPDFCore class | 12:01.49 |
kens | :-( | 12:01.49 |
dddddddd | System.loadLibrary("mupdf"); | 12:01.50 |
Robin_Watts | dddddddd: Yes, that's an unhelpful error, alas, as it gives no clue as to why. | 12:02.12 |
dddddddd | in LOgCat, I get: Caused by: java.lang.UnsatisfiedLinkError: Cannot load library: reloc_library[1307]: 1386 cannot locate '__isnanf'... | 12:02.33 |
kens | Similar, but that's a new one to me | 12:02.53 |
Robin_Watts | Ah, a new one. thanks. | 12:03.04 |
kens | Sounds like they've really messed up the NDK this time | 12:03.09 |
dddddddd | When I download the MuPDF app from the playstore, everything works fine | 12:04.17 |
| That is why it must be something in the building process | 12:04.33 |
Robin_Watts | yeah. It's the ndk stub lib, I suspect. | 12:04.43 |
dddddddd | Ok | 12:06.24 |
| If you need some more info about my install, I can provide it to you | 12:06.39 |
| For now I will try with ndk r8e | 12:07.00 |
Robin_Watts | dddddddd: Thanks. | 12:07.12 |
dddddddd | Thanks a lot, I didn't expect to have quick help like this | 12:07.37 |
Robin_Watts | dddddddd: Yeah, apparently the NDK is broken :( | 12:10.55 |
| got time to try a test for me? | 12:11.21 |
dddddddd | yes | 12:16.10 |
Robin_Watts | In include/mupdf/fitz/system.h | 12:18.16 |
| Look for a #ifdef __ANDROID__ line (line 125ish?) | 12:18.46 |
| In that block do #undef isnan | 12:18.59 |
| and see if that solves it. | 12:19.13 |
| bbia mo | 12:19.20 |
dddddddd | ok, let me try | 12:19.56 |
| where do i put the #undef isnan? | 12:21.51 |
| the below line? | 12:21.55 |
| before #include <android/log.h>? | 12:22.10 |
kens | SOmewhere between #ifded __ANDROID__ and the corresponding #endif | 12:22.12 |
| #ifdef that shoul dbe sorry | 12:22.20 |
dddddddd | ok | 12:22.44 |
| shoul I do ant debug and ant debug install again? | 12:23.19 |
| shoud* | 12:23.26 |
kens | You'll need to rebuild, I'm not an Android developer so I don't really know I'm afraid | 12:23.47 |
| OK you should just need to buidl the native libraries so ndk-build | 12:24.22 |
dddddddd | ok | 12:24.37 |
kens | I'd start from that point and do all the steps | 12:24.47 |
dddddddd | When running ndk-buil, I get this warning: Android NDK: WARNING:jni/Android.mk:mupdfcore: LOCAL_LDLIBS is always ignored for static libraries | 12:25.16 |
kens | so ndk-build tehn ant debug, ant debug install | 12:25.18 |
dddddddd | Is this relevant or not? | 12:25.23 |
kens | Again, I don't know sorry. Hopefully Robin will be back shortly and he (or paulgardiner) can tell you | 12:25.45 |
| But it sounds lik eyou can ignore it to me | 12:25.54 |
dddddddd | Right | 12:26.44 |
| Now I get this error: | 12:26.50 |
| android-ndk-r10/build/core/build-binary.mk:447: recipe for target 'obj/local/armeabi-v7a/objs/mupdfcore/__/__/__/source/fitz/stext-device.o' failed make: *** [obj/local/armeabi-v7a/objs/mupdfcore/__/__/__/source/fitz/stext-device.o] Interrupt | 12:26.51 |
| when I do ndk-build | 12:27.12 |
kens | OK *waaay* outside my competence now.... | 12:27.15 |
tor8 | dddddddd: that warning has been present (and harmless) for quite some time | 12:27.28 |
dddddddd | ok, good | 12:28.06 |
kens | Any idea about the build error tor8 ? | 12:28.37 |
tor8 | kens: no, sorry. | 12:28.46 |
| I guess I could try to install the latest ndk and see if I can get that to build. | 12:28.58 |
kens | Ah well, we need someone who speaks to droids | 12:29.02 |
| C3P0 :-) | 12:29.12 |
dddddddd | I guess running ant clean will erase my change to the system.h file, right? | 12:29.43 |
kens | I wouldn't think so no | 12:29.59 |
| I would expect that to clean the intermediate object files | 12:30.12 |
| and teh final binaries, but not change the soruce | 12:30.27 |
| Worth a try I would think | 12:30.41 |
dddddddd | ok, because ndk-build terminated successfully after that | 12:31.24 |
| doing ant debug now.... | 12:31.39 |
tor8 | dddddddd: okay, I've got ndk-r9d on linux and ndk-build succeeds just fine with that | 12:31.53 |
kens | OK sounds like one of the object files was out of date compared to the source. Sos, carry on :-) | 12:31.57 |
tor8 | updating my sdk now, then I'll test ant debug and then the emulator | 12:32.14 |
kens | H'e susing 10 I think tor8 | 12:32.19 |
tor8 | kens: right. I guess I could download that too. | 12:32.34 |
| just looking at what I had installed already | 12:32.39 |
kens | Oh fair enough, didn't realise you had that already | 12:32.55 |
dddddddd | I some of you could try it on Windows I would appeciate | 12:33.11 |
kens | runs away screaming | 12:33.22 |
tor8 | dddddddd: I don't develop on Windows, so I'm afraid I can't help you there. | 12:33.25 |
dddddddd | :D | 12:33.30 |
| ok | 12:33.37 |
tor8 | dddddddd: ndk-r10b? | 12:33.51 |
dddddddd | yes, that's the version I am using | 12:34.07 |
| it's the latest | 12:34.13 |
tor8 | dddddddd: okay, I'll try that next | 12:34.14 |
kens | I'm kind of astonished this keeps happening.... | 12:34.47 |
tor8 | kens: google has fumbled with android native development since forever... | 12:35.09 |
| ...and nobody notices because everybody is hypnotized by apples shinies | 12:35.21 |
dddddddd | I am getting a slightly different Exception message, now | 12:35.27 |
| Caused by: java.lang.UnsatisfiedLinkError: Cannot load library: reloc_library[1307]: 1386 cannot locate 'rand'... | 12:35.34 |
kens | Sure, but to keep on dropping bits of the libraries, and different bits on every release.... | 12:35.35 |
dddddddd | on the same line | 12:35.37 |
| as before | 12:35.40 |
kens | Oh dear, rand is gone too by the sound of it | 12:35.54 |
| No random number generator :-( | 12:36.06 |
dddddddd | I'll have lunch now | 12:39.13 |
kens | OK thanks for trying that | 12:39.19 |
Robin_Watts | dddddddd: Try adding #define rand(x) ((int)lrand48()) | 12:39.23 |
| dddddddd: Try adding #define rand (int)lrand48 | 12:39.41 |
| The latter one, sorry. | 12:39.59 |
dddddddd | where, exactly? | 12:40.01 |
Robin_Watts | same place as you added #undef isnan | 12:40.24 |
dddddddd | ok | 12:40.43 |
| My block is like this, now: | 12:40.50 |
| #ifdef __ANDROID__ #undef isnan #define rand (int)lrand48 #include <android/log.h> #define LOG_TAG "libmupdf" #define LOGI(...) __android_log_print(ANDROID_LOG_INFO,LOG_TAG,__VA_ARGS__) #define LOGE(...) __android_log_print(ANDROID_LOG_ERROR,LOG_TAG,__VA_ARGS__) #else #define LOGI(...) do {} while(0) #define LOGE(...) do {} while(0) #endif | 12:40.55 |
| Sorry, the format is messy | 12:41.04 |
Robin_Watts | perfect. | 12:41.07 |
dddddddd | let me try | 12:41.13 |
| Caused by: java.lang.UnsatisfiedLinkError: Cannot load library: reloc_library[1307]: 1386 cannot locate 'rand'... | 12:49.25 |
| It didn't help | 12:49.40 |
kens | Might be best to wait for tor8 to download and try the latest NDK ? | 12:50.02 |
dddddddd | I just finished downloading r8e | 12:50.28 |
| I'll try | 12:50.31 |
| But first, I got to eat something | 12:50.42 |
| So, I'll be back in one and half hour | 12:51.01 |
tor8 | kens: bah. I need to update my SDK because it's apparently too old as well... | 12:51.02 |
dddddddd | thanks for helping me | 12:51.11 |
tor8 | did I mention I hate software that has to be updated constantly? | 12:51.21 |
dddddddd | Yeah, the usual | 12:51.23 |
| Long downloads all the way | 12:51.41 |
Robin_Watts | Urm... we don't use rand... | 12:52.39 |
tor8 | all smartphones drive me nuts with their incessant nagging about pointless updates I couldn't care less about... | 12:53.02 |
| Robin_Watts: we do, in the test buffer and in mujs Math.random() implementation | 12:53.20 |
Robin_Watts | test buffer is disable. | 12:53.34 |
| ah, Math.random is not showing up for me when I search... | 12:53.52 |
tor8 | grep -R 'rand()' source thirdparty | 12:54.29 |
Robin_Watts | yeah, I see it with grep, but not in the solution, for some reason. | 12:54.50 |
kens | chrisl I believe I understand the problem with the corrupted glyphs. Its due to using tiny numbers with FreeType. I solved this before for type 3 input fonts by multiplying the CTM by 100, looks like I need to do the same for the new code too. | 12:55.28 |
| Which means hackery with the CTM at various places, but it works for type 3's so I'm sure I can solve it eventually for fallback outline type 3 fonts too. | 12:56.29 |
tor8 | *sigh* so parts of the android sdk is 32-bit, even on a 64-bit machine ... which means you need to use multi-arch | 13:19.34 |
| android dev really is a noisome pile of manure... | 13:20.16 |
kens | chrisl, working text from my new code! Outlines are good, positioning is fine, kerning works. Slightly different to what Distiller produces, but its a mtter of pixels here and there really. | 13:53.42 |
henrys | kens: sure I'll add it to the agenda - I haven't seen a customer ask for it though, have you? | 13:54.33 |
kens | henrys, yes, on occasion. The problem is that a DCT encoded image in the input either ends up larger in the pdfwrite output, (because we don't use DCT) or with ugly artefacts (because we do) | 13:55.17 |
| Its not a huge important point though, more a reminder that me ray chris and robin should probably talk about it | 13:55.55 |
| So on the Friday if we have time after Joseph finishes | 13:56.32 |
henrys | kens: okay | 13:58.52 |
chrisl | kens: that's very cool. It will be interesting to see how much difference it makes to the ps2write output.... | 13:59.51 |
kens | Hmm, Acrobat renders Arial MT with distinct flattening of curves. My new output actually looks slightly better (IMO) than the DIstiller code. | 13:59.58 |
| chrisl just kiscked off a cluster run to see what breaks :-( | 14:00.33 |
chrisl | Could it be relying on the default flatness setting? | 14:00.50 |
kens | Code isn't finished yet, I haven't added checks for exsiting CharProcs, which is (partly) where the win will be | 14:01.00 |
| chrisl Well, my outlines are just outlines, so the flatness hsould be the same.... | 14:01.20 |
| But when I view both at 600% I can see the 'o' looks rather polygonal in the Disitller file. It must be something to do with rendering the TT outline | 14:01.59 |
| Of course, this is with smoothing turned on, it may be better without, but I'm interested in what a regular user would see | 14:02.23 |
chrisl | Oh, maybe it doesn't convert the curves, and just flattens them? | 14:02.44 |
kens | That's distinctly possible, though it does look better (not polygonal) when I turn off smoothing | 14:03.14 |
| There's not really anything to choose between the Distiller font and my type 3 with smoothing off | 14:04.00 |
chrisl | I'm fairly sure Acrobat uses some library for TTF rendering - if Distiller does too, there may only be limited options about the form of the output | 14:06.29 |
kens | well, its acrobat doing the rendering, distiller just embeds teh TT font (I think). | 14:06.52 |
| Acrobat says its using an embedded subset TT font | 14:07.17 |
chrisl | Oh, I thought you had convinced Distiller to output Type 3 outlines | 14:07.49 |
kens | Nope, I just wanted a comparison for my type 3 font | 14:08.06 |
| TO see if it lookd right, was positioned correctly, sized appropriately and with sensible widths | 14:08.28 |
| I'm surprised to see so many differences, I must be kicking into this code when I shouldn't be | 14:09.04 |
chrisl | *Lots* of PDF's with pointless CIDFonts around..... | 14:09.39 |
kens | That shouldn't normally cause this code to activate, its only supposed to be active for the fallback (bitmap) case. | 14:10.23 |
chrisl | ps2write..... | 14:10.47 |
kens | CIDFonts should just go through as CIDFonts | 14:10.48 |
| There's 1800+ diffs with pdfwrite | 14:11.09 |
| Incl;uding some PCL cases | 14:11.31 |
Robin_Watts | git cluster bmpcmp -w3 | 14:11.32 |
kens | No, no, I really do want to see them | 14:11.45 |
chrisl | PCL might be stick font? | 14:11.59 |
kens | If the code is activating inappropriately I need to stop it doing that | 14:12.04 |
| chrisl yes, but the stick font is hjandled same as a PS type 3 font, so no bitmap fallback | 14:12.29 |
| THe bitmap fallback is supposed to be rare (for pdfwrite at least) | 14:12.40 |
| I suspect I'm not testing properly to see if I'm accumulating a charproc | 14:13.13 |
chrisl | I'll stop making pointless suggestions, then :-) | 14:13.29 |
kens | Hopefully it'll be easier to fix than this code was to write......... | 14:13.41 |
| bmpcmp is just coming to an end, so that'll give me something to look at, from that I shoudl be able to see why its happening | 14:13.59 |
| Well, first dumb mistake located | 14:31.06 |
dddddddd | Hi again, guys | 14:35.15 |
| Is it normal that, when executing make generate, I get as output: make: Nothing to be done for 'generate'.? | 14:35.52 |
kens | I would htink that's possible, all the files are laready present | 14:36.15 |
Robin_Watts | dddddddd: Sure, if you've done that step already, yes. | 14:51.51 |
dddddddd | Thank you all | 15:10.58 |
| I managed to get it working with ndk-r8e | 15:11.17 |
| finally | 15:11.22 |
kens | congrats :-) | 15:11.37 |
dddddddd | Just one more thing | 15:13.36 |
| Is there some way to bookmark a page | 15:13.44 |
| or to register comments in the PDF file? | 15:13.57 |
Robin_Watts | nope. | 15:14.52 |
dddddddd | ok, is it planned for the future or muPDF is only for rendering/reading? | 15:15.47 |
Robin_Watts | If you mean a way of bookmarking a page where the bookmark information is stored in the app, then it would be easy to add to the app. | 15:15.58 |
| and could be done entirely in java. | 15:16.04 |
| If you mean a way of bookmarking a page where the bookmark information is stored in the PDF, then that wouldn't be that hard, but it would require C level programming. | 15:16.31 |
dddddddd | Yes, I just wanted to know if that feature was built-in in the muPDF or not | 15:16.56 |
| Thank you once again | 15:17.02 |
| Bye | 15:18.55 |
| and good luck trying to solve Google's mess in the NDK updates | 15:19.14 |
Robin_Watts | bye | 15:19.15 |
dddddddd | :D | 15:19.16 |
rayjj | mvrhel_laptop: can you glance at my fix for cust 532's issue? It looks fine for the regression run except for tests_private/pdf/sumatra/586_-_missing_images_gs_SMask_not_applied.pdf which was wrong before, and is still wrong, but different. | 16:25.48 |
| mvrhel_laptop: the commit is at http://git.ghostscript.com/?p=user/ray/ghostpdl.git;a=commitdiff;h=ea6290b302598f13e7fb4c29aff73657989e693d;hp=3b2d19e8a459610ba50560ee004eae8736644dc1 | 16:25.58 |
mvrhel_laptop | rayjj: sure | 16:26.04 |
rayjj | mvrhel_laptop: it uses the method that you (sort of) suggested to me | 16:26.39 |
| mvrhel_laptop: at least I didn't have to add any new compositor operations -- just (ab)used the .begintransparencymaskgroup | 16:27.37 |
| mvrhel_laptop: sorry about all of the whitespace changes in pdf_draw.ps (I trimmed trailing blanks) The only real change to that file is the first part of the diff | 16:30.05 |
mvrhel_laptop | np. making my way through it | 16:30.27 |
| rayjj: looks like a reasonable way to do it. | 16:32.57 |
rayjj | mvrhel_laptop: thanks. | 16:38.40 |
henrys | rayjj: curious about the company m board, we should be able to get help on that one. | 17:52.43 |
| rayjj: you said I'd do it and I did ... can you edit the logs? | 17:53.28 |
Robin_Watts | I will. | 17:54.48 |
rayjj | henrys: sorry -- I wasn't paying attention. I have been swamped with fixes for cust 532, but I will pulse company C again, and in the meantime (once 532 is happy) get on to the company M board (which should be a lot more fun) | 18:11.05 |
henrys | rayjj: if you can give 532 stuff to chrisl_away feel free, miles really wants the board stuff done. | 18:11.58 |
aksr | guys, any way to detect/extract italic/bold words from pdf or ps? | 18:12.47 |
Robin_Watts | aksr: In general, no. | 18:13.01 |
| But there are heuristics you can apply. | 18:13.14 |
aksr | Yes? | 18:13.19 |
Robin_Watts | If you use MuPDF to do text extraction, you get details of the font sizes/font names used for each char. | 18:13.48 |
| You may be able to spot bold/italics from that. | 18:14.05 |
aksr | Ok, anything else? | 18:15.07 |
Robin_Watts | gs also does text extraction, which will help with PS, as MuPDF doesn't do PS. Dunno quite what the gs output looks like though. | 18:15.50 |
rayjj | henrys: I just shipped off one fix -- it was the transparency related one, and that one would have probably needed mvrhel, not chris to dig into (without a big learning curve) | 18:16.38 |
| henrys: I have one more (6th gen) one from 532 to look at. If I can foist that onto chrisl, I will. | 18:17.22 |
| Robin_Watts: I think that the txtwrite device on gs has output similar to mudraw's | 18:18.41 |
Robin_Watts | rayjj: mudraw has at least 3 different output modes. Not sure how many txtwrite has. | 18:19.09 |
| but yes, I am aware they are (superficially at least) the same. | 18:19.33 |
rayjj | Robin_Watts: -dTextFormat=0 | 1 | 2 | 3 (default is 3) http://www.ghostscript.com/doc/9.14/Devices.htm#TXT | 18:22.34 |
Robin_Watts | ah, great. | 18:22.46 |
rayjj | Robin_Watts: it even mentions MuPDF :-) | 18:23.15 |
Robin_Watts | MuPDF attempts to reorder text to get better output, which I don't believe gs does, but if your input files are sane, that shouldn't matter | 18:23.52 |
rayjj | Robin_Watts: I think that's what Format 1 does, but as it says: "attempts similar processing to MuPDF, and will output blocks of text. Note the alogrithm used is not the same as the MuPDF code, and so the results will not be identical." | 18:24.39 |
Robin_Watts | rayjj: MuPDF does that processing in all cases. | 18:27.56 |
| I'm not claiming that MuPDF is better than gs here, just highlighting the fact that there are differences. | 18:28.18 |
rayjj | Robin_Watts right. | 18:28.19 |
| I suppose it is expecting too much to share the heuristics code :-/ | 18:29.29 |
| since, presumably, we are starting with the same font, position size and strings | 18:30.18 |
Robin_Watts | MuPDF pulls stuff out to a well defined structure of chars/spans/lines/blocks. | 18:35.22 |
| and then operates on that. | 18:35.27 |
| If gs pulled out to the same format, the same code could be used. | 18:35.45 |
| but then the first version of txtwrite was written before MuPDF, so it would have required ken to do lots of rewrites. | 18:36.12 |
rayjj | Robin_Watts: We'll have to see if kens can be convinced :-) | 18:36.20 |
Robin_Watts | I hate the heuristic code anyway. | 18:36.39 |
rayjj | heuristics are rarely "clean" | 18:37.09 |
| at least you didn't start with a neural network -- those are *TOTALLY* unmaintainable | 18:38.21 |
rayjj | hasn't looked at the mupdf code | 18:39.19 |
aksr | Robin_Watts: I need to use mudraw, right? | 18:41.57 |
Robin_Watts | Yes. | 18:42.23 |
| Unless you want to work at the C level yourself. | 18:42.30 |
aksr | mudraw -t -t input.pdf | 18:42.37 |
| Not atm. | 18:42.48 |
| Robin_Watts: It works. | 18:50.41 |
| :) | 18:50.45 |
Robin_Watts | fab. | 18:50.55 |
aksr | Thank you. | 18:51.22 |
| ;) | 18:51.24 |
| Robin_Watts: Still it bugs me: how so there isn't any need for something like this? | 18:51.57 |
Robin_Watts | PDF was not designed for this to be an easy thing to do. | 18:53.09 |
aksr | Still, if there's a need... | 18:54.34 |
Robin_Watts | Is there a need? | 18:55.06 |
| If the information is not there in the file, we can't return it (reliably) | 18:55.29 |
aksr | That's understandable. | 18:56.03 |
| 'Was wondering in general. | 18:56.19 |
mvrhel_laptop | bbiaw | 19:18.04 |
rayjj | hurray! Hit the ball back to cust 532. Now I can play with the Company M board :-) | 20:40.30 |
| Forward 1 day (to 2014/09/12)>>> | |