| <<<Back 1 day (to 2018/03/12) | 20180313 |
sh4rm4^bnc | could i request a release tarball for mujs 1.0.2 being added to the mujs download page ? thanks | 05:18.32 |
velix | Can mupdf keep/fix the bookmarks when splitting a file? Ghostscript can do this, but (as I discussed with kens on #ghostscript) GS isn't the right tool for this. | 14:52.53 |
kens | velix : we've got a meeting going on on another channel right now, hold on for a bit | 14:58.31 |
velix | Nooo, now :))) | 15:00.16 |
| kens: no problem, much fun. | 15:00.24 |
kens | THink we're done now, you'll need to ask tor8 about that stuff | 15:00.44 |
| I don't *think* MuPDF does anything about it, but I could easily be wrong | 15:01.00 |
tor8 | velix: I can't promise, easiest to just try it and see if it does. | 15:05.16 |
| it's been a while since I worked in that area of the code | 15:05.39 |
velix | Okay, I'll have a look. If it does, it would be nice to turn it on and off ;) | 15:06.28 |
| Splitting PDFs with bookmarks is nice sometimes, but takes ages ages. | 15:06.42 |
| I'll try it soon on a huge mailmerge. | 15:06.52 |
| 8650 pages. | 15:06.55 |
Hufokus | Hello! How should I scale pdf "on fly"? The only way I see is perform pix = fz_new_pixmap_from_page_number, then idev = fz_new_draw_device( pix ), then fz_run_page( idev, &ctm ) , but that way looks kinda slow | 16:40.38 |
tor8 | Hufokus: define "on fly" please :) | 16:43.29 |
sh4rm4^bnc | thanks for the new tarballs! | 16:43.47 |
tor8 | fz_new_pixmap_from_page_number is just a convenient wrapper for fz_new_draw_device and fz_run_page | 16:43.59 |
| sh4rm4^bnc: no worries. | 16:44.03 |
| Hufokus: if you want to redraw the same page multiple times efficiently, it is better to do it in two steps. | 16:44.42 |
| 1) fz_new_display_list_from_page_number, and then fz_new_pixmap_from_display_list. | 16:44.57 |
| where you can change the CTM to fz_new_pixmap_from_display_list to the zoom value you want | 16:45.29 |
Hufokus | tor8, thanks. But what should I do if I want to previously populate pixmap with background values. I can't use pixmap, cos it's not created with new, while fz_new_pixmap_from_display_list immediately fills it with image data | 17:03.33 |
tor8 | Hufokus: then you need to create a draw device with your pixmap | 17:17.06 |
| Hufokus: you can create a new pixmap with fz_new_pixmap without drawing into it | 17:17.57 |
| Hufokus: fz_bound_page will get the page dimensions, so you can create one of the appropriate size | 17:18.22 |
| just take a look at the implementation of fz_new_pixmap_from_page in source/fitz/util.c | 17:19.17 |
| or fz_new_pixmap_from_display_list even | 17:19.49 |
sebras | tor8: the jbig2dec git is autoextracted from gs, right? | 17:21.41 |
| tor8: once a day? | 17:22.00 |
tor8 | sebras: yes. the script is run by cron once a day | 17:22.03 |
| the scripts are in /home/tor/bin/mirror-thirdparty.sh | 17:22.28 |
sebras | tor8: ok, then I'll wait until tomorrow to update jbig2dec so as to fix 699083 originating with oss-fuzz. | 17:22.31 |
tor8 | it's safer to wait, but it is possible to force running them early | 17:23.08 |
sebras | tor8: no rush. | 17:23.23 |
| tor8: Henry ACKed the patch so I figured I should at least get it in. | 17:24.15 |
Hufokus | tor8, so, no need of fz_run_page to populate pixmap, right? | 17:25.34 |
sebras | tor8: how do you envision form-filling to work? the Contents field in the annotation panel is used to edit the contents? I'm thinking of the PDF in 699026 that paul mentioned. | 17:27.03 |
tor8 | Hufokus: well, you need to call either fz_run_page or fz_run_display_list to get the rendering of the page into the pixmap | 17:27.15 |
| but if you just want a blank pixmap, there's no need to do that | 17:27.25 |
sebras | Hufokus: what are you building with mupdf? a new PDF viewer UI? | 17:28.20 |
tor8 | sebras: eventually some form of inline page editing | 17:28.54 |
| but first I need to reinstate the appearance stream updating for widgets and sort out the dirty states and event passing | 17:29.17 |
Hufokus | sebras, not exactly. A tool to form articles from PDF/A | 17:31.40 |
sebras | Hufokus: aha, are you working on it publicly somewhere? github? | 17:33.26 |
| tor8: so then the contents field is only meant for annotations proper (and not widgets)? | 17:35.18 |
tor8 | sebras: not sure yet. but I don't anticipate using the 'a' panel for form filling at all. | 17:35.42 |
Hufokus | Not yet, may be later. I'm right at the start. Just obtaining text from pdf text segments, later I wanna add some heuristics to autodefine segments order and so | 17:36.05 |
tor8 | Hufokus: you might want to take a look at 'mutool run' | 17:36.33 |
| https://mupdf.com/docs/manual-mutool-run.html | 17:36.46 |
| depending on how much control you need | 17:36.58 |
sebras | tor8: the bottom patch on sebras/ui is something you need. | 17:44.00 |
Hufokus | Thanks, I'll check this doc now | 17:44.06 |
sebras | needed to compile desktop java. | 17:44.15 |
tor8 | sebras: okay. | 17:44.29 |
Robin_Watts | tor8, sebras: http://git.ghostscript.com/?p=user/robin/mupdf.git;a=commitdiff;h=c081ca2008ecf7d503ce7710f09d19043bac5951 | 18:02.31 |
| mvrhel_laptop: So, do you have a proposed API for the halftone stuff? | 18:02.53 |
| (and possibly the GS changes to use it?) | 18:03.01 |
| Damn, wrong group. | 18:03.23 |
sebras | Robin_Watts: depending on zoom level acroread seems to draw these missing dots fatter than the other ones... | 18:11.49 |
Robin_Watts | sebras: yeah. They are all the 'leading' dots. | 18:12.09 |
| You can see the results in my bmpcmp area. | 18:12.18 |
sebras | mupdf draws them with equal size after your patch though. | 18:12.44 |
Robin_Watts | sebras: Yes. We are right, acrobat wrong :) | 18:13.03 |
sebras | Robin_Watts: why is it the case that this is only true for round line caps? | 18:29.46 |
tor8 | other line caps need an orientation to be drawn, so are not shown for zero-length lines | 18:30.08 |
Robin_Watts | how do you put not round caps on zero length lines? | 18:30.10 |
| s/not/non/ | 18:30.17 |
sebras | tor8: ok, makes sense. | 18:31.00 |
tor8 | Robin_Watts: though I think we could let the fz_stroke_lineto case handle the non-round caps and zero length lines | 18:31.42 |
| as it already does for non-dashed cases | 18:31.48 |
Robin_Watts | tor8: I'm open to neater ways of doing it. | 18:32.21 |
tor8 | what about other zero length dashes in the middle of the dash pattern? | 18:32.54 |
| I think we ought to be able to handle those too, not just initial ones | 18:33.04 |
Robin_Watts | The key thing here is that the first entry in the dash array is 0 on a move. | 18:33.10 |
| cos the move phase processing skips it, so it never gets done elsewhere. | 18:33.34 |
sebras | #if 1 | 18:34.04 |
| if (0 && dx == 0) | 18:34.05 |
| that was a neat piece of code. | 18:34.09 |
Robin_Watts | If you can construct an example that goes wrong with a different dash array, we should fix that too. | 18:34.32 |
tor8 | Robin_Watts: we get it wrong (compared to gs) with non-round line caps too with your patch | 18:38.25 |
Robin_Watts | we do? | 18:38.38 |
| Ah. maybe I see why. | 18:39.11 |
tor8 | but we do properly handle zero-width dashes except it the initial moveto | 18:39.35 |
| initial = (s->dash_list[s->offset] == 0); fixes it for non-round line caps | 18:41.02 |
Robin_Watts | I may have a nicer fix. | 18:41.14 |
tor8 | $ cat dash.txt | 18:41.21 |
| 0 J [0 10] 0 d 10 10 m 110 100 l s | 18:41.21 |
| 1 J [0 10] 0 d 20 10 m 120 100 l s | 18:41.21 |
| 1 J [] 0 d 30 10 m 130 100 l s | 18:41.21 |
| $ mutool create -o out.pdf dash.txt && mupdf out.pdf | 18:41.29 |
| ah, those sohuld be capital S at the ends there | 18:42.14 |
Robin_Watts | tor8: yeah, my simpler fix matches acrobat. Just a tick... | 18:43.44 |
| tor8: Ok, so my new fix is: http://git.ghostscript.com/?p=user/robin/mupdf.git;a=commitdiff;h=3afca2bb825017a55aadb67beb1d3318ce5f5c56 | 19:08.17 |
| That gives 107 cluster diffs, where the last fix gave 126. | 19:08.19 |
velix | Can mupdf flatten a PDF's layers? | 19:44.30 |
kens | velix you need to define what you mean by 'layers' and 'flatten'. But this sounds more like something Ghostscript+pdfwrite would do, currently | 20:08.45 |
velix | kens: Remove the layers and make it on the base document only. | 20:09.09 |
| sorry, VERY bad English today. | 20:09.20 |
kens | Define 'layer' | 20:09.22 |
Robin_Watts | velix: There is no concept of "layer" in the PDF spec. | 20:09.33 |
| There is a concept of "Optional Content Groups", which are sometimes exposed as "layers". | 20:09.51 |
velix | Oh really? I can select them in Adobe Reader. It's called "Ebenen" in German, let me check the translation in English Reader, 1 sec | 20:09.57 |
Robin_Watts | velix: Right, that's OCGs. | 20:10.12 |
kens | Right but you can't convert optional content to the PDF, because the content depends which bits are 'on' | 20:10.22 |
Robin_Watts | You can (conceptually at least) "flatten" the OCGs to be normal content. | 20:10.49 |
kens | For a given set of current conditions, yes | 20:11.03 |
velix | Acrobat also calls it "layers": http://www.adobepress.com/articles/article.asp?p=1271791&seqNum=3 | 20:11.15 |
kens | But anything which is not currently rendered will be lost | 20:11.21 |
velix | Actually, "Acrobat Layers". | 20:11.26 |
kens | That's Adobe press puff, not a specification | 20:11.37 |
Robin_Watts | And mupdf contains the basis for support for doing that using the content filtering stuff. | 20:11.54 |
velix | kens: Yeah, but that's what end-users like me call it ;) | 20:12.00 |
Robin_Watts | but it will require some C coding on your part. | 20:12.04 |
velix | Oh :( | 20:12.09 |
kens | Which is confusing for technical people, because end-users call several different things 'layers' | 20:12.23 |
Robin_Watts | Or, as kens says, you can use gs, which is a more destructive approach. | 20:12.35 |
kens | I can't recall waht GS+pdfwrite does with optional content right at the moment | 20:12.44 |
velix | Robin_Watts: I know, we discussed that already ;) | 20:12.46 |
kens | But if you want it 'flattened' then you are entering into a destructive operation anyway | 20:13.11 |
velix | Actually, I want to burn all the objects to the "base OCG" (?) which are on by default. | 20:13.19 |
| kens: Sure, but there might be the problems we talked about some hours ago (rounding issues etc.) | 20:13.37 |
kens | The rounding won't be a problem, except that some 're' operations might turn into paths. | 20:14.00 |
| There are no practical problems there | 20:14.13 |
| More to the point is the loss of things like optinal content, I now recall that pdfwrite does not preserve optional content as OCGs | 20:14.43 |
| So if you run a PDF file with OCG through pdfwrite, the output will be what you want, 'flattened' | 20:15.05 |
velix | The main problem is, latest CorelDraw PDF export converts all Corel Layers into Acrobat Layers (OCGs). I don't want that. Sure, it's a Corel problem, but there's no way around (copying them to one layer is too much work with many PDFs in Corel). Sure, another work-around is a macro, but I'd love to do it on PDF level. | 20:15.34 |
kens | Well, like I said, pdfwrite will do that for you right now. In the future I intend to change that so that we can preserve optional content | 20:16.10 |
| But its a big job and there's not been a lot of call for it | 20:16.24 |
velix | okay ;) | 20:16.24 |
| Can I also *create* OCGs? ;) | 20:16.35 |
| wrong channel... | 20:16.45 |
kens | With pdfmark operations, I believe so | 20:16.46 |
velix | But I can't overlay one PDF to be on top of the other OCG ? | 20:17.13 |
kens | Currently I don't think you can do that with MuPDF. Maybe in the future | 20:17.16 |
| 'orhter' OCG ? | 20:17.31 |
velix | Sorry, I don't know OCGs ... what do you call the basic document and the first layer? | 20:17.51 |
kens | You can't process 2 PDF files with GS at all, I'm not so sure about MuPDF | 20:17.57 |
| Tere's the PDF file, and teh PDF file contains optional content. | 20:18.17 |
| Some of that content will be displayed under different conditions | 20:18.29 |
| For example, some of it might be rendered only when printed | 20:18.40 |
velix | I've got an idea: I'll create a PDF in Corel, export and uncompress it. Then I can understand, what's going on :) | 20:19.06 |
kens | Hmm, I'd read the spec personally | 20:19.24 |
velix | kens = dev, velix = user | 20:19.35 |
| ;) | 20:19.35 |
kens | But by all means look at a simple example too | 20:19.38 |
| But do make sure its a *simple* example | 20:19.52 |
| These tings rapidly get too co,lpictaed to understand easily | 20:20.06 |
velix | kens: I like rectangles, but Corel converts the rectangle into a closed path on PDF export :( | 20:20.14 |
kens | Well I expect pdfwrite will turn them into 're' operations, but in practice it doesn't make any difference | 20:20.46 |
velix | yeah, I know. | 20:21.03 |
| let me create a simple example | 20:21.09 |
| Where does PDF start counting coordinates? bottom left or upper right or upper left or ... ? | 20:22.00 |
kens | bottom left | 20:22.12 |
velix | okay | 20:22.21 |
kens | But.. you can translate the origin by modifyinTransformaion Matrixg the Current | 20:22.49 |
velix | kens: I'm coming from geodata. OGC (Open Geospatial Consortium) does some standarization in there. And sometimes it's upper right or upper left ;) | 20:23.02 |
kens | And change the direction of increasing x and y | 20:23.07 |
| I see my trackpad messed up my insertion point again | 20:24.28 |
| All co-ordinates in PDF and PostScript are converted from user space to device space using the Current Transformation Matrix | 20:25.08 |
| That can have the effect of translating the origin, rotating the axes, or flipping them | 20:25.33 |
| As well as altering scale | 20:25.41 |
| So it snot a simple matter to determine where on the page a given object lies. | 20:26.12 |
| You have to keep track of the CTM until you reach the point where it is rendered | 20:26.30 |
velix | here we co (uncompressed): http://ge.tt/6d9L8zo2 | 20:26.32 |
| eeeh sorry, directly out of corel | 20:26.43 |
kens | content is stil ASCII85 encoded | 20:28.42 |
velix | yeah, sorry :D | 20:29.03 |
kens | But you can see the threee OCGs | 20:29.10 |
| And the OCproperties dictionary | 20:29.23 |
velix | Yes. now I understand how it works. | 20:29.36 |
| I've unpacked it with qpdf (didn't try it with mutools now): http://ge.tt/6d9L8zo2 | 20:29.52 |
| Filename is layers_qpdf.pdf | 20:29.58 |
| It's so easy... Didn't know that. | 20:30.27 |
| Ahh, flattening is easy: remove "OCProperties" and fix PDF. | 20:31.19 |
| Is there a PDF version, which doesn't support OCGs? | 20:32.05 |
kens | Yes, but I can't remember | 20:32.18 |
velix | kens: no problem ;) | 20:32.37 |
| Corel also has rounding issues... the rectangles were 150 pt. | 20:32.48 |
| Corel makes 149.9998 out of it. | 20:32.52 |
| 150.0001 49.9997 l | 20:32.58 |
| naaarf ... | 20:33.00 |
| Let me print it through Ghostscript's PS printer | 20:33.34 |
kens | Probably why its emitting paths rather than re then | 20:33.40 |
| Why would you want to run it through ps2write ? | 20:33.57 |
velix | kens: just playing around | 20:34.23 |
kens | Oh.... | 20:34.38 |
velix | No, same problem. | 20:34.44 |
| 150.00009 49.99975 L | 20:34.52 |
kens | Well we do try to preserve the input as much as possible | 20:34.58 |
velix | Then it's Corel... | 20:35.07 |
kens | Obviously our a curacy is higher :-) | 20:35.09 |
| Our rounding only takes place if there's an actual calculation involved | 20:35.33 |
velix | Inkscape is much more accurate (and can read Corel Files). | 20:35.37 |
| kens: Yeah, like geographic coordinate calculations with Proj ;) | 20:35.48 |
| kens: Sorry, I'm a Geographer. | 20:35.52 |
kens | knows nothingabout gospatial stuff | 20:36.04 |
| Normally we will put whatever was in the input in the output | 20:36.20 |
velix | kens: geospatial PDFs are nice. | 20:36.23 |
| But it's under patent, I think. | 20:36.46 |
kens | If we have to perform a calcilation (patterns and forms perhaps I think) then we might end up hitting a rounding error. | 20:36.54 |
| I think we use doubles to do the math, but still | 20:37.04 |
velix | Corel can export EPS only, but: 150.00009 49.99975 | 20:37.21 |
kens | PDF has geospatial 'things' in it in PDF 2.0 | 20:37.26 |
| Yes, the error there is clearly in Corel itself | 20:37.39 |
velix | I like it. Adobe even has a coordinate and map measuring tool. | 20:37.44 |
| Hmm. I just realized, using gs to convert the EPS to PDF creates a wrong document size. | 20:40.35 |
| Shall we switch to #ghostscript ? | 20:40.52 |
kens | EPS doesn't include a media size request | 20:40.53 |
velix | ohhh | 20:40.58 |
| You're filling my notebook pretty fast ;) | 20:41.16 |
kens | It has comments whihc GS can process, use -dEPSCrop | 20:41.17 |
| EPS isintended to be included 'as is' in another PostScript program, so you can't ask for media. But the enclsoing program needs to know how large the content is, so the information is included in specially crafted comments | 20:42.22 |
velix | hmm, works top, left, right, but has a white border at the bottom. | 20:42.24 |
| Yeah, I'm using EPS to place vectors in MS Word. | 20:42.38 |
kens | DSC comments (Document Structure Convention) | 20:42.39 |
velix | It works perfectly. | 20:42.41 |
kens | DSC parsers (and GS includes one) can use that information | 20:42.57 |
velix | Word has a beautiful EPS parser. They've licensed it from an external corporation. | 20:43.09 |
kens | But obviously you have to tell GS to use it, or it would try to process EPS files included in PS programs, with disastrous results | 20:43.28 |
velix | ahhh, it works now! | 20:43.57 |
kens | seriously doubts Word has an EPS parser | 20:44.02 |
| DSC parser possibly | 20:44.07 |
velix | is a user and I call it EPS parser ;) | 20:44.17 |
| I had to set Corel to "page limits" on export PDF and now -dEPSCrop gives perfect results. | 20:44.37 |
| export EPS, sorry. | 20:44.51 |
kens | Oh right. | 20:44.58 |
| An EPS parser would read EPS programs, and would need to be a full PostScript interpreter. DSC comments, however, are very easy to read | 20:45.56 |
velix | https://support.office.com/en-us/article/support-for-eps-images-has-been-turned-off-in-office-a069d664-4bcf-415e-a1b5-cbb0c334a840 | 20:46.53 |
| naaaaaarf | 20:46.56 |
| Instead of fixing, they disable it. | 20:47.29 |
kens | Well, that's pretty crap | 20:47.42 |
velix | I bet the license expired or anything. | 20:47.44 |
| EMF isn't an alternative. | 20:47.48 |
kens | But about what I would expect from MS | 20:47.52 |
velix | "Similarly, you can use an online conversion tool such as CloudConvert.com or Convertio.co to convert an EPS file to EMF or SVG." :) | 20:48.04 |
kens | No you can't. | 20:48.15 |
velix | Sure, I'll upload my images in copyright to a cloud service. | 20:48.16 |
kens | You can make soemthing not quite entirelyunlike the EPS you strated with | 20:48.35 |
| SVG can | 20:49.01 |
| SVG doesn't handle text and fonts well | 20:49.13 |
velix | kens: It was even possible to put CMYK EPS into a word document. When printing it with ghostscript and distilling it with CMYK settings, the EPS (I mean the drawing) still had the correct CMYK values! | 20:49.15 |
kens | Yes, of course, that's how EPS is meant to work | 20:49.33 |
| Its a 'black box' | 20:49.42 |
velix | Oh: "If you perform the change to the registry, you will be able to insert EPS files in the application on which you have applied the registry change. The EPS files will be automatically converted to EMF, saved, and visible in the saved document, even by people who haven't performed the registry change." | 20:50.19 |
kens | I can't really see what vulnerabilities there could be in Word. I can undertand that the interpreter can have a problem, but Word shuoldn't be reading anything except the comments | 20:50.29 |
velix | "In the case of EPS files, this message means that Office has turned off the ability to insert EPS files, because we think the vulnerability to malicious attacks is too great." | 20:50.52 |
| That sounds like: "We don't want to pay the import DLL anymore". | 20:51.04 |
kens | ell, if Word is converting it to an EMF, they must have a PostScirpt interpreter | 20:51.06 |
| No waqy to do it otherwise | 20:51.24 |
velix | kens: I'm still on Office 2003, the best office around (in my eyes). | 20:51.38 |
| I wrote some hundret research pages the last years with it. | 20:51.52 |
| Let me start Office 2010 on business laptop. 1 sec | 20:52.13 |
kens | So either they have a full PostScript interpreter in there still, or they will do a *really* crap job on converting it | 20:52.19 |
| OTOH Word was never really a professional layout application | 20:52.40 |
| Microsoft has never understood publishing | 20:52.52 |
velix | Microsoft has "Publisher ";) | 20:53.00 |
kens | Yes, a more ineptly named product its hard to think of | 20:53.21 |
velix | hehe | 20:53.29 |
| Microsoft TeX ;) | 20:53.35 |
kens | Its going to be interesting hearing the wails of complaint from people whose EPS logos will no longer print in the corporate colour after they convert it to EMF | 20:54.38 |
velix | Yeah, I can't drag and drop EPS anymore. | 20:54.40 |
| let me hack registry. | 20:54.53 |
| IT gave me local administrator :D | 20:54.59 |
kens | is my own administrator | 20:55.47 |
velix | Ah, here is some background: https://support.microsoft.com/en-us/help/2479871/security-settings-for-graphic-filters-for-microsoft-office-365 | 20:56.04 |
| Wait what? bmp, gif, jpg, pict and png is activated by default? what about TIFF ? | 20:56.56 |
kens | Doesn't say much | 20:57.00 |
| Obviouly you should stop trying to use professional formats, just stick with the consumer ones | 20:58.00 |
| Right off to watch TV | 20:59.00 |
velix | ok | 21:01.31 |
| What's the use of "mutool show"? Here, it shows only 1 object. I thought, it should show them all? | 22:53.43 |
| Interesting: mutool clean -s wipes the content off the OCG layers to the "base layer", but still keeps the layers (which are empty now). | 23:06.16 |
tor8 | velix: by default mutool show shows the trailer object | 23:07.22 |
| give it some object numbers as arguments to show other things | 23:07.33 |
velix | tor8: ok | 23:07.44 |
| ahh | 23:07.48 |
| tor8: What does -s "Rewrite content streams." do?Ã | 23:09.49 |
| Hmm, why are those empty objects 5, 6, 7 not cleaned? https://bpaste.net/show/a6b3d7eab96c | 23:16.52 |
| ah okay, it gets wiped with -s | 23:18.44 |
| but this destroys my layers. | 23:18.56 |
| I think this is a bug. | 23:20.49 |
| Anyone in here to talk about the problem? | 23:27.26 |
| I've found several bugs :( | 23:48.13 |
| Forward 1 day (to 2018/03/14)>>> | |