Log of #mupdf at irc.freenode.net.

Search:
 <<<Back 1 day (to 2018/03/12)20180313 
sh4rm4^bnc could i request a release tarball for mujs 1.0.2 being added to the mujs download page ? thanks05:18.32 
velix Can mupdf keep/fix the bookmarks when splitting a file? Ghostscript can do this, but (as I discussed with kens on #ghostscript) GS isn't the right tool for this.14:52.53 
kens velix : we've got a meeting going on on another channel right now, hold on for a bit14:58.31 
velix Nooo, now :)))15:00.16 
  kens: no problem, much fun.15:00.24 
kens THink we're done now, you'll need to ask tor8 about that stuff15:00.44 
  I don't *think* MuPDF does anything about it, but I could easily be wrong15:01.00 
tor8 velix: I can't promise, easiest to just try it and see if it does.15:05.16 
  it's been a while since I worked in that area of the code15:05.39 
velix Okay, I'll have a look. If it does, it would be nice to turn it on and off ;)15:06.28 
  Splitting PDFs with bookmarks is nice sometimes, but takes ages ages.15:06.42 
  I'll try it soon on a huge mailmerge.15:06.52 
  8650 pages.15:06.55 
Hufokus Hello! How should I scale pdf "on fly"? The only way I see is perform pix = fz_new_pixmap_from_page_number, then idev = fz_new_draw_device( pix ), then fz_run_page( idev, &ctm ) , but that way looks kinda slow16:40.38 
tor8 Hufokus: define "on fly" please :)16:43.29 
sh4rm4^bnc thanks for the new tarballs!16:43.47 
tor8 fz_new_pixmap_from_page_number is just a convenient wrapper for fz_new_draw_device and fz_run_page16:43.59 
  sh4rm4^bnc: no worries.16:44.03 
  Hufokus: if you want to redraw the same page multiple times efficiently, it is better to do it in two steps.16:44.42 
  1) fz_new_display_list_from_page_number, and then fz_new_pixmap_from_display_list.16:44.57 
  where you can change the CTM to fz_new_pixmap_from_display_list to the zoom value you want16:45.29 
Hufokus tor8, thanks. But what should I do if I want to previously populate pixmap with background values. I can't use pixmap, cos it's not created with new, while fz_new_pixmap_from_display_list immediately fills it with image data17:03.33 
tor8 Hufokus: then you need to create a draw device with your pixmap17:17.06 
  Hufokus: you can create a new pixmap with fz_new_pixmap without drawing into it17:17.57 
  Hufokus: fz_bound_page will get the page dimensions, so you can create one of the appropriate size17:18.22 
  just take a look at the implementation of fz_new_pixmap_from_page in source/fitz/util.c17:19.17 
  or fz_new_pixmap_from_display_list even17:19.49 
sebras tor8: the jbig2dec git is autoextracted from gs, right?17:21.41 
  tor8: once a day?17:22.00 
tor8 sebras: yes. the script is run by cron once a day17:22.03 
  the scripts are in /home/tor/bin/mirror-thirdparty.sh17:22.28 
sebras tor8: ok, then I'll wait until tomorrow to update jbig2dec so as to fix 699083 originating with oss-fuzz.17:22.31 
tor8 it's safer to wait, but it is possible to force running them early17:23.08 
sebras tor8: no rush.17:23.23 
  tor8: Henry ACKed the patch so I figured I should at least get it in.17:24.15 
Hufokus tor8, so, no need of fz_run_page to populate pixmap, right?17:25.34 
sebras tor8: how do you envision form-filling to work? the Contents field in the annotation panel is used to edit the contents? I'm thinking of the PDF in 699026 that paul mentioned.17:27.03 
tor8 Hufokus: well, you need to call either fz_run_page or fz_run_display_list to get the rendering of the page into the pixmap17:27.15 
  but if you just want a blank pixmap, there's no need to do that17:27.25 
sebras Hufokus: what are you building with mupdf? a new PDF viewer UI?17:28.20 
tor8 sebras: eventually some form of inline page editing17:28.54 
  but first I need to reinstate the appearance stream updating for widgets and sort out the dirty states and event passing17:29.17 
Hufokus sebras, not exactly. A tool to form articles from PDF/A17:31.40 
sebras Hufokus: aha, are you working on it publicly somewhere? github?17:33.26 
  tor8: so then the contents field is only meant for annotations proper (and not widgets)?17:35.18 
tor8 sebras: not sure yet. but I don't anticipate using the 'a' panel for form filling at all.17:35.42 
Hufokus Not yet, may be later. I'm right at the start. Just obtaining text from pdf text segments, later I wanna add some heuristics to autodefine segments order and so17:36.05 
tor8 Hufokus: you might want to take a look at 'mutool run'17:36.33 
  https://mupdf.com/docs/manual-mutool-run.html17:36.46 
  depending on how much control you need17:36.58 
sebras tor8: the bottom patch on sebras/ui is something you need.17:44.00 
Hufokus Thanks, I'll check this doc now17:44.06 
sebras needed to compile desktop java.17:44.15 
tor8 sebras: okay.17:44.29 
Robin_Watts tor8, sebras: http://git.ghostscript.com/?p=user/robin/mupdf.git;a=commitdiff;h=c081ca2008ecf7d503ce7710f09d19043bac595118:02.31 
  mvrhel_laptop: So, do you have a proposed API for the halftone stuff?18:02.53 
  (and possibly the GS changes to use it?)18:03.01 
  Damn, wrong group.18:03.23 
sebras Robin_Watts: depending on zoom level acroread seems to draw these missing dots fatter than the other ones...18:11.49 
Robin_Watts sebras: yeah. They are all the 'leading' dots.18:12.09 
  You can see the results in my bmpcmp area.18:12.18 
sebras mupdf draws them with equal size after your patch though.18:12.44 
Robin_Watts sebras: Yes. We are right, acrobat wrong :)18:13.03 
sebras Robin_Watts: why is it the case that this is only true for round line caps?18:29.46 
tor8 other line caps need an orientation to be drawn, so are not shown for zero-length lines18:30.08 
Robin_Watts how do you put not round caps on zero length lines?18:30.10 
  s/not/non/18:30.17 
sebras tor8: ok, makes sense.18:31.00 
tor8 Robin_Watts: though I think we could let the fz_stroke_lineto case handle the non-round caps and zero length lines18:31.42 
  as it already does for non-dashed cases18:31.48 
Robin_Watts tor8: I'm open to neater ways of doing it.18:32.21 
tor8 what about other zero length dashes in the middle of the dash pattern?18:32.54 
  I think we ought to be able to handle those too, not just initial ones18:33.04 
Robin_Watts The key thing here is that the first entry in the dash array is 0 on a move.18:33.10 
  cos the move phase processing skips it, so it never gets done elsewhere.18:33.34 
sebras #if 118:34.04 
  if (0 && dx == 0)18:34.05 
  that was a neat piece of code.18:34.09 
Robin_Watts If you can construct an example that goes wrong with a different dash array, we should fix that too.18:34.32 
tor8 Robin_Watts: we get it wrong (compared to gs) with non-round line caps too with your patch18:38.25 
Robin_Watts we do?18:38.38 
  Ah. maybe I see why.18:39.11 
tor8 but we do properly handle zero-width dashes except it the initial moveto18:39.35 
  initial = (s->dash_list[s->offset] == 0); fixes it for non-round line caps18:41.02 
Robin_Watts I may have a nicer fix.18:41.14 
tor8 $ cat dash.txt18:41.21 
  0 J [0 10] 0 d 10 10 m 110 100 l s18:41.21 
  1 J [0 10] 0 d 20 10 m 120 100 l s18:41.21 
  1 J [] 0 d 30 10 m 130 100 l s18:41.21 
  $ mutool create -o out.pdf dash.txt && mupdf out.pdf18:41.29 
  ah, those sohuld be capital S at the ends there18:42.14 
Robin_Watts tor8: yeah, my simpler fix matches acrobat. Just a tick...18:43.44 
  tor8: Ok, so my new fix is: http://git.ghostscript.com/?p=user/robin/mupdf.git;a=commitdiff;h=3afca2bb825017a55aadb67beb1d3318ce5f5c5619:08.17 
  That gives 107 cluster diffs, where the last fix gave 126.19:08.19 
velix Can mupdf flatten a PDF's layers?19:44.30 
kens velix you need to define what you mean by 'layers' and 'flatten'. But this sounds more like something Ghostscript+pdfwrite would do, currently20:08.45 
velix kens: Remove the layers and make it on the base document only.20:09.09 
  sorry, VERY bad English today.20:09.20 
kens Define 'layer'20:09.22 
Robin_Watts velix: There is no concept of "layer" in the PDF spec.20:09.33 
  There is a concept of "Optional Content Groups", which are sometimes exposed as "layers".20:09.51 
velix Oh really? I can select them in Adobe Reader. It's called "Ebenen" in German, let me check the translation in English Reader, 1 sec20:09.57 
Robin_Watts velix: Right, that's OCGs.20:10.12 
kens Right but you can't convert optional content to the PDF, because the content depends which bits are 'on'20:10.22 
Robin_Watts You can (conceptually at least) "flatten" the OCGs to be normal content.20:10.49 
kens For a given set of current conditions, yes20:11.03 
velix Acrobat also calls it "layers": http://www.adobepress.com/articles/article.asp?p=1271791&seqNum=320:11.15 
kens But anything which is not currently rendered will be lost20:11.21 
velix Actually, "Acrobat Layers".20:11.26 
kens That's Adobe press puff, not a specification20:11.37 
Robin_Watts And mupdf contains the basis for support for doing that using the content filtering stuff.20:11.54 
velix kens: Yeah, but that's what end-users like me call it ;)20:12.00 
Robin_Watts but it will require some C coding on your part.20:12.04 
velix Oh :(20:12.09 
kens Which is confusing for technical people, because end-users call several different things 'layers'20:12.23 
Robin_Watts Or, as kens says, you can use gs, which is a more destructive approach.20:12.35 
kens I can't recall waht GS+pdfwrite does with optional content right at the moment20:12.44 
velix Robin_Watts: I know, we discussed that already ;)20:12.46 
kens But if you want it 'flattened' then you are entering into a destructive operation anyway20:13.11 
velix Actually, I want to burn all the objects to the "base OCG" (?) which are on by default.20:13.19 
  kens: Sure, but there might be the problems we talked about some hours ago (rounding issues etc.)20:13.37 
kens The rounding won't be a problem, except that some 're' operations might turn into paths.20:14.00 
  There are no practical problems there20:14.13 
  More to the point is the loss of things like optinal content, I now recall that pdfwrite does not preserve optional content as OCGs20:14.43 
  So if you run a PDF file with OCG through pdfwrite, the output will be what you want, 'flattened'20:15.05 
velix The main problem is, latest CorelDraw PDF export converts all Corel Layers into Acrobat Layers (OCGs). I don't want that. Sure, it's a Corel problem, but there's no way around (copying them to one layer is too much work with many PDFs in Corel). Sure, another work-around is a macro, but I'd love to do it on PDF level.20:15.34 
kens Well, like I said, pdfwrite will do that for you right now. In the future I intend to change that so that we can preserve optional content20:16.10 
  But its a big job and there's not been a lot of call for it20:16.24 
velix okay ;)20:16.24 
  Can I also *create* OCGs? ;)20:16.35 
  wrong channel...20:16.45 
kens With pdfmark operations, I believe so20:16.46 
velix But I can't overlay one PDF to be on top of the other OCG ?20:17.13 
kens Currently I don't think you can do that with MuPDF. Maybe in the future20:17.16 
  'orhter' OCG ?20:17.31 
velix Sorry, I don't know OCGs ... what do you call the basic document and the first layer?20:17.51 
kens You can't process 2 PDF files with GS at all, I'm not so sure about MuPDF20:17.57 
  Tere's the PDF file, and teh PDF file contains optional content.20:18.17 
  Some of that content will be displayed under different conditions20:18.29 
  For example, some of it might be rendered only when printed20:18.40 
velix I've got an idea: I'll create a PDF in Corel, export and uncompress it. Then I can understand, what's going on :)20:19.06 
kens Hmm, I'd read the spec personally20:19.24 
velix kens = dev, velix = user20:19.35 
  ;)20:19.35 
kens But by all means look at a simple example too20:19.38 
  But do make sure its a *simple* example20:19.52 
  These tings rapidly get too co,lpictaed to understand easily20:20.06 
velix kens: I like rectangles, but Corel converts the rectangle into a closed path on PDF export :(20:20.14 
kens Well I expect pdfwrite will turn them into 're' operations, but in practice it doesn't make any difference20:20.46 
velix yeah, I know.20:21.03 
  let me create a simple example20:21.09 
  Where does PDF start counting coordinates? bottom left or upper right or upper left or ... ?20:22.00 
kens bottom left20:22.12 
velix okay20:22.21 
kens But.. you can translate the origin by modifyinTransformaion Matrixg the Current20:22.49 
velix kens: I'm coming from geodata. OGC (Open Geospatial Consortium) does some standarization in there. And sometimes it's upper right or upper left ;)20:23.02 
kens And change the direction of increasing x and y20:23.07 
  I see my trackpad messed up my insertion point again20:24.28 
  All co-ordinates in PDF and PostScript are converted from user space to device space using the Current Transformation Matrix20:25.08 
  That can have the effect of translating the origin, rotating the axes, or flipping them20:25.33 
  As well as altering scale20:25.41 
  So it snot a simple matter to determine where on the page a given object lies.20:26.12 
  You have to keep track of the CTM until you reach the point where it is rendered20:26.30 
velix here we co (uncompressed): http://ge.tt/6d9L8zo220:26.32 
  eeeh sorry, directly out of corel20:26.43 
kens content is stil ASCII85 encoded20:28.42 
velix yeah, sorry :D20:29.03 
kens But you can see the threee OCGs20:29.10 
  And the OCproperties dictionary20:29.23 
velix Yes. now I understand how it works.20:29.36 
  I've unpacked it with qpdf (didn't try it with mutools now): http://ge.tt/6d9L8zo220:29.52 
  Filename is layers_qpdf.pdf20:29.58 
  It's so easy... Didn't know that.20:30.27 
  Ahh, flattening is easy: remove "OCProperties" and fix PDF.20:31.19 
  Is there a PDF version, which doesn't support OCGs?20:32.05 
kens Yes, but I can't remember20:32.18 
velix kens: no problem ;)20:32.37 
  Corel also has rounding issues... the rectangles were 150 pt.20:32.48 
  Corel makes 149.9998 out of it.20:32.52 
  150.0001 49.9997 l20:32.58 
  naaarf ...20:33.00 
  Let me print it through Ghostscript's PS printer20:33.34 
kens Probably why its emitting paths rather than re then20:33.40 
  Why would you want to run it through ps2write ?20:33.57 
velix kens: just playing around20:34.23 
kens Oh....20:34.38 
velix No, same problem.20:34.44 
  150.00009 49.99975 L20:34.52 
kens Well we do try to preserve the input as much as possible20:34.58 
velix Then it's Corel...20:35.07 
kens Obviously our a curacy is higher :-)20:35.09 
  Our rounding only takes place if there's an actual calculation involved20:35.33 
velix Inkscape is much more accurate (and can read Corel Files).20:35.37 
  kens: Yeah, like geographic coordinate calculations with Proj ;)20:35.48 
  kens: Sorry, I'm a Geographer.20:35.52 
kens knows nothingabout gospatial stuff20:36.04 
  Normally we will put whatever was in the input in the output20:36.20 
velix kens: geospatial PDFs are nice.20:36.23 
  But it's under patent, I think.20:36.46 
kens If we have to perform a calcilation (patterns and forms perhaps I think) then we might end up hitting a rounding error.20:36.54 
  I think we use doubles to do the math, but still20:37.04 
velix Corel can export EPS only, but: 150.00009 49.9997520:37.21 
kens PDF has geospatial 'things' in it in PDF 2.020:37.26 
  Yes, the error there is clearly in Corel itself20:37.39 
velix I like it. Adobe even has a coordinate and map measuring tool.20:37.44 
  Hmm. I just realized, using gs to convert the EPS to PDF creates a wrong document size.20:40.35 
  Shall we switch to #ghostscript ?20:40.52 
kens EPS doesn't include a media size request20:40.53 
velix ohhh20:40.58 
  You're filling my notebook pretty fast ;)20:41.16 
kens It has comments whihc GS can process, use -dEPSCrop20:41.17 
  EPS isintended to be included 'as is' in another PostScript program, so you can't ask for media. But the enclsoing program needs to know how large the content is, so the information is included in specially crafted comments20:42.22 
velix hmm, works top, left, right, but has a white border at the bottom.20:42.24 
  Yeah, I'm using EPS to place vectors in MS Word.20:42.38 
kens DSC comments (Document Structure Convention)20:42.39 
velix It works perfectly.20:42.41 
kens DSC parsers (and GS includes one) can use that information20:42.57 
velix Word has a beautiful EPS parser. They've licensed it from an external corporation.20:43.09 
kens But obviously you have to tell GS to use it, or it would try to process EPS files included in PS programs, with disastrous results20:43.28 
velix ahhh, it works now!20:43.57 
kens seriously doubts Word has an EPS parser20:44.02 
  DSC parser possibly20:44.07 
velix is a user and I call it EPS parser ;)20:44.17 
  I had to set Corel to "page limits" on export PDF and now -dEPSCrop gives perfect results.20:44.37 
  export EPS, sorry.20:44.51 
kens Oh right.20:44.58 
  An EPS parser would read EPS programs, and would need to be a full PostScript interpreter. DSC comments, however, are very easy to read20:45.56 
velix https://support.office.com/en-us/article/support-for-eps-images-has-been-turned-off-in-office-a069d664-4bcf-415e-a1b5-cbb0c334a84020:46.53 
  naaaaaarf20:46.56 
  Instead of fixing, they disable it.20:47.29 
kens Well, that's pretty crap20:47.42 
velix I bet the license expired or anything.20:47.44 
  EMF isn't an alternative.20:47.48 
kens But about what I would expect from MS20:47.52 
velix "Similarly, you can use an online conversion tool such as CloudConvert.com or Convertio.co to convert an EPS file to EMF or SVG." :)20:48.04 
kens No you can't.20:48.15 
velix Sure, I'll upload my images in copyright to a cloud service.20:48.16 
kens You can make soemthing not quite entirelyunlike the EPS you strated with20:48.35 
  SVG can20:49.01 
  SVG doesn't handle text and fonts well20:49.13 
velix kens: It was even possible to put CMYK EPS into a word document. When printing it with ghostscript and distilling it with CMYK settings, the EPS (I mean the drawing) still had the correct CMYK values!20:49.15 
kens Yes, of course, that's how EPS is meant to work20:49.33 
  Its a 'black box'20:49.42 
velix Oh: "If you perform the change to the registry, you will be able to insert EPS files in the application on which you have applied the registry change. The EPS files will be automatically converted to EMF, saved, and visible in the saved document, even by people who haven't performed the registry change."20:50.19 
kens I can't really see what vulnerabilities there could be in Word. I can undertand that the interpreter can have a problem, but Word shuoldn't be reading anything except the comments20:50.29 
velix "In the case of EPS files, this message means that Office has turned off the ability to insert EPS files, because we think the vulnerability to malicious attacks is too great."20:50.52 
  That sounds like: "We don't want to pay the import DLL anymore".20:51.04 
kens ell, if Word is converting it to an EMF, they must have a PostScirpt interpreter20:51.06 
  No waqy to do it otherwise20:51.24 
velix kens: I'm still on Office 2003, the best office around (in my eyes).20:51.38 
  I wrote some hundret research pages the last years with it.20:51.52 
  Let me start Office 2010 on business laptop. 1 sec20:52.13 
kens So either they have a full PostScript interpreter in there still, or they will do a *really* crap job on converting it20:52.19 
  OTOH Word was never really a professional layout application20:52.40 
  Microsoft has never understood publishing20:52.52 
velix Microsoft has "Publisher ";)20:53.00 
kens Yes, a more ineptly named product its hard to think of20:53.21 
velix hehe20:53.29 
  Microsoft TeX ;)20:53.35 
kens Its going to be interesting hearing the wails of complaint from people whose EPS logos will no longer print in the corporate colour after they convert it to EMF20:54.38 
velix Yeah, I can't drag and drop EPS anymore.20:54.40 
  let me hack registry.20:54.53 
  IT gave me local administrator :D20:54.59 
kens is my own administrator20:55.47 
velix Ah, here is some background: https://support.microsoft.com/en-us/help/2479871/security-settings-for-graphic-filters-for-microsoft-office-36520:56.04 
  Wait what? bmp, gif, jpg, pict and png is activated by default? what about TIFF ?20:56.56 
kens Doesn't say much20:57.00 
  Obviouly you should stop trying to use professional formats, just stick with the consumer ones20:58.00 
  Right off to watch TV20:59.00 
velix ok21:01.31 
  What's the use of "mutool show"? Here, it shows only 1 object. I thought, it should show them all?22:53.43 
  Interesting: mutool clean -s wipes the content off the OCG layers to the "base layer", but still keeps the layers (which are empty now).23:06.16 
tor8 velix: by default mutool show shows the trailer object23:07.22 
  give it some object numbers as arguments to show other things23:07.33 
velix tor8: ok23:07.44 
  ahh23:07.48 
  tor8: What does -s "Rewrite content streams." do?ß23:09.49 
  Hmm, why are those empty objects 5, 6, 7 not cleaned? https://bpaste.net/show/a6b3d7eab96c23:16.52 
  ah okay, it gets wiped with -s23:18.44 
  but this destroys my layers.23:18.56 
  I think this is a bug.23:20.49 
  Anyone in here to talk about the problem?23:27.26 
  I've found several bugs :(23:48.13 
 Forward 1 day (to 2018/03/14)>>> 
ghostscript.com #ghostscript
Search: