| <<<Back 1 day (to 2020/10/18) | Fwd 1 day (to 2020/10/20)>>> | 20201019 |
user38 | Greetings everyone. I've ported MuPDF to Solaris 10, and while it runs without issues on the i86pc platform, on sparc it crashes in fz_keep_path at source/fitz/path.c:101. | 08:31.39 |
| Specifically, it will eventually crash at this expression: | 08:31.54 |
| if (path->refs == 1 && path->packed == FZ_PATH_UNPACKED) | 08:32.27 |
| when this occurs, both path->refs and path->packed are 1. The instruction where the crash occurs is: | 08:33.25 |
| ldx [ %i1 ], %g4 | 08:33.44 |
| and %i1 is not NULL. | 08:34.01 |
| I'm at a loss why this causes a crash. Any insights or pointers? | 08:34.19 |
sebras | user38: so you're saying your port to x86 solaris works without problems but sparc-based solaris is problematic? | 08:34.25 |
user38 | That is correct. | 08:34.35 |
sebras | I wasn't sure if you meant that x86 on say linux worked without issues. | 08:34.55 |
user38 | No Linux at all. Strictly Solaris 10. | 08:35.15 |
sebras | what value does %i1 have? is it unaligned? | 08:35.49 |
user38 | print $i1 | 08:36.12 |
| $1 = 4337015348 | 08:36.24 |
| Seems to be aligned to me? | 08:36.51 |
sebras | user38: agreed. how does it complain? | 08:39.14 |
user38 | Program received signal SIGSEGV, Segmentation fault. | 08:39.48 |
sebras | oh, both refs and packed are (u)int8_t ok. | 08:39.52 |
user38 | [Switching to Thread 1 (LWP 1)] | 08:40.04 |
sebras | user38: but path might be a completely bogus value. | 08:40.11 |
| user38: are you able to run "make sanitize" and rerun using the sanitized binary? | 08:40.30 |
| or run "make debug" and run that binary in valgrind. | 08:40.58 |
user38 | in fz_keep_path (ctx-0x102590800, pathc=0x102819a43) at source/fitz/path.c:101 | 08:41.06 |
| s/ctx-/ctx=/ | 08:41.19 |
| There is no valgrind on Solaris. | 08:41.34 |
sebras | I'm not sure if any of these tools work properly in solaris, but if they are available and do work, they might be helpful. | 08:41.44 |
user38 | Valgrind is bound to the Linux kernel. | 08:41.46 |
| The version I am running has been compiled with -O0 -g -g3. | 08:42.31 |
sebras | do you have a backtrace for where it happens? | 08:42.31 |
user38 | I do. | 08:42.35 |
sebras | user38: pastebin it please. | 08:42.45 |
| (they tend to be long) | 08:42.52 |
user38 | #0 is fz_keep_path | 08:42.54 |
| #1 is fz_append_display_node | 08:43.04 |
| #2 is fz_list_fill_path | 08:43.25 |
| unfortunely I cannot cut and paste because I'm working in a highly secured environment and cutting & pasting has been disabled on purpose. | 08:43.59 |
sebras | user38: ok. | 08:44.20 |
user38 | so being that both path->refs and path->packed are some form of an integer, why would that cause a crash? | 08:45.35 |
| And what is path->packed used for? | 08:45.59 |
sebras | user38: in fz_append_display_node() the relevant line is writer->path = fz_keep_path(ctx, my_path) | 08:46.12 |
user38 | source/fitz/list-device.c? | 08:47.09 |
sebras | yes | 08:47.13 |
| are you running mutool or mupdf-gl? | 08:47.23 |
| or mupdf-x11? | 08:47.27 |
user38 | mupdf-x11. | 08:47.31 |
sebras | can you try "mutool draw -F png -o out.png document.pdf" as well as "mutool draw -D -F png -o out.png document.pdf" ? | 08:48.28 |
| note the -D | 08:48.33 |
| I'm thinking that there might be an issue with our display list, so disabling and see if you get a reasonable out.png is useful info. | 08:49.27 |
user38 | mutool draw -F png -o out.png document.pdf caused a Bus Error. | 08:49.53 |
| Now let's try the other one... | 08:50.01 |
sebras | that's expected. | 08:50.02 |
| I'm expecting the other one to work. | 08:50.56 |
user38 | The other appears to have rendered a 116 page document into a 9.3 MB "out.png" file. | 08:50.57 |
sebras | only the first page, right? | 08:51.28 |
| but it didn't bus error on you, that's good to know. | 08:51.55 |
user38 | No it did not. Presumably that implies that the core engine is working on the sparc platform, or am I to interpret that differently? | 08:52.37 |
malc_` | user38: (genuine non-sarcastic question) isn't solaris 10 ancient? i think i still have an installation dvd gifted to me by a coworker more than a decade ago... | 08:54.24 |
user38 | Oracle still patches it and offers support until 2023. This particular Solaris 10 runs half the continent, which is why it cannot be replaced. | 08:55.12 |
sebras | user38: ok, so our display list implementation has issues on sparc. likely becuase something is unaligned. | 08:55.45 |
| user38: do you mind reporting a bug so this is not forgotten? | 08:56.06 |
user38 | I would be happy to do so. | 08:56.18 |
sebras | I'm heading out for dinner for a bit. | 08:56.18 |
user38 | Where do I report it? | 08:56.34 |
malc_` | bloody taiwan tz | 08:56.34 |
sebras | user38: bugs.ghostscript.com | 08:56.42 |
user38 | Thank you. I will do that. | 08:56.58 |
| Which information would you like in there? | 08:57.11 |
sebras | user38: if you have a simple document that shows this every time, please attach that document. | 08:57.15 |
| user38: everything you can provide. mention the tests with mutool you just did. | 08:57.36 |
user38 | Any PDF document blows up. MuPDF isn't picky or choosy in this regard. | 08:57.40 |
| I will do that. | 08:57.53 |
sebras | user38: regardless, it is good to have a document where it necessarily fails. | 08:58.30 |
| user38: a document that has only text might not fail e.g. | 08:58.45 |
| while a document that has some kind of line art probalby does. | 08:59.04 |
user38 | I will try that as well, with a PDF which only has text. | 08:59.07 |
| One other thing, the second invocation of mupdf-tool only appears to have rendered one page, when I look at "out.png"; is that expected? | 09:01.31 |
malc_` | sebras, user38: both are uint8_t yet the reported faulting instruction is an ldx which loads and extends a word.... | 09:02.19 |
| kinda bizzare | 09:02.24 |
user38 | And I've compiled the binary 64-bit. | 09:02.48 |
malc_` | user38: why? | 09:03.10 |
user38 | Because I wanted to see whether the crash is specific to 32-bit (which as it turns out it isn't, since 64-bit crashes all the same). | 09:04.18 |
| Was just going down the list of things I could think of to narrow the issue down. | 09:05.05 |
malc_ | user38: ah.. ok.. both SIGBUS and SIGSEGV were mentioned, so which is it? | 09:06.41 |
sebras | malc_: mutool draw SIGBUS:Ed | 09:07.00 |
vtorri__ | Uploaded file: https://uploads.kiwiirc.com/files/1f03565724f533653fb0d0f72f97b1e7/pasted.txt | 09:07.50 |
malc_ | sebras: funky... | 09:08.00 |
vtorri__ | valgrind does work on solaris | 09:08.01 |
sebras | hm. so the fz_path structure must be aligned according to its largest members alignment. | 09:08.38 |
| so refs ought to be 8 byte aligned if built as 64-bit, maybe? anyway, need to go. | 09:09.48 |
user38 | Both SIGBUS and SIGSEGV occur. When mupdf-x11 is run from the command line, a Bus error is reported; when run from gdb, a SIGSEGV and sometimes a bus error occur. | 09:19.05 |
| SIGSEGV occurs much more often than SIGBUS though. | 09:21.04 |
| But always the crash is reported in fz_keep_path(). | 09:22.18 |
ator | __H__: (for the logs) we changed the license for MuJS from AGPL to ISC a couple of years ago. we can do that, because we hold the copyright. | 09:33.09 |
| user38: Robin_Watts: could this be a big-endian issue? | 09:34.05 |
malc_ | ator: SIGBUS hints that it is an alignment issue, intermittency otoh... | 09:38.25 |
ator | does optimization level change the instructions used? | 09:47.44 |
kens | ator I hesitate to mention it but Chris has a Sparc Solaris machine | 09:50.55 |
Robin_Watts | ator: MY first guess would be not big-endian, but 'weirdass alignment' | 10:01.11 |
kens | is inclined to agree | 10:01.26 |
| Things like that weird bug Chris fixed recently for Ghostscript | 10:01.42 |
Robin_Watts | user38: I have another fire to fight at the moment. Are you going to be around for a few hours ? | 10:01.43 |
user38 | I will. | 10:17.05 |
Robin_Watts | fab. | 10:26.55 |
| ator: You about? | 10:28.03 |
ator | Robin_Watts: yes | 10:29.27 |
Robin_Watts | ator: So, I put a fix in for the makefile at the end of last week, and it's tripped stuff. | 10:29.47 |
ator | I just pushed some Makefile tweaks to tor/master for the leptonica stuff | 10:29.52 |
| Robin_Watts: let me guess, system presence of leptonica/tesseract forces it to try to build the thirdparty module? | 10:30.16 |
Robin_Watts | Basically, it would only build leptonica and tesseract if HAVE_LEPTONICA... | 10:30.24 |
| yes. | 10:30.25 |
| ator: Is that something you've fixed? | 10:30.34 |
ator | I have a fix for that on tor/master (just before the fix for the submodules) | 10:30.36 |
Robin_Watts | ator: Ah, fab. | 10:30.42 |
| Let me look... | 10:30.45 |
ator | we could squish all 4 commits into one and push that if you're happy with the results | 10:30.58 |
| basically if you do "make USE_TESSERACT=yes" it'll set the output path to "build/$(build)-ocr" (and likewise for other configs, like extract, and I added "tofu" as well) | 10:31.40 |
Robin_Watts | I dislike USE_TESSERACT mapping to ocr. | 10:32.29 |
| Can we not use ocr=yes ? | 10:32.36 |
| We use 'build=release' which means we've plumped for lower case things already. | 10:33.41 |
ator | it sets a global variable, it feels dangerous to me | 10:33.42 |
| given how short it is | 10:33.49 |
Robin_Watts | tesseract=yes giving build-tesseract would work for me too. | 10:34.04 |
| Your fix is slightly different to mine. I had the HAVE_TESSERACT pkgconfig stuff wrapped with ifeq ($(USE_SYSTEM_TESSERACT),yes) | 10:35.33 |
ator | the main change to fix the stuff that broke was adding HAVE_SYS_TESSERACT := $(shell pkg-config ... | 10:36.09 |
Robin_Watts | which meant I didn't need to have a separate HAVE_SYS_TESSERACT thing. | 10:36.10 |
ator | rather than set HAVE_TESSERACT immediately | 10:36.27 |
Robin_Watts | right. | 10:36.39 |
| I'd argue my fix is simpler. | 10:36.50 |
| If we have $(USE_SYSTEM_TESSERACT) then we set HAVE_TESSERACT according to pkgconfig. | 10:37.09 |
| If we don't, then we set HAVE_TESSERACT according to the presence/absence of the submodules. | 10:37.37 |
ator | I like keeping the pkg-config stuff together | 10:37.42 |
Robin_Watts | It is still kept together. | 10:37.51 |
ator | which meant adding another variable for it. it's the same way we do for CURL. | 10:37.53 |
Robin_Watts | spider alert. brb. | 10:38.12 |
ator | USE_SYSTEM_TESSERACT is set in Makethird, which is included after Makerules that defines the pkg-config stuff | 10:39.04 |
| based on USE_SYSTEM_LIBS | 10:39.18 |
Robin_Watts | ator: Ok, I think that's more complex than it needs to be, but if it's consistent with the way CURL works, lets go with it. | 10:42.05 |
ator | Robin_Watts: I've tweaked the variable names for turning on/off the variables in a fixup commit | 10:42.27 |
| I mostly went with this way because it's what we did with curl, symmetry is more important here because these makefiles can easily get confusing | 10:43.18 |
| so doing things the same way is important, IMO | 10:43.32 |
Robin_Watts | I agree, consistency wins. | 10:43.43 |
| Can we change 'ocr' to 'tesseract' ? | 10:43.53 |
| cos, as you say, consistency is important :) | 10:44.18 |
ator | I, oh, yes | 10:44.21 |
| new fixup pushed | 10:45.01 |
Robin_Watts | perfect. | 10:45.17 |
| Should we push your first fixup back into golden to the penultimate commit there to fix the cluster breakages? | 10:45.50 |
| (I vote yes) | 10:45.54 |
ator | sure | 10:46.02 |
Robin_Watts | Shall I do that? | 10:46.08 |
| or would you like to? | 10:46.14 |
ator | go ahead | 10:46.16 |
Robin_Watts | Ta. I'll push your commits til just before the push of the submodules. | 10:46.44 |
| My fixes went into upstream tesseract over the weekend. | 10:47.02 |
| I now have one tiny fix remaining. | 10:47.10 |
ator | ok | 10:47.24 |
Robin_Watts | so there is a modification to the submodules pending. I'll test that and get a review up. | 10:47.33 |
| ator: OK, cluster surgery done, commits pushed. | 11:27.16 |
| the updated submodules commit is on my repo now. | 11:27.27 |
| I'll cluster it once the existing cluster jobs finish. | 11:27.36 |
| user38: ok, let's have a think about the solaris issue. | 11:29.18 |
| You have a particular file that goes wrong on solaris with a crash in the displaylist stuff? | 11:29.45 |
sebras | Robin_Watts: it is with any file. | 11:29.52 |
| Robin_Watts: I suspect anything that involves a path. | 11:30.07 |
ator | Robin_Watts: thanks. looking good so far. | 11:30.33 |
sebras | fz_keep_path() that crashes is only called once from fz_append_display_node(). | 11:30.35 |
Robin_Watts | sebras: It'd be nice to know if it's genuinely any file, or whether it's any file with a path. | 11:30.52 |
sebras | Robin_Watts: agreed, user38 said he'd test with a file that only contained text. not sure if he did in the end. | 11:31.24 |
Robin_Watts | Any file suggests the problem might be in the padding of display list nodes. | 11:31.37 |
| Any file with paths suggests the packing of paths. | 11:31.51 |
sebras | Robin_Watts: what surprises me is that the address he listed ended on a 4, which implies that the address is 4 bytes aligned. | 11:32.15 |
Robin_Watts | sebras: solaris is/can be 64 bits. | 11:32.35 |
| fz_display_node's are 32 bits. | 11:32.55 |
sebras | he said he wanted to know if the issue was specific to 32 bits, so he compiled for 64-bits too. | 11:33.25 |
Robin_Watts | so it was failing for 32bit stuff too. | 11:33.41 |
sebras | but I don't know anything about what system he is running this on (except that it is solaris 10). | 11:33.46 |
| Robin_Watts: mmm, that's what his messages imply. | 11:33.55 |
| Robin_Watts: my_path is set like this my_path = (void *)(&node_ptr[path_off]); | 11:34.18 |
| and 20 or so lines later passed to fz_keep_path(). | 11:34.31 |
Robin_Watts | I confess that I've always kinda glossed over the technical details of solaris issues in the past as chrisl deals with them so efficiently. | 11:34.34 |
| but ISTR that there is more weirdness than you'd expect. | 11:35.14 |
sebras | and node_ptr is declared fz_display_node *, so if the original node_ptr value is reasonable then the assignment of my_path ought to be something that fz_keep_path() can accept..? | 11:35.17 |
| and node_str appears to come from list->list[whateverindex] | 11:36.42 |
| so that ought to be sufficiently aligned too..? | 11:36.53 |
Robin_Watts | Yes. | 11:37.14 |
sebras | the issue appears inside fz_keep_path() when dereferncing either ->refs or path->packed. | 11:38.02 |
| but one is int8_t and the other uint8_t. so if the pointer to the struct is sufficiently aligned I'd expect these two accesses to be non-problematic. | 11:38.55 |
user38 | sebras, I did test it with a very small PDF file which only contained text, and it crashed. I've opened https://bugs.ghostscript.com/show_bug.cgi?id=703004, which includes the stack trace and the test PDF file as we agreed. | 11:41.29 |
| If you think a core file would be helpful as well, I will add one tonight. | 11:42.32 |
| @Robin_Watts any PDF file will cause a crash. | 11:43.55 |
| Yes, the code was failing when compiled 32-bit as well. | 11:45.07 |
| "but I don't know anything about what system he is running this on (except that it is solaris 10)" | 11:45.55 |
Robin_Watts | user38: Can you do a debug build? | 11:46.54 |
user38 | I've tested on a Sun T2000 and on an Oracle (Fujitsu) M3000; the hardware appears to make no difference, which works as designed, as Solaris abstracts the hardware away. | 11:47.06 |
Robin_Watts | make build=debug | 11:47.14 |
| Are they all SPARC based machines? | 11:47.26 |
user38 | I did, plus I added -O0 -g -g3. | 11:47.35 |
| Yes, they are. | 11:47.46 |
Robin_Watts | It would be very strange if that PDF file really did contain only text, and you were still hitting problems in path.c like that :)( | 11:48.38 |
| Quite probably there is a 'blank the page' rectangle first. | 11:48.49 |
| Can you tell me value of 'path' at the point it crashes please? | 11:49.20 |
user38 | Different processor generations (UltraSPARC T1, SPARC64-X+), very different processor designs (for example, the UltraSPARC-T1 only partially implements the SPARC V9 VIS instructions), so there appears to be no correlation between specific SPARC hardware and the crash. The only common denominator is the sparc platform in general. | 11:52.36 |
| Even the Solaris 10 patch levels are different, so there appears to be no correlation between the versions of the Solaris 10 operating system or even the patch levels. | 11:54.08 |
Robin_Watts | user38: AIUI, Sparc as a family imposes certain 'unusual' restrictions on alignment. | 11:54.48 |
| We've tripped over it before with gs. | 11:54.58 |
| Those unusual restrictions probably apply to the entire family of processors. | 11:55.28 |
user38 | That is correct: in order not to stall the instruction pipeline, the SPARC architecture only allows memory access at even, aligned memory addresses. Any access to a non-aligned memory address will cause the memory management unit to intervene. | 11:56.25 |
Robin_Watts | user38: All the information I can find online suggests that it wants "natural" alignments for things. | 11:57.53 |
user38 | However this normally poses no issue as long as the members of data structures are ordered from largest to smallest. | 11:57.58 |
Robin_Watts | i.e. 8 bit stuff at any address, 16 bit stuff at even alignments, 32 bit stuff at %4 alignment etc. | 11:58.32 |
user38 | It can be at any address, so long as that address is even. | 11:58.56 |
Robin_Watts | user38: Right, that's specifically NOT what I said. | 11:59.19 |
| At the moment of a crash, what is 'path' ? | 11:59.48 |
user38 | Let me see... | 12:00.07 |
| Ah, that's interesting! When I ran it with the simple test PDF with no graphics, it crashed in thread 1 in fz_run_display_list | 12:01.59 |
| So, 'path' is fz_path *, and it contains: | 12:04.59 |
| $1 = {refs = 1 '\001', packed = 1 '\001', cmd_len = 1128409334, cmd_cap = 1141611397, cmds = 0x4240ccc852000000 <error: Cannot access memory at address 0x4240ccc852000000>, coord_len = 0, coord_cap = 0, coords = 0x1, current = {x = 1.90429152e-37, y= 1.40129864e-45}, begin = {x = 1.91206167e-37, y = 0}} | 12:08.04 |
Robin_Watts | user38: Please. What is the value of path. The address, not the contents thereof. | 12:09.49 |
| The fact that path->packed = 1 means that it's not really an fz_path*, it's actually an fz_packed_path * | 12:10.40 |
user38 | 0x102819a43 | 12:10.41 |
| Sorry | 12:10.44 |
| 0x1028119a34 | 12:10.57 |
Robin_Watts | So this is a 64bit platform? | 12:11.13 |
user38 | Correct. | 12:11.27 |
Robin_Watts | so that explains the crash. | 12:11.50 |
| Maybe. | 12:12.24 |
user38 | What is your hypothesis? | 12:12.32 |
Robin_Watts | Nah, maybe not. | 12:14.04 |
| I need to speak to chrisl really. | 12:14.09 |
user38 | I also messed it up (again), unfortunately since I cannot paste: fz_path is 0x102819a34. | 12:15.11 |
| One thing which is unclear to me is why when I print the contents of path, I get that error about not being able to access memory. Is that because that pointer has not been initialized yet? | 12:16.54 |
Robin_Watts | user38: No, it's because you're printing the pointer as if it's an fz_path *. | 12:19.26 |
| But actually, it's an fz_packed_path *. | 12:19.35 |
| Basically, we have a standard representation for paths, an fz_path *. | 12:19.52 |
| Paths are constructed in that form. | 12:20.01 |
| But when we make a display list from the paths, we want to reduce the amount of memory we use. | 12:20.22 |
| So, we 'pack' the paths into a display list. | 12:20.39 |
| We can only 'pack' paths that have less than 256 coords etc in. | 12:21.04 |
| so we look at path->pack to see whether it's a packed path or not. | 12:21.22 |
| If it is, then we cast it to be a fz_packed_path * instead and use that. | 12:21.43 |
| So if instead of printing *path, you printed *(fz_packed_path *)path you'd see more sensible values. | 12:22.42 |
| My worry is that the process of packing is ending up putting the fz_packed_path structure on a 32bit alignment when somehow it needs more. | 12:26.11 |
| can you print sizeof(fz_packed_path) please? | 12:26.40 |
user38 | Indeed the values look much more sensible. | 12:26.58 |
| $2 = {refs = 1 '\001', packed = 1 '\001', coord_len = 4 '\004', cmd_len = 1 '\001'} | 12:27.45 |
| $3 = 4 | 12:28.26 |
Robin_Watts | sizeof(fz_packed_path) == 4? | 12:29.13 |
user38 | Correct. | 12:29.19 |
Robin_Watts | I've just spoken to our expert... | 12:37.16 |
| fz_path contains pointers. | 12:37.50 |
| Therefore it will always need to be aligned to at least 64 bit. | 12:38.03 |
| which is a bugger, cos fz_packed_path doesn't, deliberately, so we pack it to just an alignment of 32bits. | 12:38.38 |
| I'm going to grab some lunch. When I come back, I'll give you a quick hack to try. | 12:41.26 |
| Thanks. | 12:41.30 |
user38 | If I should be gone by the time you get back, please update the bug with the patch, which I will try out the first chance I get. | 13:12.09 |
Robin_Watts | user38: back. | 13:42.31 |
| This is my attempted fix: https://git.ghostscript.com/?p=user/robin/mupdf.git;a=commitdiff;h=235f70d41043b82c3a66de977faca79c4c0791f4 | 13:42.59 |
| IF you'd rather get it as a patch I can arrange that. | 13:43.19 |
| Right. The ldx instruction loads a 64 bit chunk of memory, so that fails due to alignment. | 13:47.02 |
| My fix *should* solve that. | 13:47.09 |
user38 | That looks like a unified diff which I should be able to cut and paste into a patch. Unfortunately I don't have access to my build server at the moment, but I will be happy to try applying your changes tonight. | 13:54.00 |
malc_ | user38: i thought you said that you experience problems with both 32 _and_ 64bit builds | 13:55.39 |
user38 | That is correct. | 13:56.05 |
Robin_Watts | user38: Thanks. | 14:04.43 |
| malc_: In solaris, all pointers are 64bit, regardless of 32 or 64bit-ness, apparently. (Or at least they have to be 64bit aligned). | 14:05.26 |
| because the kernel, apparently, is always 64bit, and so it loads pointers and blanks the upper 32bits for a 32bit userspace. | 14:05.56 |
| or so I am informed. | 14:06.01 |
| On sparc, it's "well known" that any struct must be aligned to the alignment of the largest data type within it. | 14:06.51 |
| So an fz_path that contains pointers, therefore MUST be 64 bit aligned. | 14:07.08 |
| hmm. I think I've just convinced myself that that patch isn't enough :( | 14:10.01 |
| Will need to ponder some more. | 14:10.06 |
user38 | Yes, sadly the 32-bit kernel support has been ripped out a few years back across all supported platforms, which left perfectly good 3 GHz Pentium 4 servers unable to be upgraded further. | 14:10.06 |
| However the OS although 64-bit runs 32-bit applications without modification. | 14:10.34 |
malc_ | Robin_Watts: that makes no sense, if i have 32bit sparc that is | 14:16.59 |
Robin_Watts | Well, whatever the reason, the compiler is attempting to load the 1st byte of an fz_path by loading 64 bytes from it. | 14:18.38 |
user38 | I have to leave you soon, but I will try the patch tonight and report back. | 14:19.32 |
malc_ | sounds like something alpha would do... not sparc | 14:19.36 |
Robin_Watts | Which is fine for a 'real' fz_path, cos the compiler will have determined that that's guaranteed to be 64 bit aligned (presumably because of the pointers in it). | 14:19.36 |
| user38: There should be another patch later. I'll link it from the bug. | 14:19.54 |
user38 | I will update the bug report with the results, if that is alright by you? | 14:19.59 |
Robin_Watts | Ta. | 14:22.01 |
| user38: Before you go, what is sizeof(fz_path) please? | 14:22.13 |
user38 | print sizeof(fz_path) | 14:27.21 |
| $1 = 56 | 14:27.24 |
| is it what you expected? | 14:27.48 |
| q | 14:29.22 |
| Apologies, wrong window... I will have access to the build system later on, and can do additional debugging if required. | 14:31.05 |
| Bye for now... | 14:31.13 |
Robin_Watts | 56 is absolutely not what I've had expected. | 14:52.06 |
| 41 struct fz_path | 14:53.18 |
| 42 { | 14:53.18 |
| 43 int8_t refs; | 14:53.18 |
| 44 uint8_t packed; | 14:53.18 |
| 45 int cmd_len, cmd_cap; | 14:53.18 |
| 46 unsigned char *cmds; | 14:53.19 |
| 47 int coord_len, coord_cap; | 14:53.21 |
| 48 float *coords; | 14:53.23 |
| 49 fz_point current; | 14:53.25 |
| 50 fz_point begin; | 14:53.27 |
| 51 }; | 14:53.29 |
| Oh, wait... | 14:54.02 |
| so a sane 32bit mode would use 44 bytes. | 14:54.15 |
| and 64 would use 56. so it's padding as if it's 64bits. cool. | 14:54.55 |
malc_ | the struct is horribly laid out (alignment abi wise) | 15:05.24 |
Robin_Watts | malc_: Yes, but there are reasons. | 15:06.34 |
malc_ | Robin_Watts: okay | 15:06.50 |
Robin_Watts | mvrhel_laptop: You here? #artifex is calling. | 16:17.38 |
User38 | Greetings again, everyone. | 17:33.14 |
| So I just applied the jumbo patch from earlier, and so far we have: | 17:37.27 |
| > gpatch -uNp1 < jumbo.patchpatching file source/fitz/path.cHunk #3 succeeded at 77 (offset -13 lines).Hunk #5 FAILED at 165.Hunk #6 FAILED at 180.Hunk #7 succeeded at 171 (offset -30 lines).Hunk #9 succeeded at 219 (offset -30 lines).Hunk #11 succeeded at 304 (offset -30 lines).Hunk #13 succeeded at 412 (offset -30 lines).Hunk #15 succeeded at 483 | 17:37.54 |
| (offset -30 lines).Hunk #17 succeeded at 585 (offset -30 lines).Hunk #19 succeeded at 972 (offset -30 lines).Hunk #21 succeeded at 1483 (offset -30 lines).2 out of 21 hunks FAILED -- saving rejects to file source/fitz/path.c.rej | 17:37.54 |
| Oh great, pasting didn't retain newlines... | 17:38.13 |
| source/fitz/path.c: In function 'fz_pack_path':source/fitz/path.c:158:10: error: 'fz_path' {aka 'const struct fz_path'} has no member named 'packed' 158 | if (path->packed) | ^~gmake: *** [Makefile:101: build/release/source/fitz/path.o] Error 1 | 17:41.58 |
| I've updated https://bugs.ghostscript.com/show_bug.cgi?id=703004 with all of this information. | 17:58.19 |
malc_ | User38: your patch application failed, you have to fix that first before trying to build things | 18:04.34 |
User38 | I know, but it's late and it will require me to go through the rejected chunks, something better left for tomorrow morning when I'm rested. A cursory look over the rejected chunks seems to point to the actual code being patched missing in my copy of the source, but I'll verify that tomorrow. | 18:06.26 |
malc_ | User38: yeah it looks as if Robins path is against different branch or something | 18:08.18 |
User38 | I'm working with 1.17.0, for what it's worth. | 18:08.58 |
malc_ | that is definitely wrong | 18:10.16 |
Robin_Watts | User38: Hi. | 18:11.07 |
| I've just put some comments on the bug. | 18:11.18 |
| First and foremost is a patch to list-device.c that fixes this "better" so the path changes shouldn't be required. | 18:11.48 |
| I am (obviously) working on master, which is just after 1.18. | 18:12.04 |
| but I'm just fixing my fix :) | 18:12.43 |
| User38: ok, the version there tests out on our cluster with no regressions. | 18:25.11 |
User38 | 1.18.0 or 1.17.0 which I was troubleshooting up to now? | 18:25.56 |
Robin_Watts | 1.18.0 | 18:26.40 |
| The file in question has barely changed between 1.17 and 1.18 though. | 18:27.48 |
User38 | Then before I even attempt that, I'll have to re-apply all my patches first to get it to build on Solaris... | 18:27.58 |
| So looks like tomorrow there'll be pah-lenty to do. | 18:28.22 |
Robin_Watts | As I say, it should work on 1.17 too. | 18:29.15 |
User38 | Okay then if you don't mind, I'd like to continue with 1.17.0 first... Just the last patch in the bug report should be applied and the rest ignored, is that correct? | 18:30.12 |
Robin_Watts | yes. | 18:30.37 |
Robin_Watts | walks dogs. bbiab. | 18:30.44 |
User38 | Splendid! | 18:30.51 |
Robin_Watts | Fab! | 19:29.19 |
User38 | mupdf-1.17.0]> gpatch -uNp1 < ../sparc-jumbo.patch patching file source/fitz/list-device.cHunk #6 succeeded at 1450 (offset -1 lines).Hunk #8 succeeded at 1492 (offset -1 lines).Hunk #10 succeeded at 1713 (offset -1 lines).Hunk #12 succeeded at 1853 (offset -1 lines).Hunk #14 succeeded at 1884 (offset -1 lines). | 19:54.46 |
| ...so far so good! | 19:54.53 |
| Well, we have partial success: 64-bit version now appears to work, while the 32-bit version of mupdf-x11 only works when a PDF without graphics is provided as an argument. The core's stack trace is different, too. I've updated the bug report with the additional information. | 21:28.38 |
| And with that, I bid you all a good night until tomorrow. | 21:29.06 |
Robin_Watts | User38: For the logs: Thanks. That sounds like we've solved one problem at least. | 22:58.11 |
| <<<Back 1 day (to 2020/10/18) | Forward 1 day (to 2020/10/20)>>> | |