| <<<Back 1 day (to 2013/02/07) | 2013/02/08 |
ray_laptop | henrys: if you get a chance, can you verify that the patch posted on bug 693621 also works for you. I was slightly surprised that there were almost no differences in the regression testing. | 00:41.14 |
| henrys: I just want a 'second pair of eyes'. The "theory" is that it should be an improvement. Note I will look at the bmpcmp (running now) and see if other things are working as I expected. | 00:42.36 |
JakeSays | so i'm looking at the code in pdfclean as a sample of how to add/remove pages from a pdf. if i move a page from one pdf to another, do i need to also copy the embedded fonts, or will mupdf do that automatically? | 01:54.27 |
henrys | ray_work:looks like I'm too late to review | 05:27.58 |
| sorry I took off a little early today | 05:29.11 |
| see you in the morning | 05:31.51 |
mvrhel_laptop | kens: for the logs. I will make sure to be on line in the morning for you | 07:14.22 |
saper | quick question: why http://sourceforge.net/p/cdesktopenv/code/ci/0ec1d6b692e246249849342f326bd20655d999d6/tree/cde/historical/ReleaseNotes.ps?format=raw does render white pages on x11 output but pdfwrite works and produces correct PDF file? white text on white background or sth? | 10:14.46 |
kens | Hmm it renders 'something' to the display device, but it looks like the job doesn't request a media size. | 10:16.04 |
saper | {/currentpagedevice wh {p currentpagedevice dp /HWResolution kn {/HWResolution get al p}{p 300 300}ie}{300 300}ie}bd/C | 10:16.42 |
| looks suspicious? | 10:16.46 |
kens | Not really, its setting the resolution to 300 dpi | 10:16.58 |
saper | does anyone have a postscript de-aliaser... to un-alias those funny shortcuts?:) | 10:17.17 |
kens | Those aren't shortcuts, they are PostScript program routines. | 10:17.39 |
| PostScript is a programming language, so you need an interpreter to read it | 10:17.52 |
saper | there is also /pWd 612 d /pHt 792 d | 10:18.34 |
| {db /Duplex t d /Tumble f d /Orientation 0 /HWResolution [ 600 600 ] d/PageSize [pWd pHt]d de spd}stp p | 10:18.34 |
kens | It looks like its trying to scale the page based on a resolution of 300 dpi, but I htink its setting a resolution of 600 dpi | 10:18.51 |
| pdfwrite has a default resolution of 720 dpi.... | 10:19.13 |
saper | oh ye | 10:19.32 |
| yes | 10:19.33 |
| letters are there just HUGE | 10:19.39 |
kens | And if I use the display device at 600 dpi it looks OK | 10:19.48 |
| So its a PostScript program which is totally no device-independent, which is really bad practice. | 10:20.26 |
saper | it's old stuff, we keep it there only for historical reasons | 10:20.57 |
kens | RUn it through ps2write and you'll get PostScript which works better | 10:21.12 |
| Run it with ps2write and -r600 and the resulting PostScript looks to work fine. | 10:22.18 |
saper | yep it does, thanks | 10:22.18 |
mvrhel_laptop | Hi kens. Sorry I missed you yesterday | 14:20.43 |
kens | np Michael | 14:20.52 |
| Simple question, the existing image code is using pcs->type->remap_color so I assume this is using the CMS ? | 14:21.13 |
mvrhel_laptop | yes | 14:21.25 |
| well hold on | 14:21.44 |
kens | :-) | 14:21.55 |
mvrhel_laptop | as long as you don't have the pcs type DeviceGray, DeviceRGB or DeviceCMYK | 14:22.21 |
kens | pcs in this case is the pointer to the original colour space | 14:22.24 |
mvrhel_laptop | kens: it has to be ICC or have a base space that is ICC to work | 14:23.16 |
kens | THere is some trickery withmy new code whic now preserves (eg) Ondexed DeviceN which used ot be conerted to RGB because the ICCBased space was our replacement for CMYK | 14:23.23 |
mvrhel_laptop | kens, so the interpreter will hand you only colorspaces that will be ICC based. so unless you are creating your own color spaces, pcs->type->remap will be ICC based | 14:25.09 |
kens | Yes I beleive it is, I just wanted to check | 14:25.31 |
mvrhel_laptop | I was just worried about my statement above, if you were creating your own color spaces | 14:25.59 |
kens | This means a bunch of code I need I cna just reuse | 14:26.01 |
mvrhel_laptop | that is good news | 14:26.10 |
kens | I do create my own 'devce spaces' for faking up conversions wiht images, but I don't use them for conversion | 14:26.41 |
Robin_Watts | http://www.slate.com/blogs/future_tense/2013/02/07/fox_news_expert_on_solar_energy_germany_gets_a_lot_more_sun_than_we_do_video.html | 14:27.14 |
kens | Anyway, this saves me a bunch of work, thanks Michael | 14:28.32 |
| At least all my seg faults are now gone, and a number of 'differences' are progressions, so allI need to do is fix the ones which are definietly broken ;-) | 14:29.49 |
mvrhel_laptop | great | 14:30.23 |
| oh Robin_Watts quick question for you | 14:30.30 |
| so I was generating some examples for Max from his 8 channel 16 bit CMYKcmkk source file with ETS | 14:31.23 |
Robin_Watts | ok. | 14:31.34 |
mvrhel_laptop | and I had thought the ETS code was doing serpentine but looking at the output and the way that the dots are coming on at very low ink levels it looks like it is not | 14:32.23 |
Robin_Watts | indeed it is not. | 14:32.45 |
mvrhel_laptop | I had thought ray told me it was | 14:32.50 |
Robin_Watts | I added serpentine, but it looked worse, I think. | 14:33.02 |
mvrhel_laptop | ok so that is likely needed | 14:33.02 |
| oh | 14:33.05 |
| where is your code for that | 14:33.09 |
| well is file is a crazy chart file and dot structures show up quite noticeably as you can imagine. Of course FS is terrible | 14:34.32 |
Robin_Watts | I don't know :( | 14:35.02 |
| I'm sure I tried it. | 14:35.11 |
mvrhel_laptop | the offset of when the error reaches a threshold to put a dot is the only issue I see with our stuff for the very low ink levels (like 1 percent). It may not be noticeable on paper but on screen it. I just wanted to double check with you on this. If you do run across the code where you did serpentine in this stuff let me know. I am not going to spend time adding it now until I show him... | 14:36.57 |
| ...what we currently do compared to FS. | 14:36.59 |
| also, it looks like to me that coupling weights we had for CMYK would have been wrong | 14:37.20 |
Robin_Watts | mvrhel_laptop: It's possible that I just hacked it in by reversing the errors and the line data for every other line. | 14:38.07 |
mvrhel_laptop | Robin_Watts: in the code, it had the strengths as { 128, 51, 51, 13 }, // KCMY | 14:38.13 |
Robin_Watts | yes. | 14:38.24 |
mvrhel_laptop | which means that CMYK had things mixed up | 14:38.30 |
| Robin_Watts: ok about serpentine. I may revisit it if they become interested based upon what they see | 14:39.03 |
JakeSays | so i'm looking at the code in pdfclean as a sample of how to add/remove pages from a pdf. if i move a page from one pdf to another, do i need to also copy the embedded fonts, or will mupdf do that automatically? | 14:39.16 |
Robin_Watts | mvrhel_laptop: For cmyk, we send an ordering thing in don't we? | 14:39.38 |
| so the planes get read KCMY | 14:39.46 |
mvrhel_laptop | hmm not that I see or saw | 14:39.56 |
tor8 | JakeSays: as long as you copy along all the reference pdf objects, fonts etc will also tag along. you'll have to renumber the objects and references though, or you'll probably get collisions. | 14:40.43 |
| referenced* | 14:40.55 |
mvrhel_laptop | Robin_Watts: I will double check this | 14:41.08 |
Robin_Watts | mvrhel_laptop: In the read_pam_line code we read CMYK and output that as 1,2,3,0 | 14:41.28 |
mvrhel_laptop | ah. | 14:41.36 |
Robin_Watts | so we feed in K,C,M,Y | 14:41.50 |
mvrhel_laptop | ok. that makes sense. I am thinking that I need something a bit better defined for this for the multichannnel case | 14:42.19 |
Robin_Watts | JakeSays: No one has done PDF merging with our code yet, AFAIK. | 14:42.32 |
| mvrhel_laptop: Possibly. I hadn't really changed it much from what was in the code originally. | 14:43.17 |
mvrhel_laptop | Robin_Watts: likely some permutation array that is used by the reader and writer along with a set of strength values | 14:43.33 |
Robin_Watts | mvrhel_laptop: Did I mention that gimpprint/gutenprint contains an implementation of ETS ? | 14:43.41 |
mvrhel_laptop | Robin_Watts: ok. I just wanted to check with you | 14:43.48 |
| Robin_Watts: yes you did mention that | 14:44.00 |
Robin_Watts | ok. | 14:44.05 |
mvrhel_laptop | Robin_Watts: thanks for the help. Need to get the kids out the door now. I will be back later and may bug you a bit more | 14:44.58 |
Robin_Watts | no worries. | 14:45.06 |
mvrhel_laptop | by the way I have my windoze viewer scrolling through pages pretty well now. | 14:45.22 |
| need to add in zooming and search | 14:45.32 |
JakeSays | Robin_Watts: are you aware of any code out there that does do merging? | 14:45.40 |
mvrhel_laptop | bbiaw | 14:46.55 |
Robin_Watts | mvrhel_laptop: Nice. | 14:48.17 |
| JakeSays: I am not, offhand. | 14:48.24 |
| It should be possible to do with our code (it's something I've wanted to look at for a while). | 14:48.40 |
JakeSays | Robin_Watts: ok. maybe i'll give it a try today and see how far i get. | 14:49.34 |
Robin_Watts | but there are complexities, like we won't merge outlines, or sort out links etc. | 14:50.15 |
JakeSays | these are really simple pdfs | 14:50.36 |
| the only complexity i'm aware of is embedded fonts | 14:50.51 |
| the fonts are all subsets of the same thing (tahoma) | 14:52.55 |
Robin_Watts | tor8: still fighting ios? | 15:06.15 |
tor8 | Robin_Watts: oddly enough I can't reproduce the crash today :( | 15:06.38 |
Robin_Watts | oh. | 15:06.59 |
| this was the magic reference counting crap? | 15:07.10 |
| I have text extraction up and working with the new structures. | 15:09.03 |
| bboxes aren't calculated yet though. | 15:09.08 |
tor8 | Robin_Watts: yeah. | 15:10.14 |
| so I'm inclined to just let it slide. | 15:10.24 |
Robin_Watts | If it works, it works. | 15:10.32 |
| I'd put it down to an xcode build skew thing, and move on. | 15:10.42 |
tor8 | bah. and now I finally did manage to make it crash again... | 15:10.45 |
Robin_Watts | oh :( | 15:10.50 |
| So at the moment I have blocks of lines of spans of chars. | 15:11.27 |
| chars have styles in. | 15:12.10 |
| spans have transforms in. | 15:12.33 |
| lines are sets of spans that share the same baseline (but have bigger than expected horizontal gaps in) | 15:13.06 |
| How would you feel about lines also holding the distance from the previous line in the block ? | 15:13.32 |
tor8 | if it makes algorithms easier to understand, go ahead | 15:14.04 |
Robin_Watts | I calculate that as part of the 'do I insert this in the same block or not' code. | 15:14.05 |
| and I need it again in the paragraph analysis stuff. | 15:14.25 |
| OK. | 15:14.27 |
tor8 | but it is a bit fragile if we're shuffling things around as it's duplicated info | 15:14.28 |
| ... that needs to be kept in sync | 15:14.44 |
Robin_Watts | yes. | 15:14.53 |
henrys | mvrhel_laptop:are you on the east coast or just getting up early? | 15:49.59 |
kens | I think he got up early to answer my question, which is very generous | 15:50.30 |
henrys | ah there is supposed to be a terrible storm out east - might effect alexcher, I know lots of flights are canceled. | 15:51.23 |
alexcher | henrys: according to the forecast, we will have an inch or so of snow. | 16:06.57 |
henrys | well that's hardly anything. | 16:07.55 |
| alexcher:north of you http://articles.marketwatch.com/2013-02-07/general/36969471_1_massive-storm-british-newspaper-cnn | 16:09.29 |
| kens:sorry I'm late on the xpsdriver I used a fairly complicated memory structure for the device, linked lists etc and I've fouled up the enum_ptr stuff. Every time I trip over this I wish we had a collector transparent to the code, I've always thought it would be interesting to try the boehm collector, but a lot of work. | 16:39.36 |
kens | henrys, no problem from my point of view. | 16:40.04 |
henrys | right you look occupied | 16:40.35 |
kens | Still doing image colour stuff | 16:44.29 |
Robin_Watts | kens: I thought about this in depth when I started for Artifex. | 16:44.58 |
| Swapping to the boehm collector would be bad for a number of reasons. | 16:45.12 |
| Just don't ask me what they are now :) | 16:45.19 |
kens | Its not mt idea Robin_Watts | 16:45.52 |
| my* | 16:45.58 |
ray_work | hmm... even though my chatzilla was running, it didn't capture what I see on the logs :-( | 16:57.52 |
henrys | hi ray_work I thought -dFirstPage didn't work for postscript? | 16:58.29 |
| for some reason I can't recall | 16:58.41 |
ray_work | henrys: -dFirstPage -dLastPage don't work for PS | 17:00.31 |
| they are implemented only in pdf_main.ps | 17:00.49 |
henrys | Robin_Watts:I just wonder how many contributors we lose when they realize their work requires memory allocation - they look at that enum and reloc hell and run scared. | 17:01.15 |
| ray_work:right you said when working on the last problem you had reproduced the problem in postscript and I didn't understand that. | 17:01.52 |
ray_work | henrys: BTW, on that 3page hpgl file, I noticed (when debugging) that it returned e_ExitLanguage, but that apparently gets ignored and it keeps processing the file | 17:01.58 |
Robin_Watts | henrys: For device authors, it's irrelevant. | 17:02.02 |
| It only affects people doing stuff at the interpreter level. | 17:02.19 |
| Even the graphics library is pretty much independent of it. | 17:02.31 |
ray_work | henrys: PS has a definition that any marks on the page before 'setpagedevice' are _supposed_ to be lost. So I just marked a part of the page, did setpagedevice and marked some more. It failed with the clist mode | 17:02.56 |
| I was somewhat surprised that some CET didn't show up a progression. I thought sure they would test that. | 17:03.57 |
henrys | Robin_Watts:I don't know what you mean anytime you add a pointer to a structure, device or otherwise you have to decrypt that stuff, well if you want to understand what you are doing. | 17:04.49 |
kens | Only if you use GC memory | 17:05.19 |
Robin_Watts | what kens said. | 17:05.26 |
| We agreed ages ago that we were going to migrate stuff away from gc memory as much as possible. | 17:05.46 |
ray_work | henrys: you don't have to (and _shouldn't_ ) declare pointers to non_gc_memory. So a simple thing to do is put stuff in non_gc_memory | 17:05.58 |
Robin_Watts | Thus reducing the amount of stuff that needs to be enumerated/marked. | 17:06.06 |
ray_work | Robin_Watts: yes, and that works, but we also will need a chunk manager for the non_gc_memory to be efficient for frequent alloc/free cycles (so we don't bang on the heap allocator) | 17:07.03 |
| Robin_Watts: we can do that easily enough when we set up the pointer to the non_gc_memory in the GC allocator. | 17:08.05 |
henrys | will do | 17:10.10 |
ray_work | ooh. I just saw the email from Mateusz about the results of the fuzz testing of ghostscript. | 17:10.13 |
| I didn't know they were going to do PostScript. This should be fun (NOT) | 17:11.00 |
henrys | they sent us mail saying they would | 17:11.21 |
ray_work | well, I guess fixing stuff can only help our stability. Bound to be some _really_ screwy things to track down, given what fuzzing probably does to perfectly good PS | 17:12.47 |
henrys | ray_work:I'll look at e_ExitLanguage, I wish he wouldn't run the interpreter that way it gets so little testing | 17:14.22 |
ray_work | henrys: I agree. I'm not sure why he is doing that. Do you know ? | 17:15.17 |
henrys | I don't understand his explanation - my batting average talking him out of his ways is 0 so I'm not going to pursue it. | 17:16.00 |
| the non_gc stuff doesn't help at all with contributors who have to look at the code and follow suit, nobody would happen upon non_gc_mem | 17:21.38 |
| we probably need some documentation in the Develop.htm | 17:22.26 |
ray_work | henrys: are you going to contact cust 190 about the fix ? Maybe you (or Marcos) can ask WHY they are doing this. Is it to do pages out of order, or do a subset of the pages in order, or what ? | 17:22.37 |
henrys | I asked him once and he gave me a non answer, support was copied in. | 17:25.05 |
ray_work | henrys: the one thing I like about the fix is that it gets rid of multiple 'fillpage' actions on clist playback. | 17:25.15 |
henrys | ray_work:from my experience with him if I respond again to his email he will dig his heels in deeper and I'll never get him to change but I'll give it a go. | 17:29.14 |
mvrhel_laptop | Robin_Watts: one more question for you with -m 0 -e 0 -r 0 I should not really see any difference where a dot gets placed with different permutations of the planes | 17:38.00 |
| is that not true? | 17:38.13 |
Robin_Watts | urm... | 17:39.00 |
mvrhel_laptop | my question might not be so clear Robin_Watts | 17:39.11 |
| sorry | 17:39.13 |
| so I have a case | 17:39.18 |
| where C and K have a very low ink level | 17:39.26 |
Robin_Watts | No multiplane, no error diffusion, no random noise. | 17:39.31 |
mvrhel_laptop | right | 17:39.34 |
Robin_Watts | With no multiplane, each plane will be completely separate. | 17:39.56 |
mvrhel_laptop | and if I do C first or second, I would expect to see the same dot placement | 17:40.17 |
Robin_Watts | hence the same data in to each plane will produce the same data out. | 17:40.19 |
henrys | ray_work:okay mail sent | 17:40.20 |
Robin_Watts | yes. | 17:40.23 |
mvrhel_laptop | -e 0 means ets style is off | 17:40.29 |
| Robin_Watts: ok. that is what I thought. I am seeing something odd that I will need to dig into a bit. | 17:40.53 |
| likely my goof up someplace | 17:40.57 |
| thanks Robin_Watts | 17:41.07 |
Robin_Watts | no worries. | 17:42.04 |
mvrhel_laptop | hmm. ok it is def. something going on in the ETS code. | 17:48.02 |
| see if I can track this down | 17:48.09 |
Robin_Watts | mvrhel_laptop: Crap test: change the image loader to send the same values into each plane. Then look at the output and toggle planes on/off. That may be what you're doing of course.. | 17:51.23 |
mvrhel_laptop | Robin_Watts: that is basically what is happening now with the image data that I have as the C and K plane have the same data to start out | 17:52.19 |
Robin_Watts | mvrhel_laptop: Fair enough. | 17:52.43 |
mvrhel_laptop | which is why I caught this. It may be that there is some minor perturbation or intialization in there to keep from placing dots on dots with the same level but I want to understand | 17:53.46 |
Robin_Watts | mvrhel_laptop: I can remember no such thing. | 17:55.14 |
mvrhel_laptop | me either. which is why I was surprised | 17:55.38 |
Robin_Watts | If you force the errors to zero, does it go away ? | 17:55.46 |
mvrhel_laptop | Robin_Watts: not sure. I just broke something in the 8 bit case, need to fix that first... | 17:58.40 |
ray_work | henrys: thanks. I saw the email. I was hoping you would mention that we want to know so we can see if we can help come up with a more efficient way to do what they need. Maybe they will infer that. | 17:59.42 |
kens | Goodnight all, have a good weekend | 18:03.58 |
mvrhel_laptop | I need to do more frequent commits so I can see where I broke this :( | 18:04.29 |
ray_work | henrys: I just saw the reply from Guilaume -- nesting pages on a single sheet is what can be done with clist 'saved_pages'. About time to do an example of that, I guess. | 18:09.05 |
mvrhel_laptop | ok. fixed that. a problem introduced while cleaning up a few things | 18:11.29 |
Robin_Watts | mvrhel_laptop: Should we have cluster testing for ets? | 18:11.49 |
mvrhel_laptop | :) | 18:12.23 |
Robin_Watts | semi-serious question. | 18:12.36 |
mvrhel_laptop | maybe eventually. | 18:12.52 |
Robin_Watts | We could set it up without too much trouble. | 18:12.58 |
ray_laptop | Robin_Watts: right. It would depend on deterministic random noise generation. | 18:13.06 |
Robin_Watts | Basically to give a smoke test. | 18:13.10 |
| ray_laptop: We can cluster test with -r0 :) | 18:13.19 |
mvrhel_laptop | right | 18:13.22 |
| perhaps that might not be a bad idea | 18:13.38 |
| Robin_Watts: if you want to set this up, it would be helpful | 18:13.56 |
ray_laptop | Robin_Watts: mvrhel_laptop: I wouldn't do very many tests, however | 18:14.12 |
mvrhel_laptop | no | 18:14.15 |
Robin_Watts | mvrhel_laptop: I'll ponder on it for a bit. | 18:14.24 |
mvrhel_laptop | I would do a 16bit, 8 bit CMYK and a CMYK + a few planes | 18:14.27 |
| each with a few options | 18:14.55 |
ray_laptop | the other thing is that are so many 'tuning' modes, that we can only pick one and test that, but a change might break a mode we don't test | 18:15.12 |
Robin_Watts | ray_laptop: For mupdf and the javascript tests, I have a series of .mjs files that the cluster tests. | 18:16.23 |
| Each of those says "load this file, run these tests". | 18:16.41 |
| For ets we could have .ets files that get tested. They would say "load this file, and run it with these params". | 18:17.06 |
halko | Hey, I ust have this one question: I have a pdf sourcecode from a webpage that has been with gpl ghostscript 8.15 and I'm trying to get it to a atleast a partly readable state because I need to check if it's empy or not. | 18:17.08 |
| any ideas? | 18:17.10 |
Robin_Watts | So we can just check in as many .ets files as we want tested. We can run the same actual bitmap input several different ways. | 18:17.29 |
| halko: Sorry. I don't understand the question. | 18:17.57 |
mvrhel_laptop | Robin_Watts: that sounds good. need to head to kids school for a bit | 18:18.07 |
| bbiab | 18:18.12 |
halko | yeah, I mean that I have a scrambled source code from a pdf file | 18:18.30 |
| and trying to check if it's just empty or has some content | 18:18.45 |
Robin_Watts | I still don't follow that. | 18:18.49 |
| What do you mean by 'scrambled source code' ? | 18:19.11 |
ray_laptop | halko: If you use mutool -d clean you can get a human readable PDF file out | 18:20.08 |
Robin_Watts | If you have a PDF file, and you want to know if it's got anything in then load it into a viewer and look. | 18:20.26 |
| If you want to automate that process (so you can check many files systematically), then there are tools you could use to look for text etc in them. | 18:21.00 |
| but I don't see how GPL Ghostscript comes into this. You need to define the problem more clearly, sorry. | 18:21.20 |
ray_laptop | halko: also you can open the file using ghostscript with the -dPDFDEBUG flag and it will print out debug messages as it processes the PDF | 18:21.28 |
| halko: if gs 8.15 is saying that the PDF is corrupted and it's not managing to automatically 'repair' it, then newer Ghostscript (9.06) or mupdf / mutool can do a better job than 8.15 | 18:22.49 |
halko | technically the problem is that I don't have a file | 18:22.55 |
| only the code ripped from a web pages source code | 18:23.08 |
Robin_Watts | Web pages are written in HTML. I don't see how that relates. | 18:23.43 |
| Do you mean that you have a web app that takes some input and uses GPL Ghostscript to spit out a PDF ? | 18:24.02 |
ray_laptop | halko: if all you have is a rendered image (that some web site rendered using gs 8.15) then we can't help | 18:24.04 |
halko | and I really don't know if ghostscript has anything to do with this, I just see from the code that the pdf has been made with ghostscript | 18:24.06 |
Robin_Watts | halko: what pdf? You just said that you don't have a pdf! | 18:24.24 |
halko | I have the code of the PDF | 18:24.39 |
| :D | 18:24.39 |
Robin_Watts | "the code of the PDF" ? | 18:24.49 |
| Either you have a PDF, or you don't. There is no "code of the PDF" other than the PDF itself. | 18:25.18 |
halko | %PDF-1.4 %Ãì¢ 5 0 obj | 18:25.18 |
| stuff like that | 18:25.22 |
Robin_Watts | Right, that's a PDF. | 18:25.24 |
sebras | Robin_Watts: sounds like someone has quoted (some of) the contents of a pdf file on a webpage, and then halko reads that webpage and tries to reconstruct the pdf-file. | 18:25.29 |
Robin_Watts | sebras: As if someone has pastebinned a PDF? | 18:25.59 |
ray_laptop | halko: does the PDF file start off with %PDF-1.x (where x is 2 to 9) | 18:26.10 |
halko | yeah I'm trying to reconstruct something atleast partly readable from this | 18:26.10 |
sebras | Robin_Watts: something along those lines. I have seen it be done before.... | 18:26.13 |
halko | yeah | 18:26.17 |
| %PDF-1.4 | 18:26.27 |
ray_laptop | halko: post the PDF or give us the link to the site | 18:26.36 |
Robin_Watts | Well, if you have the whole thing, then it's conceivable that you could reverse it, but if there is binary data in there, I'd fear it was doomed to fail. | 18:26.39 |
ray_laptop | Robin_Watts: what do you mean 'reverse it' | 18:27.05 |
Robin_Watts | reverse the conversion from the raw bytes of a file to the HTMLized output. | 18:27.28 |
| Like < becomes < etc. | 18:27.35 |
sebras | ray_laptop: even if you would take the webpage contents and past into a file, offsets might be off... | 18:27.44 |
ray_laptop | Robin_Watts: why would you want to do that ? | 18:27.56 |
halko | http://pastebin.com/SzhttT7T | 18:28.10 |
sebras | ray_laptop: you should ask halko... ;) | 18:28.17 |
halko | that's what I have | 18:28.20 |
| there's probabl some content in there but :D | 18:29.57 |
Robin_Watts | The offsets seem intact. | 18:32.11 |
ray_laptop | It looks like gs 9.07 gets errors trying to "repair" the PDF. Also it gets filter errors when trying to decode things, so the binary has been damaged | 18:32.31 |
Robin_Watts | yeah. | 18:32.58 |
sebras | same here. | 18:33.09 |
| in mupdf. | 18:33.13 |
Robin_Watts | Well, there is an image there, and a font, so there is probably some text. | 18:33.23 |
sebras | Robin_Watts: the /Length of a stream is that the encoded or unencodede length? | 18:35.14 |
ray_laptop | looking at the hex for the first part of object 5's stream data (corresponding to line 6 of the .txt file) I see: | 18:35.19 |
| 00000040: 65 3E 3E 0D 0A 73 74 72 65 61 6D 0D 0A 78 C5 93 |e>>..stream..x..| | 18:35.20 |
| 00000050: 75 51 C3 89 4E C3 84 30 0C 15 C3 BB 34 C3 B0 11 |uQ..N..0....4...| | 18:35.22 |
| 00000060: 3E 26 E2 80 A1 CB 9C 38 C2 BB C2 AF 08 34 12 C2 |>&.....8.....4..| | 18:35.23 |
| 00000070: B7 19 C3 B5 06 C5 93 46 62 4E 05 15 C3 BE 5F 22 |.......FbN...._"| | 18:35.25 |
halko | could the binary be damaged because of char coding? | 18:35.26 |
ray_laptop | 00000080: 69 29 0D 0A 12 24 52 C3 BC C3 A2 C3 A7 C3 A7 25 |i)...$R........%| | 18:35.27 |
| the 69 29 0D 0A is a hint to me that something changed the (probably 0A) line ending to 0D 0A | 18:36.06 |
halko | I changed it before paste to ISO from UTF-8 | 18:36.25 |
sebras | halko: so you pastebinned this yourself? do you have the original URL? | 18:37.19 |
halko | http://www.ylioppilastutkinto.fi/hyvan_vastauksen_piirteita/fi/2013_K_BAH_sabl.pdf | 18:37.42 |
| well it's that, but I think that it's not the same pdf file from the source code | 18:38.05 |
| and actually that's the main thing that I want to check | 18:38.14 |
sebras | halko: the latter file contains the text "Tätä tietoa ei ole vielä saatavilla. " which translates to "This information is not yet available." bu you probably read finnish... | 18:39.08 |
| bu -> but | 18:39.14 |
halko | yeah I do | 18:39.25 |
| sebras: you from finland? | 18:39.53 |
sebras | halko: no, but google translate is... ;) | 18:40.04 |
halko | yeah, well it seems to have worked quite well this time | 18:40.28 |
| hmm, it's probably the same file I think, just checked some other ones with more text and they are much bigger | 18:41.29 |
sebras | halko: so the link above... are you not able to open that in a pdf-viewer? | 18:42.50 |
halko | yeah I can | 18:43.03 |
sebras | halko: and how does the link relate to mangled pdf-file on pastebin that you sent earlier? | 18:43.56 |
halko | I was just trying to look for another pdf file from that site, and kinda hoped that it would be "behind" that one | 18:44.00 |
| and buried somewhere beneath the one I posted | 18:44.29 |
ray_laptop | henrys: you can assure Guilaume that the PS file I used to trip over the problem was specially designed to help me debug and is not something we ever expect to see in "real" PS. Also it cannot ever happen from PDF. | 18:44.31 |
| henrys: I'll add that comment to the bug | 18:45.15 |
| henrys: done | 18:47.10 |
halko | hey, but thanks for you all for the help | 18:48.00 |
| and sorry for a bit of a stupid question :) | 18:48.10 |
| (and also really nice to see a channel where people accually offer some help! Thanks really!) | 18:48.45 |
ray_laptop | bbiab. Have to go pick up contacts. | 18:48.52 |
| halko: Thanks for your appreciation. Note that some of us also respond to questions on stackoverflow | 18:49.31 |
| (kens is the most prolific there) | 18:49.45 |
henrys | paulgardiner: miles needs your miami itinerary so he can make hotel reservations | 19:20.06 |
Robin_Watts | henrys: He's flying in/out on the same flights as me. | 19:25.35 |
| so same hotel stay as for me. | 19:25.48 |
henrys | Miles prefers to get everyones itinerary, he can add in the email that is the same as you. | 19:26.51 |
| okay I told miles he's on the same flight as you, but tell him to send his itinerary | 19:31.32 |
| okay belay that miles doesn't need it⦠I hate being the messenger ;-( | 19:35.38 |
Robin_Watts | henrys: Sorry. | 19:43.41 |
henrys | np just whining | 19:47.08 |
mvrhel_laptop | Robin_Watts: you still there? | 20:47.49 |
| I found what is going on, and it is as I suspected | 20:48.03 |
| so the error_line for each plane gets initialized to a random initial value | 20:48.32 |
| so all is well (i.e the initial error going in, is not zero) | 20:48.50 |
| this happens in ets_plane_new | 20:49.01 |
| and is good to keep from having dots on dots exactly for the cases when we have the same levels amongst planes | 20:49.33 |
| at least now I know. I am going to wrap this stuff up and get it to max now | 20:49.58 |
Robin_Watts | mvrhel_laptop: Ah! Excellent. | 21:36.08 |
henrys | gp_get_realtime() is different on different platforms, why not return the same value on all platforms - windows since 1980 and unix 1970 - crud | 21:36.10 |
| I guess that's the way adobe did it. | 21:39.48 |
mvrhel_laptop | hmm that is odd. I can't seem to push the ETS changes | 21:51.45 |
| need to head out for a bit right now. bbiab | 21:52.21 |
| Forward 1 day (to 2013/02/09)>>> | |