| <<<Back 1 day (to 2016/07/29) | 20160730 |
roca | hello there i try to print the tiger.ps file witj my pixma ip7250 but nothing happens use gs.919 and gsview 6.0 | 15:38.11 |
| do i need a special ps printer driver or similat? | 15:39.41 |
sebras | Robin_Watts: sorry to pester you on a saturday, but maybe you can answer whether you reviewed the two patches on sebras/master? | 16:23.29 |
| Robin_Watts: I think you did, but I can't find any answer confirming it. | 16:23.45 |
| if LGTY I will push. | 16:23.54 |
| fredross-perry: I haven't seen any complaints about the JNI-stuff, so I'm guessing it works well for you? | 18:39.46 |
| fredross-perry: I did see you mentioning the StructuredText thing to Robin though, but as I understand it this is "resolved" by saying that this is now similar to how the old app does it..? | 18:40.50 |
fredross-perry | sebras - so far so good. Right now I am working with StructuredText, which seems fine. | 18:41.03 |
| I need to look at how the old app works. | 18:41.39 |
sebras | fredross-perry: ok. even if I'm not online I do read the logs so just write something here and I'll catch it. | 18:41.55 |
fredross-perry | When you select text using the old app, it's somehow able to create a selection that has some whole lines and some partial lines. | 18:42.46 |
| In GSView we do this by actually walking thru all the chars, words, and lines ourselves and doing our own analysis. Structured Text just gives you the rects that are inside a larger Rect. | 18:43.45 |
sebras | fredross-perry: mmm, I think StructuredText aims to do this analysis for you. | 18:44.18 |
fredross-perry | So an example of that would be helpful. | 18:44.38 |
sebras | fredross-perry: at this point in time I know as much about this as you do. :) | 18:45.19 |
| fredross-perry: but I can spend some time figuring out how the old app does it. | 18:45.32 |
fredross-perry | Also I'd like to be able to find the word underneath a location. Like if I tap a page, I can highlight that word. What I tried so far is to get all the wrecks for the whole page, and then see which one contains my point. When I do that, the rect I get is for the whole line. | 18:46.05 |
| s/wrecks/rects | 18:46.18 |
| And one more thing. In fz_copy_selection, there's code that substitutes "?" for chars that are less than 32. I have a doc where everything that's a space seems to result in "?". | 18:50.08 |
sebras | fredross-perry: I'm a little confused about how the old app does this but in platform/android/viewer/src/com/artifex/mupdfdemo/PageView.java you have TextSelector which seems to be involved. | 18:52.14 |
| fredross-perry: but spaces == 32... ok, maybe just provide the document and I can have a look. | 18:53.23 |
| fredross-perry: just put it on casper and I can get it from there. | 18:56.14 |
fredross-perry | http://ghostscript.com/~fred/building.pdf | 18:56.54 |
| if it's just <32 vs ==32, that's simple | 18:57.10 |
sebras | oh, the same one as before? | 18:57.11 |
fredross-perry | yes, same | 18:57.25 |
sebras | fredross-perry: I think the current code is right though. if c < 32 (which is space) then c = '?'; | 18:57.50 |
| fredross-perry: so if you have a space you _should_ end up with ' ', not '?' | 18:58.02 |
fredross-perry | yes. | 18:59.16 |
| the old app JNI has an interface like this: private native TextChar[][][][] text(); | 18:59.33 |
sebras | fredross-perry: when running mutool draw -F stext building.pdf I see a lot fo | 19:00.11 |
| <char bbox="333.7647 71.9733 336.8494 88.29259" x="333.7647" y="85.20001" c="	"/> | 19:00.13 |
| which seems to be tab characters. | 19:00.20 |
| these would indeed be converted to '?' | 19:00.30 |
fredross-perry | yes, Robin and I were chatting about this. Look like MS Word is putting those in when converting to PDF. | 19:00.51 |
| Not sure how those should be handled. | 19:01.28 |
sebras | fredross-perry: mmm, I'm guessing one way is to ignore them. in the stext you don't see the spaces... | 19:02.04 |
fredross-perry | on the text selection front. the old app has this native if: private native TextChar[][][][] text(); | 19:03.31 |
| from that it can derive where wors and lines are. Similar to what's in GSView. | 19:04.17 |
sebras | fredross-perry: right, but then it is the app doing the interpretation of the concept of textblocks and lines and spans of characters. | 19:04.51 |
| fredross-perry: I believe that tor8s intent with StructuredText is that this analysis should be provided by the library, not the app. | 19:05.09 |
fredross-perry | Sure. But JNI could provide the structured data. Take a look at JNI_FN(MuPDFCore_text)(JNIEnv * env, jobject thiz) | 19:05.13 |
sebras | fredross-perry: mmm, that is probably the intent I believe. | 19:06.55 |
fredross-perry | It might be simple to add _text and it's associated classes to the new JNI, then I can go to town. | 19:07.36 |
| one more thing: fz_copy_selection might be giving me an extra char at the beginning of the string, the char right before the actual selection. | 19:11.29 |
sebras | fredross-perry: ok, so in another pdf mutool draw -F stext actually returns spaces as well. I'd forgotten how these things work. | 19:16.55 |
| fredross-perry: the question is if the tab should be converted to a space somewhere along the way. that may be what ought to happen (as spaces can have varying width). | 19:18.13 |
| fredross-perry: there is an stext-related patch over at sebras/master. Using that I no longer see tab characters. I believe this is the correct approach. | 19:30.37 |
fredross-perry | looking ... | 19:30.53 |
sebras | so you can use that for testing at least, but I feel like the approach needs to be vetted by robin at least. | 19:31.11 |
sebras | sleeps | 19:31.25 |
fredross-perry | good night | 19:33.02 |
Robin_Watts | sebras: First 2 commits are good. | 22:43.16 |
| I disagree with mapping tab to space. | 22:43.26 |
| and other than the typo on the commit message, the top commit looks good. | 22:43.59 |
| The stext structure returned from the raw text extraction should contain the raw text, IMAO. | 22:44.40 |
| Forward 1 day (to 2016/07/31)>>> | |