MuPDF IRC logs

	<<<Back 1 day (to 2018/05/15)	20180516
moolc	is it somehow possible to convince mutool to subset fonts used in a pdf?	05:59.36
tor8	Robin_Watts: I thought fz_image was supposed to hold the image data in its compressed form?	09:22.25
	looking at pdf_load_image ... I see we always decompress and store the decompressed data.	09:22.45
Robin_Watts	It does in most cases, why?	09:22.50
tor8	Bug 699345 has a file with a lot of jbig2 images	09:23.38
	we're decompressing all of them even if we don't render	09:23.49
Robin_Watts	jbig2 gets decompressed, cos jbig2 is special.	09:24.09
	jbig2 was always done in a "special" way in MuPDF for as long as I can remember.	09:24.46
	I'd love for it not to be.	09:24.51
tor8	pdf_load_image calls pdf_load_compressed_stream calls pdf_load_image_stream calls pdf_open_image_stream and fz_read_all	09:25.01
	pdf_open_image_stream calls pdf_open_filter which always builds the full decompression chain	09:25.37
Robin_Watts	Well, then how do we end up with anything in the fz_compressed_data case?	09:26.24
tor8	I have no idea	09:26.34
	this code is a twisty maze of functions, all alike	09:26.43
	I tried with a JPEG and that doesn't seem to hit the same code path	09:28.16
Robin_Watts	I've just tried with lion.pdf (a jpeg)	09:29.12
	and we get to pdf_load_compressed_stream.	09:29.20
	and that returns with jpeg data in the buffer.	09:30.10
tor8	ah, spotted it.	09:31.07
	hidden in build_filter we stop short if we have a compression_params object	09:31.27
Robin_Watts	/* If we were using params we were passed in, and we successfully	09:31.36
	* recognised the image type, we can use the existing filter and	09:31.38
	* shortstop here. */	09:31.39
	if (params != &local_params && params->type != FZ_IMAGE_RAW)	09:31.41
	return fz_keep_stream(ctx, chain); /* nothing to do */	09:31.42
	yeah.	09:31.44
tor8	so why can't we do it for jbig2dec?	09:32.08
Robin_Watts	It's the same trick I did in the PDF agent for Picsel about 20 years ago :)	09:32.10
	tor8: I don't know.	09:32.14
tor8	is it because it's awkward to stow away the jbig2 'globals' stream data as well?	09:32.39
Robin_Watts	As I say, jbig2dec has always been a special case.	09:32.43
	tor8: Not a clue, but if that was it, I'd have thought we could cope.	09:33.02
tor8	JPXDecode is the one I remember as being really special (because of how it was shoehorned into the spec)	09:33.24
Robin_Watts	oh, crap.	09:34.18
	Then jbig2dec is NOT special.	09:34.24
	Sorry, I always get jbig2 and jp2k confused, cos I am a fool.	09:34.39
tor8	whoever named these formats is the bigger fool ...	09:35.00
	jpeg, jbig2, jpeg2000, jpeg-xr ... give it a rest already!	09:35.16
inflex	jErks.	09:53.20
tor8	Robin_Watts: some commits on tor/master for review (including one that puts jbig2 images in compressed buffers)	10:39.50
	I think you've already approved the first 4 but I'm not 100% sure	10:40.27
Robin_Watts	looking	10:41.43
	The resources one...	10:46.38
	Ok, I'm happy, I think.	10:47.21
tor8	fab. I'll just let the cluster finish before I push though.	10:48.23
Robin_Watts	ok, all look good.	10:48.35
razi	Hi there, is there a sample code that show how to copy text containing in a rectangle?	15:44.03
inflex	razi, the mupdf-gl (or others) viewer has that in there, stext_search or something I think.	15:50.41
	stext-search.c: fz_highlight_selection()	15:52.16
	sorry, I meant fz_copy_selection()	15:52.56
razi	inflex: Indeed, I used fz_copy_selection() to extract text but sometimes it return the whole line when I just want to select a single word. So, I think maybe I used it in wrong way.	15:54.58
	inflex: maybe you can check this code, please? https://beepaste.io/paste/view/Nj3642	15:59.44
inflex	I just did a word-copy variation the other day, relies on a single-click (right) and it selects the whole word around that	16:04.12
	it's likely exceedingly inefficient, but it does work atm	16:04.25
	razi, might be of help to you - https://github.com/inflex/mupdf/blob/master/source/fitz/stext-search.c#L223	16:08.05
razi	inflex: thanks I'll check it.	16:08.55
inflex	I don't know how it'll work on selecting a word within a selection; my code relies just on pagea = pageb	16:10.31
	for all I know, there's probably a vastly more succinct way to do what I did, I just tried to stumble through the dark with this one.	16:11.22
razi	inflex: Well, you didn't use fz_copy_selection(). I will also try extract text by traversing text blocks/lines to see if it works.	16:16.12
inflex	razi, the enumerate function gets used by the fz_copy_selection()	16:18.21
razi	inflex: yeah, you are right. Is your application using this single-click feature public?	16:32.34
inflex	The code is there on github. It's been done for a specific task, choosing to optimise workflow at the expense of other things. I was considering changing it from right-click to double-left.	16:39.10
razi	inflex: Ok, thanks for your time.	16:45.57
inflex	you're welcome, best of luck.	16:47.04
	Forward 1 day (to 2018/05/17)>>>

Log of #mupdf at irc.freenode.net.