[gs-code-review] Fix for 688675 Reading / Converting PDF File
created with Adobe Lifecycle Designer 7.0
Dan Coby
dan.coby at artifex.com
Mon May 15 14:00:00 PDT 2006
I thought that I would send this out for comments. The basic problem
is that the file is broken. It has xref tables with entries that have
incorrect generation number values.
Message:
Fix for 688675 Reading / Converting PDF File created with Adobe
Lifecycle Designer 7.0.
DETAILS:
The test file has xref tables with incorrect generation numbers for
serveral entries. The xref verification logic finds the mismatch
between the generation numbers specified in the xref and the generation
numbers in the objects. It then attempts to requild the xref table.
This fails since the xref rebuild logic does not handle rebuilding
compressed object streams. As a result the file cannot be displayed.
The fix consists of three parts.
1. Change the xref verification logic to exit if compressed object
streams are detected. I am not attempting to recover objects in
compressed object streams since anything that damages a file will
generally trash any compressed data.
2. Change the logic which checks the generation number to simply
print a warning but to continue. Previously this logic would print
a warning and then reject the object. Rejecting an object generally
causes an error to occur as the PDF interpreter attempts to use the
object. The warning message was also enhanced to print both the
expected and actually generation numbers. The given test file
produces about 400 warning messages.
3. Enhance the comments for the print_xref.
-------------- next part --------------
Index: lib/pdf_base.ps
===================================================================
--- lib/pdf_base.ps (revision 6725)
+++ lib/pdf_base.ps (working copy)
@@ -418,15 +418,23 @@
QUIET not { % Create warning message if not QUIET
Generations 2 index lget 0 eq { % Check if object is free ...
( **** Warning: reference to free object: )
+ 2 index =string cvs concatstrings ( ) concatstrings % put obj #
+ exch =string cvs concatstrings ( R\n) concatstrings % put gen #
} {
( **** Warning: wrong generation: )
+ 2 index =string cvs concatstrings ( ) concatstrings % put obj #
+ exch =string cvs concatstrings % put gen #
+ (, xref gen#: ) concatstrings 1 index Generations % put xref gen #
+ exch lget 1 sub =string cvs concatstrings (\n) concatstrings
} ifelse
- 2 index =string cvs concatstrings ( ) concatstrings % put obj #
- exch =string cvs concatstrings ( R\n) concatstrings % put gen #
pdfformaterror % Output warning message
} { % Else QUIET ...
- pop % Pop generation umber
- } ifelse false % Return false if gen # not match
+ pop % Pop generation number
+ } ifelse
+ % We should return false for an incorrect generation number, however
+ % we are simply printing a warning and then returning true. This makes
+ % Ghostscript tolerant of of bad generation numbers.
+ true
} ifelse
} bind def
/R { % <object#> <generation#> R <object>
Index: lib/pdf_main.ps
===================================================================
--- lib/pdf_main.ps (revision 6725)
+++ lib/pdf_main.ps (working copy)
@@ -497,7 +497,12 @@
search_objects
exit
} if % If the entry is invalid
- } if % If not in an object stream
+ } {
+ % The object is in an object stream. We currently do not rebuild
+ % objects in an object stream. So If we find one, then abort the
+ % verification of the xref table entries.
+ pop exit % Pop object number and then exit loop
+ } ifelse % If not in an object stream
} if % If object entry is not free
pop % Remove object number
} for
Index: lib/pdf_rbld.ps
===================================================================
--- lib/pdf_rbld.ps (revision 6725)
+++ lib/pdf_rbld.ps (working copy)
@@ -73,10 +73,10 @@
} ifelse
} bind def
-% Print the contents of the xref array. This actually consists of two
-% arrays (Objects and Generations). Both are larrays. larrays are a
-% special Ghostscript object which can be arrays with more than 64k
-% elements.
+% Print the contents of the xref array. This actually consists of three
+% arrays (Objects, Generations, and ObjectStream). All three are larrays.
+% larrays are a special Ghostscript object which can be arrays with more
+% than 64k elements.
/print_xref % - print_xref -
{ 0 1 Objects llength 1 sub % stack: 0 1 <number of objects - 1>
{ dup =only % print object number
More information about the gs-code-review
mailing list