Good afternoon, Unfortunately output from pdfdraw is not parsable by LibXML2. = How to reproduce error = [From console] > wget http://archive.org/download/lawofthehayes00ewinrich/lawofthehayes00ewinrich.pdf && pdfdraw -ttt awofthehayes00ewinrich/lawofthehayes00ewinrich.pdf > law.xml > wget http://pastebin.ca/raw/2089594 -O main.c // compile and link to the LibXML2 libraries with -o parser > parser law.xml law.xml:9: parser error : Opening and ending tag mismatch: char line 0 and span </span> ^ law.xml : failed to parse Please patch the library to fix this bug. Thanks, Alec Taylor
Created attachment 7991 [details] Proposed fix for the problem... Proposed fix for the problem...
After patch, here is my output: > parser law5.xml 1> log1.txt 2> log2.txt log1.txt = http://pastebin.com/ZtPVdbEq log2.txt = http://pastebin.com/aBKW1x3k Please fix this as soon as you can. Thanks for the effort! Alec Taylor
The output for this file passes http://www.xmlvalidation.com just fine. If the latest version from git still goes wrong, please reopen this bug and attach the .xml and the errors again. If it's possible to restrict it to a single page (or a smaller page range) that goes wrong, that would help too. Thanks.