Hi Mark, Thanks for your feedback. > Thanks for the quick response. I applied that patch directly to > /usr/bin/jw and it sorta-kinda fixed the problem. Still, it is a > kludge rather than a proper bugfix. docbook2html still can't be used > as a proper filter, for example: > > | docbook2html ... | tidy ... | ... Well then all the other backends are 'broken', if you take that attitude. I think a more useful approach is to have consistent behaviour across all the backends: that of generating one or more output files in the current (or a specified) directory. That's what the man page says it does. > This is un*x. Filters should be able to take input on standard in > and send output to standard out with errors to standard error. If jw were to output to stdout, it would (in general) need to send a tar file! > The blow chunks mode is also probably also a serious security > hole in many situations (it creates files on the host system with > names based on text supplied by the untrustworthy remote user who > supplied the file). Don't believe me? Try this > Yes, this is an interesting attack. The docbook-dsssl package by default makes up its own names for output files when chunking; the Red Hat Linux docbook-utils package comes with a default custom stylesheet which turns on a feature to use IDs as filenames. We'll be correcting that shortly. > Denial of service attack: Lets suppose that on a system with > a 65536 inode limit, I process a mailicious file which has 65536 > 's. I can say the same thing about tar files (for example). > On a related note, Docbook2html files actually need to be tidy'ed so > badly that you might consider making a call to tidy (with > configurable options), a built option (or better yet, fix the > generator - but that is probably jade). The output is technically > legal HTML but the formatting violates the spirit of HTML. The output is determined by the stylesheets. They are the way they are because of technical details---significant whitespace is the reason for '>' being separate to the rest of the element, for example. I'm sure that Norm would welcome patches that make the HTML output nicer to read. How's your DSSSL? ;-) (On the other hand, who is it that is editing generating output rather than editing the source?) > Another question: does either 0.6.9 or the upcoming release fix > the "URL not supported" problem? docbook2html chokes on the DOCTYPE > in files generated by abiword: > "http://www.oasis-open.org/docbook/xml/4.0/docbookx.dtd" For a long time the Red Hat Linux openjade package came with HTTP support disabled. It is enabled in the current package (in Skipjack). But you might want to consider using an XSL processor for DocBook XML. Take a look at the xmlto package for a way to start. > Now, this appears to be at least two bugs: > - URL in DOCTYPE is unimplemented feature (Actually a feature that defaults to 'disabled'.) > - failure to use a good catch-all document type where an exact > stylesheet match is not found. This is an unreasonable requirement and would just generate bogus bug reports. People should install the DTD for the document they are processing. Tim. */