public inbox for docbook-tools-discuss@sourceware.org
 help / color / mirror / Atom feed
From: Tim Waugh <twaugh@redhat.com>
To: Mark Whitis <whitis@freelabs.com>
Cc: "Éric Bischoff" <e.bischoff@noos.fr>,
	docbook-tools-discuss@sources.redhat.com
Subject: Re: multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks)
Date: Fri, 20 Dec 2002 19:23:00 -0000	[thread overview]
Message-ID: <20020411141357.J1633@redhat.com> (raw)
In-Reply-To: <Pine.LNX.4.33.0204101911230.18156-100000@cervantes.freelabs.com>; from whitis@freelabs.com on Wed, Apr 10, 2002 at 10:40:59PM -0400

[-- Attachment #1: Type: text/plain, Size: 3426 bytes --]

Hi Mark,

Thanks for your feedback.

> Thanks for the quick response.  I applied that patch directly to
> /usr/bin/jw and it sorta-kinda fixed the problem.  Still, it is a
> kludge rather than a proper bugfix. docbook2html still can't be used
> as a proper filter, for example:
> 
>    <generate_docbook> | docbook2html ... | tidy ... | ...

Well then all the other backends are 'broken', if you take that
attitude.  I think a more useful approach is to have consistent
behaviour across all the backends: that of generating one or more
output files in the current (or a specified) directory.  That's what
the man page says it does.

> This is un*x.  Filters should be able to take input on standard in
> and send output to standard out with errors to standard error.

If jw were to output to stdout, it would (in general) need to send a
tar file!

> The blow chunks mode is also probably also a serious security
> hole in many situations (it creates files on the host system with
> names based on text supplied by the untrustworthy remote user who
> supplied the file).   Don't believe me?  Try this
>      <chapter id="/etc/youarescrewed">

Yes, this is an interesting attack.  The docbook-dsssl package by
default makes up its own names for output files when chunking; the Red
Hat Linux docbook-utils package comes with a default custom stylesheet
which turns on a feature to use IDs as filenames.  We'll be correcting
that shortly.

> Denial of service attack:  Lets suppose that on a system with
> a 65536 inode limit, I process a mailicious file which has 65536
> <chapter>'s.

I can say the same thing about tar files (for example).

> On a related note, Docbook2html files actually need to be tidy'ed so
> badly that you might consider making a call to tidy (with
> configurable options), a built option (or better yet, fix the
> generator - but that is probably jade).  The output is technically
> legal HTML but the formatting violates the spirit of HTML.

The output is determined by the stylesheets.  They are the way they
are because of technical details---significant whitespace is the
reason for '>' being separate to the rest of the element, for example.

I'm sure that Norm would welcome patches that make the HTML output
nicer to read.  How's your DSSSL? ;-)

(On the other hand, who is it that is editing generating output rather
than editing the source?)

> Another question: does either 0.6.9 or the upcoming release fix
> the "URL not supported" problem?   docbook2html chokes on the DOCTYPE
> in files generated by abiword:
> <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
> 	"http://www.oasis-open.org/docbook/xml/4.0/docbookx.dtd"

For a long time the Red Hat Linux openjade package came with HTTP
support disabled.  It is enabled in the current package (in
Skipjack).

But you might want to consider using an XSL processor for DocBook
XML.  Take a look at the xmlto package for a way to start.

> Now, this appears to be at least two bugs:
>    - URL in DOCTYPE is unimplemented feature

(Actually a feature that defaults to 'disabled'.)

>    - failure to use a good catch-all document type where an exact
>      stylesheet match is not found.

This is an unreasonable requirement and would just generate bogus bug
reports.  People should install the DTD for the document they are
processing.

Tim.
*/

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

WARNING: multiple messages have this Message-ID
From: Tim Waugh <twaugh@redhat.com>
To: Mark Whitis <whitis@freelabs.com>
Cc: "Éric Bischoff" <e.bischoff@noos.fr>,
	docbook-tools-discuss@sources.redhat.com
Subject: Re: multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks)
Date: Thu, 11 Apr 2002 07:17:00 -0000	[thread overview]
Message-ID: <20020411141357.J1633@redhat.com> (raw)
Message-ID: <20020411071700.UqIAvdb-sDdespHprh0kFLf-g56oqeCOYozvmgSQCrQ@z> (raw)
In-Reply-To: <Pine.LNX.4.33.0204101911230.18156-100000@cervantes.freelabs.com>; from whitis@freelabs.com on Wed, Apr 10, 2002 at 10:40:59PM -0400

[-- Attachment #1: Type: text/plain, Size: 3426 bytes --]

Hi Mark,

Thanks for your feedback.

> Thanks for the quick response.  I applied that patch directly to
> /usr/bin/jw and it sorta-kinda fixed the problem.  Still, it is a
> kludge rather than a proper bugfix. docbook2html still can't be used
> as a proper filter, for example:
> 
>    <generate_docbook> | docbook2html ... | tidy ... | ...

Well then all the other backends are 'broken', if you take that
attitude.  I think a more useful approach is to have consistent
behaviour across all the backends: that of generating one or more
output files in the current (or a specified) directory.  That's what
the man page says it does.

> This is un*x.  Filters should be able to take input on standard in
> and send output to standard out with errors to standard error.

If jw were to output to stdout, it would (in general) need to send a
tar file!

> The blow chunks mode is also probably also a serious security
> hole in many situations (it creates files on the host system with
> names based on text supplied by the untrustworthy remote user who
> supplied the file).   Don't believe me?  Try this
>      <chapter id="/etc/youarescrewed">

Yes, this is an interesting attack.  The docbook-dsssl package by
default makes up its own names for output files when chunking; the Red
Hat Linux docbook-utils package comes with a default custom stylesheet
which turns on a feature to use IDs as filenames.  We'll be correcting
that shortly.

> Denial of service attack:  Lets suppose that on a system with
> a 65536 inode limit, I process a mailicious file which has 65536
> <chapter>'s.

I can say the same thing about tar files (for example).

> On a related note, Docbook2html files actually need to be tidy'ed so
> badly that you might consider making a call to tidy (with
> configurable options), a built option (or better yet, fix the
> generator - but that is probably jade).  The output is technically
> legal HTML but the formatting violates the spirit of HTML.

The output is determined by the stylesheets.  They are the way they
are because of technical details---significant whitespace is the
reason for '>' being separate to the rest of the element, for example.

I'm sure that Norm would welcome patches that make the HTML output
nicer to read.  How's your DSSSL? ;-)

(On the other hand, who is it that is editing generating output rather
than editing the source?)

> Another question: does either 0.6.9 or the upcoming release fix
> the "URL not supported" problem?   docbook2html chokes on the DOCTYPE
> in files generated by abiword:
> <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
> 	"http://www.oasis-open.org/docbook/xml/4.0/docbookx.dtd"

For a long time the Red Hat Linux openjade package came with HTTP
support disabled.  It is enabled in the current package (in
Skipjack).

But you might want to consider using an XSL processor for DocBook
XML.  Take a look at the xmlto package for a way to start.

> Now, this appears to be at least two bugs:
>    - URL in DOCTYPE is unimplemented feature

(Actually a feature that defaults to 'disabled'.)

>    - failure to use a good catch-all document type where an exact
>      stylesheet match is not found.

This is an unreasonable requirement and would just generate bogus bug
reports.  People should install the DTD for the document they are
processing.

Tim.
*/

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

  parent reply	other threads:[~2002-04-11 14:17 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-12-20 19:23 BUG: docbook2html --nochunks Mark Whitis
2002-04-09 17:57 ` Mark Whitis
2002-12-20 19:23 ` Éric Bischoff
2002-04-09 23:57   ` Éric Bischoff
2002-12-20 19:23   ` Tim Waugh
2002-04-10  0:03     ` Tim Waugh
2002-12-20 19:23     ` multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks) Mark Whitis
2002-04-10 19:40       ` Mark Whitis
2002-12-20 19:23       ` Tim Waugh [this message]
2002-04-11  7:17         ` Tim Waugh
2002-12-20 19:23       ` New location of the "Crash Course.to DocBook" Éric Bischoff
2002-04-11  3:25         ` Éric Bischoff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020411141357.J1633@redhat.com \
    --to=twaugh@redhat.com \
    --cc=docbook-tools-discuss@sources.redhat.com \
    --cc=e.bischoff@noos.fr \
    --cc=whitis@freelabs.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).