* BUG: docbook2html --nochunks @ 2002-12-20 19:23 Mark Whitis 2002-04-09 17:57 ` Mark Whitis 2002-12-20 19:23 ` Éric Bischoff 0 siblings, 2 replies; 12+ messages in thread From: Mark Whitis @ 2002-12-20 19:23 UTC (permalink / raw) To: docbook-tools-discuss If you tell docbook2html not to blow chunks, it writes the output html document to standard out. This is fine except for the fact that it also outputs status messages to standard out (instead of sending them to stderr where the belong). So you end up with an invalid document because the status messages are mixed in. So, you get stuff like this: Using catalogs: /etc/sgml/sgml-docbook-4.0.cat Using stylesheet: /usr/share/sgml/docbook/utils-0.6/docbook-utils.dsl#html Working on: /home/whitis/docbook/sample.docbook [...] Done. -- Mark Whitis http://www.freelabs.com/~whitis/ NO SPAM Author of many open source software packages. Coauthor: Linux Programming Unleashed (1st Edition) ^ permalink raw reply [flat|nested] 12+ messages in thread
* BUG: docbook2html --nochunks 2002-12-20 19:23 BUG: docbook2html --nochunks Mark Whitis @ 2002-04-09 17:57 ` Mark Whitis 2002-12-20 19:23 ` Éric Bischoff 1 sibling, 0 replies; 12+ messages in thread From: Mark Whitis @ 2002-04-09 17:57 UTC (permalink / raw) To: docbook-tools-discuss If you tell docbook2html not to blow chunks, it writes the output html document to standard out. This is fine except for the fact that it also outputs status messages to standard out (instead of sending them to stderr where the belong). So you end up with an invalid document because the status messages are mixed in. So, you get stuff like this: Using catalogs: /etc/sgml/sgml-docbook-4.0.cat Using stylesheet: /usr/share/sgml/docbook/utils-0.6/docbook-utils.dsl#html Working on: /home/whitis/docbook/sample.docbook [...] Done. -- Mark Whitis http://www.freelabs.com/~whitis/ NO SPAM Author of many open source software packages. Coauthor: Linux Programming Unleashed (1st Edition) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: BUG: docbook2html --nochunks 2002-12-20 19:23 BUG: docbook2html --nochunks Mark Whitis 2002-04-09 17:57 ` Mark Whitis @ 2002-12-20 19:23 ` Éric Bischoff 2002-04-09 23:57 ` Éric Bischoff 2002-12-20 19:23 ` Tim Waugh 1 sibling, 2 replies; 12+ messages in thread From: Éric Bischoff @ 2002-12-20 19:23 UTC (permalink / raw) To: Mark Whitis, docbook-tools-discuss, twaugh On Wednesday 10 April 2002 02:58, Mark Whitis wrote: > If you tell docbook2html not to blow chunks, it writes the output > html document to standard out. This is fine except for the fact > that it also outputs status messages to standard out (instead > of sending them to stderr where the belong). So you end up > with an invalid document because the status messages are mixed in. > > So, you get stuff like this: > Using catalogs: /etc/sgml/sgml-docbook-4.0.cat > Using stylesheet: > /usr/share/sgml/docbook/utils-0.6/docbook-utils.dsl#html Working on: > /home/whitis/docbook/sample.docbook > [...] > Done. I think this has already been fixed. Tim ? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: BUG: docbook2html --nochunks 2002-12-20 19:23 ` Éric Bischoff @ 2002-04-09 23:57 ` Éric Bischoff 2002-12-20 19:23 ` Tim Waugh 1 sibling, 0 replies; 12+ messages in thread From: Éric Bischoff @ 2002-04-09 23:57 UTC (permalink / raw) To: Mark Whitis, docbook-tools-discuss, twaugh On Wednesday 10 April 2002 02:58, Mark Whitis wrote: > If you tell docbook2html not to blow chunks, it writes the output > html document to standard out. This is fine except for the fact > that it also outputs status messages to standard out (instead > of sending them to stderr where the belong). So you end up > with an invalid document because the status messages are mixed in. > > So, you get stuff like this: > Using catalogs: /etc/sgml/sgml-docbook-4.0.cat > Using stylesheet: > /usr/share/sgml/docbook/utils-0.6/docbook-utils.dsl#html Working on: > /home/whitis/docbook/sample.docbook > [...] > Done. I think this has already been fixed. Tim ? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: BUG: docbook2html --nochunks 2002-12-20 19:23 ` Éric Bischoff 2002-04-09 23:57 ` Éric Bischoff @ 2002-12-20 19:23 ` Tim Waugh 2002-04-10 0:03 ` Tim Waugh 2002-12-20 19:23 ` multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks) Mark Whitis 1 sibling, 2 replies; 12+ messages in thread From: Tim Waugh @ 2002-12-20 19:23 UTC (permalink / raw) To: Éric Bischoff; +Cc: Mark Whitis, docbook-tools-discuss [-- Attachment #1: Type: text/plain, Size: 715 bytes --] On Wed, Apr 10, 2002 at 08:56:48AM +0200, Éric Bischoff wrote: > I think this has already been fixed. Tim ? Indeed. Here is the patch I used: --- docbook-utils-0.6.9/bin/jw.in.nochunks Tue Jul 3 14:57:32 2001 +++ docbook-utils-0.6.9/bin/jw.in Tue Jul 3 14:59:52 2001 @@ -369,7 +369,12 @@ cd $SGML_OUTPUT_DIRECTORY export SGML_JADE SGML_FILE_NAME SGML_ARGUMENTS export SGML_CATALOG_FILES SGML_BASE_DIR SGML_FILE SGML_STYLESHEET -sh $SGML_BACKEND +if [ -z "$SGML_NOCHUNKS" ] +then + sh $SGML_BACKEND +else + sh $SGML_BACKEND >$SGML_FILE_NAME.html +fi SGML_RETURN=$? cd $SGML_CURRENT_DIRECTORY I hope to have time to look at making a new release in the next few weeks. Tim. */ [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: BUG: docbook2html --nochunks 2002-12-20 19:23 ` Tim Waugh @ 2002-04-10 0:03 ` Tim Waugh 2002-12-20 19:23 ` multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks) Mark Whitis 1 sibling, 0 replies; 12+ messages in thread From: Tim Waugh @ 2002-04-10 0:03 UTC (permalink / raw) To: Éric Bischoff; +Cc: Mark Whitis, docbook-tools-discuss [-- Attachment #1: Type: text/plain, Size: 715 bytes --] On Wed, Apr 10, 2002 at 08:56:48AM +0200, Éric Bischoff wrote: > I think this has already been fixed. Tim ? Indeed. Here is the patch I used: --- docbook-utils-0.6.9/bin/jw.in.nochunks Tue Jul 3 14:57:32 2001 +++ docbook-utils-0.6.9/bin/jw.in Tue Jul 3 14:59:52 2001 @@ -369,7 +369,12 @@ cd $SGML_OUTPUT_DIRECTORY export SGML_JADE SGML_FILE_NAME SGML_ARGUMENTS export SGML_CATALOG_FILES SGML_BASE_DIR SGML_FILE SGML_STYLESHEET -sh $SGML_BACKEND +if [ -z "$SGML_NOCHUNKS" ] +then + sh $SGML_BACKEND +else + sh $SGML_BACKEND >$SGML_FILE_NAME.html +fi SGML_RETURN=$? cd $SGML_CURRENT_DIRECTORY I hope to have time to look at making a new release in the next few weeks. Tim. */ [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks) 2002-12-20 19:23 ` Tim Waugh 2002-04-10 0:03 ` Tim Waugh @ 2002-12-20 19:23 ` Mark Whitis 2002-04-10 19:40 ` Mark Whitis ` (2 more replies) 1 sibling, 3 replies; 12+ messages in thread From: Mark Whitis @ 2002-12-20 19:23 UTC (permalink / raw) To: Tim Waugh; +Cc: Éric Bischoff, docbook-tools-discuss Sorry if this message seems overly negative. I only have time to mention that which is broke. Not talk about what works or contribute patches for what's broke. Thanks to everyone who is making Docbook, SGML, and HTML document preparation feasable. I coauthored an 800 page linux book; docbook is MUCH better than the publisher's dreadful MS-Word style sheet based monstrosity. Versions in use (Redhat RPMs): openjade-1.3-13 docbook-utils-0.6-13 (i'd need to download half of skipjack to install newer versions - this is only a little bit of an exageration if preliminary examination and past experience are any indication). On Wed, 10 Apr 2002, Tim Waugh wrote: > Indeed. Here is the patch I used: Thanks for the quick response. I applied that patch directly to /usr/bin/jw and it sorta-kinda fixed the problem. Still, it is a kludge rather than a proper bugfix. docbook2html still can't be used as a proper filter, for example: <generate_docbook> | docbook2html ... | tidy ... | ... This is un*x. Filters should be able to take input on standard in and send output to standard out with errors to standard error. Obviously, you can't do that if you are using it set to blow chunks, but that is a special mode of operation suitable only for documents that are not only very large but also written by someone you trust. The blow chunks mode is also probably also a serious security hole in many situations (it creates files on the host system with names based on text supplied by the untrustworthy remote user who supplied the file). Don't believe me? Try this <chapter id="/etc/youarescrewed"> Jade will complain aobut "/" in an id but will still happily create overwrite /etc/youarescrewed if you run it as root. Even as a non-root user, there are plenty of files which can be compromised, such as ".ssh/authorized_keys". Yes, there will probably be a ".html" tacked onto the end of the file name. That DOES NOT mean you are safe. Not only might there be another hole that lets an attacker get rid of the ".html" (perhaps as simple as a buffer overflow), there are places where a file can do serious damage even with a ".html" extension, like in the directories "/etc/rc.d/rc3.d/" (executable flag probably required) and /etc/xinetd.d/ (executable flag not required). And if docbook2html is not run as root, there may still be dot file directorys which will be executed even with a .html extension. Not to mention an HTML file itself may be the target of an attack; here let me just trojan this copy of the nimbda worm (which is carred in .html files) into ~/public_html/index.html. Or maybe I will distribute a trojan docbook document which overwrites /www/index.html with 7|-|15 5173 15 0\/\/|\|3D BY 7|-|3 D00D (THIS SITEIS OWNED BY THE DUDE) If the person who reads that document has access to /www, there will be trouble. So, no document should ever be processed in blow chunks mode unless you personally wrote it. Which also means no document should should be distributed that needs blow chunks mode to be processed. And blow chunks mode should definitely NOT be the default for docbook2html. And no document should every be processed by docbook2html in a vgi-bin unless the document was written by a user on that server. So, you can forget about: - An HTML translation service that allows web users who do not have docbook translation software on their computer to enter a URL of a docbook or upload it and view the results. - Any server that publishes documents from anyone who isn't absolutely trustworthy and accepts those documents in docbook format and generates html. So much for the LDP, linux.org, etc. One clown writes a trojan "Post-It note mini-HOWTO" and the server is compromised. It is not like the people administering those servers are likely to have time to validate every revision of every document. - secure non-shell ISP which requires people to create their web pages in XML/SGML instead of using Frontpage. - a web browser which supports docbook by using docbook2html as an external filter. At least using docbook2html as a standard unix filter, if that were actually allowed, would be inherently more secure. Of course, you still can't allow sloppy code leading to buffer overlows in openjade, jw, etc. Lest some niave person suggest that fixing jade so it will not accept "/" in a chapter "id" is the fix for the problem, history has repeatedly demonstrated that deny-known-bad is NOT an effective security proceedure. Someone will, using a hypothetical example, find out that the kernel also accepts ASCII character 254 as a synonym for "/". Or that id="%2Fetc%2Fpasswd" gets through. Yes, jade needs to be fixed to allow an extremely limited character set to appear in filenames based on document supplied text; for example, every single character other than A-Za-z0-9 should be translated to an underscore, multiple consequitive should be compressed to one,, leading and training undersocres should be deleted, the total length should be limited, and if appropriate a number should be added to the end to prevent two identifiers from mapping to the same file name. Denial of service attack: Lets suppose that on a system with a 65536 inode limit, I process a mailicious file which has 65536 <chapter>'s. Again, don't blow chunks unless you seriously trust the document author. Even if it was a 300 page book, I would run as a pipe producing a single document unless I had a good reason to trust the author. On a related note, Docbook2html files actually need to be tidy'ed so badly that you might consider making a call to tidy (with configurable options), a built option (or better yet, fix the generator - but that is probably jade). The output is technically legal HTML but the formatting violates the spirit of HTML. > I hope to have time to look at making a new release in the next few > weeks. Another question: does either 0.6.9 or the upcoming release fix the "URL not supported" problem? docbook2html chokes on the DOCTYPE in files generated by abiword: <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "http://www.oasis-open.org/docbook/xml/4.0/docbookx.dtd" Results in URL not supported by this version DTD did not contain element declaration for document type name element "BOOK" undefined element "CHAPTER" undefined element "SECTION" undefined element "TITLE" undefined element "PARA" undefined ... Now, this appears to be at least two bugs: - URL in DOCTYPE is unimplemented feature - failure to use a good catch-all document type where an exact stylesheet match is not found. If someone ran docbook2html, they have already said the document is in some form of docbook. The program should not puke because someone omitted a doctype or uses a doctype different/newer than the style sheets on my system. works: <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.0//EN"> <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.1//EN"> pukes: <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.2//EN"> Now it is very likely that at some point, a docbook user is going to receive a document in a newer version of docbook than their system recognizes. One of the primary reasons for using XML/SGML markup in the first place instead of horrible proprietary formats is that new features can be added in the future but old programs can still read the document although they may not be able to render any new constructs. If you process this document with a default stylesheet, you will probably get 95% of the content or more. You may even get 100% since a document might well be labeled "DocBook 4.2" but actually only uses "Docbook 4.1 features". [Definition: blow chunks: (blo chungks) v. intr. 1. (slang) To regurgitate 2. An extremely disdainful term for running docbook2html without --nochunks, or a similar program in a similar mode, such that you generate multiple tiny output files from a single source file (or a single source file with other files included by reference. As would be done by a webmaster with poor judgement on small documents, causing grief for users loading, reading, printing, searching, archiving, and making offline use of web pages. Or as might legitimately be done by the author of a VERY large document such as a multihundred page book (which should still be availible as a single file if the user wants it that way) that might legitimately be viewed as a number of smaller _but still substantial_ documents. Dividing documents up into screenful sized chunks of text is something only a propeller head would do; it might appeal to other propeller heads who are easily amused by having buttons to push but it really pisses off people who have useful work to do. ] -- Mark Whitis http://www.freelabs.com/~whitis/ NO SPAM Author of many open source software packages. Coauthor: Linux Programming Unleashed (1st Edition) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks) 2002-12-20 19:23 ` multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks) Mark Whitis @ 2002-04-10 19:40 ` Mark Whitis 2002-12-20 19:23 ` New location of the "Crash Course.to DocBook" Éric Bischoff 2002-12-20 19:23 ` multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks) Tim Waugh 2 siblings, 0 replies; 12+ messages in thread From: Mark Whitis @ 2002-04-10 19:40 UTC (permalink / raw) To: Tim Waugh; +Cc: Éric Bischoff, docbook-tools-discuss Sorry if this message seems overly negative. I only have time to mention that which is broke. Not talk about what works or contribute patches for what's broke. Thanks to everyone who is making Docbook, SGML, and HTML document preparation feasable. I coauthored an 800 page linux book; docbook is MUCH better than the publisher's dreadful MS-Word style sheet based monstrosity. Versions in use (Redhat RPMs): openjade-1.3-13 docbook-utils-0.6-13 (i'd need to download half of skipjack to install newer versions - this is only a little bit of an exageration if preliminary examination and past experience are any indication). On Wed, 10 Apr 2002, Tim Waugh wrote: > Indeed. Here is the patch I used: Thanks for the quick response. I applied that patch directly to /usr/bin/jw and it sorta-kinda fixed the problem. Still, it is a kludge rather than a proper bugfix. docbook2html still can't be used as a proper filter, for example: <generate_docbook> | docbook2html ... | tidy ... | ... This is un*x. Filters should be able to take input on standard in and send output to standard out with errors to standard error. Obviously, you can't do that if you are using it set to blow chunks, but that is a special mode of operation suitable only for documents that are not only very large but also written by someone you trust. The blow chunks mode is also probably also a serious security hole in many situations (it creates files on the host system with names based on text supplied by the untrustworthy remote user who supplied the file). Don't believe me? Try this <chapter id="/etc/youarescrewed"> Jade will complain aobut "/" in an id but will still happily create overwrite /etc/youarescrewed if you run it as root. Even as a non-root user, there are plenty of files which can be compromised, such as ".ssh/authorized_keys". Yes, there will probably be a ".html" tacked onto the end of the file name. That DOES NOT mean you are safe. Not only might there be another hole that lets an attacker get rid of the ".html" (perhaps as simple as a buffer overflow), there are places where a file can do serious damage even with a ".html" extension, like in the directories "/etc/rc.d/rc3.d/" (executable flag probably required) and /etc/xinetd.d/ (executable flag not required). And if docbook2html is not run as root, there may still be dot file directorys which will be executed even with a .html extension. Not to mention an HTML file itself may be the target of an attack; here let me just trojan this copy of the nimbda worm (which is carred in .html files) into ~/public_html/index.html. Or maybe I will distribute a trojan docbook document which overwrites /www/index.html with 7|-|15 5173 15 0\/\/|\|3D BY 7|-|3 D00D (THIS SITEIS OWNED BY THE DUDE) If the person who reads that document has access to /www, there will be trouble. So, no document should ever be processed in blow chunks mode unless you personally wrote it. Which also means no document should should be distributed that needs blow chunks mode to be processed. And blow chunks mode should definitely NOT be the default for docbook2html. And no document should every be processed by docbook2html in a vgi-bin unless the document was written by a user on that server. So, you can forget about: - An HTML translation service that allows web users who do not have docbook translation software on their computer to enter a URL of a docbook or upload it and view the results. - Any server that publishes documents from anyone who isn't absolutely trustworthy and accepts those documents in docbook format and generates html. So much for the LDP, linux.org, etc. One clown writes a trojan "Post-It note mini-HOWTO" and the server is compromised. It is not like the people administering those servers are likely to have time to validate every revision of every document. - secure non-shell ISP which requires people to create their web pages in XML/SGML instead of using Frontpage. - a web browser which supports docbook by using docbook2html as an external filter. At least using docbook2html as a standard unix filter, if that were actually allowed, would be inherently more secure. Of course, you still can't allow sloppy code leading to buffer overlows in openjade, jw, etc. Lest some niave person suggest that fixing jade so it will not accept "/" in a chapter "id" is the fix for the problem, history has repeatedly demonstrated that deny-known-bad is NOT an effective security proceedure. Someone will, using a hypothetical example, find out that the kernel also accepts ASCII character 254 as a synonym for "/". Or that id="%2Fetc%2Fpasswd" gets through. Yes, jade needs to be fixed to allow an extremely limited character set to appear in filenames based on document supplied text; for example, every single character other than A-Za-z0-9 should be translated to an underscore, multiple consequitive should be compressed to one,, leading and training undersocres should be deleted, the total length should be limited, and if appropriate a number should be added to the end to prevent two identifiers from mapping to the same file name. Denial of service attack: Lets suppose that on a system with a 65536 inode limit, I process a mailicious file which has 65536 <chapter>'s. Again, don't blow chunks unless you seriously trust the document author. Even if it was a 300 page book, I would run as a pipe producing a single document unless I had a good reason to trust the author. On a related note, Docbook2html files actually need to be tidy'ed so badly that you might consider making a call to tidy (with configurable options), a built option (or better yet, fix the generator - but that is probably jade). The output is technically legal HTML but the formatting violates the spirit of HTML. > I hope to have time to look at making a new release in the next few > weeks. Another question: does either 0.6.9 or the upcoming release fix the "URL not supported" problem? docbook2html chokes on the DOCTYPE in files generated by abiword: <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "http://www.oasis-open.org/docbook/xml/4.0/docbookx.dtd" Results in URL not supported by this version DTD did not contain element declaration for document type name element "BOOK" undefined element "CHAPTER" undefined element "SECTION" undefined element "TITLE" undefined element "PARA" undefined ... Now, this appears to be at least two bugs: - URL in DOCTYPE is unimplemented feature - failure to use a good catch-all document type where an exact stylesheet match is not found. If someone ran docbook2html, they have already said the document is in some form of docbook. The program should not puke because someone omitted a doctype or uses a doctype different/newer than the style sheets on my system. works: <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.0//EN"> <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.1//EN"> pukes: <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.2//EN"> Now it is very likely that at some point, a docbook user is going to receive a document in a newer version of docbook than their system recognizes. One of the primary reasons for using XML/SGML markup in the first place instead of horrible proprietary formats is that new features can be added in the future but old programs can still read the document although they may not be able to render any new constructs. If you process this document with a default stylesheet, you will probably get 95% of the content or more. You may even get 100% since a document might well be labeled "DocBook 4.2" but actually only uses "Docbook 4.1 features". [Definition: blow chunks: (blo chungks) v. intr. 1. (slang) To regurgitate 2. An extremely disdainful term for running docbook2html without --nochunks, or a similar program in a similar mode, such that you generate multiple tiny output files from a single source file (or a single source file with other files included by reference. As would be done by a webmaster with poor judgement on small documents, causing grief for users loading, reading, printing, searching, archiving, and making offline use of web pages. Or as might legitimately be done by the author of a VERY large document such as a multihundred page book (which should still be availible as a single file if the user wants it that way) that might legitimately be viewed as a number of smaller _but still substantial_ documents. Dividing documents up into screenful sized chunks of text is something only a propeller head would do; it might appeal to other propeller heads who are easily amused by having buttons to push but it really pisses off people who have useful work to do. ] -- Mark Whitis http://www.freelabs.com/~whitis/ NO SPAM Author of many open source software packages. Coauthor: Linux Programming Unleashed (1st Edition) ^ permalink raw reply [flat|nested] 12+ messages in thread
* New location of the "Crash Course.to DocBook" 2002-12-20 19:23 ` multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks) Mark Whitis 2002-04-10 19:40 ` Mark Whitis @ 2002-12-20 19:23 ` Éric Bischoff 2002-04-11 3:25 ` Éric Bischoff 2002-12-20 19:23 ` multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks) Tim Waugh 2 siblings, 1 reply; 12+ messages in thread From: Éric Bischoff @ 2002-12-20 19:23 UTC (permalink / raw) To: Docbook tools Hi all, As I'm leaving Caldera to start my own company, I had to move the location of the "Crash Course to DocBook". Former address was : http://www.caldera.de/~eric/crash-course/HTML/index.html New address is : http://www.bureau-cornavin.com/opensource/crash-course/index.html Sorry for the inconvenience. <spam mode="for-those-interested"> The "Bureau Cornavin" is a new company specialized in: - translation of technical documents (English to German, French, Italian, Spanish, Portuguese, Brazilian, Czech, Romanian, Turkish, Hungarian and Polish ; German to French). - documentation writing - documentation and XML expertise </spam> ^ permalink raw reply [flat|nested] 12+ messages in thread
* New location of the "Crash Course.to DocBook" 2002-12-20 19:23 ` New location of the "Crash Course.to DocBook" Éric Bischoff @ 2002-04-11 3:25 ` Éric Bischoff 0 siblings, 0 replies; 12+ messages in thread From: Éric Bischoff @ 2002-04-11 3:25 UTC (permalink / raw) To: Docbook tools Hi all, As I'm leaving Caldera to start my own company, I had to move the location of the "Crash Course to DocBook". Former address was : http://www.caldera.de/~eric/crash-course/HTML/index.html New address is : http://www.bureau-cornavin.com/opensource/crash-course/index.html Sorry for the inconvenience. <spam mode="for-those-interested"> The "Bureau Cornavin" is a new company specialized in: - translation of technical documents (English to German, French, Italian, Spanish, Portuguese, Brazilian, Czech, Romanian, Turkish, Hungarian and Polish ; German to French). - documentation writing - documentation and XML expertise </spam> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks) 2002-12-20 19:23 ` multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks) Mark Whitis 2002-04-10 19:40 ` Mark Whitis 2002-12-20 19:23 ` New location of the "Crash Course.to DocBook" Éric Bischoff @ 2002-12-20 19:23 ` Tim Waugh 2002-04-11 7:17 ` Tim Waugh 2 siblings, 1 reply; 12+ messages in thread From: Tim Waugh @ 2002-12-20 19:23 UTC (permalink / raw) To: Mark Whitis; +Cc: Éric Bischoff, docbook-tools-discuss [-- Attachment #1: Type: text/plain, Size: 3426 bytes --] Hi Mark, Thanks for your feedback. > Thanks for the quick response. I applied that patch directly to > /usr/bin/jw and it sorta-kinda fixed the problem. Still, it is a > kludge rather than a proper bugfix. docbook2html still can't be used > as a proper filter, for example: > > <generate_docbook> | docbook2html ... | tidy ... | ... Well then all the other backends are 'broken', if you take that attitude. I think a more useful approach is to have consistent behaviour across all the backends: that of generating one or more output files in the current (or a specified) directory. That's what the man page says it does. > This is un*x. Filters should be able to take input on standard in > and send output to standard out with errors to standard error. If jw were to output to stdout, it would (in general) need to send a tar file! > The blow chunks mode is also probably also a serious security > hole in many situations (it creates files on the host system with > names based on text supplied by the untrustworthy remote user who > supplied the file). Don't believe me? Try this > <chapter id="/etc/youarescrewed"> Yes, this is an interesting attack. The docbook-dsssl package by default makes up its own names for output files when chunking; the Red Hat Linux docbook-utils package comes with a default custom stylesheet which turns on a feature to use IDs as filenames. We'll be correcting that shortly. > Denial of service attack: Lets suppose that on a system with > a 65536 inode limit, I process a mailicious file which has 65536 > <chapter>'s. I can say the same thing about tar files (for example). > On a related note, Docbook2html files actually need to be tidy'ed so > badly that you might consider making a call to tidy (with > configurable options), a built option (or better yet, fix the > generator - but that is probably jade). The output is technically > legal HTML but the formatting violates the spirit of HTML. The output is determined by the stylesheets. They are the way they are because of technical details---significant whitespace is the reason for '>' being separate to the rest of the element, for example. I'm sure that Norm would welcome patches that make the HTML output nicer to read. How's your DSSSL? ;-) (On the other hand, who is it that is editing generating output rather than editing the source?) > Another question: does either 0.6.9 or the upcoming release fix > the "URL not supported" problem? docbook2html chokes on the DOCTYPE > in files generated by abiword: > <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" > "http://www.oasis-open.org/docbook/xml/4.0/docbookx.dtd" For a long time the Red Hat Linux openjade package came with HTTP support disabled. It is enabled in the current package (in Skipjack). But you might want to consider using an XSL processor for DocBook XML. Take a look at the xmlto package for a way to start. > Now, this appears to be at least two bugs: > - URL in DOCTYPE is unimplemented feature (Actually a feature that defaults to 'disabled'.) > - failure to use a good catch-all document type where an exact > stylesheet match is not found. This is an unreasonable requirement and would just generate bogus bug reports. People should install the DTD for the document they are processing. Tim. */ [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks) 2002-12-20 19:23 ` multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks) Tim Waugh @ 2002-04-11 7:17 ` Tim Waugh 0 siblings, 0 replies; 12+ messages in thread From: Tim Waugh @ 2002-04-11 7:17 UTC (permalink / raw) To: Mark Whitis; +Cc: Éric Bischoff, docbook-tools-discuss [-- Attachment #1: Type: text/plain, Size: 3426 bytes --] Hi Mark, Thanks for your feedback. > Thanks for the quick response. I applied that patch directly to > /usr/bin/jw and it sorta-kinda fixed the problem. Still, it is a > kludge rather than a proper bugfix. docbook2html still can't be used > as a proper filter, for example: > > <generate_docbook> | docbook2html ... | tidy ... | ... Well then all the other backends are 'broken', if you take that attitude. I think a more useful approach is to have consistent behaviour across all the backends: that of generating one or more output files in the current (or a specified) directory. That's what the man page says it does. > This is un*x. Filters should be able to take input on standard in > and send output to standard out with errors to standard error. If jw were to output to stdout, it would (in general) need to send a tar file! > The blow chunks mode is also probably also a serious security > hole in many situations (it creates files on the host system with > names based on text supplied by the untrustworthy remote user who > supplied the file). Don't believe me? Try this > <chapter id="/etc/youarescrewed"> Yes, this is an interesting attack. The docbook-dsssl package by default makes up its own names for output files when chunking; the Red Hat Linux docbook-utils package comes with a default custom stylesheet which turns on a feature to use IDs as filenames. We'll be correcting that shortly. > Denial of service attack: Lets suppose that on a system with > a 65536 inode limit, I process a mailicious file which has 65536 > <chapter>'s. I can say the same thing about tar files (for example). > On a related note, Docbook2html files actually need to be tidy'ed so > badly that you might consider making a call to tidy (with > configurable options), a built option (or better yet, fix the > generator - but that is probably jade). The output is technically > legal HTML but the formatting violates the spirit of HTML. The output is determined by the stylesheets. They are the way they are because of technical details---significant whitespace is the reason for '>' being separate to the rest of the element, for example. I'm sure that Norm would welcome patches that make the HTML output nicer to read. How's your DSSSL? ;-) (On the other hand, who is it that is editing generating output rather than editing the source?) > Another question: does either 0.6.9 or the upcoming release fix > the "URL not supported" problem? docbook2html chokes on the DOCTYPE > in files generated by abiword: > <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" > "http://www.oasis-open.org/docbook/xml/4.0/docbookx.dtd" For a long time the Red Hat Linux openjade package came with HTTP support disabled. It is enabled in the current package (in Skipjack). But you might want to consider using an XSL processor for DocBook XML. Take a look at the xmlto package for a way to start. > Now, this appears to be at least two bugs: > - URL in DOCTYPE is unimplemented feature (Actually a feature that defaults to 'disabled'.) > - failure to use a good catch-all document type where an exact > stylesheet match is not found. This is an unreasonable requirement and would just generate bogus bug reports. People should install the DTD for the document they are processing. Tim. */ [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2002-04-11 14:17 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2002-12-20 19:23 BUG: docbook2html --nochunks Mark Whitis 2002-04-09 17:57 ` Mark Whitis 2002-12-20 19:23 ` Éric Bischoff 2002-04-09 23:57 ` Éric Bischoff 2002-12-20 19:23 ` Tim Waugh 2002-04-10 0:03 ` Tim Waugh 2002-12-20 19:23 ` multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks) Mark Whitis 2002-04-10 19:40 ` Mark Whitis 2002-12-20 19:23 ` New location of the "Crash Course.to DocBook" Éric Bischoff 2002-04-11 3:25 ` Éric Bischoff 2002-12-20 19:23 ` multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks) Tim Waugh 2002-04-11 7:17 ` Tim Waugh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).