From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 32164 invoked by alias); 16 Nov 2007 14:25:57 -0000 Received: (qmail 32155 invoked by uid 22791); 16 Nov 2007 14:25:56 -0000 X-Spam-Check-By: sourceware.org Received: from fmmailgate01.web.de (HELO fmmailgate01.web.de) (217.72.192.221) by sourceware.org (qpsmtpd/0.31) with ESMTP; Fri, 16 Nov 2007 14:25:51 +0000 Received: from smtp07.web.de (fmsmtp07.dlan.cinetic.de [172.20.5.215]) by fmmailgate01.web.de (Postfix) with ESMTP id 936F2B19DA5E for ; Fri, 16 Nov 2007 15:25:48 +0100 (CET) Received: from [139.30.1.27] (helo=[139.30.1.27]) by smtp07.web.de with asmtp (TLSv1:AES256-SHA:256) (WEB.DE 4.108 #208) id 1It28i-0002ZI-00 for docbook-tools-discuss@sources.redhat.com; Fri, 16 Nov 2007 15:25:48 +0100 Subject: bug in docbook-tools text backend using w3m From: Christian =?ISO-8859-1?Q?B=FCnnig?= To: docbook-tools-discuss@sources.redhat.com Content-Type: text/plain Date: Fri, 16 Nov 2007 14:25:00 -0000 Message-Id: <1195223148.10704.22.camel@ume> Mime-Version: 1.0 X-Mailer: Evolution 2.12.0 Content-Transfer-Encoding: 7bit X-Sender: masala@web.de X-IsSubscribed: yes Mailing-List: contact docbook-tools-discuss-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: docbook-tools-discuss-owner@sourceware.org X-SW-Source: 2007/txt/msg00009.txt.bz2 Hey, I've experienced a bug in the text backend of the docbook tools. The bug is about converting from HTML to TEXT with w3m. The backend creates a temporary HTML file to make a TEXT file from. However, the temporary HTML has no '.html' suffix and in that case w3m does not convert it to plain text - the resulting .txt is still HTML. Below is a workaround. I just appended '.html' to the variable HTML (line 22). Now it works. ---------- snip ----------- # Backend to convert something into ASCII text # Send any comments to Eric Bischoff # This program is under GPL license. See LICENSE file for details. if [ -x /usr/bin/lynx ] then CONVERT=/usr/bin/lynx ARGS="-force_html -dump -nolist -width=72" elif [ -x /usr/bin/links ] then CONVERT=/usr/bin/links ARGS="-dump" elif [ -x /usr/bin/w3m ] then CONVERT=/usr/bin/w3m ARGS="-dump" else echo >&2 "No way to convert HTML to text found." exit 1 fi HTML=$(mktemp /tmp/html-XXXXXX).html || exit 1 trap 'rm -f "$HTML"; exit' 0 1 2 3 7 13 15 # Convert to HTML $SGML_JADE -V nochunks -t sgml ${SGML_ARGUMENTS} >${HTML} if [ $? -ne 0 ] then exit 1 fi # Convert from HTML to ASCII ${CONVERT} ${ARGS} ${HTML} > "${SGML_FILE_NAME}.txt" if [ $? -ne 0 ] then exit 2 fi exit 0 ---------- snip ----------- Here is some info about my configuration: - DocBook-utils version 0.6.14 (jw version 1.1) - w3m version 0.5.1+cvs-1.968 - Ubuntu 7.10 Btw .. the e-mail address seems to be not valid anymore. Regards, Christian