public inbox for newlib@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] makedocbook: Fix false report of unhandled texinfo command
@ 2022-10-31  9:59 Jon Turney
  2022-10-31 12:28 ` Mike Frysinger
  0 siblings, 1 reply; 4+ messages in thread
From: Jon Turney @ 2022-10-31  9:59 UTC (permalink / raw)
  To: newlib; +Cc: Jon Turney

During 'make man', makedocbook falsely reports "texinfo command
'@modifier' remains in output" while processing the setlocal(3) manpage,
which contains that literal string.

Move the check for unrecognized texinfo commands to before processing
'@@' (an escaped '@') in the texinfo source, and teach it to ignore
them.

Improve that check slightly, so it catches non-alphabetic texinfo
commands, of which there are few.

Now we don't have false positives, we can make unrecognized texinfo
commands fatal to manpage generation, rather than leaving them verbatim
in the generated manpage.
---
 newlib/doc/makedocbook.py | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/newlib/doc/makedocbook.py b/newlib/doc/makedocbook.py
index 4e83ab63a..00b3fc140 100755
--- a/newlib/doc/makedocbook.py
+++ b/newlib/doc/makedocbook.py
@@ -454,9 +454,6 @@ command_dispatch_dict = {
 def line_markup_convert(p):
     s = p;
 
-    # process the texinfo escape for an @
-    s = s.replace('@@', '@')
-
     # escape characters not allowed in XML
     s = s.replace('&','&')
     s = s.replace('<','&lt;')
@@ -486,6 +483,15 @@ def line_markup_convert(p):
     # very hacky way of dealing with @* to force a newline
     s = s.replace('@*', '</para><para>')
 
+    # fail if there are unhandled texinfo commands
+    match = re.search('(?<!@)@[^@\s]+', s)
+    if match:
+        print("texinfo command '%s' remains in output" % match.group(), file=sys.stderr)
+        exit(1)
+
+    # process the texinfo escape for an @
+    s = s.replace('@@', '@')
+
     if (verbose > 3) and (s != p):
         print('%s-> line_markup_convert ->\n%s' % (p, s), file=sys.stderr)
 
@@ -827,10 +833,6 @@ def main(file):
 
     print(s)
 
-    # warn about texinfo commands which didn't get processed
-    match = re.search('@[a-z*]+', s)
-    if match:
-        print('texinfo command %s remains in output' % match.group(), file=sys.stderr)
 
 #
 #
-- 
2.38.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] makedocbook: Fix false report of unhandled texinfo command
  2022-10-31  9:59 [PATCH] makedocbook: Fix false report of unhandled texinfo command Jon Turney
@ 2022-10-31 12:28 ` Mike Frysinger
  2022-11-04 13:50   ` Jon Turney
  0 siblings, 1 reply; 4+ messages in thread
From: Mike Frysinger @ 2022-10-31 12:28 UTC (permalink / raw)
  To: Jon Turney; +Cc: newlib

[-- Attachment #1: Type: text/plain, Size: 866 bytes --]

On 31 Oct 2022 09:59, Jon Turney wrote:

i guess the feedback below isn't exactly about new code even if it's relevant ...

> +    # fail if there are unhandled texinfo commands
> +    match = re.search('(?<!@)@[^@\s]+', s)

regexes should use raw strings (i.e., this is missing the r prefix)

> +    if match:
> +        print("texinfo command '%s' remains in output" % match.group(), file=sys.stderr)

this is a little dangerous in general as match.group() could return a tuple,
and this would fail.  if you want to use %, you should force a tuple.
  print("..." % (match.group(),), ...)

> +        exit(1)

scripts should never use exit(), only sys.exit().  although i see the current
script gets this wrong in a lot of places.

also you can simplify this -- sys.exit accepts a string that it'll print to
stderr and then exit non-zero.
  sys.exit(".....")
-mike

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] makedocbook: Fix false report of unhandled texinfo command
  2022-10-31 12:28 ` Mike Frysinger
@ 2022-11-04 13:50   ` Jon Turney
  2022-11-04 15:13     ` Mike Frysinger
  0 siblings, 1 reply; 4+ messages in thread
From: Jon Turney @ 2022-11-04 13:50 UTC (permalink / raw)
  To: Mike Frysinger, newlib

On 31/10/2022 12:28, Mike Frysinger wrote:
> On 31 Oct 2022 09:59, Jon Turney wrote:
> 
> i guess the feedback below isn't exactly about new code even if it's relevant ...
> 
>> +    # fail if there are unhandled texinfo commands
>> +    match = re.search('(?<!@)@[^@\s]+', s)
> 
> regexes should use raw strings (i.e., this is missing the r prefix)
> 
>> +    if match:
>> +        print("texinfo command '%s' remains in output" % match.group(), file=sys.stderr)
> 
> this is a little dangerous in general as match.group() could return a tuple,
> and this would fail.  if you want to use %, you should force a tuple.
>    print("..." % (match.group(),), ...)

I must be missing something here, because I'm not sure in what sense 
'could' is being used. I think match.group() is defined to be the same 
as match.group(0), which always returns a single string. (I guess this 
could be written explicitly).

So the danger pointed out seems to be that if parts of the code were 
changed, without making other necessary changes, it would be wrong. But 
I'm not sure the suggested change just to avoid that future possibility 
makes things better or clearer.

> 
>> +        exit(1)
> 
> scripts should never use exit(), only sys.exit().  although i see the current
> script gets this wrong in a lot of places.
> 
> also you can simplify this -- sys.exit accepts a string that it'll print to
> stderr and then exit non-zero.
>    sys.exit(".....")

Yes, this is one of the first things I wrote in python, and 
unfortunately, it shows.

I've posted a revised patch-set.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] makedocbook: Fix false report of unhandled texinfo command
  2022-11-04 13:50   ` Jon Turney
@ 2022-11-04 15:13     ` Mike Frysinger
  0 siblings, 0 replies; 4+ messages in thread
From: Mike Frysinger @ 2022-11-04 15:13 UTC (permalink / raw)
  To: Jon Turney; +Cc: newlib

[-- Attachment #1: Type: text/plain, Size: 2648 bytes --]

On 04 Nov 2022 13:50, Jon Turney wrote:
> On 31/10/2022 12:28, Mike Frysinger wrote:
> > On 31 Oct 2022 09:59, Jon Turney wrote:
> >> +    if match:
> >> +        print("texinfo command '%s' remains in output" % match.group(), file=sys.stderr)
> > 
> > this is a little dangerous in general as match.group() could return a tuple,
> > and this would fail.  if you want to use %, you should force a tuple.
> >    print("..." % (match.group(),), ...)
> 
> I must be missing something here, because I'm not sure in what sense 
> 'could' is being used. I think match.group() is defined to be the same 
> as match.group(0), which always returns a single string. (I guess this 
> could be written explicitly).

you are correct that match.group() is match.group(0) which guarantees to
return a string.  when i wrote this, i didn't have access to the python
manual, so i checked against help() in the python repl.  the docs there
are much more terse and don't include the situations as to when a str or
a tuple would be returned.

> So the danger pointed out seems to be that if parts of the code were 
> changed, without making other necessary changes, it would be wrong. But 
> I'm not sure the suggested change just to avoid that future possibility 
> makes things better or clearer.

i've been bitten by refactors breaking % usage that lacks test coverage
(which, practically speaking, i think you could say most of the python
code here lacks unittests, especially in error code paths), so i have an
allergy to code that uses % without guaranteeing somehow that a tuple is
being passed like this code is doing now.

i grok that `% foo` vs `% (foo,)` looks a bit funky when you aren't used
to it, but i think this is more of a muscle memory thing -- this is the
only way to write a single element tuple.

i would have suggested using a f-string, but i don't know what versions
of python you want to support as f-strings require Python 3.6+.

of course if you really want to use % with a single element, i think it's
wrong, but i'm not going to make a continued stink about it :p.

> >> +        exit(1)
> > 
> > scripts should never use exit(), only sys.exit().  although i see the current
> > script gets this wrong in a lot of places.
> > 
> > also you can simplify this -- sys.exit accepts a string that it'll print to
> > stderr and then exit non-zero.
> >    sys.exit(".....")
> 
> Yes, this is one of the first things I wrote in python, and 
> unfortunately, it shows.

i wrote python like a C programmer for quite a long time,
and it took a while to reform my brain for python idioms
-mike

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-11-04 15:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-31  9:59 [PATCH] makedocbook: Fix false report of unhandled texinfo command Jon Turney
2022-10-31 12:28 ` Mike Frysinger
2022-11-04 13:50   ` Jon Turney
2022-11-04 15:13     ` Mike Frysinger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).