public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Gabriel Dos Reis <gdr@integrable-solutions.net>
To: gcc@gcc.gnu.org
Subject: Warning for trigraphs in comment?
Date: Sun, 18 May 2003 13:37:00 -0000	[thread overview]
Message-ID: <m34r3sxw7j.fsf@uniton.integrable-solutions.net> (raw)

[-- Attachment #1: Type: text/plain, Size: 1552 bytes --]


This 

<quote>

   Second, consider the comment line.  Did you notice that it ends oddly,
   with a "/"?

       // What will the next line do? Increment???????????/
                                                          ^

   Nikolai Smirnov writes:

      "Probably, what's happened in the program is obvious for you but I
      lost a couple of days debugging a big program where I made a
      similar error.  I put a comment line ending with a lot of question
      marks accidentally releasing the 'Shift' key at the end.  The
      result is unexpected trigraph sequence '??/' which was converted to
      '\' (phase 1) which was annihilated with the following '\n' (phase
      2)."  [1]

   The "??/" sequence is converted to '\' which, at the end of a line, is
   a line-splicing directive (surprise!).  In this case, it splices the
   following line "++x;" to the end of the comment line and thus makes
   the increment part of the comment.  The increment is never executed.

   Interestingly, if you look at the Gnu g++ documentation for the
   -Wtrigraphs command-line switch, you will encounter the following
   statement:

      "Warnings are not given for trigraphs within comments, as they do
      not affect the meaning of the program."  [2]

   That may be true most of the time, but here we have a case in point --
   from real-world code, no less -- where this expectation does not hold.

</quote>

is extracted from Herb's GTW #86 -- full message appended below.
I would suggest we warn for trigraphs in comments.

-- Gaby



[-- Attachment #2: Type: message/rfc822, Size: 6408 bytes --]

From: Herb Sutter <hsutter@gotw.ca>
Subject: Guru of the Week #86: Solution
Date: 17 May 2003 17:04:31 -0400
Message-ID: <a8mccvcai5tpq5pogb6hluh0cp970ukrgj@4ax.com>


 -------------------------------------------------------------------
   Guru of the Week problems and solutions are posted regularly on
    news:comp.lang.c++.moderated. For past problems and solutions
      see the GotW archive at www.GotW.ca. (c) 2003 H.P.Sutter
            News archives may keep copies of this article.
 -------------------------------------------------------------------

______________________________________________________________________

GotW #86:   Slight Typos? Graphic Language and Other Curiosities

Difficulty: 5 / 10
______________________________________________________________________



>Answer the following questions without using a compiler.
>
>1. What is the output of the following program on a
>   standards-conforming C++ compiler?
>
>    #include <iostream>
>
>    int main()
>    {
>      int x = 1;
>      for( int i = 0; i < 100; ++i );
>        // What will the next line do? Increment???????????/
>        ++x;
>      std::cout << x;
>    }

Assuming that there is no invisible whitespace at the end of the
comment line, the output is "1".

There are two tricks here, one obvious and one less so.

First, consider the for loop line:

  for( int i = 0; i < 100; ++i );
                                ^

There's a semicolon at the end, a "curiously recurring typo pattern"
that (usually accidentally) makes the body of the for loop just the
empty statement.  Even though the following lines may be indented, and
may even have braces around them, they are not part of the body of the
for loop.  This was a deliberate red herring -- in this case, because
of the next point, it doesn't matter that the for loop never repeats
any statements because there's no increment statement to be repeated
at all (even though there appears to be one).  This brings us to the
second point:

Second, consider the comment line.  Did you notice that it ends oddly,
with a "/"?

    // What will the next line do? Increment???????????/
                                                       ^

Nikolai Smirnov writes:

   "Probably, what's happened in the program is obvious for you but I
   lost a couple of days debugging a big program where I made a
   similar error.  I put a comment line ending with a lot of question
   marks accidentally releasing the 'Shift' key at the end.  The
   result is unexpected trigraph sequence '??/' which was converted to
   '\' (phase 1) which was annihilated with the following '\n' (phase
   2)."  [1]

The "??/" sequence is converted to '\' which, at the end of a line, is
a line-splicing directive (surprise!).  In this case, it splices the
following line "++x;" to the end of the comment line and thus makes
the increment part of the comment.  The increment is never executed.

Interestingly, if you look at the Gnu g++ documentation for the
-Wtrigraphs command-line switch, you will encounter the following
statement:

   "Warnings are not given for trigraphs within comments, as they do
   not affect the meaning of the program."  [2]

That may be true most of the time, but here we have a case in point --
from real-world code, no less -- where this expectation does not hold.


>2. How many distinct errors should be reported when compiling the
>   following code on a conforming C++ compiler?
>
>    struct X {
>      static bool f( int* p )
>      {
>        return p && 0[p] and not p[1:>>p[2];
>      };
>    };
>


The short answer is:  Zero.  This code is perfectly legal and
standards-conforming (whether the author might have wanted it to be or
not).

Let's consider in turn each of the expressions that might be
questionable, and see why they're really okay:

 - 0[p] is legal and is defined to have the same meaning as "p[0]".
   In C (and C++), an expression of the form x[y], where one of x and
   y is a pointer type and the other is an integer value, always means
   *(x+y).  In this case, 0[p] and p[0] have the same meaning because
   they mean *(0+p) and *(p+0), respectively, which comes out to the
   same thing.  For more details, see clause 6.5.2.1 in the C99
   standard [3].

 - and and not are valid keywords that are alternative spellings of &&
   and !, respectively.

 - :> is legal.  It is a digraph for the "]" character, not a smiley
   (smileys are unsupported in the C++ language outside comment
   blocks, which is rather a shame).  This turns the final part of the
   expression into "p[1]>p[2]".

 - The "extra" semicolon is allowed at the end of a function
   declaration.

Of course, it could well be that the colon ":"  was a typo and the
author really meant "p[1]>>p[2]", but even if it was a typo it's still
(unfortunately, in that case) perfectly legal code.


Acknowledgements
----------------

Thanks to Nikolai Smirnov for contributing part of the Example 1 code;
I added the for loop line.


References
----------

[1] N. Smirnov, private communication.

[2] A Google search for "trigraphs within comments" yields this and
several other interesting and/or amusing hits.

[3] ISO/IEC 9899:1999 (E), International Standard, Programming
Languages -- C.


---
Herb Sutter (www.gotw.ca)

Convener, ISO WG21 (C++ standards committee)     (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal         (www.gotw.ca/cuj)
Visual C++ program manager, Microsoft      (www.gotw.ca/microsoft)

      [ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
      [ about comp.lang.c++.moderated. First time posters: do this! ]

[-- Attachment #3: Type: text/plain, Size: 53 bytes --]



-- 
Gabriel Dos Reis,	gdr@integrable-solutions.net

             reply	other threads:[~2003-05-18 12:42 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-05-18 13:37 Gabriel Dos Reis [this message]
2003-05-18 18:54 ` Zack Weinberg
2003-05-18 20:01   ` Neil Booth
2003-05-18 20:14     ` Zack Weinberg
2003-05-19  6:57   ` Gabriel Dos Reis
2003-05-18 19:36 ` Neil Booth
2003-05-19  6:51   ` Gabriel Dos Reis
2003-05-19 13:22     ` Paul Koning
2003-05-19 14:35       ` Zack Weinberg
2003-05-19 14:37         ` Paul Koning
2003-05-19 14:51           ` Gabriel Dos Reis
2003-05-19 14:55             ` Paul Koning
2003-05-19 18:11             ` Joe Buck
2003-05-19 20:04               ` Neil Booth
2003-05-19 15:10           ` Zack Weinberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m34r3sxw7j.fsf@uniton.integrable-solutions.net \
    --to=gdr@integrable-solutions.net \
    --cc=gcc@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).