public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Warning for trigraphs in comment?
@ 2003-05-18 13:37 Gabriel Dos Reis
  2003-05-18 18:54 ` Zack Weinberg
  2003-05-18 19:36 ` Neil Booth
  0 siblings, 2 replies; 15+ messages in thread
From: Gabriel Dos Reis @ 2003-05-18 13:37 UTC (permalink / raw)
  To: gcc

[-- Attachment #1: Type: text/plain, Size: 1552 bytes --]


This 

<quote>

   Second, consider the comment line.  Did you notice that it ends oddly,
   with a "/"?

       // What will the next line do? Increment???????????/
                                                          ^

   Nikolai Smirnov writes:

      "Probably, what's happened in the program is obvious for you but I
      lost a couple of days debugging a big program where I made a
      similar error.  I put a comment line ending with a lot of question
      marks accidentally releasing the 'Shift' key at the end.  The
      result is unexpected trigraph sequence '??/' which was converted to
      '\' (phase 1) which was annihilated with the following '\n' (phase
      2)."  [1]

   The "??/" sequence is converted to '\' which, at the end of a line, is
   a line-splicing directive (surprise!).  In this case, it splices the
   following line "++x;" to the end of the comment line and thus makes
   the increment part of the comment.  The increment is never executed.

   Interestingly, if you look at the Gnu g++ documentation for the
   -Wtrigraphs command-line switch, you will encounter the following
   statement:

      "Warnings are not given for trigraphs within comments, as they do
      not affect the meaning of the program."  [2]

   That may be true most of the time, but here we have a case in point --
   from real-world code, no less -- where this expectation does not hold.

</quote>

is extracted from Herb's GTW #86 -- full message appended below.
I would suggest we warn for trigraphs in comments.

-- Gaby



[-- Attachment #2: Type: message/rfc822, Size: 6408 bytes --]

From: Herb Sutter <hsutter@gotw.ca>
Subject: Guru of the Week #86: Solution
Date: 17 May 2003 17:04:31 -0400
Message-ID: <a8mccvcai5tpq5pogb6hluh0cp970ukrgj@4ax.com>


 -------------------------------------------------------------------
   Guru of the Week problems and solutions are posted regularly on
    news:comp.lang.c++.moderated. For past problems and solutions
      see the GotW archive at www.GotW.ca. (c) 2003 H.P.Sutter
            News archives may keep copies of this article.
 -------------------------------------------------------------------

______________________________________________________________________

GotW #86:   Slight Typos? Graphic Language and Other Curiosities

Difficulty: 5 / 10
______________________________________________________________________



>Answer the following questions without using a compiler.
>
>1. What is the output of the following program on a
>   standards-conforming C++ compiler?
>
>    #include <iostream>
>
>    int main()
>    {
>      int x = 1;
>      for( int i = 0; i < 100; ++i );
>        // What will the next line do? Increment???????????/
>        ++x;
>      std::cout << x;
>    }

Assuming that there is no invisible whitespace at the end of the
comment line, the output is "1".

There are two tricks here, one obvious and one less so.

First, consider the for loop line:

  for( int i = 0; i < 100; ++i );
                                ^

There's a semicolon at the end, a "curiously recurring typo pattern"
that (usually accidentally) makes the body of the for loop just the
empty statement.  Even though the following lines may be indented, and
may even have braces around them, they are not part of the body of the
for loop.  This was a deliberate red herring -- in this case, because
of the next point, it doesn't matter that the for loop never repeats
any statements because there's no increment statement to be repeated
at all (even though there appears to be one).  This brings us to the
second point:

Second, consider the comment line.  Did you notice that it ends oddly,
with a "/"?

    // What will the next line do? Increment???????????/
                                                       ^

Nikolai Smirnov writes:

   "Probably, what's happened in the program is obvious for you but I
   lost a couple of days debugging a big program where I made a
   similar error.  I put a comment line ending with a lot of question
   marks accidentally releasing the 'Shift' key at the end.  The
   result is unexpected trigraph sequence '??/' which was converted to
   '\' (phase 1) which was annihilated with the following '\n' (phase
   2)."  [1]

The "??/" sequence is converted to '\' which, at the end of a line, is
a line-splicing directive (surprise!).  In this case, it splices the
following line "++x;" to the end of the comment line and thus makes
the increment part of the comment.  The increment is never executed.

Interestingly, if you look at the Gnu g++ documentation for the
-Wtrigraphs command-line switch, you will encounter the following
statement:

   "Warnings are not given for trigraphs within comments, as they do
   not affect the meaning of the program."  [2]

That may be true most of the time, but here we have a case in point --
from real-world code, no less -- where this expectation does not hold.


>2. How many distinct errors should be reported when compiling the
>   following code on a conforming C++ compiler?
>
>    struct X {
>      static bool f( int* p )
>      {
>        return p && 0[p] and not p[1:>>p[2];
>      };
>    };
>


The short answer is:  Zero.  This code is perfectly legal and
standards-conforming (whether the author might have wanted it to be or
not).

Let's consider in turn each of the expressions that might be
questionable, and see why they're really okay:

 - 0[p] is legal and is defined to have the same meaning as "p[0]".
   In C (and C++), an expression of the form x[y], where one of x and
   y is a pointer type and the other is an integer value, always means
   *(x+y).  In this case, 0[p] and p[0] have the same meaning because
   they mean *(0+p) and *(p+0), respectively, which comes out to the
   same thing.  For more details, see clause 6.5.2.1 in the C99
   standard [3].

 - and and not are valid keywords that are alternative spellings of &&
   and !, respectively.

 - :> is legal.  It is a digraph for the "]" character, not a smiley
   (smileys are unsupported in the C++ language outside comment
   blocks, which is rather a shame).  This turns the final part of the
   expression into "p[1]>p[2]".

 - The "extra" semicolon is allowed at the end of a function
   declaration.

Of course, it could well be that the colon ":"  was a typo and the
author really meant "p[1]>>p[2]", but even if it was a typo it's still
(unfortunately, in that case) perfectly legal code.


Acknowledgements
----------------

Thanks to Nikolai Smirnov for contributing part of the Example 1 code;
I added the for loop line.


References
----------

[1] N. Smirnov, private communication.

[2] A Google search for "trigraphs within comments" yields this and
several other interesting and/or amusing hits.

[3] ISO/IEC 9899:1999 (E), International Standard, Programming
Languages -- C.


---
Herb Sutter (www.gotw.ca)

Convener, ISO WG21 (C++ standards committee)     (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal         (www.gotw.ca/cuj)
Visual C++ program manager, Microsoft      (www.gotw.ca/microsoft)

      [ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
      [ about comp.lang.c++.moderated. First time posters: do this! ]

[-- Attachment #3: Type: text/plain, Size: 53 bytes --]



-- 
Gabriel Dos Reis,	gdr@integrable-solutions.net

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2003-05-19 20:01 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-18 13:37 Warning for trigraphs in comment? Gabriel Dos Reis
2003-05-18 18:54 ` Zack Weinberg
2003-05-18 20:01   ` Neil Booth
2003-05-18 20:14     ` Zack Weinberg
2003-05-19  6:57   ` Gabriel Dos Reis
2003-05-18 19:36 ` Neil Booth
2003-05-19  6:51   ` Gabriel Dos Reis
2003-05-19 13:22     ` Paul Koning
2003-05-19 14:35       ` Zack Weinberg
2003-05-19 14:37         ` Paul Koning
2003-05-19 14:51           ` Gabriel Dos Reis
2003-05-19 14:55             ` Paul Koning
2003-05-19 18:11             ` Joe Buck
2003-05-19 20:04               ` Neil Booth
2003-05-19 15:10           ` Zack Weinberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).