public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/98882] New: ICE in in cpp_directive_only_process on empty translation unit
@ 2021-01-29 10:58 boris at kolpackov dot net
  2021-01-29 11:34 ` [Bug c++/98882] " jakub at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: boris at kolpackov dot net @ 2021-01-29 10:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98882

            Bug ID: 98882
           Summary: ICE in in cpp_directive_only_process on empty
                    translation unit
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: boris at kolpackov dot net
  Target Milestone: ---

I get an internal compiler error when preprocessing with -fdirectives-only an
empty translation unit:

touch test.cxx
g++ -std=c++2a -E -fdirectives-only -o test.ii test.cxx

test.cxx:1: internal compiler error: in cpp_directive_only_process, at
libcpp/lex.c:4323
0x2c09263 cpp_directive_only_process(cpp_reader*, void*, void (*)(cpp_reader*,
CPP_DO_task, void*, ...))
        ../../gcc/libcpp/lex.c:4323
0xf11c8e scan_translation_unit_directives_only
        ../../gcc/gcc/c-family/c-ppoutput.c:402
0xf11158 preprocess_file(cpp_reader*)
        ../../gcc/gcc/c-family/c-ppoutput.c:100
0xf0b6e4 c_common_init()
        ../../gcc/gcc/c-family/c-opts.c:1188
0xbe4d7b cxx_init()
        ../../gcc/gcc/cp/lex.c:332
0x1805e47 lang_dependent_init
        ../../gcc/gcc/toplev.c:1881
0x18066f9 do_compile
        ../../gcc/gcc/toplev.c:2178

This is new in GCC 11.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c++/98882] ICE in in cpp_directive_only_process on empty translation unit
  2021-01-29 10:58 [Bug c++/98882] New: ICE in in cpp_directive_only_process on empty translation unit boris at kolpackov dot net
@ 2021-01-29 11:34 ` jakub at gcc dot gnu.org
  2021-01-29 12:01 ` [Bug c++/98882] [11 Regression] " jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-29 11:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98882

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P1
                 CC|                            |jakub at gcc dot gnu.org
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Target Milestone|---                         |11.0
   Last reconfirmed|                            |2021-01-29

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Started with r11-206-gb224c3763e018e8bdd0047b3eb283992fb655ce0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c++/98882] [11 Regression] ICE in in cpp_directive_only_process on empty translation unit
  2021-01-29 10:58 [Bug c++/98882] New: ICE in in cpp_directive_only_process on empty translation unit boris at kolpackov dot net
  2021-01-29 11:34 ` [Bug c++/98882] " jakub at gcc dot gnu.org
@ 2021-01-29 12:01 ` jakub at gcc dot gnu.org
  2021-01-29 12:32 ` [Bug preprocessor/98882] " jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-29 12:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98882

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
It isn't just on empty files, but on any files that don't end with newline.

I believe the
      /* Files always end in a newline.  We rely on this for
         character peeking safety.  */
      gcc_assert (buffer->rlimit[-1] == '\n');
assert is just wrong.
It isn't buffer->rlimit[-1] that should be '\n', it is actually
buffer->rlimit[0] that should be end of line terminator, buffer->rlimit[-1] is
the last character of the buffer, so whatever is in the buffer (and empty files
have buffer->cur == buffer->rlimit and thus have even nothing there at all so
rlimit[-1] access is invalid).

Another problem is though that even buffer->rlimit[0] == '\n' is not
guaranteed.

libcpp/charset.c has:

  /* If the file is using old-school Mac line endings (\r only),
     terminate with another \r, not an \n, so that we do not mistake
     the \r\n sequence for a single DOS line ending and erroneously
     issue the "No newline at end of file" diagnostic.  */
  if (to.len && to.text[to.len - 1] == '\r')
    to.text[to.len] = '\r';
  else
    to.text[to.len] = '\n';

So, I think we can assert that buffer->rlimit[0] == '\n' || buffer->rlimit[0]
== '\r' and in the
            case '\r': /* MAC line ending, or Windows \r\n  */
              if (*pos == '\n')
                pos++;
              /* FALLTHROUGH */
and
              else if (*pos == '\r')
                {
                  if (pos[1] == '\n')
                    pos++;
                  pos++;
                  goto next_line;
                }
              goto dflt;
and
                      case '\r':
                        if (*pos == '\n')
                          pos++;
(twice) cases unfortunately need to compare pos against limit.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug preprocessor/98882] [11 Regression] ICE in in cpp_directive_only_process on empty translation unit
  2021-01-29 10:58 [Bug c++/98882] New: ICE in in cpp_directive_only_process on empty translation unit boris at kolpackov dot net
  2021-01-29 11:34 ` [Bug c++/98882] " jakub at gcc dot gnu.org
  2021-01-29 12:01 ` [Bug c++/98882] [11 Regression] " jakub at gcc dot gnu.org
@ 2021-01-29 12:32 ` jakub at gcc dot gnu.org
  2021-02-03 22:18 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-29 12:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98882

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 50086
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50086&action=edit
gcc11-pr98882.patch

Actually, I think it is only the assert that is problematic.
If the code isn't upset of pos in certain cases being bumped to buffer->rlimit
+ 1 (which can happen even with the \n termination of buffers, e.g. if file
ends up with backslash), then the \r handling cases are all but one about the
\r
being still before the limit and so either rlimit[0] will be '\n' and we can
safely read it, or it will be '\r' and again we can read it (it just won't be
'\n').  The harder case is backslash, after which we check for newline (ok) or
cr or cr + newline.  But, if rlimit > cur && rlimit[-1] == '\\', then rlimit[0]
must be '\n' not '\r' (as '\r' is only added if rlimit[-1] == '\r'), so we
never access rlimit[1].

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug preprocessor/98882] [11 Regression] ICE in in cpp_directive_only_process on empty translation unit
  2021-01-29 10:58 [Bug c++/98882] New: ICE in in cpp_directive_only_process on empty translation unit boris at kolpackov dot net
                   ` (2 preceding siblings ...)
  2021-01-29 12:32 ` [Bug preprocessor/98882] " jakub at gcc dot gnu.org
@ 2021-02-03 22:18 ` cvs-commit at gcc dot gnu.org
  2021-02-03 22:19 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-02-03 22:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98882

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:ac16f4327fef5dfc288409371a61649253353ef7

commit r11-7093-gac16f4327fef5dfc288409371a61649253353ef7
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Feb 3 23:18:05 2021 +0100

    libcpp: Fix up -fdirectives-only preprocessing [PR98882]

    GCC 11 ICEs on all -fdirectives-only preprocessing when the files don't end
    with a newline.

    The problem is in the assertion, for empty TUs buffer->cur ==
buffer->rlimit
    and so buffer->rlimit[-1] access triggers UB in the preprocessor, for
    non-empty TUs it refers to the last character in the file, which can be
    anything.
    The preprocessor adds a '\n' character (or '\r', in particular if the
    user file ends with '\r' then it adds another '\r' rather than '\n'), but
    that is added after the limit, i.e. at buffer->rlimit[0].

    Now, if the routine handles occassional bumping of pos to buffer->rlimit +
1,
    I think it is just the assert that needs changing, usually we read from
*pos
    if pos < limit and then e.g. if it is '\r', look at the following character
    (which could be one of those '\n' or '\r' at buffer->rlimit[0]).  There is
    also the case where for '\\' before the limit we read following character
    and if it is '\n', do one thing, if it is '\r' read another character.
    But in that case if '\\' was the last char in the TU, the limit char will
be
    '\n', so we are ok.

    2021-02-03  Jakub Jelinek  <jakub@redhat.com>

            PR preprocessor/98882
            * lex.c (cpp_directive_only_process): Don't assert that rlimit[-1]
            is a newline, instead assert that rlimit[0] is either newline or
            carriage return.  When seeing '\\' followed by '\r', check limit
            before accessing pos[1].

            * gcc.dg/cpp/pr98882.c: New test.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug preprocessor/98882] [11 Regression] ICE in in cpp_directive_only_process on empty translation unit
  2021-01-29 10:58 [Bug c++/98882] New: ICE in in cpp_directive_only_process on empty translation unit boris at kolpackov dot net
                   ` (3 preceding siblings ...)
  2021-02-03 22:18 ` cvs-commit at gcc dot gnu.org
@ 2021-02-03 22:19 ` jakub at gcc dot gnu.org
  2021-02-08 10:42 ` boris at kolpackov dot net
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-02-03 22:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98882

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug preprocessor/98882] [11 Regression] ICE in in cpp_directive_only_process on empty translation unit
  2021-01-29 10:58 [Bug c++/98882] New: ICE in in cpp_directive_only_process on empty translation unit boris at kolpackov dot net
                   ` (4 preceding siblings ...)
  2021-02-03 22:19 ` jakub at gcc dot gnu.org
@ 2021-02-08 10:42 ` boris at kolpackov dot net
  2021-02-09 16:40 ` jakub at gcc dot gnu.org
  2021-02-10  5:21 ` boris at kolpackov dot net
  7 siblings, 0 replies; 9+ messages in thread
From: boris at kolpackov dot net @ 2021-02-08 10:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98882

--- Comment #6 from Boris Kolpackov <boris at kolpackov dot net> ---
After this change I now get an ICE (that same assert) when partially
preprocessing (-E -fdirectives-only) a TU that has an include directive that is
translated to an import:

cat <<EOF >hello.hxx
void f ();
EOF

cat <<EOF >main.cxx
#include "hello.hxx"
//import "hello.hxx";

int main () {}
EOF

cat <<EOF >mapper
#!/usr/bin/env bash

function info () { echo "$*" 1>&2; }
function resp () { info "  < $*"; echo "$*"; }

while read l; do
  info "  > $l"

  case "$l" in
    HELLO*)
      resp "HELLO 1 test ;"
      ;;
    MODULE-REPO)
      resp "PATHNAME /"
      ;;
    INCLUDE-TRANSLATE*hello.hxx)
      resp "PATHNAME $(pwd)/hello.gcm"
      ;;
    INCLUDE-TRANSLATE*)
      resp "BOOL FALSE"
      ;;
    MODULE-EXPORT*|MODULE-IMPORT*)
      resp "PATHNAME $(pwd)/hello.gcm"
      ;;
    MODULE-COMPILED*)
      resp "OK"
      ;;
    *)
      resp "ERROR 'unexpected request: $l'"
      ;;
  esac
done
EOF

chmod ugo+x ./mapper

g++ -std=c++2a -fmodules-ts -fmodule-mapper='|./mapper' -fmodule-header -x
c++-header hello.hxx
g++ -std=c++2a -fmodules-ts -fmodule-mapper='|./mapper' -E -fdirectives-only -x
c++ -o main.ii main.cxx

main.cxx:1: internal compiler error: in cpp_directive_only_process, at
libcpp/lex.c:4323

0x2c26a7a cpp_directive_only_process(cpp_reader*, void*, void (*)(cpp_reader*,
CPP_DO_task, void*, ...))
        ../../gcc/libcpp/lex.c:4323
0xf1745e scan_translation_unit_directives_only
        ../../gcc/gcc/c-family/c-ppoutput.c:402
0xf16928 preprocess_file(cpp_reader*)
        ../../gcc/gcc/c-family/c-ppoutput.c:100
0xf10e2b c_common_init()
        ../../gcc/gcc/c-family/c-opts.c:1195
0xbe8ea7 cxx_init()
        ../../gcc/gcc/cp/lex.c:332
0x180c1af lang_dependent_init
        ../../gcc/gcc/toplev.c:1881
0x180ca61 do_compile
        ../../gcc/gcc/toplev.c:2178

Note that there is no assert if we are using import directly, without include
translation.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug preprocessor/98882] [11 Regression] ICE in in cpp_directive_only_process on empty translation unit
  2021-01-29 10:58 [Bug c++/98882] New: ICE in in cpp_directive_only_process on empty translation unit boris at kolpackov dot net
                   ` (5 preceding siblings ...)
  2021-02-08 10:42 ` boris at kolpackov dot net
@ 2021-02-09 16:40 ` jakub at gcc dot gnu.org
  2021-02-10  5:21 ` boris at kolpackov dot net
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-02-09 16:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98882

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Please file that as a different PR.  And that is something for Nathan to look
at, I don't know anything about that.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug preprocessor/98882] [11 Regression] ICE in in cpp_directive_only_process on empty translation unit
  2021-01-29 10:58 [Bug c++/98882] New: ICE in in cpp_directive_only_process on empty translation unit boris at kolpackov dot net
                   ` (6 preceding siblings ...)
  2021-02-09 16:40 ` jakub at gcc dot gnu.org
@ 2021-02-10  5:21 ` boris at kolpackov dot net
  7 siblings, 0 replies; 9+ messages in thread
From: boris at kolpackov dot net @ 2021-02-10  5:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98882

--- Comment #8 from Boris Kolpackov <boris at kolpackov dot net> ---
Done, bug 99050.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-02-10  5:21 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-29 10:58 [Bug c++/98882] New: ICE in in cpp_directive_only_process on empty translation unit boris at kolpackov dot net
2021-01-29 11:34 ` [Bug c++/98882] " jakub at gcc dot gnu.org
2021-01-29 12:01 ` [Bug c++/98882] [11 Regression] " jakub at gcc dot gnu.org
2021-01-29 12:32 ` [Bug preprocessor/98882] " jakub at gcc dot gnu.org
2021-02-03 22:18 ` cvs-commit at gcc dot gnu.org
2021-02-03 22:19 ` jakub at gcc dot gnu.org
2021-02-08 10:42 ` boris at kolpackov dot net
2021-02-09 16:40 ` jakub at gcc dot gnu.org
2021-02-10  5:21 ` boris at kolpackov dot net

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).