public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/104147] New: C preprocessor may remove the standard required whitespace between the preprocessing tokens
@ 2022-01-20 17:30 pavel.morozkin at gmail dot com
2022-01-20 20:59 ` [Bug preprocessor/104147] [9/10/11/12 Regression] " jsm28 at gcc dot gnu.org
` (10 more replies)
0 siblings, 11 replies; 12+ messages in thread
From: pavel.morozkin at gmail dot com @ 2022-01-20 17:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104147
Bug ID: 104147
Summary: C preprocessor may remove the standard required
whitespace between the preprocessing tokens
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: pavel.morozkin at gmail dot com
Target Milestone: ---
Sample code:
#define X(x,y) x y
#define STR_(x) #x
#define STR(x) STR_(x)
STR(X(Y,Y))
Invocations:
$ gcc t222.c -std=c11 -pedantic -Wall -Wextra -E -P
"Y Y"
$ gcc t222.c -std=c11 -pedantic -Wall -Wextra -E -P -D"Y()"
"YY"
Also 1: Somehow gcc takes into account the whitespace between `,` and `Y`:
$ gcc t222.c -std=c11 -pedantic -Wall -Wextra -E -P -D"Y()" -D"Z=STR(X(Y,Y))"
"YY"
$ gcc t222.c -std=c11 -pedantic -Wall -Wextra -E -P -D"Y()" -D"Z=STR(X(Y, Y))"
"Y Y"
Also 2: This:
STR(X(Y,
Y))
leads to:
$ gcc t222.c -std=c11 -pedantic -Wall -Wextra -E -P -D"Y()"
"Y Y"
However, this:
STR(X(Y
,Y))
leads to:
$ gcc t222.c -std=c11 -pedantic -Wall -Wextra -E -P -D"Y()"
"YY"
The whitespace is required by the standard.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug preprocessor/104147] [9/10/11/12 Regression] C preprocessor may remove the standard required whitespace between the preprocessing tokens
2022-01-20 17:30 [Bug c/104147] New: C preprocessor may remove the standard required whitespace between the preprocessing tokens pavel.morozkin at gmail dot com
@ 2022-01-20 20:59 ` jsm28 at gcc dot gnu.org
2022-01-21 7:55 ` rguenth at gcc dot gnu.org
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: jsm28 at gcc dot gnu.org @ 2022-01-20 20:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104147
Joseph S. Myers <jsm28 at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to fail| |3.2.3
Known to work| |3.0.4
Target Milestone|--- |9.5
Status|UNCONFIRMED |NEW
Last reconfirmed| |2022-01-20
Ever confirmed|0 |1
Summary|C preprocessor may remove |[9/10/11/12 Regression] C
|the standard required |preprocessor may remove the
|whitespace between the |standard required
|preprocessing tokens |whitespace between the
| |preprocessing tokens
--- Comment #1 from Joseph S. Myers <jsm28 at gcc dot gnu.org> ---
This appears to be a regression between GCC 3.0 and 3.2.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug preprocessor/104147] [9/10/11/12 Regression] C preprocessor may remove the standard required whitespace between the preprocessing tokens
2022-01-20 17:30 [Bug c/104147] New: C preprocessor may remove the standard required whitespace between the preprocessing tokens pavel.morozkin at gmail dot com
2022-01-20 20:59 ` [Bug preprocessor/104147] [9/10/11/12 Regression] " jsm28 at gcc dot gnu.org
@ 2022-01-21 7:55 ` rguenth at gcc dot gnu.org
2022-01-31 14:07 ` jakub at gcc dot gnu.org
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-21 7:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104147
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P2
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug preprocessor/104147] [9/10/11/12 Regression] C preprocessor may remove the standard required whitespace between the preprocessing tokens
2022-01-20 17:30 [Bug c/104147] New: C preprocessor may remove the standard required whitespace between the preprocessing tokens pavel.morozkin at gmail dot com
2022-01-20 20:59 ` [Bug preprocessor/104147] [9/10/11/12 Regression] " jsm28 at gcc dot gnu.org
2022-01-21 7:55 ` rguenth at gcc dot gnu.org
@ 2022-01-31 14:07 ` jakub at gcc dot gnu.org
2022-01-31 14:50 ` jakub at gcc dot gnu.org
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-01-31 14:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104147
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
So, from what I can see on
#define X(x,y) x y
#define STR_(x) #x
#define STR(x) STR_(x)
STR(X(Y,Y))
vs.
#define X(x,y) x y
#define STR_(x) #x
#define STR(x) STR_(x)
#define Y()
STR(X(Y,Y))
is that on the third call to funlike_invocation_p for the Y macro, the code
reads 2 padding tokens before finding a non-padding non-CPP_OPEN_PAREN one:
1365 static _cpp_buff *
1366 funlike_invocation_p (cpp_reader *pfile, cpp_hashnode *node,
1367 _cpp_buff **pragma_buff, unsigned *num_args)
1368 {
1369 const cpp_token *token, *padding = NULL;
1370
1371 for (;;)
1372 {
1373 token = cpp_get_token (pfile);
1374 if (token->type != CPP_PADDING)
1375 break;
1376 if (padding == NULL
1377 || (!(padding->flags & PREV_WHITE) && token->val.source ==
NULL))
1378 padding = token;
1379 }
The first padding token is pfile->avoid_paste and the second padding token is
one created by padding_token:
(gdb) p *padding
$128 = {src_loc = 0, type = CPP_PADDING, flags = 0, val = {node = {node = 0x0,
spelling = 0x0}, source = 0x0, str = {len = 0, text = 0x0}, macro_arg = {arg_no
= 0, spelling = 0x0},
token_no = 0, pragma = 0}}
(gdb) p *token
$129 = {src_loc = 258816, type = CPP_PADDING, flags = 0, val = {node = {node =
0x7fffea249db0, spelling = 0x0}, source = 0x7fffea249db0, str = {len =
3928268208, text = 0x0},
macro_arg = {arg_no = 3928268208, spelling = 0x0}, token_no = 3928268208,
pragma = 3928268208}}
(gdb) p *token->val.source
$130 = {src_loc = 242688, type = CPP_MACRO_ARG, flags = 1, val = {node = {node
= 0x7fff00000002, spelling = 0x7fffea38e2e8}, source = 0x7fff00000002, str =
{len = 2,
text = 0x7fffea38e2e8 "\260\260\070\352\377\177"}, macro_arg = {arg_no =
2, spelling = 0x7fffea38e2e8}, token_no = 2, pragma = 2}}
So, both CPP_PADDING tokens have flags of 0, but the second padding has
non-NULL val.source and that one has PREV_WHITE set on it.
Now, because of the above condition, padding is in the end the first of the 2
padding tokens, i.e. pfile->avoid_paste.
The code later will do:
1393 if (token->type != CPP_EOF || token == &pfile->endarg)
1394 {
1395 _cpp_backup_tokens (pfile, 1);
1396 if (padding)
1397 _cpp_push_token_context (pfile, NULL, padding, 1);
1398 }
so backup a single token (a CPP_NAME) and push a new context containing just
the avoid_paste token.
Later on cpp_get_token_1 is called and first hits:
2930 else if (!reached_end_of_context (context))
2931 {
2932 consume_next_token_from_context (pfile, &result,
2933 &virt_loc);
and so the avoid_paste token is read and next time reached_end_of_context is
true and so it will:
2945 if (pfile->context->c.macro)
2946 ++num_expanded_macros_counter;
2947 _cpp_pop_context (pfile);
2948 if (pfile->state.in_directive)
2949 continue;
2950 result = &pfile->avoid_paste;
and read avoid_paste again.
Now, if Y is not a macro, the tokens which will be read are the original ones,
i.e.
one avoid_paste followed by the padding_token one.
Now, stringify_arg is called later on, and that one ignores CPP_PADDING tokens,
except when they have val.source set on it:
if (token->type == CPP_PADDING)
{
if (source == NULL
|| (!(source->flags & PREV_WHITE)
&& token->val.source == NULL))
source = token->val.source;
continue;
}
...
/* Leading white space? */
if (dest - 1 != BUFF_FRONT (pfile->u_buff))
{
if (source == NULL)
source = token;
if (source->flags & PREV_WHITE)
*dest++ = ' ';
}
source = NULL;
So, when Y is not a macro, we set source to the CPP_MACRO_ARG with PREV_WHITE
set on it and so emit a space in between,
while when Y is a function-like macro, source isn't set from CPP_PADDING tokens
but instead from the actual CPP_NAME
token after it which doesn't have PREV_WHITE and so we don't emit a space in
there.
Now, if the order of the CPP_PADDING tokens doesn't matter, perhaps we want
funlike_invocation_p use the
padding token with non-NULL val.source in preference of the avoid_paste token
rather than the other way around,
but not sure what that would break. Also, not really sure how PREV_WHITE flags
on the CPP_PADDING tokens come to play with this.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug preprocessor/104147] [9/10/11/12 Regression] C preprocessor may remove the standard required whitespace between the preprocessing tokens
2022-01-20 17:30 [Bug c/104147] New: C preprocessor may remove the standard required whitespace between the preprocessing tokens pavel.morozkin at gmail dot com
` (2 preceding siblings ...)
2022-01-31 14:07 ` jakub at gcc dot gnu.org
@ 2022-01-31 14:50 ` jakub at gcc dot gnu.org
2022-02-01 19:50 ` cvs-commit at gcc dot gnu.org
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-01-31 14:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104147
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Looking at the PREV_WHITE uses, they look like:
if (token->type == CPP_PADDING)
{
avoid_paste = true;
if (print.source == NULL
|| (!(print.source->flags & PREV_WHITE)
&& token->val.source == NULL))
print.source = token->val.source;
return;
}
...
if (avoid_paste)
{
...
else if (print.source->flags & PREV_WHITE
...
}
else if (token->flags & PREV_WHITE)
and stringify_arg has something similar.
So, makes me wonder if funlike_invocation_p isn't just a copy & paste
of those conditions, with the important difference that padding (what we
remember) is not the source, but the token itself.
I'd expect something like:
if (padding == NULL
|| padding->val.source == NULL
|| (!(padding->val.source->flags & PREV_WHITE)
&& token->val.source == NULL))
padding = token;
instead.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug preprocessor/104147] [9/10/11/12 Regression] C preprocessor may remove the standard required whitespace between the preprocessing tokens
2022-01-20 17:30 [Bug c/104147] New: C preprocessor may remove the standard required whitespace between the preprocessing tokens pavel.morozkin at gmail dot com
` (3 preceding siblings ...)
2022-01-31 14:50 ` jakub at gcc dot gnu.org
@ 2022-02-01 19:50 ` cvs-commit at gcc dot gnu.org
2022-02-01 19:51 ` [Bug preprocessor/104147] [9/10/11 " jakub at gcc dot gnu.org
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-02-01 19:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104147
--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:95ac5635409606386259d2ff21fb61738858ca4a
commit r12-6976-g95ac5635409606386259d2ff21fb61738858ca4a
Author: Jakub Jelinek <jakub@redhat.com>
Date: Tue Feb 1 20:48:03 2022 +0100
libcpp: Fix up padding handling in funlike_invocation_p [PR104147]
As mentioned in the PR, in some cases we preprocess incorrectly when we
encounter an identifier which is defined as function-like macro, followed
by at least 2 CPP_PADDING tokens and then some other identifier.
On the following testcase, the problem is in the 3rd funlike_invocation_p,
the tokens are CPP_NAME Y, CPP_PADDING (the pfile->avoid_paste shared
token),
CPP_PADDING (one created with padding_token, val.source is non-NULL and
val.source->flags & PREV_WHITE is non-zero) and then another CPP_NAME.
funlike_invocation_p remembers there was a padding token, but remembers the
first one because of its condition, then the next token is the CPP_NAME,
which is not CPP_OPEN_PAREN, so the CPP_NAME token is backed up, but as we
can't easily backup more tokens, it pushes into a new context the padding
token (the pfile->avoid_paste one). The net effect is that when Y is not
defined as fun-like macro, we read Y, avoid_paste, padding_token, Y,
while if Y is fun-like macro, we read Y, avoid_paste, avoid_paste, Y
(the second avoid_paste is because that is how we handle end of a context).
Now, for stringify_arg that is unfortunately a significant difference,
which handles CPP_PADDING tokens with:
if (token->type == CPP_PADDING)
{
if (source == NULL
|| (!(source->flags & PREV_WHITE)
&& token->val.source == NULL))
source = token->val.source;
continue;
}
and later on
/* Leading white space? */
if (dest - 1 != BUFF_FRONT (pfile->u_buff))
{
if (source == NULL)
source = token;
if (source->flags & PREV_WHITE)
*dest++ = ' ';
}
source = NULL;
(and c-ppoutput.cc has similar code).
So, when Y is not fun-like macro, ' ' is added because padding_token's
val.source->flags & PREV_WHITE is non-zero, while when it is fun-like
macro, we don't add ' ' in between, because source is NULL and so
used from the next token (CPP_NAME Y), which doesn't have PREV_WHITE set.
Now, the funlike_invocation_p condition
if (padding == NULL
|| (!(padding->flags & PREV_WHITE) && token->val.source ==
NULL))
padding = token;
looks very similar to that in stringify_arg/c-ppoutput.cc, so I assume
the intent was to prefer do the same thing and pick the right padding.
But there are significant differences. Both stringify_arg and
c-ppoutput.cc
don't remember the CPP_PADDING token, but its val.source instead, while
in funlike_invocation_p we want to remember the padding token that has the
significant information for stringify_arg/c-ppoutput.cc.
So, IMHO we want to overwrite padding if:
1) padding == NULL (remember that there was any padding at all)
2) padding->val.source == NULL (this matches the source == NULL
case in stringify_arg)
3) !(padding->val.source->flags & PREV_WHITE) && token->val.source == NULL
(this matches the !(source->flags & PREV_WHITE) && token->val.source ==
NULL
case in stringify_arg)
2022-02-01 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/104147
* macro.cc (funlike_invocation_p): For padding prefer a token
with val.source non-NULL especially if it has PREV_WHITE set
on val.source->flags. Add gcc_assert that CPP_PADDING tokens
don't have PREV_WHITE set in flags.
* c-c++-common/cpp/pr104147.c: New test.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug preprocessor/104147] [9/10/11 Regression] C preprocessor may remove the standard required whitespace between the preprocessing tokens
2022-01-20 17:30 [Bug c/104147] New: C preprocessor may remove the standard required whitespace between the preprocessing tokens pavel.morozkin at gmail dot com
` (4 preceding siblings ...)
2022-02-01 19:50 ` cvs-commit at gcc dot gnu.org
@ 2022-02-01 19:51 ` jakub at gcc dot gnu.org
2022-02-19 8:02 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-02-01 19:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104147
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[9/10/11/12 Regression] C |[9/10/11 Regression] C
|preprocessor may remove the |preprocessor may remove the
|standard required |standard required
|whitespace between the |whitespace between the
|preprocessing tokens |preprocessing tokens
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed on the trunk so far.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug preprocessor/104147] [9/10/11 Regression] C preprocessor may remove the standard required whitespace between the preprocessing tokens
2022-01-20 17:30 [Bug c/104147] New: C preprocessor may remove the standard required whitespace between the preprocessing tokens pavel.morozkin at gmail dot com
` (5 preceding siblings ...)
2022-02-01 19:51 ` [Bug preprocessor/104147] [9/10/11 " jakub at gcc dot gnu.org
@ 2022-02-19 8:02 ` cvs-commit at gcc dot gnu.org
2022-02-19 8:08 ` [Bug preprocessor/104147] [9/10 " jakub at gcc dot gnu.org
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-02-19 8:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104147
--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:7c442c8897163888e1d9e367b85565f61e0d3136
commit r11-9598-g7c442c8897163888e1d9e367b85565f61e0d3136
Author: Jakub Jelinek <jakub@redhat.com>
Date: Tue Feb 1 20:48:03 2022 +0100
libcpp: Fix up padding handling in funlike_invocation_p [PR104147]
As mentioned in the PR, in some cases we preprocess incorrectly when we
encounter an identifier which is defined as function-like macro, followed
by at least 2 CPP_PADDING tokens and then some other identifier.
On the following testcase, the problem is in the 3rd funlike_invocation_p,
the tokens are CPP_NAME Y, CPP_PADDING (the pfile->avoid_paste shared
token),
CPP_PADDING (one created with padding_token, val.source is non-NULL and
val.source->flags & PREV_WHITE is non-zero) and then another CPP_NAME.
funlike_invocation_p remembers there was a padding token, but remembers the
first one because of its condition, then the next token is the CPP_NAME,
which is not CPP_OPEN_PAREN, so the CPP_NAME token is backed up, but as we
can't easily backup more tokens, it pushes into a new context the padding
token (the pfile->avoid_paste one). The net effect is that when Y is not
defined as fun-like macro, we read Y, avoid_paste, padding_token, Y,
while if Y is fun-like macro, we read Y, avoid_paste, avoid_paste, Y
(the second avoid_paste is because that is how we handle end of a context).
Now, for stringify_arg that is unfortunately a significant difference,
which handles CPP_PADDING tokens with:
if (token->type == CPP_PADDING)
{
if (source == NULL
|| (!(source->flags & PREV_WHITE)
&& token->val.source == NULL))
source = token->val.source;
continue;
}
and later on
/* Leading white space? */
if (dest - 1 != BUFF_FRONT (pfile->u_buff))
{
if (source == NULL)
source = token;
if (source->flags & PREV_WHITE)
*dest++ = ' ';
}
source = NULL;
(and c-ppoutput.cc has similar code).
So, when Y is not fun-like macro, ' ' is added because padding_token's
val.source->flags & PREV_WHITE is non-zero, while when it is fun-like
macro, we don't add ' ' in between, because source is NULL and so
used from the next token (CPP_NAME Y), which doesn't have PREV_WHITE set.
Now, the funlike_invocation_p condition
if (padding == NULL
|| (!(padding->flags & PREV_WHITE) && token->val.source ==
NULL))
padding = token;
looks very similar to that in stringify_arg/c-ppoutput.cc, so I assume
the intent was to prefer do the same thing and pick the right padding.
But there are significant differences. Both stringify_arg and
c-ppoutput.cc
don't remember the CPP_PADDING token, but its val.source instead, while
in funlike_invocation_p we want to remember the padding token that has the
significant information for stringify_arg/c-ppoutput.cc.
So, IMHO we want to overwrite padding if:
1) padding == NULL (remember that there was any padding at all)
2) padding->val.source == NULL (this matches the source == NULL
case in stringify_arg)
3) !(padding->val.source->flags & PREV_WHITE) && token->val.source == NULL
(this matches the !(source->flags & PREV_WHITE) && token->val.source ==
NULL
case in stringify_arg)
2022-02-01 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/104147
* macro.c (funlike_invocation_p): For padding prefer a token
with val.source non-NULL especially if it has PREV_WHITE set
on val.source->flags. Add gcc_assert that CPP_PADDING tokens
don't have PREV_WHITE set in flags.
* c-c++-common/cpp/pr104147.c: New test.
(cherry picked from commit 95ac5635409606386259d2ff21fb61738858ca4a)
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug preprocessor/104147] [9/10 Regression] C preprocessor may remove the standard required whitespace between the preprocessing tokens
2022-01-20 17:30 [Bug c/104147] New: C preprocessor may remove the standard required whitespace between the preprocessing tokens pavel.morozkin at gmail dot com
` (6 preceding siblings ...)
2022-02-19 8:02 ` cvs-commit at gcc dot gnu.org
@ 2022-02-19 8:08 ` jakub at gcc dot gnu.org
2022-05-10 8:23 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-02-19 8:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104147
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org
Status|NEW |ASSIGNED
Summary|[9/10/11 Regression] C |[9/10 Regression] C
|preprocessor may remove the |preprocessor may remove the
|standard required |standard required
|whitespace between the |whitespace between the
|preprocessing tokens |preprocessing tokens
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed for 11.3+ too.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug preprocessor/104147] [9/10 Regression] C preprocessor may remove the standard required whitespace between the preprocessing tokens
2022-01-20 17:30 [Bug c/104147] New: C preprocessor may remove the standard required whitespace between the preprocessing tokens pavel.morozkin at gmail dot com
` (7 preceding siblings ...)
2022-02-19 8:08 ` [Bug preprocessor/104147] [9/10 " jakub at gcc dot gnu.org
@ 2022-05-10 8:23 ` cvs-commit at gcc dot gnu.org
2022-05-11 6:24 ` cvs-commit at gcc dot gnu.org
2022-05-11 6:36 ` jakub at gcc dot gnu.org
10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-10 8:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104147
--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:fb1792e4c96cf0e969d5fde2a857da3fb4b2a5aa
commit r10-10673-gfb1792e4c96cf0e969d5fde2a857da3fb4b2a5aa
Author: Jakub Jelinek <jakub@redhat.com>
Date: Tue Feb 1 20:48:03 2022 +0100
libcpp: Fix up padding handling in funlike_invocation_p [PR104147]
As mentioned in the PR, in some cases we preprocess incorrectly when we
encounter an identifier which is defined as function-like macro, followed
by at least 2 CPP_PADDING tokens and then some other identifier.
On the following testcase, the problem is in the 3rd funlike_invocation_p,
the tokens are CPP_NAME Y, CPP_PADDING (the pfile->avoid_paste shared
token),
CPP_PADDING (one created with padding_token, val.source is non-NULL and
val.source->flags & PREV_WHITE is non-zero) and then another CPP_NAME.
funlike_invocation_p remembers there was a padding token, but remembers the
first one because of its condition, then the next token is the CPP_NAME,
which is not CPP_OPEN_PAREN, so the CPP_NAME token is backed up, but as we
can't easily backup more tokens, it pushes into a new context the padding
token (the pfile->avoid_paste one). The net effect is that when Y is not
defined as fun-like macro, we read Y, avoid_paste, padding_token, Y,
while if Y is fun-like macro, we read Y, avoid_paste, avoid_paste, Y
(the second avoid_paste is because that is how we handle end of a context).
Now, for stringify_arg that is unfortunately a significant difference,
which handles CPP_PADDING tokens with:
if (token->type == CPP_PADDING)
{
if (source == NULL
|| (!(source->flags & PREV_WHITE)
&& token->val.source == NULL))
source = token->val.source;
continue;
}
and later on
/* Leading white space? */
if (dest - 1 != BUFF_FRONT (pfile->u_buff))
{
if (source == NULL)
source = token;
if (source->flags & PREV_WHITE)
*dest++ = ' ';
}
source = NULL;
(and c-ppoutput.cc has similar code).
So, when Y is not fun-like macro, ' ' is added because padding_token's
val.source->flags & PREV_WHITE is non-zero, while when it is fun-like
macro, we don't add ' ' in between, because source is NULL and so
used from the next token (CPP_NAME Y), which doesn't have PREV_WHITE set.
Now, the funlike_invocation_p condition
if (padding == NULL
|| (!(padding->flags & PREV_WHITE) && token->val.source ==
NULL))
padding = token;
looks very similar to that in stringify_arg/c-ppoutput.cc, so I assume
the intent was to prefer do the same thing and pick the right padding.
But there are significant differences. Both stringify_arg and
c-ppoutput.cc
don't remember the CPP_PADDING token, but its val.source instead, while
in funlike_invocation_p we want to remember the padding token that has the
significant information for stringify_arg/c-ppoutput.cc.
So, IMHO we want to overwrite padding if:
1) padding == NULL (remember that there was any padding at all)
2) padding->val.source == NULL (this matches the source == NULL
case in stringify_arg)
3) !(padding->val.source->flags & PREV_WHITE) && token->val.source == NULL
(this matches the !(source->flags & PREV_WHITE) && token->val.source ==
NULL
case in stringify_arg)
2022-02-01 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/104147
* macro.c (funlike_invocation_p): For padding prefer a token
with val.source non-NULL especially if it has PREV_WHITE set
on val.source->flags. Add gcc_assert that CPP_PADDING tokens
don't have PREV_WHITE set in flags.
* c-c++-common/cpp/pr104147.c: New test.
(cherry picked from commit 95ac5635409606386259d2ff21fb61738858ca4a)
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug preprocessor/104147] [9/10 Regression] C preprocessor may remove the standard required whitespace between the preprocessing tokens
2022-01-20 17:30 [Bug c/104147] New: C preprocessor may remove the standard required whitespace between the preprocessing tokens pavel.morozkin at gmail dot com
` (8 preceding siblings ...)
2022-05-10 8:23 ` cvs-commit at gcc dot gnu.org
@ 2022-05-11 6:24 ` cvs-commit at gcc dot gnu.org
2022-05-11 6:36 ` jakub at gcc dot gnu.org
10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-11 6:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104147
--- Comment #9 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-9 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:7157e074472435489963cf6961cd4c2d06804f4d
commit r9-10121-g7157e074472435489963cf6961cd4c2d06804f4d
Author: Jakub Jelinek <jakub@redhat.com>
Date: Tue Feb 1 20:48:03 2022 +0100
libcpp: Fix up padding handling in funlike_invocation_p [PR104147]
As mentioned in the PR, in some cases we preprocess incorrectly when we
encounter an identifier which is defined as function-like macro, followed
by at least 2 CPP_PADDING tokens and then some other identifier.
On the following testcase, the problem is in the 3rd funlike_invocation_p,
the tokens are CPP_NAME Y, CPP_PADDING (the pfile->avoid_paste shared
token),
CPP_PADDING (one created with padding_token, val.source is non-NULL and
val.source->flags & PREV_WHITE is non-zero) and then another CPP_NAME.
funlike_invocation_p remembers there was a padding token, but remembers the
first one because of its condition, then the next token is the CPP_NAME,
which is not CPP_OPEN_PAREN, so the CPP_NAME token is backed up, but as we
can't easily backup more tokens, it pushes into a new context the padding
token (the pfile->avoid_paste one). The net effect is that when Y is not
defined as fun-like macro, we read Y, avoid_paste, padding_token, Y,
while if Y is fun-like macro, we read Y, avoid_paste, avoid_paste, Y
(the second avoid_paste is because that is how we handle end of a context).
Now, for stringify_arg that is unfortunately a significant difference,
which handles CPP_PADDING tokens with:
if (token->type == CPP_PADDING)
{
if (source == NULL
|| (!(source->flags & PREV_WHITE)
&& token->val.source == NULL))
source = token->val.source;
continue;
}
and later on
/* Leading white space? */
if (dest - 1 != BUFF_FRONT (pfile->u_buff))
{
if (source == NULL)
source = token;
if (source->flags & PREV_WHITE)
*dest++ = ' ';
}
source = NULL;
(and c-ppoutput.cc has similar code).
So, when Y is not fun-like macro, ' ' is added because padding_token's
val.source->flags & PREV_WHITE is non-zero, while when it is fun-like
macro, we don't add ' ' in between, because source is NULL and so
used from the next token (CPP_NAME Y), which doesn't have PREV_WHITE set.
Now, the funlike_invocation_p condition
if (padding == NULL
|| (!(padding->flags & PREV_WHITE) && token->val.source ==
NULL))
padding = token;
looks very similar to that in stringify_arg/c-ppoutput.cc, so I assume
the intent was to prefer do the same thing and pick the right padding.
But there are significant differences. Both stringify_arg and
c-ppoutput.cc
don't remember the CPP_PADDING token, but its val.source instead, while
in funlike_invocation_p we want to remember the padding token that has the
significant information for stringify_arg/c-ppoutput.cc.
So, IMHO we want to overwrite padding if:
1) padding == NULL (remember that there was any padding at all)
2) padding->val.source == NULL (this matches the source == NULL
case in stringify_arg)
3) !(padding->val.source->flags & PREV_WHITE) && token->val.source == NULL
(this matches the !(source->flags & PREV_WHITE) && token->val.source ==
NULL
case in stringify_arg)
2022-02-01 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/104147
* macro.c (funlike_invocation_p): For padding prefer a token
with val.source non-NULL especially if it has PREV_WHITE set
on val.source->flags. Add gcc_assert that CPP_PADDING tokens
don't have PREV_WHITE set in flags.
* c-c++-common/cpp/pr104147.c: New test.
(cherry picked from commit 95ac5635409606386259d2ff21fb61738858ca4a)
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug preprocessor/104147] [9/10 Regression] C preprocessor may remove the standard required whitespace between the preprocessing tokens
2022-01-20 17:30 [Bug c/104147] New: C preprocessor may remove the standard required whitespace between the preprocessing tokens pavel.morozkin at gmail dot com
` (9 preceding siblings ...)
2022-05-11 6:24 ` cvs-commit at gcc dot gnu.org
@ 2022-05-11 6:36 ` jakub at gcc dot gnu.org
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-05-11 6:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104147
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|ASSIGNED |RESOLVED
--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2022-05-11 6:36 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-20 17:30 [Bug c/104147] New: C preprocessor may remove the standard required whitespace between the preprocessing tokens pavel.morozkin at gmail dot com
2022-01-20 20:59 ` [Bug preprocessor/104147] [9/10/11/12 Regression] " jsm28 at gcc dot gnu.org
2022-01-21 7:55 ` rguenth at gcc dot gnu.org
2022-01-31 14:07 ` jakub at gcc dot gnu.org
2022-01-31 14:50 ` jakub at gcc dot gnu.org
2022-02-01 19:50 ` cvs-commit at gcc dot gnu.org
2022-02-01 19:51 ` [Bug preprocessor/104147] [9/10/11 " jakub at gcc dot gnu.org
2022-02-19 8:02 ` cvs-commit at gcc dot gnu.org
2022-02-19 8:08 ` [Bug preprocessor/104147] [9/10 " jakub at gcc dot gnu.org
2022-05-10 8:23 ` cvs-commit at gcc dot gnu.org
2022-05-11 6:24 ` cvs-commit at gcc dot gnu.org
2022-05-11 6:36 ` jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).