From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id C2AD93857C47; Tue, 1 Feb 2022 19:50:19 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C2AD93857C47 From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug preprocessor/104147] [9/10/11/12 Regression] C preprocessor may remove the standard required whitespace between the preprocessing tokens Date: Tue, 01 Feb 2022 19:50:19 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: preprocessor X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 9.5 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Feb 2022 19:50:19 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104147 --- Comment #4 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:95ac5635409606386259d2ff21fb61738858ca4a commit r12-6976-g95ac5635409606386259d2ff21fb61738858ca4a Author: Jakub Jelinek Date: Tue Feb 1 20:48:03 2022 +0100 libcpp: Fix up padding handling in funlike_invocation_p [PR104147] As mentioned in the PR, in some cases we preprocess incorrectly when we encounter an identifier which is defined as function-like macro, follow= ed by at least 2 CPP_PADDING tokens and then some other identifier. On the following testcase, the problem is in the 3rd funlike_invocation= _p, the tokens are CPP_NAME Y, CPP_PADDING (the pfile->avoid_paste shared token), CPP_PADDING (one created with padding_token, val.source is non-NULL and val.source->flags & PREV_WHITE is non-zero) and then another CPP_NAME. funlike_invocation_p remembers there was a padding token, but remembers= the first one because of its condition, then the next token is the CPP_NAME, which is not CPP_OPEN_PAREN, so the CPP_NAME token is backed up, but as= we can't easily backup more tokens, it pushes into a new context the paddi= ng token (the pfile->avoid_paste one). The net effect is that when Y is n= ot defined as fun-like macro, we read Y, avoid_paste, padding_token, Y, while if Y is fun-like macro, we read Y, avoid_paste, avoid_paste, Y (the second avoid_paste is because that is how we handle end of a conte= xt). Now, for stringify_arg that is unfortunately a significant difference, which handles CPP_PADDING tokens with: if (token->type =3D=3D CPP_PADDING) { if (source =3D=3D NULL || (!(source->flags & PREV_WHITE) && token->val.source =3D=3D NULL)) source =3D token->val.source; continue; } and later on /* Leading white space? */ if (dest - 1 !=3D BUFF_FRONT (pfile->u_buff)) { if (source =3D=3D NULL) source =3D token; if (source->flags & PREV_WHITE) *dest++ =3D ' '; } source =3D NULL; (and c-ppoutput.cc has similar code). So, when Y is not fun-like macro, ' ' is added because padding_token's val.source->flags & PREV_WHITE is non-zero, while when it is fun-like macro, we don't add ' ' in between, because source is NULL and so used from the next token (CPP_NAME Y), which doesn't have PREV_WHITE se= t. Now, the funlike_invocation_p condition if (padding =3D=3D NULL || (!(padding->flags & PREV_WHITE) && token->val.source =3D= =3D NULL)) padding =3D token; looks very similar to that in stringify_arg/c-ppoutput.cc, so I assume the intent was to prefer do the same thing and pick the right padding. But there are significant differences. Both stringify_arg and c-ppoutput.cc don't remember the CPP_PADDING token, but its val.source instead, while in funlike_invocation_p we want to remember the padding token that has = the significant information for stringify_arg/c-ppoutput.cc. So, IMHO we want to overwrite padding if: 1) padding =3D=3D NULL (remember that there was any padding at all) 2) padding->val.source =3D=3D NULL (this matches the source =3D=3D NULL case in stringify_arg) 3) !(padding->val.source->flags & PREV_WHITE) && token->val.source =3D= =3D NULL (this matches the !(source->flags & PREV_WHITE) && token->val.source= =3D=3D NULL case in stringify_arg) 2022-02-01 Jakub Jelinek PR preprocessor/104147 * macro.cc (funlike_invocation_p): For padding prefer a token with val.source non-NULL especially if it has PREV_WHITE set on val.source->flags. Add gcc_assert that CPP_PADDING tokens don't have PREV_WHITE set in flags. * c-c++-common/cpp/pr104147.c: New test.=