From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 9F37B394B02A for ; Mon, 31 Jan 2022 22:12:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9F37B394B02A Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-166-jhAIJ1xnMLiLNmALVZjeFA-1; Mon, 31 Jan 2022 17:12:26 -0500 X-MC-Unique: jhAIJ1xnMLiLNmALVZjeFA-1 Received: by mail-qt1-f200.google.com with SMTP id y22-20020ac87c96000000b002d1bfdbd86dso11450219qtv.20 for ; Mon, 31 Jan 2022 14:12:26 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=whdVjb+qf1ePPtT52XVvnmjaCuk/43SJ4CRWtVuFBag=; b=1UMBmIMvNFyId0RVjNGs14VJjBEC9mkFiEKuFp99H6em/yE4Qo/9zLRQv+FMDd+DtO JeUwVbSGdlVqswr6THlYIKM6YvRniaGlQMjKVq2IweWek+Kd9ibEpOAwEl27WybLjHvm Ng19Vr6eZ596qGnny9CxPKYBhFQtycfqeoSp0D7X3wi9kdW1sonodYpYHroiSyzWeHdT 7SqxGyK92D7OqLkGK0V+xdDWsNWfkyGpdatZqTqe/0s8j+fRW1/gKsxOcNbBzjbtNRVx NXoGW62l6FHrhb0qZKHs6nrAOTnXkP5J2YdaB+3GeugAW5B4n2Dr7U1/V3JKhskoIdF6 EIFg== X-Gm-Message-State: AOAM533kjqicHYL9iNA/PPrrk6llZmimD3nAUkQiu74gGb4UiBzlY7Ng TOImpOgFd3ItbU8cg9o4ZHmIZd1kiV23SrgscXprmzqJAcowUW2/vo9x2J6XoHcRnJPxQSEz2Bv Sy7kYgJSvKEM5DKAMRA== X-Received: by 2002:a05:6214:21a4:: with SMTP id t4mr19866718qvc.89.1643667145571; Mon, 31 Jan 2022 14:12:25 -0800 (PST) X-Google-Smtp-Source: ABdhPJwL8KdvL9btmNcAc8e2rSM20z0/FeFx1e+q0EVkEL9Wz3DQRSfwjgvis8jn4jX84xJNOeo2iw== X-Received: by 2002:a05:6214:21a4:: with SMTP id t4mr19866695qvc.89.1643667145196; Mon, 31 Jan 2022 14:12:25 -0800 (PST) Received: from [192.168.1.149] (130-44-159-43.s15913.c3-0.arl-cbr1.sbo-arl.ma.cable.rcncustomer.com. [130.44.159.43]) by smtp.gmail.com with ESMTPSA id v22sm7454108qtc.96.2022.01.31.14.12.24 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 31 Jan 2022 14:12:24 -0800 (PST) Message-ID: Date: Mon, 31 Jan 2022 17:12:23 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.1 Subject: Re: [PATCH] libcpp: Fix up padding handling in funlike_invocation_p [PR104147] To: Jakub Jelinek , "Joseph S. Myers" , Marek Polacek Cc: gcc-patches@gcc.gnu.org References: <20220131190349.GM2646553@tucnak> From: Jason Merrill In-Reply-To: <20220131190349.GM2646553@tucnak> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Jan 2022 22:12:30 -0000 On 1/31/22 14:03, Jakub Jelinek wrote: > Hi! > > As mentioned in the PR, in some cases we preprocess incorrectly when we > encounter an identifier which is defined as function-like macro, followed > by at least 2 CPP_PADDING tokens and then some other identifier. > On the following testcase, the problem is in the 3rd funlike_invocation_p, > the tokens are CPP_NAME Y, CPP_PADDING (the pfile->avoid_paste shared token), > CPP_PADDING (one created with padding_token, val.source is non-NULL and > val.source->flags & PREV_WHITE is non-zero) and then another CPP_NAME. > funlike_invocation_p remembers there was a padding token, but remembers the > first one because of its condition, then the next token is the CPP_NAME, > which is not CPP_OPEN_PAREN, so the CPP_NAME token is backed up, but as we > can't easily backup more tokens, it pushes into a new context the padding > token (the pfile->avoid_paste one). The net effect is that when Y is not > defined as fun-like macro, we read Y, avoid_paste, padding_token, Y, > while if Y is fun-like macro, we read Y, avoid_paste, avoid_paste, Y > (the second avoid_paste is because that is how we handle end of a context). > Now, for stringify_arg that is unfortunately a significant difference, > which handles CPP_PADDING tokens with: > if (token->type == CPP_PADDING) > { > if (source == NULL > || (!(source->flags & PREV_WHITE) > && token->val.source == NULL)) > source = token->val.source; > continue; > } > and later on > /* Leading white space? */ > if (dest - 1 != BUFF_FRONT (pfile->u_buff)) > { > if (source == NULL) > source = token; > if (source->flags & PREV_WHITE) > *dest++ = ' '; > } > source = NULL; > (and c-ppoutput.cc has similar code). > So, when Y is not fun-like macro, ' ' is added because padding_token's > val.source->flags & PREV_WHITE is non-zero, while when it is fun-like > macro, we don't add ' ' in between, because source is NULL and so > used from the next token (CPP_NAME Y), which doesn't have PREV_WHITE set. > > Now, the funlike_invocation_p condition > if (padding == NULL > || (!(padding->flags & PREV_WHITE) && token->val.source == NULL)) > padding = token; > looks very similar to that in stringify_arg/c-ppoutput.cc, so I assume > the intent was to prefer do the same thing and pick the right padding. > But there are significant differences. Both stringify_arg and c-ppoutput.cc > don't remember the CPP_PADDING token, but its val.source instead, while > in funlike_invocation_p we want to remember the padding token that has the > significant information for stringify_arg/c-ppoutput.cc. > So, IMHO we want to overwrite padding if: > 1) padding == NULL (remember that there was any padding at all) > 2) padding->val.source == NULL (this matches the source == NULL > case in stringify_arg) > 3) !(padding->val.source->flags & PREV_WHITE) && token->val.source == NULL > (this matches the !(source->flags & PREV_WHITE) && token->val.source == NULL > case in stringify_arg) > > Bootstrapped/regtested on powerpc64le-linux, ok for trunk > (and after a while for some release branches)? > > 2022-01-31 Jakub Jelinek > > PR preprocessor/104147 > * macro.cc (funlike_invocation_p): For padding prefer a token > with val.source non-NULL especially if it has PREV_WHITE set > on val.source->flags. > > * c-c++-common/cpp/pr104147.c: New test. > > --- libcpp/macro.cc.jj 2022-01-18 11:59:00.277972128 +0100 > +++ libcpp/macro.cc 2022-01-31 15:44:49.208847524 +0100 > @@ -1374,7 +1374,9 @@ funlike_invocation_p (cpp_reader *pfile, > if (token->type != CPP_PADDING) > break; > if (padding == NULL > - || (!(padding->flags & PREV_WHITE) && token->val.source == NULL)) > + || padding->val.source == NULL > + || (!(padding->val.source->flags & PREV_WHITE) > + && token->val.source == NULL)) What if padding->val.source == NULL and padding->flags & PREV_WHITE? Have you tried asserting that's not possible? > padding = token; > } > > --- gcc/testsuite/c-c++-common/cpp/pr104147.c.jj 2022-01-31 15:54:18.113847934 +0100 > +++ gcc/testsuite/c-c++-common/cpp/pr104147.c 2022-01-31 15:54:04.530038941 +0100 > @@ -0,0 +1,27 @@ > +/* PR preprocessor/104147 */ > +/* { dg-do run } */ > + > +#define X(x,y) x y > +#define STR_(x) #x > +#define STR(x) STR_(x) > +const char *str = > +STR(X(Y,Y)) > +#define Y() > +STR(X(Y,Y)) > +#undef Y > +STR(X(Y,Y)) > +#define Y() > +STR(X(Y,Y)) > +STR(X(Y, > +Y)) > +STR(X(Y > +,Y)) > +; > + > +int > +main () > +{ > + if (__builtin_strcmp (str, "Y YY YY YY YY YY Y") != 0) > + __builtin_abort (); > + return 0; > +} > > Jakub >