From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg1-x532.google.com (mail-pg1-x532.google.com [IPv6:2607:f8b0:4864:20::532]) by sourceware.org (Postfix) with ESMTPS id 6A0CD3858C52 for ; Wed, 11 Oct 2023 07:12:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6A0CD3858C52 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embecosm.com Received: by mail-pg1-x532.google.com with SMTP id 41be03b00d2f7-58916df84c8so4279578a12.3 for ; Wed, 11 Oct 2023 00:12:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; t=1697008347; x=1697613147; darn=gcc.gnu.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=QI132FjOBPj8gMFJnvizxPFandBOfWHKiJoqvK+Ldao=; b=JO41T71icppzRogzz9pEVTyCV3XCt9HdzaAlvro6nDjr1VqcviVvOKAzius/P202u9 ZE/gKijVx6BybFVMNb3QGqEWZa0u+4Mg3TrJ0Fxb7FftXrRF2Rxb+Qm++szutbMdCflX VEW7mb7aN7RDlC4BMKYyLHmy99yyxZZp5ztOyJsb5Nfz+v8y/rtfVHYdE3ERoAM+PGvH QRFXkBUhGO+XvqGPlgBWS032b5zWSr9Pc8fVhdlei+WKRUMiYgiu8eKaOBUEtY5G9mYV qXVqF9KkeLi6vrHvIWMatK7H2gyxerDA9qHT/42SXRW4BAHWBC6tw6/rkr+YTA28zonD 143Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697008347; x=1697613147; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=QI132FjOBPj8gMFJnvizxPFandBOfWHKiJoqvK+Ldao=; b=mOHUOOwGHDuE6UaKLjVe5yw6KIVTgOmcvF7ybCqdbKnC/Q/fr5DYD6usEDZD03jELr 0tJCl9Hhq0QYXWZ7FniRDVsr1mHpRFVczb5MHiT4QDHE0vJsZA0hrf921K1hufRQCqBm jZPBQ00vhwdbuvWyf1OqtuG6QjXo7QdA4TPryZVsSF9mZIyqpChc9bmwbp4aeIM+/MYG NDxihtO7LrHNBA7GrJ3vvzLGk25YjvDMVH0nVzUQIAaOm8IOvHNszAWjzclB9S5fXb3/ xrymH7lEXY5odT7olSJ5Z66WUvZX9DwX6oEPqjv1+ZnR5meboCwEAcdWEDkJ+bSXLUrI ua1Q== X-Gm-Message-State: AOJu0YzEBWgIZPi1PMeog02eT6mSYn2mkRi+SUkk2oyayiCXDt2JTSWS Jb2NKdOn2j1pJOuFcqTxwcueLESskJ2DLzV2pyrAmg== X-Google-Smtp-Source: AGHT+IG9ezbR3Twy1q4E8tZ5XFlyLdTTLBGNJUcWt76j4qQlQT8vRvIxBEa6Rso2+mHYH31PkQCLVaGL2C39/h+mpRE= X-Received: by 2002:a05:6a20:144e:b0:161:25f7:40ce with SMTP id a14-20020a056a20144e00b0016125f740cemr20738951pzi.27.1697008347166; Wed, 11 Oct 2023 00:12:27 -0700 (PDT) MIME-Version: 1.0 References: <2501e6a4-6f02-429f-8497-226a6b22403c@gmail.com> In-Reply-To: From: Joern Rennecke Date: Wed, 11 Oct 2023 08:12:16 +0100 Message-ID: Subject: Re: committed [RISC-V]: Harden test scan patterns To: Jeff Law Cc: Vineet Gupta , GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-4.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, 11 Oct 2023 at 05:48, Joern Rennecke wrote: > So I propose we look at the first character of the regexp, and if it's neither > ^ nor \ (neither caret nor backslash), we consider the regexp un-anchored, > and prepend ^[^"]* , so it won't allow a match after a double quote. Looking at the sources for dg-scan / scan-assembler-times / scan-assembler-dem / scan-assembler-dem-not, and the tcl regexp command and re_syntax manual pages, I realise it won't work like that. The backslash-escaped characters in the source file end up just as single characters if enclosed merely with double quotes, so "\t" is a single character, although {\t} and {\m} will be passed as two characters to regexp (and "\m" is just an m). And ^ , by default, matches only the begin of the text, which for the aforementioned scan-assembler* procs means the entire (demangled for *-dem) output file. (The manual is a bit muddled about start of string or start of line, but a test with tclsh shows the default is indeed start of string.) We can make use embedded options to make a prepended string work, i.e. (?w)^[^"]*? Although I'm not sure what that'd do on macOS - would the compiler output contain lines terminated only with \r, and these be invisible to ^ ? I see that we have a number of scan patterns that start with \n ing++.dg, so I would hope that we can rely on lines ending with \n . (\n\r or \r\n are OK for this purpose.) Incidentally, these patterns should also work with (?w^[^"]*? prepended, as a line that ends should also have a start, but it could get a low count for scan-assembler-times. There are a number of tests in gcc.target/s390 that have directives starting with: scan-assembler-times {\n\t which are perfectly anchored, but we might depress the count if we prepend a pattern that matches the start of the line that has the newline. We'd also have to make an exception for regexps that start with a parenthesis to avoid disabling REs with embedded options. So it seems we have to except patterns starting wit any of: \\ \t \n ( Maybe we should also add [ to that list, for "[\n\r]" ?