From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x102b.google.com (mail-pj1-x102b.google.com [IPv6:2607:f8b0:4864:20::102b]) by sourceware.org (Postfix) with ESMTPS id 6691B3858C52 for ; Wed, 11 Oct 2023 04:48:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6691B3858C52 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embecosm.com Received: by mail-pj1-x102b.google.com with SMTP id 98e67ed59e1d1-2773f776f49so466450a91.1 for ; Tue, 10 Oct 2023 21:48:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; t=1696999709; x=1697604509; darn=gcc.gnu.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=c4TkfZebx3+fHglaWWe54QDuIreJEp9LNoqkObFngS8=; b=TwOfhfndPDblb4QNH6v+w4EptUJdC2kutjY8MWBpj7snbdrEetTUDd6dbW8dCidGBy qoYyuw6Buia+qgmHpyZfYh0/JvMGPY0K1i2E10UWw/a3c93TQKkJNsjzkvsUldb6V9Lt Yc8hPiZ0IuZ0VsYmU9a/Rt9F0KrZZjCUDHiFKRo2P2I/3UtbBrDwrn1PRK/lhYs2JQRv 13jdiTEZGks/7aZDS/R6yj/6higoxS//qO0+AjGdXD95oJTVtHG5DxBPcumzHnZcZi5U jX8iYekeHwGbpvMKJizEy3VCG9F/UWBHjZnGZlt9BLyRMXrLmu8lnK/cbhbGu8f8WQix N3/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696999709; x=1697604509; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=c4TkfZebx3+fHglaWWe54QDuIreJEp9LNoqkObFngS8=; b=tgaruvaPcuIKtS8E83KJ/3SuL5oPuSWQ6nNcC6nWQ6heBHafLu1d3MiRGND6fXT55o QyH29TYPQPG53kNbKTKMTx0btcHcYl0E8w/4qyDjSxju9gMtgrQVjCxV+Fqpd5+SCU7O 2TLHWDLVF6c4GTgSqpa/D/pOLveH3iEmDA2SaJkTs06W7RUxBrAwLmyYO/6Fax4R3i8l Jw46vAIEbzRjyDmQT/MHBFvC1a2urYHswFuPLBpIPLY2T57Sj0bIBZ2ov1pE2a+vX3u0 gcUeDxqNNPxjecnPUdzXRVjerOHAuA7J+GDzyyQNd+bK1auDzkHhGUOYdqbb3FU8HRRF OGgg== X-Gm-Message-State: AOJu0Yz/jLH9X4Gbjza9AlJfPbwAJIb9cL4SBQZKrMJ1uJqSngdft/WN IPKEPgyxgN0TAgkzNSdRHZev6HIOD0Y1I/9bn4Pp3w== X-Google-Smtp-Source: AGHT+IEFc1RotrXLCqxyYuEEVoakO4tLfAIgDqyhLTNGJZIcJrRYTfu8zLeNavH4mJrO2hMBurAIqQVAcvqTPH+103E= X-Received: by 2002:a17:90b:38c6:b0:277:66be:f3ad with SMTP id nn6-20020a17090b38c600b0027766bef3admr24135014pjb.11.1696999709228; Tue, 10 Oct 2023 21:48:29 -0700 (PDT) MIME-Version: 1.0 References: <2501e6a4-6f02-429f-8497-226a6b22403c@gmail.com> In-Reply-To: From: Joern Rennecke Date: Wed, 11 Oct 2023 05:48:18 +0100 Message-ID: Subject: Re: committed [RISC-V]: Harden test scan patterns To: Jeff Law Cc: Vineet Gupta , GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sat, 30 Sept 2023 at 22:12, Joern Rennecke wrote: > Also, we might have different directives for not scanning in LTO sections - > or just ignoring .ascii . Or maybe the other way round - you have to do > something special if you want to scan inside strings, and by default we > don't look inside strings? > LTO information uses ascii, and ISTR sometimes also a zero-terminated > variant (asciiz?); There might also some string constant outputs, or stabs > information. > One possible rule I think might work is: if the RE doesn't mention a quote, > don't scan what's quoted inside double quotes. Although we might to have > to look out for backslash-escaped quotes to find the proper end of a quoted > string. I've though about this some more, and we need something that's simple for dejagnu and simple to describe. So I propose we look at the first character of the regexp, and if it's neither ^ nor \ (neither caret nor backslash), we consider the regexp un-anchored, and prepend ^[^"]* , so it won't allow a match after a double quote. Then document this in sourcebuild.texi, with some mention of lto information and stabs, and also mentioning that if you really want to match irrespective of a leading quote, you can prepend ^.* to your regexp. There are good reasons to be more specific with your regexps in general, but the matches in LTO are particularily damaging because they appear semi-random, so often escape a regression test when the test is made, only to surface during somebody else's regression test.