From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by sourceware.org (Postfix) with ESMTPS id E56ED3858C2F for ; Sat, 30 Sep 2023 21:12:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E56ED3858C2F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embecosm.com Received: by mail-pj1-x102f.google.com with SMTP id 98e67ed59e1d1-27747002244so9565765a91.2 for ; Sat, 30 Sep 2023 14:12:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; t=1696108341; x=1696713141; darn=gcc.gnu.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ftm+1ABwR5qybV8WFLBx8EVnD4RZQ/H0rKNoEZPBpTI=; b=MWyBEpe0WABCR/r8ygDnxtRPS2xDf2xGQUyBWSqiE+pDw6VNZ2AImG9YHTKuTZdACm rf3aYGPe8N25RO0yA+w8LA2UtCYOhoYuaoHNCLFMXzcFTvoxpgcfDBjXP80t6bX8OoGA lGZlF3x/wXlawYQqyK/w2qw8VrM7IcoCRBAz9RzCGW5/FYls4WGJb9jq7v1oaQLRuRnE MZdfr1Cpvp6kp+xXPNfj55KqcfuKQDW6JZh/KT9nMgq2i7vDCYrET+cS4rtcQOO5cfQ+ isrDsxZx2PqixikzybQcxsexi8R2eXSCSFvZPCTQp6ilZHNK9MOyN2MdhmxqntyqhXjO adQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696108341; x=1696713141; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ftm+1ABwR5qybV8WFLBx8EVnD4RZQ/H0rKNoEZPBpTI=; b=ZSp4bYAYQAa54i2cAQGwITX5mvJgOh9VETEUOw9G2BCJFhqTdwoNPmIG57bzD/QIp6 mtPTvJB141P70APWCSSXxSRc8Vpzl/Tz0aGXntXhnt4NNPRvhhem+mBOH2Z1460AgiJa 15lUF7fLFn84aTroQ+ozUFr2ux3eMAo0SLgnlKAWeJbkPbFvKr+3Q7QX+78IbB71FvZd SrWZ/uIPKehAc8zHm+BqPCyrMn5bPdAal6pXog2X+/35ibixjzZvt0IiOdUAStmvZo6a oGTiSbuV7aU4k+nAKkQceB3umHyEOTiJjQvXD47qou8V5+57HujJy5bBpqphYkIWh6R7 zsbQ== X-Gm-Message-State: AOJu0Yw7KFx+ud/TKGA4P+tonXFQGIayrOuxKM7Xqm/yDlrG5VfG14Hn jhyvXAt7LbYuctkQvp7vWdI+mVdEIiA7XlOIFJCQeA== X-Google-Smtp-Source: AGHT+IGw9rkEhi3iWgh1wy6adcJ0423MYC3KEcoZKa6fWClw13Qe2ni3RYrbMTg1W9YRR9UOXWOErdlBm6NMeV/hg1k= X-Received: by 2002:a17:90b:4f47:b0:274:6a79:17c1 with SMTP id pj7-20020a17090b4f4700b002746a7917c1mr6818933pjb.15.1696108340914; Sat, 30 Sep 2023 14:12:20 -0700 (PDT) MIME-Version: 1.0 References: <2501e6a4-6f02-429f-8497-226a6b22403c@gmail.com> In-Reply-To: From: Joern Rennecke Date: Sat, 30 Sep 2023 22:12:09 +0100 Message-ID: Subject: Re: committed [RISC-V]: Harden test scan patterns To: Jeff Law Cc: Vineet Gupta , GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, 29 Sept 2023 at 14:54, Jeff Law wrote: > So I recommend we go forward with Joern's approach (so consider that an > ACK for the trunk). Joern can you post a follow-up manual twiddle so > that other ports can follow your example and avoid this problem? The manual... so not in the general web pages, but the stuff under gcc/doc ? I see that we have a description of scan-assembler* directives in sourcebuild.texi , so I suppose that it should go there. I suppose the advice should also apply to scan-assembler-dem(-not), but not to scan-symbol-section . The more I think about this, the more it feels like we are providing the wrong tools and then are telling users they're using it incorrectly (like "You're holding it wrong."). Quoting dots with \. is not much of an issue, but prepending \t or \m impairs legibility. I like the obsoleted word-start/end markers \< / \> much better, as they don't blend in with text. ^ as start-of-line marker is nice for legibility, but it will generally not work with common semantics, as it'll be thrown off by white space, and even more, by labels. Also, we might have different directives for not scanning in LTO sections - or just ignoring .ascii . Or maybe the other way round - you have to do something special if you want to scan inside strings, and by default we don't look inside strings? LTO information uses ascii, and ISTR sometimes also a zero-terminated variant (asciiz?); There might also some string constant outputs, or stabs information. One possible rule I think might work is: if the RE doesn't mention a quote, don't scan what's quoted inside double quotes. Although we might to have to look out for backslash-escaped quotes to find the proper end of a quoted string. Or should we instead make assembly scans specific to sections in which assembly output goes, like text sections? The danger is that we might miss a text section by another name. We can give an error if we find no text section, but there might be a recognizable text section which is a red herring besides the one that's hidden by some unusual name.