From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x22c.google.com (mail-oi1-x22c.google.com [IPv6:2607:f8b0:4864:20::22c]) by sourceware.org (Postfix) with ESMTPS id D714F3857C46 for ; Mon, 20 Nov 2023 02:46:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D714F3857C46 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D714F3857C46 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::22c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700448419; cv=none; b=o6PTtd1dSWxzmLJwleXNimT09M7cLFXkdtBAKweH3Kss8DVk1VcsLPPfsc5tjCLSTD3gxETCDqXs6XusgMS2hcf8VcRxCTFHohXiLo9zgAQVqmb1+1Lagjb9HgClQ2QOVV/HfQryAGCLP6ror759ju4//nonEiWKSxjLvWY/tLU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700448419; c=relaxed/simple; bh=vFHr/qgly5cg1/Hgi95Q4IF5+nF2JYzL67q4AzyqFVw=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=TaSpizZMGXUltD0Bdro1k/eE0IRIGcMozpCfrg45aWHFNOuWcI2iCRFbHSRv9YIT3/wej8T0J+kFVGpDb2FkqnrDvX0SQ53BgoJeSBl41mUxGvNzAVWAIgk2jiVNtYLkKQ4PeN596ytqu6ppes9nDPd1KbeqjKOCINJ8dsG1T7s= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oi1-x22c.google.com with SMTP id 5614622812f47-3b566ee5f1dso2606578b6e.0 for ; Sun, 19 Nov 2023 18:46:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700448415; x=1701053215; darn=gcc.gnu.org; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=ovXEgiBG5MUSvpZOmMELQF6vg3I9Xo4wxZjlR/BI2eY=; b=hhItabdRfuTpWu0f8ubIyCJUUfZcTcWEX2EtatbEhwX4ADq1rlTxRuBG1UcqsEKdll ry9Q12ONeucK0xmK+EpRV6oKALQ7t9cvUKW6rtW9nVzxlP+AA26KSCBIis6RkGbxARc+ gypUjNfp2Yh6f66rKq2F3IrLkek87gAFUe2TuUgOwuqrUT3WUsGx9VJ+uha7YQgjGSqB 3PTnN0mkOZYSKykeli363laudiwvqlcajB1t/fmd92yGZ6XyqMN2yedutYm3CVqKoFCk PcRd2BrMn1Wc4ViYH8WsGhMAlR/c8TuSWUNLgu7aWTk1ul1MLE1XeZ+oaNrxf/1u7CZ7 8X6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700448415; x=1701053215; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ovXEgiBG5MUSvpZOmMELQF6vg3I9Xo4wxZjlR/BI2eY=; b=mkh2xZIm+rvJJgGCrkEF4p1cWApWzPp2hb3cJ7J1rTdpmgw0g3NrC8suQLXuQl9g7I 9iYS7Ak/xcgJjodY1HYz7MDJzMd/w5acn207hqqky0ft9J8sUSsWp1d/G/TjpyWDUGMC Z04tAqB7ifYdt2gLi19D8eid04z7QAMT3CIL41N8oGY6gj8yYzzN8VIUcdGsiQ6F5si1 lkC8yd3XGwDPM0BVsONFQAy6C0nA/chKDcxuS+Q77vkIx0D5MAyR8ViqKF1zsJn4Fjd0 N5A+ZZKLAUS1ukjrIllh6LQCrUs4vIFzTMU4rEUWEX0AYd28zSCFET0zAfifpP+oJrbI xVXw== X-Gm-Message-State: AOJu0YzS3ktfmPjH3hE9s9L+ktQ6lQ2JizN8u/yLewRiZ9KOTmh2lfaU V/vkaus5hSp1uuud2pKF6y2xQQyZu65shw== X-Google-Smtp-Source: AGHT+IEvrY6PAQQ6R1iNiiDs8IimhvN3XERaOKRfsrVBaakuIHDGvQOfBwNLSL4Mo/mfC9MVHRZD7w== X-Received: by 2002:a05:6808:1529:b0:3af:c259:71e6 with SMTP id u41-20020a056808152900b003afc25971e6mr1118005oiw.5.1700448414834; Sun, 19 Nov 2023 18:46:54 -0800 (PST) Received: from [172.31.0.109] ([136.36.130.248]) by smtp.gmail.com with ESMTPSA id eu3-20020a056808288300b003b2e4754cc2sm1123805oib.26.2023.11.19.18.46.54 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 19 Nov 2023 18:46:54 -0800 (PST) Message-ID: Date: Sun, 19 Nov 2023 19:46:53 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFA] New pass for sign/zero extension elimination Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <6d5f8ba7-0c60-4789-87ae-68617ce6ac2c@ventanamicro.com> From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE,WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 11/19/23 19:23, Xi Ruoyao wrote: > On Sun, 2023-11-19 at 17:47 -0700, Jeff Law wrote: >> This is work originally started by Joern @ Embecosm. >> >> There's been a long standing sense that we're generating too many >> sign/zero extensions on the RISC-V port.  REE is useful, but it's really >> focused on a relatively narrow part of the extension problem. >> >> What Joern's patch does is introduce a new pass which tracks liveness of >> chunks of pseudo regs.  Specifically it tracks bits 0..7, 8..15, 16..31 >> and 32..63. >> >> If it encounters a sign/zero extend that sets bits that are never read, >> then it replaces the sign/zero extension with a narrowing subreg.  The >> narrowing subreg usually gets eliminated by subsequent passes (it's just >> a copy after all). >> >> Jivan has done some analysis and found that it eliminates roughly 1% of >> the dynamic instruction stream for x264 as well as some redundant >> extensions in the coremark benchmark (both on rv64).  In my own testing >> as I worked through issues on other architectures I clearly saw it >> helping in various places within GCC itself or in the testsuite. >> >> The basic structure is to first do a fairly standard liveness analysis >> on the chunks, seeding original state with the liveness data from DF. >> Once that's stable, we do a final pass to identify the useless >> extensions and transform them into narrowing subregs. >> >> A few key points to remember. >> >> For destination processing it is always safe to ignore a destination. >> Ignoring a destination merely means that whatever was live after the >> given insn will continue to be live before the insn.  What is not safe >> is to clear a bit in the LIVENOW bitmap for a destination chunk that is >> not set.  This comes into play with things like STRICT_LOW_PART. >> >> For source processing the safe thing to do is to set all the chunks in a >> register as live.  It is never safe to fail to process a source operand. >> >> When a destination object is not fully live, we try to transfer that >> limited liveness to the source operands.  So for example if bits 16..63 >> are dead in a destination of a PLUS, we need not mark bits 16..63 as >> live for the source operands.  We have to be careful -- consider a shift >> count on a target without SHIFT_COUNT_TRUNCATED set.  So we have both a >> list of RTL codes where we can transfer liveness and a few codes where >> one of the operands may need to be fully live (ex, a shift count) while >> the other input may not need to be fully live (value left shifted). >> >> Locally we have had this enabled at -O1 and above to encourage testing, >> but I'm thinking that for the trunk enabling at -O2 and above is the >> right thing to do. >> >> This has (of course) been tested on rv64.  It's also been bootstrapped >> and regression tested on x86.  Bootstrap and regression tested (C only) >> for m68k, sh4, sh4eb, alpha.  Earlier versions were also bootstrapped >> and regression tested on ppc, hppa and s390x (C only for those as well). >>   It's also been tested on the various crosses in my tester.  So we've >> got reasonable coverage of 16, 32 and 64 bit targets, big and little >> endian, with and without SHIFT_COUNT_TRUNCATED and all kinds of other >> oddities. >> >> The included tests are for RISC-V only because not all targets are going >> to have extraneous extensions.   There's tests from coremark, x264 and >> GCC's bz database.  It probably wouldn't be hard to add aarch64 >> testscases.  The BZs listed are improved by this patch for aarch64. >> >> Given the amount of work Jivan and I have done, I'm not comfortable >> self-approving at this time.  I'd much rather have another set of eyes >> on the code.  Hopefully the code is documented well enough for that to >> be useful exercise. >> >> So, no need to work from Pago Pago for this patch.  I may make another >> attempt at the eswin conditional move work while working virtually in >> Pago Pago though. >> >> Thoughts, comments, recommendations? > > Unfortunately, I get some ICE building stage 1 libgcc with this patch on > loongarch64-linux-gnu: > > during RTL pass: ext_dce > ../../../gcc/libgcc/libgcc2.c: In function ‘__absvdi2’: > ../../../gcc/libgcc/libgcc2.c:224:1: internal compiler error: Segmentation fault > 224 | } > | ^ > 0x120baa477 crash_signal > ../../gcc/gcc/toplev.cc:316 > 0x1216aeeb4 ext_dce_process_sets > ../../gcc/gcc/ext-dce.cc:128 > 0x1216afbaf ext_dce_process_bb > ../../gcc/gcc/ext-dce.cc:647 > 0x1216afbaf ext_dce > ../../gcc/gcc/ext-dce.cc:802 > 0x1216afbaf execute > ../../gcc/gcc/ext-dce.cc:868 > Please submit a full bug report, with preprocessed source (by using -freport-bug). > Please include the complete backtrace with any bug report. > See for instructions. Can you pass along the .i file? I do regularly build loongarch (including earlier today) so this is totally unexpected. jeff