From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oo1-xc2d.google.com (mail-oo1-xc2d.google.com [IPv6:2607:f8b0:4864:20::c2d]) by sourceware.org (Postfix) with ESMTPS id 5508C385C6E2 for ; Mon, 20 Nov 2023 02:52:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5508C385C6E2 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=ventanamicro.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ventanamicro.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5508C385C6E2 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700448753; cv=none; b=Ex4dD3QghkLnD9RDhkoXoXT5HHhNHUdj8eio9pNAGsXRh70uvQiyqz3ATlGp4W18S5bbCxIbtkWmPNffo122WPH2T3VCCr4hwe4DQa3hhgSuhbJI9x0bCd+MybeIMpGWtybAC1YoAIXH1MbVEloNrJDnWkqkTiMZ2ixQNNmL/BI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700448753; c=relaxed/simple; bh=KzRwmGkV1w/0gMbQCJn6Aw8XVoKEZY/MrCEXk0nrWP4=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=UKvcUm6xRHmDwsYBh8sJsoCEbkL/6v/fL9HbfKctE/DETWDqfN3VLVFLWPMxuGRVqJXUK7jFuuhXE3gvIgZcuwt/2U+Pj2wICFzOulYtI6TXZoG4QgaY58nfK9m3hCTfXbyrzEFI2+mqSL2nPmReZAhpIMxVFjTlAN9sF77vs0c= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2d.google.com with SMTP id 006d021491bc7-586ba7cdb6bso2284431eaf.2 for ; Sun, 19 Nov 2023 18:52:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1700448750; x=1701053550; darn=gcc.gnu.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=ekVlcyrlYKrMypFkd14Lhm78bLYgU81pcFml9iQxWtM=; b=nYTRqZPbtDjvSsDuq67pArg5ztrFriFrW3Px/tF8Cf/uoJec/ufoMTsAJXQgj7Q7jy GCplBunRBp+4bvJsFOBeDhlVdxwaDRQl9rV+pOUeZE17ndZGnZ1KL74U1PcmT9mUk6w/ 7WAl1Nlo40RqrCwRTeZg+fqqViyqT2cclEozsMYqKJcRhPUgsHpO2UK6UoEH4qX7qrZw +A1j/UasBG9EIs9eVOxh//hUY5SbgkRfNa5omvCJS6sgddgzD6GKI8y58zR0BXDyxAz6 Z3hxfsVEaMTQXQatDHDRi6gR6xUqpcZaWtpeDSp7/wvZ1GJJq9vd63XNjYCbHXIVPwto JR7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700448750; x=1701053550; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ekVlcyrlYKrMypFkd14Lhm78bLYgU81pcFml9iQxWtM=; b=wv80WrDDlWOWo8FtlC9Qt6M1j0j0tbVN3QzJY2GvSV4W7x3zQH2aAeGQpMHFYwHYa+ i7sazIjI6KkplXOazNQWPiSG88VIS4YW17G2xeCxPimXLvclcREHeIxJGHlzQlkRWa12 UJZWZMAvJVWYMsjpjZrxAIPfpq8QKIgfutDxlKuJeGJOW0lqqJ/ZVUXypMTgmj5XMUNM 1nCcUnjQCrk6BdU/80yz6Q1rj3zn8b4wCJL7lYoA0J05IFXhUt7ZaJ5jmedzZvWm9P8t 2Dc4HQ4g3fR7Ft32qtiANBSk1BGPkpWUqpOfzNApbeeAaAmBStG6yZwstD5/YWIPgDQ/ kIQA== X-Gm-Message-State: AOJu0Yy9F8/EwN8Pq+wOW8MV4VRKbeB9OL/UthjH8B5hBVQagmYG6ntf g85CZT1f+65OoF0HZFfSBri42g== X-Google-Smtp-Source: AGHT+IEmBkbGx5xybeBQbAUcr+8An5PUaSi44YzxI+RAnu8NwrMQRwqA5LTa8hFLoBqAZZTQzn63Kg== X-Received: by 2002:a05:6871:589e:b0:1f5:5e16:c727 with SMTP id ok30-20020a056871589e00b001f55e16c727mr7039170oac.48.1700448750632; Sun, 19 Nov 2023 18:52:30 -0800 (PST) Received: from [172.31.0.109] ([136.36.130.248]) by smtp.gmail.com with ESMTPSA id ef53-20020a0568701ab500b001efb3910402sm1208760oab.0.2023.11.19.18.52.29 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 19 Nov 2023 18:52:30 -0800 (PST) Message-ID: <4c3588b1-f05c-4e7f-ad7a-e7050cf45859@ventanamicro.com> Date: Sun, 19 Nov 2023 19:52:29 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFA] New pass for sign/zero extension elimination Content-Language: en-US To: Xi Ruoyao , "gcc-patches@gcc.gnu.org" Cc: Jivan Hakobyan References: <6d5f8ba7-0c60-4789-87ae-68617ce6ac2c@ventanamicro.com> From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE,WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 11/19/23 19:23, Xi Ruoyao wrote: > On Sun, 2023-11-19 at 17:47 -0700, Jeff Law wrote: >> This is work originally started by Joern @ Embecosm. >> >> There's been a long standing sense that we're generating too many >> sign/zero extensions on the RISC-V port.  REE is useful, but it's really >> focused on a relatively narrow part of the extension problem. >> >> What Joern's patch does is introduce a new pass which tracks liveness of >> chunks of pseudo regs.  Specifically it tracks bits 0..7, 8..15, 16..31 >> and 32..63. >> >> If it encounters a sign/zero extend that sets bits that are never read, >> then it replaces the sign/zero extension with a narrowing subreg.  The >> narrowing subreg usually gets eliminated by subsequent passes (it's just >> a copy after all). >> >> Jivan has done some analysis and found that it eliminates roughly 1% of >> the dynamic instruction stream for x264 as well as some redundant >> extensions in the coremark benchmark (both on rv64).  In my own testing >> as I worked through issues on other architectures I clearly saw it >> helping in various places within GCC itself or in the testsuite. >> >> The basic structure is to first do a fairly standard liveness analysis >> on the chunks, seeding original state with the liveness data from DF. >> Once that's stable, we do a final pass to identify the useless >> extensions and transform them into narrowing subregs. >> >> A few key points to remember. >> >> For destination processing it is always safe to ignore a destination. >> Ignoring a destination merely means that whatever was live after the >> given insn will continue to be live before the insn.  What is not safe >> is to clear a bit in the LIVENOW bitmap for a destination chunk that is >> not set.  This comes into play with things like STRICT_LOW_PART. >> >> For source processing the safe thing to do is to set all the chunks in a >> register as live.  It is never safe to fail to process a source operand. >> >> When a destination object is not fully live, we try to transfer that >> limited liveness to the source operands.  So for example if bits 16..63 >> are dead in a destination of a PLUS, we need not mark bits 16..63 as >> live for the source operands.  We have to be careful -- consider a shift >> count on a target without SHIFT_COUNT_TRUNCATED set.  So we have both a >> list of RTL codes where we can transfer liveness and a few codes where >> one of the operands may need to be fully live (ex, a shift count) while >> the other input may not need to be fully live (value left shifted). >> >> Locally we have had this enabled at -O1 and above to encourage testing, >> but I'm thinking that for the trunk enabling at -O2 and above is the >> right thing to do. >> >> This has (of course) been tested on rv64.  It's also been bootstrapped >> and regression tested on x86.  Bootstrap and regression tested (C only) >> for m68k, sh4, sh4eb, alpha.  Earlier versions were also bootstrapped >> and regression tested on ppc, hppa and s390x (C only for those as well). >>   It's also been tested on the various crosses in my tester.  So we've >> got reasonable coverage of 16, 32 and 64 bit targets, big and little >> endian, with and without SHIFT_COUNT_TRUNCATED and all kinds of other >> oddities. >> >> The included tests are for RISC-V only because not all targets are going >> to have extraneous extensions.   There's tests from coremark, x264 and >> GCC's bz database.  It probably wouldn't be hard to add aarch64 >> testscases.  The BZs listed are improved by this patch for aarch64. >> >> Given the amount of work Jivan and I have done, I'm not comfortable >> self-approving at this time.  I'd much rather have another set of eyes >> on the code.  Hopefully the code is documented well enough for that to >> be useful exercise. >> >> So, no need to work from Pago Pago for this patch.  I may make another >> attempt at the eswin conditional move work while working virtually in >> Pago Pago though. >> >> Thoughts, comments, recommendations? > > Unfortunately, I get some ICE building stage 1 libgcc with this patch on > loongarch64-linux-gnu: > > during RTL pass: ext_dce > ../../../gcc/libgcc/libgcc2.c: In function ‘__absvdi2’: > ../../../gcc/libgcc/libgcc2.c:224:1: internal compiler error: Segmentation fault > 224 | } > | ^ > 0x120baa477 crash_signal > ../../gcc/gcc/toplev.cc:316 > 0x1216aeeb4 ext_dce_process_sets > ../../gcc/gcc/ext-dce.cc:128 > 0x1216afbaf ext_dce_process_bb > ../../gcc/gcc/ext-dce.cc:647 > 0x1216afbaf ext_dce > ../../gcc/gcc/ext-dce.cc:802 > 0x1216afbaf execute > ../../gcc/gcc/ext-dce.cc:868 > Please submit a full bug report, with preprocessed source (by using -freport-bug). > Please include the complete backtrace with any bug report. > See for instructions. I think I know what's going on here. jeff