From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x434.google.com (mail-pf1-x434.google.com [IPv6:2607:f8b0:4864:20::434]) by sourceware.org (Postfix) with ESMTPS id BCF843858D20 for ; Sat, 22 Apr 2023 00:08:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BCF843858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pf1-x434.google.com with SMTP id d2e1a72fcca58-63b7588005fso2461201b3a.0 for ; Fri, 21 Apr 2023 17:08:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1682122112; x=1684714112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=sYIP5EghkI3dO+sfGh+SihlAqx1defnCaX2MJArtI24=; b=oUhTy7G+9UsY+30JdXbP9r9099qngmia1RNCRhXuqK8xJlb0T9n2Iylc/efu6VTJT9 n7RB5V0pFi6V3Z0FVqKIZ3UxiYWc9CBLEAmbE5ISxT+AraG/F9BPGvFiScsMD9VBqkrz A4UWK7u+BsEyicTwHKsjB0WuhDrwoAffgN2hqL40WO5Cofypk9Z9qpSp+2U3+3AsLGqC vZA/L8Uz+Sqo19mznaaqwwvMjidR5prYYSjJjDmUaRF8rtUEO4YoQ+/OA6kOyxHT1prP vQxRwo9g+4790V/zmhlzNitVFjiRp2CAO8r33586bsfmT1mj0cRdh7W6V+HBqqjsm7Nt bERg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682122112; x=1684714112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sYIP5EghkI3dO+sfGh+SihlAqx1defnCaX2MJArtI24=; b=kOc/wI3NqlYGlNxnvhBhnZVawRwDjnQOukEIJteL/EWAjw60f4EGsjZqT6rmrxc2TI dwpjLa9HBUkexOuhF28aToMbrn6fZCZ8E+3BloK2rfrzducHOML3aVhFxjGxa6vJFJ1J 22T6YTN2Cd2MT8JmB6g8i/0Ux55joAFc1uEqA0bNgvbJxWeCC0Iy5atzjINlIvQrez2S A8t4rNw9SIVLPaqtkbwyiVJbHHAR7LZkROiuVu5bG2KjTshSVqAPGtLf3VWSV5LKqJ63 l8YrC9jS3oAHeHi/2PeT9rXk9ylxDLeu0sA/74iQ7slq7rQ072rsak1u8niz8SlFq0HA rtpQ== X-Gm-Message-State: AAQBX9e9Lq099q1rUqZw3ll9hRzOMYE8vPDDsWCULGOmaU/cF8JC2mdH +2jQBpRhSlRAT4C/AjWWNuI= X-Google-Smtp-Source: AKy350ZUxR2OMbOvTnWo2Uy9/XpzgQWvQQSWz353nqrnXuFn07NTADay9d85uSb7deqVkNHhwKY5Wg== X-Received: by 2002:a05:6a00:807:b0:636:e0fb:8c44 with SMTP id m7-20020a056a00080700b00636e0fb8c44mr8494559pfk.12.1682122112469; Fri, 21 Apr 2023 17:08:32 -0700 (PDT) Received: from ?IPV6:2601:681:8600:13d0::99f? ([2601:681:8600:13d0::99f]) by smtp.gmail.com with ESMTPSA id 125-20020a630283000000b005038291e5cbsm3107484pgc.35.2023.04.21.17.08.31 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 21 Apr 2023 17:08:31 -0700 (PDT) Message-ID: Date: Fri, 21 Apr 2023 18:08:30 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.1 Subject: Re: [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64 Content-Language: en-US To: Feng Wang , gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, palmer@dabbelt.com References: <20230324015324.13616-1-wangfeng@eswincomputing.com> From: Jeff Law In-Reply-To: <20230324015324.13616-1-wangfeng@eswincomputing.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_NUMSUBJECT,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE,WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 3/23/23 19:53, Feng Wang wrote: > This patch optimize the combine processing for sext.b/h in rv64. > Please refer to the following test case, > int sextb32(int x) > { return (x << 24) >> 24; } > > The rtl expression is as follows, > (insn 6 3 7 2 (set (reg:SI 138) > (ashift:SI (subreg/s/u:SI (reg/v:DI 136 [ xD.2271 ]) 0) > (const_int 24 [0x18]))) "sextb.c":2:13 195 {ashlsi3} > (expr_list:REG_DEAD (reg/v:DI 136 [ xD.2271 ]) > (nil))) > (insn 7 6 8 2 (set (reg:SI 137) > (ashiftrt:SI (reg:SI 138) > (const_int 24 [0x18]))) "sextb.c":2:20 196 {ashrsi3} > (expr_list:REG_DEAD (reg:SI 138) > (nil))) > > During the combine phase, they will combine into > (set (reg:SI 137) > (ashiftrt:SI (subreg:SI (ashift:DI (reg:DI 140) > (const_int 24 [0x18])) 0) > (const_int 24 [0x18]))) > > The optimal combine result is > (set (reg:SI 137) > (sign_extend:SI (subreg:QI (reg:DI 140) 0))) > This can be converted to the sext ins. > > Due to the influence of subreg,the current processing > can't obtain the imm of left shifts. Need to peel off > another layer of rtl to obtain it. > > gcc/ChangeLog: > > * combine.cc (extract_left_shift): Add SUBREG case. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/zbb-sext-rv64.c: New test. SUBREGs have painful semantics and we should be very careful just stripping them. For example, you might have a subreg that extracts the *high* part. Or you might have (subreg (mem)) or a paradoxical subreg, etc. At the *least* this case would need verification that you're getting the lowpart. However, I suspect there's other conditions that need to be checked to make this valid. But I would suggest we look elsewhere. It could be that combine is reassociating the subreg in ways that are undesirable and which ultimately makes our job harder. Additionally if we can fix this in a generic simplification/folder routine, then multiple passes can benefit. For example in simplify_context::simplify_binary_operation we get a form more amenable to optimization. > #0 simplify_context::simplify_binary_operation (this=0x7fffffffda68, code=ASHIFTRT, mode=E_SImode, > op0=0x7fffea11eb40, op1=0x7fffea009610) at /home/jlaw/riscv-persist/ventana/gcc/gcc/simplify-rtx.cc:2558 > 2558 gcc_assert (GET_RTX_CLASS (code) != RTX_COMPARE); > (gdb) p code > $24 = ASHIFTRT > (gdb) p mode > $25 = E_SImode > (gdb) p debug_rtx (op0) > (ashift:SI (subreg/s/u:SI (reg/v:DI 74 [ x ]) 0) > (const_int 24 [0x18])) > $26 = void > (gdb) p debug_rtx (op1) > (const_int 24 [0x18]) > $27 = void So that's (ashiftrt (ashift (object) 24) 24), ie sign extension. ie, we really don't have to think about the fact that the underlying object is a SUBREG because the outer operations are very clearly a sign extension regardless of the object they're operating on. With that in mind I would suggest you look at adding a case for detect zero/sign extension in simplify_context::simplify_binary_operation_1. Thanks, Jeff