From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by sourceware.org (Postfix) with ESMTPS id AA82E3858D35 for ; Thu, 29 Jun 2023 14:09:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AA82E3858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1b82bf265b2so3591135ad.0 for ; Thu, 29 Jun 2023 07:09:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688047770; x=1690639770; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=p7LBC5tTL38gtgBh/egLdXV920d6wC2hgcpcLF457t4=; b=p71He5J+BSR+kpj5SU/NV/fnxwZUgyP1VUMjUVs5aDZBQBEKI4hBxFr44GdObgau5T xm+NRUsvgqINKGi9ZqjZrsSzdwc62cufK1UQDEfzbM62QmvYOIqZXSF/Dj4sRCI+QRPr o3BvfkbGC3P0yLfJwlGwiKj3ENnFAxjcnrUCtIYpAyq2jzY9aP67hs6ZuPV0JbKgijVu ZyMmEYMRgugGTWTvt13rFD1zR1r9TxM8KAREBv74cHuB8k7vpZEdqHGUl8mW13RFiKZI PetoQRul2OhS3uePsy2mNbDA+99544SyvYNinlQAkW5X9pFXVn+GDS9fkFN3/JPmERiL ETkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688047770; x=1690639770; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=p7LBC5tTL38gtgBh/egLdXV920d6wC2hgcpcLF457t4=; b=Gsd/SPAZ5Z3sZhQTjSHj988WRllUhJ48yQlK0JkQ429k0T1dqF2/uxI3Z3TjXCO5DP ynbRzt6NdULqupL7x9UVVGA9uKDK3V/rxcEZo0v5uMsbagXHBwguUjHqDMwbPR9y4S2S YkNyJ6N72YX7JkOnJ3fCmPSsL7M+MTOMzCZcfTaM+CQTiV+fDZKwy12VYTUodny8Oxd3 MQbz4qsoKEeH4lYucVRZiDY/JW09L5HKuHZ+5CA+5M1wuXJVSjzhsXRvyc+78I1gQeOc PMKGp8qrT/qscJtzSIpn4kKE9hzV/VskFeSFbm5a14GfH86NJZVAx4pL0PAqFRWZSv0p MILA== X-Gm-Message-State: AC+VfDwUROi2FsA1UC4TNh+UvEVHr/ktSMAoXIjvLzcgdf64DNGVxvI7 l5v7IUNKjRNA5xI2KGVQsTI= X-Google-Smtp-Source: ACHHUZ564f2cd9+H1IX1Y/kRFsa+E5XAyLA01EuRdCMo48Ep1BMrZsEVgl23ULj8QrBvSLXF3NZuOA== X-Received: by 2002:a17:902:ec8f:b0:1af:d225:9002 with SMTP id x15-20020a170902ec8f00b001afd2259002mr13507142plg.14.1688047770140; Thu, 29 Jun 2023 07:09:30 -0700 (PDT) Received: from [172.31.0.109] ([136.36.130.248]) by smtp.gmail.com with ESMTPSA id jg3-20020a17090326c300b001a2104d706fsm8281999plb.225.2023.06.29.07.09.28 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 29 Jun 2023 07:09:29 -0700 (PDT) Message-ID: Date: Thu, 29 Jun 2023 08:09:27 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH 10/11] riscv: thead: Add support for the XTheadMemIdx ISA extension Content-Language: en-US To: =?UTF-8?Q?Christoph_M=c3=bcllner?= Cc: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Cooper Qu , Lifang Xia , Yunhai Shang , Zhiwei Liu References: <20230428062314.2995571-1-christoph.muellner@vrull.eu> <0e13e932-64c4-fe33-e0f8-21380809a6ba@gmail.com> From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 6/29/23 01:39, Christoph Müllner wrote: > On Wed, Jun 28, 2023 at 8:23 PM Jeff Law wrote: >> >> >> >> On 6/28/23 06:39, Christoph Müllner wrote: >> >>>>> +;; XTheadMemIdx overview: >>>>> +;; All peephole passes attempt to improve the operand utilization of >>>>> +;; XTheadMemIdx instructions, where one sign or zero extended >>>>> +;; register-index-operand can be shifted left by a 2-bit immediate. >>>>> +;; >>>>> +;; The basic idea is the following optimization: >>>>> +;; (set (reg 0) (op (reg 1) (imm 2))) >>>>> +;; (set (reg 3) (mem (plus (reg 0) (reg 4))) >>>>> +;; ==> >>>>> +;; (set (reg 3) (mem (plus (reg 4) (op2 (reg 1) (imm 2)))) >>>>> +;; This optimization only valid if (reg 0) has no further uses. >>>> Couldn't this be done by combine if you created define_insn patterns >>>> rather than define_peephole2 patterns? Similarly for the other cases >>>> handled here. >>> >>> I was inspired by XTheadMemPair, which merges two memory accesses >>> into a mem-pair instruction (and which got inspiration from >>> gcc/config/aarch64/aarch64-ldpstp.md). >> Right. I'm pretty familiar with those. They cover a different case, >> specifically the two insns being optimized don't have a true data >> dependency between them. ie, the first instruction does not produce a >> result used in the second insn. >> >> >> In the case above there is a data dependency on reg0. ie, the first >> instruction generates a result used in the second instruction. combine >> is usually the best place to handle the data dependency case. > > Ok, understood. > > It is a bit of a special case here, because the peephole is restricted > to those cases, where reg0 is not used elsewhere (peep2_reg_dead_p()). > I have not seen how to do this for combiner optimizations. If the value is used elsewhere, then the combiner will generate a parallel with two sets. If the value dies, then the combiner generates the one set. ie given (set (t) (op0 (a) (b))) (set (r) (op1 (c) (t))) If "t" is dead, then combine will present you with: (set (r) (op1 (c) (op0 (a) (b)))) If "t" is used elsewhere, then combine will present you with: (parallel [(set (r) (op1 (c) (op0 (a) (b)))) (set (t) (op0 (a) (b)))]) Which makes perfect sense if you think about it for a while. If you still need "t", then the first sequence simply isn't valid as it doesn't preserve that side effect. Hence it tries to produce a sequence with the combined operation, but with the side effect of the first statement included as well. Jeff