From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by sourceware.org (Postfix) with ESMTPS id 153FC385771F for ; Thu, 6 Jul 2023 06:48:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 153FC385771F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-3fbc5d5742eso3394055e9.3 for ; Wed, 05 Jul 2023 23:48:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1688626124; x=1691218124; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=fcIJiZYb9IoDTCz4YpJ0mYtcVHnWQQ+uRLdg3okq10E=; b=TMakS8KWg4IrS85fe+D3YuOXHY58JodkR8RiVxd+cuF+UnN+uqpGmGD/6ZGusYUdSa BarovMc0NeP/ZgnwbdgmuSjd/wV6XdfecSJ8RbHj6unjO+UU9hMZhDv5ZT7geExysDmj xXrBM90TOX6MCakMCvFiYl0hffMJ2WgWuEA57/nsYie9hZ6TQA0nAjgV+NboFqF1yeHq LX6HXcWTE3rMgyflNuc7nr6IpR9ZtSvCE7QA21lBLE0aeyaO0blAK7xfaHMqc/GhMjNn cwXFiDPdYLm+Qtiz2tR9TmSTG3pam8V5IntUUe2i5WarA2pI2U3qfTQDkolTehHuknNF 0zRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688626124; x=1691218124; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fcIJiZYb9IoDTCz4YpJ0mYtcVHnWQQ+uRLdg3okq10E=; b=Aoe4VP3gxwz2DY7fcG6yGxqIxOtxOWR5CcwN5w0MSv6Vp9ZWS8OdeO/2wISWu4udEb Jmlu+VwKnLqI+2NbRBRbdDaY85GLRO4Dk3VodmwUSaNO5u0CgB+1ST53C4ewk6wcMMIZ J7B92bylK75DwWO+9KGJy5As+uDfBz6BIhk81VYUbzp69C3zEkvS46Nzbigcxg8T8jds dCk2kw1ARBRzD5FjO4zdjX3Xm6PZfb1kK0j48iO/u+8OpPqj9UHtQRxWTE5R6IS6R/45 vcrPRWiTUDsy4BbXpGlU7/A5egTqVGpbePLgwMbC+nDhg4wqTA49923n7H61CL11lVbI AWAQ== X-Gm-Message-State: ABy/qLa1ISdQUeMExk57GAdjhw75pNTnYCAS0cD8/3fUrjjqhfNgfqqx z5IDESpQNFseVWY894kpQ2aWUoywDQPHR0Gg4EggQg== X-Google-Smtp-Source: APBJJlFIZ8g0E78glgcmMyqFfX3T8QNgfo+zlxVn5/qNtbNuQGO8s+jP1XSTbCa0QALccI9Y3/fDQ9HWrVvEtFNhltg= X-Received: by 2002:a05:6000:11ce:b0:313:edaa:2504 with SMTP id i14-20020a05600011ce00b00313edaa2504mr771654wrx.25.1688626123808; Wed, 05 Jul 2023 23:48:43 -0700 (PDT) MIME-Version: 1.0 References: <20230428062314.2995571-1-christoph.muellner@vrull.eu> <0e13e932-64c4-fe33-e0f8-21380809a6ba@gmail.com> In-Reply-To: From: =?UTF-8?Q?Christoph_M=C3=BCllner?= Date: Thu, 6 Jul 2023 08:48:32 +0200 Message-ID: Subject: Re: [PATCH 10/11] riscv: thead: Add support for the XTheadMemIdx ISA extension To: Jeff Law Cc: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Cooper Qu , Lifang Xia , Yunhai Shang , Zhiwei Liu Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,JMQ_SPF_NEUTRAL,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Jun 29, 2023 at 4:09=E2=80=AFPM Jeff Law wr= ote: > > > > On 6/29/23 01:39, Christoph M=C3=BCllner wrote: > > On Wed, Jun 28, 2023 at 8:23=E2=80=AFPM Jeff Law wrote: > >> > >> > >> > >> On 6/28/23 06:39, Christoph M=C3=BCllner wrote: > >> > >>>>> +;; XTheadMemIdx overview: > >>>>> +;; All peephole passes attempt to improve the operand utilization = of > >>>>> +;; XTheadMemIdx instructions, where one sign or zero extended > >>>>> +;; register-index-operand can be shifted left by a 2-bit immediate= . > >>>>> +;; > >>>>> +;; The basic idea is the following optimization: > >>>>> +;; (set (reg 0) (op (reg 1) (imm 2))) > >>>>> +;; (set (reg 3) (mem (plus (reg 0) (reg 4))) > >>>>> +;; =3D=3D> > >>>>> +;; (set (reg 3) (mem (plus (reg 4) (op2 (reg 1) (imm 2)))) > >>>>> +;; This optimization only valid if (reg 0) has no further uses. > >>>> Couldn't this be done by combine if you created define_insn patterns > >>>> rather than define_peephole2 patterns? Similarly for the other case= s > >>>> handled here. > >>> > >>> I was inspired by XTheadMemPair, which merges two memory accesses > >>> into a mem-pair instruction (and which got inspiration from > >>> gcc/config/aarch64/aarch64-ldpstp.md). > >> Right. I'm pretty familiar with those. They cover a different case, > >> specifically the two insns being optimized don't have a true data > >> dependency between them. ie, the first instruction does not produce a > >> result used in the second insn. > >> > >> > >> In the case above there is a data dependency on reg0. ie, the first > >> instruction generates a result used in the second instruction. combin= e > >> is usually the best place to handle the data dependency case. > > > > Ok, understood. > > > > It is a bit of a special case here, because the peephole is restricted > > to those cases, where reg0 is not used elsewhere (peep2_reg_dead_p()). > > I have not seen how to do this for combiner optimizations. > If the value is used elsewhere, then the combiner will generate a > parallel with two sets. If the value dies, then the combiner generates > the one set. ie given > > (set (t) (op0 (a) (b))) > (set (r) (op1 (c) (t))) > > If "t" is dead, then combine will present you with: > > (set (r) (op1 (c) (op0 (a) (b)))) > > If "t" is used elsewhere, then combine will present you with: > > (parallel > [(set (r) (op1 (c) (op0 (a) (b)))) > (set (t) (op0 (a) (b)))]) > > Which makes perfect sense if you think about it for a while. If you > still need "t", then the first sequence simply isn't valid as it doesn't > preserve that side effect. Hence it tries to produce a sequence with > the combined operation, but with the side effect of the first statement > included as well. Thanks for this! Of course I was "lucky" and ran into the issue that the patterns did not ma= tch, because of unexpected MULT insns where ASHIFTs were expected. But after reading enough of combiner.cc I understood that this is on purpos= e (for addresses) and I have to adjust my INSNs accordingly. I've changed the patches for XTheadMemIdx and XTheadFMemIdx and will send out a new series. Thanks, Christoph