From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) by sourceware.org (Postfix) with ESMTPS id D3746385DC0B for ; Tue, 20 Jul 2021 18:52:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D3746385DC0B Received: by mail-pj1-x1029.google.com with SMTP id bt15so224191pjb.2 for ; Tue, 20 Jul 2021 11:52:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:content-transfer-encoding; bh=yqQzR1BU7w9+jwOAWa9Oz+BOfgeM0XOu2XM/rxbWIWM=; b=jkz+zlsmjwy8sPKowiMqvOM1igRwyqaSr3xcUdGX2HQpWd8We6kYR3ksak4TtQtCJj h9f3e163umqE+DZJY92BHAZe6tqxsv9B5RqgCfR7van02v6vydqrH9TgxwOt1bbK0KT5 /nKJAkxVv7pLOUhYvypz96DMB/aIbp93saKBWIPi7V5L0i54WjIXFpDaw/0ROKP/RJJz Iks0JRm8Fmnpjq81GvDY3Mi4CeKeM1S+gNuSKvaTOF5GwzW/OS7H9MdatPqWv817aa6I cr/vq52+qSNYYNbSILi2xsKtioTX9Hnqnm4zHQbByFvzT8J7YoS1XGhWP41wlwAtPbX6 0m/Q== X-Gm-Message-State: AOAM532vdzTzm/+cIAY76WBNVNzmOkIHq/AZMjSky0TOKpI8CZ66l9dC sb947irYMyVE4CQuTgql9oYB1g309rHvIw4TMATyR0R6q78= X-Google-Smtp-Source: ABdhPJzDPU0DTS6teymMMwHRRKkz0nHaELIG/Gm8vgwDxu2gFfddPBzbeqbfIHnm+SOZ5iSsdw9hiPXRcTJMqLMvcLc= X-Received: by 2002:a17:90b:3607:: with SMTP id ml7mr36813780pjb.153.1626807160870; Tue, 20 Jul 2021 11:52:40 -0700 (PDT) MIME-Version: 1.0 References: <20210713214956.2010942-1-hjl.tools@gmail.com> In-Reply-To: From: "H.J. Lu" Date: Tue, 20 Jul 2021 11:52:05 -0700 Message-ID: Subject: Re: [PATCH] Add QI vector mode support to by-pieces for memset To: "H.J. Lu via Gcc-patches" , Jeff Law , Richard Sandiford Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3025.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Jul 2021 18:52:43 -0000 On Tue, Jul 20, 2021 at 8:12 AM Richard Sandiford wrote: > > Richard Sandiford via Gcc-patches writes: > > "H.J. Lu via Gcc-patches" writes: > >> On Mon, Jul 19, 2021 at 11:38 PM Richard Sandiford > >> wrote: > >>> > >>> "H.J. Lu via Gcc-patches" writes: > >>> >> > + { > >>> >> > + /* First generate subreg of word mode if the previous mo= de is > >>> >> > + wider than word mode and word mode is wider than MODE= . */ > >>> >> > + prev_rtx =3D simplify_gen_subreg (word_mode, prev_rtx, > >>> >> > + prev_mode, 0); > >>> >> > + prev_mode =3D word_mode; > >>> >> > + } > >>> >> > + if (prev_rtx !=3D nullptr) > >>> >> > + target =3D simplify_gen_subreg (mode, prev_rtx, prev_mode,= 0); > >>> >> > >>> >> This should be lowpart_subreg, since 0 isn't the right offset for > >>> >> big-endian targets. Using lowpart_subreg should also avoid the ne= ed > >>> >> for the word_size =E2=80=9Cif=E2=80=9D above: lowpart_subreg can h= andle lowpart subword > >>> >> subregs of multiword values. > >>> > > >>> > I tried it. It didn't work since it caused the LRA failure. I re= placed > >>> > simplify_gen_subreg with lowpart_subreg instead. > >>> > >>> What specifically went wrong? > >> > >> With vector broadcast, for > >> --- > >> extern void *ops; > >> > >> void > >> foo (int c) > >> { > >> __builtin_memset (ops, c, 18); > >> } > >> --- > >> we generate HI from V16QI. With a single lowpart_subreg, I get > >> > >> (insn 10 9 0 2 (set (mem:HI (plus:DI (reg/f:DI 84) > >> (const_int 16 [0x10])) [0 MEM [(void > >> *)ops.0_1]+16 S2 A8]) > >> (subreg:HI (reg:V16QI 51 xmm15) 0)) "s2a.i":6:3 76 {*movhi_int= ernal} > >> (nil)) > >> > >> instead of > >> > >> (insn 10 9 0 2 (set (mem:HI (plus:DI (reg/f:DI 84) > >> (const_int 16 [0x10])) [0 MEM [(void > >> *)ops.0_1]+16 S2 A8]) > >> (subreg:HI (reg:DI 51 xmm15) 0)) "s2a.i":6:3 76 {*movhi_intern= al} > >> (nil)) > >> > >> IRA and LRA fail to reload: > >> > >> (insn 10 9 0 2 (set (mem:HI (plus:DI (reg/f:DI 84) > >> (const_int 16 [0x10])) [0 MEM [(void > >> *)ops.0_1]+16 S2 A8]) > >> (subreg:HI (reg:V16QI 51 xmm15) 0)) "s2a.i":6:3 76 {*movhi_int= ernal} > >> (nil)) > >> > >> since ix86_can_change_mode_class has > >> > >> if (MAYBE_SSE_CLASS_P (regclass) || MAYBE_MMX_CLASS_P (regclass)) > >> { > >> /* Vector registers do not support QI or HImode loads. If we do= n't > >> disallow a change to these modes, reload will assume it's ok = to > >> drop the subreg from (subreg:SI (reg:HI 100) 0). This affect= s > >> the vec_dupv4hi pattern. */ > >> if (GET_MODE_SIZE (from) < 4) > >> return false; > >> } > > > > Ah! OK. In that case, maybe we should have something like: > > > > if (REG_P (prev_rtx) > > && HARD_REGISTER_P (prev_rtx) > > && REG_CAN_CHANGE_MODE_P (REGNO (prev_rtx), prev->mode, mode)) > > Sorry, make that last line: > > && lowpart_subreg_regno (REGNO (prev_rtx), prev->mode, mode) < 0 > > where lowpart_subreg_regno is a new wrapper around simplify_subreg_regno > that uses subreg_lowpart_offset (mode, prev->mode) as the offset. Fixed. I submitted the v3 patch: https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575670.html Thanks. > Thanks, > Richard > > > prev_rtx =3D copy_to_reg (prev_rtx); > > > > and then just have the single lowpart_subreg after that. > > > > Thanks, > > Richard --=20 H.J.