From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-x32b.google.com (mail-ot1-x32b.google.com [IPv6:2607:f8b0:4864:20::32b]) by sourceware.org (Postfix) with ESMTPS id 66C193858004 for ; Mon, 31 May 2021 18:21:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 66C193858004 Received: by mail-ot1-x32b.google.com with SMTP id d25-20020a0568300459b02902f886f7dd43so11820673otc.6 for ; Mon, 31 May 2021 11:21:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=3ZgjzpfUqwYQ8svVAiYEaiQtwd2G9Jr/iuQjSTQTk2c=; b=EcDdO+6GC5FIgPhP6xt0xVvtHesPzD0H6lTtK+rYvA4XVI2EC04/LC5RET1Uzc6ABj 9/9e2PtPAI7l7dSheJLC7uCouDZ6c0k3PME1rF22WKdWYPTWm6cVTb/EjdkKgTksXIpN sDfcXmjJ1vbGP9tlXnCteQT2Qi1zesnkZYJkZAmYy66OwW6Sy+OMWcZ5PjRg/euaWmmB UCpjoBzTbrRZnoNNKKwzYT2rg3xDUETN/iqX7r3edL7Xe5F3r12w/qgr3fGTB8K6k9M+ g6z0yEnODe9jmE3SQhadWK/knytaa/KqakAsTB8+KD3Egm8v7OTWPR2aslXTOAYBOWXy 5U3Q== X-Gm-Message-State: AOAM530Ja9js6awt+WIyFUCqs9SNcWZhYO5NN5BxeFYtDzxEqv8h5lHG VHCiwtyC0L8xsYQLIavq2iQ/pVXDw1xMkNhfpIM= X-Google-Smtp-Source: ABdhPJx1PKlBcnNQUKsl3xeabPtHYgm2AHs7M2tMJbqD6K10BuPi2zvGvdxgsDeGBKmC/RpzyMAUIzFU4e2uoftIfoA= X-Received: by 2002:a9d:74c5:: with SMTP id a5mr3465219otl.125.1622485270799; Mon, 31 May 2021 11:21:10 -0700 (PDT) MIME-Version: 1.0 References: <20210511233535.4448-1-hjl.tools@gmail.com> <20210511233535.4448-2-hjl.tools@gmail.com> <5df9487f-e9e1-f653-bef0-779de9214f8f@gmail.com> In-Reply-To: From: "H.J. Lu" Date: Mon, 31 May 2021 11:20:34 -0700 Message-ID: Subject: Re: [PATCH v2 01/11] Add TARGET_READ_MEMSET_VALUE/TARGET_GEN_MEMSET_VALUE To: Jeff Law Cc: GCC Patches , Richard Sandiford Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3027.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 May 2021 18:21:12 -0000 On Mon, May 31, 2021 at 11:13 AM H.J. Lu wrote: > > On Mon, May 31, 2021 at 11:07 AM Jeff Law wrote: > > > > > > > > On 5/31/2021 6:04 AM, H.J. Lu wrote: > > > On Sun, May 30, 2021 at 11:49 AM Jeff Law wrote: > > >> > > >> > > >> On 5/11/2021 5:35 PM, H.J. Lu via Gcc-patches wrote: > > >>> Add TARGET_READ_MEMSET_VALUE and TARGET_GEN_MEMSET_VALUE to support > > >>> target instructions to duplicate QImode value to TImode/OImode/XImode > > >>> value for memmset. Define SCRATCH_SSE_REG as a scratch register for > > >>> ix86_gen_memset_value. > > >>> > > >>> gcc/ > > >>> > > >>> PR middle-end/90773 > > >>> * builtins.c (builtin_memset_read_str): Call > > >>> targetm.read_memset_value. > > >>> (builtin_memset_gen_str): Call targetm.gen_memset_value. > > >>> * target.def (read_memset_value): New hook. > > >>> (gen_memset_value): Likewise. > > >>> * targhooks.c: Inclue "builtins.h". > > >>> (default_read_memset_value): New function. > > >>> (default_gen_memset_value): Likewise. > > >>> * targhooks.h (default_read_memset_value): New prototype. > > >>> (default_gen_memset_value): Likewise. > > >>> * config/i386/i386-expand.c (ix86_expand_vector_init_duplicate): > > >>> Make it global. > > >>> * config/i386/i386-protos.h (ix86_minimum_incoming_stack_boundary): > > >>> New. > > >>> (ix86_expand_vector_init_duplicate): Likewise. > > >>> * config/i386/i386.c (ix86_minimum_incoming_stack_boundary): Add > > >>> an argument to ignore stack_alignment_estimated. It is passed > > >>> as false by default. > > >>> (ix86_gen_memset_value_from_prev): New function. > > >>> (ix86_gen_memset_value): Likewise. > > >>> (ix86_read_memset_value): Likewise. > > >>> (TARGET_GEN_MEMSET_VALUE): New. > > >>> (TARGET_READ_MEMSET_VALUE): Likewise. > > >>> * config/i386/i386.h (SCRATCH_SSE_REG): New. > > >>> * doc/tm.texi.in: Add TARGET_READ_MEMSET_VALUE and > > >>> TARGET_GEN_MEMSET_VALUE hooks. > > >>> * doc/tm.texi: Regenerated. > > >>> > > >>> gcc/testsuite/ > > >>> > > >>> PR middle-end/90773 > > >>> * gcc.target/i386/pr90773-15.c: New test. > > >>> * gcc.target/i386/pr90773-16.c: Likewise. > > >>> * gcc.target/i386/pr90773-17.c: Likewise. > > >>> * gcc.target/i386/pr90773-18.c: Likewise. > > >>> * gcc.target/i386/pr90773-19.c: Likewise. > > >> Why does this need target hooks? ISTM the right way to go here is to > > >> just emit the constant load to the target register and let the target > > >> figure out how best to construct the constant into the register. If > > >> that means load it via QImode and broadcast, that's fine, but I'm not > > >> sure why that's not all implemented in the target files. > > >> > > > I will submit a patch to add optabs instead. > > I may be missing something, but I'm not even sure why we need special > > optabs. > > > > Aren't you just trying to efficiently get a constant element broadcast > > across an entire vector? > > Since vec_duplicate must not fail and for broadcast from a constant QImode > value, vec_duplicate may not be faster than a compile-time constant, I am > adding vec_const_duplicate. If vec_duplicate can fail, I don't need > vec_const_duplicate. > > -- > H.J. For extern void *ops; void foo (int c) { __builtin_memset (ops, 4, 32); } without vec_const_duplicate, I got movl $4, %eax movq ops(%rip), %rdx movd %eax, %xmm0 punpcklbw %xmm0, %xmm0 punpcklwd %xmm0, %xmm0 pshufd $0, %xmm0, %xmm0 movups %xmm0, (%rdx) movups %xmm0, 16(%rdx) ret with vec_const_duplicate, I got movq ops(%rip), %rax movdqa .LC0(%rip), %xmm0 movups %xmm0, (%rax) movups %xmm0, 16(%rax) ret -- H.J.