From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 23C2A385802E; Fri, 25 Dec 2020 01:38:36 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 23C2A385802E From: "crazylht at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/98442] New: [X86] suboptimal for memset with CLEAR_BY_PIECES Date: Fri, 25 Dec 2020 01:38:35 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 11.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: crazylht at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter cc target_milestone cf_gcctarget Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Dec 2020 01:38:36 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D98442 Bug ID: 98442 Summary: [X86] suboptimal for memset with CLEAR_BY_PIECES Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: crazylht at gmail dot com CC: hjl.tools at gmail dot com, wei3.xiao at intel dot com, wwwhhhyyy333 at gmail dot com Target Milestone: --- Target: x86_64-*-* i?86-*-* cat test.c -------- char Tab[64]; void foo(int n) { for (int i=3D 0; i !=3D 64; i++) Tab[i] =3D 0; } ---- gcc generate ------ foo(int): vpxor xmm0, xmm0, xmm0 vmovdqa XMMWORD PTR Tab[rip], xmm0 vmovdqa XMMWORD PTR Tab[rip+16], xmm0 vmovdqa XMMWORD PTR Tab[rip+32], xmm0 vmovdqa XMMWORD PTR Tab[rip+48], xmm0 ret Tab: .zero 64 --------- Could be better ---- foo(int): vpxor ymm0, ymm0, ymm0 #4.5 vmovdqu YMMWORD PTR Tab[rip], ymm0 #4.5 vmovdqu YMMWORD PTR 32+Tab[rip], ymm0 #4.5 vzeroupper #6.1 ret #6.1 Tab: ----- GCC use 128-bit as default ---- bool default_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT size, unsigned int alignment, enum by_pieces_operation op, bool speed_p) { unsigned int max_size =3D 0; unsigned int ratio =3D 0; switch (op) { case CLEAR_BY_PIECES: max_size =3D STORE_MAX_PIECES; ratio =3D CLEAR_RATIO (speed_p); ---- Define TARGET_USE_BY_PIECES_INFRASTRUCTURE_P for i386?=