From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 06E943858D32; Thu, 13 Apr 2023 07:30:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 06E943858D32 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1681371059; bh=Cwuu8Ji1A3KQ1XkTY/SfB0zyXFUp/ZmQp9sy6rQPpLs=; h=From:To:Subject:Date:In-Reply-To:References:From; b=gV5ZE4LBVTKpvdgxHPOTkB/53c3SIN5f/wMqeU2aLAx1GCnw3Xx3mDkEdqVOO7ty8 5jumO32gluFqA27PCUM04EgTDpH+AZzER4rc/Upy6ex9SC/nCRt2TWV3QSf0e/4OG2 HFSP56br6RefbxHXmvMhH00/K0+cWGVVBJRtvAK0= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/109048] [13 regression] redundant mask compare generated by vectorizer. Date: Thu, 13 Apr 2023 07:30:58 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 13.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109048 --- Comment #11 from Richard Biener --- The recent patch improved this to avoid some of the compares. We still have the three-argument PHI and thus three VEC_CONDs. .L10: vmovups (%rdi,%rdx), %ymm0 vcmpltps %ymm6, %ymm0, %ymm3 vcmpltps %ymm2, %ymm0, %ymm1 vpandn %ymm1, %ymm3, %ymm1 vblendvps %ymm1, %ymm5, %ymm4, %ymm1 vblendvps %ymm3, %ymm7, %ymm1, %ymm1 vaddps %ymm1, %ymm0, %ymm0 vaddps (%rax,%rdx), %ymm0, %ymm0 vmovups %ymm0, (%rax,%rdx) addq $32, %rdx cmpq $1024, %rdx jne .L10 vs. GCC 12 .L6: vmovups (%rdi,%rdx), %ymm1 vcmpltps %ymm5, %ymm1, %ymm0 vcmpltps %ymm6, %ymm1, %ymm4 vblendvps %ymm0, %ymm3, %ymm2, %ymm0 vandps %ymm3, %ymm4, %ymm4 vaddps %ymm4, %ymm0, %ymm0 vaddps %ymm1, %ymm0, %ymm0 vaddps (%rax,%rdx), %ymm0, %ymm0 vmovups %ymm0, (%rax,%rdx) addq $32, %rdx cmpq $1024, %rdx jne .L6 which at least overall looks comparable.=