From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 2D97E3858D38; Fri, 22 Mar 2024 05:39:27 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2D97E3858D38 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1711085967; bh=1mtKePZmtZDGT/l38r+47cjequU9WTsCtfCIf9x6/5k=; h=From:To:Subject:Date:From; b=W5V5CTmGF0RXIUxvFx5L3s26GH6juuB0vtraGUnKOlMnAByap42BUEiTfYFiRw1Mi uXMXu0VW0B4wpEjK0ry61c5u43RjLHv3zzI5tIdGBR5gIcoBhT/kq/QmIZ7PSU/78Q uEcQI7LU+bTo/HZrh//MIpCTdKyqAQ6UjO4MVQ/4= From: "liuhongt at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/114429] New: [x86] (neg a) ashifrt>> 31 can be optimized to a > 0. Date: Fri, 22 Mar 2024 05:39:26 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: liuhongt at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D114429 Bug ID: 114429 Summary: [x86] (neg a) ashifrt>> 31 can be optimized to a > 0. Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: liuhongt at gcc dot gnu.org Target Milestone: --- typedef unsigned char uint8_t; uint8_t x264_clip_uint8( int x ) { return x&(~255) ? (-x)>>31 : x; } void foo (int* a, int* __restrict b, int n) { for (int i =3D 0; i !=3D 8; i++) b[i] =3D x264_clip_uint8 (a[i]); } gcc -O2 -march=3Dx86-64-v3 -S foo(int*, int*, int): .. mov eax, 255 vpxor xmm0, xmm0, xmm0 vmovd xmm1, eax vpbroadcastd ymm1, xmm1 vmovdqu ymm2, YMMWORD PTR [rdi] vpminud ymm3, ymm2, ymm1 vpsubd ymm0, ymm0, ymm2 vmovdqa YMMWORD PTR [rsp-32], ymm3 vpsrad ymm0, ymm0, 31 vpcmpeqd ymm3, ymm2, YMMWORD PTR [rsp-32] vpblendvb ymm0, ymm0, ymm2, ymm3 vpand ymm1, ymm1, ymm0 vmovdqu YMMWORD PTR [rsi], ymm1 It can be better with mov eax, 255 vmovd xmm1, eax vpxor xmm0, xmm0, xmm0.=20 vpbroadcastd ymm1, xmm1 vmovdqu ymm2, YMMWORD PTR [rdi] vpminud ymm3, ymm2, ymm1 vmovdqa YMMWORD PTR [rsp-32], ymm3 vcmpgtps ymm0, ymm2, ymm0 vpcmpeqd ymm3, ymm2, YMMWORD PTR [rsp-32] vpblendvb ymm0, ymm0, ymm2, ymm3 vpand ymm1, ymm1, ymm0 vmovdqu YMMWORD PTR [rsi], ymm1=