From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 7D06F385842C; Mon, 3 Jan 2022 13:17:50 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7D06F385842C From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/98737] Atomic operation on x86 no optimized to use flags Date: Mon, 03 Jan 2022 13:17:50 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 11.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: jakub at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Jan 2022 13:17:50 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D98737 --- Comment #10 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:6362627b27f395b054f359244fcfcb15ac0ac2ab commit r12-6190-g6362627b27f395b054f359244fcfcb15ac0ac2ab Author: Jakub Jelinek Date: Mon Jan 3 14:02:23 2022 +0100 i386, fab: Optimize __atomic_{add,sub,and,or,xor}_fetch (x, y, z) {=3D=3D,!=3D,<,<=3D,>,>=3D} 0 [PR98737] On Wed, Jan 27, 2021 at 12:27:13PM +0100, Ulrich Drepper via Gcc-patches wrote: > On 1/27/21 11:37 AM, Jakub Jelinek wrote: > > Would equality comparison against 0 handle the most common cases. > > > > The user can write it as > > __atomic_sub_fetch (x, y, z) =3D=3D 0 > > or > > __atomic_fetch_sub (x, y, z) - y =3D=3D 0 > > thouch, so the expansion code would need to be able to cope with bo= th. > > Please also keep !=3D0, <0, <=3D0, >0, and >=3D0 in mind. They all c= an be > useful and can be handled with the flags. <=3D 0 and > 0 don't really work well with lock {add,sub,inc,dec}, x86 doesn't have comparisons that would look solely at both SF and ZF and not at ot= her flags (and emitting two separate conditional jumps or two setcc insns a= nd oring them together looks awful). But the rest can work. Here is a patch that adds internal functions and optabs for these, recognizes them at the same spot as e.g. .ATOMIC_BIT_TEST_AND* internal functions (fold all builtins pass) and expands them appropriately (or f= or the <=3D 0 and > 0 cases of +/- FAILs and let's middle-end fall back). So far I have handled just the op_fetch builtins, IMHO instead of handl= ing also __atomic_fetch_sub (x, y, z) - y =3D=3D 0 etc. we should canonical= ize __atomic_fetch_sub (x, y, z) - y to __atomic_sub_fetch (x, y, z) (and v= ice versa). 2022-01-03 Jakub Jelinek PR target/98737 * internal-fn.def (ATOMIC_ADD_FETCH_CMP_0, ATOMIC_SUB_FETCH_CMP= _0, ATOMIC_AND_FETCH_CMP_0, ATOMIC_OR_FETCH_CMP_0, ATOMIC_XOR_FETCH_CMP_0): New internal fns. * internal-fn.h (ATOMIC_OP_FETCH_CMP_0_EQ, ATOMIC_OP_FETCH_CMP_0_NE, ATOMIC_OP_FETCH_CMP_0_LT, ATOMIC_OP_FETCH_CMP_0_LE, ATOMIC_OP_FETCH_CMP_0_GT, ATOMIC_OP_FETCH_CMP_0_GE): New enumerators. * internal-fn.c (expand_ATOMIC_ADD_FETCH_CMP_0, expand_ATOMIC_SUB_FETCH_CMP_0, expand_ATOMIC_AND_FETCH_CMP_0, expand_ATOMIC_OR_FETCH_CMP_0, expand_ATOMIC_XOR_FETCH_CMP_0): N= ew functions. * optabs.def (atomic_add_fetch_cmp_0_optab, atomic_sub_fetch_cmp_0_optab, atomic_and_fetch_cmp_0_optab, atomic_or_fetch_cmp_0_optab, atomic_xor_fetch_cmp_0_optab): New direct optabs. * builtins.h (expand_ifn_atomic_op_fetch_cmp_0): Declare. * builtins.c (expand_ifn_atomic_op_fetch_cmp_0): New function. * tree-ssa-ccp.c: Include internal-fn.h. (optimize_atomic_bit_test_and): Add . before internal fn call in function comment. Change return type from void to bool and return true only if successfully replaced. (optimize_atomic_op_fetch_cmp_0): New function. (pass_fold_builtins::execute): Use optimize_atomic_op_fetch_cmp= _0 for BUILT_IN_ATOMIC_{ADD,SUB,AND,OR,XOR}_FETCH_{1,2,4,8,16} and BUILT_IN_SYNC_{ADD,SUB,AND,OR,XOR}_AND_FETCH_{1,2,4,8,16}, for *XOR* ones only if optimize_atomic_bit_test_and failed. * config/i386/sync.md (atomic__fetch_cmp_0, atomic__fetch_cmp_0): New define_expand patterns. (atomic_add_fetch_cmp_0_1, atomic_sub_fetch_cmp_0_1, atomic__fetch_cmp_0_1): New define_insn patterns. * doc/md.texi (atomic_add_fetch_cmp_0, atomic_sub_fetch_cmp_0, atomic_and_fetch_cmp_0, atomic_or_fetch_cmp_0, atomic_xor_fetch_cmp_0): Document new named patterns. * gcc.target/i386/pr98737-1.c: New test. * gcc.target/i386/pr98737-2.c: New test. * gcc.target/i386/pr98737-3.c: New test. * gcc.target/i386/pr98737-4.c: New test. * gcc.target/i386/pr98737-5.c: New test. * gcc.target/i386/pr98737-6.c: New test. * gcc.target/i386/pr98737-7.c: New test.=