On Tue, Oct 5, 2021 at 9:40 AM H.J. Lu wrote: > > On Tue, Oct 5, 2021 at 3:07 AM Richard Biener wrote: > > > > On Mon, 4 Oct 2021, H.J. Lu wrote: > > > > > commit adedd5c173388ae505470df152b9cb3947339566 > > > Author: Jakub Jelinek > > > Date: Tue May 3 13:37:25 2016 +0200 > > > > > > re PR target/49244 (__sync or __atomic builtins will not emit 'lock bts/btr/btc') > > > > > > optimized bit test on atomic builtin return with lock bts/btr/btc. But > > > it works only for unsigned integers since atomic builtins operate on the > > > 'uintptr_t' type. It fails on bool: > > > > > > _1 = atomic builtin; > > > _4 = (_Bool) _1; > > > > > > and signed integers: > > > > > > _1 = atomic builtin; > > > _2 = (int) _1; > > > _5 = _2 & (1 << N); > > > > > > Improve bit test on atomic builtin return by converting: > > > > > > _1 = atomic builtin; > > > _4 = (_Bool) _1; > > > > > > to > > > > > > _1 = atomic builtin; > > > _5 = _1 & (1 << 0); > > > _4 = (_Bool) _5; > > > > > > and converting: > > > > > > _1 = atomic builtin; > > > _2 = (int) _1; > > > _5 = _2 & (1 << N); > > > > > > to > > > _1 = atomic builtin; > > > _6 = _1 & (1 << N); > > > _5 = (int) _6; > > > > Why not do this last bit with match.pd patterns (and independent on > > whether _1 is defined by an atomic builtin)? For the first suggested > > The full picture is > > _1 = _atomic_fetch_or_* (ptr_6, mask, _3); > _2 = (int) _1; > _5 = _2 & mask; > > to > > _1 = _atomic_fetch_or_* (ptr_6, mask, _3); > _6 = _1 & mask; > _5 = (int) _6; > > It is useful only if 2 masks are the same. > > > transform that's likely going to be undone by folding, no? > > > > The bool case is > > _1 = __atomic_fetch_or_* (ptr_6, 1, _3); > _4 = (_Bool) _1; > > to > > _1 = __atomic_fetch_or_* (ptr_6, 1, _3); > _5 = _1 & 1; > _4 = (_Bool) _5; > > Without __atomic_fetch_or_*, the conversion isn't needed. > After the conversion, optimize_atomic_bit_test_and will > immediately optimize the code sequence to > > _6 = .ATOMIC_BIT_TEST_AND_SET (&v, 0, 0, 0); > _4 = (_Bool) _6; > > and there is nothing to fold after it. > Here is the v2 patch to handle more cases. -- H.J.