public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libgomp/102571] New: FAIL: libgomp.c/../libgomp.c-c++-common/atomic-21.c execution test
@ 2021-10-02 22:56 hjl.tools at gmail dot com
  2021-10-04 19:19 ` [Bug tree-optimization/102571] " jakub at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-02 22:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102571

            Bug ID: 102571
           Summary: FAIL: libgomp.c/../libgomp.c-c++-common/atomic-21.c
                    execution test
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libgomp
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hjl.tools at gmail dot com
                CC: jakub at gcc dot gnu.org
  Target Milestone: ---
            Target: i686,x86-64

libgomp.c-c++-common/atomic-21.c always fails with -m32 -mfpmath=sse

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/102571] FAIL: libgomp.c/../libgomp.c-c++-common/atomic-21.c execution test
  2021-10-02 22:56 [Bug libgomp/102571] New: FAIL: libgomp.c/../libgomp.c-c++-common/atomic-21.c execution test hjl.tools at gmail dot com
@ 2021-10-04 19:19 ` jakub at gcc dot gnu.org
  2021-10-05 14:15 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-10-04 19:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102571

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|libgomp                     |tree-optimization

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Seems like tree DSE bug to me.
We have before dse1:
  D.2119 = 0.0;
  MEM <char[6]> [(long double *)&D.2119 + 10B] = {};
  __builtin_GOMP_atomic_start ();
  _58 = MEM[(long double * {ref-all})&ld];
  D.2118 = _58;
  MEM <char[6]> [(long double *)&D.2118 + 10B] = {};
  _13 = __builtin_memcmp (&D.2118, &D.2119, 16);
where D.2119 and D.2118 have long double type.
But dse1 says:
  Deleted redundant store: MEM <char[6]> [(long double *)&D.2119 + 10B] = {};
and removes that store, but that isn't really redundant, because long double
has padding bits.
For OpenMP, perhaps I could get away around this and if I prove the non-padding
bits in the type are contiguous followed by padding bits the boundary between
the two is on a byte boundary, I could optimize the
  __builtin_clear_padding (&arg1); __builtin_clear_padding (&arg2); x =
__builtin_memcmp (&arg1, &arg2, 16); // or 12 for ia32 into just
x = __builtin_memcmp (&arg1, &arg2, 10);
(and I'll probably implement it tomorrow).
But it won't really help libstdc++, which will once the
https://gcc.gnu.org/pipermail/libstdc++/2021-September/053210.html patch is in,
use __builtin_clear_padding + __atomic_*, most likely in different inline
functions and will almost certainly suffer from this.
In
  val = 0;
  MEM ... [&val + ...] = {};
the second stmt is only redundant if the type doesn't have any padding bits...
I guess gimple-fold.c could provide an even cheaper helper whether the type has
padding bits than what is currently provided.
Then there is the question on unions, __builtin_clear_padding needs to be
conservative for unions and only clear bits that are proven to be padding in
all union members.  But to disable the DSE optimization, we probably need to
ask a different question, instead of "does the type have any padding bits that
would be cleared by __builtin_clear_padding?" ask "can the type have any
padding bits?" where for unions the answer would be true if any member has them
rather than if all the members have them in the same positions.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/102571] FAIL: libgomp.c/../libgomp.c-c++-common/atomic-21.c execution test
  2021-10-02 22:56 [Bug libgomp/102571] New: FAIL: libgomp.c/../libgomp.c-c++-common/atomic-21.c execution test hjl.tools at gmail dot com
  2021-10-04 19:19 ` [Bug tree-optimization/102571] " jakub at gcc dot gnu.org
@ 2021-10-05 14:15 ` jakub at gcc dot gnu.org
  2021-10-06  8:45 ` cvs-commit at gcc dot gnu.org
  2021-10-06 10:20 ` jakub at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-10-05 14:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102571

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 51554
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51554&action=edit
gcc12-pr102571.patch

Untested patch for the OpenMP optimization that will result in this bug not
triggering for OpenMP, but it will still trigger for when the user uses
__builtin_clear_padding and __builtin_memcmp by hand (or eventually through
<atomic>).

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/102571] FAIL: libgomp.c/../libgomp.c-c++-common/atomic-21.c execution test
  2021-10-02 22:56 [Bug libgomp/102571] New: FAIL: libgomp.c/../libgomp.c-c++-common/atomic-21.c execution test hjl.tools at gmail dot com
  2021-10-04 19:19 ` [Bug tree-optimization/102571] " jakub at gcc dot gnu.org
  2021-10-05 14:15 ` jakub at gcc dot gnu.org
@ 2021-10-06  8:45 ` cvs-commit at gcc dot gnu.org
  2021-10-06 10:20 ` jakub at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-06  8:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102571

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:ba837323dbda2bca5a1c8a4c78092a88241dcfa3

commit r12-4207-gba837323dbda2bca5a1c8a4c78092a88241dcfa3
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Oct 6 10:40:12 2021 +0200

    openmp: Optimize for OpenMP atomics
2x__builtin_clear_padding+__builtin_memcmp if possible

    For the few long double types that do have padding bits, e.g. on x86
    the clear_type_padding_in_mask computed mask is
    ff ff ff ff ff ff ff ff ff ff 00 00 for 32-bit and
    ff ff ff ff ff ff ff ff ff ff 00 00 00 00 00 00 for 64-bit.
    Instead of doing __builtin_clear_padding on both operands that will clear
the
    last 2 or 6 bytes and then memcmp on the whole 12/16 bytes, we can just
    memcmp 10 bytes.  The code also handles if the padding would be at the
start
    or both at the start and end, but everything on byte boundaries only and
    non-padding bits being contiguous.
    This works around a tree-ssa-dse.c bug (but we need to fix it anyway,
    as libstdc++ won't do this and as it can deal with arbitrary types, it even
    can't do that generally).

    2021-10-06  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/102571
            * c-omp.c (c_finish_omp_atomic): Optimize the case where type has
            padding, but the non-padding bits are contiguous set of bytes
            by adjusting the memcmp call arguments instead of emitting
            __builtin_clear_padding and then comparing all the type's bytes.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/102571] FAIL: libgomp.c/../libgomp.c-c++-common/atomic-21.c execution test
  2021-10-02 22:56 [Bug libgomp/102571] New: FAIL: libgomp.c/../libgomp.c-c++-common/atomic-21.c execution test hjl.tools at gmail dot com
                   ` (2 preceding siblings ...)
  2021-10-06  8:45 ` cvs-commit at gcc dot gnu.org
@ 2021-10-06 10:20 ` jakub at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-10-06 10:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102571

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Workaround committed, so for OpenMP it shouldn't trigger anymore.
But for
int
foo (char *p)
{
  long double l = 0.0;
  __builtin_clear_padding (&l);
  return __builtin_memcpy (&l, p, sizeof (l));
}
it still can, similarly for <atomic> APIs on long double, _Complex long double
etc.
Related to PR93270 which was also about DSE, but the other order, something
clearing the memory and then writing a long double over it (which due to hw
leaves the padding bits unchanged) and DSE removing the clearing before it.
Here it is a long double store of initializer_zerop followed by clearing of the
padding bits and removes the clearing of the padding bits.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-10-06 10:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-02 22:56 [Bug libgomp/102571] New: FAIL: libgomp.c/../libgomp.c-c++-common/atomic-21.c execution test hjl.tools at gmail dot com
2021-10-04 19:19 ` [Bug tree-optimization/102571] " jakub at gcc dot gnu.org
2021-10-05 14:15 ` jakub at gcc dot gnu.org
2021-10-06  8:45 ` cvs-commit at gcc dot gnu.org
2021-10-06 10:20 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).