public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/113665] New: Regular for Loop results in Endless Loop with -O2
@ 2024-01-29 23:11 christoph at muppetnet dot net
  2024-01-29 23:14 ` [Bug c++/113665] " christoph at muppetnet dot net
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: christoph at muppetnet dot net @ 2024-01-29 23:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665

            Bug ID: 113665
           Summary: Regular for Loop results in Endless Loop with -O2
           Product: gcc
           Version: 11.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: christoph at muppetnet dot net
  Target Milestone: ---

Compiling the example code below with GCC 11.4.0 (I actually encountered this
bug in all versions >= 11 which I tried but not in older ones) and -O2 (other
optimization levels seem to work) results in an endless loop and finally a
segmentation fault due to an out-of-bounds access.

We can clearly argue about issues in the implementation of the bitfield class
(it is based on a real implementation of such a class I encountered in existing
codebase) but the main problem here is that the simple for loop in bug() is
generated into an endless loop. I think regardless of how the test method is
implemented, this loop should always terminate correctly.

#include <cstdint>
#include <cstddef>
#include <array>
#include <climits>

template<size_t S>
class bitfield
{
private:
    using Element = uint8_t;

    static constexpr uint32_t C = (S - 1u) / 8u;

private:
    std::array<Element, C + 1> m_Array;

public:
    bool test(size_t i) const
    {   
        if (i >= S){
            return false;
        }

        const size_t index = static_cast<uint32_t>(i) / 8u;
        const Element bitmask = 1u << (i % 8u);
        return (m_Array[index] & bitmask) != 0u;
    }
};

void bug2(const bitfield<0x120u> &b)
{
    if (!b.test(1u))
    {
        volatile int test = 1;
    }
}

void bug(const bitfield<0x250u> &b)
{
    for (uint16_t i = 0u; i < 0x250u; i++)
    {
        // this loop seems to not properly terminate
        if (!b.test(i))
        {
            volatile int test = i;
        }
    }
}

int main()
{
    bitfield<0x250u> b;
    bug(b);
    return 0;
}

Looking at the generated assembler code (in this example ARM64 generated in
godbolt.org, the same issue is also present on my x86-64 machine) we can see,
that the check for i < 0x250 is completely lost. Actually, at .L9 we do the
increment of the loop variable and then unconditionally jump back to .L10 for
the next iteration.

bug(bitfield<592ul> const&):
        sub     sp, sp, #16
        mov     w2, 0
        mov     w4, 1
.L10:
        lsr     w3, w2, 3
        and     w1, w2, 7
        lsl     w1, w4, w1
        ldrb    w3, [x0, w3, uxtw]
        and     w1, w1, w3
        tst     w1, 255
        bne     .L9
        str     w2, [sp, 12]
.L9:
        add     w2, w2, 1
        b       .L10

During testing of different variants of the code I encountered that there seem
to be different (but totally unexpected) ways to solve the issue:

- adding -fno-tree-vrp or -fno-guess-branch-probability
- deleting method bug2
- changing the type of i to size_t

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug c++/113665] Regular for Loop results in Endless Loop with -O2
  2024-01-29 23:11 [Bug c++/113665] New: Regular for Loop results in Endless Loop with -O2 christoph at muppetnet dot net
@ 2024-01-29 23:14 ` christoph at muppetnet dot net
  2024-01-30  0:25 ` [Bug ipa/113665] " pinskia at gcc dot gnu.org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: christoph at muppetnet dot net @ 2024-01-29 23:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665

--- Comment #1 from Christoph Rüthing <christoph at muppetnet dot net> ---
Note, that the volatile variables are just there to provide a minimal example
and not have the entire loop optimized away. In the origian implementation I
deduced this minimal example from, there was actually a useful implementation.
But putting e.g. a std::cout at this place also solved the issue.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug ipa/113665] Regular for Loop results in Endless Loop with -O2
  2024-01-29 23:11 [Bug c++/113665] New: Regular for Loop results in Endless Loop with -O2 christoph at muppetnet dot net
  2024-01-29 23:14 ` [Bug c++/113665] " christoph at muppetnet dot net
@ 2024-01-30  0:25 ` pinskia at gcc dot gnu.org
  2024-01-30  6:01 ` [Bug ipa/113665] [11/12/13/14 regression] " sjames at gcc dot gnu.org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-30  0:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|tree-optimization           |ipa
           Keywords|                            |needs-bisection,
                   |                            |needs-reduction
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2024-01-30
     Ever confirmed|0                           |1
   Target Milestone|---                         |11.5

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.

The issue is IPA ICF  figures out that bitfield<288>::test and
bitfield<592>::test are the same but really they are not due to the size being
different.

/app/example.cpp:18:10: optimized: Semantic equality hit:bool
bitfield<S>::test(size_t) const [with long unsigned int S = 288]/60->bool
bitfield<S>::test(size_t) const [with long unsigned int S = 592]/61
/app/example.cpp:18:10: optimized: Assembler symbol
names:_ZNK8bitfieldILm288EE4testEm.part.0/60->_ZNK8bitfieldILm592EE4testEm.part.0/61

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug ipa/113665] [11/12/13/14 regression] Regular for Loop results in Endless Loop with -O2
  2024-01-29 23:11 [Bug c++/113665] New: Regular for Loop results in Endless Loop with -O2 christoph at muppetnet dot net
  2024-01-29 23:14 ` [Bug c++/113665] " christoph at muppetnet dot net
  2024-01-30  0:25 ` [Bug ipa/113665] " pinskia at gcc dot gnu.org
@ 2024-01-30  6:01 ` sjames at gcc dot gnu.org
  2024-01-30  6:04 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: sjames at gcc dot gnu.org @ 2024-01-30  6:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665

Sam James <sjames at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sjames at gcc dot gnu.org
            Summary|Regular for Loop results in |[11/12/13/14 regression]
                   |Endless Loop with -O2       |Regular for Loop results in
                   |                            |Endless Loop with -O2

--- Comment #3 from Sam James <sjames at gcc dot gnu.org> ---
I will try bisect.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug ipa/113665] [11/12/13/14 regression] Regular for Loop results in Endless Loop with -O2
  2024-01-29 23:11 [Bug c++/113665] New: Regular for Loop results in Endless Loop with -O2 christoph at muppetnet dot net
                   ` (2 preceding siblings ...)
  2024-01-30  6:01 ` [Bug ipa/113665] [11/12/13/14 regression] " sjames at gcc dot gnu.org
@ 2024-01-30  6:04 ` pinskia at gcc dot gnu.org
  2024-01-30  8:11 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-30  6:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Sam James from comment #3)
> I will try bisect.

Most likely r11-5094-gafa6adbd6c83ee or r11-4987-g602c6cfc79ce4a or
r11-4986-ga1fdc16da34118 .

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug ipa/113665] [11/12/13/14 regression] Regular for Loop results in Endless Loop with -O2
  2024-01-29 23:11 [Bug c++/113665] New: Regular for Loop results in Endless Loop with -O2 christoph at muppetnet dot net
                   ` (3 preceding siblings ...)
  2024-01-30  6:04 ` pinskia at gcc dot gnu.org
@ 2024-01-30  8:11 ` rguenth at gcc dot gnu.org
  2024-01-30  8:16 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-01-30  8:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
Well, ICF figures out the other part of the partial inlined test() are equal
and I think they are.  The

        if (i >= S){
            return false;
        }

tests are inlined and eliminated (I think correctly so).  -fno-partial-inlining
also avoids the issue.

The issue is that ICF doesn't wipe (or compare) range info so we get after
inlining:

  <bb 2> [local count: 10737416]:
  goto <bb 6>; [100.00%]

  <bb 3> [local count: 1063004409]:
  # RANGE [irange] long unsigned int [0, 591] NONZERO 0x3ff
  _5 = (long unsigned int) i_2;
  # RANGE [irange] unsigned int [0, 287] NONZERO 0x1ff
  _11 = (unsigned int) _5;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug ipa/113665] [11/12/13/14 regression] Regular for Loop results in Endless Loop with -O2
  2024-01-29 23:11 [Bug c++/113665] New: Regular for Loop results in Endless Loop with -O2 christoph at muppetnet dot net
                   ` (4 preceding siblings ...)
  2024-01-30  8:11 ` rguenth at gcc dot gnu.org
@ 2024-01-30  8:16 ` rguenth at gcc dot gnu.org
  2024-01-30  9:12 ` [Bug ipa/113665] [11/12/13/14 regression] Regular for Loop results in Endless Loop with -O2 since r11-4987-g602c6cfc79ce4a sjames at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-01-30  8:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org,
                   |                            |rguenth at gcc dot gnu.org
           Priority|P3                          |P2

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Honza - ICF seems to fixup points-to sets when merging variables, so there
should be a way to kill off flow-sensitive info inside prevailing bodies
as well.  But would that happen before inlining the body?  Can you work
on that?  I think comparing ranges would weaken ICF unnecessarily?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug ipa/113665] [11/12/13/14 regression] Regular for Loop results in Endless Loop with -O2 since r11-4987-g602c6cfc79ce4a
  2024-01-29 23:11 [Bug c++/113665] New: Regular for Loop results in Endless Loop with -O2 christoph at muppetnet dot net
                   ` (5 preceding siblings ...)
  2024-01-30  8:16 ` rguenth at gcc dot gnu.org
@ 2024-01-30  9:12 ` sjames at gcc dot gnu.org
  2024-01-30 17:21 ` hubicka at ucw dot cz
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: sjames at gcc dot gnu.org @ 2024-01-30  9:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665

Sam James <sjames at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[11/12/13/14 regression]    |[11/12/13/14 regression]
                   |Regular for Loop results in |Regular for Loop results in
                   |Endless Loop with -O2       |Endless Loop with -O2 since
                   |                            |r11-4987-g602c6cfc79ce4a

--- Comment #7 from Sam James <sjames at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #4)
> (In reply to Sam James from comment #3)
> > I will try bisect.
> 
> Most likely r11-5094-gafa6adbd6c83ee or r11-4987-g602c6cfc79ce4a or
> r11-4986-ga1fdc16da34118 .

r11-4987-g602c6cfc79ce4a is the first bad commit
commit r11-4987-g602c6cfc79ce4a
Author: Jan Hubicka <jh@suse.cz>
Date:   Fri Nov 13 15:58:41 2020 +0100

    Improve handling of memory operands in ipa-icf 2/4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug ipa/113665] [11/12/13/14 regression] Regular for Loop results in Endless Loop with -O2 since r11-4987-g602c6cfc79ce4a
  2024-01-29 23:11 [Bug c++/113665] New: Regular for Loop results in Endless Loop with -O2 christoph at muppetnet dot net
                   ` (6 preceding siblings ...)
  2024-01-30  9:12 ` [Bug ipa/113665] [11/12/13/14 regression] Regular for Loop results in Endless Loop with -O2 since r11-4987-g602c6cfc79ce4a sjames at gcc dot gnu.org
@ 2024-01-30 17:21 ` hubicka at ucw dot cz
  2024-01-31  7:35 ` rguenth at gcc dot gnu.org
  2024-03-09 17:10 ` law at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: hubicka at ucw dot cz @ 2024-01-30 17:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665

--- Comment #8 from Jan Hubicka <hubicka at ucw dot cz> ---
> Honza - ICF seems to fixup points-to sets when merging variables, so there
> should be a way to kill off flow-sensitive info inside prevailing bodies
> as well.  But would that happen before inlining the body?  Can you work
> on that?  I think comparing ranges would weaken ICF unnecessarily?

AFAIK ICF does no changes to winning function body. It basically relies
on the fact that early optimizations are local and thus arrive to same
solutions for most of metadata. So only really easy fix is to make it
match value ranges, too.  I will check how much that fire in practice -
I can only think of split funtions to diverge, which is probably not
that bad in practice.

IPA-prop and IPA-PTA is only done after ICF.

We indeed discussed clearing possibility of merging alias sets which is
relatively important in practice (increasing TBAA precision on LTO
slowly degraded ICF effectivity significantly), but got into glory
details of inventing representation which would make inliner to pick
right body (without alias sets cleared). This was never made to fly
(Martin got scared by the details and I got it on my ever overflowing
TODO list).

Honza

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug ipa/113665] [11/12/13/14 regression] Regular for Loop results in Endless Loop with -O2 since r11-4987-g602c6cfc79ce4a
  2024-01-29 23:11 [Bug c++/113665] New: Regular for Loop results in Endless Loop with -O2 christoph at muppetnet dot net
                   ` (7 preceding siblings ...)
  2024-01-30 17:21 ` hubicka at ucw dot cz
@ 2024-01-31  7:35 ` rguenth at gcc dot gnu.org
  2024-03-09 17:10 ` law at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-01-31  7:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jan Hubicka from comment #8)
> > Honza - ICF seems to fixup points-to sets when merging variables, so there
> > should be a way to kill off flow-sensitive info inside prevailing bodies
> > as well.  But would that happen before inlining the body?  Can you work
> > on that?  I think comparing ranges would weaken ICF unnecessarily?
> 
> AFAIK ICF does no changes to winning function body. It basically relies
> on the fact that early optimizations are local and thus arrive to same
> solutions for most of metadata. So only really easy fix is to make it
> match value ranges, too.  I will check how much that fire in practice -
> I can only think of split funtions to diverge, which is probably not
> that bad in practice.

But is it possible to add a local transform stage and would that also affect
which body we inline?  But yes, inlining the original body would be so
much better ...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug ipa/113665] [11/12/13/14 regression] Regular for Loop results in Endless Loop with -O2 since r11-4987-g602c6cfc79ce4a
  2024-01-29 23:11 [Bug c++/113665] New: Regular for Loop results in Endless Loop with -O2 christoph at muppetnet dot net
                   ` (8 preceding siblings ...)
  2024-01-31  7:35 ` rguenth at gcc dot gnu.org
@ 2024-03-09 17:10 ` law at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: law at gcc dot gnu.org @ 2024-03-09 17:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665

Jeffrey A. Law <law at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |law at gcc dot gnu.org
         Resolution|---                         |DUPLICATE
             Status|NEW                         |RESOLVED

--- Comment #10 from Jeffrey A. Law <law at gcc dot gnu.org> ---
Tracking in 113907 as that one has more detail about the ICF vs Ranges issue,
possible short and longer term solutions, etc.

*** This bug has been marked as a duplicate of bug 113907 ***

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-03-09 17:10 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-29 23:11 [Bug c++/113665] New: Regular for Loop results in Endless Loop with -O2 christoph at muppetnet dot net
2024-01-29 23:14 ` [Bug c++/113665] " christoph at muppetnet dot net
2024-01-30  0:25 ` [Bug ipa/113665] " pinskia at gcc dot gnu.org
2024-01-30  6:01 ` [Bug ipa/113665] [11/12/13/14 regression] " sjames at gcc dot gnu.org
2024-01-30  6:04 ` pinskia at gcc dot gnu.org
2024-01-30  8:11 ` rguenth at gcc dot gnu.org
2024-01-30  8:16 ` rguenth at gcc dot gnu.org
2024-01-30  9:12 ` [Bug ipa/113665] [11/12/13/14 regression] Regular for Loop results in Endless Loop with -O2 since r11-4987-g602c6cfc79ce4a sjames at gcc dot gnu.org
2024-01-30 17:21 ` hubicka at ucw dot cz
2024-01-31  7:35 ` rguenth at gcc dot gnu.org
2024-03-09 17:10 ` law at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).