public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/97976] New: Optimization regression in 10.1 for lambda passed as template argument
@ 2020-11-24 22:08 peter at int19h dot net
  2020-11-24 22:14 ` [Bug c++/97976] " pinskia at gcc dot gnu.org
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: peter at int19h dot net @ 2020-11-24 22:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97976

            Bug ID: 97976
           Summary: Optimization regression in 10.1 for lambda passed as
                    template argument
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: peter at int19h dot net
  Target Milestone: ---

The following C++11 and above code:

////////////////////
extern const int* data;

template<typename T> bool func(T callback)
{
    for (const int* pi = data; pi; ++pi)
    {
        if (callback(*pi))
        {
            return false;
        }
    }
    return true;
}

bool f0(int i)
{
    return func([i](const int j){ return i == j; });
}
////////////////////


With GCC 10.1 with "-std=c++11 -O2" flags generates the following:
////////////////////
f0(int):
        cmp     QWORD PTR data[rip], 0
        sete    al
        ret
////////////////////

While GCC 9.3 with the same command line flags generates the following:
////////////////////
f0(int):
        mov     rax, QWORD PTR data[rip]
        test    rax, rax
        jne     .L3
        jmp     .L4
.L7:
        add     rax, 4
.L3:
        cmp     edi, DWORD PTR [rax]
        jne     .L7
        xor     eax, eax
        ret
.L4:
        mov     eax, 1
        ret
////////////////////

It looks like this regression started with GCC 10 and starts at -02
optimization level for C++11 and above. I have tested this with clang and msvc,
and they generate code similar to what is generated by gcc 9.3.

This behavior can also be seen in the Compiler Explorer here:

https://godbolt.org/z/r4zMnc

Thank you!
--peter

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/97976] Optimization regression in 10.1 for lambda passed as template argument
  2020-11-24 22:08 [Bug c++/97976] New: Optimization regression in 10.1 for lambda passed as template argument peter at int19h dot net
@ 2020-11-24 22:14 ` pinskia at gcc dot gnu.org
  2020-11-24 22:24 ` peter at int19h dot net
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2020-11-24 22:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97976

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I don't think there is a bug here.

This loop here:
for (const int* pi = data; pi; ++pi)
invokes undefined behavior as pi can never become null after doing the
increment.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/97976] Optimization regression in 10.1 for lambda passed as template argument
  2020-11-24 22:08 [Bug c++/97976] New: Optimization regression in 10.1 for lambda passed as template argument peter at int19h dot net
  2020-11-24 22:14 ` [Bug c++/97976] " pinskia at gcc dot gnu.org
@ 2020-11-24 22:24 ` peter at int19h dot net
  2020-11-25  4:40 ` peter at int19h dot net
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: peter at int19h dot net @ 2020-11-24 22:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97976

Peter Bisroev <peter at int19h dot net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |INVALID

--- Comment #2 from Peter Bisroev <peter at int19h dot net> ---
Hi Andrew,

You are 100% correct. I realized that as soon as I hit submit button. I was
trying to create a simple test case of the problem that I saw and a bit
oversimplified it. GCC actually optimized this really well.

I apologize for a false report.

Regards,
--peter

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/97976] Optimization regression in 10.1 for lambda passed as template argument
  2020-11-24 22:08 [Bug c++/97976] New: Optimization regression in 10.1 for lambda passed as template argument peter at int19h dot net
  2020-11-24 22:14 ` [Bug c++/97976] " pinskia at gcc dot gnu.org
  2020-11-24 22:24 ` peter at int19h dot net
@ 2020-11-25  4:40 ` peter at int19h dot net
  2020-11-25  8:25 ` [Bug c++/97976] Optimization relating to NULL pointer assumptions in gcc 9.1 rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: peter at int19h dot net @ 2020-11-25  4:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97976

Peter Bisroev <peter at int19h dot net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|INVALID                     |---
             Status|RESOLVED                    |UNCONFIRMED

--- Comment #3 from Peter Bisroev <peter at int19h dot net> ---
Hi Andrew,

I was thinking about this a bit more and decided to try the loop in reverse in
a more simplified test case. I know this test case demonstrates a corner case
that no one will probably implement. However I still think it merits some
further investigation just in case this affects some other parts of the
optimizer. You can see this code below:

////////////////////
int containsBackwards(const uint8_t* p, uint8_t target)
{
    for (; p; --p)
    {
        if (*p == target)
        {
            return 1;
        }
    }
    return 0;
}

const uint8_t* findBackwards(const uint8_t* p, uint8_t target)
{
    for (; p; --p)
    {
        if (*p == target)
        {
            break;
        }
    }
    return p;
}

////////////////////

Function containsBackwards(), while searching backwards, should return 1 if
target byte is found and 0 if it was not and p points to address 0. Function
findBackwards() is similar but returns the address of the first byte that
matched the target or pointer to address 0 if a match was not found. Unless I
am mistaken, the sample code above is not hitting any undefined behavior such
as dereferencing a NULL pointer and there is a well defined loop terminating
condition.

This is the code that is generated with gcc trunk and gcc 9.1:
////////////////////
containsBackwards(unsigned char const*, unsigned char):
        xor     eax, eax
        test    rdi, rdi
        setne   al
        ret
findBackwards(unsigned char const*, unsigned char):
        test    rdi, rdi
        jne     .L5
        jmp     .L6
.L8:
        sub     rdi, 1
.L5:
        cmp     BYTE PTR [rdi], sil
        jne     .L8
        mov     rax, rdi
        ret
.L6:
        xor     eax, eax
        ret
////////////////////

I would have expected both functions to be compiled to nearly the same code,
but the looping is missing in containsBackwards() function. And unless I am
mistaken gcc 8.3 generates the output that we would expect to see here.

You can see this example in Compiler Explorer here:

https://godbolt.org/z/hWE4xs

What is also interesting, if we replace uint8_t by uint32_t in
containsBackwards() function it will work with gcc 9.3 but with gcc 10.1 it
will behave in exactly the same way as above returning the result based on the
validity of the p pointer.

Additionally, thinking about my first test case. I know it was technically in
the undefined territory, but just for my personal understanding, is the
compiler allowed to assume that pi can never become null like you have
suggested? In theory it can overflow and become 0 and the loop will terminate
but unfortunately I am not 100% certain of what the standard's view on this is.
In addition, can the compiler assume that callback(*pi) will never return true?

What are your thoughts?

Thank you!
--peter

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/97976] Optimization relating to NULL pointer assumptions in gcc 9.1
  2020-11-24 22:08 [Bug c++/97976] New: Optimization regression in 10.1 for lambda passed as template argument peter at int19h dot net
                   ` (2 preceding siblings ...)
  2020-11-25  4:40 ` peter at int19h dot net
@ 2020-11-25  8:25 ` rguenth at gcc dot gnu.org
  2020-11-25 16:43 ` peter at int19h dot net
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-11-25  8:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97976

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |INVALID

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
For containsBackwards we know it must return 1 unless p is NULL at the
beginning, for findBackwards we still need to compute where and for that the
loop has to be retained.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/97976] Optimization relating to NULL pointer assumptions in gcc 9.1
  2020-11-24 22:08 [Bug c++/97976] New: Optimization regression in 10.1 for lambda passed as template argument peter at int19h dot net
                   ` (3 preceding siblings ...)
  2020-11-25  8:25 ` [Bug c++/97976] Optimization relating to NULL pointer assumptions in gcc 9.1 rguenth at gcc dot gnu.org
@ 2020-11-25 16:43 ` peter at int19h dot net
  2020-11-25 18:18 ` redi at gcc dot gnu.org
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: peter at int19h dot net @ 2020-11-25 16:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97976

Peter Bisroev <peter at int19h dot net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|INVALID                     |FIXED

--- Comment #5 from Peter Bisroev <peter at int19h dot net> ---
Thank you for your explanation Richard, it makes perfect sense to me now. GCC
is being insanely clever here!

However I still think there is something not 100% right. If I change
containsBackwards() function to look like this:

////////////////////
int containsBackwards(const uint8_t* p, uint8_t target)
{
    for (;;)
    {
        if (*p == target)
        {
            return 1;
        }

        --p;
        if (!p)
        {
            return 0;
        }
    }
}
////////////////////

I apologize in advance as I know the code is far from ideal and is definitely a
corner case, but unless I am mistaken there is no undefined behavior here
assuming p is always valid at the time of the call and it is just to illustrate
the example. So for the above gcc 10.1 generates the following:


////////////////////
containsBackwards(unsigned char const*, unsigned char):
        mov     eax, 1
        ret
////////////////////

So my question is, is gcc allowed to assume that the target will always match?

Additionally, if I make this function a bit safer and unless I am mistaken
without any undefined behavior:

////////////////////
int containsBackwardsSafe(const uint8_t* p, uint8_t target)
{
    if (!p)
    {
        return 0;
    }

    for (;;)
    {
        if (*p == target)
        {
            return 1;
        }

        --p;
        if (!p)
        {
            return -1;
        }
    }
}
////////////////////

Then gcc 10.1 generates the following:

////////////////////
containsBackwardsSafe(unsigned char const*, unsigned char):
        xor     eax, eax
        test    rdi, rdi
        setne   al
        ret
////////////////////

Which matches the generated output of the original function. However unless I
am mistaken the result here is supposed to be different to the original
function:
* If p == NULL, return 0.
* If target matched while p was valid, return 1.
* If target did not match while p was valid return -1.

You can see these above examples at Compiler Explorer here:

https://godbolt.org/z/ha1c1h


I apologize if there is something critical I am missing here so I hope I am not
wasting your time.

Thank you!
--peter

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/97976] Optimization relating to NULL pointer assumptions in gcc 9.1
  2020-11-24 22:08 [Bug c++/97976] New: Optimization regression in 10.1 for lambda passed as template argument peter at int19h dot net
                   ` (4 preceding siblings ...)
  2020-11-25 16:43 ` peter at int19h dot net
@ 2020-11-25 18:18 ` redi at gcc dot gnu.org
  2020-11-25 19:46 ` peter at int19h dot net
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: redi at gcc dot gnu.org @ 2020-11-25 18:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97976

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|FIXED                       |INVALID

--- Comment #6 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Peter Bisroev from comment #5)
> So my question is, is gcc allowed to assume that the target will always
> match?

Yes. If p is not null then it points to a value object. It is undefined to
decrement a pointer "before" the start of an object (you can only increment to
one position past the end). That means the compiler can assume that a non-null
pointer never becomes null, because it would have to point before the start of
the object to do so. So if it can't become null, the !p check is always false,
and it must return true.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/97976] Optimization relating to NULL pointer assumptions in gcc 9.1
  2020-11-24 22:08 [Bug c++/97976] New: Optimization regression in 10.1 for lambda passed as template argument peter at int19h dot net
                   ` (5 preceding siblings ...)
  2020-11-25 18:18 ` redi at gcc dot gnu.org
@ 2020-11-25 19:46 ` peter at int19h dot net
  2020-11-25 23:55 ` redi at gcc dot gnu.org
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: peter at int19h dot net @ 2020-11-25 19:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97976

--- Comment #7 from Peter Bisroev <peter at int19h dot net> ---
Thank you for your response Jonathan.

If you have a minute, could you please clarify a few things. I have been
talking about this behavior with a few colleagues and we are all slightly
confused by the same issue. So I think the answers here can definitely help a
few people besides myself.

>It is undefined to decrement a pointer "before" the start of an object

I am sorry, but I am not sure where I am doing this? For example, lets say I am
accessing raw memory on an embedded system and I have bytes 0, 1, 1, 1 at
addresses 0, 1, 2 and 3 respectively. I know this is a trivial example and
makes no sense on x86 arch (almost, in real mode maybe). So if I call
containsBackwardsSafe(p, 2) with p == 3, shouldn't I get back -1? I guess it
all depends on what is the "object" in this context. Conceivably some other
function could have mapped that memory at that address on that system and
passed the pointer to containsBackwardsSafe() function. In that case wouldn't
the responsibility of "object" be up to the system and not the compiler?

I have also just tried going through C11 draft
(http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) to try to find
relevant standard sections that would describe this behavior. I have read
through sections:
* 6.2.5 Types (specifically paragraphs 14 and 20)
* 6.3.2.3 Pointers
* 6.5.3.1 Prefix increment and decrement operators
* 6.5.6 Additive operators (specifically paragraphs 7, 8 and 9)
* 6.5.16.2 Compound assignment

And I cannot seem to find relevant information that forbids pointer decrement
as  shown in containsBackwardsSafe() or containsBackwards() from the last
comment. If you could point me to the right section of the standard it would be
incredibly helpful. I am sure I must be missing something obvious.

Once again, thank you for your time and help on this.

Regards,
--peter

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/97976] Optimization relating to NULL pointer assumptions in gcc 9.1
  2020-11-24 22:08 [Bug c++/97976] New: Optimization regression in 10.1 for lambda passed as template argument peter at int19h dot net
                   ` (6 preceding siblings ...)
  2020-11-25 19:46 ` peter at int19h dot net
@ 2020-11-25 23:55 ` redi at gcc dot gnu.org
  2020-11-26  0:20 ` redi at gcc dot gnu.org
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: redi at gcc dot gnu.org @ 2020-11-25 23:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97976

--- Comment #8 from Jonathan Wakely <redi at gcc dot gnu.org> ---
6.5.6 p8 covers it. You cannot increment or decrement a pointer past the end of
an array (except the one past the end position). C++ has similar rules. GCC
assumes there is no object at address zero, see the documentation for
-fdelete-null-pointer-checks

Since there is no object at address zero, the first possible byte that can be
part of an array of bytes is address 1, and so decrementing a pointer to that
byte would point to an element at index -1, which doesn't exist (this violates
the "if the array is large enough" wording of 6.5.6 p8, because you've gone out
of bounds of the array).

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/97976] Optimization relating to NULL pointer assumptions in gcc 9.1
  2020-11-24 22:08 [Bug c++/97976] New: Optimization regression in 10.1 for lambda passed as template argument peter at int19h dot net
                   ` (7 preceding siblings ...)
  2020-11-25 23:55 ` redi at gcc dot gnu.org
@ 2020-11-26  0:20 ` redi at gcc dot gnu.org
  2020-11-26  1:37 ` peter at int19h dot net
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: redi at gcc dot gnu.org @ 2020-11-26  0:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97976

--- Comment #9 from Jonathan Wakely <redi at gcc dot gnu.org> ---

6.3.2.3 p3 says a null pointer compares unequal to any pointer to an object, so
that means no object can ever be at address 0 in a valid C program. If you're
not using the -fno-delete-null-pointer-checks option then GCC is perfectly
justified to assume no object is at address 0 (because the standards say that's
not possible).

The footnote on 6.5.3.2 clarifies that a null pointer is not a valid operand
for the unary * operator. 

For C++, [basic.compound] says a pointer can point to an object or be null, not
both.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/97976] Optimization relating to NULL pointer assumptions in gcc 9.1
  2020-11-24 22:08 [Bug c++/97976] New: Optimization regression in 10.1 for lambda passed as template argument peter at int19h dot net
                   ` (8 preceding siblings ...)
  2020-11-26  0:20 ` redi at gcc dot gnu.org
@ 2020-11-26  1:37 ` peter at int19h dot net
  2020-11-26 10:08 ` redi at gcc dot gnu.org
  2020-11-26 10:09 ` redi at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: peter at int19h dot net @ 2020-11-26  1:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97976

--- Comment #10 from Peter Bisroev <peter at int19h dot net> ---
Jonathan, thank you so much for your explanation. As soon as I read it, it all
started to make sense.

>You cannot increment or decrement a pointer past the end of an array (except the one past the end position).

Yes of course, that is perfectly reasonable. And unless I am mistaken, the
examples that I have provided do not do that directly because as the "array" in
those examples starts at address 0 as can be possible in some embedded systems.
But with the knowledge about how by default gcc treats a pointer to address 0
everything can be perfectly explained.

>GCC assumes there is no object at address zero, see the documentation for -fdelete-null-pointer-checks

Thank you for pointing this flag out to me. After reading its description and
searching a bit I can see that I am not the only one that got caught off guard
with this as shown in these links:

* https://gcc.gnu.org/legacy-ml/gcc-patches/2015-04/msg00790.html
* https://gcc.gnu.org/legacy-ml/gcc-patches/2007-03/msg01968.html
* https://lkml.org/lkml/2018/4/4/601
* https://reviews.llvm.org/D47894

I will definitely go through gcc man page again to see if there is anything
else that we should pay attention to.

Once again, thank you for taking the time to explain all of this!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/97976] Optimization relating to NULL pointer assumptions in gcc 9.1
  2020-11-24 22:08 [Bug c++/97976] New: Optimization regression in 10.1 for lambda passed as template argument peter at int19h dot net
                   ` (9 preceding siblings ...)
  2020-11-26  1:37 ` peter at int19h dot net
@ 2020-11-26 10:08 ` redi at gcc dot gnu.org
  2020-11-26 10:09 ` redi at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: redi at gcc dot gnu.org @ 2020-11-26 10:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97976

--- Comment #11 from Jonathan Wakely <redi at gcc dot gnu.org> ---
The fact that objects cannot live at address zero is not just a GCC quirk. As I
said, it's required by the C and C++ standards. A null pointer cannot be
dereferenced, and cannot point to any object. I'm surprised that anybody finds
that surprising.

Embedded/kernel systems that violate that rule are the exception, not the norm.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/97976] Optimization relating to NULL pointer assumptions in gcc 9.1
  2020-11-24 22:08 [Bug c++/97976] New: Optimization regression in 10.1 for lambda passed as template argument peter at int19h dot net
                   ` (10 preceding siblings ...)
  2020-11-26 10:08 ` redi at gcc dot gnu.org
@ 2020-11-26 10:09 ` redi at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: redi at gcc dot gnu.org @ 2020-11-26 10:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97976

--- Comment #12 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(And for the linux kernel I think it's that the don't want the data flow
analysis done by that option, rather than allowing objects to live at address
zero)

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-11-26 10:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-24 22:08 [Bug c++/97976] New: Optimization regression in 10.1 for lambda passed as template argument peter at int19h dot net
2020-11-24 22:14 ` [Bug c++/97976] " pinskia at gcc dot gnu.org
2020-11-24 22:24 ` peter at int19h dot net
2020-11-25  4:40 ` peter at int19h dot net
2020-11-25  8:25 ` [Bug c++/97976] Optimization relating to NULL pointer assumptions in gcc 9.1 rguenth at gcc dot gnu.org
2020-11-25 16:43 ` peter at int19h dot net
2020-11-25 18:18 ` redi at gcc dot gnu.org
2020-11-25 19:46 ` peter at int19h dot net
2020-11-25 23:55 ` redi at gcc dot gnu.org
2020-11-26  0:20 ` redi at gcc dot gnu.org
2020-11-26  1:37 ` peter at int19h dot net
2020-11-26 10:08 ` redi at gcc dot gnu.org
2020-11-26 10:09 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).