public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/109127] New: More advanced constexpr value compile time evaluation
@ 2023-03-14 11:10 dmitriy.ovdienko at gmail dot com
  0 siblings, 0 replies; only message in thread
From: dmitriy.ovdienko at gmail dot com @ 2023-03-14 11:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109127

            Bug ID: 109127
           Summary: More advanced constexpr value compile time evaluation
           Product: gcc
           Version: 12.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: dmitriy.ovdienko at gmail dot com
  Target Milestone: ---

Hello,

I'd like to report the idea which could improve the application performance.

The idea is related to `constexpr` math, which can be performed at compile
time. At some degree C++ compiler manages to perform the optimization. But in
my more real example for some reason it does not perform that kind of
optimization.

Let's start with the simple example which explains the idea and which works.
Following function serializes the `constexpr` unsigned into the string. It does
not work right, as an output is reversed, but we will get into it later.

```cpp
// The expected output is "543\0"
void foo1(char* ptr)
{
    constexpr unsigned Tag = 345;

    auto v = Tag;

    do
    {
        *ptr++ = (v % 10) + '0';
        v /= 10;
    }
    while(v);

    *ptr = 0;
}
```


The produced assembly is as following:


```asm
foo1(char*):
        mov     eax, DWORD PTR .LC0[rip]
        mov     DWORD PTR [rdi], eax
        ret

.LC0:
        .byte   53
        .byte   52
        .byte   51
        .byte   0
```

It is good enough. I would replace the reading from the memory `.LC0` with the
hardcoded unsigned integer though, so CPU does not have to access other memory
locations:

```
        mov     eax, 0x35343300
        ; instead of
        mov     eax, DWORD PTR .LC0[rip]
```

Now, I change the code a bit to use 16-base math. That is an intermediate step
before we go to the real code:

```cpp
void foo2(char* ptr)
{
    constexpr unsigned Tag = 0xF345;

    auto v = Tag;

    while(v != 0xF)
    {
        *ptr++ = (v % 16) + '0';
        v /= 16;
    }

    *ptr = 0;
}
```

The assembly is the same as above, which is good.

The thing which does not work is if I reverse the output bytes, then compiler
does not perform the `constexpr` math in the compile time:


```cpp
void foo3(char* ptr)
{
    constexpr unsigned Tag = 0x345;

    // Convert 0x345 -> 0xF543
    auto v = Tag;
    auto reversed = 0xFu; // 0xF is a stop value
    while(v)
    {
        reversed <<= 4;
        reversed |= v & 0xFu;
        v >>= 4;
    }

    // Now serialize 0xF543 into "345\0"
    while(reversed != 0xF)
    {
        *ptr++ = (reversed % 16) + '0';
        reversed /= 16;
    }

    *ptr = 0;
}

```

The assembly output is following:

```asm
foo3(char*):
        mov     eax, 62277
.L2:
        mov     edx, eax
        add     rdi, 1
        shr     eax, 4
        and     edx, 15
        add     edx, 48
        mov     BYTE PTR [rdi-1], dl
        cmp     eax, 15
        jne     .L2
        mov     BYTE PTR [rdi], 0
        ret
```

In the assembly above there is a `.L2` loop, which could be calculated during
the compilation.

The workaround is to force compiler to calculate the reversed unsigned and
store it as constexpr:

```cpp
constexpr unsigned reverse(unsigned v)
{
    auto reversed = 0xFu;
    while(v)
    {
        reversed <<= 4;
        reversed |= v & 0xFu;
        v >>= 4;
    }

    return reversed;
}

void foo3(char* ptr)
{
    constexpr unsigned Tag = 0x543;
    constexpr unsigned ReversedTag = reverse(Tag);

    auto reversed = ReversedTag;
    while(reversed != 0xF)
    {
        *ptr++ = (reversed % 16) + '0';
        reversed /= 16;
    }

    *ptr = 0;
}

```

The assembly is back to normal:

```cpp
foo3(char*):
        mov     eax, DWORD PTR .LC0[rip]
        mov     DWORD PTR [rdi], eax
        ret
.LC0:
        .byte   53
        .byte   52
        .byte   51
        .byte   0
```

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-03-14 11:10 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-14 11:10 [Bug c++/109127] New: More advanced constexpr value compile time evaluation dmitriy.ovdienko at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).