public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/114556] New: weird loop unrolling when there's attribute aligned in side the loop
@ 2024-04-02  8:27 liuhongt at gcc dot gnu.org
  2024-04-03  0:35 ` [Bug middle-end/114556] weird loop unrolling when there's attribute aligned inside " pinskia at gcc dot gnu.org
  2024-04-03  0:40 ` pinskia at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-04-02  8:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114556

            Bug ID: 114556
           Summary: weird loop unrolling when there's attribute aligned in
                    side the loop
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: liuhongt at gcc dot gnu.org
  Target Milestone: ---

v32qi
zzzzz (void* pa, void* pb, void* pc)
{
    v32qi __attribute__((aligned(64))) a;
    v32qi __attribute__((aligned(64))) b;
    v32qi __attribute__((aligned(64))) c;
    __builtin_memcpy (&a, pa, sizeof (a));
    __builtin_memcpy (&b, pb, sizeof (a));
    __builtin_memcpy (&c, pc, sizeof (a));
    #pragma GCC unroll 8
    for (int i = 0; i != 2048; i++)
      a += b;
      return a;
}

-O2 -mavx2, we have 

zzzzz:
        vmovdqu (%rsi), %ymm1
        vpaddb  (%rdi), %ymm1, %ymm0
        movl    $2041, %eax
        vpaddb  %ymm0, %ymm1, %ymm0
        vpaddb  %ymm0, %ymm1, %ymm0
        vpaddb  %ymm0, %ymm1, %ymm0
        vpaddb  %ymm0, %ymm1, %ymm0
        vpaddb  %ymm0, %ymm1, %ymm0
        vpaddb  %ymm0, %ymm1, %ymm0
        jmp     .L2
.L3:
        vpaddb  %ymm0, %ymm1, %ymm0
        subl    $8, %eax
        vpaddb  %ymm0, %ymm1, %ymm0
        vpaddb  %ymm0, %ymm1, %ymm0
        vpaddb  %ymm0, %ymm1, %ymm0
        vpaddb  %ymm0, %ymm1, %ymm0
        vpaddb  %ymm0, %ymm1, %ymm0
        vpaddb  %ymm0, %ymm1, %ymm0
.L2:
        vpaddb  %ymm0, %ymm1, %ymm0
        cmpl    $1, %eax
        jne     .L3
        ret

But shouldn't it better with

zzzzz:
        vmovdqu (%rsi), %ymm1
        vmovdqu (%rdi), %ymm0
        movl    $2048, %eax
.L2:
        vpaddb  %ymm1, %ymm0, %ymm0
        vpaddb  %ymm1, %ymm0, %ymm0
        vpaddb  %ymm1, %ymm0, %ymm0
        vpaddb  %ymm1, %ymm0, %ymm0
        vpaddb  %ymm1, %ymm0, %ymm0
        vpaddb  %ymm1, %ymm0, %ymm0
        vpaddb  %ymm1, %ymm0, %ymm0
        vpaddb  %ymm1, %ymm0, %ymm0
        subl    $8, %eax
        jne     .L2
        ret

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug middle-end/114556] weird loop unrolling when there's attribute aligned inside the loop
  2024-04-02  8:27 [Bug rtl-optimization/114556] New: weird loop unrolling when there's attribute aligned in side the loop liuhongt at gcc dot gnu.org
@ 2024-04-03  0:35 ` pinskia at gcc dot gnu.org
  2024-04-03  0:40 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-03  0:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114556

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|rtl-optimization            |middle-end

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The first difference (with/without aligned) comes from expand so moving this to
the middle-end ...

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug middle-end/114556] weird loop unrolling when there's attribute aligned inside the loop
  2024-04-02  8:27 [Bug rtl-optimization/114556] New: weird loop unrolling when there's attribute aligned in side the loop liuhongt at gcc dot gnu.org
  2024-04-03  0:35 ` [Bug middle-end/114556] weird loop unrolling when there's attribute aligned inside " pinskia at gcc dot gnu.org
@ 2024-04-03  0:40 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-03  0:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114556

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2024-04-03

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
without the align:

Inserting a partition copy on edge BB2->BB3 : PART.0 = PART.3
Inserting a value copy on edge BB2->BB3 : PART.1 = 2048


vs with:

Inserting a partition copy on edge BB3->BB3 : PART.7 = PART.0
Inserting a partition copy on edge BB2->BB3 : PART.7 = PART.3
Inserting a value copy on edge BB2->BB3 : PART.1 = 2048


Basically out of ssa is doing an extra move because it could not "merge" the
loop induction variable for the (vector) addition due to the different
alignment requirements ... 

And then things go down hill from there.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-04-03  0:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-02  8:27 [Bug rtl-optimization/114556] New: weird loop unrolling when there's attribute aligned in side the loop liuhongt at gcc dot gnu.org
2024-04-03  0:35 ` [Bug middle-end/114556] weird loop unrolling when there's attribute aligned inside " pinskia at gcc dot gnu.org
2024-04-03  0:40 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).