public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/49001] GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned
  2011-05-14 21:01 [Bug target/49001] New: GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned npozar at quick dot cz
@ 2011-05-14 20:48 ` npozar at quick dot cz
  2011-05-15 21:27 ` ubizjak at gmail dot com
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: npozar at quick dot cz @ 2011-05-14 20:48 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001

Norbert Pozar <npozar at quick dot cz> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|major                       |critical


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/49001] New: GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned
@ 2011-05-14 21:01 npozar at quick dot cz
  2011-05-14 20:48 ` [Bug target/49001] " npozar at quick dot cz
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: npozar at quick dot cz @ 2011-05-14 21:01 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001

           Summary: GCC uses VMOVAPS/PD AVX instructions to access stack
                    variables that are not 32-byte aligned
           Product: gcc
           Version: 4.6.1
            Status: UNCONFIRMED
          Severity: major
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: npozar@quick.cz


I'm using a custom mingw64 build of GCC 4.6.1. My target is Windows 64bit. I
compile with g++ -03 -march=corei7-avx -mtune=corei7-avx -mavx.

GCC uses aligned moves VMOVAPS/PD from the new AVX instruction set to access
local variables of type __m256/__m256d on the stack. But the stack pointer is
only 16byte aligned on Win64, so this causes a segmentation fault error when
the stack pointer is not 32byte aligned, as in:

__m256 dummy_ps256;
void test_stackalign32() {
    __m256 x = dummy_ps256;
    dummy_ps256 = sin256_ps_avx(x);
}

which compiles to 

    vmovaps    dummy_ps256(%rip), %ymm0
    leaq    32(%rsp), %rdx
    vmovaps    %ymm0, 32(%rsp)  // possible SEGFAULT
    leaq    64(%rsp), %rcx
    vzeroupper
    call    _Z13sin256_ps_avxDv8_f
    vmovaps    64(%rsp), %ymm0  // possible SEGFAULT

I couldn't figure out how to realign a stack with -mstackrealign.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/49001] GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned
  2011-05-14 21:01 [Bug target/49001] New: GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned npozar at quick dot cz
  2011-05-14 20:48 ` [Bug target/49001] " npozar at quick dot cz
@ 2011-05-15 21:27 ` ubizjak at gmail dot com
  2011-05-15 22:26 ` hjl.tools at gmail dot com
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: ubizjak at gmail dot com @ 2011-05-15 21:27 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001

Uros Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ktietz at gcc dot gnu.org
           Severity|critical                    |normal

--- Comment #1 from Uros Bizjak <ubizjak at gmail dot com> 2011-05-15 18:49:44 UTC ---
(In reply to comment #0)
> I'm using a custom mingw64 build of GCC 4.6.1. My target is Windows 64bit. I
> compile with g++ -03 -march=corei7-avx -mtune=corei7-avx -mavx.

Please provide testcase that can be compiled without changes. See [1].

FWIW, I have tested following testcase on x86_64-pc-linux-gnu:

--cut here--
#include <x86intrin.h>

__m256 sin256_ps_avx (__m256);

__m256 dummy_ps256;
void test_stackalign32() {
    volatile __m256 x = dummy_ps256;
    dummy_ps256 = sin256_ps_avx(x);
}
--cut here--

And got expected code (gcc-4.6.1):

test_stackalign32:
.LFB828:
    .cfi_startproc
    pushq    %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    andq    $-32, %rsp
    subq    $32, %rsp
    vmovaps    dummy_ps256(%rip), %ymm0
    vmovaps    %ymm0, (%rsp)
    vmovaps    (%rsp), %ymm0
    call    sin256_ps_avx
    vmovaps    %ymm0, dummy_ps256(%rip)
    leave
    .cfi_def_cfa 7, 8
    vzeroupper
    ret

Probably mingw64 specific problem... CC added.

[1] http://gcc.gnu.org/bugs/#report


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/49001] GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned
  2011-05-14 21:01 [Bug target/49001] New: GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned npozar at quick dot cz
  2011-05-14 20:48 ` [Bug target/49001] " npozar at quick dot cz
  2011-05-15 21:27 ` ubizjak at gmail dot com
@ 2011-05-15 22:26 ` hjl.tools at gmail dot com
  2011-05-16  7:22 ` npozar at quick dot cz
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: hjl.tools at gmail dot com @ 2011-05-15 22:26 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001

--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> 2011-05-15 22:10:00 UTC ---
Stack alignment isn't supported on Windows.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/49001] GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned
  2011-05-14 21:01 [Bug target/49001] New: GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned npozar at quick dot cz
                   ` (2 preceding siblings ...)
  2011-05-15 22:26 ` hjl.tools at gmail dot com
@ 2011-05-16  7:22 ` npozar at quick dot cz
  2014-09-03 21:18 ` roland at rschulz dot eu
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: npozar at quick dot cz @ 2011-05-16  7:22 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001

--- Comment #3 from Norbert Pozar <npozar at quick dot cz> 2011-05-16 06:05:37 UTC ---
(In reply to comment #1)
> Please provide testcase that can be compiled without changes. See [1].

I'm sorry about this.

> Probably mingw64 specific problem... CC added.

Thank you for your time to test the code on linux. I was worried that this
might be mingw64 specific.

(In reply to comment #2)
> Stack alignment isn't supported on Windows.

Since this bug effectively prevents using 256bit AVX instructions when
compiling for Windows using GCC, I was wondering if there are any plans to
support the stack alignment. It seems that simply adding 

andq    $-32, %rsp

to the function prologue would fix this. Or would it be feasible to replace
VMOVAPS by unaligned VMOVUPS when accessing the stack?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/49001] GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned
  2011-05-14 21:01 [Bug target/49001] New: GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned npozar at quick dot cz
                   ` (3 preceding siblings ...)
  2011-05-16  7:22 ` npozar at quick dot cz
@ 2014-09-03 21:18 ` roland at rschulz dot eu
  2021-08-22 18:35 ` arthur200126 at gmail dot com
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: roland at rschulz dot eu @ 2014-09-03 21:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001

--- Comment #4 from Roland Schulz <roland at rschulz dot eu> ---
*** Bug 61730 has been marked as a duplicate of this bug. ***


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/49001] GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned
  2011-05-14 21:01 [Bug target/49001] New: GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned npozar at quick dot cz
                   ` (4 preceding siblings ...)
  2014-09-03 21:18 ` roland at rschulz dot eu
@ 2021-08-22 18:35 ` arthur200126 at gmail dot com
  2021-08-22 18:39 ` arthur200126 at gmail dot com
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: arthur200126 at gmail dot com @ 2021-08-22 18:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001

Mingye Wang <arthur200126 at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |arthur200126 at gmail dot com

--- Comment #5 from Mingye Wang <arthur200126 at gmail dot com> ---
I think I am bumping into the same bug with GCC 10.3.0, MinGW64 environment, in
an SIMD library at [1].
  [1]: https://github.com/google/highway/issues/332

There was a related bug at [2] showing another small (not quite minimal) test
case.
  [2]: https://osdn.net/projects/mingw/ticket/39565

The VMOVUPS idea seems cool -- can we do it?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/49001] GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned
  2011-05-14 21:01 [Bug target/49001] New: GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned npozar at quick dot cz
                   ` (5 preceding siblings ...)
  2021-08-22 18:35 ` arthur200126 at gmail dot com
@ 2021-08-22 18:39 ` arthur200126 at gmail dot com
  2021-12-21 12:35 ` thiago at kde dot org
  2024-02-19 17:00 ` pinskia at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: arthur200126 at gmail dot com @ 2021-08-22 18:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001

--- Comment #6 from Mingye Wang <arthur200126 at gmail dot com> ---
FWIW, the ticket about doing stuff to align the stack in the prologue is bug
54412. Apologies for the noisy emails, but thing is I can't do the see-also
thing here.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/49001] GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned
  2011-05-14 21:01 [Bug target/49001] New: GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned npozar at quick dot cz
                   ` (6 preceding siblings ...)
  2021-08-22 18:39 ` arthur200126 at gmail dot com
@ 2021-12-21 12:35 ` thiago at kde dot org
  2024-02-19 17:00 ` pinskia at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: thiago at kde dot org @ 2021-12-21 12:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001

--- Comment #7 from Thiago Macieira <thiago at kde dot org> ---
Hack to workaround:

asm(
    ".macro vmovapd args:vararg\n"
    "    vmovupd \\args\n"
    ".endm\n"
    ".macro vmovaps args:vararg\n"
    "    vmovups \\args\n"
    ".endm\n"
    ".macro vmovdqa args:vararg\n"
    "    vmovdqu \\args\n"
    ".endm\n"
    ".macro vmovdqa32 args:vararg\n"
    "    vmovdqu32 \\args\n"
    ".endm\n"
    ".macro vmovdqa64 args:vararg\n"
    "    vmovdqu64 \\args\n"
    ".endm\n"
);

See:
https://github.com/opendcdiag/opendcdiag/blob/main/framework/sysdeps/windows/win32_stdlib.h#L11-L34

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/49001] GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned
  2011-05-14 21:01 [Bug target/49001] New: GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned npozar at quick dot cz
                   ` (7 preceding siblings ...)
  2021-12-21 12:35 ` thiago at kde dot org
@ 2024-02-19 17:00 ` pinskia at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-19 17:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |xjkp2283572185 at gmail dot com

--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 113989 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-02-19 17:00 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-14 21:01 [Bug target/49001] New: GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned npozar at quick dot cz
2011-05-14 20:48 ` [Bug target/49001] " npozar at quick dot cz
2011-05-15 21:27 ` ubizjak at gmail dot com
2011-05-15 22:26 ` hjl.tools at gmail dot com
2011-05-16  7:22 ` npozar at quick dot cz
2014-09-03 21:18 ` roland at rschulz dot eu
2021-08-22 18:35 ` arthur200126 at gmail dot com
2021-08-22 18:39 ` arthur200126 at gmail dot com
2021-12-21 12:35 ` thiago at kde dot org
2024-02-19 17:00 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).