public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4
@ 2024-04-02 18:19 yshuiv7 at gmail dot com
  2024-04-02 18:28 ` [Bug target/114566] " pinskia at gcc dot gnu.org
                   ` (20 more replies)
  0 siblings, 21 replies; 22+ messages in thread
From: yshuiv7 at gmail dot com @ 2024-04-02 18:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

            Bug ID: 114566
           Summary: Misaligned vmovaps when compiling libvorbis for znver4
           Product: gcc
           Version: 13.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: yshuiv7 at gmail dot com
  Target Milestone: ---

Haven't tried to minimize it yet, but here is how to reproduce it:

1. Download libvorbis 1.3.7 source: https://github.com/xiph/vorbis/tree/v1.3.7
2. Configure it: 
       cmake -B build -DCMAKE_C_FLAGS="-march=znver4 -mtune=znver4"
-DCMAKE_CXX_FLAGS="-march=znver4 -mtune=znver4" -DCMAKE_BUILD_TYPE=Release .
3. Run tests: 
       make -C build test

Stack trace:
#0  0x0000000000410d2e in setup_tone_curves
(curveatt_dB=curveatt_dB@entry=0x4e1834, binHz=binHz@entry=86.1328125,
    n=n@entry=256, center_boost=-1.00000203, center_decay_rate=<optimized out>)
at /tmp/vorbis/lib/psy.c:129
#1  0x0000000000413b24 in _vp_psy_init (p=0x4fe8c0, vi=<optimized out>,
gi=gi@entry=0x4e0be0, n=256,
    rate=<optimized out>) at /tmp/vorbis/lib/psy.c:326
#2  0x000000000040a7b5 in _vds_shared_init (v=v@entry=0x7fffffffb370,
vi=vi@entry=0x7fffffffb330, encp=encp@entry=1)
    at /tmp/vorbis/lib/block.c:225
#3  0x000000000040a93f in vorbis_analysis_init (v=v@entry=0x7fffffffb370,
vi=vi@entry=0x7fffffffb330)
    at /tmp/vorbis/lib/block.c:298
#4  0x0000000000404ad2 in write_vorbis_data_or_die (
    filename=filename@entry=0x7fffffffb700 "vorbis_1ch_q-0.5_44100.ogg",
srate=srate@entry=44100,
    q=q@entry=-0.0500000007, data=data@entry=0x4dc080 <data_out>,
count=count@entry=2048, ch=ch@entry=1)
    at /tmp/vorbis/test/write_read.c:61
#5  0x000000000040456d in main () at /tmp/vorbis/test/test.c:58

Relevant part of the code:

   0x0000000000410cee <+1854>:  add    $0xe0,%rdx
   0x0000000000410cf5 <+1861>:  vmovups %zmm17,-0xe0(%rdx)
   0x0000000000410cff <+1871>:  vaddps -0xa0(%rdx),%zmm7,%zmm17
   0x0000000000410d09 <+1881>:  vmovups %zmm17,-0xa0(%rdx)
   0x0000000000410d13 <+1891>:  vaddps -0x60(%rdx),%zmm6,%zmm17
   0x0000000000410d1d <+1901>:  vmovups %zmm17,-0x60(%rdx)
   0x0000000000410d27 <+1911>:  vaddps -0x20(%rdx),%ymm0,%ymm17
=> 0x0000000000410d2e <+1918>:  vmovaps %ymm17,-0x20(%rdx)

$rdx is 0x7fffffff3a10

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/114566] Misaligned vmovaps when compiling libvorbis for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
@ 2024-04-02 18:28 ` pinskia at gcc dot gnu.org
  2024-04-02 19:33 ` yshuiv7 at gmail dot com
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-02 18:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2024-04-02
             Target|x86_64-linux-gnu            |x86_64
     Ever confirmed|0                           |1
           Keywords|                            |wrong-code

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Can you provide a few more information? Like the preprocessed source for psy.c
and the exact command line that is invoked for GCC?

Also provide the full output of `gcc -v`?

This is all that is requested by https://gcc.gnu.org/bugs/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/114566] Misaligned vmovaps when compiling libvorbis for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
  2024-04-02 18:28 ` [Bug target/114566] " pinskia at gcc dot gnu.org
@ 2024-04-02 19:33 ` yshuiv7 at gmail dot com
  2024-04-02 20:15 ` yshuiv7 at gmail dot com
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: yshuiv7 at gmail dot com @ 2024-04-02 19:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

--- Comment #2 from Yuxuan Shui <yshuiv7 at gmail dot com> ---
/nix/store/qp3k692bxjhlzvsdqpq7mdylfyr7i1ln-gcc-wrapper-13.2.0/bin/gcc 
-I/tmp/vorbis/include -I/tmp/vorbis/lib -O3 -march=znver4 -mtune=znver4 -g -o
psy.c.o -c /tmp/vorbis/lib/psy.c -v
Using built-in specs.
COLLECT_GCC=/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/bin/gcc
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-13.2.0/configure
--prefix=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gcc-13.2.0
--with-gmp-include=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gmp-6.3.0-dev/include
--with-gmp-lib=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gmp-6.3.0/lib
--with-mpfr-include=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-mpfr-4.2.1-dev/include
--with-mpfr-lib=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-mpfr-4.2.1/lib
--with-mpc=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-libmpc-1.3.1
--with-native-system-header-dir=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.38-44-dev/include
--with-build-sysroot=/
--with-gxx-include-dir=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gcc-13.2.0/include/c++/13.2.0/
--program-prefix= --enable-lto --disable-libstdcxx-pch
--without-included-gettext --with-system-zlib --enable-static
--enable-languages=c,c++ --disable-multilib --enable-plugin --disable-libcc1
--with-isl=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-isl-0.20
--disable-bootstrap --build=x86_64-unknown-linux-gnu
--host=x86_64-unknown-linux-gnu --target=x86_64-unknown-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 13.2.0 (GCC)
COLLECT_GCC_OPTIONS='-fPIC' '-O2' '-U' '_FORTIFY_SOURCE' '-Wformat=1'
'-Wformat-security' '-Werror=format-security' '-fstack-protector-strong'
'--param=ssp-buffer-size=4' '-fno-strict-overflow' '-I' '/tmp/vorbis/include'
'-I' '/tmp/vorbis/lib' '-O3' '-march=znver4' '-mtune=znver4' '-g' '-o'
'psy.c.o' '-c' '-v' '-U' '_FORTIFY_SOURCE' '-D' '_FORTIFY_SOURCE=3' '-B'
'/nix/store/bzjyfnr8g585gvxjgiabn28qdm32b02n-glibc-2.38-44/lib/' '-idirafter'
'/nix/store/j79rphhc2vmb7rrxvx0aymhkw8bpkckf-glibc-2.38-44-dev/include'
'-idirafter'
'/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/lib/gcc/x86_64-unknown-linux-gnu/13.2.0/include-fixed'
'-B' '/nix/store/7ngrcd0a7q460gyg8grx6pipwzpgy0vq-gcc-13.2.0-lib/lib' '-B'
'/nix/store/qp3k692bxjhlzvsdqpq7mdylfyr7i1ln-gcc-wrapper-13.2.0/bin/'
'-frandom-seed=wbmyj7xk8s' '-isystem'
'/nix/store/5zr21xnk4h8pdi1s8n20y31y1x5hn8i0-libogg-1.3.5-dev/include'
'-isystem'
'/nix/store/5zr21xnk4h8pdi1s8n20y31y1x5hn8i0-libogg-1.3.5-dev/include'

/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/libexec/gcc/x86_64-unknown-linux-gnu/13.2.0/cc1
-quiet -v -I /tmp/vorbis/include -I /tmp/vorbis/lib -U _FORTIFY_SOURCE -U
_FORTIFY_SOURCE -D _FORTIFY_SOURCE=3 -idirafter
/nix/store/j79rphhc2vmb7rrxvx0aymhkw8bpkckf-glibc-2.38-44-dev/include
-idirafter
/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/lib/gcc/x86_64-unknown-linux-gnu/13.2.0/include-fixed
-isystem /nix/store/5zr21xnk4h8pdi1s8n20y31y1x5hn8i0-libogg-1.3.5-dev/include
-isystem /nix/store/5zr21xnk4h8pdi1s8n20y31y1x5hn8i0-libogg-1.3.5-dev/include
/tmp/vorbis/lib/psy.c -quiet -dumpbase psy.c.c -dumpbase-ext .c -march=znver4
-mtune=znver4 -g -O2 -O3 -Wformat=1 -Wformat-security -Werror=format-security
-version -fPIC -fstack-protector-strong -fno-strict-overflow
-frandom-seed=wbmyj7xk8s --param=ssp-buffer-size=4 -o
/tmp/nix-shell.Yn7YW0/ccu1mT2u.s
GNU C17 (GCC) version 13.2.0 (x86_64-unknown-linux-gnu)
        compiled by GNU C version 13.2.0, GMP version 6.3.0, MPFR version
4.2.1, MPC version 1.3.1, isl version isl-0.20-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring duplicate directory
"/nix/store/5zr21xnk4h8pdi1s8n20y31y1x5hn8i0-libogg-1.3.5-dev/include"
ignoring nonexistent directory
"/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/lib/gcc/x86_64-unknown-linux-gnu/13.2.0/../../../../x86_64-unknown-linux-gnu/include"
ignoring duplicate directory
"/nix/store/j79rphhc2vmb7rrxvx0aymhkw8bpkckf-glibc-2.38-44-dev/include"
ignoring duplicate directory
"/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/lib/gcc/x86_64-unknown-linux-gnu/13.2.0/include-fixed"
#include "..." search starts here:
#include <...> search starts here:
 /tmp/vorbis/include
 /tmp/vorbis/lib
 /nix/store/5zr21xnk4h8pdi1s8n20y31y1x5hn8i0-libogg-1.3.5-dev/include

/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/lib/gcc/x86_64-unknown-linux-gnu/13.2.0/include
 /nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/include

/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/lib/gcc/x86_64-unknown-linux-gnu/13.2.0/include-fixed
 /nix/store/j79rphhc2vmb7rrxvx0aymhkw8bpkckf-glibc-2.38-44-dev/include
End of search list.
Compiler executable checksum: b33e0c81578694d9e35e19d87dacd083
COLLECT_GCC_OPTIONS='-fPIC' '-O2' '-U' '_FORTIFY_SOURCE' '-Wformat=1'
'-Wformat-security' '-Werror=format-security' '-fstack-protector-strong'
'--param=ssp-buffer-size=4' '-fno-strict-overflow' '-I' '/tmp/vorbis/include'
'-I' '/tmp/vorbis/lib' '-O3' '-march=znver4' '-mtune=znver4' '-g' '-o'
'psy.c.o' '-c' '-v' '-U' '_FORTIFY_SOURCE' '-D' '_FORTIFY_SOURCE=3' '-B'
'/nix/store/bzjyfnr8g585gvxjgiabn28qdm32b02n-glibc-2.38-44/lib/' '-idirafter'
'/nix/store/j79rphhc2vmb7rrxvx0aymhkw8bpkckf-glibc-2.38-44-dev/include'
'-idirafter'
'/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/lib/gcc/x86_64-unknown-linux-gnu/13.2.0/include-fixed'
'-B' '/nix/store/7ngrcd0a7q460gyg8grx6pipwzpgy0vq-gcc-13.2.0-lib/lib' '-B'
'/nix/store/qp3k692bxjhlzvsdqpq7mdylfyr7i1ln-gcc-wrapper-13.2.0/bin/'
'-frandom-seed=wbmyj7xk8s' '-isystem'
'/nix/store/5zr21xnk4h8pdi1s8n20y31y1x5hn8i0-libogg-1.3.5-dev/include'
'-isystem'
'/nix/store/5zr21xnk4h8pdi1s8n20y31y1x5hn8i0-libogg-1.3.5-dev/include'
 /nix/store/qp3k692bxjhlzvsdqpq7mdylfyr7i1ln-gcc-wrapper-13.2.0/bin/as -v -I
/tmp/vorbis/include -I /tmp/vorbis/lib --gdwarf-5 --64 -o psy.c.o
/tmp/nix-shell.Yn7YW0/ccu1mT2u.s
GNU assembler version 2.41 (x86_64-unknown-linux-gnu) using BFD version (GNU
Binutils) 2.41
COMPILER_PATH=/nix/store/bzjyfnr8g585gvxjgiabn28qdm32b02n-glibc-2.38-44/lib/:/nix/store/7ngrcd0a7q460gyg8grx6pipwzpgy0vq-gcc-13.2.0-lib/lib/:/nix/store/qp3k692bxjhlzvsdqpq7mdylfyr7i1ln-gcc-wrapper-13.2.0/bin/:/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/libexec/gcc/x86_64-unknown-linux-gnu/13.2.0/:/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/libexec/gcc/x86_64-unknown-linux-gnu/13.2.0/:/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/libexec/gcc/x86_64-unknown-linux-gnu/:/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/lib/gcc/x86_64-unknown-linux-gnu/13.2.0/:/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/lib/gcc/x86_64-unknown-linux-gnu/
LIBRARY_PATH=/nix/store/bzjyfnr8g585gvxjgiabn28qdm32b02n-glibc-2.38-44/lib/:/nix/store/7ngrcd0a7q460gyg8grx6pipwzpgy0vq-gcc-13.2.0-lib/lib/:/nix/store/qp3k692bxjhlzvsdqpq7mdylfyr7i1ln-gcc-wrapper-13.2.0/bin/:/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/lib/gcc/x86_64-unknown-linux-gnu/13.2.0/:/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/lib/gcc/x86_64-unknown-linux-gnu/13.2.0/../../../../lib64/:/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/lib/gcc/x86_64-unknown-linux-gnu/13.2.0/../../../
COLLECT_GCC_OPTIONS='-fPIC' '-O2' '-U' '_FORTIFY_SOURCE' '-Wformat=1'
'-Wformat-security' '-Werror=format-security' '-fstack-protector-strong'
'--param=ssp-buffer-size=4' '-fno-strict-overflow' '-I' '/tmp/vorbis/include'
'-I' '/tmp/vorbis/lib' '-O3' '-march=znver4' '-mtune=znver4' '-g' '-o'
'psy.c.o' '-c' '-v' '-U' '_FORTIFY_SOURCE' '-D' '_FORTIFY_SOURCE=3' '-B'
'/nix/store/bzjyfnr8g585gvxjgiabn28qdm32b02n-glibc-2.38-44/lib/' '-idirafter'
'/nix/store/j79rphhc2vmb7rrxvx0aymhkw8bpkckf-glibc-2.38-44-dev/include'
'-idirafter'
'/nix/store/iwwfrvi5b20irhl0vz8zdqzjf5i7vil2-gcc-13.2.0/lib/gcc/x86_64-unknown-linux-gnu/13.2.0/include-fixed'
'-B' '/nix/store/7ngrcd0a7q460gyg8grx6pipwzpgy0vq-gcc-13.2.0-lib/lib' '-B'
'/nix/store/qp3k692bxjhlzvsdqpq7mdylfyr7i1ln-gcc-wrapper-13.2.0/bin/'
'-frandom-seed=wbmyj7xk8s' '-isystem'
'/nix/store/5zr21xnk4h8pdi1s8n20y31y1x5hn8i0-libogg-1.3.5-dev/include'
'-isystem'
'/nix/store/5zr21xnk4h8pdi1s8n20y31y1x5hn8i0-libogg-1.3.5-dev/include'
'-dumpdir' 'psy.c.'

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/114566] Misaligned vmovaps when compiling libvorbis for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
  2024-04-02 18:28 ` [Bug target/114566] " pinskia at gcc dot gnu.org
  2024-04-02 19:33 ` yshuiv7 at gmail dot com
@ 2024-04-02 20:15 ` yshuiv7 at gmail dot com
  2024-04-02 20:22 ` yshuiv7 at gmail dot com
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: yshuiv7 at gmail dot com @ 2024-04-02 20:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

--- Comment #3 from Yuxuan Shui <yshuiv7 at gmail dot com> ---
Roughly reduced example:

#include <math.h>
#include <string.h>

#define toOC(n)     (log(n)*1.442695f-5.965784f)

float *setup_tone_curves(float binHz,
                         float center_decay_rate) {
  float workc[8][56];
  float *ret = NULL;

  memset(workc, 0, sizeof(workc));

  for (int j = 0; j < 8; j++) {
    for (int k = 0; k < 56; k++) {
      float adj =  k * center_decay_rate;
      if (adj < 0.)
        adj = 0.;
      if (adj > 0.)
        adj = 0.;
      workc[j][k] += adj;
    }
  }

  int lo_curve, bin = 0;
  lo_curve = ceil(toOC(bin * binHz + 1) * 2);

  return (ret);
}

int main() {
  setup_tone_curves(86.1328125, 0.625001);
  return 0;
}

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/114566] Misaligned vmovaps when compiling libvorbis for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (2 preceding siblings ...)
  2024-04-02 20:15 ` yshuiv7 at gmail dot com
@ 2024-04-02 20:22 ` yshuiv7 at gmail dot com
  2024-04-02 20:28 ` yshuiv7 at gmail dot com
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: yshuiv7 at gmail dot com @ 2024-04-02 20:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

--- Comment #4 from Yuxuan Shui <yshuiv7 at gmail dot com> ---
Reduced a bit further:

void setup_tone_curves(float binHz, float center_decay_rate) {
  float workc[8][56];
  memset(workc, 0, sizeof(workc));

  for (int j = 0; j < 8; j++) {
    for (int k = 0; k < 56; k++) {
      float adj =  k * center_decay_rate;
      if (adj < 0.)
        adj = 0.;
      if (adj > 0.)
        adj = 0.;
      workc[j][k] += adj;
    }
  }

  int lo_curve = log(binHz);
}

Some observations:

1. lo_curve, although dead code, triggers the bug.
2. the 2 if's in the loop are also required for this bug.
3. workc has to be initialized with memset, aggregate initializer doesn't
trigger the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/114566] Misaligned vmovaps when compiling libvorbis for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (3 preceding siblings ...)
  2024-04-02 20:22 ` yshuiv7 at gmail dot com
@ 2024-04-02 20:28 ` yshuiv7 at gmail dot com
  2024-04-02 20:37 ` pinskia at gcc dot gnu.org
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: yshuiv7 at gmail dot com @ 2024-04-02 20:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

--- Comment #5 from Yuxuan Shui <yshuiv7 at gmail dot com> ---
And -fstack-protector-strong is needed.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/114566] Misaligned vmovaps when compiling libvorbis for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (4 preceding siblings ...)
  2024-04-02 20:28 ` yshuiv7 at gmail dot com
@ 2024-04-02 20:37 ` pinskia at gcc dot gnu.org
  2024-04-02 20:40 ` yshuiv7 at gmail dot com
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-02 20:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=110621
             Status|WAITING                     |UNCONFIRMED
     Ever confirmed|1                           |0

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Yuxuan Shui from comment #5)
> And -fstack-protector-strong is needed.

Oh ...

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/114566] Misaligned vmovaps when compiling libvorbis for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (5 preceding siblings ...)
  2024-04-02 20:37 ` pinskia at gcc dot gnu.org
@ 2024-04-02 20:40 ` yshuiv7 at gmail dot com
  2024-04-02 20:44 ` [Bug target/114566] Misaligned vmovaps when compiling with stack-protector-strong " pinskia at gcc dot gnu.org
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: yshuiv7 at gmail dot com @ 2024-04-02 20:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

--- Comment #7 from Yuxuan Shui <yshuiv7 at gmail dot com> ---
Looks similar to Bug 110027, but ASAN is not involved here.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/114566] Misaligned vmovaps when compiling with stack-protector-strong for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (6 preceding siblings ...)
  2024-04-02 20:40 ` yshuiv7 at gmail dot com
@ 2024-04-02 20:44 ` pinskia at gcc dot gnu.org
  2024-04-04 15:22 ` jakub at gcc dot gnu.org
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-02 20:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Yuxuan Shui from comment #7)
> Looks similar to Bug 110027, but ASAN is not involved here.

Right, someone will need to debug it but I looked at the patch for PR 110027
and it is ASAN specific though.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/114566] Misaligned vmovaps when compiling with stack-protector-strong for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (7 preceding siblings ...)
  2024-04-02 20:44 ` [Bug target/114566] Misaligned vmovaps when compiling with stack-protector-strong " pinskia at gcc dot gnu.org
@ 2024-04-04 15:22 ` jakub at gcc dot gnu.org
  2024-04-04 15:23 ` [Bug target/114566] [11/12/13 Regression] " jakub at gcc dot gnu.org
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-04 15:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
                 CC|                            |jakub at gcc dot gnu.org
   Last reconfirmed|2024-04-02 00:00:00         |2024-04-04

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
With -O3 -mavx512{bw,dq,vl} -fstack-protector-strong on
void
foo (float x, float y)
{
  float a[8][56];
  __builtin_memset (a, 0, sizeof (a));

  for (int j = 0; j < 8; j++)
    {
      for (int k = 0; k < 56; k++)
        {
          float b = k * y;
          if (b < 0.)
            b = 0.;
          if (b > 0.)
            b = 0.;
          a[j][k] += b;
        }
    }

  int c = __builtin_log (x);
}

int
main ()
{
  foo (86.1328125, 0.625001);
  return 0;
}
this started with r10-4263-g1297712fb4af6c6bfd827e0f0a9695b14669f87d
and went away on the trunk with
r14-2487-g574a1ea4406dd1dbf14e149a9b5d142f6cbdf32a

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/114566] [11/12/13 Regression] Misaligned vmovaps when compiling with stack-protector-strong for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (8 preceding siblings ...)
  2024-04-04 15:22 ` jakub at gcc dot gnu.org
@ 2024-04-04 15:23 ` jakub at gcc dot gnu.org
  2024-04-04 15:40 ` jakub at gcc dot gnu.org
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-04 15:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
            Summary|Misaligned vmovaps when     |[11/12/13 Regression]
                   |compiling with              |Misaligned vmovaps when
                   |stack-protector-strong for  |compiling with
                   |znver4                      |stack-protector-strong for
                   |                            |znver4
   Target Milestone|---                         |11.5

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/114566] [11/12/13 Regression] Misaligned vmovaps when compiling with stack-protector-strong for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (9 preceding siblings ...)
  2024-04-04 15:23 ` [Bug target/114566] [11/12/13 Regression] " jakub at gcc dot gnu.org
@ 2024-04-04 15:40 ` jakub at gcc dot gnu.org
  2024-04-04 15:46 ` jakub at gcc dot gnu.org
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-04 15:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Note, the a array which is the object into which the misaligned store happens
has
align:128 so assuming 256-bit alignment into it seems wrong:
(insn 57 56 58 4 (set (reg:V8SF 135 [ vect__33.37 ])
        (plus:V8SF (reg:V8SF 101 [ vect_b_6.33 ])
            (mem:V8SF (plus:DI (reg/f:DI 83 [ _42 ])
                    (const_int 192 [0xc0])) [1 MEM <vector(8) float> [(float
*)_42 + 192B]+0 S32 A256]))) "pr114566.c":16:12 2077 {*addv8sf3}
     (nil))
(insn 58 57 59 4 (set (mem:V8SF (plus:DI (reg/f:DI 83 [ _42 ])
                (const_int 192 [0xc0])) [1 MEM <vector(8) float> [(float *)_42
+ 192B]+0 S32 A256])
        (reg:V8SF 135 [ vect__33.37 ])) "pr114566.c":16:12 1847
{movv8sf_internal}
     (nil))
It should have been A128, not A256...
For the vaddps we get away with that because AVX allows misaligned loads when
used in arith instructions, but not for the store.
But the neighbouring loads/stores use correct alignment:
(insn 54 53 55 4 (set (reg:V16SF 133 [ vect__47.22 ])
        (mem:V16SF (plus:DI (reg/f:DI 83 [ _42 ])
                (const_int 128 [0x80])) [1 MEM <vector(16) float> [(float *)_42
+ 128B]+0 S64 A128])) "pr114566.c":16:8 1846 {movv16sf_internal}
     (nil))
(insn 55 54 56 4 (set (reg:V16SF 134 [ vect__48.23 ])
        (plus:V16SF (reg:V16SF 133 [ vect__47.22 ])
            (reg:V16SF 96 [ vect_b_46.19 ]))) "pr114566.c":16:12 2069
{*addv16sf3}
     (nil))
(insn 56 55 57 4 (set (mem:V16SF (plus:DI (reg/f:DI 83 [ _42 ])
                (const_int 128 [0x80])) [1 MEM <vector(16) float> [(float *)_42
+ 128B]+0 S64 A128])
        (reg:V16SF 134 [ vect__48.23 ])) "pr114566.c":16:12 1846
{movv16sf_internal}
     (nil))

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/114566] [11/12/13 Regression] Misaligned vmovaps when compiling with stack-protector-strong for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (10 preceding siblings ...)
  2024-04-04 15:40 ` jakub at gcc dot gnu.org
@ 2024-04-04 15:46 ` jakub at gcc dot gnu.org
  2024-04-04 16:33 ` jakub at gcc dot gnu.org
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-04 15:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Seems like vectorizer bug to me.  The _42 + 128 store is to
MEM <vector(16) float> [(float *)_42 + 128B];
aka:
 <target_mem_ref 0x7fffea146580
    type <vector_type 0x7fffea038930
        type <real_type 0x7fffea1532a0 float sizes-gimplified SF
            size <integer_cst 0x7fffea12cfd8 constant 32>
            unit-size <integer_cst 0x7fffea14f000 constant 4>
            align:32 warn_if_not_align:0 symtab:0 alias-set 1 canonical-type
0x7fffea1532a0 precision:32
            pointer_to_this <pointer_type 0x7fffea153930>>
        V16SF
        size <integer_cst 0x7fffea14f480 constant 512>
        unit-size <integer_cst 0x7fffea299960 constant 64>
        user align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7fffea2ac2a0 nunits:16
        pointer_to_this <pointer_type 0x7fffea038888>>

    arg:0 <ssa_name 0x7fffea03fee8
        type <pointer_type 0x7fffea153000 type <void_type 0x7fffea14bf18 void>
            public unsigned DI
            size <integer_cst 0x7fffea12cd98 constant 64>
            unit-size <integer_cst 0x7fffea12cdb0 constant 8>
            align:64 warn_if_not_align:0 symtab:0 alias-set 2 canonical-type
0x7fffea153000
            pointer_to_this <pointer_type 0x7fffea15b9d8>>

        def_stmt _42 = (void *) ivtmp.49_58;
        version:42
        ptr-info 0x7fffea068540>
    arg:1 <integer_cst 0x7fffea047c30 type <pointer_type 0x7fffea153930>
constant 128>>
so has 32-bit alignment there, so it uses movmisalign optab.
The _42 + 192 store is
MEM <vector(8) float> [(float *)_42 + 192B];
aka
 <target_mem_ref 0x7fffea146600
    type <vector_type 0x7fffea2a4f18
        type <real_type 0x7fffea1532a0 float sizes-gimplified SF
            size <integer_cst 0x7fffea12cfd8 constant 32>
            unit-size <integer_cst 0x7fffea14f000 constant 4>
            align:32 warn_if_not_align:0 symtab:0 alias-set 1 canonical-type
0x7fffea1532a0 precision:32
            pointer_to_this <pointer_type 0x7fffea153930>>
        V8SF
        size <integer_cst 0x7fffea14f108 constant 256>
        unit-size <integer_cst 0x7fffea14f1f8 constant 32>
        align:256 warn_if_not_align:0 symtab:0 alias-set 1 canonical-type
0x7fffea2a4f18 nunits:8
        pointer_to_this <pointer_type 0x7fffea2a7e70>>

    arg:0 <ssa_name 0x7fffea03fee8
        type <pointer_type 0x7fffea153000 type <void_type 0x7fffea14bf18 void>
            public unsigned DI
            size <integer_cst 0x7fffea12cd98 constant 64>
            unit-size <integer_cst 0x7fffea12cdb0 constant 8>
            align:64 warn_if_not_align:0 symtab:0 alias-set 2 canonical-type
0x7fffea153000
            pointer_to_this <pointer_type 0x7fffea15b9d8>>

        def_stmt _42 = (void *) ivtmp.49_58;
        version:42
        ptr-info 0x7fffea068540>
    arg:1 <integer_cst 0x7fffea0682e8 type <pointer_type 0x7fffea153930>
constant 192>>
and so it expects 256-bit alignment (despite only 128-bit being guaranteed).

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/114566] [11/12/13 Regression] Misaligned vmovaps when compiling with stack-protector-strong for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (11 preceding siblings ...)
  2024-04-04 15:46 ` jakub at gcc dot gnu.org
@ 2024-04-04 16:33 ` jakub at gcc dot gnu.org
  2024-04-04 16:57 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-04 16:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The user align:32 MEM_REF comes from
(gdb) p debug (dr_info->dr)
#(Data Ref: 
#  bb: 21 
#  stmt: a[j_38][k_41] = _48;
#  ref: a[j_38][k_41];
#  base_object: a;
#  Access function 0: {0, +, 1}_3
#  Access function 1: j_38
#)
$34 = void
(gdb) p *dr_info
$35 = {dr = 0x3a1be60, stmt = 0x3a1ef10, group = 21, misalignment = -1,
target_alignment = {<poly_int_pod<1, unsigned long>> = {coeffs = {64}}, <No
data fields>}, 
  base_misaligned = false, base_decl = <tree 0x0>, offset = <tree 0x0>}
(gdb) p *dr_info->dr
$36 = {stmt = <gimple_assign 0x7fffea04e320>, ref = <array_ref 0x7fffea139bd0>,
aux = 0x0, is_read = false, is_conditional_in_stmt = false, alias = {ptr_info =
0x0}, innermost = {
    base_address = <addr_expr 0x7fffea051160>, offset = <nop_expr
0x7fffea051180>, init = <integer_cst 0x7fffea0474f8>, step = <integer_cst
0x7fffea047528>, base_alignment = 16, 
    base_misalignment = 0, offset_alignment = 32, step_alignment = 4}, indices
= {base_object = <mem_ref 0x7fffea058ac8>, access_fns = {m_vec = 0x383ebb0 =
{0x7fffea058618, 
        0x7fffea03fdc8}}, unconstrained_base = false}, alt_indices =
{base_object = <tree 0x0>, access_fns = {m_vec = 0x0}, unconstrained_base =
false}}
dr in vectorizable_store, alignment_support_scheme is dr_unaligned_supported
and so build_aligned_type is done:
8694                      data_ref = fold_build2 (MEM_REF, vectype,
8695                                              dataref_ptr,
8696                                              dataref_offset
8697                                              ? dataref_offset
8698                                              : build_int_cst (ref_type,
0));
8699                      if (alignment_support_scheme == dr_aligned)
8700                        ;
8701                      else
8702                        TREE_TYPE (data_ref)
8703                          = build_aligned_type (TREE_TYPE (data_ref),
8704                                                align * BITS_PER_UNIT);
while the incorrect one is
(gdb) p debug (dr_info->dr)
#(Data Ref: 
#  bb: 51 
#  stmt: a[j_38][k_63] = _33;
#  ref: a[j_38][k_41];
#  base_object: a;
#  Access function 0: {0, +, 1}_3
#  Access function 1: j_38
#)
$37 = void
(gdb) p *dr_info
$38 = {dr = 0x3a1be60, stmt = 0x3a21f20, group = 21, misalignment = 0,
target_alignment = {<poly_int_pod<1, unsigned long>> = {coeffs = {32}}, <No
data fields>}, 
  base_misaligned = false, base_decl = <var_decl 0x7fffea13bcf0 a>, offset =
<integer_cst 0x7fffea0477e0>}
(gdb) p *dr_info->dr
$39 = {stmt = <gimple_assign 0x7fffea04e9b0>, ref = <array_ref 0x7fffea139bd0>,
aux = 0x0, is_read = false, is_conditional_in_stmt = false, alias = {ptr_info =
0x0}, innermost = {
    base_address = <addr_expr 0x7fffea051160>, offset = <nop_expr
0x7fffea051180>, init = <integer_cst 0x7fffea0474f8>, step = <integer_cst
0x7fffea047528>, base_alignment = 16, 
    base_misalignment = 0, offset_alignment = 32, step_alignment = 4}, indices
= {base_object = <mem_ref 0x7fffea058ac8>, access_fns = {m_vec = 0x383ebb0 =
{0x7fffea058618, 
        0x7fffea03fdc8}}, unconstrained_base = false}, alt_indices =
{base_object = <tree 0x0>, access_fns = {m_vec = 0x0}, unconstrained_base =
false}}

base_alignment in both cases is 16, so I wonder why it would think that
V8SFmode access on it is aligned.
Ah, the first case is actually with V16SFmode access, perhaps something
considers just the offset_alignment which is good enough for 32-byte alignment
but not 64-byte alignment and disregards base_alignment?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/114566] [11/12/13 Regression] Misaligned vmovaps when compiling with stack-protector-strong for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (12 preceding siblings ...)
  2024-04-04 16:33 ` jakub at gcc dot gnu.org
@ 2024-04-04 16:57 ` jakub at gcc dot gnu.org
  2024-04-04 17:08 ` [Bug target/114566] [11/12/13/14 " jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-04 16:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Ah, vect_analyze_data_refs_alignment -> vect_compute_data_ref_alignment
actually checks for this case
1136          if (max_alignment < vect_align_c
1137              || !vect_can_force_dr_alignment_p (base,
1138                                                 vect_align_c *
BITS_PER_UNIT))
and vect_can_force_dr_alignment_p returns true that the base (aka VAR_DECL a)
can be forced to have dr alignment.
But then nothing actually increases the alignment of the VAR_DECL.
For vect_can_force_dr_alignment_p static VAR_DECLs there is the
increase_alignment IPA pass (though just for -fsection-anchors?), and for
automatic vars there is
ensure_base_align, but for some reason that doesn't trigger.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/114566] [11/12/13/14 Regression] Misaligned vmovaps when compiling with stack-protector-strong for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (13 preceding siblings ...)
  2024-04-04 16:57 ` jakub at gcc dot gnu.org
@ 2024-04-04 17:08 ` jakub at gcc dot gnu.org
  2024-04-04 17:14 ` pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-04 17:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |avieira at gcc dot gnu.org
            Summary|[11/12/13 Regression]       |[11/12/13/14 Regression]
                   |Misaligned vmovaps when     |Misaligned vmovaps when
                   |compiling with              |compiling with
                   |stack-protector-strong for  |stack-protector-strong for
                   |znver4                      |znver4

--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Ah, it is the
10753         /* The vector size of the epilogue is smaller than that of the
main loop
10754            so the alignment is either the same or lower. This means the
dr will
10755            thus by definition be aligned.  */
10756         STMT_VINFO_DR_INFO (stmt_vinfo)->base_misaligned = false;
that clears base_misaligned, but somehow nothing forced the higher alignment on
the var before.
And the assumption is just wrong.
In the main loop it is using 512-bit vectors and we have base_alignment 16,
offset_alignment 32, so for V16SFmode accesses in the main vectorized loop as
the earlier one in the vectorized epilogue, so vect_compute_data_ref_alignment
in that
case gave up already earlier:
  if (drb->offset_alignment < vect_align_c
      || !step_preserves_misalignment_p
      /* We need to know whether the step wrt the vectorized loop is
         negative when computing the starting misalignment below.  */
      || TREE_CODE (drb->step) != INTEGER_CST)
    {
      if (dump_enabled_p ())
        dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
                         "Unknown alignment for access: %T\n", ref);
      return;
    }
and just in the V8SFmode case in the epilogue, because vect_align_c is there 32
rather than 64, goes further and triggers
  if (base_alignment < vect_align_c)
    {
      unsigned int max_alignment;
      tree base = get_base_for_alignment (drb->base_address, &max_alignment);
      if (max_alignment < vect_align_c
          || !vect_can_force_dr_alignment_p (base,
                                             vect_align_c * BITS_PER_UNIT))
        {
          if (dump_enabled_p ())
            dump_printf_loc (MSG_NOTE, vect_location,
                             "can't force alignment of ref: %T\n", ref);
          return;
        }

      /* Force the alignment of the decl.
         NOTE: This is the only change to the code we make during
         the analysis phase, before deciding to vectorize the loop.  */
      if (dump_enabled_p ())
        dump_printf_loc (MSG_NOTE, vect_location,
                         "force alignment of %T\n", ref);

      dr_info->base_decl = base;
      dr_info->base_misaligned = true;
      base_misalignment = 0;
    }

So, if we don't want to force higher base alignment just because of some
accesses in vectorizable epilogue, I think we need to recompute the
alignment/misalignment there as well.
Marking for 14 as well because I believe the trunk commit just made it latent
there rather than fixed.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/114566] [11/12/13/14 Regression] Misaligned vmovaps when compiling with stack-protector-strong for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (14 preceding siblings ...)
  2024-04-04 17:08 ` [Bug target/114566] [11/12/13/14 " jakub at gcc dot gnu.org
@ 2024-04-04 17:14 ` pinskia at gcc dot gnu.org
  2024-04-05 10:30 ` [Bug tree-optimization/114566] " jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-04 17:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

--- Comment #15 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #14)
> Marking for 14 as well because I believe the trunk commit just made it
> latent there rather than fixed.

You might be able to reproduce it on the trunk with -fno-vect-cost-model
because the patch which made it latent was to the cost model.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/114566] [11/12/13/14 Regression] Misaligned vmovaps when compiling with stack-protector-strong for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (15 preceding siblings ...)
  2024-04-04 17:14 ` pinskia at gcc dot gnu.org
@ 2024-04-05 10:30 ` jakub at gcc dot gnu.org
  2024-04-05 12:56 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-05 10:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

--- Comment #16 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #15)
> (In reply to Jakub Jelinek from comment #14)
> > Marking for 14 as well because I believe the trunk commit just made it
> > latent there rather than fixed.
> 
> You might be able to reproduce it on the trunk with -fno-vect-cost-model
> because the patch which made it latent was to the cost model.

It doesn't reproduce even with -fno-vect-cost-model on the trunk.

That said, I've posted
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648850.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/114566] [11/12/13/14 Regression] Misaligned vmovaps when compiling with stack-protector-strong for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (16 preceding siblings ...)
  2024-04-05 10:30 ` [Bug tree-optimization/114566] " jakub at gcc dot gnu.org
@ 2024-04-05 12:56 ` cvs-commit at gcc dot gnu.org
  2024-04-05 12:57 ` [Bug tree-optimization/114566] [11/12/13 " jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-04-05 12:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

--- Comment #17 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:a844095e17c1a5aada1364c6f6eaade87ead463c

commit r14-9808-ga844095e17c1a5aada1364c6f6eaade87ead463c
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Apr 5 14:56:14 2024 +0200

    vect: Don't clear base_misaligned in update_epilogue_loop_vinfo [PR114566]

    The following testcase is miscompiled, because in the vectorized
    epilogue the vectorizer assumes it can use aligned loads/stores
    (if the base decl gets alignment increased), but it actually doesn't
    increase that.
    This is because r10-4203-g97c1460367 added the hunk following
    patch removes.  The explanation feels reasonable, but actually it
    is not true as the testcase proves.
    The thing is, we vectorize the main loop with 64-byte vectors
    and the corresponding data refs have base_alignment 16 (the
    a array has DECL_ALIGN 128) and offset_alignment 32.  Now, because
    of the offset_alignment 32 rather than 64, we need to use unaligned
    loads/stores in the main loop (and ditto in the first load/store
    in vectorized epilogue).  But the second load/store in the vectorized
    epilogue uses only 32-byte vectors and because it is a multiple
    of offset_alignment, it checks if we could increase alignment of the
    a VAR_DECL, the function returns true, sets base_misaligned = true
    and says the access is then aligned.
    But when update_epilogue_loop_vinfo clears base_misaligned with the
    assumption that the var had to have the alignment increased already,
    the update of DECL_ALIGN doesn't happen anymore.

    Now, I'd think this base_alignment = false was needed before
    r10-4030-gd2db7f7901 change was committed where it incorrectly
    overwrote DECL_ALIGN even if it was already larger, rather than
    just always increasing it.  But with that change in, it doesn't
    make sense to me anymore.

    Note, the testcase is latent on the trunk, but reproduces on the 13
    branch.

    2024-04-05  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/114566
            * tree-vect-loop.cc (update_epilogue_loop_vinfo): Don't clear
            base_misaligned.

            * gcc.target/i386/avx512f-pr114566.c: New test.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/114566] [11/12/13 Regression] Misaligned vmovaps when compiling with stack-protector-strong for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (17 preceding siblings ...)
  2024-04-05 12:56 ` cvs-commit at gcc dot gnu.org
@ 2024-04-05 12:57 ` jakub at gcc dot gnu.org
  2024-04-21  4:08 ` cvs-commit at gcc dot gnu.org
  2024-04-23  6:45 ` [Bug tree-optimization/114566] [11/12 " jakub at gcc dot gnu.org
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-05 12:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[11/12/13/14 Regression]    |[11/12/13 Regression]
                   |Misaligned vmovaps when     |Misaligned vmovaps when
                   |compiling with              |compiling with
                   |stack-protector-strong for  |stack-protector-strong for
                   |znver4                      |znver4
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/114566] [11/12/13 Regression] Misaligned vmovaps when compiling with stack-protector-strong for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (18 preceding siblings ...)
  2024-04-05 12:57 ` [Bug tree-optimization/114566] [11/12/13 " jakub at gcc dot gnu.org
@ 2024-04-21  4:08 ` cvs-commit at gcc dot gnu.org
  2024-04-23  6:45 ` [Bug tree-optimization/114566] [11/12 " jakub at gcc dot gnu.org
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-04-21  4:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

--- Comment #18 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-13 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:38af0d59043da4cc07cd62c17da599e43668e3be

commit r13-8628-g38af0d59043da4cc07cd62c17da599e43668e3be
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Apr 5 14:56:14 2024 +0200

    vect: Don't clear base_misaligned in update_epilogue_loop_vinfo [PR114566]

    The following testcase is miscompiled, because in the vectorized
    epilogue the vectorizer assumes it can use aligned loads/stores
    (if the base decl gets alignment increased), but it actually doesn't
    increase that.
    This is because r10-4203-g97c1460367 added the hunk following
    patch removes.  The explanation feels reasonable, but actually it
    is not true as the testcase proves.
    The thing is, we vectorize the main loop with 64-byte vectors
    and the corresponding data refs have base_alignment 16 (the
    a array has DECL_ALIGN 128) and offset_alignment 32.  Now, because
    of the offset_alignment 32 rather than 64, we need to use unaligned
    loads/stores in the main loop (and ditto in the first load/store
    in vectorized epilogue).  But the second load/store in the vectorized
    epilogue uses only 32-byte vectors and because it is a multiple
    of offset_alignment, it checks if we could increase alignment of the
    a VAR_DECL, the function returns true, sets base_misaligned = true
    and says the access is then aligned.
    But when update_epilogue_loop_vinfo clears base_misaligned with the
    assumption that the var had to have the alignment increased already,
    the update of DECL_ALIGN doesn't happen anymore.

    Now, I'd think this base_alignment = false was needed before
    r10-4030-gd2db7f7901 change was committed where it incorrectly
    overwrote DECL_ALIGN even if it was already larger, rather than
    just always increasing it.  But with that change in, it doesn't
    make sense to me anymore.

    Note, the testcase is latent on the trunk, but reproduces on the 13
    branch.

    2024-04-05  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/114566
            * tree-vect-loop.cc (update_epilogue_loop_vinfo): Don't clear
            base_misaligned.

            * gcc.target/i386/avx512f-pr114566.c: New test.

    (cherry picked from commit a844095e17c1a5aada1364c6f6eaade87ead463c)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/114566] [11/12 Regression] Misaligned vmovaps when compiling with stack-protector-strong for znver4
  2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
                   ` (19 preceding siblings ...)
  2024-04-21  4:08 ` cvs-commit at gcc dot gnu.org
@ 2024-04-23  6:45 ` jakub at gcc dot gnu.org
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-23  6:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[11/12/13 Regression]       |[11/12 Regression]
                   |Misaligned vmovaps when     |Misaligned vmovaps when
                   |compiling with              |compiling with
                   |stack-protector-strong for  |stack-protector-strong for
                   |znver4                      |znver4

--- Comment #19 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed for 13.3 too.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-04-23  6:45 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-02 18:19 [Bug c/114566] New: Misaligned vmovaps when compiling libvorbis for znver4 yshuiv7 at gmail dot com
2024-04-02 18:28 ` [Bug target/114566] " pinskia at gcc dot gnu.org
2024-04-02 19:33 ` yshuiv7 at gmail dot com
2024-04-02 20:15 ` yshuiv7 at gmail dot com
2024-04-02 20:22 ` yshuiv7 at gmail dot com
2024-04-02 20:28 ` yshuiv7 at gmail dot com
2024-04-02 20:37 ` pinskia at gcc dot gnu.org
2024-04-02 20:40 ` yshuiv7 at gmail dot com
2024-04-02 20:44 ` [Bug target/114566] Misaligned vmovaps when compiling with stack-protector-strong " pinskia at gcc dot gnu.org
2024-04-04 15:22 ` jakub at gcc dot gnu.org
2024-04-04 15:23 ` [Bug target/114566] [11/12/13 Regression] " jakub at gcc dot gnu.org
2024-04-04 15:40 ` jakub at gcc dot gnu.org
2024-04-04 15:46 ` jakub at gcc dot gnu.org
2024-04-04 16:33 ` jakub at gcc dot gnu.org
2024-04-04 16:57 ` jakub at gcc dot gnu.org
2024-04-04 17:08 ` [Bug target/114566] [11/12/13/14 " jakub at gcc dot gnu.org
2024-04-04 17:14 ` pinskia at gcc dot gnu.org
2024-04-05 10:30 ` [Bug tree-optimization/114566] " jakub at gcc dot gnu.org
2024-04-05 12:56 ` cvs-commit at gcc dot gnu.org
2024-04-05 12:57 ` [Bug tree-optimization/114566] [11/12/13 " jakub at gcc dot gnu.org
2024-04-21  4:08 ` cvs-commit at gcc dot gnu.org
2024-04-23  6:45 ` [Bug tree-optimization/114566] [11/12 " jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).