public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
@ 2023-01-26  8:00 feng.tang at intel dot com
  2023-01-26  8:01 ` [Bug c/108552] " feng.tang at intel dot com
                   ` (46 more replies)
  0 siblings, 47 replies; 48+ messages in thread
From: feng.tang at intel dot com @ 2023-01-26  8:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

            Bug ID: 108552
           Summary: Linux i386 kernel 5.14 memory corruption for
                    pre_compound_page() when gcov is enabled
           Product: gcc
           Version: 11.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: feng.tang at intel dot com
  Target Milestone: ---

Created attachment 54345
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54345&action=edit
objdump of  prep_compound_page()

0Day found a i386 Linux kernel boot issue, and bisection shows the first bad
commit is 7118fc2906e29 ("hugetlb: address ref count racing in
prep_compound_gigantic_page"). It happens 94 times out of 999 runs. Details and
some debug analysis from Linus/Vlastimil and us could be found in the following
link: 
https://lore.kernel.org/lkml/202301170941.49728982-oliver.sang@intel.com/t/


Debug shows it is related with one function prep_compound_page() in
mm/page_alloc.c:

* If we use  '#pragma GCC optimize ("O1")' for that function (kernel normally
uses O2), the issue will be gone
* If we disable GCOV for page_alloc.c, can't reproduce it
* If we disable UBSAN for page_alloc.c, can't reproduce it
* Not reproducable for x86_64 build

It seems to be a loop corruption, the pesudo code is:

for (i = 1; i < nr_pages; i++)
   set_meta_data(page[i];

It should happen for page[1]...page[nr_pages - 1], but from memory dump, seems
that one more page, the page[nr_pages] is also called with set_meta_data[].
https://lore.kernel.org/all/202212312021.bc1efe86-oliver.sang@intel.com/t/

The kernel log, i386 config and the objdump of prep_compound_page() of first
bad commit are attached, please let know if you need more info, thanks!

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug c/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
@ 2023-01-26  8:01 ` feng.tang at intel dot com
  2023-01-26  8:02 ` [Bug target/108552] " pinskia at gcc dot gnu.org
                   ` (45 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: feng.tang at intel dot com @ 2023-01-26  8:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #1 from Tang, Feng <feng.tang at intel dot com> ---
Created attachment 54346
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54346&action=edit
kernel log with error message

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug target/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
  2023-01-26  8:01 ` [Bug c/108552] " feng.tang at intel dot com
@ 2023-01-26  8:02 ` pinskia at gcc dot gnu.org
  2023-01-26  8:05 ` pinskia at gcc dot gnu.org
                   ` (44 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-01-26  8:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|c                           |target

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This could be a bug in mcount that the kernel provides.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug target/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
  2023-01-26  8:01 ` [Bug c/108552] " feng.tang at intel dot com
  2023-01-26  8:02 ` [Bug target/108552] " pinskia at gcc dot gnu.org
@ 2023-01-26  8:05 ` pinskia at gcc dot gnu.org
  2023-01-26  8:13 ` feng.tang at intel dot com
                   ` (43 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-01-26  8:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2023-01-26
             Status|UNCONFIRMED                 |WAITING

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Do you have the preprocessed source that is used generate the bad object file?
How about the exact command line?

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug target/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (2 preceding siblings ...)
  2023-01-26  8:05 ` pinskia at gcc dot gnu.org
@ 2023-01-26  8:13 ` feng.tang at intel dot com
  2023-01-26  8:19 ` pinskia at gcc dot gnu.org
                   ` (42 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: feng.tang at intel dot com @ 2023-01-26  8:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #4 from Tang, Feng <feng.tang at intel dot com> ---
(In reply to Andrew Pinski from comment #3)
> Do you have the preprocessed source that is used generate the bad object
> file?
> How about the exact command line?

Thanks for the prompt response!

The error was originally reported by 0Day (which is a kernel automation test
robot), and I can locally reproduce it with a little difference.

Sorry for my poor knowledge of gcc, do you want me to give the output of
" make ARCH=i386 mm/page_alloc.s"? or you can give me to command to generate
it. thanks

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug target/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (3 preceding siblings ...)
  2023-01-26  8:13 ` feng.tang at intel dot com
@ 2023-01-26  8:19 ` pinskia at gcc dot gnu.org
  2023-01-26 11:35 ` feng.tang at intel dot com
                   ` (41 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-01-26  8:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Everything we needed is listed at https://gcc.gnu.org/bugs/

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug target/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (4 preceding siblings ...)
  2023-01-26  8:19 ` pinskia at gcc dot gnu.org
@ 2023-01-26 11:35 ` feng.tang at intel dot com
  2023-01-26 11:37 ` feng.tang at intel dot com
                   ` (40 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: feng.tang at intel dot com @ 2023-01-26 11:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

Tang, Feng <feng.tang at intel dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #54345|0                           |1
        is obsolete|                            |

--- Comment #6 from Tang, Feng <feng.tang at intel dot com> ---
Created attachment 54348
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54348&action=edit
objdump of  prep_compound_page()

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug target/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (5 preceding siblings ...)
  2023-01-26 11:35 ` feng.tang at intel dot com
@ 2023-01-26 11:37 ` feng.tang at intel dot com
  2023-01-26 11:39 ` feng.tang at intel dot com
                   ` (39 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: feng.tang at intel dot com @ 2023-01-26 11:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #7 from Tang, Feng <feng.tang at intel dot com> ---
Created attachment 54349
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54349&action=edit
original job-script from Oliver (0Day)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug target/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (6 preceding siblings ...)
  2023-01-26 11:37 ` feng.tang at intel dot com
@ 2023-01-26 11:39 ` feng.tang at intel dot com
  2023-01-26 16:03 ` feng.tang at intel dot com
                   ` (38 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: feng.tang at intel dot com @ 2023-01-26 11:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #8 from Tang, Feng <feng.tang at intel dot com> ---
Created attachment 54350
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54350&action=edit
i386 kernel config

In https://lore.kernel.org/lkml/202301170941.49728982-oliver.sang@intel.com/t/
Oliver Sang provided a reproduce:

To reproduce:

        # build kernel
        cd linux
        cp config-5.13.0-00219-g7118fc2906e2 .config
        make HOSTCC=gcc-11 CC=gcc-11 ARCH=i386 olddefconfig prepare
modules_prepare bzImage modules
        make HOSTCC=gcc-11 CC=gcc-11 ARCH=i386
INSTALL_MOD_PATH=<mod-install-dir> modules_install
        cd <mod-install-dir>
        find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is
attached in this email

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug target/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (7 preceding siblings ...)
  2023-01-26 11:39 ` feng.tang at intel dot com
@ 2023-01-26 16:03 ` feng.tang at intel dot com
  2023-01-26 16:07 ` feng.tang at intel dot com
                   ` (37 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: feng.tang at intel dot com @ 2023-01-26 16:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #9 from Tang, Feng <feng.tang at intel dot com> ---

For original report
https://lore.kernel.org/lkml/202301170941.49728982-oliver.sang@intel.com/t/, it
was reported by Sang Oliver from 0Day team, but I failed to add him too cc
(probably due to he is not registered in this bugzilla system?), so I will try
to gather some info (some from Oliver's report, some from my local system when
it can't be found from Oliver's report)

gcc version: gcc-11 (Debian 11.3.0-8) 11.3.0
             gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0

Platform: QEMU

Preprocessing file: page_alloc.i (attached)

gcc options: from page_alloc.s(got from 'make ARCH=i386 mm/page_alloc.s')

 # GNU C89 (Ubuntu 11.3.0-1ubuntu1~22.04) version 11.3.0 (x86_64-linux-gnu)
#       compiled by GNU C version 11.3.0, GMP version 6.2.1, MPFR version
4.1.0, MPC version 1.2.1, isl version isl-0.24-GMP

# GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
# options passed: -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m32
-msoft-float -mregparm=3 -mpreferred-stack-boundary=2 -march=i686
-mstack-protector-guard-reg=fs -msta
ck-protector-guard-symbol=__stack_chk_guard -mindirect-branch=thunk-extern
-mindirect-branch-register -O2 -std=gnu90 -fno-strict-aliasing -fno-common
-fshort-wchar -fcf-prot
ection=none -freg-struct-return -fno-pic -ffreestanding
-fno-asynchronous-unwind-tables -fno-jump-tables
-fno-delete-null-pointer-checks -fno-allow-store-data-races -fno-reo
rder-blocks -fno-ipa-cp-clone -fno-partial-inlining -fstack-protector-strong
-fno-omit-frame-pointer -fno-optimize-sibling-calls -fno-stack-clash-protection
-fno-inline-func
tions-called-once -fno-strict-overflow -fstack-check=no -fconserve-stack
-fprofile-arcs -ftest-coverage -fno-tree-loop-im -fsanitize=bounds
-fsanitize=shift -fsanitize=unrea
chable

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug target/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (8 preceding siblings ...)
  2023-01-26 16:03 ` feng.tang at intel dot com
@ 2023-01-26 16:07 ` feng.tang at intel dot com
  2023-01-26 19:06 ` pinskia at gcc dot gnu.org
                   ` (36 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: feng.tang at intel dot com @ 2023-01-26 16:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #10 from Tang, Feng <feng.tang at intel dot com> ---
Created attachment 54352
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54352&action=edit
page_alloc.i.xz

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug target/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (9 preceding siblings ...)
  2023-01-26 16:07 ` feng.tang at intel dot com
@ 2023-01-26 19:06 ` pinskia at gcc dot gnu.org
  2023-01-26 19:22 ` torvalds@linux-foundation.org
                   ` (35 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-01-26 19:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|1                           |0
             Status|WAITING                     |UNCONFIRMED

--- Comment #11 from Andrew Pinski <pinskia at gcc dot gnu.org> ---


The generated IR from the trunk:
  <bb 7> [local count: 118111600]:
  PROF_edge_counter_113 = __gcov0.set_compound_page_dtor[1];
  PROF_edge_counter_114 = PROF_edge_counter_113 + 1;
  __gcov0.set_compound_page_dtor[1] = PROF_edge_counter_114;
  MEM[(struct page *)page_12(D) + 40B].D.14083.D.14061.compound_dtor = 1;
  PROF_edge_counter_47 = __gcov0.prep_compound_page[8];
  PROF_edge_counter_48 = PROF_edge_counter_47 + 1;
  __gcov0.prep_compound_page[8] = PROF_edge_counter_48;
  PROF_edge_counter_96 = __gcov0.set_compound_order[0];
  PROF_edge_counter_97 = PROF_edge_counter_96 + 1;
  __gcov0.set_compound_order[0] = PROF_edge_counter_97;
  _98 = (unsigned char) order_8(D);
  MEM[(struct page *)page_12(D) + 40B].D.14083.D.14061.compound_order = _98;
  if (order_8(D) > 31)
    goto <bb 10>; [0.00%]
  else
    goto <bb 11>; [100.00%]

  <bb 8> [local count: 105119324]:
  _176 = (long unsigned int) page_12(D);
  _95 = _176 + 1;
  pretmp_94 = __gcov0.prep_compound_page[7];
  _179 = pretmp_94 + 1;
  ivtmp.1725_211 = (unsigned long long) _179;
  _155 = page_12(D) + 40;
  ivtmp.1730_157 = (unsigned int) _155;
  _135 = (unsigned int) nr_pages_11;
  _134 = _135 + 4294967294;
  _132 = (unsigned long long) _134;
  _89 = (unsigned long long) pretmp_94;
  _76 = _89 + 2;
  _19 = _76 + _132;

  <bb 9> [local count: 955630225]:
  # ivtmp.1725_77 = PHI <ivtmp.1725_69(9), ivtmp.1725_211(8)>
  # ivtmp.1730_178 = PHI <ivtmp.1730_168(9), ivtmp.1730_157(8)>
  p_16 = (struct page *) ivtmp.1730_178;
  MEM <struct address_space *> [(union  *)p_16 + 12B] = 1024B;
  MEM[(volatile long unsigned int *)p_16 + 4B] ={v} _95;
  PROF_edge_counter_46 = (long long int) ivtmp.1725_77;
  __gcov0.prep_compound_page[7] = PROF_edge_counter_46;
  ivtmp.1725_69 = ivtmp.1725_77 + 1;
  ivtmp.1730_168 = ivtmp.1730_178 + 40;
  if (_19 != ivtmp.1725_69)
    goto <bb 9>; [89.00%]
  else
    goto <bb 7>; [11.00%]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug target/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (10 preceding siblings ...)
  2023-01-26 19:06 ` pinskia at gcc dot gnu.org
@ 2023-01-26 19:22 ` torvalds@linux-foundation.org
  2023-01-27  9:52 ` ubizjak at gmail dot com
                   ` (34 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: torvalds@linux-foundation.org @ 2023-01-26 19:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #12 from Linus Torvalds <torvalds@linux-foundation.org> ---
So it might be worth pointing explicitly to Vlastimil's email at

  https://lore.kernel.org/all/2b857e20-5e3a-13ec-a0b0-1f69d2d047a5@suse.cz/

which has annotated objdump output and seems to point to the actual bug (or at
least part of it), which seems to show how the page counting (in register %ebx)
is corrupted by the coverage counts (Vlastimil calls the coverage counts "crap"
- it's real data, but from an algorithmic standpoint it obviously has no
bearing on the output).

That would mesh with "on 32-bit x86, the 64-bit coverage counts require a lot
more effort, and we have few registers, and something gets confused and uses
register %rax for two things".

The bug apparently only happens with -O2, and I think has only been reported
with gcc-11, which is what the intel test robots happened to use

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug target/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (11 preceding siblings ...)
  2023-01-26 19:22 ` torvalds@linux-foundation.org
@ 2023-01-27  9:52 ` ubizjak at gmail dot com
  2023-01-27 10:47 ` ubizjak at gmail dot com
                   ` (33 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: ubizjak at gmail dot com @ 2023-01-27  9:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #13 from Uroš Bizjak <ubizjak at gmail dot com> ---
-fverbose-asm annotated assembly:

prep_compound_page:
        pushl   %ebp    #
        movl    %esp, %ebp      #,
        pushl   %edi    #
        movl    %eax, %edi      # tmp356, page
        pushl   %esi    #
        pushl   %ebx    #
        subl    $20, %esp       #,
        cmpl    $31, %edx       #, order
        movl    %edx, -28(%ebp) # order, %sfp
        ja      .L1483  #,
.L1464:
        movzbl  -28(%ebp), %ecx # %sfp, tmp365
        movl    $1, %ebx        #, tmp182
        sall    %cl, %ebx       # tmp365, nr_pages
        cmpl    $-1, (%edi)     #, MEM[(const struct page *)page_12(D)].flags
        je      .L1486  #,
        addl    $1, __gcov0.prep_compound_page+16       #,
__gcov0.prep_compound_page[2]
        adcl    $0, __gcov0.prep_compound_page+20       #,
__gcov0.prep_compound_page[2]
#APP
# 68 "./arch/x86/include/asm/bitops.h" 1
         btsl  $16,(%edi)       #, MEM[(volatile long int *)_19]
# 0 "" 2
#NO_APP
        addl    $1, __gcov0.prep_compound_page+48       #,
__gcov0.prep_compound_page[6]
        adcl    $0, __gcov0.prep_compound_page+52       #,
__gcov0.prep_compound_page[6]
        cmpl    $1, %ebx        #, nr_pages
        jle     .L1470  #,
        leal    1(%edi), %eax   #, _159
        movl    __gcov0.prep_compound_page+60, %edx     #
__gcov0.prep_compound_page[7], ivtmp.1714
        movl    %eax, -24(%ebp) # _159, %sfp
        movl    __gcov0.prep_compound_page+56, %eax     #
__gcov0.prep_compound_page[7], ivtmp.1714
        leal    40(%edi), %ecx  #, ivtmp.1720
        movl    %edi, -32(%ebp) # page, %sfp
        addl    $1, %eax        #, ivtmp.1714
        movl    %eax, -20(%ebp) # ivtmp.1714, %sfp
        adcl    $0, %edx        #, ivtmp.1714
        movl    __gcov0.prep_compound_page+56, %eax     #
__gcov0.prep_compound_page[7], tmp228
        movl    %edx, -16(%ebp) # ivtmp.1714, %sfp
        movl    __gcov0.prep_compound_page+60, %edx     #
__gcov0.prep_compound_page[7],
        subl    $2, %ebx        #, tmp226
        xorl    %esi, %esi      #
        addl    $2, %eax        #, tmp228
        adcl    $0, %edx        #,
        addl    %eax, %ebx      # tmp228, tmp227
        movl    -20(%ebp), %eax # %sfp, ivtmp.1714
        adcl    %edx, %esi      #,
        movl    -16(%ebp), %edx # %sfp, ivtmp.1714
        movl    %esi, %edi      # _45, _45
        movl    %ebx, %esi      # _45, _45
        .p2align 4
        .p2align 3
.L1469:
        movl    %eax, __gcov0.prep_compound_page+56     # ivtmp.1714,
__gcov0.prep_compound_page[7]
        movl    -24(%ebp), %ebx # %sfp, _159
        addl    $1, %eax        #, ivtmp.1714
        movl    %edx, __gcov0.prep_compound_page+60     # ivtmp.1714,
__gcov0.prep_compound_page[7]
        adcl    $0, %edx        #, ivtmp.1714
        addl    $40, %ecx       #, ivtmp.1720
        movl    $1024, -28(%ecx)        #, MEM <struct address_space *> [(union
 *)p_15 + 12B]
        movl    %ebx, -36(%ecx) # _159, MEM[(volatile long unsigned int *)p_15
+ 4B]
        movl    %edi, %ebx      # _45, tmp230
        xorl    %edx, %ebx      # ivtmp.1714, tmp230
        movl    %ebx, -20(%ebp) # tmp230, %sfp
        movl    %esi, %ebx      # _45, tmp231
        xorl    %eax, %ebx      # ivtmp.1714, tmp231
        orl     -20(%ebp), %ebx # %sfp, tmp358
        jne     .L1469  #,
        movl    -32(%ebp), %edi # %sfp, page
.L1470:
        movb    $1, 48(%edi)    #, MEM[(struct page *)page_12(D) +
40B].D.13727.D.13705.compound_dtor
        ...

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug target/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (12 preceding siblings ...)
  2023-01-27  9:52 ` ubizjak at gmail dot com
@ 2023-01-27 10:47 ` ubizjak at gmail dot com
  2023-01-27 10:56 ` ubizjak at gmail dot com
                   ` (32 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: ubizjak at gmail dot com @ 2023-01-27 10:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #14 from Uroš Bizjak <ubizjak at gmail dot com> ---
The loop is actually pretty simple, please see the interpretation below

-24(%ebp): some value previously saved to stack frame
%ecx: address to write to
%eax/%edx: loop iterator
%edi/%esi: termination value

.L1469:
        movl    %eax, __gcov0.prep_compound_page+56
        movl    -24(%ebp), %ebx
        addl    $1, %eax           <- increase loop iterator (low word)...
        movl    %edx, __gcov0.prep_compound_page+60
        adcl    $0, %edx           <- ... and high word
        addl    $40, %ecx          <- increase address pointer
        movl    $1024, -28(%ecx)   <- write to address
        movl    %ebx, -36(%ecx)    <

        movl    %edi, %ebx         <- loop exit test: %eax/%edx == %edi/%esi
        xorl    %edx, %ebx         <
        movl    %ebx, -20(%ebp)    <
        movl    %esi, %ebx         <
        xorl    %eax, %ebx         <
        orl     -20(%ebp), %ebx    <
        jne     .L1469             <

So, are loop iterator and termination value correct at the beginning of the
loop?

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug target/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (13 preceding siblings ...)
  2023-01-27 10:47 ` ubizjak at gmail dot com
@ 2023-01-27 10:56 ` ubizjak at gmail dot com
  2023-01-27 12:23 ` ubizjak at gmail dot com
                   ` (31 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: ubizjak at gmail dot com @ 2023-01-27 10:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #15 from Uroš Bizjak <ubizjak at gmail dot com> ---
Sorry, %esi/%edi is the correct order.

-24(%ebp): some value previously saved to stack frame
%ecx: address to write to
%eax/%edx: loop iterator
%esi/%edi: termination value

.L1469:
        movl    %eax, __gcov0.prep_compound_page+56
        movl    -24(%ebp), %ebx
        addl    $1, %eax           <- increase loop iterator (low word)...
        movl    %edx, __gcov0.prep_compound_page+60
        adcl    $0, %edx           <- ... and high word
        addl    $40, %ecx          <- increase address pointer
        movl    $1024, -28(%ecx)   <- write to address
        movl    %ebx, -36(%ecx)    <

        movl    %edi, %ebx         <- loop exit test: %eax/%edx == %esi/%edi
        xorl    %edx, %ebx         <
        movl    %ebx, -20(%ebp)    <
        movl    %esi, %ebx         <
        xorl    %eax, %ebx         <
        orl     -20(%ebp), %ebx    <
        jne     .L1469             <

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug target/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (14 preceding siblings ...)
  2023-01-27 10:56 ` ubizjak at gmail dot com
@ 2023-01-27 12:23 ` ubizjak at gmail dot com
  2023-01-27 12:29 ` ubizjak at gmail dot com
                   ` (30 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: ubizjak at gmail dot com @ 2023-01-27 12:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #16 from Uroš Bizjak <ubizjak at gmail dot com> ---
        addl    $1, __gcov0.prep_compound_page+48
        adcl    $0, __gcov0.prep_compound_page+52
        cmpl    $1, %ebx
        jle     .L1470
        leal    1(%edi), %eax
        movl    __gcov0.prep_compound_page+60, %edx  <- load %eax/%edx from $
        movl    %eax, -24(%ebp)
        movl    __gcov0.prep_compound_page+56, %eax
        leal    40(%edi), %ecx
        movl    %edi, -32(%ebp)
        addl    $1, %eax         <- add $1 to %eax/%edx
        movl    %eax, -20(%ebp)  <- save to stack frame loc 20
        adcl    $0, %edx
        movl    __gcov0.prep_compound_page+56, %eax  <- load again %eax/%edx
from $
        movl    %edx, -16(%ebp)
        movl    __gcov0.prep_compound_page+60, %edx
        subl    $2, %ebx         <- subtract $2 to %ebx, zext to %ebx/%esi
        xorl    %esi, %esi
        addl    $2, %eax         <- add $2 to %eax/%edx
        adcl    $0, %edx
        addl    %eax, %ebx       <- move %eax/%edx to %ebx/%esi
        movl    -20(%ebp), %eax  <- load %eax/%edx from stack frame loc 20
        adcl    %edx, %esi
        movl    -16(%ebp), %edx
        movl    %esi, %edi       <- move %ebx/%esi to %esi/%edi
        movl    %ebx, %esi

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug target/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (15 preceding siblings ...)
  2023-01-27 12:23 ` ubizjak at gmail dot com
@ 2023-01-27 12:29 ` ubizjak at gmail dot com
  2023-01-27 12:31 ` [Bug tree-optimization/108552] " ubizjak at gmail dot com
                   ` (29 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: ubizjak at gmail dot com @ 2023-01-27 12:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #17 from Uroš Bizjak <ubizjak at gmail dot com> ---
The assembly is just mirroring what tree optimizers prepare:

  pretmp_94 = __gcov0.prep_compound_page[7];
  _179 = pretmp_94 + 1;
  ivtmp.1725_211 = (unsigned long long) _179;

  ...


  <bb 9> [local count: 955630225]:
  # ivtmp.1725_77 = PHI <ivtmp.1725_69(9), ivtmp.1725_211(8)>
  # ivtmp.1730_178 = PHI <ivtmp.1730_168(9), ivtmp.1730_157(8)>
  p_16 = (struct page *) ivtmp.1730_178;
  MEM <struct address_space *> [(union  *)p_16 + 12B] = 1024B;
  MEM[(volatile long unsigned int *)p_16 + 4B] ={v} _95;
  PROF_edge_counter_46 = (long long int) ivtmp.1725_77;
  __gcov0.prep_compound_page[7] = PROF_edge_counter_46;
  ivtmp.1725_69 = ivtmp.1725_77 + 1;
  ivtmp.1730_168 = ivtmp.1730_178 + 40;
  if (_19 != ivtmp.1725_69)
    goto <bb 9>; [89.00%]
  else
    goto <bb 7>; [11.00%]


So, loop variable is initialized to __gcov0.prep_compound_page[7] ???

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (16 preceding siblings ...)
  2023-01-27 12:29 ` ubizjak at gmail dot com
@ 2023-01-27 12:31 ` ubizjak at gmail dot com
  2023-01-27 12:51 ` ubizjak at gmail dot com
                   ` (28 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: ubizjak at gmail dot com @ 2023-01-27 12:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|target                      |tree-optimization
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
   Last reconfirmed|2023-01-26 00:00:00         |2023-01-27
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #18 from Uroš Bizjak <ubizjak at gmail dot com> ---
Confirmed, the trail goes into the tree optimization area.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (17 preceding siblings ...)
  2023-01-27 12:31 ` [Bug tree-optimization/108552] " ubizjak at gmail dot com
@ 2023-01-27 12:51 ` ubizjak at gmail dot com
  2023-01-27 12:52 ` ubizjak at gmail dot com
                   ` (27 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: ubizjak at gmail dot com @ 2023-01-27 12:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #19 from Uroš Bizjak <ubizjak at gmail dot com> ---
Some further analysis:

  pretmp_94 = __gcov0.prep_compound_page[7];  <--
  _179 = pretmp_94 + 1;                       <--
  ivtmp.1725_211 = (unsigned long long) _179;

  _135 = (unsigned int) nr_pages_11;
  _134 = _135 + 4294967294;             <--
  _132 = (unsigned long long) _134;     <--
  _89 = (unsigned long long) pretmp_94; <--
  _76 = _89 + 2;       <-
  _19 = _76 + _132;    <-


And the loop exit condition is:

  # ivtmp.1725_77 = PHI <ivtmp.1725_69(9), ivtmp.1725_211(8)>
  ...
  ivtmp.1725_69 = ivtmp.1725_77 + 1;

And the loop exit condition is:

  if (_19 != ivtmp.1725_69)


So, both ivtmp and _19 are calculated from the value at
__gcov0.prep_compound_page. But as shown in Comment #15, we have two separate
reads from the location, the compiler assumes that the value there is
invariant, which is probably not the case.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (18 preceding siblings ...)
  2023-01-27 12:51 ` ubizjak at gmail dot com
@ 2023-01-27 12:52 ` ubizjak at gmail dot com
  2023-01-27 13:17 ` jakub at gcc dot gnu.org
                   ` (26 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: ubizjak at gmail dot com @ 2023-01-27 12:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #20 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Uroš Bizjak from comment #19)
> __gcov0.prep_compound_page. But as shown in Comment #15, we have two

Comment #16, actually.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (19 preceding siblings ...)
  2023-01-27 12:52 ` ubizjak at gmail dot com
@ 2023-01-27 13:17 ` jakub at gcc dot gnu.org
  2023-01-27 13:40 ` ubizjak at gmail dot com
                   ` (25 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-01-27 13:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org,
                   |                            |jakub at gcc dot gnu.org

--- Comment #21 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I'd say using the (default unless -pthread is used) -fprofile-update=single is
wrong for the kernel, it can't work correctly in multi-threaded case which is
the case of kernel.
In the -fprofile-update=single (as opposed to -fprofile-update=atomic) the
updates to the counters aren't atomic and the arrays aren't marked volatile or
something similar, it is really meant for single threaded coverage.

Anyway, before ivopts we have:
  pretmp_93 = __gcov0.prep_compound_page[7];

  <bb 9> [local count: 955630225]:
  # i_66 = PHI <i_17(26), 1(8)>
  # prephitmp_92 = PHI <PROF_edge_counter_46(26), pretmp_93(8)>
  i.144_1 = (unsigned int) i_66;
  _2 = i.144_1 * 40;
  p_15 = page_12(D) + _2;
  p_15->D.13727.D.13672.mapping = 1024B;
  MEM[(volatile long unsigned int *)p_15 + 4B] ={v} _159;
  i_17 = i_66 + 1;
  PROF_edge_counter_46 = prephitmp_92 + 1;
  __gcov0.prep_compound_page[7] = PROF_edge_counter_46;
  if (nr_pages_11 > i_17)
    goto <bb 26>; [89.00%]
  else
    goto <bb 7>; [11.00%]

  <bb 26> [local count: 850510901]:
  goto <bb 9>; [100.00%]
which given the non-volatile non-atomically updated arrays is to be expected,
instead of re-reading __gcov0.prep_compound_page[7] in every iteration it just
reads it once and stores in each iteration, which is possible because another
thread changing it concurrently would mean a data race anyway.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (20 preceding siblings ...)
  2023-01-27 13:17 ` jakub at gcc dot gnu.org
@ 2023-01-27 13:40 ` ubizjak at gmail dot com
  2023-01-27 14:14 ` jakub at gcc dot gnu.org
                   ` (24 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: ubizjak at gmail dot com @ 2023-01-27 13:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #22 from Uroš Bizjak <ubizjak at gmail dot com> ---
BTW: It is the reload pass that duplicates read from
__gcov0.prep_compound_page[7].

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (21 preceding siblings ...)
  2023-01-27 13:40 ` ubizjak at gmail dot com
@ 2023-01-27 14:14 ` jakub at gcc dot gnu.org
  2023-01-27 14:59 ` rguenth at gcc dot gnu.org
                   ` (23 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-01-27 14:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #23 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
We could mark the __gcov* artificial vars with some flag (unless they are
already) and try to avoid using IVs loaded from those in IVOPTs, but as can be
seen above, the chosen IV really isn't that memory but an SSA_NAME that is
initialized with something loaded from that and in other cases it could be even
not that simple (say multiple copies of the same loop in sequence with a load
from __gcov* only at the beginning and then the loops just using a PRE IV
temporary for all the stores).  I bet the RA does it from similar reasons, var
isn't volatile, updated many times without any atomic barriers in between, so
if some other thread modifies it in between, it would be a data race.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (22 preceding siblings ...)
  2023-01-27 14:14 ` jakub at gcc dot gnu.org
@ 2023-01-27 14:59 ` rguenth at gcc dot gnu.org
  2023-01-27 15:01 ` rguenth at gcc dot gnu.org
                   ` (22 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-27 14:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #24 from Richard Biener <rguenth at gcc dot gnu.org> ---
Does

diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
index 0dd47910f97..f780c0ce08c 100644
--- a/gcc/tree-ssa-loop-ivopts.cc
+++ b/gcc/tree-ssa-loop-ivopts.cc
@@ -2241,7 +2241,7 @@ may_be_nonaddressable_p (tree expr)
     {
     case VAR_DECL:
       /* Check if it's a register variable.  */
-      return DECL_HARD_REGISTER (expr);
+      return DECL_HARD_REGISTER (expr) || DECL_NONALIASED (expr);

     case TARGET_MEM_REF:
       /* TARGET_MEM_REFs are translated directly to valid MEMs on the

fix it?

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (23 preceding siblings ...)
  2023-01-27 14:59 ` rguenth at gcc dot gnu.org
@ 2023-01-27 15:01 ` rguenth at gcc dot gnu.org
  2023-01-27 15:13 ` rguenth at gcc dot gnu.org
                   ` (21 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-27 15:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #25 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #24)
> Does
> 
> diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
> index 0dd47910f97..f780c0ce08c 100644
> --- a/gcc/tree-ssa-loop-ivopts.cc
> +++ b/gcc/tree-ssa-loop-ivopts.cc
> @@ -2241,7 +2241,7 @@ may_be_nonaddressable_p (tree expr)
>      {
>      case VAR_DECL:
>        /* Check if it's a register variable.  */
> -      return DECL_HARD_REGISTER (expr);
> +      return DECL_HARD_REGISTER (expr) || DECL_NONALIASED (expr);
>  
>      case TARGET_MEM_REF:
>        /* TARGET_MEM_REFs are translated directly to valid MEMs on the
> 
> fix it?

Ah, reading more comments, no - it probably doesn't.  Jakub correctly says
that there seems to be a data race necessary to trigger this, so it doesn't
seem to be a GCC issue?

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (24 preceding siblings ...)
  2023-01-27 15:01 ` rguenth at gcc dot gnu.org
@ 2023-01-27 15:13 ` rguenth at gcc dot gnu.org
  2023-01-27 15:15 ` jakub at gcc dot gnu.org
                   ` (20 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-27 15:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #26 from Richard Biener <rguenth at gcc dot gnu.org> ---
And yes, to IV optimization the gcov counter for the loop body is just another
IV candidate that can be used, and in this case it allows to elide the
otherwise
unused original IV.

Now, in principle we should have applied store-motion and not only PRE which
would have avoided the issue, not tricking the RA into reloading the value
from where we store it in the loop, but the kernel uses -fno-tree-loop-im,
preventing that.  If you enable that you'd get

  <bb 7> [local count: 105119324]:
  __gcov0.prep_compound_page_I_lsm.1755_4 = __gcov0.prep_compound_page[7];
  _92 = (long unsigned int) page_12(D);
  _57 = _92 + 1;
  _119 = page_12(D) + 40;
  ivtmp.1762_136 = (unsigned int) _119;

  <bb 8> [local count: 955630225]:
  # i_66 = PHI <i_17(8), 1(7)>
  # ivtmp.1762_6 = PHI <ivtmp.1762_46(8), ivtmp.1762_136(7)>
  p_15 = (struct page *) ivtmp.1762_6;
  MEM <struct address_space *> [(union  *)p_15 + 12B] = 1024B;
  MEM[(volatile long unsigned int *)p_15 + 4B] ={v} _57;
  i_17 = i_66 + 1;
  ivtmp.1762_46 = ivtmp.1762_6 + 40;
  if (nr_pages_11 != i_17)
    goto <bb 8>; [89.00%]
  else
    goto <bb 9>; [11.00%]

  <bb 9> [local count: 105119324]:
  _73 = (unsigned int) nr_pages_11;
  _163 = _73 + 4294967294;
  _159 = (long long int) _163;
  _1 = __gcov0.prep_compound_page_I_lsm.1755_4 + 1;
  PROF_edge_counter_74 = _1 + _159;
  __gcov0.prep_compound_page[7] = PROF_edge_counter_74;

which is the desired optimization, handling the counter in the loop like
an induction variable instead of going through memory.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (25 preceding siblings ...)
  2023-01-27 15:13 ` rguenth at gcc dot gnu.org
@ 2023-01-27 15:15 ` jakub at gcc dot gnu.org
  2023-01-27 15:18 ` rguenth at gcc dot gnu.org
                   ` (19 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-01-27 15:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #27 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #25)
> Ah, reading more comments, no - it probably doesn't.  Jakub correctly says
> that there seems to be a data race necessary to trigger this, so it doesn't
> seem to be a GCC issue?

Well, we could in -fprofile-update=single (or perhaps in a new single-like
mode) mark the gcov artificial vars volatile or with some flag that would at
least cause reload not to reread values from memory.  The profiling would be
still racy, but at the expense of somewhat slower code (with volatile more,
with special flag less so) slightly less so (as it would e.g. prevent the
compiler from avoiding the rereads).

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (26 preceding siblings ...)
  2023-01-27 15:15 ` jakub at gcc dot gnu.org
@ 2023-01-27 15:18 ` rguenth at gcc dot gnu.org
  2023-01-27 15:20 ` jakub at gcc dot gnu.org
                   ` (18 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-27 15:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #28 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #27)
> (In reply to Richard Biener from comment #25)
> > Ah, reading more comments, no - it probably doesn't.  Jakub correctly says
> > that there seems to be a data race necessary to trigger this, so it doesn't
> > seem to be a GCC issue?
> 
> Well, we could in -fprofile-update=single (or perhaps in a new single-like
> mode) mark the gcov artificial vars volatile or with some flag that would at
> least cause reload not to reread values from memory.  The profiling would be
> still racy, but at the expense of somewhat slower code (with volatile more,
> with special flag less so) slightly less so (as it would e.g. prevent the
> compiler from avoiding the rereads).

-fprofile-update=volatile?  Huh, sure, we could do that.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (27 preceding siblings ...)
  2023-01-27 15:18 ` rguenth at gcc dot gnu.org
@ 2023-01-27 15:20 ` jakub at gcc dot gnu.org
  2023-01-27 17:00 ` torvalds@linux-foundation.org
                   ` (17 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-01-27 15:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #29 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #27)
> Well, we could in -fprofile-update=single (or perhaps in a new single-like
> mode) mark the gcov artificial vars volatile or with some flag that would at
> least cause reload not to reread values from memory.  The profiling would be
> still racy, but at the expense of somewhat slower code (with volatile more,
> with special flag less so) slightly less so (as it would e.g. prevent the
> compiler from avoiding the rereads).

Though, e.g. volatile would then prevent say on x86 using inc directly on the
memory
location, which is used in the -fprofile-update=atomic mode (with additional
lock).

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (28 preceding siblings ...)
  2023-01-27 15:20 ` jakub at gcc dot gnu.org
@ 2023-01-27 17:00 ` torvalds@linux-foundation.org
  2023-01-27 17:05 ` torvalds@linux-foundation.org
                   ` (16 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: torvalds@linux-foundation.org @ 2023-01-27 17:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #30 from Linus Torvalds <torvalds@linux-foundation.org> ---
(In reply to Richard Biener from comment #26)
> And yes, to IV optimization the gcov counter for the loop body is just
> another IV candidate that can be used, and in this case it allows to elide
> the otherwise
> unused original IV.

Ouch.

So we really don't mind the data race - the gcov data is obviously not primary
- but I don't think anybody expected the data race on the gcov data that isn't
"semantically visible" to then affect actual semantics.

And yeah, atomic updates would be too expensive even on 64-bit architectures,
so we pretty much *depend* on the data race being there. And on 32-bit
architectures (at least i386), atomic 64-bit ones go from "expensive" to
"ludicrously complicated" (ie to get a 64-bit atomic update you'd need to start
doing cmpxchg8b loops or something).

So I think the data race is not just what we expected, it's fundamental. Just
the "mix it with semantics" ends up being less than optimal. 

Having the gcov data be treated as 'volatile' would be one option, but probably
cause horrendous code generation issues as Jakub says.

Although I have several times hit that "I want to just update a volatile in
memory, I wish gcc would just be happy to combine a 'read-modify-update' to a
single instruction". So in a perfect world, that would be fixed too.

I guess from a kernel perspective, we might need to really document that GCOV
has these issues, and you can't use it for any real work. We have just been
lucky this hasn't hit us (admittedly because it's fairly odd that an expected
end gcov value would end up being used in that secondary way as a loop
variable).

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (29 preceding siblings ...)
  2023-01-27 17:00 ` torvalds@linux-foundation.org
@ 2023-01-27 17:05 ` torvalds@linux-foundation.org
  2023-01-27 17:15 ` torvalds@linux-foundation.org
                   ` (15 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: torvalds@linux-foundation.org @ 2023-01-27 17:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #31 from Linus Torvalds <torvalds@linux-foundation.org> ---
(In reply to Richard Biener from comment #26)
> 
> Now, in principle we should have applied store-motion and not only PRE which
> would have avoided the issue, not tricking the RA into reloading the value
> from where we store it in the loop, but the kernel uses -fno-tree-loop-im,
> preventing that.  If you enable that you'd get

Note that we use -fno-tree-loop-im only for the GCOV case, and because of
another problem with code generation with gcov. See

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69702

and the fix for the excessive stack use was to disable that compiler option.
See

 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c87bf431448b404a6ef5fbabd74c0e3e42157a7f

for the kernel commit message.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (30 preceding siblings ...)
  2023-01-27 17:05 ` torvalds@linux-foundation.org
@ 2023-01-27 17:15 ` torvalds@linux-foundation.org
  2023-01-27 17:19 ` jakub at gcc dot gnu.org
                   ` (14 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: torvalds@linux-foundation.org @ 2023-01-27 17:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #32 from Linus Torvalds <torvalds@linux-foundation.org> ---
Brw, where does the -fprofile-update=single/atomic come from?

The kernel just uses 

  CFLAGS_GCOV    := -fprofile-arcs -ftest-coverage

for this case. So I guess 'single' is just the default value?

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (31 preceding siblings ...)
  2023-01-27 17:15 ` torvalds@linux-foundation.org
@ 2023-01-27 17:19 ` jakub at gcc dot gnu.org
  2023-01-27 17:29 ` jakub at gcc dot gnu.org
                   ` (13 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-01-27 17:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #33 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
It is the default unless -pthread is specified:
 %{fprofile-arcs|fprofile-generate*|coverage:\
   %{!fprofile-update=single:\
     %{pthread:-fprofile-update=prefer-atomic}}}
So, when one uses -fprofile-arcs, -fprofile-generate* or -converage together
with -pthread and doesn't use -fprofile-uupdate=single, then
-fprofile-update=prefer-atomic is added (which is -fprofile-update=atomic if
the architecture supports it).

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (32 preceding siblings ...)
  2023-01-27 17:19 ` jakub at gcc dot gnu.org
@ 2023-01-27 17:29 ` jakub at gcc dot gnu.org
  2023-01-27 22:30 ` vmakarov at gcc dot gnu.org
                   ` (12 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-01-27 17:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmakarov at gcc dot gnu.org

--- Comment #34 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Seems right now DECL_NONALIASED is only used on these coverage vars and on
Fortran caf tokens, so perhaps a quick workaround would be on the LRA side
never reread stuff from MEMs with VAR_P && DECL_NONALIASED MEM_EXPRs.  CCing
Vlad on that.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (33 preceding siblings ...)
  2023-01-27 17:29 ` jakub at gcc dot gnu.org
@ 2023-01-27 22:30 ` vmakarov at gcc dot gnu.org
  2023-01-28 14:20 ` feng.tang at intel dot com
                   ` (11 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2023-01-27 22:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #35 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #34)
> Seems right now DECL_NONALIASED is only used on these coverage vars and on
> Fortran caf tokens, so perhaps a quick workaround would be on the LRA side
> never reread stuff from MEMs with VAR_P && DECL_NONALIASED MEM_EXPRs.  CCing
> Vlad on that.

The following patch can do this:

diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc                    
index 7bffbc07ee2..d80a6a9f41d 100644                                           
--- a/gcc/lra-constraints.cc                                                    
+++ b/gcc/lra-constraints.cc                                                    
@@ -515,6 +515,7 @@ get_equiv (rtx x)                                           
 {                                                                              
   int regno;                                                                   
   rtx res;                                                                     
+  tree expr;                                                                   

   if (! REG_P (x) || (regno = REGNO (x)) < FIRST_PSEUDO_REGISTER               
       || ! ira_reg_equiv[regno].defined_p                                      
@@ -525,6 +526,10 @@ get_equiv (rtx x)                                          
     {                                                                          
       if (targetm.cannot_substitute_mem_equiv_p (res))                         
        return x;                                                               
+      if ((expr = MEM_EXPR (res)) != NULL                                      
+         && (expr = get_base_address (expr)) != NULL                           
+         && VAR_P (expr) && DECL_NONALIASED (expr))                            
+       return x;                                                               
       return res;                                                              
     }                                                                          
   if ((res = ira_reg_equiv[regno].constant) != NULL_RTX)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (34 preceding siblings ...)
  2023-01-27 22:30 ` vmakarov at gcc dot gnu.org
@ 2023-01-28 14:20 ` feng.tang at intel dot com
  2023-01-28 14:27 ` feng.tang at intel dot com
                   ` (10 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: feng.tang at intel dot com @ 2023-01-28 14:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #36 from Tang, Feng <feng.tang at intel dot com> ---
(In reply to Vladimir Makarov from comment #35)
> (In reply to Jakub Jelinek from comment #34)
> > Seems right now DECL_NONALIASED is only used on these coverage vars and on
> > Fortran caf tokens, so perhaps a quick workaround would be on the LRA side
> > never reread stuff from MEMs with VAR_P && DECL_NONALIASED MEM_EXPRs.  CCing
> > Vlad on that.
> 
> The following patch can do this:
> 
> diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc                

Thanks for the patch!

As the bug is against 11.3, so I git cloned gcc git, and checkout
origin/releases/gcc-11 branch, then compile gcc (TBH, it's my first time)

* built gcc-11,compiled i386 kernel, run my local reproduce(QEMU loop booting
that kernel), the error was reproduced at once for every 20 boots rate. 

* manually applied Vladimir's patch (original patch seems to be against
'master' branch)

* rebuilt gcc, make clean and re-compile i386 kernel, and the error was NOT
seen in 350 runs so far

Also I will attach the page_alloc.i and objdump of prep_compound_page() with
the new patched gcc-11

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (35 preceding siblings ...)
  2023-01-28 14:20 ` feng.tang at intel dot com
@ 2023-01-28 14:27 ` feng.tang at intel dot com
  2023-01-28 14:29 ` feng.tang at intel dot com
                   ` (9 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: feng.tang at intel dot com @ 2023-01-28 14:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #37 from Tang, Feng <feng.tang at intel dot com> ---
Created attachment 54367
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54367&action=edit
page_alloc.i with patch in comment 35

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (36 preceding siblings ...)
  2023-01-28 14:27 ` feng.tang at intel dot com
@ 2023-01-28 14:29 ` feng.tang at intel dot com
  2023-01-28 23:40 ` hubicka at ucw dot cz
                   ` (8 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: feng.tang at intel dot com @ 2023-01-28 14:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #38 from Tang, Feng <feng.tang at intel dot com> ---
Created attachment 54368
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54368&action=edit
objdump of  prep_compound_page() with patch in comment 35

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (37 preceding siblings ...)
  2023-01-28 14:29 ` feng.tang at intel dot com
@ 2023-01-28 23:40 ` hubicka at ucw dot cz
  2023-01-29 10:08 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: hubicka at ucw dot cz @ 2023-01-28 23:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #39 from Jan Hubicka <hubicka at ucw dot cz> ---
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552
> 
> --- Comment #35 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
> (In reply to Jakub Jelinek from comment #34)
> > Seems right now DECL_NONALIASED is only used on these coverage vars and on
> > Fortran caf tokens, so perhaps a quick workaround would be on the LRA side
> > never reread stuff from MEMs with VAR_P && DECL_NONALIASED MEM_EXPRs.  CCing
> > Vlad on that.
> 
> The following patch can do this:

Note that with threads we often get large profile mismatches when the
load/stores are hoisted out of the loop.  I.e. 

for (....)
  gcov_count++;

to

i = gcov_count
for (.....)
  i++
gocv_count = i

If the second loop is run in parallel a lot of increments may be lost.

I was wonering if we should not provide flag to turn all counts
volatile.   That way we will still have race conditions on their updates
(and it would be chepaer than atomic) but we won't run into such wrong
code issues nor large profile mismatches.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (38 preceding siblings ...)
  2023-01-28 23:40 ` hubicka at ucw dot cz
@ 2023-01-29 10:08 ` jakub at gcc dot gnu.org
  2023-01-30  7:05 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-01-29 10:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #40 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Jan Hubicka from comment #39)
> I was wonering if we should not provide flag to turn all counts
> volatile.   That way we will still have race conditions on their updates
> (and it would be chepaer than atomic) but we won't run into such wrong
> code issues nor large profile mismatches.

Yes, see above.  Or a mode in which we would just avoid hoisting and sinking
the gcov vars but keep them non-volatile.  Or both.
But I guess it would be nice to get Vlad's patch into trunk and release
branches for now (perhaps with an extra check for startswith "__gcov" on
DECL_NAME, so that we don't do it for the Fortran tokens).

As for the patch, just small nits, I think get_base_address returns always
non-NULL, so it could be
      if (tree expr = MEM_EXPR (res))
        {
          expr = get_base_address (expr);
          if (VAR_P (expr)
              && DECL_NONALIASED (expr)
              && DECL_NAME (expr))
            {
              const char *name = IDENTIFIER_POINTER (DECL_NAME (expr));
              /* Don't reread coverage counters from memory, if single
                 update model is used in threaded code, other threads
                 could change the counters concurrently.  See PR108552.  */
              if (startswith (name, "__gcov"))
                return x;
            }
        }

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (39 preceding siblings ...)
  2023-01-29 10:08 ` jakub at gcc dot gnu.org
@ 2023-01-30  7:05 ` rguenth at gcc dot gnu.org
  2023-01-30  7:09 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-30  7:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #41 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Linus Torvalds from comment #31)
> (In reply to Richard Biener from comment #26)
> > 
> > Now, in principle we should have applied store-motion and not only PRE which
> > would have avoided the issue, not tricking the RA into reloading the value
> > from where we store it in the loop, but the kernel uses -fno-tree-loop-im,
> > preventing that.  If you enable that you'd get
> 
> Note that we use -fno-tree-loop-im only for the GCOV case, and because of
> another problem with code generation with gcov. See
> 
>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69702
> 
> and the fix for the excessive stack use was to disable that compiler option.
> See
> 
>  
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/
> ?id=c87bf431448b404a6ef5fbabd74c0e3e42157a7f
> 
> for the kernel commit message.

Yes, I remember.  So another option would be to add -fno-tree-pre to that
mix which should avoid hoisting the load out of the loop.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (40 preceding siblings ...)
  2023-01-30  7:05 ` rguenth at gcc dot gnu.org
@ 2023-01-30  7:09 ` rguenth at gcc dot gnu.org
  2023-01-30  8:06 ` torvalds@linux-foundation.org
                   ` (4 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-30  7:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #42 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #40)
> (In reply to Jan Hubicka from comment #39)
> > I was wonering if we should not provide flag to turn all counts
> > volatile.   That way we will still have race conditions on their updates
> > (and it would be chepaer than atomic) but we won't run into such wrong
> > code issues nor large profile mismatches.
> 
> Yes, see above.  Or a mode in which we would just avoid hoisting and sinking
> the gcov vars but keep them non-volatile.  Or both.
> But I guess it would be nice to get Vlad's patch into trunk and release
> branches for now (perhaps with an extra check for startswith "__gcov" on
> DECL_NAME, so that we don't do it for the Fortran tokens).
> 
> As for the patch, just small nits, I think get_base_address returns always
> non-NULL, so it could be
>       if (tree expr = MEM_EXPR (res))
>         {
>           expr = get_base_address (expr);
>           if (VAR_P (expr)
>               && DECL_NONALIASED (expr)
>               && DECL_NAME (expr))
>             {
>               const char *name = IDENTIFIER_POINTER (DECL_NAME (expr));
>               /* Don't reread coverage counters from memory, if single
>                  update model is used in threaded code, other threads
>                  could change the counters concurrently.  See PR108552.  */
>               if (startswith (name, "__gcov"))
>                 return x;
>             }
>         }

Note that this isn't exactly reliable but a heuristic workaround since
MEM_EXPRs are optional and dropping them is valid (and done in some places).

I think if we want to avoid doing optimizations on gcov counters we should
make them volatile.  I suppose kernel folks would have a way to assess
any "catastrophic consequences" on optimization?  (I have a hard time
imagining them, sure that RMW will not allow add with memory operand,
but that's it?)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (41 preceding siblings ...)
  2023-01-30  7:09 ` rguenth at gcc dot gnu.org
@ 2023-01-30  8:06 ` torvalds@linux-foundation.org
  2023-01-30  8:30 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: torvalds@linux-foundation.org @ 2023-01-30  8:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #43 from Linus Torvalds <torvalds@linux-foundation.org> ---
(In reply to Richard Biener from comment #42)
> 
> I think if we want to avoid doing optimizations on gcov counters we should
> make them volatile. 

Honestly, that sounds like the cleanest and safest option to me.

That said, with the gcov counters apparently also being 64-bit, I suspect it
will create some truly horrid code generation.

Presumably you'd end up getting a lot of load-load-add-adc-store-store
instruction patterns, which is not just six instructions when just two should
do - it also uses up two registers.

So while it sounds like the simplest and safest model, maybe it just makes code
generation too unbearably bad?

Maybe nobody who uses gcov would care. But I suspect it might be quite the big
performance regression, to the point where even people who thought they don't
care will go "that's a bit much".

I wonder if there is some half-way solution that would allow at least a
load-add-store-load-adc-store instruction sequence, which would then mean (a)
one less register wasted and (b) potentially allow some peephole optimization
turning it into just a addmem-adcmem instruction pair.

Turning just the one of the memops into a volatile access might be enough (eg
just the load, but not the store?)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (42 preceding siblings ...)
  2023-01-30  8:06 ` torvalds@linux-foundation.org
@ 2023-01-30  8:30 ` jakub at gcc dot gnu.org
  2023-01-30  8:44 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  46 siblings, 0 replies; 48+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-01-30  8:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #44 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I guess we should try and see.
For volatile,
--- gcc/coverage.cc     2023-01-02 09:32:37.078072992 +0100
+++ gcc/coverage.cc     2023-01-30 09:24:45.219951352 +0100
@@ -774,6 +774,7 @@ build_var (tree fn_decl, tree type, int
   TREE_STATIC (var) = 1;
   TREE_ADDRESSABLE (var) = 1;
   DECL_NONALIASED (var) = 1;
+  TREE_THIS_VOLATILE (var) = 1;
   SET_DECL_ALIGN (var, TYPE_ALIGN (type));

   return var;

would do it I think (but it should be conditional on new -fupdate-profile
modes, single-volatile and prefer-atomic-volatile or something similar).
Or perhaps insert asm volatile ("" : "+g" (tmp)); in between the load and store
and see how that compares to the volatile vars? Or adding another flag on the
gcov vars next to DECL_NONALIASED and just avoid specific optimizations on it
that somebody runs into (not as reliable but could be faster) - for now
hoisting in LIM and sinking.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (43 preceding siblings ...)
  2023-01-30  8:30 ` jakub at gcc dot gnu.org
@ 2023-01-30  8:44 ` rguenth at gcc dot gnu.org
  2023-01-30  8:46 ` rguenther at suse dot de
  2023-01-30 18:54 ` torvalds@linux-foundation.org
  46 siblings, 0 replies; 48+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-30  8:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #45 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Linus Torvalds from comment #43)
> (In reply to Richard Biener from comment #42)
> > 
> > I think if we want to avoid doing optimizations on gcov counters we should
> > make them volatile. 
> 
> Honestly, that sounds like the cleanest and safest option to me.
> 
> That said, with the gcov counters apparently also being 64-bit, I suspect it
> will create some truly horrid code generation.
> 
> Presumably you'd end up getting a lot of load-load-add-adc-store-store
> instruction patterns, which is not just six instructions when just two
> should do - it also uses up two registers.
> 
> So while it sounds like the simplest and safest model, maybe it just makes
> code generation too unbearably bad?
> 
> Maybe nobody who uses gcov would care. But I suspect it might be quite the
> big performance regression, to the point where even people who thought they
> don't care will go "that's a bit much".
> 
> I wonder if there is some half-way solution that would allow at least a
> load-add-store-load-adc-store instruction sequence, which would then mean
> (a) one less register wasted and (b) potentially allow some peephole
> optimization turning it into just a addmem-adcmem instruction pair.
> 
> Turning just the one of the memops into a volatile access might be enough
> (eg just the load, but not the store?)

It might be possible to introduce something like a __volatile_inc () which
implements a somewhat relaxed "volatile".

For user code

volatile long long x;
void foo () { x++; }

emitting inc + adc with memory operands is only "incorrect" in re-ordering
the subword reads with the subword writes, the reads and writes still happen
architecturally ...

That said, the coverage code could make this re-ordering explicit for
32bit with some conditional code (add-with-overflow) that eventually
combines back nicely even with volatile ...

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (44 preceding siblings ...)
  2023-01-30  8:44 ` rguenth at gcc dot gnu.org
@ 2023-01-30  8:46 ` rguenther at suse dot de
  2023-01-30 18:54 ` torvalds@linux-foundation.org
  46 siblings, 0 replies; 48+ messages in thread
From: rguenther at suse dot de @ 2023-01-30  8:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #46 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 30 Jan 2023, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552
> 
> --- Comment #44 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> I guess we should try and see.
> For volatile,
> --- gcc/coverage.cc     2023-01-02 09:32:37.078072992 +0100
> +++ gcc/coverage.cc     2023-01-30 09:24:45.219951352 +0100
> @@ -774,6 +774,7 @@ build_var (tree fn_decl, tree type, int
>    TREE_STATIC (var) = 1;
>    TREE_ADDRESSABLE (var) = 1;
>    DECL_NONALIASED (var) = 1;
> +  TREE_THIS_VOLATILE (var) = 1;
>    SET_DECL_ALIGN (var, TYPE_ALIGN (type));
> 
>    return var;
> 
> would do it I think (but it should be conditional on new -fupdate-profile
> modes, single-volatile and prefer-atomic-volatile or something similar).
> Or perhaps insert asm volatile ("" : "+g" (tmp)); in between the load and store
> and see how that compares to the volatile vars? Or adding another flag on the
> gcov vars next to DECL_NONALIASED and just avoid specific optimizations on it
> that somebody runs into (not as reliable but could be faster) - for now
> hoisting in LIM and sinking.

We could put an __attribute__(("semi atomic")) on them ... it all somewhat
feels like a hack.  We could make half of the update volatile only,
like only make the store volatile, not the read?

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled
  2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
                   ` (45 preceding siblings ...)
  2023-01-30  8:46 ` rguenther at suse dot de
@ 2023-01-30 18:54 ` torvalds@linux-foundation.org
  46 siblings, 0 replies; 48+ messages in thread
From: torvalds@linux-foundation.org @ 2023-01-30 18:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #47 from Linus Torvalds <torvalds@linux-foundation.org> ---
(In reply to Richard Biener from comment #45)
> For user code
> 
> volatile long long x;
> void foo () { x++; }
> 
> emitting inc + adc with memory operands is only "incorrect" in re-ordering
> the subword reads with the subword writes, the reads and writes still happen
> architecturally ...

But the thing is, the ordering *is* very much defined for volatile accesses.
"volatile" is not a "the access happens architecturally", it's very much
defined "the access is _visible_ architecturally, and ordering matters".

So with the "volatile long long x" code, I think any language lawyer will say
that generating it as

    add $1,mem
    adc $0,mem+4

is unquestionably a compiler bug.

It may be what the user *wants* (and it's obviously what the gcov code would
like), but it's simply not a valid volatile access to 'x'.

So the gcov code would really want something slightly weaker than 'volatile'.
Something that just does 'guaranteed access' and disallows combining stores or
doing re-loads, without the ordering constraints.

Side note: we would use such a "weaker volatile" in the kernel too. We already
have that concept in the form of READ_ONCE() and WRITE_ONCE(), and it uses
"volatile" internally, and it works fine for us. But if we had another way to
just describe "guaranteed access", that could be useful.

I suspect the memory ordering primitives would be a better model than
'volatile' for this.  What are the rules for doing it as load/store with
'memory_order_relaxed'? That should at least guarantee that the load is never
re-done (getting two different values for anybody who does a load), but maybe
the stores can be combined?

And gcc should already have all that infrastructure in place. Hmm?

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2023-01-30 18:54 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-26  8:00 [Bug c/108552] New: Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled feng.tang at intel dot com
2023-01-26  8:01 ` [Bug c/108552] " feng.tang at intel dot com
2023-01-26  8:02 ` [Bug target/108552] " pinskia at gcc dot gnu.org
2023-01-26  8:05 ` pinskia at gcc dot gnu.org
2023-01-26  8:13 ` feng.tang at intel dot com
2023-01-26  8:19 ` pinskia at gcc dot gnu.org
2023-01-26 11:35 ` feng.tang at intel dot com
2023-01-26 11:37 ` feng.tang at intel dot com
2023-01-26 11:39 ` feng.tang at intel dot com
2023-01-26 16:03 ` feng.tang at intel dot com
2023-01-26 16:07 ` feng.tang at intel dot com
2023-01-26 19:06 ` pinskia at gcc dot gnu.org
2023-01-26 19:22 ` torvalds@linux-foundation.org
2023-01-27  9:52 ` ubizjak at gmail dot com
2023-01-27 10:47 ` ubizjak at gmail dot com
2023-01-27 10:56 ` ubizjak at gmail dot com
2023-01-27 12:23 ` ubizjak at gmail dot com
2023-01-27 12:29 ` ubizjak at gmail dot com
2023-01-27 12:31 ` [Bug tree-optimization/108552] " ubizjak at gmail dot com
2023-01-27 12:51 ` ubizjak at gmail dot com
2023-01-27 12:52 ` ubizjak at gmail dot com
2023-01-27 13:17 ` jakub at gcc dot gnu.org
2023-01-27 13:40 ` ubizjak at gmail dot com
2023-01-27 14:14 ` jakub at gcc dot gnu.org
2023-01-27 14:59 ` rguenth at gcc dot gnu.org
2023-01-27 15:01 ` rguenth at gcc dot gnu.org
2023-01-27 15:13 ` rguenth at gcc dot gnu.org
2023-01-27 15:15 ` jakub at gcc dot gnu.org
2023-01-27 15:18 ` rguenth at gcc dot gnu.org
2023-01-27 15:20 ` jakub at gcc dot gnu.org
2023-01-27 17:00 ` torvalds@linux-foundation.org
2023-01-27 17:05 ` torvalds@linux-foundation.org
2023-01-27 17:15 ` torvalds@linux-foundation.org
2023-01-27 17:19 ` jakub at gcc dot gnu.org
2023-01-27 17:29 ` jakub at gcc dot gnu.org
2023-01-27 22:30 ` vmakarov at gcc dot gnu.org
2023-01-28 14:20 ` feng.tang at intel dot com
2023-01-28 14:27 ` feng.tang at intel dot com
2023-01-28 14:29 ` feng.tang at intel dot com
2023-01-28 23:40 ` hubicka at ucw dot cz
2023-01-29 10:08 ` jakub at gcc dot gnu.org
2023-01-30  7:05 ` rguenth at gcc dot gnu.org
2023-01-30  7:09 ` rguenth at gcc dot gnu.org
2023-01-30  8:06 ` torvalds@linux-foundation.org
2023-01-30  8:30 ` jakub at gcc dot gnu.org
2023-01-30  8:44 ` rguenth at gcc dot gnu.org
2023-01-30  8:46 ` rguenther at suse dot de
2023-01-30 18:54 ` torvalds@linux-foundation.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).