public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded
@ 2024-02-14 17:19 seanjc at google dot com
  2024-02-14 18:00 ` [Bug middle-end/113921] " jakub at gcc dot gnu.org
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: seanjc at google dot com @ 2024-02-14 17:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

            Bug ID: 113921
           Summary: Output register of an "asm volatile goto" is
                    incorrectly clobbered/discarded
           Product: gcc
           Version: 11.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: seanjc at google dot com
                CC: jakub at redhat dot com, ndesaulniers at google dot com,
                    torvalds@linux-foundation.org, ubizjak at gmail dot com
  Target Milestone: ---

Created attachment 57428
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57428&action=edit
Intermediate output of the miscompiled file

gcc-11 appears to have a bug that results in gcc incorrectly clobbering the
output register of an "asm volatile goto".

The failing asm blob is a sequence of VMREADs in the Linux kernel, with the
outputs stored into a dynamically allocated structure whose lifecycle is far
beyond the scope of the code in question:

  vmcs12->guest_pdptr0 = vmcs_read64(GUEST_PDPTR0);
  vmcs12->guest_pdptr1 = vmcs_read64(GUEST_PDPTR1);
  vmcs12->guest_pdptr2 = vmcs_read64(GUEST_PDPTR2);
  vmcs12->guest_pdptr3 = vmcs_read64(GUEST_PDPTR3);

where vmcs_read64() eventually becomes:

        asm volatile goto("1: vmread %[field], %[output]\n\t"
                          "jna %l[do_fail]\n\t"

                          _ASM_EXTABLE(1b, %l[do_exception])

                          : [output] "=r" (value)
                          : [field] "r" (field)
                          : "cc"
                          : do_fail, do_exception);

        return value;

  do_fail:
        instrumentation_begin();
        vmread_error(field);
        instrumentation_end();
        return 0;

  do_exception:
        kvm_spurious_fault();
        return 0;


The first three PDPTR VMREADs generate correctly, but the fourth effectively
gets ignored, and '0' is written to vmcs12->guest_pdptr3.

    3597:       mov    $0x280a,%r13d
    359d:       vmread %r13,%r13
    35a1:       jbe    3724 <sync_vmcs02_to_vmcs12+0x7c4>
    35a7:       mov    %r13,0xd8(%rbx)
    35ae:       jmp    396b <sync_vmcs02_to_vmcs12+0xa0b>
    35b3:       mov    $0x280c,%r13d
    35b9:       vmread %r13,%r13
    35bd:       jbe    3705 <sync_vmcs02_to_vmcs12+0x7a5>
    35c3:       mov    %r13,0xe0(%rbx)
    35ca:       jmp    393a <sync_vmcs02_to_vmcs12+0x9da>
    35cf:       mov    $0x280e,%r13d
    35d5:       vmread %r13,%r13
    35d9:       jbe    36e6 <sync_vmcs02_to_vmcs12+0x786>
    35df:       mov    %r13,0xe8(%rbx)
    35e6:       jmp    3909 <sync_vmcs02_to_vmcs12+0x9a9>
    35eb:       mov    $0x2810,%eax
    35f0:       vmread %rax,%rax        <= VMREAD to nowhere   
    35f3:       jbe    36ca <sync_vmcs02_to_vmcs12+0x76a>
    35f9:       xor    %r12d,%r12d      <= zeroing of output
    35fc:       mov    %r12,0xf0(%rbx)  <= store to vmcs12->guest_pdptr3

Replacing "asm volatile goto" with the following macro

  #define asm_goto(x...) \
          do { asm volatile goto(x); asm (""); } while (0)

to force a second barrier generates functional code, although the attempt to
miscompile the sequence is still evident, as the output of the affected VMREAD
is unnecessarily bounced through an extra register:

    35f8:       mov    $0x280a,%r13d
    35fe:       vmread %r13,%r13
    3602:       jbe    36b2 <sync_vmcs02_to_vmcs12+0x762>
    3608:       mov    %r13,0xd8(%rbx)
    360f:       jmp    3925 <sync_vmcs02_to_vmcs12+0x9d5>
    3614:       mov    $0x280c,%r13d
    361a:       vmread %r13,%r13
    361e:       jbe    3693 <sync_vmcs02_to_vmcs12+0x743>
    3620:       mov    %r13,0xe0(%rbx)
    3627:       jmp    38f4 <sync_vmcs02_to_vmcs12+0x9a4>
    362c:       mov    $0x280e,%r13d
    3632:       vmread %r13,%r13
    3636:       jbe    367a <sync_vmcs02_to_vmcs12+0x72a>
    3638:       mov    %r13,0xe8(%rbx)
    363f:       jmp    38c3 <sync_vmcs02_to_vmcs12+0x973>
    3644:       mov    $0x2810,%eax
    3649:       vmread %rax,%rax
    364c:       jbe    3664 <sync_vmcs02_to_vmcs12+0x714>
    364e:       mov    %rax,%r12
    3651:       mov    %r12,0xf0(%rbx)

The bug reproduces with two different 11.4.0 builds, on three different systems
(Intel i7-9850H, Intel i7-13700K, AMD EPYC 7B12), all running Debian-based
Linux.

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
11.4.0-1ubuntu1~22.04' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-11
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib
--enable-libphobos-checking=release --with-target-system-zlib=auto
--enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet
--with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32
--enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr
--without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
--with-build-config=bootstrap-lto-lean --enable-link-serialization=2
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)

$ gcc-11 -v
Using built-in specs.
COLLECT_GCC=gcc-11
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 11.4.0-4'
--with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-11
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib
--enable-libphobos-checking=release --with-target-system-zlib=auto
--enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet
--with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32
--enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-11-IBEKnH/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-IBEKnH/gcc-11-11.4.0/debian/tmp-gcn/usr
--without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
--with-build-config=bootstrap-lto-lean --enable-link-serialization=28
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.4.0 (Debian 11.4.0-4)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/113921] Output register of an "asm volatile goto" is incorrectly clobbered/discarded
  2024-02-14 17:19 [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded seanjc at google dot com
@ 2024-02-14 18:00 ` jakub at gcc dot gnu.org
  2024-02-14 18:13 ` torvalds@linux-foundation.org
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-14 18:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Bisection points to r12-5301-g045206450386bcd774db3bde0c696828402361c6 making
the problem go away,
    3abc:       b8 10 28 00 00          mov    $0x2810,%eax
    3ac1:       0f 78 c0                vmread %rax,%rax
    3ac4:       0f 86 1e 01 00 00       jbe    3be8
<sync_vmcs02_to_vmcs12+0x858>
    3aca:       45 31 e4                xor    %r12d,%r12d
    3acd:       4c 89 a3 f0 00 00 00    mov    %r12,0xf0(%rbx)
to
    3a4f:       b8 10 28 00 00          mov    $0x2810,%eax
    3a54:       0f 78 c0                vmread %rax,%rax
    3a57:       0f 86 1b 01 00 00       jbe    3b78
<sync_vmcs02_to_vmcs12+0x808>
    3a5d:       48 89 83 f0 00 00 00    mov    %rax,0xf0(%rbx)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/113921] Output register of an "asm volatile goto" is incorrectly clobbered/discarded
  2024-02-14 17:19 [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded seanjc at google dot com
  2024-02-14 18:00 ` [Bug middle-end/113921] " jakub at gcc dot gnu.org
@ 2024-02-14 18:13 ` torvalds@linux-foundation.org
  2024-02-14 18:16 ` torvalds@linux-foundation.org
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: torvalds@linux-foundation.org @ 2024-02-14 18:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

--- Comment #2 from Linus Torvalds <torvalds@linux-foundation.org> ---
(In reply to Jakub Jelinek from comment #1)
> Bisection points to r12-5301-g045206450386bcd774db3bde0c696828402361c6
> making the problem go away,

Well, that certainly explains why I can't see the problem with my gcc 13.2.1.

It looks like that commit is in gcc-12.1.o and later:

   git log --oneline --all --grep="tree-optimization/102880 - improve CD-DCE"

only returns that one commit, and 

   git name-rev --refs '*releases*' 045206450386b

says "gcc-12.1.0~3038".

So we could make our workaround option be something like

   config GCC_ASM_GOTO_WORKAROUND
     def_bool y
     depends on CC_IS_GCC && GCC_VERSION < 120100

but maybe there is some backporting policy with gcc that my quick git check
missed?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/113921] Output register of an "asm volatile goto" is incorrectly clobbered/discarded
  2024-02-14 17:19 [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded seanjc at google dot com
  2024-02-14 18:00 ` [Bug middle-end/113921] " jakub at gcc dot gnu.org
  2024-02-14 18:13 ` torvalds@linux-foundation.org
@ 2024-02-14 18:16 ` torvalds@linux-foundation.org
  2024-02-14 18:21 ` jakub at gcc dot gnu.org
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: torvalds@linux-foundation.org @ 2024-02-14 18:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

--- Comment #3 from Linus Torvalds <torvalds@linux-foundation.org> ---
(In reply to Linus Torvalds from comment #2)
>
> So we could make our workaround option be something like
> 
>    config GCC_ASM_GOTO_WORKAROUND
>      def_bool y
>      depends on CC_IS_GCC && GCC_VERSION < 120100

Actually, rather than add a new kernel config option, I'll just make the
workaround conditional in our <linux/compiler-gcc.h> header file.

But I'll wait for confirmation from gcc people that Jakub's bisection makes
sense, and is the real fix, rather than just a change that just happens to hide
the issue.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/113921] Output register of an "asm volatile goto" is incorrectly clobbered/discarded
  2024-02-14 17:19 [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded seanjc at google dot com
                   ` (2 preceding siblings ...)
  2024-02-14 18:16 ` torvalds@linux-foundation.org
@ 2024-02-14 18:21 ` jakub at gcc dot gnu.org
  2024-02-14 18:25 ` torvalds@linux-foundation.org
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-14 18:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Bisection in the other direction doesn't make much sense, since asm goto with
output operands is only supported in GCC 11 and later.

Anyway, with gcc 11, I can see something fishy already during expansion:
(jump_insn 927 926 1285 191 (parallel [
            (set (reg:DI 385 [ value ])
                (asm_operands/v:DI ("1: vmread %1, %0
        jna %l2
         .pushsection "__ex_table","a"
 .balign 4
 .long (1b) - .
 .long (%l3) - .
 .long 1
 .popsection
") ("=r") 0 [
                        (reg:DI 386)
                    ]
                     [
                        (asm_input:DI ("r")
/home/sean/go/src/kernel.org/linux/arch/x86/kvm/vmx/vmx_ops.h:97)
                    ]
                     [
                        (label_ref:DI 959)
                        (label_ref:DI 965)
                    ]
/home/sean/go/src/kernel.org/linux/arch/x86/kvm/vmx/vmx_ops.h:97))
            (clobber (reg:CC 17 flags))
        ]) "/home/sean/go/src/kernel.org/linux/arch/x86/kvm/vmx/vmx_ops.h":97:2
-1
     (insn_list:REG_LABEL_TARGET 1253 (insn_list:REG_LABEL_TARGET 959 (nil)))
 -> 965)             
;;  succ:       197 count:99052688 (estimated locally)
;;              198 count:99052688 (estimated locally)
;;              192 count:99052688 (estimated locally) (FALLTHRU)

;; basic block 192, loop depth 0, count 99052688 (estimated locally), maybe hot
;;  prev block 191, next block 193, flags: (NEW, REACHABLE, RTL, MODIFIED)
;;  pred:       191 count:99052688 (estimated locally) (FALLTHRU)
(note 1285 927 931 192 [bb 192] NOTE_INSN_BASIC_BLOCK)
(jump_insn 931 1285 932 192 (set (pc)
        (label_ref:DI 1253))
"/home/sean/go/src/kernel.org/linux/arch/x86/kvm/vmx/vmx_ops.h":97:2 807 {jump}
     (nil)
 -> 1253)
;;  succ:       199 [always]  count:99052688 (estimated locally)
...
(code_label 1253 1251 1252 199 1127 (nil) [1 uses])
(note 1252 1253 49 199 [bb 199] NOTE_INSN_BASIC_BLOCK)
(insn 49 1252 930 199 (set (reg:DI 152 [ _241 ])
        (reg/v:DI 151 [ value ]))
"/home/sean/go/src/kernel.org/linux/arch/x86/kvm/vmx/vmx_ops.h":107:9 -1
     (nil))
(insn 930 49 968 199 (set (reg/v:DI 151 [ value ])
        (reg:DI 385 [ value ]))
"/home/sean/go/src/kernel.org/linux/arch/x86/kvm/vmx/vmx_ops.h":97:2 -1
     (nil))
;;  succ:       200 [always]  count:16508781 (estimated locally) (FALLTHRU)

(code_label 968 930 969 200 1083 (nil) [5 uses])
(note 969 968 970 200 [bb 200] NOTE_INSN_BASIC_BLOCK)
(insn 970 969 971 200 (set (mem:DI (plus:DI (reg/v/f:DI 283 [ vmcs12 ])
                (const_int 240 [0xf0])) [19 vmcs12_30(D)->guest_pdptr3+0 S8
A8])
        (reg:DI 152 [ _241 ]))
"/home/sean/go/src/kernel.org/linux/arch/x86/kvm/vmx/nested.c":4422:25 -1
     (nil))
;;  succ:       201 [always]  count:55029271 (estimated locally) (FALLTHRU)

So, the asm goto sets pseudo 385 and in case it doesn't jump anywhere, it then
goes
into the 2 pseudo moves but they'd need to be reversed in order to store the
asm goto
output into gues_pdptr3.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/113921] Output register of an "asm volatile goto" is incorrectly clobbered/discarded
  2024-02-14 17:19 [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded seanjc at google dot com
                   ` (3 preceding siblings ...)
  2024-02-14 18:21 ` jakub at gcc dot gnu.org
@ 2024-02-14 18:25 ` torvalds@linux-foundation.org
  2024-02-14 18:40 ` jakub at gcc dot gnu.org
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: torvalds@linux-foundation.org @ 2024-02-14 18:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

--- Comment #5 from Linus Torvalds <torvalds@linux-foundation.org> ---
(In reply to Linus Torvalds from comment #2)
>
> So we could make our workaround option be something like
> 
>    config GCC_ASM_GOTO_WORKAROUND
>      def_bool y
>      depends on CC_IS_GCC && GCC_VERSION < 120100

Replying to myself some more, since that kernel side workaround is actually in
two parts: it does the extra empty inline asm as an extra barrier, but it
*also* adds the 'volatile' to the asm with outputs to work around the other gcc
bug.

And that other fix ("Mark asm goto with outputs as volatile") is *not* in
gcc-12.1.0. It has only made it into gcc-13.2.0 (and it's pending in the gcc-12
release branch if I read things right).

I suspect that other bug doesn't affect Sean's kvm case, because the outputs
are always used, but it could affect other kernel code.

So we'd want to have at least gcc-13.2 to be safe.

Hmm.

We could make the "add volatile manually" be the default workaround, though,
since it shouldn't matter.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/113921] Output register of an "asm volatile goto" is incorrectly clobbered/discarded
  2024-02-14 17:19 [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded seanjc at google dot com
                   ` (4 preceding siblings ...)
  2024-02-14 18:25 ` torvalds@linux-foundation.org
@ 2024-02-14 18:40 ` jakub at gcc dot gnu.org
  2024-02-14 19:07 ` jakub at gcc dot gnu.org
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-14 18:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmakarov at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
insn 49 above is inserted in
#1  0x0000000000d419db in emit_insn (x=0x7fffe6e921f8) at
../../gcc/emit-rtl.c:5106
#2  0x0000000001895ab7 in ix86_expand_move (mode=E_DImode,
operands=0x7fffffffd600) at ../../gcc/config/i386/i386-expand.c:368
#3  0x0000000001de4226 in gen_movdi (operand0=0x7fffe6ec8108,
operand1=0x7fffe6ec80f0) at ../../gcc/config/i386/i386.md:1935
#4  0x0000000000db804f in insn_gen_fn::operator()<rtx_def*, rtx_def*>
(this=0x2c6cd48 <insn_data+242536>) at ../../gcc/recog.h:407
#5  0x0000000000d8f394 in emit_move_insn_1 (x=0x7fffe6ec8108, y=0x7fffe6ec80f0)
at ../../gcc/expr.c:3766
#6  0x0000000000d8fe7a in emit_move_insn (x=0x7fffe6ec8108, y=0x7fffe6ec80f0)
at ../../gcc/expr.c:3936
#7  0x00000000013b8a75 in emit_partition_copy (dest=0x7fffe6ec8108,
src=0x7fffe6ec80f0, unsignedsrcp=1, sizeexp=<ssa_name 0x7fffe9cfc6c0 240>) at
../../gcc/tree-outof-ssa.c:259
#8  0x00000000013b8c5b in insert_partition_copy_on_edge (e=<edge 0x7fffe7314000
(138 -> 145)>, dest=72, src=71, locus=2147592841) at
../../gcc/tree-outof-ssa.c:293
#9  0x00000000013b9f0b in elim_create (g=0x7fffffffd8e0, T=72) at
../../gcc/tree-outof-ssa.c:729
#10 0x00000000013ba0ab in eliminate_phi (e=<edge 0x7fffe7314000 (138 -> 145)>,
g=0x7fffffffd8e0) at ../../gcc/tree-outof-ssa.c:771
#11 0x00000000013baaa5 in expand_phi_nodes (sa=0x3169360 <SA>) at
../../gcc/tree-outof-ssa.c:1024
later on insert_partition_copy_on_edge will insert_insn_on_edge on the 138 ->
145 edge, but the insertion remains uncommitted.
Later on the asm goto expansion copies after_rtl_seq which contains (set
(reg:DI 151) (reg:DI 385)) and calls insert_insn_on_edge on it too.
But insert_insn_on_edge appends to the e->insns.r rather than prepends it which
presumably is what we'd want for the asm goto case.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/113921] Output register of an "asm volatile goto" is incorrectly clobbered/discarded
  2024-02-14 17:19 [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded seanjc at google dot com
                   ` (5 preceding siblings ...)
  2024-02-14 18:40 ` jakub at gcc dot gnu.org
@ 2024-02-14 19:07 ` jakub at gcc dot gnu.org
  2024-02-14 20:31 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-14 19:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2024-02-14

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
So, GCC 11 version:
--- gcc/cfgexpand.c.jj  2023-05-09 12:59:04.381738365 +0200
+++ gcc/cfgexpand.c     2024-02-14 19:56:08.733150432 +0100
@@ -3639,7 +3639,16 @@ expand_asm_stmt (gasm *stmt)
                emit_insn (copy_insn (PATTERN (curr)));
              rtx_insn *copy = get_insns ();
              end_sequence ();
-             insert_insn_on_edge (copy, e);
+             if (rtx_insn *prev = e->insns.r)
+               {
+                 /* Prepend copy before any other previously
+                    inserted insns on the edge rather than append.  */
+                 e->insns.r = NULL;
+                 insert_insn_on_edge (copy, e);
+                 insert_insn_on_edge (prev, e);
+               }
+             else
+               insert_insn_on_edge (copy, e);
            }
        }
     }
changes the emitted assembler:
@@ -7328,7 +7328,7 @@ sync_vmcs02_to_vmcs12:
 # 0 "" 2
 #NO_APP
 .L1127:
-       xorl    %r12d, %r12d
+       movq    %rax, %r12
 .L1083:
        movq    %r12, 240(%rbx)
        jmp     .L1047
@@ -29897,7 +29897,7 @@ nested_vmx_vmexit:
 # 0 "" 2
 #NO_APP
 .L5187:
-       xorl    %r12d, %r12d
+       movq    %rax, %r12
 .L5113:
        movq    %r12, %rdx
        movl    $7, %esi
which is I believe exactly what we want.

For GCC trunk the patch would be
--- gcc/cfgexpand.cc.jj 2024-02-10 11:25:09.995474027 +0100
+++ gcc/cfgexpand.cc    2024-02-14 19:54:30.811505882 +0100
@@ -3687,7 +3687,16 @@ expand_asm_stmt (gasm *stmt)
                  copy = get_insns ();
                  end_sequence ();
                }
-             insert_insn_on_edge (copy, e);
+             if (rtx_insn *prev = e->insns.r)
+               {
+                 /* Prepend copy before any other previously
+                    inserted insns on the edge rather than append.  */
+                 e->insns.r = NULL;
+                 insert_insn_on_edge (copy, e);
+                 insert_insn_on_edge (prev, e);
+               }
+             else
+               insert_insn_on_edge (copy, e);
            }
        }
     }
and with trunk it triggers (I mean the prev != NULL case) only on the
nested_vmx_vmexit
function and not the other one.
Guess one could try to build whole kernel with instrumented gcc (just add
FILE *f = fopen ("/tmp/asmgoto", "a");
fprintf (f, "%s %s\n", main_input_filename ? main_input_filename : "-",
current_function_name ());
fclose (f);
next to the e->insns.r = NULL; in the patch or so) to find out what else it
affects.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/113921] Output register of an "asm volatile goto" is incorrectly clobbered/discarded
  2024-02-14 17:19 [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded seanjc at google dot com
                   ` (6 preceding siblings ...)
  2024-02-14 19:07 ` jakub at gcc dot gnu.org
@ 2024-02-14 20:31 ` jakub at gcc dot gnu.org
  2024-02-15  8:00 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-14 20:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 57430
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57430&action=edit
gcc14-pr113921.patch

Full untested trunk patch.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/113921] Output register of an "asm volatile goto" is incorrectly clobbered/discarded
  2024-02-14 17:19 [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded seanjc at google dot com
                   ` (7 preceding siblings ...)
  2024-02-14 20:31 ` jakub at gcc dot gnu.org
@ 2024-02-15  8:00 ` rguenth at gcc dot gnu.org
  2024-02-15  8:21 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-15  8:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
So just to say all versions seem to be affected and the issue is just latent
because of the rev we bisected to.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/113921] Output register of an "asm volatile goto" is incorrectly clobbered/discarded
  2024-02-14 17:19 [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded seanjc at google dot com
                   ` (8 preceding siblings ...)
  2024-02-15  8:00 ` rguenth at gcc dot gnu.org
@ 2024-02-15  8:21 ` jakub at gcc dot gnu.org
  2024-02-15 14:56 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-15  8:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
*** Bug 107385 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/113921] Output register of an "asm volatile goto" is incorrectly clobbered/discarded
  2024-02-14 17:19 [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded seanjc at google dot com
                   ` (9 preceding siblings ...)
  2024-02-15  8:21 ` jakub at gcc dot gnu.org
@ 2024-02-15 14:56 ` cvs-commit at gcc dot gnu.org
  2024-02-15 15:14 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-02-15 14:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

--- Comment #11 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:2b4efc5db2aedb59196987300e14951d08cd7106

commit r14-9012-g2b4efc5db2aedb59196987300e14951d08cd7106
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Feb 15 15:53:01 2024 +0100

    expand: Fix handling of asm goto outputs vs. PHI argument adjustments
[PR113921]

    The Linux kernel and the following testcase distilled from it is
    miscompiled, because tree-outof-ssa.cc (eliminate_phi) emits some
    fixups on some of the edges (but doesn't commit edge insertions).
    Later expand_asm_stmt emits further instructions on the same edge.
    Now the problem is that expand_asm_stmt uses insert_insn_on_edge
    to add its own fixups, but that function appends to the existing
    sequence on the edge if any.  And the bug triggers when the
    fixup sequence emitted by eliminate_phi uses a pseudo which the
    fixup sequence emitted by expand_asm_stmt later on sets.
    So, we end up with
      (set (reg A) (asm_operands ...))
    and on one of the edges queued sequence
      (set (reg C) (reg B)) // added by eliminate_phi
      (set (reg B) (reg A)) // added by expand_asm_stmt
    That is wrong, what we emit by expand_asm_stmt needs to be as close
    to the asm_operands as possible (they aren't known until expand_asm_stmt
    is called, the PHI fixup code assumes it is reg B which holds the right
    value) and the PHI adjustments need to be done after it.

    So, the following patch introduces a prepend_insn_to_edge function and
    uses it from expand_asm_stmt, so that we queue
      (set (reg B) (reg A)) // added by expand_asm_stmt
      (set (reg C) (reg B)) // added by eliminate_phi
    instead and so the value from the asm_operands output propagates correctly
    to the PHI result.

    2024-02-15  Jakub Jelinek  <jakub@redhat.com>

            PR middle-end/113921
            * cfgrtl.h (prepend_insn_to_edge): New declaration.
            * cfgrtl.cc (insert_insn_on_edge): Clarify behavior in function
            comment.
            (prepend_insn_to_edge): New function.
            * cfgexpand.cc (expand_asm_stmt): Use prepend_insn_to_edge instead
of
            insert_insn_on_edge.

            * gcc.target/i386/pr113921.c: New test.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/113921] Output register of an "asm volatile goto" is incorrectly clobbered/discarded
  2024-02-14 17:19 [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded seanjc at google dot com
                   ` (10 preceding siblings ...)
  2024-02-15 14:56 ` cvs-commit at gcc dot gnu.org
@ 2024-02-15 15:14 ` cvs-commit at gcc dot gnu.org
  2024-02-15 15:17 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-02-15 15:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

--- Comment #12 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-13 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:4040d472825f203660371331c9e86cd75e30f8d2

commit r13-8328-g4040d472825f203660371331c9e86cd75e30f8d2
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Feb 15 15:53:01 2024 +0100

    expand: Fix handling of asm goto outputs vs. PHI argument adjustments
[PR113921]

    The Linux kernel and the following testcase distilled from it is
    miscompiled, because tree-outof-ssa.cc (eliminate_phi) emits some
    fixups on some of the edges (but doesn't commit edge insertions).
    Later expand_asm_stmt emits further instructions on the same edge.
    Now the problem is that expand_asm_stmt uses insert_insn_on_edge
    to add its own fixups, but that function appends to the existing
    sequence on the edge if any.  And the bug triggers when the
    fixup sequence emitted by eliminate_phi uses a pseudo which the
    fixup sequence emitted by expand_asm_stmt later on sets.
    So, we end up with
      (set (reg A) (asm_operands ...))
    and on one of the edges queued sequence
      (set (reg C) (reg B)) // added by eliminate_phi
      (set (reg B) (reg A)) // added by expand_asm_stmt
    That is wrong, what we emit by expand_asm_stmt needs to be as close
    to the asm_operands as possible (they aren't known until expand_asm_stmt
    is called, the PHI fixup code assumes it is reg B which holds the right
    value) and the PHI adjustments need to be done after it.

    So, the following patch introduces a prepend_insn_to_edge function and
    uses it from expand_asm_stmt, so that we queue
      (set (reg B) (reg A)) // added by expand_asm_stmt
      (set (reg C) (reg B)) // added by eliminate_phi
    instead and so the value from the asm_operands output propagates correctly
    to the PHI result.

    2024-02-15  Jakub Jelinek  <jakub@redhat.com>

            PR middle-end/113921
            * cfgrtl.h (prepend_insn_to_edge): New declaration.
            * cfgrtl.cc (insert_insn_on_edge): Clarify behavior in function
            comment.
            (prepend_insn_to_edge): New function.
            * cfgexpand.cc (expand_asm_stmt): Use prepend_insn_to_edge instead
of
            insert_insn_on_edge.

            * gcc.target/i386/pr113921.c: New test.

    (cherry picked from commit 2b4efc5db2aedb59196987300e14951d08cd7106)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/113921] Output register of an "asm volatile goto" is incorrectly clobbered/discarded
  2024-02-14 17:19 [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded seanjc at google dot com
                   ` (11 preceding siblings ...)
  2024-02-15 15:14 ` cvs-commit at gcc dot gnu.org
@ 2024-02-15 15:17 ` cvs-commit at gcc dot gnu.org
  2024-02-15 15:20 ` cvs-commit at gcc dot gnu.org
  2024-02-15 15:22 ` jakub at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-02-15 15:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

--- Comment #13 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:ba09da9787e8db8903b2b0f7c647c0d7af68bb74

commit r12-10158-gba09da9787e8db8903b2b0f7c647c0d7af68bb74
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Feb 15 15:53:01 2024 +0100

    expand: Fix handling of asm goto outputs vs. PHI argument adjustments
[PR113921]

    The Linux kernel and the following testcase distilled from it is
    miscompiled, because tree-outof-ssa.cc (eliminate_phi) emits some
    fixups on some of the edges (but doesn't commit edge insertions).
    Later expand_asm_stmt emits further instructions on the same edge.
    Now the problem is that expand_asm_stmt uses insert_insn_on_edge
    to add its own fixups, but that function appends to the existing
    sequence on the edge if any.  And the bug triggers when the
    fixup sequence emitted by eliminate_phi uses a pseudo which the
    fixup sequence emitted by expand_asm_stmt later on sets.
    So, we end up with
      (set (reg A) (asm_operands ...))
    and on one of the edges queued sequence
      (set (reg C) (reg B)) // added by eliminate_phi
      (set (reg B) (reg A)) // added by expand_asm_stmt
    That is wrong, what we emit by expand_asm_stmt needs to be as close
    to the asm_operands as possible (they aren't known until expand_asm_stmt
    is called, the PHI fixup code assumes it is reg B which holds the right
    value) and the PHI adjustments need to be done after it.

    So, the following patch introduces a prepend_insn_to_edge function and
    uses it from expand_asm_stmt, so that we queue
      (set (reg B) (reg A)) // added by expand_asm_stmt
      (set (reg C) (reg B)) // added by eliminate_phi
    instead and so the value from the asm_operands output propagates correctly
    to the PHI result.

    2024-02-15  Jakub Jelinek  <jakub@redhat.com>

            PR middle-end/113921
            * cfgrtl.h (prepend_insn_to_edge): New declaration.
            * cfgrtl.cc (insert_insn_on_edge): Clarify behavior in function
            comment.
            (prepend_insn_to_edge): New function.
            * cfgexpand.cc (expand_asm_stmt): Use prepend_insn_to_edge instead
of
            insert_insn_on_edge.

            * gcc.target/i386/pr113921.c: New test.

    (cherry picked from commit 2b4efc5db2aedb59196987300e14951d08cd7106)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/113921] Output register of an "asm volatile goto" is incorrectly clobbered/discarded
  2024-02-14 17:19 [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded seanjc at google dot com
                   ` (12 preceding siblings ...)
  2024-02-15 15:17 ` cvs-commit at gcc dot gnu.org
@ 2024-02-15 15:20 ` cvs-commit at gcc dot gnu.org
  2024-02-15 15:22 ` jakub at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-02-15 15:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

--- Comment #14 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:7a6e9e70ea88061981c5565c043985d8cde9ecc8

commit r11-11239-g7a6e9e70ea88061981c5565c043985d8cde9ecc8
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Feb 15 15:53:01 2024 +0100

    expand: Fix handling of asm goto outputs vs. PHI argument adjustments
[PR113921]

    The Linux kernel and the following testcase distilled from it is
    miscompiled, because tree-outof-ssa.cc (eliminate_phi) emits some
    fixups on some of the edges (but doesn't commit edge insertions).
    Later expand_asm_stmt emits further instructions on the same edge.
    Now the problem is that expand_asm_stmt uses insert_insn_on_edge
    to add its own fixups, but that function appends to the existing
    sequence on the edge if any.  And the bug triggers when the
    fixup sequence emitted by eliminate_phi uses a pseudo which the
    fixup sequence emitted by expand_asm_stmt later on sets.
    So, we end up with
      (set (reg A) (asm_operands ...))
    and on one of the edges queued sequence
      (set (reg C) (reg B)) // added by eliminate_phi
      (set (reg B) (reg A)) // added by expand_asm_stmt
    That is wrong, what we emit by expand_asm_stmt needs to be as close
    to the asm_operands as possible (they aren't known until expand_asm_stmt
    is called, the PHI fixup code assumes it is reg B which holds the right
    value) and the PHI adjustments need to be done after it.

    So, the following patch introduces a prepend_insn_to_edge function and
    uses it from expand_asm_stmt, so that we queue
      (set (reg B) (reg A)) // added by expand_asm_stmt
      (set (reg C) (reg B)) // added by eliminate_phi
    instead and so the value from the asm_operands output propagates correctly
    to the PHI result.

    2024-02-15  Jakub Jelinek  <jakub@redhat.com>

            PR middle-end/113921
            * cfgrtl.h (prepend_insn_to_edge): New declaration.
            * cfgrtl.c (insert_insn_on_edge): Clarify behavior in function
            comment.
            (prepend_insn_to_edge): New function.
            * cfgexpand.c (expand_asm_stmt): Use prepend_insn_to_edge instead
of
            insert_insn_on_edge.

            * gcc.target/i386/pr113921.c: New test.

    (cherry picked from commit 2b4efc5db2aedb59196987300e14951d08cd7106)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/113921] Output register of an "asm volatile goto" is incorrectly clobbered/discarded
  2024-02-14 17:19 [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded seanjc at google dot com
                   ` (13 preceding siblings ...)
  2024-02-15 15:20 ` cvs-commit at gcc dot gnu.org
@ 2024-02-15 15:22 ` jakub at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-15 15:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #15 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Should be fixed now for GCC 14.1+, 13.3+, 12.4+ and 11.5.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-02-15 15:22 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-14 17:19 [Bug c/113921] New: Output register of an "asm volatile goto" is incorrectly clobbered/discarded seanjc at google dot com
2024-02-14 18:00 ` [Bug middle-end/113921] " jakub at gcc dot gnu.org
2024-02-14 18:13 ` torvalds@linux-foundation.org
2024-02-14 18:16 ` torvalds@linux-foundation.org
2024-02-14 18:21 ` jakub at gcc dot gnu.org
2024-02-14 18:25 ` torvalds@linux-foundation.org
2024-02-14 18:40 ` jakub at gcc dot gnu.org
2024-02-14 19:07 ` jakub at gcc dot gnu.org
2024-02-14 20:31 ` jakub at gcc dot gnu.org
2024-02-15  8:00 ` rguenth at gcc dot gnu.org
2024-02-15  8:21 ` jakub at gcc dot gnu.org
2024-02-15 14:56 ` cvs-commit at gcc dot gnu.org
2024-02-15 15:14 ` cvs-commit at gcc dot gnu.org
2024-02-15 15:17 ` cvs-commit at gcc dot gnu.org
2024-02-15 15:20 ` cvs-commit at gcc dot gnu.org
2024-02-15 15:22 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).