On 2/6/20 6:07 AM, Jakub Jelinek wrote: > On Thu, Feb 06, 2020 at 01:00:36AM +0000, JonY wrote: >> On 2/4/20 11:42 AM, Jakub Jelinek wrote: >>> Hi! >>> >>> On Tue, Feb 04, 2020 at 11:16:06AM +0100, Uros Bizjak wrote: >>>> I guess that Comment #9 patch form the PR should be trivially correct, >>>> but althouhg it looks obvious, I don't want to propose the patch since >>>> I have no means of testing it. >>> >>> I don't have means of testing it either. >>> https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019 >>> is quite explicit that [xyz]mm16-31 are call clobbered and only xmm6-15 (low >>> 128-bits only) are call preserved. >>> >>> Jonathan, could you please test this if it is sufficient to just change >>> CALL_USED_REGISTERS or if e.g. something in the pro/epilogue needs tweaking >>> too? Thanks. >> >> Is this patch testing still required? I just got back from traveling. > > Yes, our reading of the MS ABI docs show that xmm16-31 are to be call used > (not preserved over calls), while in gcc they are currently handled as > preserved across the calls. > > Jakub > --- original.s 2020-02-06 09:00:02.014638069 +0000 +++ new.s 2020-02-07 10:28:55.678317667 +0000 @@ -7,23 +7,23 @@ qux: subq $72, %rsp .seh_stackalloc 72 - vmovaps %xmm18, 48(%rsp) - .seh_savexmm %xmm18, 48 + vmovaps %xmm6, 48(%rsp) + .seh_savexmm %xmm6, 48 .seh_endprologue call bar vmovapd %xmm0, %xmm1 - vmovapd %xmm1, %xmm18 + vmovapd %xmm1, %xmm6 call foo leaq 32(%rsp), %rcx - vmovapd %xmm18, %xmm0 - vmovaps %xmm0, 32(%rsp) + vmovapd %xmm6, %xmm0 + vmovapd %xmm0, 32(%rsp) call baz nop - vmovaps 48(%rsp), %xmm18 + vmovaps 48(%rsp), %xmm6 addq $72, %rsp ret .seh_endproc - .ident "GCC: (GNU) 10.0.0 20191024 (experimental)" + .ident "GCC: (GNU) 10.0.1 20200206 (experimental)" .def bar; .scl 2; .type 32; .endef .def foo; .scl 2; .type 32; .endef .def baz; .scl 2; .type 32; .endef GCC with the patch now seems to put the variables in xmm6, unfortunately I don't know enough of AVX or stack setups to know if that's all that is needed.