* [Bug target/69493] Poor code generation for return of struct containing vectors on PPC64LE
[not found] <bug-69493-4@http.gcc.gnu.org/bugzilla/>
@ 2020-05-21 6:47 ` luoxhu at gcc dot gnu.org
2020-05-26 6:31 ` luoxhu at gcc dot gnu.org
2020-05-26 16:28 ` segher at gcc dot gnu.org
2 siblings, 0 replies; 3+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2020-05-21 6:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69493
luoxhu at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |luoxhu at gcc dot gnu.org
--- Comment #9 from luoxhu at gcc dot gnu.org ---
No load/store on Power9.
cat pr69493.s
.file "pr69493.c"
.abiversion 2
.section ".text"
.align 2
.p2align 4,,15
.globl test_big_double
.type test_big_double, @function
test_big_double:
.LFB0:
.cfi_startproc
mfvsrd 7,1
mfvsrd 10,2
mfvsrd 8,3
mfvsrd 9,4
mtvsrdd 34,10,7
mtvsrdd 35,9,8
blr
.long 0
.byte 0,0,0,0,0,0,0,0
.cfi_endproc
.LFE0:
.size test_big_double,.-test_big_double
.ident "GCC: (GNU) 9.2.1 20191023 (Advance-Toolchain 13.0-1)
[aba1f4e8b6ac]"
.gnu_attribute 4, 5
.section .note.GNU-stack,"",@progbits
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug target/69493] Poor code generation for return of struct containing vectors on PPC64LE
[not found] <bug-69493-4@http.gcc.gnu.org/bugzilla/>
2020-05-21 6:47 ` [Bug target/69493] Poor code generation for return of struct containing vectors on PPC64LE luoxhu at gcc dot gnu.org
@ 2020-05-26 6:31 ` luoxhu at gcc dot gnu.org
2020-05-26 16:28 ` segher at gcc dot gnu.org
2 siblings, 0 replies; 3+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2020-05-26 6:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69493
--- Comment #10 from luoxhu at gcc dot gnu.org ---
In expand, Power8 will emit two register permute instructions to byte swap the
contents by rs6000_emit_le_vsx_move.
P9:
5: NOTE_INSN_BASIC_BLOCK 2
2: r129:TF=%1:TF
3: r130:TF=%3:TF
4: NOTE_INSN_FUNCTION_BEG
7: r117:DF=unspec[r129:TF,0] 70
8: r131:V2DF=r121:V2DF
9: r133:DF=vec_select(r131:V2DF,parallel)
10: r131:V2DF=vec_concat(r117:DF,r133:DF)
11: r122:V2DF=r131:V2DF
12: r118:DF=unspec[r129:TF,0x1] 70
13: r119:DF=unspec[r130:TF,0] 70
14: r134:V2DF=r124:V2DF
15: r136:DF=vec_select(r134:V2DF,parallel)
16: r134:V2DF=vec_concat(r119:DF,r136:DF)
17: r125:V2DF=r134:V2DF
18: r120:DF=unspec[r130:TF,0x1] 70
19: r137:V2DF=r122:V2DF
20: r139:DF=vec_select(r137:V2DF,parallel)
21: r137:V2DF=vec_concat(r139:DF,r118:DF)
22: [r112:DI]=r137:V2DF
23: r140:V2DF=r125:V2DF
24: r142:DF=vec_select(r140:V2DF,parallel)
25: r140:V2DF=vec_concat(r142:DF,r120:DF)
26: [r112:DI+0x10]=r140:V2DF
27: r143:V4SI=[r112:DI]
28: r144:V4SI=[r112:DI+0x10]
29: r127:V4SI=r143:V4SI
30: r128:V4SI=r144:V4SI
34: %2:V4SI=r127:V4SI
35: %3:V4SI=r128:V4SI
36: use %2:V4SI
37: use %3:V4SI
P8:
5: NOTE_INSN_BASIC_BLOCK 2
2: r129:TF=%1:TF
3: r130:TF=%3:TF
4: NOTE_INSN_FUNCTION_BEG
7: r117:DF=unspec[r129:TF,0] 70
8: r131:V2DF=r121:V2DF
9: r133:DF=vec_select(r131:V2DF,parallel)
10: r131:V2DF=vec_concat(r117:DF,r133:DF)
11: r122:V2DF=r131:V2DF
12: r118:DF=unspec[r129:TF,0x1] 70
13: r119:DF=unspec[r130:TF,0] 70
14: r134:V2DF=r124:V2DF
15: r136:DF=vec_select(r134:V2DF,parallel)
16: r134:V2DF=vec_concat(r119:DF,r136:DF)
17: r125:V2DF=r134:V2DF
18: r120:DF=unspec[r130:TF,0x1] 70
19: r137:V2DF=r122:V2DF
20: r139:DF=vec_select(r137:V2DF,parallel)
21: r137:V2DF=vec_concat(r139:DF,r118:DF)
22: r140:V2DF=vec_select(r137:V2DF,parallel)
23: [r112:DI]=vec_select(r140:V2DF,parallel)
24: r141:V2DF=r125:V2DF
25: r143:DF=vec_select(r141:V2DF,parallel)
26: r141:V2DF=vec_concat(r143:DF,r120:DF)
27: r144:V2DF=vec_select(r141:V2DF,parallel)
28: [r112:DI+0x10]=vec_select(r144:V2DF,parallel)
29: r146:V4SI=vec_select([r112:DI],parallel)
30: r145:V4SI=vec_select(r146:V4SI,parallel)
31: r148:V4SI=vec_select([r112:DI+0x10],parallel)
32: r147:V4SI=vec_select(r148:V4SI,parallel)
33: r127:V4SI=r145:V4SI
34: r128:V4SI=r147:V4SI
38: %2:V4SI=r127:V4SI
39: %3:V4SI=r128:V4SI
40: use %2:V4SI
41: use %3:V4SI
Difference starts from #22. Power8 will emit two vec_select instructions for
stack store/load operations. But power9 needs only one.
^ permalink raw reply [flat|nested] 3+ messages in thread