[Bug c/42596] New: Integer/Floating point vector casts generate XMM register moves from and to the same register

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug c/42596]  New: Integer/Floating point vector casts generate XMM register moves from and to the same register
@ 2010-01-03 14:18 adam at consulting dot net dot nz
  2010-01-03 14:46 ` [Bug c/42596] " rguenth at gcc dot gnu dot org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: adam at consulting dot net dot nz @ 2010-01-03 14:18 UTC (permalink / raw)
  To: gcc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3175 bytes --]

/*
  GCC does not permit a vector of a union type.

  To dynamically store both integer and float data in an XMM register define an
  integer vector and cast to a float vector whenever a floating point operation
  is required upon the data (or use a union to perform the type conversion).

  Instead of treating these casts as a no-op GCC copies the register using a
  floating point XMM move instruction, performs the calculation, then copies
  the register back using an integer XMM move instruction.
*/

typedef double xmm_2f64_t __attribute__((vector_size (16)));
typedef long long xmm_2i64_t __attribute__((vector_size (16)));

register xmm_2f64_t xmm_a __asm__("xmm4");
register xmm_2f64_t xmm_b __asm__("xmm5");

register xmm_2i64_t xmm_c __asm__("xmm6");
register xmm_2i64_t xmm_d __asm__("xmm7");

typedef union {
  xmm_2f64_t xmm_2f64;
  xmm_2i64_t xmm_2i64;
} xmm_u;

//"data type of ‘xmm_e’ isn’t suitable for a register"
//register xmm_u xmm_e __asm__("xmm8");

typedef union {
  long long i64;
  double f64;
} r64_u;

//Note that the union above is suitable for a 64-bit register
register r64_u r64 __asm__("r15");


void test_fp_vectors_containing_fp_data() {
  xmm_a+=xmm_b;
}

void test_int_vectors_containing_fp_data() {
  xmm_c=(xmm_2i64_t) ((xmm_2f64_t) xmm_c + (xmm_2f64_t) xmm_d);
}

void test_int_vectors_containing_fp_data_using_a_union() {
  xmm_u u_c, u_d;
  u_c.xmm_2i64=xmm_c;
  u_d.xmm_2i64=xmm_d;
  u_c.xmm_2f64+=u_d.xmm_2f64;
  xmm_c=u_c.xmm_2i64;
}

int main() {
}


Relevant code generation:
$ gcc -O3 dynamic_vectors.c && objdump -d -m i386:x86-64:intel a.out |less

00000000004004a0 <test_fp_vectors_containing_fp_data>:
  4004a0:       66 0f 58 e5             addpd  xmm4,xmm5
  4004a4:       c3                      ret    
  4004a5:       66 66 2e 0f 1f 84 00    nop    WORD PTR cs:[rax+rax*1+0x0]
  4004ac:       00 00 00 00 

00000000004004b0 <test_int_vectors_containing_fp_data>:
  4004b0:       66 0f 28 c6             movapd xmm0,xmm6
  4004b4:       66 0f 58 c7             addpd  xmm0,xmm7
  4004b8:       66 0f 6f f0             movdqa xmm6,xmm0
  4004bc:       c3                      ret    
  4004bd:       0f 1f 00                nop    DWORD PTR [rax]

00000000004004c0 <test_int_vectors_containing_fp_data_using_a_union>:
  4004c0:       66 0f 28 c6             movapd xmm0,xmm6
  4004c4:       66 0f 58 c7             addpd  xmm0,xmm7
  4004c8:       66 0f 6f f0             movdqa xmm6,xmm0
  4004cc:       c3                      ret    
  4004cd:       0f 1f 00                nop    DWORD PTR [rax]


The last two functions should generate addpd xmm6,xmm7 instead of first copying
xmm6 to xmm0, performing the calculation, and then copying xmm6 back to xmm0.


-- 
           Summary: Integer/Floating point vector casts generate XMM
                    register moves from and to the same register
           Product: gcc
           Version: 4.4.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: adam at consulting dot net dot nz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42596


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/42596] Integer/Floating point vector casts generate XMM register moves from and to the same register
  2010-01-03 14:18 [Bug c/42596] New: Integer/Floating point vector casts generate XMM register moves from and to the same register adam at consulting dot net dot nz
@ 2010-01-03 14:46 ` rguenth at gcc dot gnu dot org
  2010-01-03 14:50 ` rguenth at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-01-03 14:46 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from rguenth at gcc dot gnu dot org  2010-01-03 14:46 -------
*** Bug 42595 has been marked as a duplicate of this bug. ***


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42596


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/42596] Integer/Floating point vector casts generate XMM register moves from and to the same register
  2010-01-03 14:18 [Bug c/42596] New: Integer/Floating point vector casts generate XMM register moves from and to the same register adam at consulting dot net dot nz
  2010-01-03 14:46 ` [Bug c/42596] " rguenth at gcc dot gnu dot org
@ 2010-01-03 14:50 ` rguenth at gcc dot gnu dot org
  2010-01-03 23:20 ` adam at consulting dot net dot nz
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-01-03 14:50 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from rguenth at gcc dot gnu dot org  2010-01-03 14:50 -------
Likely because you are using hardregs.  Re-do the testcase with intrinsics
please.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42596


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/42596] Integer/Floating point vector casts generate XMM register moves from and to the same register
  2010-01-03 14:18 [Bug c/42596] New: Integer/Floating point vector casts generate XMM register moves from and to the same register adam at consulting dot net dot nz
  2010-01-03 14:46 ` [Bug c/42596] " rguenth at gcc dot gnu dot org
  2010-01-03 14:50 ` rguenth at gcc dot gnu dot org
@ 2010-01-03 23:20 ` adam at consulting dot net dot nz
  2010-01-05  4:17 ` adam at consulting dot net dot nz
  2010-01-05 10:47 ` rguenth at gcc dot gnu dot org
  4 siblings, 0 replies; 6+ messages in thread
From: adam at consulting dot net dot nz @ 2010-01-03 23:20 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from adam at consulting dot net dot nz  2010-01-03 23:20 -------
This is a demo of poor code generation with XMM global register variables
("hardregs") and the vector extensions. Intrinsics are too low level. The
vector extensions can continue to work on a platform without x86 intrinsics
(just replace register variable definitions with, e.g., __thread variable
definitions on a foreign architecture).

test_fp_vectors_containing_fp_data() indicates that code generation can be
optimal with XMM global register variables+vector extensions. But only if the
register variables are defined as the same type as the calculation. This is not
a reasonable restriction for an XMM global register variable that has to
contain integer and/or floating point data over the lifetime of a program.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42596


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/42596] Integer/Floating point vector casts generate XMM register moves from and to the same register
  2010-01-03 14:18 [Bug c/42596] New: Integer/Floating point vector casts generate XMM register moves from and to the same register adam at consulting dot net dot nz
                   ` (2 preceding siblings ...)
  2010-01-03 23:20 ` adam at consulting dot net dot nz
@ 2010-01-05  4:17 ` adam at consulting dot net dot nz
  2010-01-05 10:47 ` rguenth at gcc dot gnu dot org
  4 siblings, 0 replies; 6+ messages in thread
From: adam at consulting dot net dot nz @ 2010-01-05  4:17 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from adam at consulting dot net dot nz  2010-01-05 04:17 -------
/* Workaround discovered! */
void test_int_vectors_containing_fp_data_using_local_reg_var_overlay() {
  //create local register variables of the required floating point type
  //(for the same global register variables)
  register xmm_2f64_t local_xmm_c __asm__("xmm6");
  register xmm_2f64_t local_xmm_d __asm__("xmm7");
  //same calculation upon the local register variables. No casts are required.
  local_xmm_c = local_xmm_c + local_xmm_d;
  //the local changes above will be optimised away unless the global register
  //variables are updated. The casts below should be a no-op as the local
  //register variables are aliased to the global register variables.
  xmm_c=(xmm_2i64_t) local_xmm_c;
  xmm_d=(xmm_2i64_t) local_xmm_d;
}


With this workaround generated code is still optimal when the global register
variables have an integer vector type:

0000000000400550
<test_int_vectors_containing_fp_data_using_local_reg_var_overlay>:
  400550:       66 0f 58 f7             addpd  xmm6,xmm7
  400554:       c3                      ret    


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42596


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/42596] Integer/Floating point vector casts generate XMM register moves from and to the same register
  2010-01-03 14:18 [Bug c/42596] New: Integer/Floating point vector casts generate XMM register moves from and to the same register adam at consulting dot net dot nz
                   ` (3 preceding siblings ...)
  2010-01-05  4:17 ` adam at consulting dot net dot nz
@ 2010-01-05 10:47 ` rguenth at gcc dot gnu dot org
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-01-05 10:47 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from rguenth at gcc dot gnu dot org  2010-01-05 10:47 -------
Uh, of course.

_Don't use global register variables_

They are not supposed to be used for this kind of things and nobody spends
a single second to optimizing code generation for them - instead the most
difficult thing is to prevent the compiler from miscompiling things.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |WONTFIX


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42596


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-01-05 10:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-01-03 14:18 [Bug c/42596] New: Integer/Floating point vector casts generate XMM register moves from and to the same register adam at consulting dot net dot nz
2010-01-03 14:46 ` [Bug c/42596] " rguenth at gcc dot gnu dot org
2010-01-03 14:50 ` rguenth at gcc dot gnu dot org
2010-01-03 23:20 ` adam at consulting dot net dot nz
2010-01-05  4:17 ` adam at consulting dot net dot nz
2010-01-05 10:47 ` rguenth at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).