public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them.
@ 2005-01-03  2:04 drab at kepler dot fjfi dot cvut dot cz
  2005-01-03  7:46 ` [Bug rtl-optimization/19235] [4.0 regression] " aj at gcc dot gnu dot org
                   ` (19 more replies)
  0 siblings, 20 replies; 21+ messages in thread
From: drab at kepler dot fjfi dot cvut dot cz @ 2005-01-03  2:04 UTC (permalink / raw)
  To: gcc-bugs

When compiling with following CVS HEAD snapshot from 30.12.2004

---------------------------
# gcc -v
Using built-in specs.
Configured with: ../../../gcc-CVS-20041230/gcc-CVS-20041230/configure
--host=i686-pc-linux-gnu --prefix=/usr/local/opt/gcc-4.0
--exec-prefix=/usr/local/opt/gcc-4.0 --sysconfdir=/etc
--libdir=/usr/local/opt/gcc-4.0/lib --libexecdir=/usr/local/opt/gcc-4.0/libexec
--sharedstatedir=/var --localstatedir=/var --program-suffix=-4.0
--with-x-includes=/usr/X11R6/include --with-x-libraries=/usr/X11R6/lib
--enable-shared --enable-static --with-gnu-as --with-gnu-ld --enable-libada
--with-stabs --enable-threads=posix --enable-version-specific-runtime-libs
--disable-coverage --enable-gather-detailed-mem-stats --disable-libgcj
--disable-checking --enable-multilib --with-x --enable-cmath
--enable-libstdcxx-debug --enable-fast-character --enable-hash-synchronization
--enable-languages=c,c++,f95,objc,ada --with-system-zlib --with-libbanshee
--with-demangler-in-ld --with-arch=athlon-xp
Thread model: posix
gcc version 4.0.0 20041230 (experimental)
---------------------------

the following test code

-- test.c -----------------
typedef struct t1 { double a,b,c,d; } t1_t;
typedef struct t2 { double e; t1_t g[4]; } t2_t;
t2_t *H;

void f (t2_t *h)
{
        int i;

        for (i=0; i<4; i++)
        {
                h->g[i].a=h->e;
                h->g[i].b=h->e;
                h->g[i].c=h->e;
                h->g[i].d=h->e;
        }
}

int main (void)
{
        f(&H);
        return 0;
}
--------------------------------

compiled with

--------------------------------
gcc -O3 -march=athlon-xp -msse -mfpmath=sse -c test.c -o test.o
--------------------------------

produces following code (listed by objdump, because it also shows the hexcodes).

--------------------------------
test.o:     file format elf32-i386

Disassembly of section .text:

00000000 <f>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   8b 55 08                mov    0x8(%ebp),%edx
   6:   8d 8a 80 00 00 00       lea    0x80(%edx),%ecx
   c:   89 d0                   mov    %edx,%eax
   e:   89 f6                   mov    %esi,%esi
  10:   f3 0f 7e 02             movq   (%edx),%xmm0
  14:   66 0f d6 40 08          movq   %xmm0,0x8(%eax)
  19:   f3 0f 7e 02             movq   (%edx),%xmm0
  1d:   66 0f d6 40 10          movq   %xmm0,0x10(%eax)
  22:   f3 0f 7e 02             movq   (%edx),%xmm0
  26:   66 0f d6 40 18          movq   %xmm0,0x18(%eax)
  2b:   f3 0f 7e 02             movq   (%edx),%xmm0
  2f:   66 0f d6 40 20          movq   %xmm0,0x20(%eax)
  34:   83 c0 20                add    $0x20,%eax
  37:   39 c8                   cmp    %ecx,%eax
  39:   75 d5                   jne    10 <f+0x10>
  3b:   c9                      leave
  3c:   c3                      ret
  3d:   8d 76 00                lea    0x0(%esi),%esi

00000040 <main>:
  40:   55                      push   %ebp
  41:   b8 00 00 00 00          mov    $0x0,%eax
  46:   89 e5                   mov    %esp,%ebp
  48:   83 ec 08                sub    $0x8,%esp
  4b:   83 e4 f0                and    $0xfffffff0,%esp
  4e:   83 ec 10                sub    $0x10,%esp
  51:   f3 0f 7e 05 00 00 00    movq   0x0,%xmm0
  58:   00
  59:   66 0f d6 40 08          movq   %xmm0,0x8(%eax)
  5e:   f3 0f 7e 05 00 00 00    movq   0x0,%xmm0
  65:   00
  66:   66 0f d6 40 10          movq   %xmm0,0x10(%eax)
  6b:   f3 0f 7e 05 00 00 00    movq   0x0,%xmm0
  72:   00
  73:   66 0f d6 40 18          movq   %xmm0,0x18(%eax)
  78:   f3 0f 7e 05 00 00 00    movq   0x0,%xmm0
  7f:   00
  80:   66 0f d6 40 20          movq   %xmm0,0x20(%eax)
  85:   83 c0 20                add    $0x20,%eax
  88:   3d 80 00 00 00          cmp    $0x80,%eax
  8d:   75 c2                   jne    51 <main+0x11>
  8f:   c9                      leave
  90:   31 c0                   xor    %eax,%eax
  92:   c3                      ret
------------------------------------

All instructions on lines 10-2f and 51-80 belong to the SSE2 instruction set
(at least according to the AMD documentation), and, though, are not supported
by AthlonXP, which fails to run this with an illegal instruction error message.
I also tried to add -mno-sse2, but with no luck.
GCC 3.4.1 doesn't seem to have that problem.

-- 
           Summary: GCC generates SSE2 instructions for AthlonXP which
                    doesn't support them.
           Product: gcc
           Version: 4.0.0
            Status: UNCONFIRMED
          Severity: critical
          Priority: P1
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: drab at kepler dot fjfi dot cvut dot cz
                CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: i686-pc-linux-gnu
  GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
@ 2005-01-03  7:46 ` aj at gcc dot gnu dot org
  2005-01-03 10:30 ` [Bug target/19235] " uros at kss-loka dot si
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: aj at gcc dot gnu dot org @ 2005-01-03  7:46 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From aj at gcc dot gnu dot org  2005-01-03 07:46 -------
Confirmed with gcc 4.0.0 20050102. 
 
Adding Uros since he made recent patches in this area, and also Honza as 
another expert. 

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |uros at kss-loka dot si,
                   |                            |hubicka at gcc dot gnu dot
                   |                            |org
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|                            |1
           Keywords|                            |wrong-code
   Last reconfirmed|0000-00-00 00:00:00         |2005-01-03 07:46:28
               date|                            |
            Summary|GCC generates SSE2          |[4.0 regression] GCC
                   |instructions for AthlonXP   |generates SSE2 instructions
                   |which doesn't support them. |for AthlonXP which doesn't
                   |                            |support them.
   Target Milestone|---                         |4.0.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
  2005-01-03  7:46 ` [Bug rtl-optimization/19235] [4.0 regression] " aj at gcc dot gnu dot org
@ 2005-01-03 10:30 ` uros at kss-loka dot si
  2005-01-03 10:40 ` uros at kss-loka dot si
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: uros at kss-loka dot si @ 2005-01-03 10:30 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From uros at kss-loka dot si  2005-01-03 10:30 -------
This bug could be caused by:

2004-12-17  Richard Henderson  <rth@redhat.com>

	* config/i386/i386.c (x86_64_reg_class_name): Re-indent.
	...
	(movsi_1): Use 'x' instead of 'Y' constraints.
	(movsi_1_nointernunit, movdi_2, movdi_1_rex64): Likewise.
	(movdi_1_rex64_nointerunit): Likewise.
	(movdf_nointeger, movdf_integer): Likewise.  Handle SSE1.       <<<
	...

There is a problem with register constraints in movdf_nointeger pattern. It
looks that 'x' should be substituted by 'Y'. Patch is in testing.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rth at gcc dot gnu dot org
         AssignedTo|unassigned at gcc dot gnu   |uros at kss-loka dot si
                   |dot org                     |
             Status|NEW                         |ASSIGNED
          Component|rtl-optimization            |target
   Last reconfirmed|2005-01-03 07:46:28         |2005-01-03 10:30:10
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
  2005-01-03  7:46 ` [Bug rtl-optimization/19235] [4.0 regression] " aj at gcc dot gnu dot org
  2005-01-03 10:30 ` [Bug target/19235] " uros at kss-loka dot si
@ 2005-01-03 10:40 ` uros at kss-loka dot si
  2005-01-03 12:44 ` uros at kss-loka dot si
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: uros at kss-loka dot si @ 2005-01-03 10:40 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From uros at kss-loka dot si  2005-01-03 10:40 -------
Patch here: http://gcc.gnu.org/ml/gcc-patches/2005-01/msg00063.html

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (2 preceding siblings ...)
  2005-01-03 10:40 ` uros at kss-loka dot si
@ 2005-01-03 12:44 ` uros at kss-loka dot si
  2005-01-03 12:52 ` andersca at gnome dot org
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: uros at kss-loka dot si @ 2005-01-03 12:44 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From uros at kss-loka dot si  2005-01-03 12:44 -------
I don't have a PIII to test, but http://gcc.gnu.org/ml/gcc/2005-01/msg00114.html
could be related to this bug.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (3 preceding siblings ...)
  2005-01-03 12:44 ` uros at kss-loka dot si
@ 2005-01-03 12:52 ` andersca at gnome dot org
  2005-01-03 15:21 ` andersca at gnome dot org
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: andersca at gnome dot org @ 2005-01-03 12:52 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From andersca at gnome dot org  2005-01-03 12:52 -------
I'm going a make bootstrap on this right now; I'll report the test results as soon as it finishes.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (4 preceding siblings ...)
  2005-01-03 12:52 ` andersca at gnome dot org
@ 2005-01-03 15:21 ` andersca at gnome dot org
  2005-01-03 15:34 ` drab at kepler dot fjfi dot cvut dot cz
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: andersca at gnome dot org @ 2005-01-03 15:21 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From andersca at gnome dot org  2005-01-03 15:21 -------
The patch does not fix the problem; the xmm registers are used anyway...

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (5 preceding siblings ...)
  2005-01-03 15:21 ` andersca at gnome dot org
@ 2005-01-03 15:34 ` drab at kepler dot fjfi dot cvut dot cz
  2005-01-03 15:44 ` drab at kepler dot fjfi dot cvut dot cz
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: drab at kepler dot fjfi dot cvut dot cz @ 2005-01-03 15:34 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From drab at kepler dot fjfi dot cvut dot cz  2005-01-03 15:34 -------
(In reply to comment #6)
> The patch does not fix the problem; the xmm registers are used anyway...

SSE instructions also use the xmm registers! The problem wasn't in using the xmm
registers (frankly, I hope it will accelerate things a bit by using the SSE to
copy), but problem was, that those "f3 0f 7e ..." and "66 0f d6 ..."
instructions (that's why I listed it with objdump) belong to the to the SSE2
instruction set. If instead it would use the instructions from SSE only (also
with xmm regs. of course), that would be good.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (6 preceding siblings ...)
  2005-01-03 15:34 ` drab at kepler dot fjfi dot cvut dot cz
@ 2005-01-03 15:44 ` drab at kepler dot fjfi dot cvut dot cz
  2005-01-03 15:56 ` cvs-commit at gcc dot gnu dot org
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: drab at kepler dot fjfi dot cvut dot cz @ 2005-01-03 15:44 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From drab at kepler dot fjfi dot cvut dot cz  2005-01-03 15:44 -------
For list of SSE instructions available at AthlonXP (it's called "3DNow!
Professional" instruction set there) see for instance following document
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/22007.pdf
on pages 301-303.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (7 preceding siblings ...)
  2005-01-03 15:44 ` drab at kepler dot fjfi dot cvut dot cz
@ 2005-01-03 15:56 ` cvs-commit at gcc dot gnu dot org
  2005-01-03 19:47 ` andersca at gnome dot org
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: cvs-commit at gcc dot gnu dot org @ 2005-01-03 15:56 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From cvs-commit at gcc dot gnu dot org  2005-01-03 15:56 -------
Subject: Bug 19235

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	uros@gcc.gnu.org	2005-01-03 15:56:17

Modified files:
	gcc/testsuite  : ChangeLog 
Added files:
	gcc/testsuite/gcc.dg: pr19236-1.c 

Log message:
	PR target/19235
	* gcc.dg/pr19236-1.c: New test case.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/ChangeLog.diff?cvsroot=gcc&r1=1.4837&r2=1.4838
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/gcc.dg/pr19236-1.c.diff?cvsroot=gcc&r1=NONE&r2=1.1



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (8 preceding siblings ...)
  2005-01-03 15:56 ` cvs-commit at gcc dot gnu dot org
@ 2005-01-03 19:47 ` andersca at gnome dot org
  2005-01-03 22:18 ` kcook at gcc dot gnu dot org
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: andersca at gnome dot org @ 2005-01-03 19:47 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From andersca at gnome dot org  2005-01-03 19:46 -------
Looking at the Intel reference documentation available from ftp://download.intel.com/design/
Pentium4/manuals/25366614.pdf MOVQ has the following opcodes:

0F 6F /r MOVQ mm, mm/m64 Move quadword from mm/m64 to mm. 
0F 7F /r MOVQ mm/m64, mm Move quadword from mm to mm/m64. 
F3 0F 7E MOVQ xmm1, xmm2/m64 Move quadword from xmm2/mem64 to xmm1.
66 0F D6 MOVQ xmm2/m64, xmm1 Move quadword from xmm1 to xmm2/mem64.

and since the two latter instructions are unsupported on AMD and Pentium III you would need some 
other way to move data between the xmm registers and memory. 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (9 preceding siblings ...)
  2005-01-03 19:47 ` andersca at gnome dot org
@ 2005-01-03 22:18 ` kcook at gcc dot gnu dot org
  2005-01-04  0:03 ` rth at gcc dot gnu dot org
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: kcook at gcc dot gnu dot org @ 2005-01-03 22:18 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From kcook at gcc dot gnu dot org  2005-01-03 22:18 -------
(In reply to comment #4)
> I don't have a PIII to test, but http://gcc.gnu.org/ml/gcc/2005-01/msg00114.html
> could be related to this bug.

Anders,
Your bug though similar is not the same and will need a new PR.
I'm pretty sure Uros's fix is correct for this PR19235.

Your bug comes about because of the movq instruction is attempting to use XMM
registers, but that capability was not added until SSE2.   The movq instruction
is being generated from both *movdi_2 and *movv2si_internal in your example.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (10 preceding siblings ...)
  2005-01-03 22:18 ` kcook at gcc dot gnu dot org
@ 2005-01-04  0:03 ` rth at gcc dot gnu dot org
  2005-01-04 10:01 ` cvs-commit at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: rth at gcc dot gnu dot org @ 2005-01-04  0:03 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rth at gcc dot gnu dot org  2005-01-04 00:03 -------
*** Bug 19174 has been marked as a duplicate of this bug. ***

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andre dot maute at gmx dot
                   |                            |de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (11 preceding siblings ...)
  2005-01-04  0:03 ` rth at gcc dot gnu dot org
@ 2005-01-04 10:01 ` cvs-commit at gcc dot gnu dot org
  2005-01-04 10:04 ` rth at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: cvs-commit at gcc dot gnu dot org @ 2005-01-04 10:01 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From cvs-commit at gcc dot gnu dot org  2005-01-04 10:01 -------
Subject: Bug 19235

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	rth@gcc.gnu.org	2005-01-04 10:00:57

Modified files:
	gcc            : ChangeLog 
	gcc/config/i386: i386.md 

Log message:
	PR target/19235
	* config/i386/i386.md (movdi_2): Separate SSE1 and SSE2 alternatives.
	(mov<MMXMODEI>_internal): Likewise.
	(movdf_nointeger): Prefer Y while not preferring, but allowing, x.
	Add V2SF case; use it for SSE1; don't use TI.
	(movdf_integer): Likewise.
	(mov<SSEMODEI>_internal, movti_internal): Force V4SF for SSE1.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.7014&r2=2.7015
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386.md.diff?cvsroot=gcc&r1=1.597&r2=1.598



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (12 preceding siblings ...)
  2005-01-04 10:01 ` cvs-commit at gcc dot gnu dot org
@ 2005-01-04 10:04 ` rth at gcc dot gnu dot org
  2005-01-04 13:51 ` drab at kepler dot fjfi dot cvut dot cz
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: rth at gcc dot gnu dot org @ 2005-01-04 10:04 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rth at gcc dot gnu dot org  2005-01-04 10:04 -------
Fixed.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (13 preceding siblings ...)
  2005-01-04 10:04 ` rth at gcc dot gnu dot org
@ 2005-01-04 13:51 ` drab at kepler dot fjfi dot cvut dot cz
  2005-01-04 15:06 ` kcook at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: drab at kepler dot fjfi dot cvut dot cz @ 2005-01-04 13:51 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From drab at kepler dot fjfi dot cvut dot cz  2005-01-04 13:51 -------
(In reply to comment #10)
> Looking at the Intel reference documentation available from
ftp://download.intel.com/design/
> Pentium4/manuals/25366614.pdf MOVQ has the following opcodes:
> 
> 0F 6F /r MOVQ mm, mm/m64 Move quadword from mm/m64 to mm. 
> 0F 7F /r MOVQ mm/m64, mm Move quadword from mm to mm/m64. 
> F3 0F 7E MOVQ xmm1, xmm2/m64 Move quadword from xmm2/mem64 to xmm1.
> 66 0F D6 MOVQ xmm2/m64, xmm1 Move quadword from xmm1 to xmm2/mem64.
> 
> and since the two latter instructions are unsupported on AMD and Pentium III
you would need some 
> other way to move data between the xmm registers and memory.

Those 0F 6F and 0F 7F are, however, standard MMX instructions. So when you use
for instance -msse -mfpmath=sse -no-mmx those shouldn't be used as well (don't
know why would anybody want to do that, but...). However when it is used only
for copying (as in the example, that I porposed), there are other ways, such as
using the following instructions:

0F 12 /r MOVLPS xmm, mem64
0F 13 /r MOVLPS mem64, xmm

and even more

0F 16 /r MOVHPS xmm, mem64
0F 17 /r MOVHPS mem64, xmm

It's true, that those are used for two single precision floats moving (into
lower or higher half of the xmm reg.), but since it's only moving, it doesn't
matter, because it just copies those 64bits into either lower or upper 64 bits
of the xmm register. These could come quite handy, since it leaves the mmx/st
registers available for other usage and when we consider only 64bit memory
accesses, then it effectively adds doule the amount of xmm registers as
additional 64bit registers. I think that might be worth considering, isn't it?
And it is SSE only, so AthlonXP, PIII and others might benefit out of it.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (14 preceding siblings ...)
  2005-01-04 13:51 ` drab at kepler dot fjfi dot cvut dot cz
@ 2005-01-04 15:06 ` kcook at gcc dot gnu dot org
  2005-01-04 15:48 ` drab at kepler dot fjfi dot cvut dot cz
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: kcook at gcc dot gnu dot org @ 2005-01-04 15:06 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From kcook at gcc dot gnu dot org  2005-01-04 15:06 -------
> (In reply to comment #10)
> Those 0F 6F and 0F 7F are, however, standard MMX instructions. So when you
> use for instance -msse -mfpmath=sse -no-mmx those shouldn't be used as well
> (don't know why would anybody want to do that, but...). However when it is
> used only for copying (as in the example, that I porposed), there are other
> ways, such as using the following instructions:
> 
> 0F 12 /r MOVLPS xmm, mem64
> 0F 13 /r MOVLPS mem64, xmm
> 0F 16 /r MOVHPS xmm, mem64
> 0F 17 /r MOVHPS mem64, xmm

If you look at Richard's patch, the compiler will use MOVLPS into XMM register
when only SSE1 is available.

Anders test case now looks like:
        mov     eax, 19088743    # 54   *movsi_1/1      [length = 5]
        mov     edx, -1985229329         # 56   *movsi_1/1      [length = 5]
        mov     DWORD PTR _e64+4, eax    # 55   *movsi_1/2      [length = 6]
        mov     DWORD PTR _e64, edx      # 57   *movsi_1/2      [length = 6]
        movlps  xmm0, QWORD PTR _e64     # 27   *movv2si_internal/11    [length = 7]
        movlps  QWORD PTR _m64_64, xmm0  # 28   *movv2si_internal/12    [length = 7]

Your test case now won't use SSE at all (reverting to x87 instruction) on an
athlon-xp/pentium3, which I believe is the correct behavior as SSE1 doesn't do
floating point doubles.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (15 preceding siblings ...)
  2005-01-04 15:06 ` kcook at gcc dot gnu dot org
@ 2005-01-04 15:48 ` drab at kepler dot fjfi dot cvut dot cz
  2005-01-04 18:09 ` andersca at gnome dot org
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: drab at kepler dot fjfi dot cvut dot cz @ 2005-01-04 15:48 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From drab at kepler dot fjfi dot cvut dot cz  2005-01-04 15:47 -------
(In reply to comment #16)
> > (In reply to comment #10)
> If you look at Richard's patch, the compiler will use MOVLPS into XMM register
> when only SSE1 is available.

Yes, I noticed. That's good. Thanks. :) 
 
> Your test case now won't use SSE at all (reverting to x87 instruction) on an
> athlon-xp/pentium3, which I believe is the correct behavior as SSE1 doesn't do
> floating point doubles.

It doesn't, but as I said earlier, for copying ANY 64-bit piece of memory (even
doubles) it can be used as well. Perhaps it may not be necessary in the case of
my test code, but I can imagine a situation, where the mmx or st regs. (which
are mapped to the same place AFAIK) will be occupied. Then SSE1 can be used for
copying doubles, even when the interpretation of the data in the xmm reg. is
different. That's just what I wanted to say.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (16 preceding siblings ...)
  2005-01-04 15:48 ` drab at kepler dot fjfi dot cvut dot cz
@ 2005-01-04 18:09 ` andersca at gnome dot org
  2005-01-04 21:03 ` belyshev at depni dot sinp dot msu dot ru
  2005-06-05 22:54 ` pinskia at gcc dot gnu dot org
  19 siblings, 0 replies; 21+ messages in thread
From: andersca at gnome dot org @ 2005-01-04 18:09 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From andersca at gnome dot org  2005-01-04 18:09 -------
Confirming that this does fix the error for me. Thanks a lot!

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (17 preceding siblings ...)
  2005-01-04 18:09 ` andersca at gnome dot org
@ 2005-01-04 21:03 ` belyshev at depni dot sinp dot msu dot ru
  2005-06-05 22:54 ` pinskia at gcc dot gnu dot org
  19 siblings, 0 replies; 21+ messages in thread
From: belyshev at depni dot sinp dot msu dot ru @ 2005-01-04 21:03 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From belyshev at depni dot sinp dot msu dot ru  2005-01-04 21:03 -------
*** Bug 19107 has been marked as a duplicate of this bug. ***

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
OtherBugsDependingO|19107                       |
              nThis|                            |
                 CC|                            |belyshev at depni dot sinp
                   |                            |dot msu dot ru


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.
  2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
                   ` (18 preceding siblings ...)
  2005-01-04 21:03 ` belyshev at depni dot sinp dot msu dot ru
@ 2005-06-05 22:54 ` pinskia at gcc dot gnu dot org
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-06-05 22:54 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-06-05 22:54 -------
*** Bug 19107 has been marked as a duplicate of this bug. ***

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19235


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2005-06-05 22:54 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-01-03  2:04 [Bug rtl-optimization/19235] New: GCC generates SSE2 instructions for AthlonXP which doesn't support them drab at kepler dot fjfi dot cvut dot cz
2005-01-03  7:46 ` [Bug rtl-optimization/19235] [4.0 regression] " aj at gcc dot gnu dot org
2005-01-03 10:30 ` [Bug target/19235] " uros at kss-loka dot si
2005-01-03 10:40 ` uros at kss-loka dot si
2005-01-03 12:44 ` uros at kss-loka dot si
2005-01-03 12:52 ` andersca at gnome dot org
2005-01-03 15:21 ` andersca at gnome dot org
2005-01-03 15:34 ` drab at kepler dot fjfi dot cvut dot cz
2005-01-03 15:44 ` drab at kepler dot fjfi dot cvut dot cz
2005-01-03 15:56 ` cvs-commit at gcc dot gnu dot org
2005-01-03 19:47 ` andersca at gnome dot org
2005-01-03 22:18 ` kcook at gcc dot gnu dot org
2005-01-04  0:03 ` rth at gcc dot gnu dot org
2005-01-04 10:01 ` cvs-commit at gcc dot gnu dot org
2005-01-04 10:04 ` rth at gcc dot gnu dot org
2005-01-04 13:51 ` drab at kepler dot fjfi dot cvut dot cz
2005-01-04 15:06 ` kcook at gcc dot gnu dot org
2005-01-04 15:48 ` drab at kepler dot fjfi dot cvut dot cz
2005-01-04 18:09 ` andersca at gnome dot org
2005-01-04 21:03 ` belyshev at depni dot sinp dot msu dot ru
2005-06-05 22:54 ` pinskia at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).