public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed
@ 2004-03-12 13:15 michaelni at gmx dot at
  2004-03-12 13:20 ` [Bug optimization/14552] " michaelni at gmx dot at
                   ` (18 more replies)
  0 siblings, 19 replies; 20+ messages in thread
From: michaelni at gmx dot at @ 2004-03-12 13:15 UTC (permalink / raw)
  To: gcc-bugs

See attached source, gcc  -O3 -mtune=pentium3 -march=pentium3 -S 
generates: 
test: 
        movq    w, %mm1 
        pushl   %ebp 
        movl    %esp, %ebp 
        popl    %ebp 
        psllw   $1, %mm1 
        movq    %mm1, w 
        movq    w, %mm0 
        movq    %mm0, dw 
        ret 
 
human generates: 
movq w, %mm1 
paddw %mm1,%mm1 
movq %mm1, w 
movq %mm1,dw 
ret

-- 
           Summary: compiled trivial vector intrinsic code contains nearly
                    twice as many instructions as needed
           Product: gcc
           Version: 3.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: michaelni at gmx dot at
                CC: gcc-bugs at gcc dot gnu dot org
  GCC host triplet: pentium3-debian-linux
GCC target triplet: pentium3-debian-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug optimization/14552] compiled trivial vector intrinsic code contains nearly twice as many instructions as needed
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
@ 2004-03-12 13:20 ` michaelni at gmx dot at
  2004-03-12 15:47 ` pinskia at gcc dot gnu dot org
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: michaelni at gmx dot at @ 2004-03-12 13:20 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From michaelni at gmx dot at  2004-03-12 13:20 -------
Created an attachment (id=5906)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=5906&action=view)
source to generate the well optimized code


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug optimization/14552] compiled trivial vector intrinsic code contains nearly twice as many instructions as needed
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
  2004-03-12 13:20 ` [Bug optimization/14552] " michaelni at gmx dot at
@ 2004-03-12 15:47 ` pinskia at gcc dot gnu dot org
  2004-03-12 16:26 ` michaelni at gmx dot at
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-12 15:47 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-03-12 15:46 -------
The poblem is that you also need -fomit-frame-pointer  to get the same code as the human generated 
code:
test:
        movq    w, %mm1
        psllw   $1, %mm1
        movq    %mm1, w
        movq    w, %mm0
        movq    %mm0, dw
        ret

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |INVALID


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug optimization/14552] compiled trivial vector intrinsic code contains nearly twice as many instructions as needed
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
  2004-03-12 13:20 ` [Bug optimization/14552] " michaelni at gmx dot at
  2004-03-12 15:47 ` pinskia at gcc dot gnu dot org
@ 2004-03-12 16:26 ` michaelni at gmx dot at
  2004-03-12 16:30 ` [Bug optimization/14552] compiled trivial vector intrinsic code is ineffiencent pinskia at gcc dot gnu dot org
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: michaelni at gmx dot at @ 2004-03-12 16:26 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From michaelni at gmx dot at  2004-03-12 16:26 -------
sorry, no thats not the same code, it has 1 instruction more, uses a shift 
instead of a addition and writes the value to memory and reads it 
immedeatly afterwards, anyway iam not surprised that the bugreport got 
closed immedeatly 
 
gcc: 
movq    w, %mm1 
psllw   $1, %mm1     <------- 
movq    %mm1, w 
movq    w, %mm0     <------ 
movq    %mm0, dw 
 
human: 
movq w, %mm1  
paddw %mm1,%mm1  
movq %mm1, w  
movq %mm1,dw  

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug optimization/14552] compiled trivial vector intrinsic code is ineffiencent
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
                   ` (2 preceding siblings ...)
  2004-03-12 16:26 ` michaelni at gmx dot at
@ 2004-03-12 16:30 ` pinskia at gcc dot gnu dot org
  2004-03-12 16:38 ` pinskia at gcc dot gnu dot org
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-12 16:30 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-03-12 16:30 -------
Okay so reopening it.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
           Keywords|                            |pessimizes-code
         Resolution|INVALID                     |
            Summary|compiled trivial vector     |compiled trivial vector
                   |intrinsic code contains     |intrinsic code is
                   |nearly twice as many        |ineffiencent
                   |instructions as needed      |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug optimization/14552] compiled trivial vector intrinsic code is ineffiencent
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
                   ` (3 preceding siblings ...)
  2004-03-12 16:30 ` [Bug optimization/14552] compiled trivial vector intrinsic code is ineffiencent pinskia at gcc dot gnu dot org
@ 2004-03-12 16:38 ` pinskia at gcc dot gnu dot org
  2004-03-12 17:11 ` michaelni at gmx dot at
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-12 16:38 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-03-12 16:38 -------
Using a tempary variable I can get it down to 5 instructions (including the return):
        movq    w, %mm0
        psllw   $1, %mm0
        movq    %mm0, dw
        movq    %mm0, w
        ret

The problem is that global variables create the pessimize code so this is a dup of bug 12395.

*** This bug has been marked as a duplicate of 12395 ***

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |DUPLICATE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug optimization/14552] compiled trivial vector intrinsic code is ineffiencent
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
                   ` (4 preceding siblings ...)
  2004-03-12 16:38 ` pinskia at gcc dot gnu dot org
@ 2004-03-12 17:11 ` michaelni at gmx dot at
  2004-03-12 17:15 ` pinskia at gcc dot gnu dot org
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: michaelni at gmx dot at @ 2004-03-12 17:11 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From michaelni at gmx dot at  2004-03-12 17:11 -------
and the addition vs. shift issue? on the p3, mmx additions can be executed 
in port 0 or 1 while mmx shifts can only execute in port 1 

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug optimization/14552] compiled trivial vector intrinsic code is ineffiencent
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
                   ` (5 preceding siblings ...)
  2004-03-12 17:11 ` michaelni at gmx dot at
@ 2004-03-12 17:15 ` pinskia at gcc dot gnu dot org
  2004-04-07  3:00 ` pinskia at gcc dot gnu dot org
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-12 17:15 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-03-12 17:15 -------
That is a tunning issue.  The problem is that CSE selects the shift.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|DUPLICATE                   |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug optimization/14552] compiled trivial vector intrinsic code is ineffiencent
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
                   ` (6 preceding siblings ...)
  2004-03-12 17:15 ` pinskia at gcc dot gnu dot org
@ 2004-04-07  3:00 ` pinskia at gcc dot gnu dot org
  2004-05-31  3:01 ` [Bug target/14552] " pinskia at gcc dot gnu dot org
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-04-07  3:00 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-04-07 03:00 -------
I already confirmed this.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|                            |1
   Last reconfirmed|0000-00-00 00:00:00         |2004-04-07 03:00:15
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/14552] compiled trivial vector intrinsic code is ineffiencent
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
                   ` (7 preceding siblings ...)
  2004-04-07  3:00 ` pinskia at gcc dot gnu dot org
@ 2004-05-31  3:01 ` pinskia at gcc dot gnu dot org
  2005-01-12  6:26 ` pinskia at gcc dot gnu dot org
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-05-31  3:01 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
          Component|rtl-optimization            |target


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/14552] compiled trivial vector intrinsic code is ineffiencent
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
                   ` (8 preceding siblings ...)
  2004-05-31  3:01 ` [Bug target/14552] " pinskia at gcc dot gnu dot org
@ 2005-01-12  6:26 ` pinskia at gcc dot gnu dot org
  2005-01-12  6:32 ` pinskia at gcc dot gnu dot org
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-12  6:26 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-01-12 06:26 -------
I will have to file a new bug for this as we produce so much worse code now on the mainline but that is 
because we expand the + to do it all four times instead of using the sse/mmx unit which is just plainly 
wrong.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|enhancement                 |minor


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/14552] compiled trivial vector intrinsic code is ineffiencent
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
                   ` (9 preceding siblings ...)
  2005-01-12  6:26 ` pinskia at gcc dot gnu dot org
@ 2005-01-12  6:32 ` pinskia at gcc dot gnu dot org
  2005-01-12 15:31 ` pinskia at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-12  6:32 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
  BugsThisDependsOn|                            |19391


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/14552] compiled trivial vector intrinsic code is ineffiencent
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
                   ` (10 preceding siblings ...)
  2005-01-12  6:32 ` pinskia at gcc dot gnu dot org
@ 2005-01-12 15:31 ` pinskia at gcc dot gnu dot org
  2005-01-18 11:34 ` rth at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-12 15:31 UTC (permalink / raw)
  To: gcc-bugs



-- 
Bug 14552 depends on bug 19391, which changed state.

Bug 19391 Summary: [4.0 Regression] missed optimization with size of 8 vectors
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19391

           What    |Old Value                   |New Value
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |WONTFIX

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/14552] compiled trivial vector intrinsic code is ineffiencent
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
                   ` (11 preceding siblings ...)
  2005-01-12 15:31 ` pinskia at gcc dot gnu dot org
@ 2005-01-18 11:34 ` rth at gcc dot gnu dot org
  2005-04-05  1:52 ` pinskia at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rth at gcc dot gnu dot org @ 2005-01-18 11:34 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rth at gcc dot gnu dot org  2005-01-18 11:34 -------
No, Andrew, mainline is not plainly wrong.  We are correctly not using the 
MMX unit when <mmintrin.h> is not in use.  The instruction selection thing
can still be seen with the SSE unit though, if you widen the vectors to 16
bytes.

The problem is that ix86_rtx_costs has no idea about the cost of vector
operations.  For what little it's worth, K8 thinks paddw and psllw are
equivalent -- both can be issued to fadd or fmul pipelines.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |4.1.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/14552] compiled trivial vector intrinsic code is ineffiencent
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
                   ` (12 preceding siblings ...)
  2005-01-18 11:34 ` rth at gcc dot gnu dot org
@ 2005-04-05  1:52 ` pinskia at gcc dot gnu dot org
  2005-06-22 10:14 ` uros at kss-loka dot si
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-04-05  1:52 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.1.0                       |---


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/14552] compiled trivial vector intrinsic code is ineffiencent
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
                   ` (13 preceding siblings ...)
  2005-04-05  1:52 ` pinskia at gcc dot gnu dot org
@ 2005-06-22 10:14 ` uros at kss-loka dot si
  2005-07-21  8:47 ` uros at kss-loka dot si
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: uros at kss-loka dot si @ 2005-06-22 10:14 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From uros at kss-loka dot si  2005-06-22 10:14 -------
Just for fun, I have compiled the testcase with MMX/x87 mode switching patch 
included, to check MMX vector extensions. This little patch is needed to enable 
MMX vector extensions (only MMX vector add expander is shown):

diff -upr /export/home/uros/gcc-back/gcc/config/i386/i386.h i386/i386.h
--- /export/home/uros/gcc-back/gcc/config/i386/i386.h	2005-06-08 
07:05:22.000000000 +0200
+++ i386/i386.h	2005-06-22 10:41:31.000000000 +0200
@@ -843,7 +845,8 @@ do {							
		\
 
 /* ??? No autovectorization into MMX or 3DNOW until we can reliably
    place emms and femms instructions.  */
-#define UNITS_PER_SIMD_WORD (TARGET_SSE ? 16 : UNITS_PER_WORD)
+#define UNITS_PER_SIMD_WORD						\
+    (TARGET_SSE ? 16 : TARGET_MMX ? 8 : UNITS_PER_WORD)
 
 #define VALID_FP_MODE_P(MODE)						\
     ((MODE) == SFmode || (MODE) == DFmode || (MODE) == XFmode		\
diff -upr /export/home/uros/gcc-back/gcc/config/i386/mmx.md i386/mmx.md
--- /export/home/uros/gcc-back/gcc/config/i386/mmx.md	2005-04-20 
21:56:15.000000000 +0200
+++ i386/mmx.md	2005-06-22 11:00:35.000000000 +0200
@@ -553,6 +553,13 @@
 ;;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 
+(define_expand "add<mode>3"
+  [(set (match_operand:MMXMODEI 0 "register_operand" "")
+	(plus:MMXMODEI (match_operand:MMXMODEI 1 "nonimmediate_operand" "")
+		       (match_operand:MMXMODEI 2 "nonimmediate_operand" "")))]
+  "TARGET_MMX"
+  "ix86_fixup_binary_operands_no_copy (PLUS, <MODE>mode, operands);")
+
 (define_insn "mmx_add<mode>3"
   [(set (match_operand:MMXMODEI 0 "register_operand" "=y")
         (plus:MMXMODEI

After that, the testcase from description is compiled to (with -fomit-frame-
pointer):

test:
	movq	w, %mm0
	paddw	%mm0, %mm0
	movq	%mm0, w
	movq	%mm0, dw
	emms
	ret



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |uros at kss-loka dot si


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/14552] compiled trivial vector intrinsic code is ineffiencent
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
                   ` (14 preceding siblings ...)
  2005-06-22 10:14 ` uros at kss-loka dot si
@ 2005-07-21  8:47 ` uros at kss-loka dot si
  2005-09-13 21:09 ` fjahanian at apple dot com
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: uros at kss-loka dot si @ 2005-07-21  8:47 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From uros at kss-loka dot si  2005-07-21 08:42 -------
You can patch the mainline 4.1 compiler with the patch at 
http://gcc.gnu.org/ml/gcc-patches/2005-07/msg01128.html. Patch (which is 
currently awaiting a review) will make gcc to produce optimal code:

'gcc -O2 -mmmx -fomit-frame-pointer'

test:
	movq	w, %mm0
	paddw	%mm0, %mm0
	movq	%mm0, w
	movq	%mm0, dw
	emms
	ret




-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |uros at kss-loka dot si
                   |dot org                     |
                URL|                            |http://gcc.gnu.org/ml/gcc-
                   |                            |patches/2005-
                   |                            |07/msg01128.html
             Status|NEW                         |ASSIGNED
           Keywords|                            |patch
   Last reconfirmed|2004-11-22 03:37:09         |2005-07-21 08:42:15
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/14552] compiled trivial vector intrinsic code is ineffiencent
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
                   ` (15 preceding siblings ...)
  2005-07-21  8:47 ` uros at kss-loka dot si
@ 2005-09-13 21:09 ` fjahanian at apple dot com
  2005-09-13 21:13 ` pinskia at gcc dot gnu dot org
  2005-09-15 11:39 ` uros at kss-loka dot si
  18 siblings, 0 replies; 20+ messages in thread
From: fjahanian at apple dot com @ 2005-09-13 21:09 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From fjahanian at apple dot com  2005-09-13 21:09 -------
Hello,

What is the status of Uros's patches in:

http://gcc.gnu.org/ml/gcc-patches/2005-07/msg01128.html

Looks like they did not make it to FSF mainline? Are there remaining issues with them?



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/14552] compiled trivial vector intrinsic code is ineffiencent
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
                   ` (16 preceding siblings ...)
  2005-09-13 21:09 ` fjahanian at apple dot com
@ 2005-09-13 21:13 ` pinskia at gcc dot gnu dot org
  2005-09-15 11:39 ` uros at kss-loka dot si
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-09-13 21:13 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-09-13 21:13 -------
(In reply to comment #13)
> Are there remaining issues with them?

Yes, it does not work when configuring gcc with --with-cpu=pentium4 see PR 19161.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
  BugsThisDependsOn|                            |19161


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/14552] compiled trivial vector intrinsic code is ineffiencent
  2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
                   ` (17 preceding siblings ...)
  2005-09-13 21:13 ` pinskia at gcc dot gnu dot org
@ 2005-09-15 11:39 ` uros at kss-loka dot si
  18 siblings, 0 replies; 20+ messages in thread
From: uros at kss-loka dot si @ 2005-09-15 11:39 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From uros at kss-loka dot si  2005-09-15 11:39 -------
(In reply to comment #14)

> Yes, it does not work when configuring gcc with --with-cpu=pentium4 see PR 
19161.

No, the patch works OK for pentium4. The remaining problem is in 
optimize_mode_switching() function. For a certain loop layout, o_m_s could 
insert emms and efpu insn in such way, that both register sets are blocked.

Because emms/efpu insertion depends heavily on o_m_s functionality, this 
infrastructure should be upgraded as explained in PR 19161.

(BTW: One of the design goals was to ICE, instead of generating wrong code. It 
loks that this goal was achieved :)

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2005-09-15 11:39 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-03-12 13:15 [Bug optimization/14552] New: compiled trivial vector intrinsic code contains nearly twice as many instructions as needed michaelni at gmx dot at
2004-03-12 13:20 ` [Bug optimization/14552] " michaelni at gmx dot at
2004-03-12 15:47 ` pinskia at gcc dot gnu dot org
2004-03-12 16:26 ` michaelni at gmx dot at
2004-03-12 16:30 ` [Bug optimization/14552] compiled trivial vector intrinsic code is ineffiencent pinskia at gcc dot gnu dot org
2004-03-12 16:38 ` pinskia at gcc dot gnu dot org
2004-03-12 17:11 ` michaelni at gmx dot at
2004-03-12 17:15 ` pinskia at gcc dot gnu dot org
2004-04-07  3:00 ` pinskia at gcc dot gnu dot org
2004-05-31  3:01 ` [Bug target/14552] " pinskia at gcc dot gnu dot org
2005-01-12  6:26 ` pinskia at gcc dot gnu dot org
2005-01-12  6:32 ` pinskia at gcc dot gnu dot org
2005-01-12 15:31 ` pinskia at gcc dot gnu dot org
2005-01-18 11:34 ` rth at gcc dot gnu dot org
2005-04-05  1:52 ` pinskia at gcc dot gnu dot org
2005-06-22 10:14 ` uros at kss-loka dot si
2005-07-21  8:47 ` uros at kss-loka dot si
2005-09-13 21:09 ` fjahanian at apple dot com
2005-09-13 21:13 ` pinskia at gcc dot gnu dot org
2005-09-15 11:39 ` uros at kss-loka dot si

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).