public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/53190] New: CSE (?) CSEs asms
@ 2012-05-02 12:47 rguenth at gcc dot gnu.org
  2012-05-02 12:48 ` [Bug rtl-optimization/53190] " rguenth at gcc dot gnu.org
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-05-02 12:47 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53190

             Bug #: 53190
           Summary: CSE (?) CSEs asms
    Classification: Unclassified
           Product: gcc
           Version: 4.7.1
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: rguenth@gcc.gnu.org


For the attached testcase CSE ends up CSEing

(insn 2556 2555 2557 128 (parallel [
            (set (reg:DI 542 [ __res ])
                (asm_operands:DI ("cvtss2si %1, %0") ("=r") 0 [
                        (reg:SF 543)
                    ]
                     [
                        (asm_input:SF ("xm") (null):0)
                    ]
                     [] /home/aj/build/glibc/testing/math/libm-test.c:8699))
            (clobber (reg:QI 18 fpsr))
            (clobber (reg:QI 17 flags))
        ]) ../sysdeps/x86_64/fpu/bits/mathinline.h:82 -1
     (nil))

(insn 2557 2556 2558 128 (set (reg/v:DI 403 [ __res ])
        (reg:DI 542 [ __res ])) ../sysdeps/x86_64/fpu/bits/mathinline.h:82 62
{*movdi_internal_rex64}
     (nil))

to

(insn 2556 2555 2557 128 (set (reg:DI 542 [ __res ])
        (reg:DI 525 [ __res ])) ../sysdeps/x86_64/fpu/bits/mathinline.h:82 62
{*movdi_internal_rex64}
     (nil))

(insn 2557 2556 2558 128 (set (reg/v:DI 403 [ __res ])
        (reg:DI 525 [ __res ])) ../sysdeps/x86_64/fpu/bits/mathinline.h:82 62
{*movdi_internal_rex64}
     (nil))

where reg:DI 525 is computed from an asm() in a different floating-point
status area:

(insn 2479 2478 2480 127 (parallel [
            (set (reg:DI 525 [ __res ])
                (asm_operands:DI ("cvtss2si %1, %0") ("=r") 0 [
                        (reg:SF 526)
                    ]
                     [
                        (asm_input:SF ("xm") (null):0)
                    ]
                     [] /home/aj/build/glibc/testing/math/libm-test.c:8699))
            (clobber (reg:QI 18 fpsr))
            (clobber (reg:QI 17 flags))
        ]) ../sysdeps/x86_64/fpu/bits/mathinline.h:82 -1
     (nil))

(insn 2480 2479 2481 127 (set (reg/v:DI 391 [ __res ])
        (reg:DI 525 [ __res ])) ../sysdeps/x86_64/fpu/bits/mathinline.h:82 62
{*movdi_internal_rex64}
     (nil))

the bug is sensitive to inlinign decisions and -fno-ipa-sra manages to
hide it.

Compile flags for the above is

-std=gnu99 -fgnu89-inline -O2 -fmerge-all-constants -fno-builtin

adding -frounding-math does not change anything, even if techincalli required.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug rtl-optimization/53190] CSE (?) CSEs asms
  2012-05-02 12:47 [Bug rtl-optimization/53190] New: CSE (?) CSEs asms rguenth at gcc dot gnu.org
@ 2012-05-02 12:48 ` rguenth at gcc dot gnu.org
  2012-05-02 17:09 ` pinskia at gcc dot gnu.org
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-05-02 12:48 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53190

--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-05-02 12:47:58 UTC ---
Created attachment 27287
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27287
preprocessed source

Preprocessed source from glibc test-ifloat testcase.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug rtl-optimization/53190] CSE (?) CSEs asms
  2012-05-02 12:47 [Bug rtl-optimization/53190] New: CSE (?) CSEs asms rguenth at gcc dot gnu.org
  2012-05-02 12:48 ` [Bug rtl-optimization/53190] " rguenth at gcc dot gnu.org
@ 2012-05-02 17:09 ` pinskia at gcc dot gnu.org
  2012-05-02 17:28 ` ubizjak at gmail dot com
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-05-02 17:09 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53190

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-05-02 17:09:02 UTC ---
            (clobber (reg:QI 18 fpsr))
            (clobber (reg:QI 17 flags))


It might clobber those but that is not considered an use.  So CSE is doing the
correct thing as there is no use based on fpsr.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug rtl-optimization/53190] CSE (?) CSEs asms
  2012-05-02 12:47 [Bug rtl-optimization/53190] New: CSE (?) CSEs asms rguenth at gcc dot gnu.org
  2012-05-02 12:48 ` [Bug rtl-optimization/53190] " rguenth at gcc dot gnu.org
  2012-05-02 17:09 ` pinskia at gcc dot gnu.org
@ 2012-05-02 17:28 ` ubizjak at gmail dot com
  2012-05-03  8:00 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: ubizjak at gmail dot com @ 2012-05-02 17:28 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53190

Uros Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |INVALID

--- Comment #3 from Uros Bizjak <ubizjak at gmail dot com> 2012-05-02 17:25:22 UTC ---
(In reply to comment #2)
>             (clobber (reg:QI 18 fpsr))
>             (clobber (reg:QI 17 flags))
> 
> 
> It might clobber those but that is not considered an use.  So CSE is doing the
> correct thing as there is no use based on fpsr.

The asm that depends on some unknown-to-gcc global processor state should be
marked as volatile.

This is a bug in glibc.

As a side note, mathinline.h really needs some serious TLC, there are many
functions that are much better implemented with gcc builtin functions, not to
mention that x87 asm in 32bit case interferes *badly* with -mfpmath=sse.

Fortunately, there is -D__NO_MATH_INLINES, used in my projects from day one.

Really, there is no justification for mathinline.h to live, as far as gcc is
concerned.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug rtl-optimization/53190] CSE (?) CSEs asms
  2012-05-02 12:47 [Bug rtl-optimization/53190] New: CSE (?) CSEs asms rguenth at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2012-05-02 17:28 ` ubizjak at gmail dot com
@ 2012-05-03  8:00 ` rguenth at gcc dot gnu.org
  2012-05-03  8:16 ` ubizjak at gmail dot com
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-05-03  8:00 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53190

--- Comment #4 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-05-03 07:59:40 UTC ---
(In reply to comment #3)
> (In reply to comment #2)
> >             (clobber (reg:QI 18 fpsr))
> >             (clobber (reg:QI 17 flags))
> > 
> > 
> > It might clobber those but that is not considered an use.  So CSE is doing the
> > correct thing as there is no use based on fpsr.
> 
> The asm that depends on some unknown-to-gcc global processor state should be
> marked as volatile.
> 
> This is a bug in glibc.
> 
> As a side note, mathinline.h really needs some serious TLC, there are many
> functions that are much better implemented with gcc builtin functions, not to
> mention that x87 asm in 32bit case interferes *badly* with -mfpmath=sse.
> 
> Fortunately, there is -D__NO_MATH_INLINES, used in my projects from day one.
> 
> Really, there is no justification for mathinline.h to live, as far as gcc is
> concerned.

So you say that it's correct for CSE to CSE asm()s if they have the same
asm string and the same asm operands in case the asm is not volatile?  I was
not aware we would do that ;)  (but yes, it sounds like a reasonable thing)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug rtl-optimization/53190] CSE (?) CSEs asms
  2012-05-02 12:47 [Bug rtl-optimization/53190] New: CSE (?) CSEs asms rguenth at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2012-05-03  8:00 ` rguenth at gcc dot gnu.org
@ 2012-05-03  8:16 ` ubizjak at gmail dot com
  2012-05-03  8:20 ` rguenther at suse dot de
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: ubizjak at gmail dot com @ 2012-05-03  8:16 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53190

--- Comment #5 from Uros Bizjak <ubizjak at gmail dot com> 2012-05-03 08:15:00 UTC ---
(In reply to comment #4)

> So you say that it's correct for CSE to CSE asm()s if they have the same
> asm string and the same asm operands in case the asm is not volatile?  I was
> not aware we would do that ;)  (but yes, it sounds like a reasonable thing)

There is nothing that prevents gcc from doing this...

BTW: fpsr is FP *status* register, an x87 CC-like internal register. This asm
should probably depend on fpcr, a FP *control* register.

In fact, there are two FP control registers, x87 and SSE one. The later is not
handled at all.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug rtl-optimization/53190] CSE (?) CSEs asms
  2012-05-02 12:47 [Bug rtl-optimization/53190] New: CSE (?) CSEs asms rguenth at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2012-05-03  8:16 ` ubizjak at gmail dot com
@ 2012-05-03  8:20 ` rguenther at suse dot de
  2012-05-03  8:39 ` ubizjak at gmail dot com
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenther at suse dot de @ 2012-05-03  8:20 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53190

--- Comment #6 from rguenther at suse dot de <rguenther at suse dot de> 2012-05-03 08:19:17 UTC ---
On Thu, 3 May 2012, ubizjak at gmail dot com wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53190
> 
> --- Comment #5 from Uros Bizjak <ubizjak at gmail dot com> 2012-05-03 08:15:00 UTC ---
> (In reply to comment #4)
> 
> > So you say that it's correct for CSE to CSE asm()s if they have the same
> > asm string and the same asm operands in case the asm is not volatile?  I was
> > not aware we would do that ;)  (but yes, it sounds like a reasonable thing)
> 
> There is nothing that prevents gcc from doing this...
> 
> BTW: fpsr is FP *status* register, an x87 CC-like internal register. This asm
> should probably depend on fpcr, a FP *control* register.
> 
> In fact, there are two FP control registers, x87 and SSE one. The later is not
> handled at all.

Well, the asm in glibc is simply

extern __inline __attribute__ ((__always_inline__)) long int
__attribute__ ((__nothrow__ )) lrintf (float __x)
{
  long int __res;
  __asm ("cvtss2si %1, %0" : "=r" (__res) : "xm" (__x));
  return __res;
}

thus the CLOBBERs somehow magically appear anyways it seems.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug rtl-optimization/53190] CSE (?) CSEs asms
  2012-05-02 12:47 [Bug rtl-optimization/53190] New: CSE (?) CSEs asms rguenth at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2012-05-03  8:20 ` rguenther at suse dot de
@ 2012-05-03  8:39 ` ubizjak at gmail dot com
  2012-05-03  9:20 ` aj at gcc dot gnu.org
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: ubizjak at gmail dot com @ 2012-05-03  8:39 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53190

--- Comment #7 from Uros Bizjak <ubizjak at gmail dot com> 2012-05-03 08:39:19 UTC ---
(In reply to comment #6)

> Well, the asm in glibc is simply
> 
> extern __inline __attribute__ ((__always_inline__)) long int
> __attribute__ ((__nothrow__ )) lrintf (float __x)
> {
>   long int __res;
>   __asm ("cvtss2si %1, %0" : "=r" (__res) : "xm" (__x));
>   return __res;
> }
> 
> thus the CLOBBERs somehow magically appear anyways it seems.

Yes, this is done in ix86_md_asm_clobbers. The reasoning is the same as for
flags reg, most asms "probably" touch both status regs anyway. This prevents
asms from being scheduled in between CC-like producer/consumer chain.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug rtl-optimization/53190] CSE (?) CSEs asms
  2012-05-02 12:47 [Bug rtl-optimization/53190] New: CSE (?) CSEs asms rguenth at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2012-05-03  8:39 ` ubizjak at gmail dot com
@ 2012-05-03  9:20 ` aj at gcc dot gnu.org
  2012-05-03  9:40 ` ubizjak at gmail dot com
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: aj at gcc dot gnu.org @ 2012-05-03  9:20 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53190

--- Comment #8 from Andreas Jaeger <aj at gcc dot gnu.org> 2012-05-03 09:19:16 UTC ---
So, how should the inline rewritten?

Adding volatile is one option:
extern __inline __attribute__ ((__always_inline__)) long int
__attribute__ ((__nothrow__ )) lrintf (float __x)
{
  long int __res;
  __asm __volatile__ ("cvtss2si %1, %0" : "=r" (__res) : "xm" (__x));
  return __res;
}

Since this is SSE code: Is there any way to clobber the SSE control register?
Any better way to write the above?

Btw. I'd like to make two changes for glibc:
* Fix the inline
* Only declare the inline when lrintf is not available as builtin. lrint was
introduce in 2003-08-28, so should be part of gcc 3.4 and therefore for GCC 3.4
and newer my patch will not use the above anymore.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug rtl-optimization/53190] CSE (?) CSEs asms
  2012-05-02 12:47 [Bug rtl-optimization/53190] New: CSE (?) CSEs asms rguenth at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2012-05-03  9:20 ` aj at gcc dot gnu.org
@ 2012-05-03  9:40 ` ubizjak at gmail dot com
  2012-05-03 11:03 ` aj at gcc dot gnu.org
  2012-05-03 11:43 ` ubizjak at gmail dot com
  10 siblings, 0 replies; 12+ messages in thread
From: ubizjak at gmail dot com @ 2012-05-03  9:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53190

--- Comment #9 from Uros Bizjak <ubizjak at gmail dot com> 2012-05-03 09:38:59 UTC ---
(In reply to comment #8)
> So, how should the inline rewritten?
> 
> Adding volatile is one option:
> extern __inline __attribute__ ((__always_inline__)) long int
> __attribute__ ((__nothrow__ )) lrintf (float __x)
> {
>   long int __res;
>   __asm __volatile__ ("cvtss2si %1, %0" : "=r" (__res) : "xm" (__x));
>   return __res;
> }
> 
> Since this is SSE code: Is there any way to clobber the SSE control register?
> Any better way to write the above?

The insn above doesn't clobber SSE CR, but it uses it. But there is no MXCSR
reg listed in gcc ATM. I see no other way to prevent CSE or moves of these
ASMs.

> Btw. I'd like to make two changes for glibc:
> * Fix the inline
> * Only declare the inline when lrintf is not available as builtin. lrint was
> introduce in 2003-08-28, so should be part of gcc 3.4 and therefore for GCC 3.4
> and newer my patch will not use the above anymore.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug rtl-optimization/53190] CSE (?) CSEs asms
  2012-05-02 12:47 [Bug rtl-optimization/53190] New: CSE (?) CSEs asms rguenth at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2012-05-03  9:40 ` ubizjak at gmail dot com
@ 2012-05-03 11:03 ` aj at gcc dot gnu.org
  2012-05-03 11:43 ` ubizjak at gmail dot com
  10 siblings, 0 replies; 12+ messages in thread
From: aj at gcc dot gnu.org @ 2012-05-03 11:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53190

--- Comment #10 from Andreas Jaeger <aj at gcc dot gnu.org> 2012-05-03 11:02:43 UTC ---
So, should somebody file a bug against GCC that to make the SSE cr available?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug rtl-optimization/53190] CSE (?) CSEs asms
  2012-05-02 12:47 [Bug rtl-optimization/53190] New: CSE (?) CSEs asms rguenth at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2012-05-03 11:03 ` aj at gcc dot gnu.org
@ 2012-05-03 11:43 ` ubizjak at gmail dot com
  10 siblings, 0 replies; 12+ messages in thread
From: ubizjak at gmail dot com @ 2012-05-03 11:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53190

--- Comment #11 from Uros Bizjak <ubizjak at gmail dot com> 2012-05-03 11:43:16 UTC ---
(In reply to comment #10)
> So, should somebody file a bug against GCC that to make the SSE cr available?

Er, gcc doesn't need it, all SSE conversion builtins are implemented without
touching MXCSR. Please see lfloor and lceil implementations via
x86_expand_lfloorceil.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-05-03 11:43 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-02 12:47 [Bug rtl-optimization/53190] New: CSE (?) CSEs asms rguenth at gcc dot gnu.org
2012-05-02 12:48 ` [Bug rtl-optimization/53190] " rguenth at gcc dot gnu.org
2012-05-02 17:09 ` pinskia at gcc dot gnu.org
2012-05-02 17:28 ` ubizjak at gmail dot com
2012-05-03  8:00 ` rguenth at gcc dot gnu.org
2012-05-03  8:16 ` ubizjak at gmail dot com
2012-05-03  8:20 ` rguenther at suse dot de
2012-05-03  8:39 ` ubizjak at gmail dot com
2012-05-03  9:20 ` aj at gcc dot gnu.org
2012-05-03  9:40 ` ubizjak at gmail dot com
2012-05-03 11:03 ` aj at gcc dot gnu.org
2012-05-03 11:43 ` ubizjak at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).