public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: Profiling on S390
@ 2003-04-22 22:34 Ulrich Weigand
  2003-04-22 23:13 ` Richard Henderson
  0 siblings, 1 reply; 14+ messages in thread
From: Ulrich Weigand @ 2003-04-22 22:34 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: rth, gcc, aj


Jan Hubicka wrote:

>It looks very unsafe for me - GCC assumes at many places that CC is not
>live accross boundaries.

This is annoying.  Is there any fundamental reason for this restriction?
Allowing CC to stay live across bb boundaries would enable further
optimizations, e.g. avoid one comparison instruction in code like this:

  if (x > 5)
    ...
  else if (x == 5)
    ...

>What happens is that profiler insert adddi on
>the begining of BB clobberring the flags.

Can't the profiler use special code to save/restore CC?

>But in general gcc will
>happily reverse the conditional when reordering basic blocks and do
>other transformations that can break such construct.

Can you elaborate why reversing the conditional would break this
construct?

>What would be the best sollution?

The quick fix would be to fall back on CLCL in all cases, like
below.  I'll see if I can think of something better ...


Index: s390.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/s390/s390.c,v
retrieving revision 1.70.2.7
diff -c -p -r1.70.2.7 s390.c
*** s390.c      10 Apr 2003 16:41:31 -0000      1.70.2.7
--- s390.c      22 Apr 2003 20:41:22 -0000
*************** s390_expand_cmpstr (target, op0, op1, le
*** 2794,2800 ****
          emit_move_insn (target, const0_rtx);
      }

!   else if (TARGET_MVCLE)
      {
        enum machine_mode double_mode = TARGET_64BIT ? TImode : DImode;
        enum machine_mode single_mode = TARGET_64BIT ? DImode : SImode;
--- 2794,2800 ----
          emit_move_insn (target, const0_rtx);
      }

!   else if (1 || TARGET_MVCLE)
      {
        enum machine_mode double_mode = TARGET_64BIT ? TImode : DImode;
        enum machine_mode single_mode = TARGET_64BIT ? DImode : SImode;


Mit freundlichen Gruessen / Best Regards

Ulrich Weigand

--
  Dr. Ulrich Weigand
  Linux for S/390 Design & Development
  IBM Deutschland Entwicklung GmbH, Schoenaicher Str. 220, 71032 Boeblingen
  Phone: +49-7031/16-3727   ---   Email: Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Profiling on S390
  2003-04-22 22:34 Profiling on S390 Ulrich Weigand
@ 2003-04-22 23:13 ` Richard Henderson
  2003-04-23 10:59   ` Jan Hubicka
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Henderson @ 2003-04-22 23:13 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: Jan Hubicka, gcc, aj

On Tue, Apr 22, 2003 at 10:43:37PM +0200, Ulrich Weigand wrote:
> >It looks very unsafe for me - GCC assumes at many places that CC is not
> >live accross boundaries.
> 
> This is annoying.  Is there any fundamental reason for this restriction?

No, and it's not actually true.

What is true is that you can't generally move the thing around
if most of your instructions clobber the flags.  So e.g. for x86
this would be a Really Bad Idea, but for Sparc it can work out.

It's also true that some of the simplifiers (cse, combine) won't
DTRT unless the compare and use are in the same block.

> Can't the profiler use special code to save/restore CC?

It could, but if you don't have add/load/store insns that don't
clobber CC, then you'll get in trouble with reload too.  Of
course, that brings up the issue of getting the right pattern
emitted from the profiler.  Could be done with a special named
add pattern, but that seems like a lot of overhead for very
little return.

This is all about a movstrsi or cmpstrsi pattern?  Perhaps it
would be better to keep this as one unit until after reload?


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Profiling on S390
  2003-04-22 23:13 ` Richard Henderson
@ 2003-04-23 10:59   ` Jan Hubicka
  2003-04-23 18:26     ` Richard Henderson
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Hubicka @ 2003-04-23 10:59 UTC (permalink / raw)
  To: Richard Henderson, Ulrich Weigand, Jan Hubicka, gcc, aj

> On Tue, Apr 22, 2003 at 10:43:37PM +0200, Ulrich Weigand wrote:
> > >It looks very unsafe for me - GCC assumes at many places that CC is not
> > >live accross boundaries.
> > 
> > This is annoying.  Is there any fundamental reason for this restriction?
> 
> No, and it's not actually true.
> 
> What is true is that you can't generally move the thing around
> if most of your instructions clobber the flags.  So e.g. for x86
> this would be a Really Bad Idea, but for Sparc it can work out.

My impression is that s390 is x86-like when dealing with flags.
> 
> It's also true that some of the simplifiers (cse, combine) won't
> DTRT unless the compare and use are in the same block.
> 
> > Can't the profiler use special code to save/restore CC?
> 
> It could, but if you don't have add/load/store insns that don't
> clobber CC, then you'll get in trouble with reload too.  Of
We can conclude to use some named patterns.  For instance
"save_state" "restore_state" to fold profiler code in.  Does this sound
plausible?

Honza
> course, that brings up the issue of getting the right pattern
> emitted from the profiler.  Could be done with a special named
> add pattern, but that seems like a lot of overhead for very
> little return.
> 
> This is all about a movstrsi or cmpstrsi pattern?  Perhaps it
> would be better to keep this as one unit until after reload?
> 
> 
> r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Profiling on S390
  2003-04-23 10:59   ` Jan Hubicka
@ 2003-04-23 18:26     ` Richard Henderson
  2003-04-23 19:18       ` Jan Hubicka
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Henderson @ 2003-04-23 18:26 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Ulrich Weigand, gcc, aj

On Wed, Apr 23, 2003 at 09:45:37AM +0200, Jan Hubicka wrote:
> > It could, but if you don't have add/load/store insns that don't
> > clobber CC, then you'll get in trouble with reload too.  Of
> We can conclude to use some named patterns.  For instance
> "save_state" "restore_state" to fold profiler code in.  Does this sound
> plausible?

No.  How much state are you planning to save?  I think it'd
make more sense to have an add_nocc pattern instead.


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Profiling on S390
  2003-04-23 18:26     ` Richard Henderson
@ 2003-04-23 19:18       ` Jan Hubicka
  2003-04-23 19:56         ` Richard Henderson
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Hubicka @ 2003-04-23 19:18 UTC (permalink / raw)
  To: Richard Henderson, Jan Hubicka, Ulrich Weigand, gcc, aj

> On Wed, Apr 23, 2003 at 09:45:37AM +0200, Jan Hubicka wrote:
> > > It could, but if you don't have add/load/store insns that don't
> > > clobber CC, then you'll get in trouble with reload too.  Of
> > We can conclude to use some named patterns.  For instance
> > "save_state" "restore_state" to fold profiler code in.  Does this sound
> > plausible?
> 
> No.  How much state are you planning to save?  I think it'd
> make more sense to have an add_nocc pattern instead.
Perhaps, but Zdenek already has come code to measure extra stuff (like
value ranges) that do other thinks than plain add.  How would one
accomplish that?

Honza
> 
> 
> r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Profiling on S390
  2003-04-23 19:18       ` Jan Hubicka
@ 2003-04-23 19:56         ` Richard Henderson
  2003-04-23 21:29           ` Jan Hubicka
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Henderson @ 2003-04-23 19:56 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Ulrich Weigand, gcc, aj

On Wed, Apr 23, 2003 at 08:50:15PM +0200, Jan Hubicka wrote:
> Perhaps, but Zdenek already has come code to measure extra stuff (like
> value ranges) that do other thinks than plain add.  How would one
> accomplish that?

Hum.  Perhaps

  (1) Do life analysis to verify something's live across here.  Here we
      only have to check hard registers, and then we only have to care
      about those with

	GET_MODE_CLASS (reg_raw_mode[i]) == MODE_CC.

      In the vast majority of cases, no such will be live, so we can
      avoid extra work.

  (2) Generate the code sequence for the edge.  Examine it to see if
      the flags register is actually killed.  If not, do nothing.
      Should help Sparc-like targets that have fine-grained control
      over the setting of the flags register.

  (3) Locate the (a) set that causes the flags register(s) to be live.
      Remember the mode of the register, and we'll use gen_move_insn to
      save/restore the value to a pseudo.  If AVOID_CCMODE_COPIES, abort;
      either the machine description or an optimizer is incorrect.

      There are probably targets for which either AVOID_CCMODE_COPIES
      or CCmode move support is missing.

      Document that AVOID_CCMODE_COPIES implies that the flags register
      cannot be live across basic blocks.


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Profiling on S390
  2003-04-23 19:56         ` Richard Henderson
@ 2003-04-23 21:29           ` Jan Hubicka
  2003-04-23 22:33             ` Richard Henderson
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Hubicka @ 2003-04-23 21:29 UTC (permalink / raw)
  To: Richard Henderson, Jan Hubicka, Ulrich Weigand, gcc, aj

> On Wed, Apr 23, 2003 at 08:50:15PM +0200, Jan Hubicka wrote:
> > Perhaps, but Zdenek already has come code to measure extra stuff (like
> > value ranges) that do other thinks than plain add.  How would one
> > accomplish that?
> 
> Hum.  Perhaps
> 
>   (1) Do life analysis to verify something's live across here.  Here we
>       only have to check hard registers, and then we only have to care
>       about those with
> 
> 	GET_MODE_CLASS (reg_raw_mode[i]) == MODE_CC.
> 
>       In the vast majority of cases, no such will be live, so we can
>       avoid extra work.
This is interesting idea, we can do it more generally because ...
> 
>   (2) Generate the code sequence for the edge.  Examine it to see if
>       the flags register is actually killed.  If not, do nothing.

This work should in general be doable using my code hoisting functions I
made for GCSE some time ago (BTW these are still unmerged and we still
don't do parallels.  What do you think if I re-continued patches that
direction? (moving GCSE to code hoisting beasts, later doing pre-GCSE
liveness pass and even later making GCSE to actually use it and update)

In case we want to make an invariant that machine description implied
clobbers and hard reg sets are deadly only for MODE_CC as suggested in
(1), we could add it here.
>       Should help Sparc-like targets that have fine-grained control
>       over the setting of the flags register.
> 
>   (3) Locate the (a) set that causes the flags register(s) to be live.
>       Remember the mode of the register, and we'll use gen_move_insn to
>       save/restore the value to a pseudo.  If AVOID_CCMODE_COPIES, abort;
>       either the machine description or an optimizer is incorrect.
> 
>       There are probably targets for which either AVOID_CCMODE_COPIES
>       or CCmode move support is missing.
> 
>       Document that AVOID_CCMODE_COPIES implies that the flags register
>       cannot be live across basic blocks.
This all sounds fine.  It is bit expensive but I guess not deadly.

Honza
> 
> 
> r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Profiling on S390
  2003-04-23 21:29           ` Jan Hubicka
@ 2003-04-23 22:33             ` Richard Henderson
  2003-04-25 17:01               ` Jan Hubicka
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Henderson @ 2003-04-23 22:33 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Ulrich Weigand, gcc, aj

On Wed, Apr 23, 2003 at 09:57:01PM +0200, Jan Hubicka wrote:
>  What do you think if I re-continued patches that
> direction? (moving GCSE to code hoisting beasts, later doing pre-GCSE
> liveness pass and even later making GCSE to actually use it and update)

It'd be nice to have this completed.  It'll need testing on a wide
selection of targets before it can go in though.


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Profiling on S390
  2003-04-23 22:33             ` Richard Henderson
@ 2003-04-25 17:01               ` Jan Hubicka
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Hubicka @ 2003-04-25 17:01 UTC (permalink / raw)
  To: Richard Henderson, Jan Hubicka, Ulrich Weigand, gcc, aj

> On Wed, Apr 23, 2003 at 09:57:01PM +0200, Jan Hubicka wrote:
> >  What do you think if I re-continued patches that
> > direction? (moving GCSE to code hoisting beasts, later doing pre-GCSE
> > liveness pass and even later making GCSE to actually use it and update)
> 
> It'd be nice to have this completed.  It'll need testing on a wide
> selection of targets before it can go in though.
OK, I will try what I can do.  What platforms would you like to see
tested?  I can easilly run tests on i386/sparc/mips and simulators
here..

Honza
> 
> 
> r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Profiling on S390
  2003-04-23 21:50 Ulrich Weigand
@ 2003-04-23 22:08 ` Richard Henderson
  0 siblings, 0 replies; 14+ messages in thread
From: Richard Henderson @ 2003-04-23 22:08 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: Jan Hubicka, gcc, aj

On Wed, Apr 23, 2003 at 10:23:24PM +0200, Ulrich Weigand wrote:
> However, in the case of the profiler where saving/restoring CC
> is vital for *correctness*, shouldn't the fact that CCmode copies
> are *possible* (i.e. movcc works) be enough?

Well... if movcc exists, then fine.  But if it doesn't, and
movsi doesn't support the flags register, then we're going
to generate an unreloadable insn.

OTOH, I suppose it's sufficient to abort in reload.

> If I were to introduce that capability, does the common code make
> any assumptions on just how those values are represented by integers?
> Does (subreg:SI (reg:CC ...) 0) or vice versa work?)

If a movcc pattern doesn't exist, then emit_move_insn_1 will subreg
to word_mode and try that move pattern.  But no assumptions are made
wrt the contents of the bits.


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Profiling on S390
@ 2003-04-23 21:50 Ulrich Weigand
  2003-04-23 22:08 ` Richard Henderson
  0 siblings, 1 reply; 14+ messages in thread
From: Ulrich Weigand @ 2003-04-23 21:50 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Jan Hubicka, gcc, aj


Richard Henderson wrote:

>Document that AVOID_CCMODE_COPIES implies that the flags register
>cannot be live across basic blocks.

Why?  I'd probably *want* to define AVOID_CCMODE_COPIES on s390,
because while we can save/restore the condition code to/from a
general purpose register, we really have just the single CC that
is actually usable for conditional branches, so it usually wouldn't
make sense for the optimizer to copy CCmode values around.

However, in the case of the profiler where saving/restoring CC
is vital for *correctness*, shouldn't the fact that CCmode copies
are *possible* (i.e. movcc works) be enough?

(B.t.w. we currently don't accept CCmode values in GPRs at all.
If I were to introduce that capability, does the common code make
any assumptions on just how those values are represented by integers?
Does (subreg:SI (reg:CC ...) 0) or vice versa work?)


Mit freundlichen Gruessen / Best Regards

Ulrich Weigand

--
  Dr. Ulrich Weigand
  Linux for S/390 Design & Development
  IBM Deutschland Entwicklung GmbH, Schoenaicher Str. 220, 71032 Boeblingen
  Phone: +49-7031/16-3727   ---   Email: Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Profiling on S390
@ 2003-04-23 18:18 Ulrich Weigand
  0 siblings, 0 replies; 14+ messages in thread
From: Ulrich Weigand @ 2003-04-23 18:18 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: rth, gcc, aj


I wrote:

>The quick fix would be to fall back on CLCL in all cases, like
>below.  I'll see if I can think of something better ...

I've disabled the CLC loop for now and always emit CLCL.  This is
not really a big deal performance-wise, and it actually was the
behaviour of all previous GCC versions, so it should be acceptable
for 3.3.  Maybe we can do something better for 3.4 ...


Bootstrapped/regtested on s390-ibm-linux and s390x-ibm-linux on
3.3 and head; applied to both.


ChangeLog:

      * config/s390/s390.c (s390_expand_cmpstr): Disable CLC loop.


Index: gcc/config/s390/s390.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/s390/s390.c,v
retrieving revision 1.70.2.7
diff -c -p -r1.70.2.7 s390.c
*** gcc/config/s390/s390.c    10 Apr 2003 16:41:31 -0000    1.70.2.7
--- gcc/config/s390/s390.c    23 Apr 2003 12:55:31 -0000
*************** s390_expand_cmpstr (target, op0, op1, le
*** 2794,2800 ****
          emit_move_insn (target, const0_rtx);
      }

!   else if (TARGET_MVCLE)
      {
        enum machine_mode double_mode = TARGET_64BIT ? TImode : DImode;
        enum machine_mode single_mode = TARGET_64BIT ? DImode : SImode;
--- 2794,2800 ----
          emit_move_insn (target, const0_rtx);
      }

!   else /* if (TARGET_MVCLE) */
      {
        enum machine_mode double_mode = TARGET_64BIT ? TImode : DImode;
        enum machine_mode single_mode = TARGET_64BIT ? DImode : SImode;
*************** s390_expand_cmpstr (target, op0, op1, le
*** 2813,2818 ****
--- 2813,2821 ----
        emit_insn ((*gen_result) (target));
      }

+ #if 0
+   /* Deactivate for now as profile code cannot cope with
+      CC being live across basic block boundaries.  */
    else
      {
        rtx addr0, addr1, count, blocks, temp;
*************** s390_expand_cmpstr (target, op0, op1, le
*** 2878,2883 ****
--- 2881,2887 ----

        emit_insn ((*gen_result) (target));
      }
+ #endif
  }

  /* In the name of slightly smaller debug output, and to cater to


Mit freundlichen Gruessen / Best Regards

Ulrich Weigand

--
  Dr. Ulrich Weigand
  Linux for S/390 Design & Development
  IBM Deutschland Entwicklung GmbH, Schoenaicher Str. 220, 71032 Boeblingen
  Phone: +49-7031/16-3727   ---   Email: Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Profiling on S390
@ 2003-04-23 12:19 Ulrich Weigand
  0 siblings, 0 replies; 14+ messages in thread
From: Ulrich Weigand @ 2003-04-23 12:19 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Jan Hubicka, gcc, aj


Richard Henderson wrote:

>It could, but if you don't have add/load/store insns that don't
>clobber CC, then you'll get in trouble with reload too.

We can load/store without clobbering CC; while we don't have
a general addition operation that does not clobber CC, we do
have a load-address operation, which is all reload actually
needs (if things are set up correctly).  However, LA is a 31-bit
addition and cannot be used to implement 32-bit or even 64-bit
integer arithmetic.

However, it is certainly true that the use of gen_add2_insn in
reload is really broken as it doesn't care at all whether CC
gets clobbered.  It is only by setting things up so that reload
actually never tries to use gen_add2_insn that things work out ...

>This is all about a movstrsi or cmpstrsi pattern?  Perhaps it
>would be better to keep this as one unit until after reload?

It's only about cmpstrsi.  The idea behind exposing the loop
early was to potentially benefit from loop unrolling ...


Mit freundlichen Gruessen / Best Regards

Ulrich Weigand

--
  Dr. Ulrich Weigand
  Linux for S/390 Design & Development
  IBM Deutschland Entwicklung GmbH, Schoenaicher Str. 220, 71032 Boeblingen
  Phone: +49-7031/16-3727   ---   Email: Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Profiling on S390
@ 2003-04-18 12:11 Jan Hubicka
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Hubicka @ 2003-04-18 12:11 UTC (permalink / raw)
  To: Ulrich.Weigand, rth, gcc, aj

Hi,
arc profiling on s390 is currently broken because of strcmp expander.
What it does is to emit:
      expand_end_loop ();

      emit_insn ((*gen_short) (op0, op1, convert_to_mode
	    (word_mode, count, 1)));
      emit_label (end_label);

      emit_insn ((*gen_result) (target));

Where gen_result expands into:
(insn 43 42 44 (nil) (set (reg:SI 44)
     (compare:SI (reg:CCS 33 %cc)
            (const_int 0 [0x0]))) -1 (nil)
	        (nil))
Where CC is intialized via the conditional jumps into the label.

It looks very unsafe for me - GCC assumes at many places that CC is not
live accross boundaries.  What happens is that profiler insert adddi on
the begining of BB clobberring the flags.  But in general gcc will
happily reverse the conditional when reordering basic blocks and do
other transformations that can break such construct.

What would be the best sollution?

Honza

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2003-04-25 14:55 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-22 22:34 Profiling on S390 Ulrich Weigand
2003-04-22 23:13 ` Richard Henderson
2003-04-23 10:59   ` Jan Hubicka
2003-04-23 18:26     ` Richard Henderson
2003-04-23 19:18       ` Jan Hubicka
2003-04-23 19:56         ` Richard Henderson
2003-04-23 21:29           ` Jan Hubicka
2003-04-23 22:33             ` Richard Henderson
2003-04-25 17:01               ` Jan Hubicka
  -- strict thread matches above, loose matches on Subject: below --
2003-04-23 21:50 Ulrich Weigand
2003-04-23 22:08 ` Richard Henderson
2003-04-23 18:18 Ulrich Weigand
2003-04-23 12:19 Ulrich Weigand
2003-04-18 12:11 Jan Hubicka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).