Re: [PATCH 2/2] s390: Convert from sync to atomic optabs

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "Ulrich Weigand" <uweigand@de.ibm.com>
To: rth@redhat.com (Richard Henderson)
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH 2/2] s390: Convert from sync to atomic optabs
Date: Mon, 06 Aug 2012 18:34:00 -0000	[thread overview]
Message-ID: <201208061834.q76IY8HS013445@d06av02.portsmouth.uk.ibm.com> (raw)
In-Reply-To: <1343687574-3244-3-git-send-email-rth@redhat.com> from "Richard Henderson" at Jul 30, 2012 03:32:54 PM

Richard Henderson wrote:


Some more comments on this patch.

> +; Different from movdi_31 in that we have no splitters.
> +(define_insn "atomic_loaddi_1"
> +  [(set (match_operand:DI 0 "register_operand" "=d,d,!*f,!*f")
> +	(unspec:DI [(match_operand:DI 1 "s_operand" "Q,S,Q,m")]

The constraint line doesn't look right.  ld(y) accept base+index,
so they should get R and T constraints, respectively.

In fact, I wouldn't use "s_operand" as a predicate, since this
excludes valid addresses for those cases, and since in general
I've found it to be preferable to have reload fix up addresses.
So I'd just use "memory_operand" here, and actually in all the
other patterns as well ...  (This also simplifes the expander.)

> +		   UNSPEC_MOVA))]
> +  "!TARGET_ZARCH"
> +  "@
> +   lm\t%0,%M0,%S1
> +   lmy\t%0,%M0,%S1
> +   ld\t%0,%1
> +   ldy\t%0,%1"
> +  [(set_attr "op_type" "RS,RSY,RS,RSY")
> +   (set_attr "type" "lm,lm,floaddf,floaddf")])
> +
> +(define_insn "atomic_loadti_1"
> +  [(set (match_operand:TI 0 "register_operand" "=r")
> +	(unspec:TI [(match_operand:TI 1 "memory_operand" "m")]

"m" is wrong here (and basically everywhere), since it includes
SYMBOL_REFs for lrl etc. on z10.  For lpq we need "RT".

> +  "lpq\t%0,%1"
> +  [(set_attr "op_type" "RXY")])

LPQ and STPQ probably should get a "type" of "other".  This may not
model the pipeline exactly, but it's better than just assuming
"integer" by default.  These instructions are special (and slow).

> +(define_expand "atomic_compare_and_swap<mode>"
> +  [(match_operand:SI 0 "register_operand")	;; bool success output
> +   (match_operand:DGPR 1 "register_operand")	;; oldval output
> +   (match_operand:DGPR 2 "s_operand")		;; memory

memory_operand probably better again.

> +   (match_operand:DGPR 3 "register_operand")	;; expected intput
> +   (match_operand:DGPR 4 "register_operand")	;; newval intput
> +   (match_operand:SI 5 "const_int_operand")	;; is_weak
> +   (match_operand:SI 6 "const_int_operand")	;; success model
> +   (match_operand:SI 7 "const_int_operand")]	;; failure model
> +  ""
> +{
> +  rtx cc, cmp;
> +  emit_insn (gen_atomic_compare_and_swap<mode>_internal
> +	     (operands[1], operands[2], operands[3], operands[4]));
> +  cc = gen_rtx_REG (CCZ1mode, CC_REGNUM);
> +  cmp = gen_rtx_NE (SImode, cc, const0_rtx);

As already pointed out, this needs to be EQ.

> -  ""
> -  "cds<tg>\t%0,%3,%S1"
> -  [(set_attr "op_type" "RS<TE>")
> +  "TARGET_ZARCH"
> +  "c<td>sg\t%0,%3,%S1"
> +  [(set_attr "op_type" "RXY")

Should be "RSY"

> +  "@
> +   cds\t%0,%3,%S1
> +   cdsy\t%0,%3,%S1"
> +  [(set_attr "op_type" "RX,RXY")

Should be "RS, RSY"

> +  "@
> +   cs\t%0,%3,%S1
> +   csy\t%0,%3,%S1"
> +  [(set_attr "op_type" "RX,RXY")

Likewise.

> +(define_insn "atomic_fetch_<atomic><mode>_iaf"
> +  [(set (match_operand:GPR 0 "register_operand" "=d")
> +	(match_operand:GPR 1 "s_operand" "+S"))
> +   (set (match_dup 1)
> +	(unspec_volatile:GPR
> +	 [(ATOMIC_Z196:GPR (match_dup 1)
> +			   (match_operand:GPR 2 "general_operand" "d"))]
> +	 UNSPECV_ATOMIC_OP))
> +   (clobber (reg:CC CC_REGNUM))]
>    "TARGET_Z196"
> -  "la<noxa><g>\t%0,%2,%1")
> +  "la<noxa><g>\t%0,%2,%1"
> +  [(set_attr "op_type" "RXY")

Again RSY.



There is one particular inefficiency I have noticed.  This code:

  if (!__atomic_compare_exchange_n (&v, &expected, max, 0 , 0, 0))
    abort ();

from atomic-compare-exchange-3.c gets compiled into:

        l       %r3,0(%r2)
        larl    %r1,v
        cs      %r3,%r4,0(%r1)
        ipm     %r1
        sra     %r1,28
        st      %r3,0(%r2)
        ltr     %r1,%r1
        jne     .L3

which is extremely inefficient; it converts the condition code into
an integer using the slow ipm, sra sequence, just so that it can
convert the integer back into a condition code via ltr and branch
on it ...

Now, with the old code this sequence used to be optimized away by
combine and we'd just have the cs followed by a branch.  This is
not done any more because we have the store to "expected" in between.

This is because of this code in combine:

      /* Make sure that the value that is to be substituted for the register
         does not use any registers whose values alter in between.  However,
         If the insns are adjacent, a use can't cross a set even though we
         think it might (this can happen for a sequence of insns each setting
         the same destination; last_set of that register might point to
         a NOTE).  If INSN has a REG_EQUIV note, the register is always
         equivalent to the memory so the substitution is valid even if there
         are intervening stores.  Also, don't move a volatile asm or
         UNSPEC_VOLATILE across any other insns.  */
      || (! all_adjacent
          && (((!MEM_P (src)
                || ! find_reg_note (insn, REG_EQUIV, src))
               && use_crosses_set_p (src, DF_INSN_LUID (insn)))
              || (GET_CODE (src) == ASM_OPERANDS && MEM_VOLATILE_P (src))
              || GET_CODE (src) == UNSPEC_VOLATILE))

Note that we have exactly the case mentioned, where a series of instructions
to be combined all set the same (CC) register.  If they're all adjacent,
this still gets optimized away -- but they no longer are due to the store.

Is there a way of structuring the atomic optabs/expander so that the store
gets done either before or after all the CC operations?


Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

next prev parent reply	other threads:[~2012-08-06 18:34 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-29 21:32 [CFT] " Richard Henderson
2012-07-30 14:19 ` Ulrich Weigand
2012-07-30 15:12   ` Richard Henderson
2012-07-30 15:51     ` Ulrich Weigand
2012-07-30 18:53       ` Richard Henderson
2012-07-30 22:33         ` [PATCH 0/2] Convert s390 to atomic optabs, v2 Richard Henderson
2012-07-30 22:33           ` [PATCH 1/2] s390: Reorg s390_expand_insv Richard Henderson
2012-07-30 22:36           ` [PATCH 2/2] s390: Convert from sync to atomic optabs Richard Henderson
2012-08-06 18:34             ` Ulrich Weigand [this message]
2012-08-06 18:51               ` Richard Henderson
2012-08-06 19:45                 ` Richard Henderson
2012-08-06 22:40               ` s390: Avoid CAS boolean output inefficiency Richard Henderson
2012-08-07 17:02                 ` Ulrich Weigand
2012-08-07 22:13                   ` Richard Henderson
2012-08-08 18:05                     ` Ulrich Weigand
2012-08-09 16:55                 ` Eric Botcazou
2012-07-31  9:11           ` [PATCH 0/2] Convert s390 to atomic optabs, v2 Richard Guenther
2012-07-31 15:27             ` Andrew MacLeod
2012-07-31 16:07             ` Richard Henderson
2012-08-01  8:41               ` Richard Guenther
2012-08-01 15:59                 ` Richard Henderson
2012-08-01 17:14                   ` Richard Guenther
2012-08-01 19:42                     ` Richard Henderson
2012-07-31 18:36           ` Ulrich Weigand
2012-07-31 19:54             ` Richard Henderson
2012-08-01 23:23             ` Richard Henderson
2012-08-03 12:20               ` Ulrich Weigand
2012-08-03 14:21                 ` Ulrich Weigand
2012-08-06 16:44               ` Ulrich Weigand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201208061834.q76IY8HS013445@d06av02.portsmouth.uk.ibm.com \
    --to=uweigand@de.ibm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=rth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).