From: Richard Guenther <rguenther@suse.de>
To: Richard Henderson <rth@redhat.com>
Cc: uweigand@de.ibm.com, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH 0/2] Convert s390 to atomic optabs, v2
Date: Tue, 31 Jul 2012 09:11:00 -0000 [thread overview]
Message-ID: <alpine.LNX.2.00.1207311053340.17088@jbgna.fhfr.qr> (raw)
In-Reply-To: <1343687574-3244-1-git-send-email-rth@redhat.com>
On Mon, 30 Jul 2012, Richard Henderson wrote:
> The atomic_load/storedi_1 patterns are fixed to use LM, STM.
>
> I've had a go at generating better code in the HQImode CAS
> loop for aligned memory, but I don't know that I'd call it
> the most efficient thing ever. Some of this is due to
> deficiencies in other parts of the compiler (including the
> s390 backend):
>
> (1) MEM_ALIGN can't pass down full align+ofs data that we had
> during cfgexpand. This means the opportunities for using
> the "aligned" path are less than they ought to be.
>
> (2) In get_pointer_alignment (used by get_builtin_sync_mem),
> we don't consider an ADDR_EXPR to return the full alignment
> that the type is due. I'm sure this is to work around some
> other sort of usage via the <string.h> builtins, but it's
> less-than-handy in this case.
>
> I wonder if in get_builtin_sync_mem we ought to be using
> get_object_alignment (build_fold_indirect_ref (addr)) instead?
>
> Consider
>
> struct S { int x; unsigned short y; } g_s;
> unsigned short o, n;
> void good() {
> __builtin_compare_exchange (&g_s.y, &o, n, 0, 0, 0);
> }
> void bad(S *p_s) {
> __builtin_compare_exchange (&p_s->y, &o, n, 0, 0, 0);
> }
>
> where GOOD produces the aligned MEM that we need, and BAD doesn't.
You cannot generally use get_object_alignment here. Once we have
an address in the middle-end we treat it as generic pointer, happily
not caring about the actual pointer types. But it seems we do
/* The alignment needs to be at least according to that of the mode. */
set_mem_align (mem, MAX (GET_MODE_ALIGNMENT (mode),
get_pointer_alignment (loc)));
anyway? What do we expect __builtin_compare_exchange to do for
unaligned inputs? Like
typedef int uint __attribute__((aligned((8))));
unsigned short o, n;
void very_bad (uint *p) {
__builtin_compare_exchange (p, &o, n, 0, 0, 0);
}
? Doesn't the above set a wrong alignment?
Back to using get_object_alignment - you cannot blindly use
build_fold_indirect_ref at least, you could use get_object_alignment
on the operand of an ADDR_EXPR address but I am sure we can construct
a testcase where that would give a wrong answer, too (maybe not
easily without violating the C standards rule that pointers have
to be aligned according to their type ... but nobody in practice
follows this and the middle-end does not require this either).
Thus, the bad news is that it's hard for the middle-end to
recover alignment of a memory access that is represented as
a builtin function call that takes addresses as parameters
(which also makes them address-taken and thus possibly aliased).
Didn't Andrew have some patches to introduce a GIMPLE_ATOMIC
eventually side-stepping this issue (maybe that used addresses, too)?
Richard.
> (3) Support for IC, and ICM via the insv pattern is lacking.
> I've added a tiny bit of support here, in the form of using
> the existing strict_low_part patterns, but most definitely we
> could do better.
>
> (4) The *sethighpartsi and *sethighpartdi_64 patterns ought to be
> more different. As is, we can't insert into bits 48-56 of a
> DImode quantity, because we don't generate ICM for DImode,
> only ICMH.
>
> (5) Missing support for RISBGZ in the form of an extv/z expander.
> The existing *extv/z splitters probably ought to be conditionalized
> on !Z10.
>
> (6) The strict_low_part patterns should allow registers for at
> least Z10. The SImode strict_low_part can use LR everywhere.
>
> (7) RISBGZ could be used for a 3-address constant lshrsi3 before
> srlk is available.
>
> For the GOOD function above, and this patch set, for -O3 -march=z10:
>
> larl %r3,s+4
> lhrl %r0,o
> lhi %r2,1
> l %r1,0(%r3)
> nilh %r1,0
> .L2:
> lr %r5,%r1
> larl %r12,n
> lr %r4,%r1
> risbg %r4,%r0,32,47,16
> icm %r5,3,0(%r12)
> cs %r4,%r5,0(%r3)
> je .L3
> lr %r5,%r4
> nilh %r5,0
> cr %r5,%r1
> lr %r1,%r5
> jne .L2
> lhi %r2,0
> .L3:
> srl %r4,16
> sthrl %r4,o
>
> Odd things:
>
> * O is forced into a register before reaching the expander, so we
> get the RISBG for that. N is left in a memory and so we commit
> to using ICM for that. Further, because of how strict_low_part
> is implemented we're committed to leaving that in memory.
>
> * We don't optimize the loop and hoist the LARL of N outside the loop.
>
> * Given that we're having to zap the mask in %r1 for the second
> compare anyway, I wonder if RISBG is really beneficial over OR.
> Is RISBG (or ICM for that matter) any faster (or even smaller)?
>
>
> r~
>
>
> Richard Henderson (2):
> s390: Reorg s390_expand_insv
> s390: Convert from sync to atomic optabs
>
> gcc/config/s390/s390-protos.h | 3 +-
> gcc/config/s390/s390.c | 270 ++++++++++++++++++----------
> gcc/config/s390/s390.md | 401 +++++++++++++++++++++++++++++------------
> 3 files changed, 465 insertions(+), 209 deletions(-)
>
>
--
Richard Guenther <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend
next prev parent reply other threads:[~2012-07-31 9:09 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-29 21:32 [CFT] s390: Convert from sync to atomic optabs Richard Henderson
2012-07-30 14:19 ` Ulrich Weigand
2012-07-30 15:12 ` Richard Henderson
2012-07-30 15:51 ` Ulrich Weigand
2012-07-30 18:53 ` Richard Henderson
2012-07-30 22:33 ` [PATCH 0/2] Convert s390 to atomic optabs, v2 Richard Henderson
2012-07-30 22:33 ` [PATCH 1/2] s390: Reorg s390_expand_insv Richard Henderson
2012-07-30 22:36 ` [PATCH 2/2] s390: Convert from sync to atomic optabs Richard Henderson
2012-08-06 18:34 ` Ulrich Weigand
2012-08-06 18:51 ` Richard Henderson
2012-08-06 19:45 ` Richard Henderson
2012-08-06 22:40 ` s390: Avoid CAS boolean output inefficiency Richard Henderson
2012-08-07 17:02 ` Ulrich Weigand
2012-08-07 22:13 ` Richard Henderson
2012-08-08 18:05 ` Ulrich Weigand
2012-08-09 16:55 ` Eric Botcazou
2012-07-31 9:11 ` Richard Guenther [this message]
2012-07-31 15:27 ` [PATCH 0/2] Convert s390 to atomic optabs, v2 Andrew MacLeod
2012-07-31 16:07 ` Richard Henderson
2012-08-01 8:41 ` Richard Guenther
2012-08-01 15:59 ` Richard Henderson
2012-08-01 17:14 ` Richard Guenther
2012-08-01 19:42 ` Richard Henderson
2012-07-31 18:36 ` Ulrich Weigand
2012-07-31 19:54 ` Richard Henderson
2012-08-01 23:23 ` Richard Henderson
2012-08-03 12:20 ` Ulrich Weigand
2012-08-03 14:21 ` Ulrich Weigand
2012-08-06 16:44 ` Ulrich Weigand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LNX.2.00.1207311053340.17088@jbgna.fhfr.qr \
--to=rguenther@suse.de \
--cc=gcc-patches@gcc.gnu.org \
--cc=rth@redhat.com \
--cc=uweigand@de.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).