public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, alpha]: Wrap {un,}aligned_store sequence with memory blockages.
@ 2014-06-27 17:05 Uros Bizjak
  2014-06-27 20:04 ` Richard Henderson
  0 siblings, 1 reply; 12+ messages in thread
From: Uros Bizjak @ 2014-06-27 17:05 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 2352 bytes --]

Hello!

The motivation for this patch was exposed miscompilation of gfortran
on alpha that resulted in:

FAIL: gfortran.dg/assumed_rank_3.f90:15.20:

    print *, ubound(x,dim=3)  ! << wrong dim
                    1
Error: Assumed-rank variable x at (1) may only be used as actual argument

The problem was in the miscompilation of resolve_actual_arglist from
resolve.c.  This function initializes two nearby global bool variables
with the following sequence:

  actual_arg = true;
  first_actual_arg = true;

but due to the miscompilation, the actual_arg was never set to true.
This happened due to the way stores to QImode and HImode locations are
implemented on non-BWX targets. The sequence reads full word, does its
magic to the part and stores the full word with changed part back to
the memory. However - the scheduler mixed two sequences, violating the
atomicity of RMW sequence.

This can be illustrated with following test:

--cut here--
static char *a;
static char *b;

void foo (void)
{
  a[1] = 1;
  b[2] = 1;
}

int bar (void)
{
  return a && b;
}
--cut here--

$foo..ng:
        .prologue 1
        lda $1,1($31)
        lda $2,a
        ldq $3,0($2)   <-- load a
        lda $2,b
        lda $7,1($3)
        ldq_u $5,1($3)   <-- load a
        insbl $1,$7,$4
        ldq $2,0($2)   <-- load b
        mskbl $5,$7,$5
        lda $6,2($2)
        bis $4,$5,$4
        stq_u $4,1($3)  <-- store a
        insbl $1,$6,$1
        ldq_u $3,2($2)   <-- load b
        mskbl $3,$6,$3
        cpys $f31,$f31,$f31
        bis $1,$3,$1
        stq_u $1,2($2)   <-- store b
        ret $31,($26),1
        .end foo

if a and b alias to the same wide memory location, then "b"  RMW
sequence corrupts "a".

The solution is to wrap the sequences with memory blockages to prevent
scheduler to move read and write around. These blockages thus ensure
atomicity of the sequence.

2014-06-27  Uros Bizjak  <ubizjak@gmail.com>

    * config/alpha/alpha.md (unspec): Add UNSPEC_MEMORY_BLOCKAGE.
    (*memory_blockage): New.
    (aligned_store): Wrap the expanded sequence with memory blockages
    (unaligned_store<mode>): Ditto.

The patch was bootstrapped and regression tested on alpha-linux-gnu,
with the compiler, configured with "--target=alpha-linux-gnu
--host=alpha-linux-gnu --build=alpha-linux-gnu".

OK for mainline and release branches?

Uros.

[-- Attachment #2: p.diff.txt --]
[-- Type: text/plain, Size: 2886 bytes --]

Index: config/alpha/alpha.md
===================================================================
--- config/alpha/alpha.md	(revision 212063)
+++ config/alpha/alpha.md	(working copy)
@@ -58,6 +58,9 @@
   UNSPEC_ATOMIC
   UNSPEC_CMPXCHG
   UNSPEC_XCHG
+
+  ;; Scheduling
+  UNSPEC_MEMORY_BLOCKAGE
 ])
 
 ;; UNSPEC_VOLATILE:
@@ -3689,6 +3692,16 @@
   [(set_attr "length" "0")
    (set_attr "type" "none")])
 
+;; Do not schedule instructions accessing memory across this point.
+
+(define_insn "*memory_blockage"
+  [(set (match_operand:BLK 0)
+	(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BLOCKAGE))]
+  ""
+  ""
+  [(set_attr "length" "0")
+   (set_attr "type" "none")])
+
 (define_insn "jump"
   [(set (pc)
 	(label_ref (match_operand 0)))]
@@ -4269,7 +4282,9 @@
 ;; the value should be placed.  Operands 3 and 4 are SImode temporaries.
 
 (define_expand "aligned_store"
-  [(set (match_operand:SI 3 "register_operand")
+  [(set (match_dup 6)
+	(unspec:BLK [(match_dup 6)] UNSPEC_MEMORY_BLOCKAGE))
+   (set (match_operand:SI 3 "register_operand")
 	(match_operand:SI 0 "memory_operand"))
    (set (subreg:DI (match_dup 3) 0)
 	(and:DI (subreg:DI (match_dup 3) 0) (match_dup 5)))
@@ -4278,11 +4293,15 @@
 		   (match_operand:DI 2 "const_int_operand")))
    (set (subreg:DI (match_dup 4) 0)
 	(ior:DI (subreg:DI (match_dup 4) 0) (subreg:DI (match_dup 3) 0)))
-   (set (match_dup 0) (match_dup 4))]
+   (set (match_dup 0) (match_dup 4))
+   (set (match_dup 6)
+	(unspec:BLK [(match_dup 6)] UNSPEC_MEMORY_BLOCKAGE))]
   ""
 {
   operands[5] = GEN_INT (~ (GET_MODE_MASK (GET_MODE (operands[1]))
 			    << INTVAL (operands[2])));
+  operands[6] = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
+  MEM_VOLATILE_P (operands[6]) = 1;
 })
 
 ;; For the unaligned byte and halfword cases, we use code similar to that
@@ -4293,7 +4312,9 @@
 ;; operand 2 can be that register.
 
 (define_expand "unaligned_store<mode>"
-  [(set (match_operand:DI 3 "register_operand")
+  [(set (match_dup 6)
+	(unspec:BLK [(match_dup 6)] UNSPEC_MEMORY_BLOCKAGE))
+   (set (match_operand:DI 3 "register_operand")
 	(mem:DI (and:DI (match_operand:DI 0 "address_operand")
 			(const_int -8))))
    (set (match_operand:DI 2 "register_operand")
@@ -4308,9 +4329,15 @@
 		   (ashift:DI (match_dup 2) (const_int 3))))
    (set (match_dup 4) (ior:DI (match_dup 4) (match_dup 3)))
    (set (mem:DI (and:DI (match_dup 0) (const_int -8)))
-	(match_dup 4))]
+	(match_dup 4))
+   (set (match_dup 6)
+	(unspec:BLK [(match_dup 6)] UNSPEC_MEMORY_BLOCKAGE))]
   ""
-  "operands[5] = GEN_INT (GET_MODE_MASK (<MODE>mode));")
+{
+  operands[5] = GEN_INT (GET_MODE_MASK (<MODE>mode));
+  operands[6] = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
+  MEM_VOLATILE_P (operands[6]) = 1;
+})
 
 ;; Here are the define_expand's for QI and HI moves that use the above
 ;; patterns.  We have the normal sets, plus the ones that need scratch

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-07-07 17:17 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-27 17:05 [PATCH, alpha]: Wrap {un,}aligned_store sequence with memory blockages Uros Bizjak
2014-06-27 20:04 ` Richard Henderson
2014-06-29 18:14   ` Uros Bizjak
2014-06-30 15:54     ` Richard Henderson
2014-06-30 20:58       ` Uros Bizjak
2014-07-07  9:10       ` Richard Biener
2014-07-07 14:21         ` Richard Henderson
2014-07-07 14:35           ` Richard Biener
2014-07-07 14:48             ` Richard Biener
2014-07-07 15:01             ` Richard Henderson
2014-07-07 16:35               ` Uros Bizjak
2014-07-07 17:17                 ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).