[Bug target/52908] New: xop-mul-1:f9 miscompiled on bulldozer (-mxop)

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "matz at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/52908] New: xop-mul-1:f9 miscompiled on bulldozer (-mxop)
Date: Sun, 08 Apr 2012 22:48:00 -0000	[thread overview]
Message-ID: <bug-52908-4@http.gcc.gnu.org/bugzilla/> (raw)

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52908

             Bug #: 52908
           Summary: xop-mul-1:f9 miscompiled on bulldozer (-mxop)
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: matz@gcc.gnu.org


FAIL: gcc.target/i386/xop-mul-1.c execution test

This is because f9 is miscompiled with -O3 -mxop (f13 is as well,
but this report is only about f9).  f9 looks like so:

__attribute__((noinline, noclone)) void
f9 (void)
{
  int i;
  for (i = 0; i < 512; ++i)
    e1[i] = (long long) c2[i] * (long long) c3[i];
}

This is compiled into this asm:

f9:
        movl    $0, %eax
.L29:
        vpshufd $216, c2(%rax), %xmm1
        vpshufd $216, c3(%rax), %xmm0
        vpxor   %xmm2, %xmm2, %xmm2
        vpmacsdql       %xmm2, %xmm0, %xmm1, %xmm2
        vmovdqa %xmm2, e1(%rax,%rax)
        vpxor   %xmm0, %xmm0, %xmm0         <<< overwrite xmm0
        vpmacsdqh       %xmm0, %xmm0, %xmm1, %xmm0
        vmovdqa %xmm0, e1+16(%rax,%rax)
        addq    $16, %rax
        cmpq    $2048, %rax
        jne     .L29
        rep
        ret

The marked instruction zeroing xmm0 before the second vpmacsdqh overwrites
xmm0 although it's still used as input.  That insn is generated by post-reload
splitting and is caused by a conflict between two pattern in sse.md:

(define_insn "*sse4_1_mulv2siv2di3"
  [(set (match_operand:V2DI 0 "register_operand" "=x,x")
        (mult:V2DI
          (sign_extend:V2DI
            (vec_select:V2SI
              (match_operand:V4SI 1 "nonimmediate_operand" "%0,x")
              (parallel [(const_int 0) (const_int 2)])))
          (sign_extend:V2DI
            (vec_select:V2SI
              (match_operand:V4SI 2 "nonimmediate_operand" "xm,xm")
              (parallel [(const_int 0) (const_int 2)])))))]
  "TARGET_SSE4_1 && ix86_binary_operator_ok (MULT, V4SImode, operands)"
...

(note how operand 0 is not marked early-clobber) and the next xop pattern:

;; We don't have a straight 32-bit parallel multiply and extend on XOP, so
;; fake it with a multiply/add.  In general, we expect the define_split to
;; occur before register allocation, so we have to handle the corner case where
;; the target is the same as either operands[1] or operands[2]
(define_insn_and_split "xop_mulv2div2di3_high"
  [(set (match_operand:V2DI 0 "register_operand" "=&x")
        (mult:V2DI
          (sign_extend:V2DI
            (vec_select:V2SI
              (match_operand:V4SI 1 "register_operand" "%x")
              (parallel [(const_int 0)
                         (const_int 2)])))
          (sign_extend:V2DI
            (vec_select:V2SI
              (match_operand:V4SI 2 "nonimmediate_operand" "xm")
              (parallel [(const_int 0)
                         (const_int 2)])))))]
  "TARGET_XOP"
  "#"
  "&& reload_completed"
  [(set (match_dup 0)
        (match_dup 3))
  ...

Note in particular how the comment about being careful about dest == op0/op1
is taken care of by the early-clobber of op0.

Now the problem is, the first (sse 4.1) pattern matches the insn in question:

(insn 31 27 32 3 (set (reg:V2DI 84)
        (mult:V2DI (sign_extend:V2DI (vec_select:V2SI (reg:V4SI 81)
                    (parallel [
                            (const_int 0 [0])
                            (const_int 2 [0x2])
                        ])))
            (sign_extend:V2DI (vec_select:V2SI (reg:V4SI 82)
                    (parallel [
                            (const_int 0 [0])
                            (const_int 2 [0x2])
                        ])))))
/matz/trunk/gcc/gcc/testsuite/gcc.target/i386/sse
2-mul-1.c:99 1484 {*sse4_1_mulv2siv2di3}
     (expr_list:REG_DEAD (reg:V4SI 82)
        (expr_list:REG_DEAD (reg:V4SI 81)
            (nil))))

(sse4_1_mulv2siv2di3 matched).  The reg allocator hence doesn't see any
early-clobber, allocates pseudo 84 and 81 to xmm0, and when the pattern
then is split (using the xop_mulv2div2di3_high splitter, used because
the sse4_1_mulv2siv2di3 pattern is no splitter) op1 is overwritten.

I've no idea how best to solve this, either deactivating the sse4.1 pattern
when TARGET_XOP, or rewriting the latter to not early-clobber result, or
something else entirely.  I've tested that the first way fixes this problem.

next             reply	other threads:[~2012-04-08 22:48 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-08 22:48 matz at gcc dot gnu.org [this message]
2012-04-09  8:17 ` [Bug target/52908] " ubizjak at gmail dot com
2012-04-09 11:48 ` ubizjak at gmail dot com
2012-04-10 13:49 ` ubizjak at gmail dot com
2012-05-04 10:43 ` venkataramanan.kumar at amd dot com
2012-05-04 12:52 ` ubizjak at gmail dot com
2012-05-09  3:44 ` venkataramanan.kumar at amd dot com
2012-05-09 20:46 ` uros at gcc dot gnu.org
2012-05-09 20:57 ` ubizjak at gmail dot com
2012-05-10  6:22 ` ubizjak at gmail dot com
2012-06-18 15:11 ` vekumar at gcc dot gnu.org
2012-07-02 12:42 ` rguenth at gcc dot gnu.org
2024-03-17  2:33 ` sjames at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-52908-4@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).