[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "irar at il dot ibm dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution,  -O3 -g with -m64
Date: Sun, 10 Jan 2010 13:43:00 -0000	[thread overview]
Message-ID: <20100110134329.5744.qmail@sourceware.org> (raw)
In-Reply-To: <bug-41082-12313@http.gcc.gnu.org/bugzilla/>

------- Comment #43 from irar at il dot ibm dot com  2010-01-10 13:43 -------
Since -O2 -ftree-vectorize doesn't cause bad code, it has to be some other
optimization on top of vectorized code that causes the problem.

Bad code is generated when the alignment of 'reduce' is forced and the
reduction 'sum(reduce)' is vectorized. However, the result of the reduction is
correct, and the vector store element does not do any damage (as far as I can
see in debugger). So, the vector stores don't corrupt anything.

The part that goes wrong is in the scalar code that implements the decision on
whether to add the (correctly computed) reduction value to temp[9] and
temp[10]. The code that sets the condition, (which, by the way, is not using
any vectorized code) is not using the values of  reduce[9] and reduce[10], even
though the value of the condition depends on them:

reduce(1:3) = -1
reduce(4:6) = 0
reduce(7:8) = 5
reduce(9:10) = 10
...
WHERE (reduce > 6) temp = temp + sum(reduce) 

Here is the code for adding the result of the "sum(reduce)" to temp[9]:

L29:
        lbz r11,152(r1)  # **
        cmpwi cr7,r11,0  # reduce > 6 ?
        beq cr7,L30
        lwz r11,240(r1)  # load temp[9]
        add r11,r11,r9   # temp[9] + sum(reduce) 
        stw r11,240(r1)  # store temp[9]

** - The calculation of 152(r1) is based only on the value of reduce[8]! The
values of reduce[9] and reduce[10] are only used in the reduction calculation
and not compared to 6 at all.

In case we don't vectorize (but force the alignment), there is cmpwi cr7,r29,6
instruction, where r29 is reduce[9] (and the code is correct). The same happens
when the alignment of reduce is not forced and the reduction is vectorized
using peeling. 

I.e., as far as I can see, in the bad code, the comparison of reduce[9] and
reduce[10] with 6 do not exist. I wonder which optimization can be responsible
for that?

Also, some values of reduce are copied to a temporal array and are further
compared with 6. In  the version with peeling the values that are copied are
reduce[4:8]: there is no need to keep the first three and the last two are kept
in registers and compared to 6 (and also used in reduction epilogue). While in
the bad version the kept values are reduce[3:8] and reduce[8] is put before the
values of reduce[3:7] (reduce[3:7] are in 276(r1) to 292(r1), and reduce[8] is
in 272(r1)). (And in the bad code the last two values reduce[9] and reduce[10]
are only used in reduction epilogue).

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41082

next prev parent reply	other threads:[~2010-01-10 13:43 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-16 10:31 [Bug middle-end/41082] New: " dominiq at lps dot ens dot fr
2009-08-17 11:42 ` [Bug middle-end/41082] " dominiq at lps dot ens dot fr
2009-08-18 20:03 ` dominiq at lps dot ens dot fr
2009-08-19  8:30 ` rguenther at suse dot de
2009-08-25 15:49 ` jsm28 at gcc dot gnu dot org
2009-09-17 20:34 ` dominiq at lps dot ens dot fr
2009-09-17 20:54 ` dominiq at lps dot ens dot fr
2009-09-18  9:01 ` rguenth at gcc dot gnu dot org
2009-12-14 21:10 ` dominiq at lps dot ens dot fr
2009-12-15  8:25 ` irar at il dot ibm dot com
2009-12-15  9:58 ` dominiq at lps dot ens dot fr
2009-12-15  9:59 ` dominiq at lps dot ens dot fr
2009-12-15 10:00 ` dominiq at lps dot ens dot fr
2009-12-15 10:59 ` irar at il dot ibm dot com
2009-12-15 13:05 ` dominiq at lps dot ens dot fr
2009-12-15 13:07 ` irar at il dot ibm dot com
2009-12-15 13:08 ` irar at il dot ibm dot com
2009-12-15 13:30 ` dominiq at lps dot ens dot fr
2009-12-15 13:35 ` irar at il dot ibm dot com
2009-12-15 14:15 ` dominiq at lps dot ens dot fr
2009-12-15 17:47 ` dominiq at lps dot ens dot fr
2009-12-16  9:22 ` dominiq at lps dot ens dot fr
2009-12-16  9:27 ` dominiq at lps dot ens dot fr
2009-12-16 12:01 ` irar at il dot ibm dot com
2009-12-16 23:36 ` matz at gcc dot gnu dot org
2009-12-20 12:19 ` irar at il dot ibm dot com
2009-12-20 12:46 ` dominiq at lps dot ens dot fr
2009-12-20 13:34 ` rguenth at gcc dot gnu dot org
2009-12-20 13:46 ` irar at il dot ibm dot com
2009-12-20 13:48 ` rguenther at suse dot de
2009-12-20 14:00 ` irar at il dot ibm dot com
2009-12-20 14:19 ` dominiq at lps dot ens dot fr
2009-12-22 11:42 ` irar at il dot ibm dot com
2009-12-22 11:43 ` irar at il dot ibm dot com
2009-12-22 11:44 ` irar at il dot ibm dot com
2009-12-22 16:43 ` dominiq at lps dot ens dot fr
2009-12-22 16:56 ` dominiq at lps dot ens dot fr
2009-12-22 16:59 ` dominiq at lps dot ens dot fr
2009-12-23  7:54 ` irar at il dot ibm dot com
2009-12-23  7:55 ` irar at il dot ibm dot com
2009-12-23  7:55 ` irar at il dot ibm dot com
2009-12-23 14:09 ` dominiq at lps dot ens dot fr
2009-12-23 14:49 ` irar at il dot ibm dot com
2009-12-23 23:34 ` dominiq at lps dot ens dot fr
2010-01-05  9:10 ` irar at il dot ibm dot com
2010-01-10 13:43 ` irar at il dot ibm dot com [this message]
2010-01-10 14:22 ` dominiq at lps dot ens dot fr
2010-01-10 14:23 ` dominiq at lps dot ens dot fr
2010-01-10 14:25 ` dominiq at lps dot ens dot fr
2010-04-06 11:24 ` rguenth at gcc dot gnu dot org
2010-05-13 18:04 ` [Bug middle-end/41082] [4.5/4.6 " tkoenig at gcc dot gnu dot org
2010-05-14  9:28 ` [Bug middle-end/41082] [4.5/4.6 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 dominiq at lps dot ens dot fr
2010-07-25 10:20 ` dominiq at lps dot ens dot fr
2010-07-31  9:34 ` rguenth at gcc dot gnu dot org
2010-08-06  7:33 ` dominiq at lps dot ens dot fr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100110134329.5744.qmail@sourceware.org \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).