public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libgomp/58482] New: gomp4: user defined reduction produce wrong result
@ 2013-09-20 10:52 vincenzo.innocente at cern dot ch
  2013-09-20 11:16 ` [Bug libgomp/58482] " jakub at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: vincenzo.innocente at cern dot ch @ 2013-09-20 10:52 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58482

            Bug ID: 58482
           Summary: gomp4: user defined reduction produce wrong result
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libgomp
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vincenzo.innocente at cern dot ch
                CC: jakub at gcc dot gnu.org

I acknowledge  that my understanding of "omp declare" is still limited.
Still the example below produces different result with and w/o -fopenmp

gcc version 4.9.0 20130919 (experimental) [gomp-4_0-branch revision 202766]
(GCC) 
pb-d-128-141-131-94:vectorize innocent$ c++ -std=c++11  ured_omp4.cpp -O
-ftree-vectorizer-verbose=1; ./a.out
523776,-523776
pb-d-128-141-131-94:vectorize innocent$ c++ -std=c++11  ured_omp4.cpp -O
-ftree-vectorizer-verbose=1 -fopenmp; ./a.out
ured_omp4.cpp:26:8: note: loop turned into non-loop; it never loops
ured_omp4.cpp:26:8: note: loop turned into non-loop; it never loops
523776,523776


cat ured_omp4.cpp
#define Type float

struct TwoInt {
  Type a=0;
  Type b=0;

#pragma omp declare simd
  TwoInt & operator+=(TwoInt rh) {
    a+=rh.a;
    b-=rh.b;
  }

#pragma omp declare simd
  TwoInt & add(TwoInt rh) {
    a+=rh.a;
    b-=rh.b;
    return *this;
  }


};

#pragma omp declare reduction (foo:struct TwoInt: omp_out.add(omp_in))


TwoInt sum(Type const * q, int NN) {
  TwoInt s;
#pragma omp simd reduction(foo:s)
  for (int i=0;i<NN;++i) {
    TwoInt l; l.a=q[i]; l.b = q[i];
    s.add(l);
  }
  return s;
}

#include<iostream>
int main() {
  constexpr int NN=1024;
  Type q[NN];
  Type a=0;
  for (auto & e: q) e=a++;

  auto s = sum(q,NN);
  std::cout << s.a << "," << s.b << std::endl;


  return 0;
}


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libgomp/58482] gomp4: user defined reduction produce wrong result
  2013-09-20 10:52 [Bug libgomp/58482] New: gomp4: user defined reduction produce wrong result vincenzo.innocente at cern dot ch
@ 2013-09-20 11:16 ` jakub at gcc dot gnu.org
  2013-09-20 13:28 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-09-20 11:16 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58482

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |INVALID

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Yeah, the testcase is just wrong.  The way OpenMP reductions work is that
in the loop the reduction variable(s) are privatized, i.e. either each thread
(for e.g. #pragma omp for) or each SIMD lane (for #pragma omp simd) gets its
own copy of that var, either default constructed (C++98 wording; for UDRs with
missing initializer clause), or initialized otherwise (see the standard for
details), in the loop everything is done on the privatized variable and finally
at the end the reduction operation is performed on the original variable,
calling the combiner expression with omp_out being the original variable and
omp_in being the privatized var, for each of the privatized variables.
But your testcase obviously can't work properly in that case, because your
reduction operation can't cope properly with being performed more than once.
If you subtract the array element values from the counter, then if they are all
positive, you'll get negative number, while if you subtract them from the
privatized var, all those privatized vars will have negative counter, but
then you subtract those negative numbers from the original var counter and get
positive number.  There is a reason why reduction (-:int_var) is actually
implemented as addition rather than subtraction...
In your testcase, you'd want a different operation to be used in the reduction
combiner, one that would add things rather than subtract.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libgomp/58482] gomp4: user defined reduction produce wrong result
  2013-09-20 10:52 [Bug libgomp/58482] New: gomp4: user defined reduction produce wrong result vincenzo.innocente at cern dot ch
  2013-09-20 11:16 ` [Bug libgomp/58482] " jakub at gcc dot gnu.org
@ 2013-09-20 13:28 ` jakub at gcc dot gnu.org
  2013-09-21 15:47 ` vincenzo.innocente at cern dot ch
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-09-20 13:28 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58482

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
It is well known that we don't vectorize this, right now we only handle
accesses to the SIMD lane privatized vars that cover the whole size of those
vars, while in your testcase the access size is half the size of the var.
The reason why -Ofast vectorizes it is likely that SRA manages to scalarize
those, but scalarizer can't do anything easily with the magic arrays indexed by
SIMD_LANE internal fn that we use to represent the privatized variables (and, I
couldn't find a better representation yet for those).
So, either the SRA pass would need to handle those (split the single array
with { float; float } pairs into two arrays with just float type), or the
vectorizer would need to do some ugly magic for selected cases (e.g. handle the
pair case by having two vector vars and use always either odd or even entries
from them).  I'm hoping that most people will actually use scalars or single
data member classes for reductions etc.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libgomp/58482] gomp4: user defined reduction produce wrong result
  2013-09-20 10:52 [Bug libgomp/58482] New: gomp4: user defined reduction produce wrong result vincenzo.innocente at cern dot ch
  2013-09-20 11:16 ` [Bug libgomp/58482] " jakub at gcc dot gnu.org
  2013-09-20 13:28 ` jakub at gcc dot gnu.org
@ 2013-09-21 15:47 ` vincenzo.innocente at cern dot ch
  2013-09-21 19:04 ` jakub at gcc dot gnu.org
  2013-09-26  7:58 ` jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: vincenzo.innocente at cern dot ch @ 2013-09-21 15:47 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58482

--- Comment #4 from vincenzo Innocente <vincenzo.innocente at cern dot ch> ---
I see.
I have several use cases in which the reduction requires the access to two
variables
(minloc for instance: the minimum and its location)


btw tried
omp parallel for simd
got ICE

 c++ -std=c++11 ured_omp4.cpp -O -ftree-vectorizer-verbose=1 -fopenmp
ured_omp4.cpp: In function ‘TwoInt sum(const int*, int)’:
ured_omp4.cpp:38:63: internal compiler error: Segmentation fault: 11
 #pragma omp parallel for simd  aligned(q: 16) reduction(foo:s)
                                                               ^

ured_omp4.cpp:38:63: internal compiler error: Abort trap: 6
c++: internal compiler error: Abort trap: 6 (program cc1plus)
Abort trap: 6
>From gcc-bugs-return-430350-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Sat Sep 21 16:07:55 2013
Return-Path: <gcc-bugs-return-430350-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 9879 invoked by alias); 21 Sep 2013 16:07:54 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 9832 invoked by uid 48); 21 Sep 2013 16:07:50 -0000
From: "ishiura-compiler at ml dot kwansei.ac.jp" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/58494] New: ICE (verify_ssa failed)
Date: Sat, 21 Sep 2013 16:07:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 4.9.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: ishiura-compiler at ml dot kwansei.ac.jp
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter cf_gcctarget
Message-ID: <bug-58494-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2013-09/txt/msg01590.txt.bz2
Content-length: 2138

http://gcc.gnu.org/bugzilla/show_bug.cgi?idX494

            Bug ID: 58494
           Summary: ICE (verify_ssa failed)
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ishiura-compiler at ml dot kwansei.ac.jp
            Target: x86_64-pc-linux-gnu

GCC 4.9.0 ICEs on the following code. (x86_64)


  $ cat test.c

  int  g0 = 1;
  long g1 = 0;

  int main (void)
  {
    int x0 = 1;
    int x1 = 1;

    int a = g0 != 1;  /* a = 0 */
    int t = x0 - g1;  /* t = 1 */
    int b = x1 & t;   /* b = 1 */
    int c = a  & b;   /* c = 0 */
    int s = g0 * 1;   /* s = 1 */
    int d = s && 1;   /* d = 1 */
    int e = c  & d;   /* e = 0 */

    if (e != 0) __builtin_abort();

    return 0;
  }


  $ x86_64-unknown-linux-gnu-gcc-4.9.0 test.c -O1
  test.c: In function 'main':
  test.c:4:5: error: definition in block 2 follows the use
   int main (void)
     ^
  for SSA_NAME: _16 in statement:
  c_10 = _16 & 1;
  test.c:4:5: internal compiler error: verify_ssa failed
  0xaa4779 verify_ssa(bool)
    /home/hassy/gcc/gcc/tree-ssa.c:1046
  0x87f3c1 execute_function_todo
    /home/hassy/gcc/gcc/passes.c:1834
  0x87fb17 execute_todo
    /home/hassy/gcc/gcc/passes.c:1866
  Please submit a full bug report,
  with preprocessed source if appropriate.
  Please include the complete backtrace with any bug report.
  See <http://gcc.gnu.org/bugs.html> for instructions.


  $ x86_64-unknown-linux-gnu-gcc-4.9.0 -v
  Using built-in specs.
  COLLECT_GCC=x86_64-unknown-linux-gnu-gcc-4.9.0

COLLECT_LTO_WRAPPER=/usr/local/x86_64-tools/gcc-4.9.0/libexec/gcc/x86_64-unknown-linux-gnu/4.9.0/lto-wrapper
  Target: x86_64-unknown-linux-gnu
  Configured with: /home/hassy/gcc/configure
--prefix=/usr/local/x86_64-tools/gcc-4.9.0/
--with-gmp=/usr/local/gmp-5.1.1/ --with-mpfr=/usr/local/mpfr-3.1.2/
--with-mpc=/usr/local/mpc-1.0.1/ --disable-multilib --disable-nls
--enable-languages=c
  Thread model: posix
  gcc version 4.9.0 20130919 (experimental) (GCC)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libgomp/58482] gomp4: user defined reduction produce wrong result
  2013-09-20 10:52 [Bug libgomp/58482] New: gomp4: user defined reduction produce wrong result vincenzo.innocente at cern dot ch
                   ` (2 preceding siblings ...)
  2013-09-21 15:47 ` vincenzo.innocente at cern dot ch
@ 2013-09-21 19:04 ` jakub at gcc dot gnu.org
  2013-09-26  7:58 ` jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-09-21 19:04 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58482

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
There is no problem with having as many reductions as you need, if they are
separate variables; the only case that will prevent vectorization is if you
have a struct/class with multiple data members as reduction.

I'll see if I can reproduce the ICE on Monday.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libgomp/58482] gomp4: user defined reduction produce wrong result
  2013-09-20 10:52 [Bug libgomp/58482] New: gomp4: user defined reduction produce wrong result vincenzo.innocente at cern dot ch
                   ` (3 preceding siblings ...)
  2013-09-21 19:04 ` jakub at gcc dot gnu.org
@ 2013-09-26  7:58 ` jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-09-26  7:58 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58482

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Author: jakub
Date: Thu Sep 26 07:58:02 2013
New Revision: 202937

URL: http://gcc.gnu.org/viewcvs?rev=202937&root=gcc&view=rev
Log:
    PR libgomp/58482
    * c-omp.c (c_omp_split_clauses) <case OMP_CLAUSE_REDUCTION>: Copy
    also OMP_CLAUSE_REDUCTION_PLACEHOLDER.

    * testsuite/libgomp.c/simd-6.c: New test.
    * testsuite/libgomp.c++/simd-8.C: New test.

Added:
    branches/gomp-4_0-branch/libgomp/testsuite/libgomp.c++/simd-8.C
    branches/gomp-4_0-branch/libgomp/testsuite/libgomp.c/simd-6.c
Modified:
    branches/gomp-4_0-branch/gcc/c-family/ChangeLog.gomp
    branches/gomp-4_0-branch/gcc/c-family/c-omp.c
    branches/gomp-4_0-branch/libgomp/ChangeLog.gomp


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-09-26  7:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-20 10:52 [Bug libgomp/58482] New: gomp4: user defined reduction produce wrong result vincenzo.innocente at cern dot ch
2013-09-20 11:16 ` [Bug libgomp/58482] " jakub at gcc dot gnu.org
2013-09-20 13:28 ` jakub at gcc dot gnu.org
2013-09-21 15:47 ` vincenzo.innocente at cern dot ch
2013-09-21 19:04 ` jakub at gcc dot gnu.org
2013-09-26  7:58 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).