public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/53991] New: _mm_popcnt_u64 fails with -O3 -fgnu-tm
@ 2012-07-17  7:32 hakan at debian dot org
  2013-05-16 17:18 ` [Bug tree-optimization/53991] " ubizjak at gmail dot com
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: hakan at debian dot org @ 2012-07-17  7:32 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53991

             Bug #: 53991
           Summary: _mm_popcnt_u64 fails with -O3 -fgnu-tm
    Classification: Unclassified
           Product: gcc
           Version: 4.7.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: hakan@debian.org


Created attachment 27807
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27807
preprocessed file that triggers the bug

Hi,
a call to _mm_popcnt_u64 fails to compile if -O3 and -fgnu-tm is enabled. If
either is disabled it works fine. It can be triggered by:

  #include <smmintrin.h>

  int main(void) {
      int res = _mm_popcnt_u64(0);
      printf("Result res should be 0: %d\n", res);
  }

With this command line and output:

$ gcc -v -save-temps -march=native -O3 -fgnu-tm popcnt.c 
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.7/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.7.1-2'
--with-bugurl=file:///usr/share/doc/gcc-4.7/README.Bugs
--enable-languages=c,c++,go,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-4.7 --enable-shared --enable-linker-build-id
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext
--enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.7
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object
--enable-plugin --enable-objc-gc --with-arch-32=i586 --with-tune=generic
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu
Thread model: posix
gcc version 4.7.1 (Debian 4.7.1-2) 
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-march=native' '-O3' '-fgnu-tm'
'-pthread'
 /usr/lib/gcc/x86_64-linux-gnu/4.7/cc1 -E -quiet -v -imultiarch
x86_64-linux-gnu -D_REENTRANT popcnt.c -march=corei7 -mcx16 -msahf -mno-movbe
-mno-aes -mno-pclmul -mpopcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop
-mno-bmi -mno-bmi2 -mno-tbm -mno-avx -mno-avx2 -msse4.2 -msse4.1 -mno-lzcnt
-mno-rdrnd -mno-f16c -mno-fsgsbase --param l1-cache-size=32 --param
l1-cache-line-size=64 --param l2-cache-size=8192 -mtune=corei7 -fgnu-tm -O3
-fpch-preprocess -o popcnt.i
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory
"/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/lib/gcc/x86_64-linux-gnu/4.7/include
 /usr/local/include
 /usr/lib/gcc/x86_64-linux-gnu/4.7/include-fixed
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-march=native' '-O3' '-fgnu-tm'
'-pthread'
 /usr/lib/gcc/x86_64-linux-gnu/4.7/cc1 -fpreprocessed popcnt.i -march=corei7
-mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul -mpopcnt -mno-abm -mno-lwp
-mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mno-avx -mno-avx2
-msse4.2 -msse4.1 -mno-lzcnt -mno-rdrnd -mno-f16c -mno-fsgsbase --param
l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=8192
-mtune=corei7 -quiet -dumpbase popcnt.c -auxbase popcnt -O3 -version -fgnu-tm
-o popcnt.s
GNU C (Debian 4.7.1-2) version 4.7.1 (x86_64-linux-gnu)
    compiled by GNU C version 4.7.1, GMP version 5.0.5, MPFR version 3.1.0-p10,
MPC version 0.9
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU C (Debian 4.7.1-2) version 4.7.1 (x86_64-linux-gnu)
    compiled by GNU C version 4.7.1, GMP version 5.0.5, MPFR version 3.1.0-p10,
MPC version 0.9
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 9e08a85d4bf68460be9df04a431a25b0
popcnt.c: In function ‘main’:
popcnt.c:5:5: warning: incompatible implicit declaration of built-in function
‘printf’ [enabled by default]
In file included from
/usr/lib/gcc/x86_64-linux-gnu/4.7/include/smmintrin.h:796:0,
                 from popcnt.c:1:
/usr/lib/gcc/x86_64-linux-gnu/4.7/include/popcntintrin.h:40:1: error: inlining
failed in call to always_inline ‘_mm_popcnt_u64’: 
popcnt.c:4:29: error: called from here


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/53991] _mm_popcnt_u64 fails with -O3 -fgnu-tm
  2012-07-17  7:32 [Bug c/53991] New: _mm_popcnt_u64 fails with -O3 -fgnu-tm hakan at debian dot org
@ 2013-05-16 17:18 ` ubizjak at gmail dot com
  2013-05-20 12:33 ` hubicka at ucw dot cz
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: ubizjak at gmail dot com @ 2013-05-16 17:18 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53991

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|middle-end                  |tree-optimization

--- Comment #2 from Uroš Bizjak <ubizjak at gmail dot com> ---
For some reason ccp1 pass doesn't fully propagate _mm_popcnt_u64 when -fgnu-tm
is in effect, leaving:

  res_1 = _mm_popcnt_u64 (0);
  printf ("Result res should be 0: %d\n", res_1);
  return 0;

Without -fgnu-tm, the cpp1 tree dump reads:

  printf ("Result res should be 0: %d\n", 0);
  return 0;
>From gcc-bugs-return-422459-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Thu May 16 17:23:25 2013
Return-Path: <gcc-bugs-return-422459-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 30022 invoked by alias); 16 May 2013 17:23:25 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 29984 invoked by uid 48); 16 May 2013 17:23:22 -0000
From: "jamborm at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug fortran/57297] FAIL: gfortran.dg/select_type_4.f90 -O2  execution test
Date: Thu, 16 May 2013 17:23:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: fortran
X-Bugzilla-Version: 4.8.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: jamborm at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-57297-4-zFg9pZDyb4@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-57297-4@http.gcc.gnu.org/bugzilla/>
References: <bug-57297-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2013-05/txt/msg01132.txt.bz2
Content-length: 2333

http://gcc.gnu.org/bugzilla/show_bug.cgi?idW297

--- Comment #4 from Martin Jambor <jamborm at gcc dot gnu.org> ---
So far I have not attempted to reproduce this myself and so do not
quite follow all the previous comments but...

(In reply to Richard Biener from comment #2)
>
> build_ref_for_offset ends up creating a MEM_REF with different alias pointer
> type than "the original" (note that there may be multiple originals and
> AFAIK we don't make any attempt to "merge" them conservatively).

...it looks like the second arguments in
    generate_subtree_copies (racc->first_child, racc->base, 0, 0, 0,
                 gsi, false, false, loc);
and
    generate_subtree_copies (lacc->first_child, lacc->base, 0, 0, 0,
                 gsi, true, true, loc);

should be the rhs and lhs respectively, rather than [rl]acc->base.
Using rhs/lhs, the generated statements would have the same alias
pointer type as the respective sides of the original statement.  Can
you try that?  We should probably also change accordingly (almost all)
other calls to generate_subtree_copies (this code generally predates
generating MEM_REFs in build_ref_for_offset and this did not matter).

>
> Re-materializing the original variable will always be hard.  I believe that
>
> Index: gcc/tree-sra.c
> ==================================================================> --- gcc/tree-sra.c      (revision 198420)
> +++ gcc/tree-sra.c      (working copy)
> @@ -3158,7 +3158,7 @@ sra_modify_assign (gimple *stmt, gimple_
>
>    if (modify_this_stmt
>        || gimple_has_volatile_ops (*stmt)
> -      || contains_vce_or_bfcref_p (rhs)
> +      || contains_bitfld_comp_ref_p (rhs)
>        || contains_vce_or_bfcref_p (lhs))
>      {
>        if (access_has_children_p (racc))
>
> should work.

Even though relaxing this condition might be a good idea to try to
generate better code, I'd be against fixing bugs this way, if we can
avoid it.  This should be the safe path capable of handling everything
that the latter more sophisticated approaches might choke on.  It
would make the already complex code more difficult to maintain and
sooner or later we'd hit the same problem again (I think it should be
possible use structures with a single field to create a testcase with
a similar problem and modify_this_stmt set to true, for example).


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/53991] _mm_popcnt_u64 fails with -O3 -fgnu-tm
  2012-07-17  7:32 [Bug c/53991] New: _mm_popcnt_u64 fails with -O3 -fgnu-tm hakan at debian dot org
  2013-05-16 17:18 ` [Bug tree-optimization/53991] " ubizjak at gmail dot com
@ 2013-05-20 12:33 ` hubicka at ucw dot cz
  2013-05-20 13:05 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: hubicka at ucw dot cz @ 2013-05-20 12:33 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53991

--- Comment #5 from Jan Hubicka <hubicka at ucw dot cz> ---
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53991
> 
> --- Comment #4 from Uroš Bizjak <ubizjak at gmail dot com> ---
> The inlining is failed in ipa-inline.c, around line 294:
> 
>   /* TM pure functions should not be inlined into non-TM_pure
>      functions.  */
>   else if (is_tm_pure (callee->symbol.decl)
>        && !is_tm_pure (e->caller->symbol.decl))
>     {
>       e->inline_failed = CIF_UNSPECIFIED;
>       inlinable = false;
>     }
Instead of CIF_UNSPECIFIED we should have warning/error for this case.
Jakub, what should be the behaviour when !tm_pure always_inline
is called from tm_pure function?

Honza
>From gcc-bugs-return-422634-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Mon May 20 12:44:20 2013
Return-Path: <gcc-bugs-return-422634-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 633 invoked by alias); 20 May 2013 12:44:20 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 537 invoked by uid 48); 20 May 2013 12:44:12 -0000
From: "dimhen at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c++/57038] Latest libreoffice compilation fails with enabled LTO
Date: Mon, 20 May 2013 12:44:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: c++
X-Bugzilla-Version: 4.9.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: dimhen at gmail dot com
X-Bugzilla-Status: WAITING
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: cc
Message-ID: <bug-57038-4-VkNhmNDZQW@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-57038-4@http.gcc.gnu.org/bugzilla/>
References: <bug-57038-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2013-05/txt/msg01307.txt.bz2
Content-length: 447

http://gcc.gnu.org/bugzilla/show_bug.cgi?idW038

Dmitry G. Dyachenko <dimhen at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dimhen at gmail dot com

--- Comment #22 from Dmitry G. Dyachenko <dimhen at gmail dot com> ---
*** Bug 57267 has been marked as a duplicate of this bug. ***


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/53991] _mm_popcnt_u64 fails with -O3 -fgnu-tm
  2012-07-17  7:32 [Bug c/53991] New: _mm_popcnt_u64 fails with -O3 -fgnu-tm hakan at debian dot org
  2013-05-16 17:18 ` [Bug tree-optimization/53991] " ubizjak at gmail dot com
  2013-05-20 12:33 ` hubicka at ucw dot cz
@ 2013-05-20 13:05 ` jakub at gcc dot gnu.org
  2013-05-21 14:08 ` torvald at gcc dot gnu.org
  2014-08-01  9:46 ` andysem at mail dot ru
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-05-20 13:05 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53991

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org,
                   |                            |rth at gcc dot gnu.org,
                   |                            |torvald at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I know next to nothing about tm_pure, CCing those that do know.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/53991] _mm_popcnt_u64 fails with -O3 -fgnu-tm
  2012-07-17  7:32 [Bug c/53991] New: _mm_popcnt_u64 fails with -O3 -fgnu-tm hakan at debian dot org
                   ` (2 preceding siblings ...)
  2013-05-20 13:05 ` jakub at gcc dot gnu.org
@ 2013-05-21 14:08 ` torvald at gcc dot gnu.org
  2014-08-01  9:46 ` andysem at mail dot ru
  4 siblings, 0 replies; 6+ messages in thread
From: torvald at gcc dot gnu.org @ 2013-05-21 14:08 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53991

--- Comment #7 from torvald at gcc dot gnu.org ---
A piece of code is tm_pure if, roughly, it doesn't need any instrumentation
(e.g., in contrast to memory loads/stores).  In the test case, I suppose that
the compiler detects that it is tm_pure, but we also allow programmers to
declare it.

Ideally, tm_pure should be a property of a region of code that is preserved
across optimizations (but where we don't move code into or out of tm_pure
regions).  That may require too much implementation effort (but perhaps we
could reuse the TM regions for that, as a "no-TM" region?)

Alternatively, we could not automatically mark always_inline functions also as
tm_pure, and warn if always_inline is also annotated as tm_pure by the
programmer.

Other thoughts?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/53991] _mm_popcnt_u64 fails with -O3 -fgnu-tm
  2012-07-17  7:32 [Bug c/53991] New: _mm_popcnt_u64 fails with -O3 -fgnu-tm hakan at debian dot org
                   ` (3 preceding siblings ...)
  2013-05-21 14:08 ` torvald at gcc dot gnu.org
@ 2014-08-01  9:46 ` andysem at mail dot ru
  4 siblings, 0 replies; 6+ messages in thread
From: andysem at mail dot ru @ 2014-08-01  9:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53991

--- Comment #8 from andysem at mail dot ru ---
We have a similar problem in Boost.Atomic:

https://svn.boost.org/trac/boost/ticket/10204

There we mark all boost::atomic<> functions as always_inline to make sure the
compiler sees the memory order arguments as constants as opposed to runtime
values (otherwise the compiler just ignores memory order arguments and acts as
if seq_cst was specified).

As I understand, atomic intrinsics are transaction_unsafe, so should be
Boost.Atomic functions as well, yet we still want them always_inline. Given
this I don't quite understand the reason why a transaction_unsafe function
can't be inlined into the caller; the caller is unsafe anyway, isn't it?

Is there a solution for this problem on the source code level, except removing
always_inline?


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-08-01  9:46 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-17  7:32 [Bug c/53991] New: _mm_popcnt_u64 fails with -O3 -fgnu-tm hakan at debian dot org
2013-05-16 17:18 ` [Bug tree-optimization/53991] " ubizjak at gmail dot com
2013-05-20 12:33 ` hubicka at ucw dot cz
2013-05-20 13:05 ` jakub at gcc dot gnu.org
2013-05-21 14:08 ` torvald at gcc dot gnu.org
2014-08-01  9:46 ` andysem at mail dot ru

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).