From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22505 invoked by alias); 5 Dec 2014 10:50:43 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 22460 invoked by uid 48); 5 Dec 2014 10:50:40 -0000 From: "ubizjak at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/64003] valgrind complains about get_attr_length_nobnd in insn-attrtab.c from i386.md Date: Fri, 05 Dec 2014 10:50:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 5.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: ubizjak at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: enkovich.gnu at gmail dot com X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-12/txt/msg00538.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D64003 --- Comment #24 from Uro=C5=A1 Bizjak --- (In reply to Uro=C5=A1 Bizjak from comment #23) > (In reply to Ilya Enkovich from comment #21) >=20 > > Then we have three problematic patterns and the easiest way to handle i= t is > > to get rid of ix86_bnd_prefixed_insn_p call in length computation for t= hem.=20 > > I think the easiest way to do it is to have separate bnd and nobnd patt= erns > > for these instructions. Attached patch helps me to resolve valgrind er= ror.=20 > > Is such approach fine? >=20 > Maybe "enabled" attribute can help here to avoid unnecessary duplication. No, please disregard the above sentence. The aproach with two patterns look= s OK AFAICS. >>From gcc-bugs-return-469532-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Fri Dec 05 11:03:40 2014 Return-Path: Delivered-To: listarch-gcc-bugs@gcc.gnu.org Received: (qmail 28591 invoked by alias); 5 Dec 2014 11:03:40 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Delivered-To: mailing list gcc-bugs@gcc.gnu.org Received: (qmail 28553 invoked by uid 48); 5 Dec 2014 11:03:34 -0000 From: "petschy at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug c++/64191] New: -march=native messes up dead code elimination in loop calling dtor Date: Fri, 05 Dec 2014 11:03:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c++ X-Bugzilla-Version: 5.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: petschy at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-12/txt/msg00539.txt.bz2 Content-length: 6269 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64191 Bug ID: 64191 Summary: -march=native messes up dead code elimination in loop calling dtor Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Without -march=native, the loops in the 3 fns are eliminated as expected, resulting in single retq's. With -march=native, the loop which calls the defined, but empty dtor is compiled into something rather weird. However, the other empty Nop() call is optimized away as expected. g++-5.0.0 -g -O3 -Wall -Wextra -c 20141205-dtor_loop.cpp g++-5.0.0 -g -O3 -Wall -Wextra -o 20141205-dtor_loop 20141205-dtor_loop.o Dump of assembler code for function foo_dtor_loop(Foo*, unsigned int): 0x0000000000400570 <+0>: repz retq Dump of assembler code for function bar_dtor_loop(Bar*, unsigned int): 0x0000000000400580 <+0>: repz retq Dump of assembler code for function bar_nop_loop(Bar*, unsigned int): 0x0000000000400590 <+0>: repz retq So far so good. g++-5.0.0 -g -O3 -march=native -Wall -Wextra -c 20141205-dtor_loop.cpp g++-5.0.0 -g -O3 -march=native -Wall -Wextra -o 20141205-dtor_loop 20141205-dtor_loop.o Dump of assembler code for function foo_dtor_loop(Foo*, unsigned int): 0x0000000000400570 <+0>: retq Dump of assembler code for function bar_dtor_loop(Bar*, unsigned int): 0x0000000000400578 <+0>: test %rdi,%rdi 0x000000000040057b <+3>: je 0x4005b8 0x000000000040057d <+5>: mov %esi,%esi 0x000000000040057f <+7>: lea (%rdi,%rsi,4),%rax 0x0000000000400583 <+11>: cmp %rax,%rdi 0x0000000000400586 <+14>: jae 0x4005b8 0x0000000000400588 <+16>: mov $0x3,%edx 0x000000000040058d <+21>: lea -0x4(%rax),%rsi 0x0000000000400591 <+25>: sub %rdi,%rdx 0x0000000000400594 <+28>: add %rsi,%rdx 0x0000000000400597 <+31>: mov %rdx,%rcx 0x000000000040059a <+34>: shr $0x2,%rcx 0x000000000040059e <+38>: lea 0x1(%rcx),%r8 0x00000000004005a2 <+42>: dec %rcx 0x00000000004005a5 <+45>: shr %rcx 0x00000000004005a8 <+48>: lea 0x2(%rcx,%rcx,1),%rcx 0x00000000004005ad <+53>: cmp $0x2f,%rdx 0x00000000004005b1 <+57>: jbe 0x4005b8 0x00000000004005b3 <+59>: cmp %rcx,%r8 0x00000000004005b6 <+62>: je 0x4005b8 0x00000000004005b8 <+64>: retq Dump of assembler code for function bar_nop_loop(Bar*, unsigned int): 0x00000000004005c0 <+0>: retq The bar_dtor_loop() fn is clearly a mess, unfortunately I can't follow the computation. The bar_inc_loop() does a single int increment on each object, to see what loop code is generated if not empty fns are called. It is as expected: the loop is unrolled 16x times, and the residual part is executed in a tight loop: 0x0000000000400648 <+120>: sub $0x4,%rdx 0x000000000040064c <+124>: incl (%rdx) 0x000000000040064e <+126>: cmp %rdx,%rdi 0x0000000000400651 <+129>: jb 0x400648 g++-5.0.0 -v Using built-in specs. COLLECT_GCC=g++-5.0.0 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ./configure --enable-languages=c,c++ --disable-multilib --program-suffix=-5.0.0 Thread model: posix gcc version 5.0.0 20141027 (experimental) (GCC) cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 21 model : 1 model name : AMD FX(tm)-8150 Eight-Core Processor stepping : 2 microcode : 0x6000626 cpu MHz : 1400.000 cache size : 2048 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 16 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 popcnt aes xsave avx lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 nodeid_msr topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold bugs : fxsave_leak bogomips : 7624.63 TLB size : 1536 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm 100mhzsteps hwpstate cpb Unfortunately, I couldn't test with the latest version since the build fails with ../.././libcc1/findcomp.cc:20:20: fatal error: config.h: No such file or directory a while know, even after deleting everything and doing a git reset --hard HEAD. ----8<----8<----8<---- struct Foo { int i; }; void foo_dtor_loop(Foo* p, unsigned int n) { if (p) { Foo* e = p + n; while (e > p) { --e; e->~Foo(); } } } struct Bar { int i; ~Bar() { } void Nop() { } void Inc() { ++i; } }; void bar_dtor_loop(Bar* p, unsigned int n) { if (p) { Bar* e = p + n; while (e > p) { --e; e->~Bar(); } } } void bar_nop_loop(Bar* p, unsigned int n) { if (p) { Bar* e = p + n; while (e > p) { --e; e->Nop(); } } } void bar_inc_loop(Bar* p, unsigned int n) { if (p) { Bar* e = p + n; while (e > p) { --e; e->Inc(); } } } int main() { }