public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/64704] New: software crashed when using vectorizing optimization
@ 2015-01-21 2:37 zhangyajie_koy at 126 dot com
2015-01-21 2:44 ` [Bug c++/64704] " pinskia at gcc dot gnu.org
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: zhangyajie_koy at 126 dot com @ 2015-01-21 2:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64704
Bug ID: 64704
Summary: software crashed when using vectorizing optimization
Product: gcc
Version: 4.8.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
Assignee: unassigned at gcc dot gnu.org
Reporter: zhangyajie_koy at 126 dot com
when executing the following for() loop,the system crash.
uint16 MessageBuffer::icmp6Checksum(int update)
{
TRACE_FUNCTION_ENTRY("");
register uint32 sum = 0xffff;
struct icmp6_hdr *icmp6Ptr = NULL;
uint8 type = findPayloadType((void**)&icmp6Ptr);
register int i;
uint16 len = getLength();
register uint16 *ptr = (uint16 *)icmp6Ptr;
for (i = 0; i < len - 1; i += 2)
{
sum += *ptr++;
}
return (sum);
}
this code runs OK,when the compiler is 4.4.1, OS is Ubuntu9.10. when the
compiler is 4.8.2,OS is Ubuntu14.04,it is crash. I check the assemble code of
this for()loop, when using 4.8.2, it optimized by 2 ways. first, loop
unrolling, it preunrolled by 10 times. second, the auto vectorizing
optimization.
after several test,i find that, when the actual loops are less than 10,it runs
OK,while, if greater than 10, it is crashed. so, it must be something wrong
with the auto vectorizing.
when i modify the makefile to close the auto vectorizing optimization using
-O3 -fno-tree-vectorize,it is OK. the assemble code for the for() loop is shown
as below.
for loop unrolling optimization begin:
13081bc: 45 8d 4d ff lea -0x1(%r13),%r9d
13081c0: 45 85 c9 test %r9d,%r9d
13081c3: 0f 8e 9e 02 00 00 jle 1308467
<_ZN13MessageBuffer13icmp6ChecksumEi+0x3a7>
13081c9: 41 8d 75 fe lea -0x2(%r13),%esi
13081cd: 48 89 da mov %rbx,%rdx
13081d0: 83 e2 0f and $0xf,%edx
13081d3: d1 ee shr %esi
13081d5: 48 d1 ea shr %rdx
13081d8: 8d 7e 01 lea 0x1(%rsi),%edi
13081db: 48 f7 da neg %rdx
13081de: 83 e2 07 and $0x7,%edx
13081e1: 39 d7 cmp %edx,%edi
13081e3: 89 f9 mov %edi,%ecx
13081e5: 0f 46 d7 cmovbe %edi,%edx
13081e8: 83 ff 0a cmp $0xa,%edi
13081eb: 0f 87 0f 02 00 00 ja 1308400
<_ZN13MessageBuffer13icmp6ChecksumEi+0x340>
13081f1: 44 0f b7 03 movzwl (%rbx),%r8d
13081f5: 48 8d 53 02 lea 0x2(%rbx),%rdx
13081f9: 44 01 c0 add %r8d,%eax
13081fc: 83 f9 01 cmp $0x1,%ecx
13081ff: 0f 86 95 02 00 00 jbe 130849a
<_ZN13MessageBuffer13icmp6ChecksumEi+0x3da>
1308205: 44 0f b7 43 02 movzwl 0x2(%rbx),%r8d
130820a: 48 8d 53 04 lea 0x4(%rbx),%rdx
130820e: 44 01 c0 add %r8d,%eax
1308211: 83 f9 02 cmp $0x2,%ecx
1308214: 0f 86 75 02 00 00 jbe 130848f
<_ZN13MessageBuffer13icmp6ChecksumEi+0x3cf>
130821a: 44 0f b7 43 04 movzwl 0x4(%rbx),%r8d
130821f: 48 8d 53 06 lea 0x6(%rbx),%rdx
1308223: 44 01 c0 add %r8d,%eax
1308226: 83 f9 03 cmp $0x3,%ecx
1308229: 0f 86 97 02 00 00 jbe 13084c6
<_ZN13MessageBuffer13icmp6ChecksumEi+0x406>
130822f: 44 0f b7 43 06 movzwl 0x6(%rbx),%r8d
1308234: 48 8d 53 08 lea 0x8(%rbx),%rdx
1308238: 44 01 c0 add %r8d,%eax
130823b: 83 f9 04 cmp $0x4,%ecx
130823e: 0f 86 77 02 00 00 jbe 13084bb
<_ZN13MessageBuffer13icmp6ChecksumEi+0x3fb>
1308244: 44 0f b7 43 08 movzwl 0x8(%rbx),%r8d
1308249: 48 8d 53 0a lea 0xa(%rbx),%rdx
130824d: 44 01 c0 add %r8d,%eax
1308250: 83 f9 05 cmp $0x5,%ecx
1308253: 0f 86 57 02 00 00 jbe 13084b0
<_ZN13MessageBuffer13icmp6ChecksumEi+0x3f0>
1308259: 44 0f b7 43 0a movzwl 0xa(%rbx),%r8d
130825e: 48 8d 53 0c lea 0xc(%rbx),%rdx
1308262: 44 01 c0 add %r8d,%eax
1308265: 83 f9 06 cmp $0x6,%ecx
1308268: 0f 86 37 02 00 00 jbe 13084a5
<_ZN13MessageBuffer13icmp6ChecksumEi+0x3e5>
130826e: 44 0f b7 43 0c movzwl 0xc(%rbx),%r8d
1308273: 48 8d 53 0e lea 0xe(%rbx),%rdx
1308277: 44 01 c0 add %r8d,%eax
130827a: 83 f9 07 cmp $0x7,%ecx
130827d: 0f 86 f6 01 00 00 jbe 1308479
<_ZN13MessageBuffer13icmp6ChecksumEi+0x3b9>
1308283: 44 0f b7 43 0e movzwl 0xe(%rbx),%r8d
1308288: 48 8d 53 10 lea 0x10(%rbx),%rdx
130828c: 44 01 c0 add %r8d,%eax
130828f: 83 f9 08 cmp $0x8,%ecx
1308292: 0f 86 d6 01 00 00 jbe 130846e
<_ZN13MessageBuffer13icmp6ChecksumEi+0x3ae>
1308298: 44 0f b7 43 10 movzwl 0x10(%rbx),%r8d
130829d: 48 8d 53 12 lea 0x12(%rbx),%rdx
13082a1: 44 01 c0 add %r8d,%eax
13082a4: 83 f9 09 cmp $0x9,%ecx
13082a7: 0f 86 d7 01 00 00 jbe 1308484
<_ZN13MessageBuffer13icmp6ChecksumEi+0x3c4>
13082ad: 44 0f b7 43 12 movzwl 0x12(%rbx),%r8d
13082b2: 48 8d 53 14 lea 0x14(%rbx),%rdx
13082b6: 44 01 c0 add %r8d,%eax
13082b9: 41 b8 14 00 00 00 mov $0x14,%r8d
13082bf: 39 f9 cmp %edi,%ecx
13082c1: 0f 84 e2 00 00 00 je 13083a9
<_ZN13MessageBuffer13icmp6ChecksumEi+0x2e9>
for loop unrolling optimization end:
for loop auto vectorizing optimization begin:
13082c7: 41 89 fe mov %edi,%r14d
13082ca: 41 89 ca mov %ecx,%r10d
13082cd: 41 29 ce sub %ecx,%r14d
13082d0: 44 89 f5 mov %r14d,%ebp
13082d3: c1 ed 03 shr $0x3,%ebp
13082d6: 8d 0c ed 00 00 00 00 lea 0x0(,%rbp,8),%ecx
13082dd: 85 c9 test %ecx,%ecx
13082df: 74 69 je 130834a
<_ZN13MessageBuffer13icmp6ChecksumEi+0x28a>
13082e1: 66 0f ef c0 pxor %xmm0,%xmm0
13082e5: 4e 8d 1c 53 lea (%rbx,%r10,2),%r11
13082e9: 66 0f ef d2 pxor %xmm2,%xmm2
13082ed: 45 31 d2 xor %r10d,%r10d
13082f0: 66 41 0f 6f 0b movdqa (%r11),%xmm1
13082f5: 41 83 c2 01 add $0x1,%r10d
13082f9: 49 83 c3 10 add $0x10,%r11
13082fd: 44 39 d5 cmp %r10d,%ebp
1308300: 66 0f 6f e1 movdqa %xmm1,%xmm4
1308304: 66 0f 69 ca punpckhwd %xmm2,%xmm1
1308308: 66 0f 61 e2 punpcklwd %xmm2,%xmm4
130830c: 66 0f fe c4 paddd %xmm4,%xmm0
1308310: 66 0f fe c1 paddd %xmm1,%xmm0
1308314: 77 da ja 13082f0
<_ZN13MessageBuffer13icmp6ChecksumEi+0x230>
1308316: 66 0f 6f e8 movdqa %xmm0,%xmm5
130831a: 41 89 ca mov %ecx,%r10d
130831d: 45 8d 04 48 lea (%r8,%rcx,2),%r8d
1308321: 4a 8d 14 52 lea (%rdx,%r10,2),%rdx
1308325: 66 0f 73 dd 08 psrldq $0x8,%xmm5
130832a: 66 0f fe c5 paddd %xmm5,%xmm0
130832e: 66 0f 6f f0 movdqa %xmm0,%xmm6
1308332: 66 0f 73 de 04 psrldq $0x4,%xmm6
1308337: 66 0f fe c6 paddd %xmm6,%xmm0
130833b: 66 0f 7e 44 24 0c movd %xmm0,0xc(%rsp)
1308341: 03 44 24 0c add 0xc(%rsp),%eax
1308345: 41 39 ce cmp %ecx,%r14d
1308348: 74 5f je 13083a9
<_ZN13MessageBuffer13icmp6ChecksumEi+0x2e9>
130834a: 0f b7 0a movzwl (%rdx),%ecx
130834d: 01 c8 add %ecx,%eax
130834f: 41 8d 48 02 lea 0x2(%r8),%ecx
1308353: 44 39 c9 cmp %r9d,%ecx
1308356: 7d 51 jge 13083a9
<_ZN13MessageBuffer13icmp6ChecksumEi+0x2e9>
1308358: 0f b7 4a 02 movzwl 0x2(%rdx),%ecx
130835c: 01 c8 add %ecx,%eax
130835e: 41 8d 48 04 lea 0x4(%r8),%ecx
1308362: 41 39 c9 cmp %ecx,%r9d
1308365: 7e 42 jle 13083a9
<_ZN13MessageBuffer13icmp6ChecksumEi+0x2e9>
1308367: 0f b7 4a 04 movzwl 0x4(%rdx),%ecx
130836b: 01 c8 add %ecx,%eax
130836d: 41 8d 48 06 lea 0x6(%r8),%ecx
1308371: 41 39 c9 cmp %ecx,%r9d
1308374: 7e 33 jle 13083a9
<_ZN13MessageBuffer13icmp6ChecksumEi+0x2e9>
1308376: 0f b7 4a 06 movzwl 0x6(%rdx),%ecx
130837a: 01 c8 add %ecx,%eax
130837c: 41 8d 48 08 lea 0x8(%r8),%ecx
1308380: 41 39 c9 cmp %ecx,%r9d
1308383: 7e 24 jle 13083a9
<_ZN13MessageBuffer13icmp6ChecksumEi+0x2e9>
1308385: 0f b7 4a 08 movzwl 0x8(%rdx),%ecx
1308389: 01 c8 add %ecx,%eax
130838b: 41 8d 48 0a lea 0xa(%r8),%ecx
130838f: 41 39 c9 cmp %ecx,%r9d
1308392: 7e 15 jle 13083a9
<_ZN13MessageBuffer13icmp6ChecksumEi+0x2e9>
1308394: 0f b7 4a 0a movzwl 0xa(%rdx),%ecx
1308398: 41 83 c0 0c add $0xc,%r8d
130839c: 01 c8 add %ecx,%eax
130839e: 45 39 c1 cmp %r8d,%r9d
13083a1: 7e 06 jle 13083a9
<_ZN13MessageBuffer13icmp6ChecksumEi+0x2e9>
13083a3: 0f b7 52 0c movzwl 0xc(%rdx),%edx
13083a7: 01 d0 add %edx,%eax
13083a9: 48 8d 5c 73 02 lea 0x2(%rbx,%rsi,2),%rbx
13083ae: 01 ff add %edi,%edi
for loop auto vectorizing optimization end:
our cpu info is:
Intel(R) Xeon(R) CPU E5-2687W 0 @ 3.10GHz
i have 2cpu, 8cores for each cpu.
>From gcc-bugs-return-474154-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Wed Jan 21 02:39:28 2015
Return-Path: <gcc-bugs-return-474154-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 18466 invoked by alias); 21 Jan 2015 02:39:24 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 18401 invoked by uid 48); 21 Jan 2015 02:39:14 -0000
From: "zhangyajie_koy at 126 dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c++/64704] software crashed when using vectorizing optimization
Date: Wed, 21 Jan 2015 02:39:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: c++
X-Bugzilla-Version: 4.8.2
X-Bugzilla-Keywords:
X-Bugzilla-Severity: major
X-Bugzilla-Who: zhangyajie_koy at 126 dot com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_severity
Message-ID: <bug-64704-4-VRWOqt6LQb@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-64704-4@http.gcc.gnu.org/bugzilla/>
References: <bug-64704-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-01/txt/msg02148.txt.bz2
Content-length: 403
https://gcc.gnu.org/bugzilla/show_bug.cgi?idd704
kathy <zhangyajie_koy at 126 dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |major
--- Comment #1 from kathy <zhangyajie_koy at 126 dot com> ---
is this a GCC bug? look for your response,sincerely
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug c++/64704] software crashed when using vectorizing optimization
2015-01-21 2:37 [Bug c++/64704] New: software crashed when using vectorizing optimization zhangyajie_koy at 126 dot com
@ 2015-01-21 2:44 ` pinskia at gcc dot gnu.org
2015-01-21 3:03 ` zhangyajie_koy at 126 dot com
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-01-21 2:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64704
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Most likely because icmp6Ptr is not aligned to 16 bits like you say it is by
doing:
register uint16 *ptr = (uint16 *)icmp6Ptr;
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug c++/64704] software crashed when using vectorizing optimization
2015-01-21 2:37 [Bug c++/64704] New: software crashed when using vectorizing optimization zhangyajie_koy at 126 dot com
2015-01-21 2:44 ` [Bug c++/64704] " pinskia at gcc dot gnu.org
@ 2015-01-21 3:03 ` zhangyajie_koy at 126 dot com
2015-01-21 8:40 ` jakub at gcc dot gnu.org
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: zhangyajie_koy at 126 dot com @ 2015-01-21 3:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64704
--- Comment #3 from kathy <zhangyajie_koy at 126 dot com> ---
(In reply to Andrew Pinski from comment #2)
> Most likely because icmp6Ptr is not aligned to 16 bits like you say it is by
> doing:
> register uint16 *ptr = (uint16 *)icmp6Ptr;
i am not understand the assemble code clearly. but it seems that if the pointer
is not aligned to 16 bits. the comiler will do something to make ptr is aligned
to 16,and than executing the vectorizing loop.
i think, form line:13082c7 to 1308348, the code is to doing something with
align?
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug c++/64704] software crashed when using vectorizing optimization
2015-01-21 2:37 [Bug c++/64704] New: software crashed when using vectorizing optimization zhangyajie_koy at 126 dot com
2015-01-21 2:44 ` [Bug c++/64704] " pinskia at gcc dot gnu.org
2015-01-21 3:03 ` zhangyajie_koy at 126 dot com
@ 2015-01-21 8:40 ` jakub at gcc dot gnu.org
2015-01-21 20:11 ` maltsevm at gmail dot com
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2015-01-21 8:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64704
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
If the pointer is not 16-bit aligned and you dereference it, you invoke
undefined behavior and anything can happen.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug c++/64704] software crashed when using vectorizing optimization
2015-01-21 2:37 [Bug c++/64704] New: software crashed when using vectorizing optimization zhangyajie_koy at 126 dot com
` (2 preceding siblings ...)
2015-01-21 8:40 ` jakub at gcc dot gnu.org
@ 2015-01-21 20:11 ` maltsevm at gmail dot com
2015-01-23 1:22 ` zhangyajie_koy at 126 dot com
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: maltsevm at gmail dot com @ 2015-01-21 20:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64704
Mikhail Maltsev <maltsevm at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |maltsevm at gmail dot com
--- Comment #5 from Mikhail Maltsev <maltsevm at gmail dot com> ---
(In reply to kathy from comment #3)
> i think, form line:13082c7 to 1308348, the code is to doing something with
> align?
Yes
13082f0: 66 41 0f 6f 0b movdqa (%r11),%xmm1
Address in %r11 is expected to be aligned by 16-byte boundary.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug c++/64704] software crashed when using vectorizing optimization
2015-01-21 2:37 [Bug c++/64704] New: software crashed when using vectorizing optimization zhangyajie_koy at 126 dot com
` (3 preceding siblings ...)
2015-01-21 20:11 ` maltsevm at gmail dot com
@ 2015-01-23 1:22 ` zhangyajie_koy at 126 dot com
2015-01-23 1:28 ` zhangyajie_koy at 126 dot com
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: zhangyajie_koy at 126 dot com @ 2015-01-23 1:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64704
--- Comment #6 from kathy <zhangyajie_koy at 126 dot com> ---
(In reply to Mikhail Maltsev from comment #5)
> (In reply to kathy from comment #3)
> > i think, form line:13082c7 to 1308348, the code is to doing something with
> > align?
> Yes
> 13082f0: 66 41 0f 6f 0b movdqa (%r11),%xmm1
> Address in %r11 is expected to be aligned by 16-byte boundary.
what can i do to make the ptr aligned by 16-byte.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug c++/64704] software crashed when using vectorizing optimization
2015-01-21 2:37 [Bug c++/64704] New: software crashed when using vectorizing optimization zhangyajie_koy at 126 dot com
` (4 preceding siblings ...)
2015-01-23 1:22 ` zhangyajie_koy at 126 dot com
@ 2015-01-23 1:28 ` zhangyajie_koy at 126 dot com
2015-01-23 7:04 ` zhangyajie_koy at 126 dot com
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: zhangyajie_koy at 126 dot com @ 2015-01-23 1:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64704
--- Comment #7 from kathy <zhangyajie_koy at 126 dot com> ---
(In reply to Mikhail Maltsev from comment #5)
> (In reply to kathy from comment #3)
> > i think, form line:13082c7 to 1308348, the code is to doing something with
> > align?
> Yes
> 13082f0: 66 41 0f 6f 0b movdqa (%r11),%xmm1
> Address in %r11 is expected to be aligned by 16-byte boundary.
what can i do to make the ptr aligned by 16-byte.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug c++/64704] software crashed when using vectorizing optimization
2015-01-21 2:37 [Bug c++/64704] New: software crashed when using vectorizing optimization zhangyajie_koy at 126 dot com
` (5 preceding siblings ...)
2015-01-23 1:28 ` zhangyajie_koy at 126 dot com
@ 2015-01-23 7:04 ` zhangyajie_koy at 126 dot com
2015-01-24 3:31 ` maltsevm at gmail dot com
2015-04-08 9:14 ` mpolacek at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: zhangyajie_koy at 126 dot com @ 2015-01-23 7:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64704
--- Comment #8 from kathy <zhangyajie_koy at 126 dot com> ---
(In reply to Mikhail Maltsev from comment #5)
> (In reply to kathy from comment #3)
> > i think, form line:13082c7 to 1308348, the code is to doing something with
> > align?
> Yes
> 13082f0: 66 41 0f 6f 0b movdqa (%r11),%xmm1
> Address in %r11 is expected to be aligned by 16-byte boundary.
i heard of that it is not necesary to aligned by 16-byte in x86
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug c++/64704] software crashed when using vectorizing optimization
2015-01-21 2:37 [Bug c++/64704] New: software crashed when using vectorizing optimization zhangyajie_koy at 126 dot com
` (6 preceding siblings ...)
2015-01-23 7:04 ` zhangyajie_koy at 126 dot com
@ 2015-01-24 3:31 ` maltsevm at gmail dot com
2015-04-08 9:14 ` mpolacek at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: maltsevm at gmail dot com @ 2015-01-24 3:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64704
--- Comment #9 from Mikhail Maltsev <maltsevm at gmail dot com> ---
>what can i do to make the ptr aligned by 16-byte.
Well, you may skip first few bytes (of course not just discard them, but
process one-by-one).
Fortunately, you don't need to do it manually, it can be done by the compiler.
The problem is that when you use a pointer to uint16, GCC assumes that it's
already aligned by 2 byte boundary (if it's not true, the behavior is
undefined). Consider this program:
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <linux/icmpv6.h>
typedef uint8_t uint8;
typedef uint16_t uint16;
typedef uint32_t uint32;
uint8 buf[1024] = { 0xFF, 0x01, 0x00, 0x02, 0x00 };
class MessageBuffer
{
public:
MessageBuffer(uint8 *data, uint16 len) :
data_(data), len_(len) { }
uint16 getLength() { return len_ - 1; }
uint16 __attribute__((noinline)) icmp6Checksum_ub (int update);
uint16 __attribute__((noinline)) icmp6Checksum_naive (int update);
uint8 __attribute__((noinline)) findPayloadType (void **payloadStart)
{
uint8 *p;
asm volatile ("leaq 1(%0), %1" : "=r"(p) : "r"(data_) : );
/* p = data_ + 1; GCC will not use this information during tree
optimization */
*payloadStart = p;
return ICMPV6_ECHO_REQUEST;
}
private:
uint8 *data_;
uint16 len_;
};
uint16 MessageBuffer::icmp6Checksum_ub(int)
{
register uint32 sum = 0xffff;
struct icmp6_hdr *icmp6Ptr = NULL;
uint8 type = findPayloadType((void**)&icmp6Ptr);
(void)type; /* inhibit warning */
register int i;
uint16 len = getLength();
register uint16 *ptr = (uint16 *)icmp6Ptr;
for (i = 0; i < len - 1; i += 2) {
sum += *ptr++;
}
return (sum);
}
uint16 MessageBuffer::icmp6Checksum_naive(int)
{
register uint32 sum = 0xffff;
uint8 *data;
findPayloadType((void**)&data);
uint16 len = getLength();
for (int i = 0; i < len - 1; i += 2) {
sum += data[i] | (data[i + 1] << 8);
}
return (sum);
}
int main()
{
MessageBuffer buffer(buf, 1000);
printf("0x%.4x\n", buffer.icmp6Checksum_naive(0));
printf("0x%.4x\n", buffer.icmp6Checksum_ub(0));
}
icmp6Checksum_naive calculates the checksum (I hope at least) and
icmp6Checksum_ub causes segfault (I tried on g++ -O3 -funroll-loops -msse2, GCC
4.8.2).
>i heard of that it is not necesary to aligned by 16-byte in x86
Maybe you confuse movdqa and movdqu (or some other instruction)?
Here is a universal implementation from Linux kernel (there are also
platform-specific versions):
http://lxr.free-electrons.com/source/lib/checksum.c
Notice that the case when address is odd is handled separately (especially in
platform-specific code).
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug c++/64704] software crashed when using vectorizing optimization
2015-01-21 2:37 [Bug c++/64704] New: software crashed when using vectorizing optimization zhangyajie_koy at 126 dot com
` (7 preceding siblings ...)
2015-01-24 3:31 ` maltsevm at gmail dot com
@ 2015-04-08 9:14 ` mpolacek at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: mpolacek at gcc dot gnu.org @ 2015-04-08 9:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64704
Marek Polacek <mpolacek at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
CC| |mpolacek at gcc dot gnu.org
Resolution|--- |INVALID
--- Comment #10 from Marek Polacek <mpolacek at gcc dot gnu.org> ---
Doesn't look like a GCC bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-04-08 9:14 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-21 2:37 [Bug c++/64704] New: software crashed when using vectorizing optimization zhangyajie_koy at 126 dot com
2015-01-21 2:44 ` [Bug c++/64704] " pinskia at gcc dot gnu.org
2015-01-21 3:03 ` zhangyajie_koy at 126 dot com
2015-01-21 8:40 ` jakub at gcc dot gnu.org
2015-01-21 20:11 ` maltsevm at gmail dot com
2015-01-23 1:22 ` zhangyajie_koy at 126 dot com
2015-01-23 1:28 ` zhangyajie_koy at 126 dot com
2015-01-23 7:04 ` zhangyajie_koy at 126 dot com
2015-01-24 3:31 ` maltsevm at gmail dot com
2015-04-08 9:14 ` mpolacek at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).