public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/63678] New: __mm256_blend_epi16 only accepts 8-bit masks (should accept 16-bit)
@ 2014-10-29 15:18 peter.bumbulis at ianywhere dot com
2014-10-29 18:01 ` [Bug c/63678] " jakub at gcc dot gnu.org
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: peter.bumbulis at ianywhere dot com @ 2014-10-29 15:18 UTC (permalink / raw)
To: gcc-bugs
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="UTF-8", Size: 12121 bytes --]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63678
Bug ID: 63678
Summary: __mm256_blend_epi16 only accepts 8-bit masks (should
accept 16-bit)
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: peter.bumbulis at ianywhere dot com
Created attachment 33844
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33844&action=edit
.i file for repro.
__mm256_blend_epi16 only accepts 8-bit masks as the 3rd parameter, not 16. The
Intel and Microsoft compilers handle this properly.
$ gcc -c -mavx2 -save-temps foo.c
foo.c: In function âblendâ:
foo.c:4:46: error: the last argument must be an 8-bit immediate
return _mm256_blend_epi16(a, b, 0xabcd);
^
where foo.c is
#include <immintrin.h>
__m256i blend(__m256i a, __m256i b) {
return _mm256_blend_epi16(a, b, 0xabcd);
}
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.8/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.8.2-19ubuntu1'
--with-bugurl=file:///usr/share/doc/gcc-4.8/README.Bugs
--enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-4.8 --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--with-gxx-include-dir=/usr/include/c++/4.8 --libdir=/usr/lib --enable-nls
--with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-libmudflap
--enable-plugin --with-system-zlib --disable-browser-plugin
--enable-java-awt=gtk --enable-gtk-cairo
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64/jre --enable-java-home
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.8-amd64
--with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar
--enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686
--with-abi=m64 --with-multilib-list=m32,m64,mx32 --with-tune=generic
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu
Thread model: posix
gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1)
>From gcc-bugs-return-465224-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Wed Oct 29 15:18:25 2014
Return-Path: <gcc-bugs-return-465224-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 11472 invoked by alias); 29 Oct 2014 15:18:25 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 11418 invoked by uid 55); 29 Oct 2014 15:18:21 -0000
From: "marxin at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug ipa/63587] [5 Regression] ICE : tree check: expected var_decl, have result_decl in add_local_variables, at tree-inline.c:4112
Date: Wed, 29 Oct 2014 15:19:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: ipa
X-Bugzilla-Version: 5.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: marxin at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: marxin at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 5.0
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-63587-4-jalCxzbN4e@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-63587-4@http.gcc.gnu.org/bugzilla/>
References: <bug-63587-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2014-10/txt/msg02245.txt.bz2
Content-length: 845
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63587
--- Comment #14 from Martin Liška <marxin at gcc dot gnu.org> ---
Author: marxin
Date: Wed Oct 29 15:17:42 2014
New Revision: 216841
URL: https://gcc.gnu.org/viewcvs?rev=216841&root=gcc&view=rev
Log:
PR ipa/63587
* g++.dg/ipa/pr63587-1.C: New test
* g++.dg/ipa/pr63587-2.C: New test.
* cgraphunit.c (cgraph_node::expand_thunk): Only VAR_DECLs are put
to local declarations.
* function.c (add_local_decl): Implementation moved from header
file, assert introduced for tree type.
* function.h: Likewise.
Added:
trunk/gcc/testsuite/g++.dg/ipa/pr63587-1.C
trunk/gcc/testsuite/g++.dg/ipa/pr63587-2.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/cgraphunit.c
trunk/gcc/function.c
trunk/gcc/function.h
trunk/gcc/testsuite/ChangeLog
>From gcc-bugs-return-465225-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Wed Oct 29 15:19:26 2014
Return-Path: <gcc-bugs-return-465225-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 12482 invoked by alias); 29 Oct 2014 15:19:26 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 12427 invoked by uid 48); 29 Oct 2014 15:19:22 -0000
From: "marxin at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug ipa/63587] [5 Regression] ICE : tree check: expected var_decl, have result_decl in add_local_variables, at tree-inline.c:4112
Date: Wed, 29 Oct 2014 15:44:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: ipa
X-Bugzilla-Version: 5.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: marxin at gcc dot gnu.org
X-Bugzilla-Status: RESOLVED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: marxin at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 5.0
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_status resolution
Message-ID: <bug-63587-4-eOJfRoeA56@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-63587-4@http.gcc.gnu.org/bugzilla/>
References: <bug-63587-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2014-10/txt/msg02246.txt.bz2
Content-length: 437
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63587
Martin Liška <marxin at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |FIXED
--- Comment #15 from Martin Liška <marxin at gcc dot gnu.org> ---
Resolved.
>From gcc-bugs-return-465226-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Wed Oct 29 15:44:30 2014
Return-Path: <gcc-bugs-return-465226-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 30434 invoked by alias); 29 Oct 2014 15:44:30 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 30399 invoked by uid 48); 29 Oct 2014 15:44:25 -0000
From: "belagod at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/63679] New: [4.9 Regression][AArch64] Failure to constant fold.
Date: Wed, 29 Oct 2014 16:40:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 5.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: belagod at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter
Message-ID: <bug-63679-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2014-10/txt/msg02247.txt.bz2
Content-length: 2786
https://gcc.gnu.org/bugzilla/show_bug.cgi?idc679
Bug ID: 63679
Summary: [4.9 Regression][AArch64] Failure to constant fold.
Product: gcc
Version: 5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: belagod at gcc dot gnu.org
When this piece of code is compiled with -O3 -mgeneral-regs-only
int __attribute__ ((noinline))
foo ()
{
const int a[8] = { 0, 1, 2, 3, 4, 5, 6, 7 };
int i, sum;
sum = 0;
for (i = 0; i < sizeof (a) / sizeof (*a); i++)
sum += a[i];
return sum;
}
4.9 gcc generates:
foo:
sub sp, sp, #32
mov w0, 28
add sp, sp, 32
ret
.size foo, .-foo
.ident "GCC: (unknown) 4.9.2 20140930 (prerelease)"
5.0 generates:
foo:
adrp x0, .LANCHOR0
sub sp, sp, #32
add x0, x0, :lo12:.LANCHOR0
ldr x7, [x0]
ldr x6, [x0, 16]
ldr x1, [x0, 8]
sbfx x5, x7, 32, 32
ldr x0, [x0, 24]
add w2, w6, w7
str x0, [sp, 24]
mov x4, x1
str x1, [sp, 8]
sbfx x1, x6, 32, 32
ldr x3, [sp, 24]
add w1, w1, w5
add w1, w1, w2
str x7, [sp]
add w0, w3, w4
sbfx x4, x4, 32, 32
sbfx x3, x3, 32, 32
add w0, w0, w1
add w3, w4, w3
str x6, [sp, 16]
add w0, w3, w0
add sp, sp, 32
ret
.size foo, .-foo
.section .rodata
.align 3
.LANCHOR0 = . + 0
.LC0:
.word 0
.word 1
.word 2
.word 3
.word 4
.word 5
.word 6
.word 7
.ident "GCC: (unknown) 5.0.0 20141023 (experimental)"
Constant-folding seems to have got a bit messed up. I've observed this only on
aarch64-none-elf-gcc. 5.0 x86_64 seems to work fine.
foo:
.LFB0:
.cfi_startproc
movl $28, %eax
ret
.cfi_endproc
.LFE0:
.size foo, .-foo
.section .text.unlikely
.LCOLDE0:
.text
.LHOTE0:
.ident "GCC: (GNU) 5.0.0 20141023 (experimental)"
.section .note.GNU-stack,"",@progbits
Looks like a aarch64-specific backend issue.
$ aarch64-none-elf-gcc -v
Target: aarch64-none-elf
Configured with: /work/dev/arm/src/gcc/configure --targetªrch64-none-elf
--prefix=/work/dev/arm/bin//install --with-gmp=/work/dev/arm/bin//host-tools
--with-mpfr=/work/dev/arm/bin//host-tools
--with-mpc=/work/dev/arm/bin//host-tools
--with-cloog=/work/dev/arm/bin//host-tools
--with-isl=/work/dev/arm/bin//host-tools --with-pkgversion=unknown
--disable-shared --disable-nls --disable-threads --disable-tls
--enable-checking=yes --enable-languages=c,c++ --with-newlib
Thread model: single
gcc version 5.0.0 20141023 (experimental) (unknown)
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug c/63678] __mm256_blend_epi16 only accepts 8-bit masks (should accept 16-bit)
2014-10-29 15:18 [Bug c/63678] New: __mm256_blend_epi16 only accepts 8-bit masks (should accept 16-bit) peter.bumbulis at ianywhere dot com
@ 2014-10-29 18:01 ` jakub at gcc dot gnu.org
2014-10-29 18:22 ` [Bug target/63678] " peter.bumbulis at ianywhere dot com
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-10-29 18:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63678
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org,
| |kyukhin at gcc dot gnu.org
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
What means properly? The underlying instruction (256-bit VPBLENDW) certainly
accepts only 8-bit mask, and e.g.
https://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-5369B2B5-B1E1-4D96-85AB-2019982667B4.htm
says nothing what would the upper bits mean, it also says that the mask is
8-bit immediate. Perhaps icc just doesn't diagnose incorrect masks?
Or do you see that for 16-bit masks _mm256_blend_epi16 would actually emit more
than one insn (say separate blends with the low 8-bit mask, high 8-bit mask and
then blend together)?
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/63678] __mm256_blend_epi16 only accepts 8-bit masks (should accept 16-bit)
2014-10-29 15:18 [Bug c/63678] New: __mm256_blend_epi16 only accepts 8-bit masks (should accept 16-bit) peter.bumbulis at ianywhere dot com
2014-10-29 18:01 ` [Bug c/63678] " jakub at gcc dot gnu.org
@ 2014-10-29 18:22 ` peter.bumbulis at ianywhere dot com
2014-10-29 18:34 ` jakub at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: peter.bumbulis at ianywhere dot com @ 2014-10-29 18:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63678
--- Comment #2 from Peter Bumbulis <peter.bumbulis at ianywhere dot com> ---
The referenced web page is incorrect. Look in the instruction set reference
manual
(https://software.intel.com/sites/default/files/managed/c6/a9/319433-020.pdf,
search for VPBLENDMW) or the intrinsics guide
(https://software.intel.com/sites/landingpage/IntrinsicsGuide/).
These instructions blend 16 bit quantities: you can fit 16 of these in a 256
bit register. For AVX512 it's a 32-bit constant.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/63678] __mm256_blend_epi16 only accepts 8-bit masks (should accept 16-bit)
2014-10-29 15:18 [Bug c/63678] New: __mm256_blend_epi16 only accepts 8-bit masks (should accept 16-bit) peter.bumbulis at ianywhere dot com
2014-10-29 18:01 ` [Bug c/63678] " jakub at gcc dot gnu.org
2014-10-29 18:22 ` [Bug target/63678] " peter.bumbulis at ianywhere dot com
@ 2014-10-29 18:34 ` jakub at gcc dot gnu.org
2014-10-29 18:37 ` peter.bumbulis at ianywhere dot com
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-10-29 18:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63678
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Peter Bumbulis from comment #2)
> The referenced web page is incorrect. Look in the instruction set reference
> manual
> (https://software.intel.com/sites/default/files/managed/c6/a9/319433-020.pdf,
> search for VPBLENDMW) or the intrinsics guide
> (https://software.intel.com/sites/landingpage/IntrinsicsGuide/).
>
> These instructions blend 16 bit quantities: you can fit 16 of these in a
> 256 bit register. For AVX512 it's a 32-bit constant.
Your first reference is AVX512 documentation, _mm256_blend_epi16 is not
_mm256_mask_blend_epi16. _mm256_blend_epi16 is for VPBLENDW instruction, and
the
https://software.intel.com/sites/landingpage/IntrinsicsGuide/ looks incorrect,
because it doesn't describe what the VPBLENDW instruction does. In particular,
it only has 8-bit immediate, and both 128-bit lanes are blended the same given
that mask:
IF (imm8[0] == 1) THEN DEST[15:0] <- SRC2[15:0]
ELSE DEST[15:0] <- SRC1[15:0]
IF (imm8[1] == 1) THEN DEST[31:16] <- SRC2[31:16]
ELSE DEST[31:16] <- SRC1[31:16]
IF (imm8[2] == 1) THEN DEST[47:32] <- SRC2[47:32]
ELSE DEST[47:32] <- SRC1[47:32]
IF (imm8[3] == 1) THEN DEST[63:48] <- SRC2[63:48]
ELSE DEST[63:48] <- SRC1[63:48]
IF (imm8[4] == 1) THEN DEST[79:64] <- SRC2[79:64]
ELSE DEST[79:64] <- SRC1[79:64]
IF (imm8[5] == 1) THEN DEST[95:80] <- SRC2[95:80]
ELSE DEST[95:80] <- SRC1[95:80]
IF (imm8[6] == 1) THEN DEST[111:96] <- SRC2[111:96]
ELSE DEST[111:96] <- SRC1[111:96]
IF (imm8[7] == 1) THEN DEST[127:112] <- SRC2[127:112]
ELSE DEST[127:112] <- SRC1[127:112]
IF (imm8[0] == 1) THEN DEST[143:128] <- SRC2[143:128]
ELSE DEST[143:128] <- SRC1[143:128]
IF (imm8[1] == 1) THEN DEST[159:144] <- SRC2[159:144]
ELSE DEST[159:144] <- SRC1[159:144]
IF (imm8[2] == 1) THEN DEST[175:160] <- SRC2[175:160]
ELSE DEST[175:160] <- SRC1[175:160]
IF (imm8[3] == 1) THEN DEST[191:176] <- SRC2[191:176]
ELSE DEST[191:176] <- SRC1[191:176]
IF (imm8[4] == 1) THEN DEST[207:192] <- SRC2[207:192]
ELSE DEST[207:192] <- SRC1[207:192]
IF (imm8[5] == 1) THEN DEST[223:208] <- SRC2[223:208]
ELSE DEST[223:208] <- SRC1[223:208]
IF (imm8[6] == 1) THEN DEST[239:224] <- SRC2[239:224]
ELSE DEST[239:224] <- SRC1[239:224]
IF (imm8[7] == 1) THEN DEST[255:240] <- SRC2[255:240]
ELSE DEST[255:240] <- SRC1[255:240]
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/63678] __mm256_blend_epi16 only accepts 8-bit masks (should accept 16-bit)
2014-10-29 15:18 [Bug c/63678] New: __mm256_blend_epi16 only accepts 8-bit masks (should accept 16-bit) peter.bumbulis at ianywhere dot com
` (2 preceding siblings ...)
2014-10-29 18:34 ` jakub at gcc dot gnu.org
@ 2014-10-29 18:37 ` peter.bumbulis at ianywhere dot com
2014-10-29 18:38 ` jakub at gcc dot gnu.org
2014-10-29 19:19 ` jakub at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: peter.bumbulis at ianywhere dot com @ 2014-10-29 18:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63678
--- Comment #4 from Peter Bumbulis <peter.bumbulis at ianywhere dot com> ---
(In reply to Peter Bumbulis from comment #2)
> The referenced web page is incorrect. Look in the instruction set reference
> manual
> (https://software.intel.com/sites/default/files/managed/c6/a9/319433-020.pdf,
> search for VPBLENDMW) or the intrinsics guide
> (https://software.intel.com/sites/landingpage/IntrinsicsGuide/).
>
> These instructions blend 16 bit quantities: you can fit 16 of these in a
> 256 bit register. For AVX512 it's a 32-bit constant.
My mistake: it looks like the generated code only uses the low 8 bytes. Sorry
for any wasted bandwidth.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/63678] __mm256_blend_epi16 only accepts 8-bit masks (should accept 16-bit)
2014-10-29 15:18 [Bug c/63678] New: __mm256_blend_epi16 only accepts 8-bit masks (should accept 16-bit) peter.bumbulis at ianywhere dot com
` (3 preceding siblings ...)
2014-10-29 18:37 ` peter.bumbulis at ianywhere dot com
@ 2014-10-29 18:38 ` jakub at gcc dot gnu.org
2014-10-29 19:19 ` jakub at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-10-29 18:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63678
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Trying icc 14.0.2.144 Build 2014012, I see that
a) it indeed fails to report the bug in your source
b) when using -c, it silently discards the upper 8 bits of the immediate, so
you end up with:
0: c4 e3 7d 0e c1 cd vpblendw $0xcd,%ymm1,%ymm0,%ymm0
c) when using -S, it generates invalid assembly:
vpblendw $43981, %ymm1, %ymm0, %ymm0 #4.16
which doesn't assemble at least with gas.
So, I believe erroring out on this is significantly better than what icc does
with it.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/63678] __mm256_blend_epi16 only accepts 8-bit masks (should accept 16-bit)
2014-10-29 15:18 [Bug c/63678] New: __mm256_blend_epi16 only accepts 8-bit masks (should accept 16-bit) peter.bumbulis at ianywhere dot com
` (4 preceding siblings ...)
2014-10-29 18:38 ` jakub at gcc dot gnu.org
@ 2014-10-29 19:19 ` jakub at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-10-29 19:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63678
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |INVALID
--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-10-29 18:38 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-29 15:18 [Bug c/63678] New: __mm256_blend_epi16 only accepts 8-bit masks (should accept 16-bit) peter.bumbulis at ianywhere dot com
2014-10-29 18:01 ` [Bug c/63678] " jakub at gcc dot gnu.org
2014-10-29 18:22 ` [Bug target/63678] " peter.bumbulis at ianywhere dot com
2014-10-29 18:34 ` jakub at gcc dot gnu.org
2014-10-29 18:37 ` peter.bumbulis at ianywhere dot com
2014-10-29 18:38 ` jakub at gcc dot gnu.org
2014-10-29 19:19 ` jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).