public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts
@ 2017-04-24  9:17 Allan Sandfeld Jensen
  0 siblings, 0 replies; 16+ messages in thread
From: Allan Sandfeld Jensen @ 2017-04-24  9:17 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On Monday 24 April 2017, Jakub Jelinek wrote:
> On Mon, Apr 24, 2017 at 10:34:58AM +0200, Allan Sandfeld Jensen wrote:
> > That is a different instruction. That is the vpsllw not vpsllwi
> > 
> > The intrinsics I changed is the immediate version, I didn't change the
> > non- immediate version. It is probably a bug if you can give
> > non-immediate values to the immediate only intrinsic. At least both
> > versions handles it, if in different ways, but is is illegal arguments.
> 
> The documentation is unclear on that and I've only recently fixed up some
> cases where these intrinsics weren't able to handle non-constant arguments
> in some cases, while both ICC and clang coped with that fine.
> So it is clearly allowed and handled by all the compilers and needs to be
> supported, people use that in real-world code.
> 
Undoubtedly it happens. I just make a mistake myself that created that case. 
But it is rather unfortunate, and means we make wrong code currently for 
corner case values.

Note the difference in definition between the two intrinsics: 
_mm_ssl_epi16:
FOR j := 0 to 7
	i := j*16
	IF count[63:0] > 15
		dst[i+15:i] := 0
	ELSE
		dst[i+15:i] := ZeroExtend(a[i+15:i] << count[63:0])
	FI
ENDFOR

_mm_ssli_epi16:
FOR j := 0 to 7
	i := j*16
	IF imm8[7:0] > 15
		dst[i+15:i] := 0
	ELSE
		dst[i+15:i] := ZeroExtend(a[i+15:i] << imm8[7:0])
	FI
ENDFOR

For a value such as 257, the immediate version does a 1 bit shift, while the 
non-immediate returns a zero vector. A simple function using the immediate 
intrinsic has to have an if-statement, if transformed to using the non-
immediate instruction.

`Allan

^ permalink raw reply	[flat|nested] 16+ messages in thread
[parent not found: <201704221338.46300.linux@carewolf.com>]

end of thread, other threads:[~2017-05-02 15:48 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-24  9:17 [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts Allan Sandfeld Jensen
     [not found] <201704221338.46300.linux@carewolf.com>
2017-04-24  7:43 ` Allan Sandfeld Jensen
2017-04-24  7:47   ` Jakub Jelinek
2017-04-24  8:02     ` Allan Sandfeld Jensen
2017-04-24  8:25       ` Jakub Jelinek
2017-04-24  8:25         ` Allan Sandfeld Jensen
2017-04-24  8:38           ` Jakub Jelinek
2017-04-24  8:40             ` Allan Sandfeld Jensen
2017-04-24  8:54               ` Allan Sandfeld Jensen
2017-04-24  8:57               ` Jakub Jelinek
2017-04-24 14:43     ` Allan Sandfeld Jensen
2017-05-02 10:17       ` Jakub Jelinek
2017-05-02 11:22         ` Allan Sandfeld Jensen
2017-05-02 15:58         ` Marc Glisse
     [not found] ` <201704241101.29634.linux@carewolf.com>
2017-04-24  9:38   ` Jakub Jelinek
2017-04-24  9:38     ` Allan Sandfeld Jensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).