public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* autovectorization in gcc
@ 2019-01-09  8:29 Kay F. Jahnke
  2019-01-09  9:46 ` Kyrill Tkachov
  0 siblings, 1 reply; 16+ messages in thread
From: Kay F. Jahnke @ 2019-01-09  8:29 UTC (permalink / raw)
  To: gcc

Hi there!

I am developing software which tries to deliberately exploit the 
compiler's autovectorization facilities by feeding data in 
autovectorization-friendly loops. I'm currently using both g++ and 
clang++ to see how well this approach works. Using simple arithmetic, I 
often get good results. To widen the scope of my work, I was looking for 
documentation on which constructs would be recognized by the 
autovectorization stage, and found

https://www.gnu.org/software/gcc/projects/tree-ssa/vectorization.html

By the looks of it, this document has not seen any changes for several 
years. Has development on the autovectorization stage stopped, or is 
there simply no documentation?

In my experience, vectorization is essential to speed up arithmetic on 
the CPU, and reliable recognition of vectorization opportunities by the 
compiler can provide vectorization to programs which don't bother to 
code it explicitly. I feel the topic is being neglected - at least the 
documentation I found suggests this. To demonstrate what I mean, I have 
two concrete scenarios which I'd like to be handled by the 
autovectorization stage:

- gather/scatter with arbitrary indexes

In C, this would be loops like

// gather from B to A using gather indexes

for ( int i = 0 ; i < vsz ; i++ )
   A [ i ] = B [ indexes [ i ] ] ;

 From the AVX2 ISA onwards, there are hardware gather/scatter 
operations, which can speed things up a good deal.

- repeated use of vectorizable functions

for ( int i = 0 ; i < vsz ; i++ )
   A [ i ] = sqrt ( B [ i ] ) ;

Here, replacing the repeated call of sqrt with the vectorized equivalent 
gives a dramatic speedup (ca. 4X)

If the compiler were to provide the autovectorization facilities, and if 
the patterns it recognizes were well-documented, users could rely on 
certain code patterns being recognized and autovectorized - sort of a 
contract between the user and the compiler. With a well-chosen spectrum 
of patterns, this would make it unnecessary to have to rely on explicit 
vectorization in many cases. My hope is that such an interface would 
help vectorization to become more frequently used - as I understand the 
status quo, this is still a niche topic, even though many processors 
provide suitable hardware nowadays.

Can you point me to where 'the action is' in this regard?

With regards

Kay F. Jahnke


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-08-18 10:59 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-09  8:29 autovectorization in gcc Kay F. Jahnke
2019-01-09  9:46 ` Kyrill Tkachov
2019-01-09  9:50   ` Andrew Haley
2019-01-09  9:56     ` Jonathan Wakely
2019-01-09 16:10       ` David Malcolm
2019-01-09 16:25         ` Jakub Jelinek
2019-01-10  8:19           ` Richard Biener
2019-01-10 11:11             ` Szabolcs Nagy
2019-01-09 16:26         ` David Malcolm
2019-01-09 10:47     ` Ramana Radhakrishnan
2019-01-10  9:24     ` Kay F. Jahnke
2019-01-10 11:18       ` Jonathan Wakely
2019-08-18 10:59         ` [wwwdocs PATCH] for " Gerald Pfeifer
2019-01-09 10:56   ` Kay F. Jahnke
2019-01-09 11:03     ` Jakub Jelinek
2019-01-09 11:21       ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).