public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* AVX generic mode tuning discussion.
@ 2011-07-12 22:26 harsha.jagasia
  2011-07-12 22:29 ` Richard Henderson
  0 siblings, 1 reply; 12+ messages in thread
From: harsha.jagasia @ 2011-07-12 22:26 UTC (permalink / raw)
  To: gcc-patches, hubicka, ubizjak, hjl.tools, Changpeng.Fang, rth
  Cc: harsha.jagasia

We would like to propose changing AVX generic mode tuning to generate 128-bit
AVX instead of 256-bit AVX. As per H.J's suggestion, we have reviewed the
various tuning choices made for generic mode with respect to AMD's upcoming
Bulldozer processor. At this moment, this is the most significant change we
have to propose. While we are willing to re-engineer generic mode, this
feature needs immediate discussion since the performance impact on Bulldozer
is significant.

Here is the relative CPU2006 performance data we have gathered using gcc on AMD
Bulldozer (BD) and Intel Sandybridge (SB) machines with "-Ofast -mtune=generic
-mavx".

		%gain/loss avx256 vs avx128
		(negative % indicates loss
		positive % indicates gain)

		AMD BD	Intel SB
410.bwaves	-2.34	-1.52   	   
416.gamess	-1.11	-0.30
433.milc	0.47	-1.75
434.zeusmp	-3.61	0.68
435.gromacs	-0.54	-0.38
436.cactusADM	-23.56	21.49
437.leslie3d	-0.44	1.56
444.namd	0.00	0.00
447.dealII	-0.36	-0.23
450.soplex	-0.43	-0.29
453.povray	0.50	3.63
454.calculix	-8.29	1.38
459.GemsFDTD	2.37	-1.54
465.tonto	0.00	0.00
470.lbm		0.00	0.21
481.wrf		-4.80	0.00
482.sphinx3	-10.20	-3.65
SpecINT		-3.29	1.01

400.perlbench	0.93	1.47
401.bzip2	0.60	0.00
403.gcc		0.00	0.00
429.mcf		0.00	-0.36
445.gobmk	-1.03	0.37
456.hmmer	-0.64	0.38
458.sjeng	1.74	0.00
462.libquantum	0.31	0.00
464.h264ref	0.00	0.00
471.omnetpp	-1.27	0.00
473.astar	0.00	0.46
483.xalancbmk	0.51	0.00
SpecFP	      	0.09	0.19

As per the data, the 1% performance gain for Intel Sandybridge on SpecFP is
eclipsed by a 3% degradation for AMD Bulldozer.

For the data above, generic mode splits both 256-bit misaligned loads and
stores, as is currently the case in trunk. 

Even if we disable 256-bit misaliged load splitting, AVX 256-bit performance
improves only by ~1.4% on SpecFP for AMD Bulldozer. On the other hand, AVX
256-bit performance drops by 0.12% on Intel Sandybridge. In this case with
AVX 256 load splitting disabled, a cumulative 0.9% performance gain for Intel
Sandybridge is reflected versus a 1.9% loss for AMD Bulldozer comparing AVX 256
to AVX 128 and hence AVX 256 is still not a fair choice for generic mode.

Please provide thoughts. It would be great if HJ can verify Intel Sandybridge
data.

Thanks,
Harsha


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-01-08 11:22 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-12 22:26 AVX generic mode tuning discussion harsha.jagasia
2011-07-12 22:29 ` Richard Henderson
2011-07-13  8:49   ` Richard Guenther
2011-07-13  9:07     ` Jakub Jelinek
2011-07-21 21:37     ` Jagasia, Harsha
2013-01-07 18:24     ` FW: " Jagasia, Harsha
     [not found]     ` <873A3B0C5474B84F92B91855BCB4FE1625438297@sausexdag01.amd.com>
2013-01-08 11:22       ` Richard Biener
2011-07-21 21:18   ` Jagasia, Harsha
     [not found]   ` <63EE40A00BA43F49B85FACBB03F078B60821086630@sausexmbp02.amd.com>
2011-10-31 21:21     ` Jagasia, Harsha
2011-11-01  9:47       ` Richard Guenther
2011-11-02 17:17         ` Jagasia, Harsha
2011-11-02 20:50           ` Richard Guenther

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).