* [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
@ 2011-06-23 20:48 Fang, Changpeng
2011-06-23 21:33 ` Jakub Jelinek
2011-06-24 0:14 ` Jan Hubicka
0 siblings, 2 replies; 8+ messages in thread
From: Fang, Changpeng @ 2011-06-23 20:48 UTC (permalink / raw)
To: Uros Bizjak, gcc-patches; +Cc: hubicka, rguenther
[-- Attachment #1: Type: text/plain, Size: 397 bytes --]
Hi,
This patch enables 128-bit avx instruction generation for the auto-vectorizer for AMD bulldozer
machines. This enablement gives additional ~3% improvement on polyhedron 2005 and cpu2006
floating point programs.
The patch passed bootstrapping on a x86_64-unknown-linux-gnu system with Bulldozer cores.
Is it OK to commit to trunk and backport to 4.6 branch?
Thanks,
Changpeng
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Auto-vectorizer-generates-128-bit-AVX-insns-by-defau.patch --]
[-- Type: text/x-patch; name="0001-Auto-vectorizer-generates-128-bit-AVX-insns-by-defau.patch", Size: 3539 bytes --]
From b5015593b0b30b14783866ac68c2c5f2e014d206 Mon Sep 17 00:00:00 2001
From: Changpeng Fang <chfang@huainan.(none)>
Date: Wed, 22 Jun 2011 15:03:05 -0700
Subject: [PATCH] Auto-vectorizer generates 128-bit AVX insns by default for bdver1
* config/i386/i386.opt (mprefer-avx128): Redefine the flag as a Mask option.
* config/i386/i386.c (x86_prefer_avx128): New tune option definition.
(ix86_option_override_internal): Enable the generation of the 128-bit
instructions when x86_prefer_avx128 is set.
(ix86_preferred_simd_mode): Use TARGET_PREFER_AVX128.
(ix86_autovectorize_vector_sizes): Use TARGET_PREFER_AVX128.
---
gcc/config/i386/i386.c | 13 ++++++++++---
gcc/config/i386/i386.opt | 2 +-
2 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 014401b..1f5113f 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2128,6 +2128,9 @@ static const unsigned int x86_avx256_split_unaligned_load
static const unsigned int x86_avx256_split_unaligned_store
= m_COREI7 | m_BDVER1 | m_GENERIC;
+static const unsigned int x86_prefer_avx128
+ = m_BDVER1;
+
/* In case the average insn count for single function invocation is
lower than this constant, emit fast (but longer) prologue and
epilogue code. */
@@ -2623,6 +2626,7 @@ ix86_target_string (int isa, int flags, const char *arch, const char *tune,
{ "-mvzeroupper", MASK_VZEROUPPER },
{ "-mavx256-split-unaligned-load", MASK_AVX256_SPLIT_UNALIGNED_LOAD},
{ "-mavx256-split-unaligned-store", MASK_AVX256_SPLIT_UNALIGNED_STORE},
+ { "-mprefer-avx128", MASK_PREFER_AVX128},
};
const char *opts[ARRAY_SIZE (isa_opts) + ARRAY_SIZE (flag_opts) + 6][2];
@@ -3672,6 +3676,9 @@ ix86_option_override_internal (bool main_args_p)
if ((x86_avx256_split_unaligned_store & ix86_tune_mask)
&& !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE))
target_flags |= MASK_AVX256_SPLIT_UNALIGNED_STORE;
+ if ((x86_prefer_avx128 & ix86_tune_mask)
+ && !(target_flags_explicit & MASK_PREFER_AVX128))
+ target_flags |= MASK_PREFER_AVX128;
}
}
else
@@ -34614,7 +34621,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
return V2DImode;
case SFmode:
- if (TARGET_AVX && !flag_prefer_avx128)
+ if (TARGET_AVX && !TARGET_PREFER_AVX128)
return V8SFmode;
else
return V4SFmode;
@@ -34622,7 +34629,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
case DFmode:
if (!TARGET_VECTORIZE_DOUBLE)
return word_mode;
- else if (TARGET_AVX && !flag_prefer_avx128)
+ else if (TARGET_AVX && !TARGET_PREFER_AVX128)
return V4DFmode;
else if (TARGET_SSE2)
return V2DFmode;
@@ -34639,7 +34646,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
static unsigned int
ix86_autovectorize_vector_sizes (void)
{
- return (TARGET_AVX && !flag_prefer_avx128) ? 32 | 16 : 0;
+ return (TARGET_AVX && !TARGET_PREFER_AVX128) ? 32 | 16 : 0;
}
/* Initialize the GCC target structure. */
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 21e0def..9886b7b 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -388,7 +388,7 @@ Do dispatch scheduling if processor is bdver1 and Haifa scheduling
is selected.
mprefer-avx128
-Target Report Var(flag_prefer_avx128) Init(0)
+Target Report Mask(PREFER_AVX128) SAVE
Use 128-bit AVX instructions instead of 256-bit AVX instructions in the auto-vectorizer.
;; ISA support
--
1.7.0.4
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
2011-06-23 20:48 [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer Fang, Changpeng
@ 2011-06-23 21:33 ` Jakub Jelinek
2011-06-24 0:14 ` Jan Hubicka
1 sibling, 0 replies; 8+ messages in thread
From: Jakub Jelinek @ 2011-06-23 21:33 UTC (permalink / raw)
To: Fang, Changpeng; +Cc: Uros Bizjak, gcc-patches, hubicka, rguenther
On Thu, Jun 23, 2011 at 03:41:01PM -0500, Fang, Changpeng wrote:
> This patch enables 128-bit avx instruction generation for the auto-vectorizer for AMD bulldozer
> machines. This enablement gives additional ~3% improvement on polyhedron 2005 and cpu2006
> floating point programs.
>
> The patch passed bootstrapping on a x86_64-unknown-linux-gnu system with Bulldozer cores.
>
> Is it OK to commit to trunk and backport to 4.6 branch?
For 4.6 branch, if it is approved for trunk, please wait after 4.6.1 is
released.
Jakub
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
2011-06-23 20:48 [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer Fang, Changpeng
2011-06-23 21:33 ` Jakub Jelinek
@ 2011-06-24 0:14 ` Jan Hubicka
2011-06-25 6:41 ` Fang, Changpeng
1 sibling, 1 reply; 8+ messages in thread
From: Jan Hubicka @ 2011-06-24 0:14 UTC (permalink / raw)
To: Fang, Changpeng; +Cc: Uros Bizjak, gcc-patches, hubicka, rguenther
Hi,
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -2128,6 +2128,9 @@ static const unsigned int x86_avx256_split_unaligned_load
> static const unsigned int x86_avx256_split_unaligned_store
> = m_COREI7 | m_BDVER1 | m_GENERIC;
>
> +static const unsigned int x86_prefer_avx128
> + = m_BDVER1;
What is reason for stuff like this to not go into initial_ix86_tune_features?
I sort of liked them better when they was individual flags, but having the target
tunning flags spread across multiple places seems unnecesary.
Honza
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
2011-06-24 0:14 ` Jan Hubicka
@ 2011-06-25 6:41 ` Fang, Changpeng
2011-06-27 23:25 ` Fang, Changpeng
0 siblings, 1 reply; 8+ messages in thread
From: Fang, Changpeng @ 2011-06-25 6:41 UTC (permalink / raw)
To: Jan Hubicka; +Cc: Uros Bizjak, gcc-patches, rguenther
[-- Attachment #1: Type: text/plain, Size: 1096 bytes --]
Hi,
I have no preference in tune feature coding. But I agree with you it's better to
put similar things together. I modified the code following your suggestion.
Is it OK to commit this modified patch?
Thanks,
Changpeng
________________________________________
From: Jan Hubicka [hubicka@ucw.cz]
Sent: Thursday, June 23, 2011 6:20 PM
To: Fang, Changpeng
Cc: Uros Bizjak; gcc-patches@gcc.gnu.org; hubicka@ucw.cz; rguenther@suse.de
Subject: Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
Hi,
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -2128,6 +2128,9 @@ static const unsigned int x86_avx256_split_unaligned_load
> static const unsigned int x86_avx256_split_unaligned_store
> = m_COREI7 | m_BDVER1 | m_GENERIC;
>
> +static const unsigned int x86_prefer_avx128
> + = m_BDVER1;
What is reason for stuff like this to not go into initial_ix86_tune_features?
I sort of liked them better when they was individual flags, but having the target
tunning flags spread across multiple places seems unnecesary.
Honza
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Auto-vectorizer-generates-128-bit-AVX-insns-by-defau.patch --]
[-- Type: text/x-patch; name="0001-Auto-vectorizer-generates-128-bit-AVX-insns-by-defau.patch", Size: 4685 bytes --]
From a325395439a314f87b3c79a5b9ce79a6a976a710 Mon Sep 17 00:00:00 2001
From: Changpeng Fang <chfang@huainan.(none)>
Date: Wed, 22 Jun 2011 15:03:05 -0700
Subject: [PATCH] Auto-vectorizer generates 128-bit AVX insns by default for bdver1
* config/i386/i386.opt (mprefer-avx128): Redefine the flag as a Mask option.
* config/i386/i386.h (ix86_tune_indices): Add X86_TUNE_AVX128_OPTIMAL entry.
(TARGET_AVX128_OPTIMAL): New definition.
* config/i386/i386.c (initial_ix86_tune_features): Initialize
X86_TUNE_AVX128_OPTIMAL entry.
(ix86_option_override_internal): Enable the generation
of the 128-bit instructions when TARGET_AVX128_OPTIMAL is set.
(ix86_preferred_simd_mode): Use TARGET_PREFER_AVX128.
(ix86_autovectorize_vector_sizes): Use TARGET_PREFER_AVX128.
---
gcc/config/i386/i386.c | 16 ++++++++++++----
gcc/config/i386/i386.h | 4 +++-
gcc/config/i386/i386.opt | 2 +-
3 files changed, 16 insertions(+), 6 deletions(-)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 014401b..b3434dd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2089,7 +2089,11 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
/* X86_SOFTARE_PREFETCHING_BENEFICIAL: Enable software prefetching
at -O3. For the moment, the prefetching seems badly tuned for Intel
chips. */
- m_K6_GEODE | m_AMD_MULTIPLE
+ m_K6_GEODE | m_AMD_MULTIPLE,
+
+ /* X86_TUNE_AVX128_OPTIMAL: Enable 128-bit AVX instruction generation for
+ the auto-vectorizer. */
+ m_BDVER1
};
/* Feature tests against the various architecture variations. */
@@ -2623,6 +2627,7 @@ ix86_target_string (int isa, int flags, const char *arch, const char *tune,
{ "-mvzeroupper", MASK_VZEROUPPER },
{ "-mavx256-split-unaligned-load", MASK_AVX256_SPLIT_UNALIGNED_LOAD},
{ "-mavx256-split-unaligned-store", MASK_AVX256_SPLIT_UNALIGNED_STORE},
+ { "-mprefer-avx128", MASK_PREFER_AVX128},
};
const char *opts[ARRAY_SIZE (isa_opts) + ARRAY_SIZE (flag_opts) + 6][2];
@@ -3672,6 +3677,9 @@ ix86_option_override_internal (bool main_args_p)
if ((x86_avx256_split_unaligned_store & ix86_tune_mask)
&& !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE))
target_flags |= MASK_AVX256_SPLIT_UNALIGNED_STORE;
+ /* Enable 128-bit AVX instruction generation for the auto-vectorizer. */
+ if (TARGET_AVX128_OPTIMAL && !(target_flags_explicit & MASK_PREFER_AVX128))
+ target_flags |= MASK_PREFER_AVX128;
}
}
else
@@ -34614,7 +34622,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
return V2DImode;
case SFmode:
- if (TARGET_AVX && !flag_prefer_avx128)
+ if (TARGET_AVX && !TARGET_PREFER_AVX128)
return V8SFmode;
else
return V4SFmode;
@@ -34622,7 +34630,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
case DFmode:
if (!TARGET_VECTORIZE_DOUBLE)
return word_mode;
- else if (TARGET_AVX && !flag_prefer_avx128)
+ else if (TARGET_AVX && !TARGET_PREFER_AVX128)
return V4DFmode;
else if (TARGET_SSE2)
return V2DFmode;
@@ -34639,7 +34647,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
static unsigned int
ix86_autovectorize_vector_sizes (void)
{
- return (TARGET_AVX && !flag_prefer_avx128) ? 32 | 16 : 0;
+ return (TARGET_AVX && !TARGET_PREFER_AVX128) ? 32 | 16 : 0;
}
/* Initialize the GCC target structure. */
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 8badcbb..d9317ed 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -312,6 +312,7 @@ enum ix86_tune_indices {
X86_TUNE_OPT_AGU,
X86_TUNE_VECTORIZE_DOUBLE,
X86_TUNE_SOFTWARE_PREFETCHING_BENEFICIAL,
+ X86_TUNE_AVX128_OPTIMAL,
X86_TUNE_LAST
};
@@ -410,7 +411,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
ix86_tune_features[X86_TUNE_VECTORIZE_DOUBLE]
#define TARGET_SOFTWARE_PREFETCHING_BENEFICIAL \
ix86_tune_features[X86_TUNE_SOFTWARE_PREFETCHING_BENEFICIAL]
-
+#define TARGET_AVX128_OPTIMAL \
+ ix86_tune_features[X86_TUNE_AVX128_OPTIMAL]
/* Feature tests against the various architecture variations. */
enum ix86_arch_indices {
X86_ARCH_CMOVE, /* || TARGET_SSE */
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 21e0def..9886b7b 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -388,7 +388,7 @@ Do dispatch scheduling if processor is bdver1 and Haifa scheduling
is selected.
mprefer-avx128
-Target Report Var(flag_prefer_avx128) Init(0)
+Target Report Mask(PREFER_AVX128) SAVE
Use 128-bit AVX instructions instead of 256-bit AVX instructions in the auto-vectorizer.
;; ISA support
--
1.7.0.4
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
2011-06-25 6:41 ` Fang, Changpeng
@ 2011-06-27 23:25 ` Fang, Changpeng
2011-06-28 22:43 ` Fang, Changpeng
0 siblings, 1 reply; 8+ messages in thread
From: Fang, Changpeng @ 2011-06-27 23:25 UTC (permalink / raw)
To: Fang, Changpeng, Jan Hubicka; +Cc: Uros Bizjak, gcc-patches, rguenther
Is this patch OK to commit to trunk?
Also I would like to backport this patch to gcc 4.6 branch. Do I have to send a separate
request or use this one?
Thanks,
Changpeng
________________________________________
From: Fang, Changpeng
Sent: Friday, June 24, 2011 7:12 PM
To: Jan Hubicka
Cc: Uros Bizjak; gcc-patches@gcc.gnu.org; rguenther@suse.de
Subject: RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
Hi,
I have no preference in tune feature coding. But I agree with you it's better to
put similar things together. I modified the code following your suggestion.
Is it OK to commit this modified patch?
Thanks,
Changpeng
________________________________________
From: Jan Hubicka [hubicka@ucw.cz]
Sent: Thursday, June 23, 2011 6:20 PM
To: Fang, Changpeng
Cc: Uros Bizjak; gcc-patches@gcc.gnu.org; hubicka@ucw.cz; rguenther@suse.de
Subject: Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
Hi,
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -2128,6 +2128,9 @@ static const unsigned int x86_avx256_split_unaligned_load
> static const unsigned int x86_avx256_split_unaligned_store
> = m_COREI7 | m_BDVER1 | m_GENERIC;
>
> +static const unsigned int x86_prefer_avx128
> + = m_BDVER1;
What is reason for stuff like this to not go into initial_ix86_tune_features?
I sort of liked them better when they was individual flags, but having the target
tunning flags spread across multiple places seems unnecesary.
Honza
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
2011-06-27 23:25 ` Fang, Changpeng
@ 2011-06-28 22:43 ` Fang, Changpeng
2011-06-29 8:14 ` Jan Hubicka
0 siblings, 1 reply; 8+ messages in thread
From: Fang, Changpeng @ 2011-06-28 22:43 UTC (permalink / raw)
To: Fang, Changpeng, Jan Hubicka; +Cc: Uros Bizjak, gcc-patches, rguenther
[-- Attachment #1: Type: text/plain, Size: 1969 bytes --]
Hi,
I re-attached the patch here. Can someone review it?
We would like to commit to trunk as well as 4.6 branch.
Thanks,
Changpeng
________________________________________
From: Fang, Changpeng
Sent: Monday, June 27, 2011 5:42 PM
To: Fang, Changpeng; Jan Hubicka
Cc: Uros Bizjak; gcc-patches@gcc.gnu.org; rguenther@suse.de
Subject: RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
Is this patch OK to commit to trunk?
Also I would like to backport this patch to gcc 4.6 branch. Do I have to send a separate
request or use this one?
Thanks,
Changpeng
________________________________________
From: Fang, Changpeng
Sent: Friday, June 24, 2011 7:12 PM
To: Jan Hubicka
Cc: Uros Bizjak; gcc-patches@gcc.gnu.org; rguenther@suse.de
Subject: RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
Hi,
I have no preference in tune feature coding. But I agree with you it's better to
put similar things together. I modified the code following your suggestion.
Is it OK to commit this modified patch?
Thanks,
Changpeng
________________________________________
From: Jan Hubicka [hubicka@ucw.cz]
Sent: Thursday, June 23, 2011 6:20 PM
To: Fang, Changpeng
Cc: Uros Bizjak; gcc-patches@gcc.gnu.org; hubicka@ucw.cz; rguenther@suse.de
Subject: Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
Hi,
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -2128,6 +2128,9 @@ static const unsigned int x86_avx256_split_unaligned_load
> static const unsigned int x86_avx256_split_unaligned_store
> = m_COREI7 | m_BDVER1 | m_GENERIC;
>
> +static const unsigned int x86_prefer_avx128
> + = m_BDVER1;
What is reason for stuff like this to not go into initial_ix86_tune_features?
I sort of liked them better when they was individual flags, but having the target
tunning flags spread across multiple places seems unnecesary.
Honza
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Auto-vectorizer-generates-128-bit-AVX-insns-by-defau.patch --]
[-- Type: text/x-patch; name="0001-Auto-vectorizer-generates-128-bit-AVX-insns-by-defau.patch", Size: 4685 bytes --]
From a325395439a314f87b3c79a5b9ce79a6a976a710 Mon Sep 17 00:00:00 2001
From: Changpeng Fang <chfang@huainan.(none)>
Date: Wed, 22 Jun 2011 15:03:05 -0700
Subject: [PATCH] Auto-vectorizer generates 128-bit AVX insns by default for bdver1
* config/i386/i386.opt (mprefer-avx128): Redefine the flag as a Mask option.
* config/i386/i386.h (ix86_tune_indices): Add X86_TUNE_AVX128_OPTIMAL entry.
(TARGET_AVX128_OPTIMAL): New definition.
* config/i386/i386.c (initial_ix86_tune_features): Initialize
X86_TUNE_AVX128_OPTIMAL entry.
(ix86_option_override_internal): Enable the generation
of the 128-bit instructions when TARGET_AVX128_OPTIMAL is set.
(ix86_preferred_simd_mode): Use TARGET_PREFER_AVX128.
(ix86_autovectorize_vector_sizes): Use TARGET_PREFER_AVX128.
---
gcc/config/i386/i386.c | 16 ++++++++++++----
gcc/config/i386/i386.h | 4 +++-
gcc/config/i386/i386.opt | 2 +-
3 files changed, 16 insertions(+), 6 deletions(-)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 014401b..b3434dd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2089,7 +2089,11 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
/* X86_SOFTARE_PREFETCHING_BENEFICIAL: Enable software prefetching
at -O3. For the moment, the prefetching seems badly tuned for Intel
chips. */
- m_K6_GEODE | m_AMD_MULTIPLE
+ m_K6_GEODE | m_AMD_MULTIPLE,
+
+ /* X86_TUNE_AVX128_OPTIMAL: Enable 128-bit AVX instruction generation for
+ the auto-vectorizer. */
+ m_BDVER1
};
/* Feature tests against the various architecture variations. */
@@ -2623,6 +2627,7 @@ ix86_target_string (int isa, int flags, const char *arch, const char *tune,
{ "-mvzeroupper", MASK_VZEROUPPER },
{ "-mavx256-split-unaligned-load", MASK_AVX256_SPLIT_UNALIGNED_LOAD},
{ "-mavx256-split-unaligned-store", MASK_AVX256_SPLIT_UNALIGNED_STORE},
+ { "-mprefer-avx128", MASK_PREFER_AVX128},
};
const char *opts[ARRAY_SIZE (isa_opts) + ARRAY_SIZE (flag_opts) + 6][2];
@@ -3672,6 +3677,9 @@ ix86_option_override_internal (bool main_args_p)
if ((x86_avx256_split_unaligned_store & ix86_tune_mask)
&& !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE))
target_flags |= MASK_AVX256_SPLIT_UNALIGNED_STORE;
+ /* Enable 128-bit AVX instruction generation for the auto-vectorizer. */
+ if (TARGET_AVX128_OPTIMAL && !(target_flags_explicit & MASK_PREFER_AVX128))
+ target_flags |= MASK_PREFER_AVX128;
}
}
else
@@ -34614,7 +34622,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
return V2DImode;
case SFmode:
- if (TARGET_AVX && !flag_prefer_avx128)
+ if (TARGET_AVX && !TARGET_PREFER_AVX128)
return V8SFmode;
else
return V4SFmode;
@@ -34622,7 +34630,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
case DFmode:
if (!TARGET_VECTORIZE_DOUBLE)
return word_mode;
- else if (TARGET_AVX && !flag_prefer_avx128)
+ else if (TARGET_AVX && !TARGET_PREFER_AVX128)
return V4DFmode;
else if (TARGET_SSE2)
return V2DFmode;
@@ -34639,7 +34647,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
static unsigned int
ix86_autovectorize_vector_sizes (void)
{
- return (TARGET_AVX && !flag_prefer_avx128) ? 32 | 16 : 0;
+ return (TARGET_AVX && !TARGET_PREFER_AVX128) ? 32 | 16 : 0;
}
/* Initialize the GCC target structure. */
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 8badcbb..d9317ed 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -312,6 +312,7 @@ enum ix86_tune_indices {
X86_TUNE_OPT_AGU,
X86_TUNE_VECTORIZE_DOUBLE,
X86_TUNE_SOFTWARE_PREFETCHING_BENEFICIAL,
+ X86_TUNE_AVX128_OPTIMAL,
X86_TUNE_LAST
};
@@ -410,7 +411,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
ix86_tune_features[X86_TUNE_VECTORIZE_DOUBLE]
#define TARGET_SOFTWARE_PREFETCHING_BENEFICIAL \
ix86_tune_features[X86_TUNE_SOFTWARE_PREFETCHING_BENEFICIAL]
-
+#define TARGET_AVX128_OPTIMAL \
+ ix86_tune_features[X86_TUNE_AVX128_OPTIMAL]
/* Feature tests against the various architecture variations. */
enum ix86_arch_indices {
X86_ARCH_CMOVE, /* || TARGET_SSE */
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 21e0def..9886b7b 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -388,7 +388,7 @@ Do dispatch scheduling if processor is bdver1 and Haifa scheduling
is selected.
mprefer-avx128
-Target Report Var(flag_prefer_avx128) Init(0)
+Target Report Mask(PREFER_AVX128) SAVE
Use 128-bit AVX instructions instead of 256-bit AVX instructions in the auto-vectorizer.
;; ISA support
--
1.7.0.4
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
2011-06-28 22:43 ` Fang, Changpeng
@ 2011-06-29 8:14 ` Jan Hubicka
2011-06-29 8:18 ` Jakub Jelinek
0 siblings, 1 reply; 8+ messages in thread
From: Jan Hubicka @ 2011-06-29 8:14 UTC (permalink / raw)
To: Fang, Changpeng; +Cc: Jan Hubicka, Uros Bizjak, gcc-patches, rguenther
> * config/i386/i386.opt (mprefer-avx128): Redefine the flag as a Mask option.
>
> * config/i386/i386.h (ix86_tune_indices): Add X86_TUNE_AVX128_OPTIMAL entry.
> (TARGET_AVX128_OPTIMAL): New definition.
>
> * config/i386/i386.c (initial_ix86_tune_features): Initialize
> X86_TUNE_AVX128_OPTIMAL entry.
> (ix86_option_override_internal): Enable the generation
> of the 128-bit instructions when TARGET_AVX128_OPTIMAL is set.
> (ix86_preferred_simd_mode): Use TARGET_PREFER_AVX128.
> (ix86_autovectorize_vector_sizes): Use TARGET_PREFER_AVX128.
OK for mainline. For 4.6 it is RM's call.
Thanks,
Honza
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
2011-06-29 8:14 ` Jan Hubicka
@ 2011-06-29 8:18 ` Jakub Jelinek
0 siblings, 0 replies; 8+ messages in thread
From: Jakub Jelinek @ 2011-06-29 8:18 UTC (permalink / raw)
To: Jan Hubicka; +Cc: Fang, Changpeng, Uros Bizjak, gcc-patches, rguenther
On Wed, Jun 29, 2011 at 09:49:52AM +0200, Jan Hubicka wrote:
> > * config/i386/i386.opt (mprefer-avx128): Redefine the flag as a Mask option.
> >
> > * config/i386/i386.h (ix86_tune_indices): Add X86_TUNE_AVX128_OPTIMAL entry.
> > (TARGET_AVX128_OPTIMAL): New definition.
> >
> > * config/i386/i386.c (initial_ix86_tune_features): Initialize
> > X86_TUNE_AVX128_OPTIMAL entry.
> > (ix86_option_override_internal): Enable the generation
> > of the 128-bit instructions when TARGET_AVX128_OPTIMAL is set.
> > (ix86_preferred_simd_mode): Use TARGET_PREFER_AVX128.
> > (ix86_autovectorize_vector_sizes): Use TARGET_PREFER_AVX128.
>
> OK for mainline. For 4.6 it is RM's call.
For 4.6 it is fine as well.
Jakub
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-06-29 7:57 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-23 20:48 [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer Fang, Changpeng
2011-06-23 21:33 ` Jakub Jelinek
2011-06-24 0:14 ` Jan Hubicka
2011-06-25 6:41 ` Fang, Changpeng
2011-06-27 23:25 ` Fang, Changpeng
2011-06-28 22:43 ` Fang, Changpeng
2011-06-29 8:14 ` Jan Hubicka
2011-06-29 8:18 ` Jakub Jelinek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).