public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* _mm_store_pd/s translated to movntpd/s
@ 2011-03-21 14:23 Matthias Kretz
  2011-03-21 14:49 ` Matthias Kretz
  0 siblings, 1 reply; 5+ messages in thread
From: Matthias Kretz @ 2011-03-21 14:23 UTC (permalink / raw)
  To: gcc-help

Hi,

I tested the GCC 4.6.0 RC on Intel systems with good success so far. Now I 
tested on an AMD Magny-Cours using the -march=barcelona flag and gcc 
translated _mm_store_pd/s calls in the code to streaming stores in the 
resulting binary.

Where does this "optimization" come from and how can I disable it? This 
doesn't make much sense on a working set that fits into the cache...

Is this intended behavior or a bug?

Cheers,
	Matthias

-- 
Dipl.-Phys. Matthias Kretz
http://compeng.uni-frankfurt.de/?mkretz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: _mm_store_pd/s translated to movntpd/s
  2011-03-21 14:23 _mm_store_pd/s translated to movntpd/s Matthias Kretz
@ 2011-03-21 14:49 ` Matthias Kretz
  2011-03-21 17:00   ` Brian Budge
  2011-03-21 20:43   ` Ian Lance Taylor
  0 siblings, 2 replies; 5+ messages in thread
From: Matthias Kretz @ 2011-03-21 14:49 UTC (permalink / raw)
  To: gcc-help

Hi,

On Monday 21 March 2011 15:23:02 Matthias Kretz wrote:
> I tested the GCC 4.6.0 RC on Intel systems with good success so far. Now I
> tested on an AMD Magny-Cours using the -march=barcelona flag and gcc
> translated _mm_store_pd/s calls in the code to streaming stores in the
> resulting binary.
> 
> Where does this "optimization" come from and how can I disable it? This
> doesn't make much sense on a working set that fits into the cache...
> 
> Is this intended behavior or a bug?

Additional info: If I add -fno-prefetch-loop-arrays I get normal stores as 
expected. I don't consider this a solution, though.

Regards,
	Matthias

-- 
Dipl.-Phys. Matthias Kretz
http://compeng.uni-frankfurt.de/?mkretz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: _mm_store_pd/s translated to movntpd/s
  2011-03-21 14:49 ` Matthias Kretz
@ 2011-03-21 17:00   ` Brian Budge
  2011-03-21 17:56     ` Matthias Kretz
  2011-03-21 20:43   ` Ian Lance Taylor
  1 sibling, 1 reply; 5+ messages in thread
From: Brian Budge @ 2011-03-21 17:00 UTC (permalink / raw)
  To: Matthias Kretz; +Cc: gcc-help

On Mon, Mar 21, 2011 at 7:49 AM, Matthias Kretz
<kretz@compeng.uni-frankfurt.de> wrote:
> Hi,
>
> On Monday 21 March 2011 15:23:02 Matthias Kretz wrote:
>> I tested the GCC 4.6.0 RC on Intel systems with good success so far. Now I
>> tested on an AMD Magny-Cours using the -march=barcelona flag and gcc
>> translated _mm_store_pd/s calls in the code to streaming stores in the
>> resulting binary.
>>
>> Where does this "optimization" come from and how can I disable it? This
>> doesn't make much sense on a working set that fits into the cache...
>>
>> Is this intended behavior or a bug?
>
> Additional info: If I add -fno-prefetch-loop-arrays I get normal stores as
> expected. I don't consider this a solution, though.
>
> Regards,
>        Matthias
>
> --
> Dipl.-Phys. Matthias Kretz
> http://compeng.uni-frankfurt.de/?mkretz
>

Do you mean _mm_stream_pd/s?  I think store will still take your
values to cache...

  Brian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: _mm_store_pd/s translated to movntpd/s
  2011-03-21 17:00   ` Brian Budge
@ 2011-03-21 17:56     ` Matthias Kretz
  0 siblings, 0 replies; 5+ messages in thread
From: Matthias Kretz @ 2011-03-21 17:56 UTC (permalink / raw)
  To: gcc-help

Hi,

On Monday 21 March 2011 18:00:38 Brian Budge wrote:
> On Mon, Mar 21, 2011 at 7:49 AM, Matthias Kretz wrote:
> > On Monday 21 March 2011 15:23:02 Matthias Kretz wrote:
> >> I tested the GCC 4.6.0 RC on Intel systems with good success so far. Now
> >> I tested on an AMD Magny-Cours using the -march=barcelona flag and gcc
> >> translated _mm_store_pd/s calls in the code to streaming stores in the
> >> resulting binary.
> >> 
> >> Where does this "optimization" come from and how can I disable it? This
> >> doesn't make much sense on a working set that fits into the cache...
> >> 
> >> Is this intended behavior or a bug?
> > 
> > Additional info: If I add -fno-prefetch-loop-arrays I get normal stores
> > as expected. I don't consider this a solution, though.
> 
> Do you mean _mm_stream_pd/s?  I think store will still take your
> values to cache...

I mean that I wrote _mm_store_pd/s in my code but I got _mm_stream_pd/s 
instead. Only if I compile with -fno-prefetch-loop-arrays do I actually get 
non-streaming stores.

Regards,
	Matthias

-- 
Dipl.-Phys. Matthias Kretz
http://compeng.uni-frankfurt.de/?mkretz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: _mm_store_pd/s translated to movntpd/s
  2011-03-21 14:49 ` Matthias Kretz
  2011-03-21 17:00   ` Brian Budge
@ 2011-03-21 20:43   ` Ian Lance Taylor
  1 sibling, 0 replies; 5+ messages in thread
From: Ian Lance Taylor @ 2011-03-21 20:43 UTC (permalink / raw)
  To: Matthias Kretz; +Cc: gcc-help

Matthias Kretz <kretz@compeng.uni-frankfurt.de> writes:

> On Monday 21 March 2011 15:23:02 Matthias Kretz wrote:
>> I tested the GCC 4.6.0 RC on Intel systems with good success so far. Now I
>> tested on an AMD Magny-Cours using the -march=barcelona flag and gcc
>> translated _mm_store_pd/s calls in the code to streaming stores in the
>> resulting binary.
>> 
>> Where does this "optimization" come from and how can I disable it? This
>> doesn't make much sense on a working set that fits into the cache...
>> 
>> Is this intended behavior or a bug?
>
> Additional info: If I add -fno-prefetch-loop-arrays I get normal stores as 
> expected. I don't consider this a solution, though.

That is precisely where this optimization is coming from.  The
vectorizer pretty much assumes that the working set doesn't fit in the
cache.  I think it would be reasonable to have an option to control
this.  Please consider filing a bug report as described at
http://gcc.gnu.org/bugs/ , ideally with a test case.

Ian

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-03-21 20:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-21 14:23 _mm_store_pd/s translated to movntpd/s Matthias Kretz
2011-03-21 14:49 ` Matthias Kretz
2011-03-21 17:00   ` Brian Budge
2011-03-21 17:56     ` Matthias Kretz
2011-03-21 20:43   ` Ian Lance Taylor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).