public inbox for gcc-prs@sourceware.org
help / color / mirror / Atom feed
From: Kevin J Bowers <kbowers@lanl.gov>
To: hubicka@gcc.gnu.org
Cc: gcc-prs@gcc.gnu.org,
Subject: Re: target/10691: Invalid assembly emitted when using _m128 datatypes on x86
Date: Mon, 12 May 2003 23:16:00 -0000	[thread overview]
Message-ID: <20030512231601.18429.qmail@sources.redhat.com> (raw)

The following reply was made to PR target/10691; it has been noted by GNATS.

From: Kevin J Bowers <kbowers@lanl.gov>
To: gcc-prs@gcc.gnu.org, hubicka@gcc.gnu.org, kbowers@lanl.gov,
   gcc-bugs@gcc.gnu.org, gcc-gnats@gcc.gnu.org
Cc:  
Subject: Re: target/10691: Invalid assembly emitted when using _m128 datatypes
 on x86
Date: Mon, 12 May 2003 17:13:51 -0600

 http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=10691
 
 Brief followup:
 
 I've had the problem crop up in a couple of other situations. Here is 
 some information that might help in isolating the problem.
 
 Consider _mm_storel_pi in "xmmintrin.h":
 
 static __inline void
 _mm_storel_pi (__m64 *__P, __m128 __A)
 {
    __builtin_ia32_storelps ((__v2si *)__P, (__v4sf)__A);
 }
 
 In a -g compile, it appears that the the macro is not expanded inline. 
 For the function call, the compiler puts the arguments on the stack as:
 
 (%esp)  -> __P
 4(%esp) -> __A
 
 To store the caller's __A at 4(%esp), the compiler emits something along 
 the lines of:
 
 movaps mem128, %xmm_reg
 movaps %xmm_reg, 4(%esp) ==> faults
 ...
 call _mm_storel_pi
 
 At higher optimization levels, sometimes the faulting move gets 
 optimized away and sometimes it doesn't.
 
 I suspect that the faults will go away if __m128 arguments in a function 
 call are placed such that they fall on 16-byte boundaries. In the above 
 example, this would mean switching the order of __P and __A arguments. 
 In any case, the faults seems to go away if I override the various 
 xmmstore macros to byass the inline intrinsic functions.
 
 That is:
 
 #define _mm_storel_pi(__P,__A) \
    __builtin_ia32_storeaps((__v2si *)__P, (__v4sf)__A)
 
 The preferred solution is that __m128 arguments put onto the stack be 
 placed on 16-byte boundaries. I solution that would work (but that would 
 defeat the point of using SSE instructions) would be to use a "movups" 
 to put the __m128's onto the stack.
 
 A previous response says that this problem might have been fixed. Does 
 this mean fixed in gcc-3.3? I've had this problem in gcc-3.2.x.
 
 -- 
 Kevin J Bowers, Ph.D.
 Plasma Physics Group (X-1)
 Applied Physics Division
 Los Alamos National Lab
 


             reply	other threads:[~2003-05-12 23:16 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-05-12 23:16 Kevin J Bowers [this message]
  -- strict thread matches above, loose matches on Subject: below --
2003-05-17  6:49 giovannibajo
2003-05-15 16:26 Kevin J Bowers
2003-05-08 23:54 bangerth
2003-05-08 19:26 kbowers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030512231601.18429.qmail@sources.redhat.com \
    --to=kbowers@lanl.gov \
    --cc=gcc-prs@gcc.gnu.org \
    --cc=hubicka@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).