public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Uros Bizjak <ubizjak@gmail.com>
To: Eric Botcazou <ebotcazou@adacore.com>
Cc: gcc-patches@gcc.gnu.org, vbyakovl23@gmail.com,
		Kaz Kojima <kkojima@rr.iij4u.or.jp>
Subject: Re: [PATCH, middle-end]: Fix mode-switching MODE_EXIT check with __builtin_apply/__builtin_return
Date: Mon, 05 Nov 2012 18:34:00 -0000	[thread overview]
Message-ID: <CAFULd4ZtkLtw3rzoPGaAGCiH3z8K7dGdvq8iZqsyN5-ZTUFY+g@mail.gmail.com> (raw)
In-Reply-To: <1411479.F0z64l7D3N@polaris>

[-- Attachment #1: Type: text/plain, Size: 2347 bytes --]

On Mon, Nov 5, 2012 at 7:05 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> This sequence breaks assumption in mode-switching.c, that final
>> function return register load operates in MODE_EXIT mode and triggere
>> following code:
>>
>>                   for (j = n_entities - 1; j >= 0; j--)
>>                     {
>>                       int e = entity_map[j];
>>                       int mode = MODE_NEEDED (e, return_copy);
>>
>>                       if (mode != num_modes[e] && mode != MODE_EXIT (e))
>>                         break;
>>                     }
>>
>> As discussed above, modes of loads, generated from __builtin_apply
>> have no connection to function return mode. mode-switching.c does
>> detect __builtin_apply situation and raises maybe_builtin_apply flag,
>> but doesn't use it to short-circuit wrong check. In proposed patch, we
>> detect this situation and raise force_late_switch in the same way, as
>> SH4 does for its "late" fpscr emission.
>
> If I understand correctly, we need to insert the vzeroupper because the
> function returns double in SSE registers but we generate an OImode load
> instead of a DFmode load because of the __builtin_return.  So we're in the
> forced_late_switch case but we fail to recognize the tweaked return value load
> since the number of registers doesn't match.
>
> If so, I'd rather add another special case, like for the SH4, instead of a
> generic bypass for maybe_builtin_apply, something along the lines of:
>
>         /* For the x86 with AVX, we might be using a larger load for a value
>            returned in SSE registers and we want to put the final mode switch
>            after this return value copy.  */
>         if (copy_start == ret_start
>             && nregs == hard_regno_nregs[ret_start][GET_MODE (ret_reg)]
>             && copy_num >= nregs
>             && OBJECT_P (SET_SRC (return_copy_pat)))
>           forced_late_switch = 1;

Yes, this approach also works.

I assume it is OK to commit attached patch?

2012-11-05  Eric Botcazou  <ebotcazou@adacore.com>
	    Uros Bizjak  <ubizjak@gmail.com>

	* mode-switching.c (create_pre_exit): Force late switch for
	__builtin_return case, when value, returned in SSE register,
	was loaded using OImode load.

Tested on x86_64-pc-linux-gnu, also with to-be-committed avx-vzeroupper-27.c

Thanks,
Uros.

[-- Attachment #2: p.diff.txt --]
[-- Type: text/plain, Size: 1110 bytes --]

Index: mode-switching.c
===================================================================
--- mode-switching.c	(revision 193174)
+++ mode-switching.c	(working copy)
@@ -1,6 +1,6 @@
 /* CPU mode switching
    Copyright (C) 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2007, 2008,
-   2009, 2010 Free Software Foundation, Inc.
+   2009, 2010, 2012  Free Software Foundation, Inc.
 
 This file is part of GCC.
 
@@ -342,6 +342,17 @@ create_pre_exit (int n_entities, int *entity_map,
 		      }
 		    if (j >= 0)
 		      {
+			/* For the x86 with AVX, we might be using a larger
+			   load for a value returned in SSE registers and we
+			   want to put the final mode switch after this
+			   return value copy.  */
+			if (copy_start == ret_start
+			    && nregs
+			       == hard_regno_nregs[ret_start][GET_MODE (ret_reg)]
+			    && copy_num >= nregs
+			    && OBJECT_P (SET_SRC (return_copy_pat)))
+			  forced_late_switch = 1;
+
 			/* For the SH4, floating point loads depend on fpscr,
 			   thus we might need to put the final mode switch
 			   after the return value copy.  That is still OK,

  reply	other threads:[~2012-11-05 18:34 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-04 13:52 Uros Bizjak
2012-11-04 14:58 ` Oleg Endo
2012-11-04 15:03   ` Uros Bizjak
2012-11-04 15:23     ` Uros Bizjak
2012-11-04 18:06       ` Uros Bizjak
2012-11-05  9:10 ` Kaz Kojima
2012-11-05  9:55   ` Vladimir Yakovlev
2012-11-05 18:07 ` Eric Botcazou
2012-11-05 18:34   ` Uros Bizjak [this message]
2012-11-05 18:43     ` Uros Bizjak
2012-11-05 19:07       ` Uros Bizjak
2012-11-05 19:32       ` Eric Botcazou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFULd4ZtkLtw3rzoPGaAGCiH3z8K7dGdvq8iZqsyN5-ZTUFY+g@mail.gmail.com \
    --to=ubizjak@gmail.com \
    --cc=ebotcazou@adacore.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=kkojima@rr.iij4u.or.jp \
    --cc=vbyakovl23@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).