public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Uros Bizjak <ubizjak@gmail.com>
To: Richard Sandiford <richard.sandiford@arm.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	Jan Hubicka <hubicka@ucw.cz>
Subject: Re: [04/32] [x86] Robustify vzeroupper handling across calls
Date: Tue, 01 Oct 2019 10:14:00 -0000	[thread overview]
Message-ID: <CAFULd4ZvEmn5tAYW_Ud--8j-V+908NnEZ8MnPU9BSVREf4GzYA@mail.gmail.com> (raw)
In-Reply-To: <mpt36gkjs1o.fsf@arm.com>

On Wed, Sep 25, 2019 at 5:48 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:

> > The comment suggests that this code is only needed for Win64 and that
> > not testing for Win64 is just a simplification.  But in practice it was
> > needed for correctness on GNU/Linux and other targets too, since without
> > it the RA would be able to keep 256-bit and 512-bit values in SSE
> > registers across calls that are known not to clobber them.
> >
> > This patch conservatively treats calls as AVX_U128_ANY if the RA can see
> > that some SSE registers are not touched by a call.  There are then no
> > regressions if the ix86_hard_regno_call_part_clobbered check is disabled
> > for GNU/Linux (not something we should do, was just for testing).

If RA can sse that some SSE regs are not touched by the call, then we
are sure that the called function is part of the current TU. In this
case, the called function will be compiled using VEX instructions,
where there is no AVX-SSE transition penalty. So, skipping VZEROUPPER
is beneficial here.

Uros.

> > If in fact we want -fipa-ra to pretend that all functions clobber
> > SSE registers above 128 bits, it'd certainly be possible to arrange
> > that.  But IMO that would be an optimisation decision, whereas what
> > the patch is fixing is a correctness decision.  So I think we should
> > have this check even so.
>
> 2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * config/i386/i386.c: Include function-abi.h.
>         (ix86_avx_u128_mode_needed): Treat function calls as AVX_U128_ANY
>         if they preserve some 256-bit or 512-bit SSE registers.
>
> Index: gcc/config/i386/i386.c
> ===================================================================
> --- gcc/config/i386/i386.c      2019-09-25 16:47:48.000000000 +0100
> +++ gcc/config/i386/i386.c      2019-09-25 16:47:49.089962608 +0100
> @@ -95,6 +95,7 @@ #define IN_TARGET_CODE 1
>  #include "i386-builtins.h"
>  #include "i386-expand.h"
>  #include "i386-features.h"
> +#include "function-abi.h"
>
>  /* This file should be included last.  */
>  #include "target-def.h"
> @@ -13511,6 +13512,15 @@ ix86_avx_u128_mode_needed (rtx_insn *ins
>             }
>         }
>
> +      /* If the function is known to preserve some SSE registers,
> +        RA and previous passes can legitimately rely on that for
> +        modes wider than 256 bits.  It's only safe to issue a
> +        vzeroupper if all SSE registers are clobbered.  */
> +      const function_abi &abi = insn_callee_abi (insn);
> +      if (!hard_reg_set_subset_p (reg_class_contents[ALL_SSE_REGS],
> +                                 abi.mode_clobbers (V4DImode)))
> +       return AVX_U128_ANY;
> +
>        return AVX_U128_CLEAN;
>      }
>

  parent reply	other threads:[~2019-10-01 10:14 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
2019-09-11 19:03 ` [01/32] Add function_abi.{h,cc} Richard Sandiford
2019-09-29 20:51   ` Jeff Law
2019-09-30  9:19     ` Richard Sandiford
2019-09-30 21:16       ` Jeff Law
2019-09-11 19:03 ` [02/32] Add a target hook for getting an ABI from a function type Richard Sandiford
2019-09-29 20:52   ` Jeff Law
2019-09-11 19:04 ` [03/32] Add a function for getting the ABI of a call insn target Richard Sandiford
2019-09-25 15:38   ` Richard Sandiford
2019-09-30 15:52     ` Jeff Law
2019-09-30 16:32       ` Richard Sandiford
2019-09-30 16:46         ` Jeff Law
2019-09-11 19:05 ` [04/32] [x86] Robustify vzeroupper handling across calls Richard Sandiford
2019-09-25 15:48   ` Richard Sandiford
2019-09-25 18:11     ` Uros Bizjak
2019-10-01 10:14     ` Uros Bizjak [this message]
2019-10-08 18:17       ` Uros Bizjak
2019-09-11 19:05 ` [05/32] Pass an ABI identifier to hard_regno_call_part_clobbered Richard Sandiford
2019-09-29 20:58   ` Jeff Law
2019-09-11 19:06 ` [06/32] Pass an ABI to choose_hard_reg_mode Richard Sandiford
2019-09-29 21:00   ` Jeff Law
2019-09-11 19:07 ` [07/32] Remove global call sets: caller-save.c Richard Sandiford
2019-09-29 21:01   ` Jeff Law
2019-09-11 19:07 ` [08/32] Remove global call sets: cfgcleanup.c Richard Sandiford
2019-09-29 21:02   ` Jeff Law
2019-09-11 19:08 ` [09/32] Remove global call sets: cfgloopanal.c Richard Sandiford
2019-09-29 21:02   ` Jeff Law
2019-09-11 19:08 ` [10/32] Remove global call sets: combine.c Richard Sandiford
2019-09-12  2:18   ` Segher Boessenkool
2019-09-12  7:52     ` Richard Sandiford
2019-09-20  0:43       ` Segher Boessenkool
2019-09-25 15:52         ` Richard Sandiford
2019-09-25 16:30           ` Segher Boessenkool
2019-09-29 22:32           ` Jeff Law
2019-09-29 22:43             ` Segher Boessenkool
2019-09-11 19:09 ` [12/32] Remove global call sets: cselib.c Richard Sandiford
2019-09-29 21:05   ` Jeff Law
2019-10-29  9:20     ` Martin Liška
2019-09-11 19:09 ` [11/32] Remove global call sets: cse.c Richard Sandiford
2019-09-25 15:57   ` Richard Sandiford
2019-09-29 21:04     ` Jeff Law
2019-09-30 16:23       ` Richard Sandiford
2019-09-11 19:10 ` [14/32] Remove global call sets: DF (entry/exit defs) Richard Sandiford
2019-09-29 21:07   ` Jeff Law
2019-09-11 19:10 ` [13/32] Remove global call sets: DF (EH edges) Richard Sandiford
2019-09-29 21:07   ` Jeff Law
2019-09-11 19:11 ` [16/32] Remove global call sets: function.c Richard Sandiford
2019-09-29 21:10   ` Jeff Law
2019-09-11 19:11 ` [17/32] Remove global call sets: gcse.c Richard Sandiford
2019-09-25 16:04   ` Richard Sandiford
2019-09-29 21:10   ` Jeff Law
2019-09-11 19:11 ` [15/32] Remove global call sets: early-remat.c Richard Sandiford
2019-09-29 21:09   ` Jeff Law
2019-09-11 19:12 ` [19/32] Remove global call sets: IRA Richard Sandiford
2019-09-30 15:16   ` Jeff Law
2019-09-11 19:12 ` [18/32] Remove global call sets: haifa-sched.c Richard Sandiford
2019-09-29 21:11   ` Jeff Law
2019-09-11 19:13 ` [20/32] Remove global call sets: loop-iv.c Richard Sandiford
2019-09-29 21:20   ` Jeff Law
2019-09-11 19:14 ` [22/32] Remove global call sets: postreload.c Richard Sandiford
2019-09-29 21:33   ` Jeff Law
2019-09-11 19:14 ` [23/32] Remove global call sets: postreload-gcse.c Richard Sandiford
2019-09-25 16:08   ` Richard Sandiford
2019-09-29 22:22     ` Jeff Law
2019-09-11 19:14 ` [21/32] Remove global call sets: LRA Richard Sandiford
2019-09-30 15:29   ` Jeff Law
2019-10-04 18:03   ` H.J. Lu
2019-10-04 21:52     ` H.J. Lu
2019-10-05 13:33       ` Richard Sandiford
2019-09-11 19:15 ` [24/32] Remove global call sets: recog.c Richard Sandiford
2019-09-29 21:33   ` Jeff Law
2019-09-11 19:15 ` [25/32] Remove global call sets: regcprop.c Richard Sandiford
2019-09-29 21:34   ` Jeff Law
2019-09-11 19:16 ` [26/32] Remove global call sets: regrename.c Richard Sandiford
2019-09-29 22:25   ` Jeff Law
2019-09-11 19:16 ` [27/32] Remove global call sets: reload.c Richard Sandiford
2019-09-29 22:26   ` Jeff Law
2019-09-11 19:17 ` [29/32] Remove global call sets: sched-deps.c Richard Sandiford
2019-09-29 22:20   ` Jeff Law
2019-10-04 14:32     ` Christophe Lyon
2019-10-04 14:35       ` Richard Sandiford
2019-10-04 14:37         ` Christophe Lyon
2019-10-07 13:29         ` Christophe Lyon
2019-09-11 19:17 ` [00/32] Remove global call sets: rtlanal.c Richard Sandiford
2019-09-29 22:21   ` Jeff Law
2019-09-11 19:18 ` [30/32] Remove global call sets: sel-sched.c Richard Sandiford
2019-09-30 15:08   ` Jeff Law
2019-09-11 19:18 ` [31/32] Remove global call sets: shrink-wrap.c Richard Sandiford
2019-09-29 22:21   ` Jeff Law
2019-09-11 19:19 ` [32/32] Hide regs_invalidated_by_call etc Richard Sandiford
2019-09-29 22:22   ` Jeff Law
2019-09-12 20:42 ` [00/32] Support multiple ABIs in the same translation unit Steven Bosscher
2019-09-26 19:24 ` Dimitar Dimitrov
2019-09-27  8:58   ` Richard Sandiford
2019-10-01  2:09 ` build-failure for cris-elf with "[00/32] Support multiple ABIs in the same translation unit" Hans-Peter Nilsson
2019-10-01  7:51   ` Richard Sandiford
2019-10-01 10:58     ` Hans-Peter Nilsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFULd4ZvEmn5tAYW_Ud--8j-V+908NnEZ8MnPU9BSVREf4GzYA@mail.gmail.com \
    --to=ubizjak@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hubicka@ucw.cz \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).