public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Undelivered Mail Returned to Sender
@ 2021-08-10  8:40 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10  8:40 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx601.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 474 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 7886 bytes --]

From: Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Jason Merrill <jason@redhat.com>
Cc: gcc-patches@gcc.gnu.org
Subject: [PATCH] c++: Fix ICE on defaulted spaceship with pointer return type [PR94162]
Date: Tue, 10 Aug 2021 10:39:19 +0200
Message-ID: <20210810083919.GH2380545@tucnak>

Hi!

The spaceship-synth-neg6.C testcase ICEs because we call cat_tag_for
on the explicit return type, but pointer types don't have
TYPE_LINKAGE_IDENTIFIER.  The patch fixes that.
Or should I be checking for if (!CLASS_TYPE_P (type)) return cc_last;
instead (are class type guaranteed to have TYPE_LINKAGE_IDENTIFIER?)?
I also wonder if after finding a match we shouldn't verify if is
a class type in std namespace (i.e. that
TYPE_NAME (TYPE_MAIN_VARIANT (type)) is a TYPE_DECL and
decl_in_std_namespace_p (TYPE_NAME (TYPE_MAIN_VARIANT (type)))
because it seems nothing prevents it from returning non-cc_last say on
namespace N {
  struct partial_ordering {};
}
etc.

The g++.dg/cpp2a/spaceship-synth11.C testcase is from a PR that has been
fixed with r12-619-gfc178519771db508c03611cff4a1466cf67fce1d (but
not backported to 11).

Bootstrapped/regtested on x86_64-linux and i686-linux.

2021-08-10  Jakub Jelinek  <jakub@redhat.com>

gcc/cp/
	PR c++/94162
	* method.c (cat_tag_for): Return cc_last for types with no
	linkage identifier.
gcc/testsuite/
	PR c++/99429
	* g++.dg/cpp2a/spaceship-synth11.C: New test.

	PR c++/94162
	* g++.dg/cpp2a/spaceship-synth-neg6.C: New test.

--- gcc/cp/method.c.jj	2021-06-25 10:36:22.169019953 +0200
+++ gcc/cp/method.c	2021-08-09 12:26:38.590166006 +0200
@@ -1029,6 +1029,8 @@ is_cat (tree type, comp_cat_tag tag)
 static comp_cat_tag
 cat_tag_for (tree type)
 {
+  if (!TYPE_LINKAGE_IDENTIFIER (type))
+    return cc_last;
   for (int i = 0; i < cc_last; ++i)
     {
       comp_cat_tag tag = (comp_cat_tag)i;
--- gcc/testsuite/g++.dg/cpp2a/spaceship-synth11.C.jj	2021-08-09 12:28:58.748240310 +0200
+++ gcc/testsuite/g++.dg/cpp2a/spaceship-synth11.C	2021-08-09 12:29:44.023618250 +0200
@@ -0,0 +1,29 @@
+// PR c++/99429
+// { dg-do compile { target c++20 } }
+
+namespace std {
+struct strong_ordering {
+  int _v;
+  constexpr strong_ordering (int v) :_v(v) {}
+  constexpr operator int (void) const { return _v; }
+  static const strong_ordering less;
+  static const strong_ordering equal;
+  static const strong_ordering greater;
+};
+constexpr strong_ordering strong_ordering::less = -1;
+constexpr strong_ordering strong_ordering::equal = 0;
+constexpr strong_ordering strong_ordering::greater = 1;
+}
+
+template <unsigned long N>
+struct duration {
+  static constexpr const long period = N;
+  constexpr duration (void) = default;
+  constexpr duration (const duration& d) = default;
+  constexpr bool operator== (const duration& d) const = default;
+  constexpr bool operator<=> (const duration& d) const = default;
+  long _d;
+};
+
+using nanoseconds = duration<1>;
+using microseconds = duration<nanoseconds::period * 1000>;
--- gcc/testsuite/g++.dg/cpp2a/spaceship-synth-neg6.C.jj	2021-08-09 12:31:47.411922957 +0200
+++ gcc/testsuite/g++.dg/cpp2a/spaceship-synth-neg6.C	2021-08-09 12:35:26.995906403 +0200
@@ -0,0 +1,11 @@
+// PR c++/94162
+// { dg-do compile { target c++20 } }
+
+#include <compare>
+
+struct S {
+  int a;			// { dg-error "three-way comparison of 'S::a' has type 'std::strong_ordering', which does not convert to 'int\\*'" }
+  int *operator<=>(const S&) const = default;
+};
+
+bool b = S{} < S{};		// { dg-error "use of deleted function 'constexpr int\\* S::operator<=>\\\(const S&\\\) const'" }

	Jakub



To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=1a4c.61123b70.e4e73.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=%5BPATCH%5D+c%2B%2B%3A+Fix+ICE+on+defaulted+spaceship+with+pointer+return+type+%5BPR94162%5D&verdict=C&c=1a81d13ef5acc37fb1148ec14fcd34221032f1c5

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 15:03 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 15:03 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx308.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 474 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 15858 bytes --]

From: Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Richard Biener <rguenther@suse.de>
Cc: Jakub Jelinek <jakub@redhat.com>, Nick Alcock via Gcc-patches <gcc-patches@gcc.gnu.org>, Kees Cook <keescook@chromium.org>
Subject: Re: [patch][version 6] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc
Date: Tue, 10 Aug 2021 15:02:33 +0000
Message-ID: <B3B236CC-A596-4A2E-A5B8-F0DE70D265D6@oracle.com>

Hi,

> On Aug 10, 2021, at 9:16 AM, Richard Biener <rguenther@suse.de> wrote:
> 
> On Tue, 10 Aug 2021, Qing Zhao wrote:
> 
>>>>> 
>>>>> +static void
>>>>> +expand_DEFERRED_INIT (internal_fn, gcall *stmt)
>>>>> +{
>>>>> +  tree var = gimple_call_lhs (stmt);
>>>>> +  tree size_of_var = gimple_call_arg (stmt, 0);
>>>>> +  tree vlaaddr = NULL_TREE;
>>>>> +  tree var_type = TREE_TYPE (var);
>>>>> +  bool is_vla = (bool) TREE_INT_CST_LOW (gimple_call_arg (stmt, 2));
>>>>> +  enum auto_init_type init_type
>>>>> +    = (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (stmt, 1));
>>>>> +
>>>>> +  gcc_assert (init_type > AUTO_INIT_UNINITIALIZED);
>>>>> +
>>>>> +  /* if this variable is a VLA, get its SIZE and ADDR first.  */
>>>>> +  if (is_vla)
>>>>> +    {
>>>>> +      /* The temporary address variable for this vla should have been
>>>>> +        created during gimplification phase.  Refer to gimplify_vla_decl
>>>>> +        for details.  */
>>>>> +      tree var_decl = (TREE_CODE (var) == SSA_NAME) ?
>>>>> +                      SSA_NAME_VAR (var) : var;
>>>>> +      gcc_assert (DECL_HAS_VALUE_EXPR_P (var_decl));
>>>>> +      gcc_assert (TREE_CODE (DECL_VALUE_EXPR (var_decl)) == 
>>>>> INDIRECT_REF);
>>>>> +      /* Get the address of this vla variable.  */
>>>>> +      vlaaddr = TREE_OPERAND (DECL_VALUE_EXPR (var_decl), 0);
>>>>> 
>>>>> err - isn't the address of the decl represented by the LHS 
>>>>> regardless whether this is a VLA or not?
>>>> 
>>>> The LHS of the call to .DEFERRED_INIT is the DECL itself whatever it’s a VLA or not. 
>>>> 
>>>> In order to create a memset call, we need the Address of this DECL as the first argument. 
>>>> If the DECL is not a VLA, we just simply apply “build_fold_addr_expr” on this DECL to get its address,
>>>> However, for VLA, during gimplification phase “gimplify_vla_decl”, we have already created a temporary
>>>> address variable for this DECL, and recorded this address variable with “DECL_VALUE_EXPR(DECL), 
>>>> We should use this already created address variable  for VLAs. 
>>> 
>>> So the issue is that the LHS of the .DEFERRED_INIT call is not properly
>>> gimplified.  We should not have such decl there but I see we do not
>>> have IL verification that covers this.
>> 
>> Don’t quite understand here:  do you mean all the LHS of .DEFERRED_INIT call are not properly gimplified, or
>> Only the LHS of .DEFERRED_INIT call for VLA are not properly gimplified?
> 
> Especially in the VLA case but likely also in general (though unlikely
> since usually the receiver of initializations are simple enough).  I'd
> expect the VLA case end up as
> 
> *ptr_to_decl = .DEFERRED_INIT (...);
> 
> where *ptr_to_decl is the DECL_VALUE_EXPR of the decl.

So, for the following small testing case:

====
extern void bar (int);

void foo(int n)
{
  int arr[n];
  bar (arr[2]);
  return;
}
=====

If I compile it with -ftrivial-auto-var-init=zero -fdump-tree-gimple -S -o auto-init-11.s -fdump-rtl-expand, the *.gimple dump is:

=====
void foo (int n)
{
  int n.0;
  sizetype D.1950;
  bitsizetype D.1951;
  sizetype D.1952;
  bitsizetype D.1953;
  sizetype D.1954;
  int[0:D.1950] * arr.1;
  void * saved_stack.2;
  int arr[0:D.1950] [value-expr: *arr.1];

  saved_stack.2 = __builtin_stack_save ();
  try
    {
      n.0 = n;
      _1 = (long int) n.0;
      _2 = _1 + -1;
      _3 = (sizetype) _2;
      D.1950 = _3;
      _4 = (sizetype) n.0;
      _5 = (bitsizetype) _4;
      _6 = _5 * 32;
      D.1951 = _6;
      _7 = (sizetype) n.0;
      _8 = _7 * 4;
      D.1952 = _8;
      _9 = (sizetype) n.0;
      _10 = (bitsizetype) _9;
      _11 = _10 * 32;
      D.1953 = _11;
      _12 = (sizetype) n.0;
      _13 = _12 * 4;
      D.1954 = _13;
      arr.1 = __builtin_alloca_with_align (D.1954, 32);
      arr = .DEFERRED_INIT (D.1952, 2, 1);
      _14 = (*arr.1)[2];
      bar (_14);
      return;
    }
  finally
    {
      __builtin_stack_restore (saved_stack.2);
    }
}

====

You think that the above .DEFEERED_INIT is not correct?
It should be:

*arr.1 = .DEFERRED_INIT (D.1952. 2, 1);

?

> 
>> What do you mean by “such” decl? A decl whole “DECL_VALUE_EXPR(DECL)” is valid?
> 
> A 'decl' that has a DECL_VALUE_EXPR should not appear in the IL, it should
> always be refered to as its DECL_VALUE_EXPR.

Okay.

Qing
> 
> Richard.
> 
>> Qing
>>> 



To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=3aeb.6112954c.96d85.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5Bpatch%5D%5Bversion+6%5D+add+-ftrivial-auto-var-init+and+variable+attribute+%22uninitialized%22+to+gcc&verdict=C&c=ca241e39373f82aaccceb76b5bae87c7715bd9b7

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 14:48 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 14:48 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx408.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 475 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 13109 bytes --]

From: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Xionghu Luo <luoxhu@linux.ibm.com>
Cc: segher@kernel.crashing.org, wschmidt@linux.ibm.com, linkw@gcc.gnu.org, gcc-patches@gcc.gnu.org, hubicka@ucw.cz
Subject: Re: [PATCH] Fix loop split incorrect count and probability
Date: Tue, 10 Aug 2021 16:47:41 +0200 (CEST)
Message-ID: <nycvar.YFH.7.76.2108101634250.11781@zhemvz.fhfr.qr>

On Mon, 9 Aug 2021, Xionghu Luo wrote:

> Thanks,
> 
> On 2021/8/6 19:46, Richard Biener wrote:
> > On Tue, 3 Aug 2021, Xionghu Luo wrote:
> > 
> >> loop split condition is moved between loop1 and loop2, the split bb's
> >> count and probability should also be duplicated instead of (100% vs INV),
> >> secondly, the original loop1 and loop2 count need be propotional from the
> >> original loop.
> >>
> >>
> >> diff base/loop-cond-split-1.c.151t.lsplit  patched/loop-cond-split-1.c.151t.lsplit:
> >> ...
> >>     int prephitmp_16;
> >>     int prephitmp_25;
> >>
> >>     <bb 2> [local count: 118111600]:
> >>     if (n_7(D) > 0)
> >>       goto <bb 4>; [89.00%]
> >>     else
> >>       goto <bb 3>; [11.00%]
> >>
> >>     <bb 3> [local count: 118111600]:
> >>     return;
> >>
> >>     <bb 4> [local count: 105119324]:
> >>     pretmp_3 = ga;
> >>
> >> -  <bb 5> [local count: 955630225]:
> >> +  <bb 5> [local count: 315357973]:
> >>     # i_13 = PHI <i_10(20), 0(4)>
> >>     # prephitmp_12 = PHI <prephitmp_5(20), pretmp_3(4)>
> >>     if (prephitmp_12 != 0)
> >>       goto <bb 6>; [33.00%]
> >>     else
> >>       goto <bb 7>; [67.00%]
> >>
> >> -  <bb 6> [local count: 315357972]:
> >> +  <bb 6> [local count: 104068130]:
> >>     _2 = do_something ();
> >>     ga = _2;
> >>
> >> -  <bb 7> [local count: 955630225]:
> >> +  <bb 7> [local count: 315357973]:
> >>     # prephitmp_5 = PHI <prephitmp_12(5), _2(6)>
> >>     i_10 = inc (i_13);
> >>     if (n_7(D) > i_10)
> >>       goto <bb 21>; [89.00%]
> >>     else
> >>       goto <bb 11>; [11.00%]
> >>
> >>     <bb 11> [local count: 105119324]:
> >>     goto <bb 3>; [100.00%]
> >>
> >> -  <bb 21> [local count: 850510901]:
> >> +  <bb 21> [local count: 280668596]:
> >>     if (prephitmp_12 != 0)
> >> -    goto <bb 20>; [100.00%]
> >> +    goto <bb 20>; [33.00%]
> >>     else
> >> -    goto <bb 19>; [INV]
> >> +    goto <bb 19>; [67.00%]
> >>
> >> -  <bb 20> [local count: 850510901]:
> >> +  <bb 20> [local count: 280668596]:
> >>     goto <bb 5>; [100.00%]
> >>
> >> -  <bb 19> [count: 0]:
> >> +  <bb 19> [local count: 70429947]:
> >>     # i_23 = PHI <i_10(21)>
> >>     # prephitmp_25 = PHI <prephitmp_5(21)>
> >>
> >> -  <bb 12> [local count: 955630225]:
> >> +  <bb 12> [local count: 640272252]:
> >>     # i_15 = PHI <i_23(19), i_22(16)>
> >>     # prephitmp_16 = PHI <prephitmp_25(19), prephitmp_16(16)>
> >>     i_22 = inc (i_15);
> >>     if (n_7(D) > i_22)
> >>       goto <bb 16>; [89.00%]
> >>     else
> >>       goto <bb 11>; [11.00%]
> >>
> >> -  <bb 16> [local count: 850510901]:
> >> +  <bb 16> [local count: 569842305]:
> >>     goto <bb 12>; [100.00%]
> >>
> >>   }
> >>
> >> gcc/ChangeLog:
> >>
> >> 	* tree-ssa-loop-split.c (split_loop): Fix incorrect probability.
> >> 	(do_split_loop_on_cond): Likewise.
> >> ---
> >>   gcc/tree-ssa-loop-split.c | 16 ++++++++--------
> >>   1 file changed, 8 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
> >> index 3a09bbc39e5..8e5a7ded0f7 100644
> >> --- a/gcc/tree-ssa-loop-split.c
> >> +++ b/gcc/tree-ssa-loop-split.c
> >> @@ -583,10 +583,10 @@ split_loop (class loop *loop1)
> >>   	basic_block cond_bb;
> 
>  	if (!initial_true)
> -	  cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); 
> +	  cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond);
> +
> +	edge true_edge = EDGE_SUCC (bbs[i], 0)->flags & EDGE_TRUE_VALUE
> +			   ? EDGE_SUCC (bbs[i], 0)
> +			   : EDGE_SUCC (bbs[i], 1);
> 
> >>   
> >>   	class loop *loop2 = loop_version (loop1, cond, &cond_bb,
> >> -					   profile_probability::always (),
> >> -					   profile_probability::always (),
> >> -					   profile_probability::always (),
> >> -					   profile_probability::always (),
> >> +					   true_edge->probability,
> >> +					   true_edge->probability.invert (),
> >> +					   true_edge->probability,
> >> +					   true_edge->probability.invert (),
> >>   					   true);
> > 
> > there is no 'true_edge' variable at this point.
> 
> Sorry, missed the above hunk when split the patch. 
> 
> > 
> >>   	gcc_assert (loop2);
> >>   
> >> @@ -1486,10 +1486,10 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch)
> >>     initialize_original_copy_tables ();
> >>   
> >>     struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL,
> >> -				     profile_probability::always (),
> >> -				     profile_probability::never (),
> >> -				     profile_probability::always (),
> >> -				     profile_probability::always (),
> >> +				     invar_branch->probability.invert (),
> >> +				     invar_branch->probability,
> >> +				     invar_branch->probability.invert (),
> >> +				     invar_branch->probability,
> >>   				     true);
> >>     if (!loop2)
> >>       {
> > 
> > The patch introduction seems to talk about do_split_loop_on_cond only.
> 
> split_loop faces similar issue though it sets the two branches to 100% vs 100%
> and no scaling which seems also incorrect.
> 
> > Since loop versioning inserts a condition with the passed probabilities
> > but in this case a 'boolean_true_node' condition the then and else
> > probabilities passed look correct.  It's just the scaling arguments
> > that look wrong?  This loop_version call should get a comment as to
> > why we are passing probabilities the way we do.
> 
> This optimization is transforming from:
> 
> for (i = 0; i < n; i = inc (i))
>     {
>       if (ga)
>         ga = do_something ();
> }
> 
> to: 
> 
>   for (i = 0; i < x; i = inc (i))
> {
>     if (true)
>          ga = do_something ();
>         if (!ga)
>           break;
> }
>   for (; i < n; i = inc (i))
> {
>     if (false)
>          ga = do_something ();
> }
> 
> 
> `boolean_true_node` is passed in to make the first loop's condition
> statement to be 'true', after returning from loop_version, there is a
> piece of code forcing the condition in second loop to be false,
> and the original condition is moved from *in loop* to *exit edge*
> between loop1 and loop2. 

Yes, one complication is that we use loop_version but do not retain
the CFG it creates.  Something like the vectorizers
slpeel_tree_duplicate_loop_to_edge_cfg would be a better "fit"
but then that code doesn't do any scaling yet.  But then
loop_version uses cfg_hook_duplicate_loop_to_header_edge and I suppose
we could write a variant that simply doesn't mangle the CFG
with a new condition switching between both loops but keeps them
executed after each other ...

As said, this adds to the confusion and some awkwardness.

> @fxue may have inputs about this since he contributed it years ago.
> 
> > 
> > It does seem that scaling the loop by the invar_branch probability
> > is correct.  Since this does similar things to unswitching, I see
> > that unswitching does
> > 
> > prob_true = edge_true->probability;
> > loop_version (loop, unshare_expr (cond),
> >                         NULL, prob_true,
> >                         prob_true.invert (),
> >                         prob_true, prob_true.invert (),
> >                         false);
> > 
> > which maybe suggests that your invar_branch based passing should
> > depend on 'true_invar'?
> > 
> > Also compared to unswitching the first loop is always entered,
> > so I wonder if the scaling is correct with respect to that
> > given unswitching where only ever one loop is entered?
> 
> 
> Scaling is based on the probability calculated on "if (ga)", if it is
> 33% vs 67% before split, then it is reasonable to scale loop1 to 33%
> and loop2 to 67% suppose the branch probability is correct enough?
> 
> unswitch also scaled the two loops based on prob_true, if prob_true
> is 50%, it decreases X's count to HALF unexpectedly since it should be
> executed on both branches?  while loop split still kept execute both first
> loop and second loop, it seems even more accurate than loop unswitching?

I just was saying that both doing exactly the same looks wrong (on
either side).

> tree-ssa-loop-unswitch.c:
> 
>    while (A)
>      {
>        if (inv)
>          B;
> 
>        X;
> 
>        if (!inv)
> 	 C;
>      }
> 
>    where inv is the loop invariant, into
> 
>    if (inv)
>      {
>        while (A)
> 	 {
>            B;
> 	   X;
> 	 }
>      }
>    else
>      {
>        while (A)
> 	 {
> 	   X;
> 	   C;
> 	 }
>      }

Yes, here scaling based on the if (inv) probability looks obviously
100% correct to me.  But I might be wrong.  And thus the
splitting case must be still off (so I conclude).  Hmm, in fact
I think for the loop unswitching case the scaling of the B and
C blocks is off?  Those should be considered always executed.
Note the loop unswitching pass is altering the conditions
condition to static true/false but I don't think it performs
further adjustments.

That said, likely the profile update cannot be done uniformly
for all blocks of a loop?

Richard.


To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=4755.611291c9.bbe20.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5BPATCH%5D+Fix+loop+split+incorrect+count+and+probability&verdict=C&c=5b848bab159a481116cc4a7f585423cde878a78a

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 14:26 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 14:26 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx408.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 475 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 6191 bytes --]

From: Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Richard Biener <richard.guenther@gmail.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [COMMITTED] PR tree-optimization/101741 - Ensure toupper and tolower follow the expected pattern.
Date: Tue, 10 Aug 2021 16:25:37 +0200
Message-ID: <20210810142537.GL2380545@tucnak>

On Tue, Aug 10, 2021 at 04:14:15PM +0200, Richard Biener via Gcc-patches wrote:
> OK, so gimple_builtin_call_types_compatible_p only checks that the
> call is consistent with the fndecl type - iff the declaration is incompatible
> with the declaration as specified by builtins.def then that's of course
> not detected by that check (this check is to catch cases where a
> formerly indirect call through an incompatible type is now known to
> be to a builtin).
> 
> IIRC that is a recurring issue and indeed my opinion is that frontends
> should not mark function decls as BUILT_IN if the definition/declaration
> signature is not compatible.

Different FEs use different strictness for what is or is not compatible.
And we can't be too strict, because e.g. of FILE * arguments where the
FILE type isn't pre-declared for the builtins.
On severe mismatches I think the FEs already don't mark it built in
(say double vs. int etc.), and e.g. const char * vs. char * differences
should be allowed (e.g. consider C++ vs. C strchr).
But perhaps we should consider as incompatible somethng that doesn't
pass useless_type_conversion_p too...

	Jakub



To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=2002.61128ca5.85f80.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5BCOMMITTED%5D+PR+tree-optimization%2F101741+-+Ensure+toupper+and+tolower+follow+the+expected+pattern.&verdict=C&c=faf1b8beb7208e3d8bcbc95892f060f0216217d6

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 14:18 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 14:18 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx306.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 474 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 18125 bytes --]

From: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Qing Zhao <qing.zhao@oracle.com>
Cc: Jakub Jelinek <jakub@redhat.com>, Nick Alcock via Gcc-patches <gcc-patches@gcc.gnu.org>, Kees Cook <keescook@chromium.org>
Subject: Re: [patch][version 6] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc
Date: Tue, 10 Aug 2021 16:16:58 +0200 (CEST)
Message-ID: <nycvar.YFH.7.76.2108101614340.11781@zhemvz.fhfr.qr>

On Tue, 10 Aug 2021, Qing Zhao wrote:

> 
> 
> > On Aug 10, 2021, at 2:36 AM, Richard Biener <rguenther@suse.de> wrote:
> > 
> > On Mon, 9 Aug 2021, Qing Zhao wrote:
> > 
> >> Hi, Richard,
> >> 
> >> Thanks a lot for you review.
> >> 
> >> Although these comments are not made on the latest patch (7th version) :-), all the comments are valid since the parts you commented
> >> are not changed in the 7th version.
> >> 
> >> 
> >>> On Aug 9, 2021, at 9:09 AM, Richard Biener <rguenther@suse.de> wrote:
> >>> 
> >>> On Tue, 27 Jul 2021, Qing Zhao wrote:
> >>> 
> >>>> Hi,
> >>>> 
> >>>> This is the 6th version of the patch for the new security feature for GCC.
> >>>> 
> >>>> I have tested it with bootstrap on both x86 and aarch64, regression testing on both x86 and aarch64.
> >>>> Also compile CPU2017 (running is ongoing), without any issue. (With the fix to bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101586).
> >>>> 
> >>>> Please take a look and let me know any issue.
> >>> 
> >>> +/* Handle an "uninitialized" attribute; arguments as in
> >>> +   struct attribute_spec.handler.  */
> >>> +
> >>> +static tree
> >>> +handle_uninitialized_attribute (tree *node, tree name, tree ARG_UNUSED 
> >>> (args),
> >>> +                               int ARG_UNUSED (flags), bool 
> >>> *no_add_attrs)
> >>> +{
> >>> +  if (!VAR_P (*node))
> >>> +    {
> >>> +      warning (OPT_Wattributes, "%qE attribute ignored", name);
> >>> +      *no_add_attrs = true;
> >>> +    }
> >>> 
> >>> you are documenting this attribute for automatic variables but
> >>> here you allow placement on globals as well (not sure if at this
> >>> point TREE_STATIC / DECL_EXTERNAL are set correctly).
> >> 
> >> Right, I should warn when the attribute is placed for globals or static variables. 
> >> I will try TREE_STATIC/DECL_EXTERNAL to see whether it’s work or not.
> >> 
> >>> 
> >>> +  /* for languages that do not support BUILT_IN_CLEAR_PADDING, create the
> >>> +     function node for padding initialization.  */
> >>> +  if (!fn)
> >>> +    {
> >>> +      tree ftype = build_function_type_list (void_type_node,
> >>> +                                            ptr_type_node,
> >>> 
> >>> the "appropriate" place to do this would be 
> >>> tree.c:build_common_builtin_nodes
> >> 
> >> Sure, will move the creation of  function node of BUILT_IN_CLEAR_PADDING for Fortran etc. to tree.c:build_common_builtin_nodes.
> >> 
> >>> 
> >>> You seem to marshall the is_vla argument as for_auto_init when
> >>> expanding/folding the builtin and there it's used to suppress
> >>> diagnostics (and make covered pieces not initialized?).
> >> 
> >> Yes, I added an extra argument “for_auto_init” for “BUILT_IN_CLEAR_PADDING”, this argument is added to suppress errors emitted during folding
> >> BUILT_IN_CLEAR_PADDING for flexible array member . Such errors should Not be emitted when “BUILT_IN_CLEAR_PADDING” is called with compiler automatic initialization.
> >> Please see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101586, comment #6 from Jakub Jelinek.
> >> 
> >>> I suggest
> >>> to re-name is_vla/for_auto_init to something more descriptive.
> >> 
> >> Okay, I will. 
> >>> 
> >>> +   gimple_fold_builtin_clear_padding. If FOR_AUTO_INIT,
> >>> +   not emit some of the error messages since doing that
> >>> +   might confuse the end user.  */
> >>> 
> >>> doesn't explain to me whether errors still might be raised or
> >>> what the actual behavior is.
> >> 
> >> Okay, will make this more clear in the comments.
> >> 
> >>> 
> >>> +static gimple *
> >>> +build_deferred_init (tree decl,
> >>> +                    enum auto_init_type init_type,
> >>> +                    bool is_vla)
> >>> +{
> >>> +  gcc_assert ((is_vla && TREE_CODE (decl) == WITH_SIZE_EXPR)
> >>> +             || (!is_vla && TREE_CODE (decl) != WITH_SIZE_EXPR));
> >>> 
> >>> so the is_vla parameter looks redundant (and the assert dangerous?).
> >>> Either the caller knows it deals with a VLA, then that should be
> >>> passed through - constant sizes can also later appear during
> >>> optimization after all - or is_vla should be determined here
> >>> based on whether the size at gimplification time is constant.
> >> 
> >> The routine “build_deferred_init” is ONLY called during gimplification phase by the routine “gimple_add_init_for_auto_var", at this place,
> >> Is_vla should be determined by the caller to check the size of the DECL. If it’s a vla, the “maybe_with_size_expr” will be applied for
> >> DECL to make it to a WITH_SIZE_EXPR.  So, the assertion is purely to make sure this at gimplification phase.
> >> 
> >> Yes, the size of the VLA decl might become a constant later due to constant propagation, etc.  but during the gimplification phase, the assertion should be true.
> >>> 
> >>> +         /* If the user requests to initialize automatic variables, we
> >>> +            should initialize paddings inside the variable. Add a call to
> >>> +            __BUILTIN_CLEAR_PADDING (&object, 0, for_auto_init = true) to
> >>> +            initialize paddings of object always to zero regardless of
> >>> +            INIT_TYPE.  */
> >>> +         if (opt_for_fn (current_function_decl, flag_auto_var_init)
> >>> +               > AUTO_INIT_UNINITIALIZED
> >>> +             && VAR_P (object)
> >>> +             && !DECL_EXTERNAL (object)
> >>> +             && !TREE_STATIC (object))
> >>> +           gimple_add_padding_init_for_auto_var (object, false, pre_p);
> >>> +         return ret;
> >>> 
> >>> I think you want to use either auto_var_p (object) or
> >>> auto_var_in_fn_p (object, current_function_decl).  Don't you also
> >>> want to check for the 'uninitialized' attribute here?  I suggest
> >>> to abstract the check on whether 'object' should be subject
> >>> to autoinit to a helper function.
> >> 
> >> Thanks for the suggestion, I will do this.
> >> 
> >> 
> >>> 
> >>> There's another path above this calling gimplify_init_constructor
> >>> for the case of
> >>> 
> >>> const struct S x = { ... };
> >>> struct S y = x;
> >>> 
> >>> where it will try to init 'y' from the CTOR directly, it seems you
> >>> do not cover this case.
> >> 
> >> Yes, you are right, this case was not covered right now, and this should be covered.
> >> 
> >> Looks like that I need to move the “gimple_add_padding_init_for_auto_var” inside the routine “gimplify_init_constructor” to
> >> Cover all the cases. 
> >> 
> >>> I also think that the above place applies
> >>> to all aggregate assignment statements, not only to INIT_EXPRs?
> >> 
> >>> So don't you want to restrict clear-padding emit here?
> >> 
> >> You are right, I might need to restrict it Only to INIT_EXPR. 
> >> Will update.
> >> 
> >>> 
> >>> +static void
> >>> +expand_DEFERRED_INIT (internal_fn, gcall *stmt)
> >>> +{
> >>> +  tree var = gimple_call_lhs (stmt);
> >>> +  tree size_of_var = gimple_call_arg (stmt, 0);
> >>> +  tree vlaaddr = NULL_TREE;
> >>> +  tree var_type = TREE_TYPE (var);
> >>> +  bool is_vla = (bool) TREE_INT_CST_LOW (gimple_call_arg (stmt, 2));
> >>> +  enum auto_init_type init_type
> >>> +    = (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (stmt, 1));
> >>> +
> >>> +  gcc_assert (init_type > AUTO_INIT_UNINITIALIZED);
> >>> +
> >>> +  /* if this variable is a VLA, get its SIZE and ADDR first.  */
> >>> +  if (is_vla)
> >>> +    {
> >>> +      /* The temporary address variable for this vla should have been
> >>> +        created during gimplification phase.  Refer to gimplify_vla_decl
> >>> +        for details.  */
> >>> +      tree var_decl = (TREE_CODE (var) == SSA_NAME) ?
> >>> +                      SSA_NAME_VAR (var) : var;
> >>> +      gcc_assert (DECL_HAS_VALUE_EXPR_P (var_decl));
> >>> +      gcc_assert (TREE_CODE (DECL_VALUE_EXPR (var_decl)) == 
> >>> INDIRECT_REF);
> >>> +      /* Get the address of this vla variable.  */
> >>> +      vlaaddr = TREE_OPERAND (DECL_VALUE_EXPR (var_decl), 0);
> >>> 
> >>> err - isn't the address of the decl represented by the LHS 
> >>> regardless whether this is a VLA or not?
> >> 
> >> The LHS of the call to .DEFERRED_INIT is the DECL itself whatever it’s a VLA or not. 
> >> 
> >> In order to create a memset call, we need the Address of this DECL as the first argument. 
> >> If the DECL is not a VLA, we just simply apply “build_fold_addr_expr” on this DECL to get its address,
> >> However, for VLA, during gimplification phase “gimplify_vla_decl”, we have already created a temporary
> >> address variable for this DECL, and recorded this address variable with “DECL_VALUE_EXPR(DECL), 
> >> We should use this already created address variable  for VLAs. 
> > 
> > So the issue is that the LHS of the .DEFERRED_INIT call is not properly
> > gimplified.  We should not have such decl there but I see we do not
> > have IL verification that covers this.
> 
> Don’t quite understand here:  do you mean all the LHS of .DEFERRED_INIT call are not properly gimplified, or
> Only the LHS of .DEFERRED_INIT call for VLA are not properly gimplified?

Especially in the VLA case but likely also in general (though unlikely
since usually the receiver of initializations are simple enough).  I'd
expect the VLA case end up as

 *ptr_to_decl = .DEFERRED_INIT (...);

where *ptr_to_decl is the DECL_VALUE_EXPR of the decl.

> What do you mean by “such” decl? A decl whole “DECL_VALUE_EXPR(DECL)” is valid?

A 'decl' that has a DECL_VALUE_EXPR should not appear in the IL, it should
always be refered to as its DECL_VALUE_EXPR.

Richard.

> Qing
> > 
> > The gimplifier usually does this in gimplify_var_or_parm_decl,
> > but you can of course substitute DECL_VALUE_EXPR yourself if the
> > decl was already gimplified (was it?)
> > 
> >> 
> >>> Looking at DECL_VALUE_EXPR
> >>> looks quite fragile since that's not sth data dependence honors.
> >>> It looks you only partly gimplify the build init here?  All
> >>> DECL_VALUE_EXPRs should have been resolved.
> >> 
> >> Don’t quite understand here. you mean that all the “DECL_VALUE_EXPRs” have been resolved at the phase RTL expansion,
> >> So I cannot use this to get the address variable of the VLA?
> >> 
> >> (However, my unit testing cases for VLAs are all looks fine).
> >> 
> >>> 
> >>> +  if (is_vla || (!use_register_for_decl (var)))
> >>> ...
> >>> +  else
> >>> +    {
> >>> +    /* If this variable is in a register, use expand_assignment might
> >>> +       generate better code.  */
> >>> 
> >>> you compute the patter initializer even when not needing it,
> >>> that's wasteful.
> >> 
> >> Okay, I will restrict the pattern initializer computation when really needed. 
> >> 
> >>> It's also quite ugly, IMHO you should
> >>> use can_native_interpret_type_p (var_type) and native_interpret
> >>> a char [] array initialized to the pattern and if
> >>> !can_native_interpret_type_p () go the memset route.
> >> 
> >> Thanks for the suggestion. 
> >> 
> >> Will try this. 
> >> 
> >>> 
> >>> +  /* We will not verify the arguments for the calls to .DEFERRED_INIT.
> >>> +     Such call is not a real call, just a placeholder for a later
> >>> +     initialization during expand phase.
> >>> +     This is mainly to avoid assertion failure for the following
> >>> +     case:
> >>> +
> >>> +     uni_var = .DEFERRED_INIT (var_size, INIT_TYPE, is_vla);
> >>> +     foo (&uni_var);
> >>> +
> >>> +     in the above, the uninitialized auto variable "uni_var" is
> >>> +     addressable, therefore should not be in registers, resulting
> >>> +     the assertion failure in the following argument verification.  */
> >>> +  if (gimple_call_internal_p (stmt, IFN_DEFERRED_INIT))
> >>> +    return false;
> >>> +
> >>>  /* ???  The C frontend passes unpromoted arguments in case it
> >>>     didn't see a function declaration before the call.  So for now
> >>>     leave the call arguments mostly unverified.  Once we gimplify
> >>>     unit-at-a-time we have a chance to fix this.  */
> >>> 
> >>> -  for (i = 0; i < gimple_call_num_args (stmt); ++i)
> >>> 
> >>> isn't that from the time there was a decl argument to .DEFERRED_INIT?
> >> 
> >> You mean this issue is only there when the decl is the first argument (the old design for .DEFERRED_INIT).
> >> With the new design, this issue is not there anymore?
> > 
> > I think so, yes - the change should no longer be needed.
> > 
> > Ricahrd.
> > 
> >>> 
> >>> +  if (gimple_call_internal_p (stmt, IFN_DEFERRED_INIT))
> >>> +    {
> >>> +      tree size_of_arg0 = gimple_call_arg (stmt, 0);
> >>> +      tree size_of_lhs = TYPE_SIZE_UNIT (TREE_TYPE (lhs));
> >>> +      tree is_vla_node = gimple_call_arg (stmt, 2);
> >>> +      bool is_vla = (bool) TREE_INT_CST_LOW (is_vla_node);
> >>> +
> >>> +      if (TREE_CODE (lhs) == SSA_NAME)
> >>> +       lhs = SSA_NAME_VAR (lhs);
> >>> +
> >>> 
> >>> 'lhs' is not looked at after this, no need to look at SSA_NAME_VAR.
> >> 
> >> Okay, will update this.
> >> 
> >>> 
> >>> 
> >>> Thanks and sorry for the delay in reviewing this (again).
> >> 
> >> Thanks again for your detailed review and suggestions.
> >> 
> >> I will update the patch accordingly and send the updated patch soon.
> >> 
> >> Qing
> >>> 
> >>> Richard.
> >>> 
> >>> 
> >>>> Thanks
> >>>> 
> >> 
> >> 
> > 
> > -- 
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=5351.61128a96.8a679.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5Bpatch%5D%5Bversion+6%5D+add+-ftrivial-auto-var-init+and+variable+attribute+%22uninitialized%22+to+gcc&verdict=C&c=7133febf0f12e6fe353fbf385079f94e929770f7

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 14:15 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 14:15 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx409.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 474 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 9188 bytes --]

From: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Andrew MacLeod <amacleod@redhat.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [COMMITTED] PR tree-optimization/101741 - Ensure toupper and tolower follow the expected pattern.
Date: Tue, 10 Aug 2021 16:14:15 +0200
Message-ID: <CAFiYyc3Beh8achiPWvS0Lnpx-ecfLZtM0326AV+fTbCnOrjSkA@mail.gmail.com>

On Tue, Aug 10, 2021 at 3:21 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>
> On 8/10/21 3:45 AM, Richard Biener wrote:
> > On Mon, Aug 9, 2021 at 10:31 PM Andrew MacLeod via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >> The user has overridden the function name "toupper" . Its marked as a
> >> builtin function, presumably because it matches the name.  In range
> >> folding, we were assuming the LHS and the parameter were compatible
> >> types...  but they are not in this case..
> >>
> >> I don't know if we should be marking such a thing as a builtin function,
> >> but regardless.. we shouldn't be trapping.  If the type of the argument
> >> is not compatible with  the LHS, then we'll simply assume nothing about
> >> the function.
> >>
> >> Bootstraps with no regression on x86_64-pc-linux-gnu.  pushed.
> > I wonder why the gimple_call_combined_fn verification
> > using gimple_builtin_call_types_compatible_p isn't enough here?
> > Yes, it's matching is a bit lazy, but only with respect to promotion
> > of arguments to 'int'.
> >
> > Richard.
> >
> I set a breakpoint on the return type field for the builtin... A quick
> check shows the return type of the builtin is being changed to "unsigned
> int" here:

OK, so gimple_builtin_call_types_compatible_p only checks that the
call is consistent with the fndecl type - iff the declaration is incompatible
with the declaration as specified by builtins.def then that's of course
not detected by that check (this check is to catch cases where a
formerly indirect call through an incompatible type is now known to
be to a builtin).

IIRC that is a recurring issue and indeed my opinion is that frontends
should not mark function decls as BUILT_IN if the definition/declaration
signature is not compatible.

> #0  merge_decls (newdecl=0x7fffe9f67100, olddecl=0x7fffe9ed2100,
> newtype=0x7fffe9e3ae70, oldtype=0x7fffe9e3ae70) at
> /gcc/master/gcc/gcc/c/c-decl.c:2598
> #1  0x0000000000a0628d in duplicate_decls (newdecl=0x7fffe9f67100,
> olddecl=0x7fffe9ed2100) at /gcc/master/gcc/gcc/c/c-decl.c:2963
> #2  0x0000000000a07464 in pushdecl (x=0x7fffe9f67100) at
> /gcc/master/gcc/gcc/c/c-decl.c:3256
> #3  0x0000000000a1d113 in start_function (declspecs=0x37728b0,
> declarator=0x3772ae0, attributes=0x0) at /gcc/master/gcc/gcc/c/c-decl.c:9644
> #4  0x0000000000a8667c in c_parser_declaration_or_fndef
> (parser=0x7ffff7ff7ab0, fndef_ok=true, static_assert_ok=true,
> empty_ok=true, nested=false, start_attr_ok=true,
> objc_foreach_object_declaration=0x0,
>      omp_declare_simd_clauses=0x0, have_attrs=false, attrs=0x0,
> oacc_routine_data=0x0, fallthru_attr_p=0x0) at
> /gcc/master/gcc/gcc/c/c-parser.c:2444
>
>
> It would appear that merge_decls is merging the olddecl (the bultin)
> with newdecl  and results in changing the LHS to unsigned int, so now it
> thinks the builtin matches.   eeeks.
>
> I don't know what the correct thing to do is, but a quick hack to check
> if old_decl is a builtin and return false from duplicate_decl also seems
> to resolve the problem:

Yeah, but that's of course too harsh - we do want correct user declarations of
say 'malloc' to be classified as built-in.

The odd thing is that we merge int foo() and unsigned foo () using
composite_type.  diagnose_mismatched_decls should be made
more strict here although it explicitly mentions

          /* Accept "harmless" mismatches in function types such
             as missing qualifiers or int vs long when they're the same
             size.  However, diagnose return and argument types that are
             incompatible according to language rules.  */

whatever 'language rules' means here.

Anyway, this function is where we should avoid making newdecl built-in
(and we should of course not adjust olddecl either).

Richard.

> diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
> index 221a67fe57b..27ab6ac9f1a 100644
> --- a/gcc/c/c-decl.c
> +++ b/gcc/c/c-decl.c
> @@ -2960,6 +2960,9 @@ duplicate_decls (tree newdecl, tree olddecl)
>         return false;
>       }
>
> +  if (DECL_BUILT_IN_CLASS (olddecl) != NOT_BUILT_IN)
> +    return false;
> +
>     merge_decls (newdecl, olddecl, newtype, oldtype);
>
>     /* The NEWDECL will no longer be needed.
>
>


To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=14d8.611289f6.1f517.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5BCOMMITTED%5D+PR+tree-optimization%2F101741+-+Ensure+toupper+and+tolower+follow+the+expected+pattern.&verdict=C&c=3114d549f7a496e61d413636ba42fe6103f489ac

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 13:51 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 13:51 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx306.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 474 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 8370 bytes --]

From: Bill Schmidt via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: gcc-patches@gcc.gnu.org, willschm@linux.ibm.com, dje.gcc@gmail.com
Subject: Re: [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file
Date: Tue, 10 Aug 2021 08:49:28 -0500
Message-ID: <8101fae8-058a-b27f-6abb-dab445fd6c0e@linux.ibm.com>


On 8/10/21 8:40 AM, Segher Boessenkool wrote:
> On Tue, Aug 10, 2021 at 08:02:24AM -0500, Bill Schmidt wrote:
>> The whole point is that this data type is only used for interfaces, as
>> shown in the example code.  Nobody wants to define const void as
>> anything.  The const serves only as a contract that the pointed-to
>> object, no matter what it is cast to, will not be modified.
> So it is just documentation, nothing to do with overloading?  Any cast
> (implicit as well!) will give new qualifiers, not just a new type.  So I
> still do not see the point here.
>
> I'll just read it as "void *" :-)


Largely documentational, yes.  The overloads must be defined with "const 
unsigned char *" and so forth.  It would be unexpected to define the 
built-in that this maps to as "void *" rather than "const void *".  
Normally passing a "const unsigned char *" to a function requiring a 
"const void *" can be done implicitly with no cast at all, and so this 
is what people expect to see.  "Under the covers" we can of course cast 
in any way that we see fit, but specifying "const void *" really 
reinforces what people should understand is going on.

If it makes you feel better to read it as "void *", I say go for it. 
:-)  I think most people will be less confused with "const" present in 
the signature in both the built-in definition and the overload 
definition, not just in one of them.

Bill

>
>
> Segher


To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=762e.61128440.f4204.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5BPATCH+03%2F34%5D+rs6000%3A+Add+the+rest+of+the+%5Baltivec%5D+stanza+to+the+builtins+file&verdict=C&c=a5b68e50745a40904759329912479562a5ab8c0b

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 13:41 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 13:41 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx601.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 474 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 23895 bytes --]

From: Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Richard Biener <rguenther@suse.de>
Cc: Jakub Jelinek <jakub@redhat.com>, Nick Alcock via Gcc-patches <gcc-patches@gcc.gnu.org>, Kees Cook <keescook@chromium.org>
Subject: Re: [patch][version 6] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc
Date: Tue, 10 Aug 2021 13:39:59 +0000
Message-ID: <F9BC037B-9B5B-4268-BB90-0901D285424C@oracle.com>



> On Aug 10, 2021, at 2:36 AM, Richard Biener <rguenther@suse.de> wrote:
> 
> On Mon, 9 Aug 2021, Qing Zhao wrote:
> 
>> Hi, Richard,
>> 
>> Thanks a lot for you review.
>> 
>> Although these comments are not made on the latest patch (7th version) :-), all the comments are valid since the parts you commented
>> are not changed in the 7th version.
>> 
>> 
>>> On Aug 9, 2021, at 9:09 AM, Richard Biener <rguenther@suse.de> wrote:
>>> 
>>> On Tue, 27 Jul 2021, Qing Zhao wrote:
>>> 
>>>> Hi,
>>>> 
>>>> This is the 6th version of the patch for the new security feature for GCC.
>>>> 
>>>> I have tested it with bootstrap on both x86 and aarch64, regression testing on both x86 and aarch64.
>>>> Also compile CPU2017 (running is ongoing), without any issue. (With the fix to bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101586).
>>>> 
>>>> Please take a look and let me know any issue.
>>> 
>>> +/* Handle an "uninitialized" attribute; arguments as in
>>> +   struct attribute_spec.handler.  */
>>> +
>>> +static tree
>>> +handle_uninitialized_attribute (tree *node, tree name, tree ARG_UNUSED 
>>> (args),
>>> +                               int ARG_UNUSED (flags), bool 
>>> *no_add_attrs)
>>> +{
>>> +  if (!VAR_P (*node))
>>> +    {
>>> +      warning (OPT_Wattributes, "%qE attribute ignored", name);
>>> +      *no_add_attrs = true;
>>> +    }
>>> 
>>> you are documenting this attribute for automatic variables but
>>> here you allow placement on globals as well (not sure if at this
>>> point TREE_STATIC / DECL_EXTERNAL are set correctly).
>> 
>> Right, I should warn when the attribute is placed for globals or static variables. 
>> I will try TREE_STATIC/DECL_EXTERNAL to see whether it’s work or not.
>> 
>>> 
>>> +  /* for languages that do not support BUILT_IN_CLEAR_PADDING, create the
>>> +     function node for padding initialization.  */
>>> +  if (!fn)
>>> +    {
>>> +      tree ftype = build_function_type_list (void_type_node,
>>> +                                            ptr_type_node,
>>> 
>>> the "appropriate" place to do this would be 
>>> tree.c:build_common_builtin_nodes
>> 
>> Sure, will move the creation of  function node of BUILT_IN_CLEAR_PADDING for Fortran etc. to tree.c:build_common_builtin_nodes.
>> 
>>> 
>>> You seem to marshall the is_vla argument as for_auto_init when
>>> expanding/folding the builtin and there it's used to suppress
>>> diagnostics (and make covered pieces not initialized?).
>> 
>> Yes, I added an extra argument “for_auto_init” for “BUILT_IN_CLEAR_PADDING”, this argument is added to suppress errors emitted during folding
>> BUILT_IN_CLEAR_PADDING for flexible array member . Such errors should Not be emitted when “BUILT_IN_CLEAR_PADDING” is called with compiler automatic initialization.
>> Please see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101586, comment #6 from Jakub Jelinek.
>> 
>>> I suggest
>>> to re-name is_vla/for_auto_init to something more descriptive.
>> 
>> Okay, I will. 
>>> 
>>> +   gimple_fold_builtin_clear_padding. If FOR_AUTO_INIT,
>>> +   not emit some of the error messages since doing that
>>> +   might confuse the end user.  */
>>> 
>>> doesn't explain to me whether errors still might be raised or
>>> what the actual behavior is.
>> 
>> Okay, will make this more clear in the comments.
>> 
>>> 
>>> +static gimple *
>>> +build_deferred_init (tree decl,
>>> +                    enum auto_init_type init_type,
>>> +                    bool is_vla)
>>> +{
>>> +  gcc_assert ((is_vla && TREE_CODE (decl) == WITH_SIZE_EXPR)
>>> +             || (!is_vla && TREE_CODE (decl) != WITH_SIZE_EXPR));
>>> 
>>> so the is_vla parameter looks redundant (and the assert dangerous?).
>>> Either the caller knows it deals with a VLA, then that should be
>>> passed through - constant sizes can also later appear during
>>> optimization after all - or is_vla should be determined here
>>> based on whether the size at gimplification time is constant.
>> 
>> The routine “build_deferred_init” is ONLY called during gimplification phase by the routine “gimple_add_init_for_auto_var", at this place,
>> Is_vla should be determined by the caller to check the size of the DECL. If it’s a vla, the “maybe_with_size_expr” will be applied for
>> DECL to make it to a WITH_SIZE_EXPR.  So, the assertion is purely to make sure this at gimplification phase.
>> 
>> Yes, the size of the VLA decl might become a constant later due to constant propagation, etc.  but during the gimplification phase, the assertion should be true.
>>> 
>>> +         /* If the user requests to initialize automatic variables, we
>>> +            should initialize paddings inside the variable. Add a call to
>>> +            __BUILTIN_CLEAR_PADDING (&object, 0, for_auto_init = true) to
>>> +            initialize paddings of object always to zero regardless of
>>> +            INIT_TYPE.  */
>>> +         if (opt_for_fn (current_function_decl, flag_auto_var_init)
>>> +               > AUTO_INIT_UNINITIALIZED
>>> +             && VAR_P (object)
>>> +             && !DECL_EXTERNAL (object)
>>> +             && !TREE_STATIC (object))
>>> +           gimple_add_padding_init_for_auto_var (object, false, pre_p);
>>> +         return ret;
>>> 
>>> I think you want to use either auto_var_p (object) or
>>> auto_var_in_fn_p (object, current_function_decl).  Don't you also
>>> want to check for the 'uninitialized' attribute here?  I suggest
>>> to abstract the check on whether 'object' should be subject
>>> to autoinit to a helper function.
>> 
>> Thanks for the suggestion, I will do this.
>> 
>> 
>>> 
>>> There's another path above this calling gimplify_init_constructor
>>> for the case of
>>> 
>>> const struct S x = { ... };
>>> struct S y = x;
>>> 
>>> where it will try to init 'y' from the CTOR directly, it seems you
>>> do not cover this case.
>> 
>> Yes, you are right, this case was not covered right now, and this should be covered.
>> 
>> Looks like that I need to move the “gimple_add_padding_init_for_auto_var” inside the routine “gimplify_init_constructor” to
>> Cover all the cases. 
>> 
>>> I also think that the above place applies
>>> to all aggregate assignment statements, not only to INIT_EXPRs?
>> 
>>> So don't you want to restrict clear-padding emit here?
>> 
>> You are right, I might need to restrict it Only to INIT_EXPR. 
>> Will update.
>> 
>>> 
>>> +static void
>>> +expand_DEFERRED_INIT (internal_fn, gcall *stmt)
>>> +{
>>> +  tree var = gimple_call_lhs (stmt);
>>> +  tree size_of_var = gimple_call_arg (stmt, 0);
>>> +  tree vlaaddr = NULL_TREE;
>>> +  tree var_type = TREE_TYPE (var);
>>> +  bool is_vla = (bool) TREE_INT_CST_LOW (gimple_call_arg (stmt, 2));
>>> +  enum auto_init_type init_type
>>> +    = (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (stmt, 1));
>>> +
>>> +  gcc_assert (init_type > AUTO_INIT_UNINITIALIZED);
>>> +
>>> +  /* if this variable is a VLA, get its SIZE and ADDR first.  */
>>> +  if (is_vla)
>>> +    {
>>> +      /* The temporary address variable for this vla should have been
>>> +        created during gimplification phase.  Refer to gimplify_vla_decl
>>> +        for details.  */
>>> +      tree var_decl = (TREE_CODE (var) == SSA_NAME) ?
>>> +                      SSA_NAME_VAR (var) : var;
>>> +      gcc_assert (DECL_HAS_VALUE_EXPR_P (var_decl));
>>> +      gcc_assert (TREE_CODE (DECL_VALUE_EXPR (var_decl)) == 
>>> INDIRECT_REF);
>>> +      /* Get the address of this vla variable.  */
>>> +      vlaaddr = TREE_OPERAND (DECL_VALUE_EXPR (var_decl), 0);
>>> 
>>> err - isn't the address of the decl represented by the LHS 
>>> regardless whether this is a VLA or not?
>> 
>> The LHS of the call to .DEFERRED_INIT is the DECL itself whatever it’s a VLA or not. 
>> 
>> In order to create a memset call, we need the Address of this DECL as the first argument. 
>> If the DECL is not a VLA, we just simply apply “build_fold_addr_expr” on this DECL to get its address,
>> However, for VLA, during gimplification phase “gimplify_vla_decl”, we have already created a temporary
>> address variable for this DECL, and recorded this address variable with “DECL_VALUE_EXPR(DECL), 
>> We should use this already created address variable  for VLAs. 
> 
> So the issue is that the LHS of the .DEFERRED_INIT call is not properly
> gimplified.  We should not have such decl there but I see we do not
> have IL verification that covers this.

Don’t quite understand here:  do you mean all the LHS of .DEFERRED_INIT call are not properly gimplified, or
Only the LHS of .DEFERRED_INIT call for VLA are not properly gimplified?

What do you mean by “such” decl? A decl whole “DECL_VALUE_EXPR(DECL)” is valid?

Qing
> 
> The gimplifier usually does this in gimplify_var_or_parm_decl,
> but you can of course substitute DECL_VALUE_EXPR yourself if the
> decl was already gimplified (was it?)
> 
>> 
>>> Looking at DECL_VALUE_EXPR
>>> looks quite fragile since that's not sth data dependence honors.
>>> It looks you only partly gimplify the build init here?  All
>>> DECL_VALUE_EXPRs should have been resolved.
>> 
>> Don’t quite understand here. you mean that all the “DECL_VALUE_EXPRs” have been resolved at the phase RTL expansion,
>> So I cannot use this to get the address variable of the VLA?
>> 
>> (However, my unit testing cases for VLAs are all looks fine).
>> 
>>> 
>>> +  if (is_vla || (!use_register_for_decl (var)))
>>> ...
>>> +  else
>>> +    {
>>> +    /* If this variable is in a register, use expand_assignment might
>>> +       generate better code.  */
>>> 
>>> you compute the patter initializer even when not needing it,
>>> that's wasteful.
>> 
>> Okay, I will restrict the pattern initializer computation when really needed. 
>> 
>>> It's also quite ugly, IMHO you should
>>> use can_native_interpret_type_p (var_type) and native_interpret
>>> a char [] array initialized to the pattern and if
>>> !can_native_interpret_type_p () go the memset route.
>> 
>> Thanks for the suggestion. 
>> 
>> Will try this. 
>> 
>>> 
>>> +  /* We will not verify the arguments for the calls to .DEFERRED_INIT.
>>> +     Such call is not a real call, just a placeholder for a later
>>> +     initialization during expand phase.
>>> +     This is mainly to avoid assertion failure for the following
>>> +     case:
>>> +
>>> +     uni_var = .DEFERRED_INIT (var_size, INIT_TYPE, is_vla);
>>> +     foo (&uni_var);
>>> +
>>> +     in the above, the uninitialized auto variable "uni_var" is
>>> +     addressable, therefore should not be in registers, resulting
>>> +     the assertion failure in the following argument verification.  */
>>> +  if (gimple_call_internal_p (stmt, IFN_DEFERRED_INIT))
>>> +    return false;
>>> +
>>>  /* ???  The C frontend passes unpromoted arguments in case it
>>>     didn't see a function declaration before the call.  So for now
>>>     leave the call arguments mostly unverified.  Once we gimplify
>>>     unit-at-a-time we have a chance to fix this.  */
>>> 
>>> -  for (i = 0; i < gimple_call_num_args (stmt); ++i)
>>> 
>>> isn't that from the time there was a decl argument to .DEFERRED_INIT?
>> 
>> You mean this issue is only there when the decl is the first argument (the old design for .DEFERRED_INIT).
>> With the new design, this issue is not there anymore?
> 
> I think so, yes - the change should no longer be needed.
> 
> Ricahrd.
> 
>>> 
>>> +  if (gimple_call_internal_p (stmt, IFN_DEFERRED_INIT))
>>> +    {
>>> +      tree size_of_arg0 = gimple_call_arg (stmt, 0);
>>> +      tree size_of_lhs = TYPE_SIZE_UNIT (TREE_TYPE (lhs));
>>> +      tree is_vla_node = gimple_call_arg (stmt, 2);
>>> +      bool is_vla = (bool) TREE_INT_CST_LOW (is_vla_node);
>>> +
>>> +      if (TREE_CODE (lhs) == SSA_NAME)
>>> +       lhs = SSA_NAME_VAR (lhs);
>>> +
>>> 
>>> 'lhs' is not looked at after this, no need to look at SSA_NAME_VAR.
>> 
>> Okay, will update this.
>> 
>>> 
>>> 
>>> Thanks and sorry for the delay in reviewing this (again).
>> 
>> Thanks again for your detailed review and suggestions.
>> 
>> I will update the patch accordingly and send the updated patch soon.
>> 
>> Qing
>>> 
>>> Richard.
>>> 
>>> 
>>>> Thanks
>>>> 
>> 
>> 
> 
> -- 
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)



To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=e970.611281f2.451f7.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5Bpatch%5D%5Bversion+6%5D+add+-ftrivial-auto-var-init+and+variable+attribute+%22uninitialized%22+to+gcc&verdict=C&c=ff7e02c1d13060bd65f1917000dcdda9be2d25f8

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 13:22 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 13:22 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx302.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 475 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 8678 bytes --]

From: Andrew MacLeod via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Richard Biener <richard.guenther@gmail.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [COMMITTED] PR tree-optimization/101741 - Ensure toupper and tolower follow the expected pattern.
Date: Tue, 10 Aug 2021 09:21:36 -0400
Message-ID: <448e095e-cce5-a29a-ce51-4053fda5e596@redhat.com>

On 8/10/21 3:45 AM, Richard Biener wrote:
> On Mon, Aug 9, 2021 at 10:31 PM Andrew MacLeod via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>> The user has overridden the function name "toupper" . Its marked as a
>> builtin function, presumably because it matches the name.  In range
>> folding, we were assuming the LHS and the parameter were compatible
>> types...  but they are not in this case..
>>
>> I don't know if we should be marking such a thing as a builtin function,
>> but regardless.. we shouldn't be trapping.  If the type of the argument
>> is not compatible with  the LHS, then we'll simply assume nothing about
>> the function.
>>
>> Bootstraps with no regression on x86_64-pc-linux-gnu.  pushed.
> I wonder why the gimple_call_combined_fn verification
> using gimple_builtin_call_types_compatible_p isn't enough here?
> Yes, it's matching is a bit lazy, but only with respect to promotion
> of arguments to 'int'.
>
> Richard.
>
I set a breakpoint on the return type field for the builtin... A quick 
check shows the return type of the builtin is being changed to "unsigned 
int" here:

#0  merge_decls (newdecl=0x7fffe9f67100, olddecl=0x7fffe9ed2100, 
newtype=0x7fffe9e3ae70, oldtype=0x7fffe9e3ae70) at 
/gcc/master/gcc/gcc/c/c-decl.c:2598
#1  0x0000000000a0628d in duplicate_decls (newdecl=0x7fffe9f67100, 
olddecl=0x7fffe9ed2100) at /gcc/master/gcc/gcc/c/c-decl.c:2963
#2  0x0000000000a07464 in pushdecl (x=0x7fffe9f67100) at 
/gcc/master/gcc/gcc/c/c-decl.c:3256
#3  0x0000000000a1d113 in start_function (declspecs=0x37728b0, 
declarator=0x3772ae0, attributes=0x0) at /gcc/master/gcc/gcc/c/c-decl.c:9644
#4  0x0000000000a8667c in c_parser_declaration_or_fndef 
(parser=0x7ffff7ff7ab0, fndef_ok=true, static_assert_ok=true, 
empty_ok=true, nested=false, start_attr_ok=true, 
objc_foreach_object_declaration=0x0,
     omp_declare_simd_clauses=0x0, have_attrs=false, attrs=0x0, 
oacc_routine_data=0x0, fallthru_attr_p=0x0) at 
/gcc/master/gcc/gcc/c/c-parser.c:2444


It would appear that merge_decls is merging the olddecl (the bultin) 
with newdecl  and results in changing the LHS to unsigned int, so now it 
thinks the builtin matches.   eeeks.

I don't know what the correct thing to do is, but a quick hack to check 
if old_decl is a builtin and return false from duplicate_decl also seems 
to resolve the problem:

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 221a67fe57b..27ab6ac9f1a 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -2960,6 +2960,9 @@ duplicate_decls (tree newdecl, tree olddecl)
        return false;
      }

+  if (DECL_BUILT_IN_CLASS (olddecl) != NOT_BUILT_IN)
+    return false;
+
    merge_decls (newdecl, olddecl, newtype, oldtype);

    /* The NEWDECL will no longer be needed.




To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=14e68.61127dac.234a0.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5BCOMMITTED%5D+PR+tree-optimization%2F101741+-+Ensure+toupper+and+tolower+follow+the+expected+pattern.&verdict=C&c=3fea457da484485891727e6a67eb7e2bee797a0c

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 13:04 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 13:04 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx302.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 475 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 8985 bytes --]

From: Bill Schmidt via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: gcc-patches@gcc.gnu.org, willschm@linux.ibm.com, dje.gcc@gmail.com
Subject: Re: [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file
Date: Tue, 10 Aug 2021 08:02:24 -0500
Message-ID: <031c85bf-a022-a2e2-ac76-4030c6241cc8@linux.ibm.com>

On 8/10/21 7:48 AM, Segher Boessenkool wrote:
> On Tue, Aug 10, 2021 at 07:17:54AM -0500, Bill Schmidt wrote:
>> On 8/9/21 6:44 PM, Segher Boessenkool wrote:
>>> This is not a documented GCC extension either, and it might even
>>> conflict with the existing void * extension (allowing arithmetic on it,
>>> by defining sizeof(void)).  In either case it is not currently defined.
>>>
>> I'm not sure how you get to this, but all we're doing here is standard C.
> Arithmetic on void* is the GCC extension.  sizeof(void) is 1 as GCC
> extension, instead of being undefined.  Pointer arithmetic is only
> defined for arrays of the type being pointed to, and you cannot have an
> array of void.  You can do this as GCC extension though, it behaves as
> if it was a char* instead.
>
>> x.c:
>>
>> char
>> foo (const void *x)
>> {
>>    const char *y = (const char *) x;
>>    return *y;
>> }
> And this behaves exactly the same if you do s/const void/void/ .  The
> const qualifier is meaningless on things of type void, since you cannot
> have an lvalue of that type anyway.  And all type qualifiers can be cast
> away (or cast into existence).
>
>> y.c:
>>
>> void
>> foo (const void *x, char c)
>> {
>>    const char *y = (const char *) x;
>>    *y = c;
>> }
>>
>> wschmidt@rain6p1:~/src$ gcc -c x.c
>> wschmidt@rain6p1:~/src$ gcc -c y.c
>> y.c: In function 'foo':
>> y.c:5:6: error: assignment of read-only location '*y'
>>     *y = c;
>>        ^
> Yes, *y is an lvalue.  *x is not: *x is an error.
>
>
> It *is* allowed to have a "const void", but it means exactly the same as
> just "void" (you cannot assign to either!)  And, they are compatible
> types, too, (they are the *same* type in fact!), so if you ever would
> treat them differently it would be mightily confusing :-)


The whole point is that this data type is only used for interfaces, as 
shown in the example code.  Nobody wants to define const void as 
anything.  The const serves only as a contract that the pointed-to 
object, no matter what it is cast to, will not be modified.

I think you're over-thinking this. :-)

Bill

>
>
> Segher


To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=15eb8.61127946.1743d.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5BPATCH+03%2F34%5D+rs6000%3A+Add+the+rest+of+the+%5Baltivec%5D+stanza+to+the+builtins+file&verdict=C&c=cf58a577e641bc42ccd2fb0be89ced1517e6257c

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 12:34 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 12:34 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx405.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 474 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 5700 bytes --]

From: "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org>
To: gcc-patches@gcc.gnu.org
Subject: [PATCH] Enable gcc.target/i386/pr88531-1a.c for all targets
Date: Tue, 10 Aug 2021 05:33:11 -0700
Message-ID: <20210810123311.1657066-1-hjl.tools@gmail.com>

	PR tree-optimization/101809
	* gcc.target/i386/pr88531-1a.c: Enable for all targets.
---
 gcc/testsuite/gcc.target/i386/pr88531-1a.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/i386/pr88531-1a.c b/gcc/testsuite/gcc.target/i386/pr88531-1a.c
index d1c29b2c990..5e4f28e9b17 100644
--- a/gcc/testsuite/gcc.target/i386/pr88531-1a.c
+++ b/gcc/testsuite/gcc.target/i386/pr88531-1a.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target lp64 } } */
+/* { dg-do compile } */
 /* { dg-options "-O3 -march=x86-64 -mfpmath=sse" } */
 
 #include <stdint.h>
-- 
2.31.1



To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=153a.61127260.35533.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=%5BPATCH%5D+Enable+gcc.target%2Fi386%2Fpr88531-1a.c+for+all+targets&verdict=C&c=cd4808a25716a2952ccf6a20ac79e9d529a95275

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 12:18 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 12:18 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx304.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 473 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 7507 bytes --]

From: Bill Schmidt via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: gcc-patches@gcc.gnu.org, willschm@linux.ibm.com, dje.gcc@gmail.com
Subject: Re: [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file
Date: Tue, 10 Aug 2021 07:17:54 -0500
Message-ID: <188d0a99-4382-39cd-fb4e-3ccacd63c010@linux.ibm.com>


On 8/9/21 6:44 PM, Segher Boessenkool wrote:
>
>
> This is not a documented GCC extension either, and it might even
> conflict with the existing void * extension (allowing arithmetic on it,
> by defining sizeof(void)).  In either case it is not currently defined.
>
>
I'm not sure how you get to this, but all we're doing here is standard C.

x.c:

char
foo (const void *x)
{
   const char *y = (const char *) x;
   return *y;
}

y.c:

void
foo (const void *x, char c)
{
   const char *y = (const char *) x;
   *y = c;
}

wschmidt@rain6p1:~/src$ gcc -c x.c
wschmidt@rain6p1:~/src$ gcc -c y.c
y.c: In function 'foo':
y.c:5:6: error: assignment of read-only location '*y'
    *y = c;
       ^
wschmidt@rain6p1:~/src$



To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=12e60.61126ea8.8c7d6.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5BPATCH+03%2F34%5D+rs6000%3A+Add+the+rest+of+the+%5Baltivec%5D+stanza+to+the+builtins+file&verdict=C&c=5e37a7c97b9bae99e1a38ee5b94ae38ddee5b217

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 12:14 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 12:14 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx601.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 474 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 8307 bytes --]

From: liuhongt via Gcc-patches <gcc-patches@gcc.gnu.org>
To: gcc-patches@gcc.gnu.org
Subject: [PATCH] Extend ldexp{s, d}f3 to vscalefs{s, d} when TARGET_AVX512F and TARGET_SSE_MATH.
Date: Tue, 10 Aug 2021 20:13:15 +0800
Message-ID: <20210810121315.3409758-1-hongtao.liu@intel.com>

Hi:
  AVX512F supported vscalefs{s,d} which is the same as ldexp except the second operand should be floating point.
  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.

gcc/ChangeLog:

	PR target/98309
	* config/i386/i386.md (ldexp<mode>3): Extend to vscalefs[sd]
	when TARGET_AVX512F and TARGET_SSE_MATH.

gcc/testsuite/ChangeLog:

	PR target/98309
	* gcc.target/i386/pr98309-1.c: New test.
	* gcc.target/i386/pr98309-2.c: New test.
---
 gcc/config/i386/i386.md                   | 34 +++++++++++++++-----
 gcc/testsuite/gcc.target/i386/pr98309-1.c | 18 +++++++++++
 gcc/testsuite/gcc.target/i386/pr98309-2.c | 39 +++++++++++++++++++++++
 3 files changed, 83 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr98309-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr98309-2.c

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index bc1c30b77f4..56b09c566ed 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -17914,17 +17914,35 @@ (define_expand "ldexp<mode>3"
   [(use (match_operand:MODEF 0 "register_operand"))
    (use (match_operand:MODEF 1 "general_operand"))
    (use (match_operand:SI 2 "register_operand"))]
-  "TARGET_USE_FANCY_MATH_387
-   && (!(SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH)
-       || TARGET_MIX_SSE_I387)
+  "((TARGET_USE_FANCY_MATH_387
+     && (!(SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH)
+	 || TARGET_MIX_SSE_I387))
+    || (TARGET_AVX512F && TARGET_SSE_MATH))
    && flag_unsafe_math_optimizations"
 {
-  rtx op0 = gen_reg_rtx (XFmode);
-  rtx op1 = gen_reg_rtx (XFmode);
+  /* Prefer avx512f version.  */
+  if (TARGET_AVX512F && TARGET_SSE_MATH)
+   {
+     rtx op2 = gen_reg_rtx (<MODE>mode);
+     emit_insn (gen_floatsi<mode>2 (op2, operands[2]));
+     operands[0] = lowpart_subreg (<ssevecmodef>mode, operands[0], <MODE>mode);
+     if (MEM_P (operands[1]))
+       operands[1] = force_reg (<MODE>mode, operands[1]);
+     operands[1] = lowpart_subreg (<ssevecmodef>mode, operands[1], <MODE>mode);
+     op2 = lowpart_subreg (<ssevecmodef>mode, op2, <MODE>mode);
+     emit_insn (gen_avx512f_vmscalef<ssevecmodelower> (operands[0],
+						       operands[1],
+						       op2));
+   }
+  else
+    {
+      rtx op0 = gen_reg_rtx (XFmode);
+      rtx op1 = gen_reg_rtx (XFmode);
 
-  emit_insn (gen_extend<mode>xf2 (op1, operands[1]));
-  emit_insn (gen_ldexpxf3 (op0, op1, operands[2]));
-  emit_insn (gen_truncxf<mode>2 (operands[0], op0));
+      emit_insn (gen_extend<mode>xf2 (op1, operands[1]));
+      emit_insn (gen_ldexpxf3 (op0, op1, operands[2]));
+      emit_insn (gen_truncxf<mode>2 (operands[0], op0));
+  }
   DONE;
 })
 
diff --git a/gcc/testsuite/gcc.target/i386/pr98309-1.c b/gcc/testsuite/gcc.target/i386/pr98309-1.c
new file mode 100644
index 00000000000..3a7afb58971
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr98309-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512f -O2 -mfpmath=sse -ffast-math" } */
+/* { dg-final { scan-assembler-times "vcvtsi2s\[sd\]" "2" } } */
+/* { dg-final { scan-assembler-times "vscalefs\[sd\]" "2" } } */
+
+double
+__attribute__((noipa))
+foo (double a, int b)
+{
+  return __builtin_ldexp (a, b);
+}
+
+float
+__attribute__((noipa))
+foo2 (float a, int b)
+{
+  return __builtin_ldexpf (a, b);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr98309-2.c b/gcc/testsuite/gcc.target/i386/pr98309-2.c
new file mode 100644
index 00000000000..ecfb9168b7d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr98309-2.c
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+/* { dg-options "-mavx512f -O2 -mfpmath=sse -ffast-math" } */
+/* { dg-require-effective-target avx512f } */
+
+#define AVX512F
+#ifndef CHECK
+#define CHECK "avx512f-helper.h"
+#endif
+
+#include CHECK
+
+#include "pr98309-1.c"
+
+double
+__attribute__((noipa, target("fpmath=387")))
+foo_i387 (double a, int b)
+{
+  return __builtin_ldexp (a, b);
+}
+
+float
+__attribute__((noipa, target("fpmath=387")))
+foo2_i387 (float a, int b)
+{
+  return __builtin_ldexpf (a, b);
+}
+
+static void
+test_512 (void)
+{
+  float fa = 14.5;
+  double da = 44.5;
+  int fb = 12;
+  int db = 8;
+  if (foo_i387 (da, db) != foo (da, db))
+    abort ();
+  if (foo2_i387 (fa, fb) != foo2 (fa, fb))
+    abort ();
+}
-- 
2.27.0



To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=1c86.61126d9f.9f300.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=%5BPATCH%5D+Extend+ldexp%7Bs%2C+d%7Df3+to+vscalefs%7Bs%2C+d%7D+when+TARGET_AVX512F+and+TARGET_SSE_MATH.&verdict=C&c=c73a09cb6932b77b37051b3f521b4dbcf762b63e

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 12:02 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 12:02 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx408.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 475 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 12096 bytes --]

From: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Indu Bhagat <indu.bhagat@oracle.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH, V2 3/3] dwarf2out: Emit BTF in dwarf2out_finish for BPF CO-RE usecase
Date: Tue, 10 Aug 2021 14:00:40 +0200
Message-ID: <CAFiYyc29q65RM2oCckBw+a2trnZCa3N7J3bOx5Kv-09B6WXfWw@mail.gmail.com>

On Thu, Aug 5, 2021 at 2:53 AM Indu Bhagat via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> DWARF generation is split between early and late phases when LTO is in effect.
> This poses challenges for CTF/BTF generation especially if late debug info
> generation is desirable, as turns out to be the case for BPF CO-RE.
>
> In case of BPF CO-RE, the BPF backend adds information about CO-RE relocations
> to the CTF container. This information is what needs to be emitted as a
> separate .BTF.ext section when -more is in effect. Further, each CO-RE
> relocation record holds an offset to a string specifying the access to the
> structure's field. This means that .BTF string table needs to be modified
> "late" in the compilation process. In other words, this implies that the BTF
> sections cannot be finalized in dwarf2out_early_finish when -mcore for the BPF
> backend is in effect.

OK, it didn't really get any clearer as to why the late annotation
cannot be done
after the early info was output.  Something to do with the BTF string table,
but all structure field names must be already present there so I must be missing
something.

ISTR the CO-RE info is fully contained in a new section .BTF.ext and the
"early" .BTF section is not altered?

> Now, the emission of CTF/BTF debug info cannot be moved unconditionally to
> dwarf2out_finish because dwarf2out_finish is not invoked at all for the LTO
> compile phase for slim LTO objects, thus breaking CTF/BTF generation for other
> targets when used with LTO.
>
> The approach taken here in this patch is that -
>
> 1. LTO is disabled for BPF CO-RE
> The reason to disable LTO for BPF CO-RE is that if LTO is in effect, BPF CO-RE
> relocations need to be generated in the LTO link phase _after_ the optimizations
> are done. This means we need to devise way to combine early and late BTF. At
> this time, in absence of linker support for BTF sections, it makes sense to
> steer clear of LTO for BPF CO-RE and bypass the issue.
>
> 2. Use a target hook to allow BPF backend to cleanly convey the case when late
> finalization of the CTF container is desirable.
>
> So, in other words,
>
> dwarf2out_early_finish
>   - Always emit CTF here.
>   - if (BTF && ctfc_debuginfo_early_finish_p), emit BTF now.
>
> dwarf2out_finish
>   - if (BTF && !ctfc_debuginfo_early_finish_p && !in_lto_p) emit BTF now.
>   - Use of in_lto_p to make sure LTO link phase does not affect BTF sections
> for other targets.
>
> gcc/ChangeLog:
>
>         * dwarf2ctf.c (ctf_debug_finalize): Make it static.
>         (ctf_debug_early_finish): New definition.
>         (ctf_debug_finish): Likewise.
>         * dwarf2ctf.h (ctf_debug_finalize): Remove declaration.
>         (ctf_debug_early_finish): New declaration.
>         (ctf_debug_finish): Likewise.
>         * dwarf2out.c (dwarf2out_finish): Invoke ctf_debug_finish.
>         (dwarf2out_early_finish): Invoke ctf_debug_early_finish.
> ---
>  gcc/dwarf2ctf.c | 55 +++++++++++++++++++++++++++++++++++++++++++------------
>  gcc/dwarf2ctf.h |  4 +++-
>  gcc/dwarf2out.c |  9 +++++++--
>  3 files changed, 53 insertions(+), 15 deletions(-)
>
> diff --git a/gcc/dwarf2ctf.c b/gcc/dwarf2ctf.c
> index 5e8a725..0fa429c 100644
> --- a/gcc/dwarf2ctf.c
> +++ b/gcc/dwarf2ctf.c
> @@ -917,6 +917,27 @@ gen_ctf_type (ctf_container_ref ctfc, dw_die_ref die)
>    return type_id;
>  }
>
> +/* Prepare for output and write out the CTF debug information.  */
> +
> +static void
> +ctf_debug_finalize (const char *filename, bool btf)
> +{
> +  if (btf)
> +    {
> +      btf_output (filename);
> +      btf_finalize ();
> +    }
> +
> +  else
> +    {
> +      /* Emit the collected CTF information.  */
> +      ctf_output (filename);
> +
> +      /* Reset the CTF state.  */
> +      ctf_finalize ();
> +    }
> +}
> +
>  bool
>  ctf_do_die (dw_die_ref die)
>  {
> @@ -966,25 +987,35 @@ ctf_debug_init_postprocess (bool btf)
>      btf_init_postprocess ();
>  }
>
> -/* Prepare for output and write out the CTF debug information.  */
> +/* Early finish CTF/BTF debug info.  */
>
>  void
> -ctf_debug_finalize (const char *filename, bool btf)
> +ctf_debug_early_finish (const char * filename)
>  {
> -  if (btf)
> +  /* Emit CTF debug info early always.  */
> +  if (ctf_debug_info_level > CTFINFO_LEVEL_NONE
> +      /* Emit BTF debug info early if the target does not require late
> +        emission.  */
> +       || (btf_debuginfo_p ()
> +          && targetm.ctfc_debuginfo_early_finish_p ()))
>      {
> -      btf_output (filename);
> -      btf_finalize ();
> +      /* Emit CTF/BTF debug info.  */
> +      ctf_debug_finalize (filename, btf_debuginfo_p ());
>      }
> +}
>
> -  else
> -    {
> -      /* Emit the collected CTF information.  */
> -      ctf_output (filename);
> +/* Finish CTF/BTF debug info emission.  */
>
> -      /* Reset the CTF state.  */
> -      ctf_finalize ();
> -    }
> +void
> +ctf_debug_finish (const char * filename)
> +{
> +  /* Emit BTF debug info here when the target needs to update the CTF container
> +     (ctfc) in the backend.  An example of this, at this time is the BPF CO-RE
> +     usecase.  */
> +  if (btf_debuginfo_p ()
> +      && (!in_lto_p && !targetm.ctfc_debuginfo_early_finish_p ()))
> +    /* Emit BTF debug info.  */
> +    ctf_debug_finalize (filename, btf_debuginfo_p ());
>  }
>
>  #include "gt-dwarf2ctf.h"
> diff --git a/gcc/dwarf2ctf.h b/gcc/dwarf2ctf.h
> index a3cf567..9edbde0 100644
> --- a/gcc/dwarf2ctf.h
> +++ b/gcc/dwarf2ctf.h
> @@ -24,13 +24,15 @@ along with GCC; see the file COPYING3.  If not see
>  #define GCC_DWARF2CTF_H 1
>
>  #include "dwarf2out.h"
> +#include "flags.h"
>
>  /* Debug Format Interface.  Used in dwarf2out.c.  */
>
>  extern void ctf_debug_init (void);
>  extern void ctf_debug_init_postprocess (bool);
>  extern bool ctf_do_die (dw_die_ref);
> -extern void ctf_debug_finalize (const char *, bool);
> +extern void ctf_debug_early_finish (const char *);
> +extern void ctf_debug_finish (const char *);
>
>  /* Wrappers for CTF/BTF to fetch information from GCC DWARF DIE.  Used in
>     ctfc.c.
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index b91a9b5..708cd1f 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -31895,6 +31895,11 @@ dwarf2out_finish (const char *filename)
>    unsigned char checksum[16];
>    char dl_section_ref[MAX_ARTIFICIAL_LABEL_BYTES];
>
> +  /* Generate CTF/BTF debug info.  */
> +  if ((ctf_debug_info_level > CTFINFO_LEVEL_NONE
> +       || btf_debuginfo_p ()) && lang_GNU_C ())
> +    ctf_debug_finish (filename);
> +
>    /* Skip emitting DWARF if not required.  */
>    if (!dwarf_debuginfo_p ())
>      return;
> @@ -32799,8 +32804,8 @@ dwarf2out_early_finish (const char *filename)
>         ctf_debug_do_cu (node->die);
>        /* Post process the debug data in the CTF container if necessary.  */
>        ctf_debug_init_postprocess (btf_debuginfo_p ());
> -      /* Emit CTF/BTF debug info.  */
> -      ctf_debug_finalize (filename, btf_debuginfo_p ());
> +
> +      ctf_debug_early_finish (filename);
>      }
>
>    /* Do not generate DWARF assembler now when not producing LTO bytecode.  */
> --
> 1.8.3.1
>


To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=3bbb.61126ace.69948.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5BPATCH%2C+V2+3%2F3%5D+dwarf2out%3A+Emit+BTF+in+dwarf2out_finish+for+BPF+CO-RE+usecase&verdict=C&c=9d35f940a066f745684173ff6cf52e498837d628

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 11:55 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 11:55 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx601.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 474 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 10313 bytes --]

From: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Indu Bhagat <indu.bhagat@oracle.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH, V2 2/3] targhooks: New target hook for CTF/BTF debug info emission
Date: Tue, 10 Aug 2021 13:54:27 +0200
Message-ID: <CAFiYyc3LoQQQwVE5QgGxD5TduV31JDbmd4Ek6Z+XzYsHPbSY8g@mail.gmail.com>

On Thu, Aug 5, 2021 at 2:52 AM Indu Bhagat via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> This patch adds a new target hook to detect if the CTF container can allow the
> emission of CTF/BTF debug info at DWARF debug info early finish time. Some
> backends, e.g., BPF when generating code for CO-RE usecase, may need to emit
> the CTF/BTF debug info sections around the time when late DWARF debug is
> finalized (dwarf2out_finish).

Without looking at the dwarf2out.c usage in the next patch - I think
the CTF part
should be always emitted from dwarf2out_early_finish, the "hooks" should somehow
arrange for the alternate output specific data to be preserved until
dwarf2out_finish
time so the late BTF data can be emitted from there.

Lumping everything together now just makes it harder to see what info
is required
to persist and thus make LTO support more intrusive than necessary.

> gcc/ChangeLog:
>
>         * config/bpf/bpf.c (ctfc_debuginfo_early_finish_p): New definition.
>         (TARGET_CTFC_DEBUGINFO_EARLY_FINISH_P): Undefine and override.
>         * doc/tm.texi: Regenerated.
>         * doc/tm.texi.in: Document the new hook.
>         * target.def: Add a new hook.
>         * targhooks.c (default_ctfc_debuginfo_early_finish_p): Likewise.
>         * targhooks.h (default_ctfc_debuginfo_early_finish_p): Likewise.
> ---
>  gcc/config/bpf/bpf.c | 14 ++++++++++++++
>  gcc/doc/tm.texi      |  6 ++++++
>  gcc/doc/tm.texi.in   |  2 ++
>  gcc/target.def       | 10 ++++++++++
>  gcc/targhooks.c      |  6 ++++++
>  gcc/targhooks.h      |  2 ++
>  6 files changed, 40 insertions(+)
>
> diff --git a/gcc/config/bpf/bpf.c b/gcc/config/bpf/bpf.c
> index 028013e..85f6b76 100644
> --- a/gcc/config/bpf/bpf.c
> +++ b/gcc/config/bpf/bpf.c
> @@ -178,6 +178,20 @@ bpf_option_override (void)
>  #undef TARGET_OPTION_OVERRIDE
>  #define TARGET_OPTION_OVERRIDE bpf_option_override
>
> +/* Return FALSE iff -mcore has been specified.  */
> +
> +static bool
> +ctfc_debuginfo_early_finish_p (void)
> +{
> +  if (TARGET_BPF_CORE)
> +    return false;
> +  else
> +    return true;
> +}
> +
> +#undef TARGET_CTFC_DEBUGINFO_EARLY_FINISH_P
> +#define TARGET_CTFC_DEBUGINFO_EARLY_FINISH_P ctfc_debuginfo_early_finish_p
> +
>  /* Define target-specific CPP macros.  This function in used in the
>     definition of TARGET_CPU_CPP_BUILTINS in bpf.h */
>
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index cb01528..2d5ff05 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -10400,6 +10400,12 @@ Define this macro if GCC should produce debugging output in BTF debug
>  format in response to the @option{-gbtf} option.
>  @end defmac
>
> +@deftypefn {Target Hook} bool TARGET_CTFC_DEBUGINFO_EARLY_FINISH_P (void)
> +This target hook returns nonzero if the CTF Container can allow the
> + emission of the CTF/BTF debug info at the DWARF debuginfo early finish
> + time.
> +@end deftypefn
> +
>  @node Floating Point
>  @section Cross Compilation and Floating Point
>  @cindex cross compilation and floating point
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index 4a522ae..05b3c2c 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -7020,6 +7020,8 @@ Define this macro if GCC should produce debugging output in BTF debug
>  format in response to the @option{-gbtf} option.
>  @end defmac
>
> +@hook TARGET_CTFC_DEBUGINFO_EARLY_FINISH_P
> +
>  @node Floating Point
>  @section Cross Compilation and Floating Point
>  @cindex cross compilation and floating point
> diff --git a/gcc/target.def b/gcc/target.def
> index 68a46aa..44e2251 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -4016,6 +4016,16 @@ clobbered parts of a register altering the frame register size",
>   machine_mode, (int regno),
>   default_dwarf_frame_reg_mode)
>
> +/* Return nonzero if CTF Container can finalize the CTF/BTF emission
> +   at DWARF debuginfo early finish time.  */
> +DEFHOOK
> +(ctfc_debuginfo_early_finish_p,
> + "This target hook returns nonzero if the CTF Container can allow the\n\
> + emission of the CTF/BTF debug info at the DWARF debuginfo early finish\n\
> + time.",
> + bool, (void),
> + default_ctfc_debuginfo_early_finish_p)
> +
>  /* If expand_builtin_init_dwarf_reg_sizes needs to fill in table
>     entries not corresponding directly to registers below
>     FIRST_PSEUDO_REGISTER, this hook should generate the necessary
> diff --git a/gcc/targhooks.c b/gcc/targhooks.c
> index eb51909..e38566c 100644
> --- a/gcc/targhooks.c
> +++ b/gcc/targhooks.c
> @@ -2112,6 +2112,12 @@ default_dwarf_frame_reg_mode (int regno)
>    return save_mode;
>  }
>
> +bool
> +default_ctfc_debuginfo_early_finish_p (void)
> +{
> +  return true;
> +}
> +
>  /* To be used by targets where reg_raw_mode doesn't return the right
>     mode for registers used in apply_builtin_return and apply_builtin_arg.  */
>
> diff --git a/gcc/targhooks.h b/gcc/targhooks.h
> index f92e102..55dc443 100644
> --- a/gcc/targhooks.h
> +++ b/gcc/targhooks.h
> @@ -255,6 +255,8 @@ extern unsigned int default_dwarf_poly_indeterminate_value (unsigned int,
>                                                             unsigned int *,
>                                                             int *);
>  extern machine_mode default_dwarf_frame_reg_mode (int);
> +extern bool default_ctfc_debuginfo_early_finish_p (void);
> +
>  extern fixed_size_mode default_get_reg_raw_mode (int);
>  extern bool default_keep_leaf_when_profiled ();
>
> --
> 1.8.3.1
>


To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=507b.6112693a.61520.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5BPATCH%2C+V2+2%2F3%5D+targhooks%3A+New+target+hook+for+CTF%2FBTF+debug+info+emission&verdict=C&c=a36df4fb92b96a9b7c8cc1efb06de9a37b8da7a7

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 11:52 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 11:52 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx302.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 475 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 8176 bytes --]

From: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Indu Bhagat <indu.bhagat@oracle.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH,V2 1/3] bpf: Add new -mcore option for BPF CO-RE
Date: Tue, 10 Aug 2021 13:51:32 +0200
Message-ID: <CAFiYyc3-oTk+c8FtKQkKWWZwLGrQgOTvsbHeKME9apE4uPQCdQ@mail.gmail.com>

On Thu, Aug 5, 2021 at 2:54 AM Indu Bhagat via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> -mcore in the BPF backend enables code generation for the CO-RE usecase. LTO is
> disabled for CO-RE compilations.

-mcore reads like "core", why not -mco-re?  Anyway, ...

> gcc/ChangeLog:
>
>         * config/bpf/bpf.c (bpf_option_override): For BPF backend, disable LTO
>         support when compiling for CO-RE.
>         * config/bpf/bpf.opt: Add new command line option -mcore.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/bpf/core-lto-1.c: New test.
> ---
>  gcc/config/bpf/bpf.c                      | 15 +++++++++++++++
>  gcc/config/bpf/bpf.opt                    |  4 ++++
>  gcc/testsuite/gcc.target/bpf/core-lto-1.c |  9 +++++++++
>  3 files changed, 28 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/bpf/core-lto-1.c
>
> diff --git a/gcc/config/bpf/bpf.c b/gcc/config/bpf/bpf.c
> index e635f9e..028013e 100644
> --- a/gcc/config/bpf/bpf.c
> +++ b/gcc/config/bpf/bpf.c
> @@ -158,6 +158,21 @@ bpf_option_override (void)
>  {
>    /* Set the initializer for the per-function status structure.  */
>    init_machine_status = bpf_init_machine_status;
> +
> +  /* To support the portability needs of BPF CO-RE approach, BTF debug
> +     information includes the BPF CO-RE relocations.  The information
> +     necessary for these relocations is added to the CTF container by the
> +     BPF backend.  Enabling LTO poses challenges in the generation of the BPF
> +     CO-RE relocations because if LTO is in effect, they need to be
> +     generated late in the LTO link phase.  This in turn means the compiler
> +     needs to provide means to combine the early and late BTF debug info,
> +     similar to DWARF debug info.
> +
> +     In any case, in absence of linker support for BTF sections at this time,
> +     it is acceptable to simply disallow LTO for BPF CO-RE compilations.  */
> +
> +  if (flag_lto && TARGET_BPF_CORE)
> +    error ("BPF CO-RE does not support LTO");

... these "errors" should use sorry ("...") which annotate places where the
compiler has to give up because sth is not implemented.

otherwise this patch needs BPF maintainer review of course.

Richard.

>  }
>
>  #undef TARGET_OPTION_OVERRIDE
> diff --git a/gcc/config/bpf/bpf.opt b/gcc/config/bpf/bpf.opt
> index 916b53c..e8926f5 100644
> --- a/gcc/config/bpf/bpf.opt
> +++ b/gcc/config/bpf/bpf.opt
> @@ -127,3 +127,7 @@ Generate little-endian eBPF.
>  mframe-limit=
>  Target Joined RejectNegative UInteger IntegerRange(0, 32767) Var(bpf_frame_limit) Init(512)
>  Set a hard limit for the size of each stack frame, in bytes.
> +
> +mcore
> +Target Mask(BPF_CORE)
> +Generate all necessary information for BPF Compile Once - Run Everywhere.
> diff --git a/gcc/testsuite/gcc.target/bpf/core-lto-1.c b/gcc/testsuite/gcc.target/bpf/core-lto-1.c
> new file mode 100644
> index 0000000..a90dc5b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/bpf/core-lto-1.c
> @@ -0,0 +1,9 @@
> +/* Test -mcore with -flto.
> +
> +   -mcore is used to generate information for BPF CO-RE usecase. To support
> +   the generataion of the .BTF and .BTF.ext sections in GCC, -flto is disabled
> +   with -mcore.  */
> +
> +/* { dg-do compile } */
> +/* { dg-error "BPF CO-RE does not support LTO" "" { target bpf-*-* } 0 } */
> +/* { dg-options "-gbtf -mcore -flto" } */
> --
> 1.8.3.1
>


To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=fba2.6112688b.8382f.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5BPATCH%2CV2+1%2F3%5D+bpf%3A+Add+new+-mcore+option+for+BPF+CO-RE&verdict=C&c=efcea6600b27d77b13aec32ad537dc019d3d10cd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 11:42 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 11:42 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx409.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 474 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 10258 bytes --]

From: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
To: gcc-patches@gcc.gnu.org
Cc: jakub@redhat.com, ebotcazou@adacore.com, fortran@gcc.gnu.org
Subject: [PATCH][v2] Adjust volatile handling of the operand scanner
Date: Tue, 10 Aug 2021 13:40:33 +0200 (CEST)
Message-ID: <50o5pr8r-q741-o8n0-9r49-q53o90o8675s@fhfr.qr>

The GIMPLE SSA operand scanner handles COMPONENT_REFs that are
not marked TREE_THIS_VOLATILE but have a TREE_THIS_VOLATILE
FIELD_DECL as volatile.  That's inconsistent in how TREE_THIS_VOLATILE
testing on GENERIC refs works which requires operand zero of
component references to mirror TREE_THIS_VOLATILE to the ref
so that testing TREE_THIS_VOLATILE on the outermost reference
is enough to determine the volatileness.

The following patch thus removes FIELD_DECL scanning from
the GIMPLE SSA operand scanner, possibly leaving fewer stmts
marked as gimple_has_volatile_ops.

It shows we miss at least one case in the fortran frontend, though
there's a suspicious amount of COMPONENT_REF creation compared
to little setting of TREE_THIS_VOLATILE.  This fixes the FAIL
of gfortran.dg/volatile11.f90 that would otherwise occur.

Visually inspecting fortran/ reveals a bunch of likely to fix
cases but I don't know the constraints of 'volatile' uses in
the fortran language to assess whether some of these are not
necessary.  The cases would be fixed with

diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 0d013defdbb..5d708fe90aa 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -6983,9 +6983,10 @@ gfc_get_dataptr_offset (stmtblock_t *block, tree parm, tree desc, tree offset,
 	    case REF_COMPONENT:
 	      field = ref->u.c.component->backend_decl;
 	      gcc_assert (field && TREE_CODE (field) == FIELD_DECL);
-	      tmp = fold_build3_loc (input_location, COMPONENT_REF,
-				     TREE_TYPE (field),
-				     tmp, field, NULL_TREE);
+	      tmp = build3_loc (input_location, COMPONENT_REF,
+				TREE_TYPE (field), tmp, field, NULL_TREE);
+	      if (TREE_THIS_VOLATILE (field))
+		TREE_THIS_VOLATILE (tmp) = 1;
 	      break;

 	    case REF_SUBSTRING:
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index c4291cce079..e6dc79f8c1e 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -2244,9 +2244,11 @@ gfc_get_tree_for_caf_expr (gfc_expr *expr)

 	if (POINTER_TYPE_P (TREE_TYPE (caf_decl)))
 	  caf_decl = build_fold_indirect_ref_loc (input_location, caf_decl);
-	caf_decl = fold_build3_loc (input_location, COMPONENT_REF,
-				    TREE_TYPE (comp->backend_decl), caf_decl,
-				    comp->backend_decl, NULL_TREE);
+	caf_decl = build3_loc (input_location, COMPONENT_REF,
+			       TREE_TYPE (comp->backend_decl), caf_decl,
+			       comp->backend_decl, NULL_TREE);
+	if (TREE_THIS_VOLATILE (comp->backend_decl))
+	  TREE_THIS_VOLATILE (caf_decl) = 1;
 	if (comp->ts.type == BT_CLASS)
 	  {
 	    caf_decl = gfc_class_data_get (caf_decl);
@@ -2755,8 +2757,10 @@ gfc_conv_component_ref (gfc_se * se, gfc_ref * ref)
   else
     se->class_vptr = NULL_TREE;

-  tmp = fold_build3_loc (input_location, COMPONENT_REF, TREE_TYPE (field),
-			 decl, field, NULL_TREE);
+  tmp =build3_loc (input_location, COMPONENT_REF, TREE_TYPE (field),
+		   decl, field, NULL_TREE);
+  if (TREE_THIS_VOLATILE (field))
+    TREE_THIS_VOLATILE (tmp) = 1;

   se->expr = tmp;

@@ -8792,8 +8796,10 @@ gfc_trans_structure_assign (tree dest, gfc_expr * expr, bool init, bool coarray)
 	}
       field = cm->backend_decl;
       gcc_assert(field);
-      tmp = fold_build3_loc (input_location, COMPONENT_REF, TREE_TYPE (field),
-			     dest, field, NULL_TREE);
+      tmp = build3_loc (input_location, COMPONENT_REF, TREE_TYPE (field),
+			dest, field, NULL_TREE);
+      if (TREE_THIS_VOLATILE (field))
+	TREE_THIS_VOLATILE (tmp) = 1;
       if (!c->expr)
 	{
 	  gfc_expr *e = gfc_get_null_expr (NULL);

but I did not include them as they have no effect on the testsuite.

The question is whether we instead want to amend build3 to
set TREE_THIS_VOLATILE automatically when the FIELD_DECL has
it set.  At least for the Fortran FE cases the gimplifier
fails to see some volatile references and thus can generate
wrong code which is a latent issue.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

2021-08-09  Richard Biener  <rguenther@suse.de>

gcc/
	* tree-ssa-operands.c (operands_scanner::get_expr_operands):
	Do not look at COMPONENT_REF FIELD_DECLs TREE_THIS_VOLATILE
	to determine has_volatile_ops.

gcc/fortran/
	* trans-common.c (create_common): Set TREE_THIS_VOLATILE on the
	COMPONENT_REF if the field is volatile.
---
 gcc/fortran/trans-common.c | 9 +++++----
 gcc/tree-ssa-operands.c    | 7 +------
 2 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/gcc/fortran/trans-common.c b/gcc/fortran/trans-common.c
index a11cf4c839e..7bcf18dc475 100644
--- a/gcc/fortran/trans-common.c
+++ b/gcc/fortran/trans-common.c
@@ -759,10 +759,11 @@ create_common (gfc_common_head *com, segment_info *head, bool saw_equiv)
       else
 	gfc_add_decl_to_function (var_decl);
 
-      SET_DECL_VALUE_EXPR (var_decl,
-			   fold_build3_loc (input_location, COMPONENT_REF,
-					    TREE_TYPE (s->field),
-					    decl, s->field, NULL_TREE));
+      tree comp = build3_loc (input_location, COMPONENT_REF,
+			      TREE_TYPE (s->field), decl, s->field, NULL_TREE);
+      if (TREE_THIS_VOLATILE (s->field))
+	TREE_THIS_VOLATILE (comp) = 1;
+      SET_DECL_VALUE_EXPR (var_decl, comp);
       DECL_HAS_VALUE_EXPR_P (var_decl) = 1;
       GFC_DECL_COMMON_OR_EQUIV (var_decl) = 1;
 
diff --git a/gcc/tree-ssa-operands.c b/gcc/tree-ssa-operands.c
index c15575416dd..ebf7eea3b04 100644
--- a/gcc/tree-ssa-operands.c
+++ b/gcc/tree-ssa-operands.c
@@ -834,12 +834,7 @@ operands_scanner::get_expr_operands (tree *expr_p, int flags)
 	get_expr_operands (&TREE_OPERAND (expr, 0), flags);
 
 	if (code == COMPONENT_REF)
-	  {
-	    if (!(flags & opf_no_vops)
-		&& TREE_THIS_VOLATILE (TREE_OPERAND (expr, 1)))
-	      gimple_set_has_volatile_ops (stmt, true);
-	    get_expr_operands (&TREE_OPERAND (expr, 2), uflags);
-	  }
+	  get_expr_operands (&TREE_OPERAND (expr, 2), uflags);
 	else if (code == ARRAY_REF || code == ARRAY_RANGE_REF)
 	  {
 	    get_expr_operands (&TREE_OPERAND (expr, 1), uflags);
-- 
2.31.1


To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=e9c2.611265ed.8e876.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=%5BPATCH%5D%5Bv2%5D+Adjust+volatile+handling+of+the+operand+scanner&verdict=C&c=6e6e41ce91c630eb8b61022e89dc0a89ce0902aa

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 11:04 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 11:04 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx409.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 474 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 8655 bytes --]

From: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Jojo R <rjiejie@linux.alibaba.com>, Richard Sandiford <richard.sandiford@arm.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] Adding target hook allows to reject initialization of register
Date: Tue, 10 Aug 2021 13:03:09 +0200
Message-ID: <CAFiYyc2W5bL93-SsMXo3Yz_eYzHkyGVTR8GE24_p_n=zWVbW+g@mail.gmail.com>

On Tue, Aug 10, 2021 at 10:33 AM Jojo R via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Some target like RISC-V allow to group vector register as a whole,
> and only operate part of it in fact, but the 'init-regs' pass will add initialization
> for uninitialized registers. Add this hook to reject this action for reducing instruction.

Are these groups "visible"?  That is, are the pseudos multi-reg
pseudos?  I wonder
if there's a more generic way to tame down initregs w/o introducing a new target
hook.

Btw, initregs is a red herring - it ideally should go away.  See PR61810.

So instead of adding to it can you see whether disabling the pass for RISC-V
works w/o fallout (and add a comment to the PR)?  Maybe some more RTL
literate (in particular DF literate) can look at the remaining issue.
Richard, did you
ever have a look into the "issue" that initregs covers up (whatever
that exactly is)?

Thanks,
Richard.

>         gcc/
>         * init-regs.c (initialize_uninitialized_regs): Call register_reject_init_p.
>         * target.def (register_reject_init_p): New hook.
>         * doc/tm.texi.in: Add TARGET_REGISTER_REJECT_INIT_P.
>         * doc/tm.texi: Regenerated.
> ---
>  gcc/doc/tm.texi    | 6 ++++++
>  gcc/doc/tm.texi.in | 2 ++
>  gcc/init-regs.c    | 5 +++++
>  gcc/target.def     | 8 ++++++++
>  4 files changed, 21 insertions(+)
>
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index a30fdcbbf3d6..83fd5496ca3f 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -12588,3 +12588,9 @@ Return an RTX representing @var{tagged_pointer} with its tag set to zero.
>  Store the result in @var{target} if convenient.
>  The default clears the top byte of the original pointer.
>  @end deftypefn
> +
> +@deftypefn {Target Hook} bool TARGET_REGISTER_REJECT_INIT_P (rtx @var{reg})
> +This target hook should return @code{true} if reject initialization for a uninitialized @var{reg}.
> +
> +The default value of this hook is @code{NULL}.
> +@end deftypefn
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index 611fc500ac86..13174ce66d59 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -8180,3 +8180,5 @@ maintainer is familiar with.
>  @hook TARGET_MEMTAG_EXTRACT_TAG
>
>  @hook TARGET_MEMTAG_UNTAGGED_POINTER
> +
> +@hook TARGET_REGISTER_REJECT_INIT_P
> diff --git a/gcc/init-regs.c b/gcc/init-regs.c
> index 72e898f3e334..51c0d669d30b 100644
> --- a/gcc/init-regs.c
> +++ b/gcc/init-regs.c
> @@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "system.h"
>  #include "coretypes.h"
>  #include "backend.h"
> +#include "target.h"
>  #include "rtl.h"
>  #include "tree.h"
>  #include "df.h"
> @@ -101,6 +102,10 @@ initialize_uninitialized_regs (void)
>                   rtx_insn *move_insn;
>                   rtx reg = DF_REF_REAL_REG (use);
>
> +                 if (targetm.register_reject_init_p
> +                     && targetm.register_reject_init_p (reg))
> +                   continue;
> +
>                   bitmap_set_bit (already_genned, regno);
>
>                   start_sequence ();
> diff --git a/gcc/target.def b/gcc/target.def
> index 7676d5e626e3..c2b54421618d 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -4545,6 +4545,14 @@ by a subtarget.",
>   unsigned HOST_WIDE_INT, (void),
>   NULL)
>
> +/* Return true if reject initialization for a uninitialized register.  */
> +DEFHOOK
> +(register_reject_init_p,
> + "This target hook should return @code{true} if reject initialization for a uninitialized @var{reg}.\n\
> +\n\
> +The default value of this hook is @code{NULL}.",
> + bool, (rtx reg), NULL)
> +
>  /* Functions relating to calls - argument passing, returns, etc.  */
>  /* Members of struct call have no special macro prefix.  */
>  HOOK_VECTOR (TARGET_CALLS, calls)
> --
> 2.24.3 (Apple Git-128)
>


To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=1dac.61125d35.60598.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5BPATCH%5D+Adding+target+hook+allows+to+reject+initialization+of+register&verdict=C&c=531bdddedefc24be9cb5af751a905f00acebd76f

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 10:26 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 10:26 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx409.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 474 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 10001 bytes --]

From: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
To: gcc-patches@gcc.gnu.org
Subject: [PATCH] tree-optimization/101809 - support emulated gather for double[int]
Date: Tue, 10 Aug 2021 12:25:04 +0200 (CEST)
Message-ID: <31q44sno-o7no-82nq-851o-52n67opp6o1@fhfr.qr>

This adds emulated gather support for index vectors with more
elements than the data vector.  The internal function gather
vectorization code doesn't currently handle this (but the builtin
decl code does).  This allows vectorization of double data gather
with int indexes on 32bit platforms where there isn't an implicit
widening to 64bit present.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-08-10  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/101809
	* tree-vect-stmts.c (get_load_store_type): Allow emulated
	gathers with offset vector nunits being a constant multiple
	of the data vector nunits.
	(vect_get_gather_scatter_ops): Use the appropriate nunits
	for the offset vector defs.
	(vectorizable_store): Adjust call to
	vect_get_gather_scatter_ops.
	(vectorizable_load): Likewise.  Handle the case of less
	offset vectors than data vectors.
---
 gcc/tree-vect-stmts.c | 47 +++++++++++++++++++++++++++----------------
 1 file changed, 30 insertions(+), 17 deletions(-)

diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 5a5a4dab3f2..ab402b57fb4 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -2377,9 +2377,11 @@ get_load_store_type (vec_info  *vinfo, stmt_vec_info stmt_info,
 	      return false;
 	    }
 	  else if (!TYPE_VECTOR_SUBPARTS (vectype).is_constant ()
-		   || !known_eq (TYPE_VECTOR_SUBPARTS (vectype),
-				 TYPE_VECTOR_SUBPARTS
-				   (gs_info->offset_vectype)))
+		   || !TYPE_VECTOR_SUBPARTS
+			 (gs_info->offset_vectype).is_constant ()
+		   || !constant_multiple_p (TYPE_VECTOR_SUBPARTS
+					      (gs_info->offset_vectype),
+					    TYPE_VECTOR_SUBPARTS (vectype)))
 	    {
 	      if (dump_enabled_p ())
 		dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -2928,11 +2930,10 @@ vect_build_gather_load_calls (vec_info *vinfo, stmt_vec_info stmt_info,
    containing loop.  */
 
 static void
-vect_get_gather_scatter_ops (vec_info *vinfo,
+vect_get_gather_scatter_ops (loop_vec_info loop_vinfo,
 			     class loop *loop, stmt_vec_info stmt_info,
 			     gather_scatter_info *gs_info,
-			     tree *dataref_ptr, vec<tree> *vec_offset,
-			     unsigned ncopies)
+			     tree *dataref_ptr, vec<tree> *vec_offset)
 {
   gimple_seq stmts = NULL;
   *dataref_ptr = force_gimple_operand (gs_info->base, &stmts, true, NULL_TREE);
@@ -2943,8 +2944,10 @@ vect_get_gather_scatter_ops (vec_info *vinfo,
       new_bb = gsi_insert_seq_on_edge_immediate (pe, stmts);
       gcc_assert (!new_bb);
     }
-  vect_get_vec_defs_for_operand (vinfo, stmt_info, ncopies, gs_info->offset,
-				 vec_offset, gs_info->offset_vectype);
+  unsigned ncopies = vect_get_num_copies (loop_vinfo, gs_info->offset_vectype);
+  vect_get_vec_defs_for_operand (loop_vinfo, stmt_info, ncopies,
+				 gs_info->offset, vec_offset,
+				 gs_info->offset_vectype);
 }
 
 /* Prepare to implement a grouped or strided load or store using
@@ -8072,8 +8075,9 @@ vectorizable_store (vec_info *vinfo,
 	    }
 	  else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
 	    {
-	      vect_get_gather_scatter_ops (vinfo, loop, stmt_info, &gs_info,
-					   &dataref_ptr, &vec_offsets, ncopies);
+	      vect_get_gather_scatter_ops (loop_vinfo, loop, stmt_info,
+					   &gs_info, &dataref_ptr,
+					   &vec_offsets);
 	      vec_offset = vec_offsets[0];
 	    }
 	  else
@@ -9376,9 +9380,9 @@ vectorizable_load (vec_info *vinfo,
 	    }
 	  else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
 	    {
-	      vect_get_gather_scatter_ops (vinfo, loop, stmt_info, &gs_info,
-					   &dataref_ptr, &vec_offsets, ncopies);
-	      vec_offset = vec_offsets[0];
+	      vect_get_gather_scatter_ops (loop_vinfo, loop, stmt_info,
+					   &gs_info, &dataref_ptr,
+					   &vec_offsets);
 	    }
 	  else
 	    dataref_ptr
@@ -9395,9 +9399,7 @@ vectorizable_load (vec_info *vinfo,
 	  if (dataref_offset)
 	    dataref_offset = int_const_binop (PLUS_EXPR, dataref_offset,
 					      bump);
-	  else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
-	    vec_offset = vec_offsets[j];
-	  else
+	  else if (!STMT_VINFO_GATHER_SCATTER_P (stmt_info))
 	    dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, gsi,
 					   stmt_info, bump);
 	  if (mask)
@@ -9490,6 +9492,7 @@ vectorizable_load (vec_info *vinfo,
 		    if (memory_access_type == VMAT_GATHER_SCATTER
 			&& gs_info.ifn != IFN_LAST)
 		      {
+			vec_offset = vec_offsets[j];
 			tree zero = build_zero_cst (vectype);
 			tree scale = size_int (gs_info.scale);
 			gcall *call;
@@ -9512,9 +9515,18 @@ vectorizable_load (vec_info *vinfo,
 			gcc_assert (!final_mask);
 			unsigned HOST_WIDE_INT const_nunits
 			  = nunits.to_constant ();
+			unsigned HOST_WIDE_INT const_offset_nunits
+			  = TYPE_VECTOR_SUBPARTS (gs_info.offset_vectype)
+			      .to_constant ();
 			vec<constructor_elt, va_gc> *ctor_elts;
 			vec_alloc (ctor_elts, const_nunits);
 			gimple_seq stmts = NULL;
+			/* We support offset vectors with more elements
+			   than the data vector for now.  */
+			unsigned HOST_WIDE_INT factor
+			  = const_offset_nunits / const_nunits;
+			vec_offset = vec_offsets[j / factor];
+			unsigned elt_offset = (j % factor) * const_nunits;
 			tree idx_type = TREE_TYPE (TREE_TYPE (vec_offset));
 			tree scale = size_int (gs_info.scale);
 			align
@@ -9525,7 +9537,8 @@ vectorizable_load (vec_info *vinfo,
 			  {
 			    tree boff = size_binop (MULT_EXPR,
 						    TYPE_SIZE (idx_type),
-						    bitsize_int (k));
+						    bitsize_int
+						      (k + elt_offset));
 			    tree idx = gimple_build (&stmts, BIT_FIELD_REF,
 						     idx_type, vec_offset,
 						     TYPE_SIZE (idx_type),
-- 
2.31.1


To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=7dfe.6112543e.f148.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=%5BPATCH%5D+tree-optimization%2F101809+-+support+emulated+gather+for+double%5Bint%5D&verdict=C&c=db07867a517b0a4ebd892d73cf5921ec9e358b2b

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10 10:09 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10 10:09 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx302.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 475 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 9086 bytes --]

From: Hongtao Liu via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Jakub Jelinek <jakub@redhat.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] i386: Allow some V32HImode and V64QImode permutations even without AVX512BW [PR80355]
Date: Tue, 10 Aug 2021 18:14:33 +0800
Message-ID: <CAMZc-by0wedVFDpi2rm2E8_3vo8ZEtSV-Co7LHueLL+HsdL8sg@mail.gmail.com>

On Tue, Aug 10, 2021 at 4:54 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> Hi!
>
> When working on the PR, I've noticed we generate terrible code for
> V32HImode or V64QImode permutations for -mavx512f -mno-avx512bw.
> Generally we can't do much with such permutations, but since PR68655
> we can handle at least some, those expressible using V16SImode or V8DImode
> permutations, but that wasn't reachable, because ix86_vectorize_vec_perm_const
> didn't even try, it said without TARGET_AVX512BW it can't do anything, and
> with it can do everything, no d.testing_p attempts.
>
> This patch makes it try it for TARGET_AVX512F && !TARGET_AVX512BW.
TARGET_AVX512{F,BW,CD,DQ,VL} will be the baseline for all
AVX512-enabled processors after(including)SKX.
But it's definitely good to have this, patch LGTM.
>
> The first hunk is to avoid ICE, expand_vec_perm_even_odd_1 asserts d->vmode
> isn't V32HImode because expand_vec_perm_1 for AVX512BW handles already
> all permutations, but when we let it through without !TARGET_AVX512BW,
> expand_vec_perm_1 doesn't handle it.
>
> If we want, that hunk can be dropped if we implement in
> expand_vec_perm_even_odd_1 and its helper the even permutation as
> vpmovdw + vpmovdw + vinserti64x4 and odd permutation as
> vpsrld $16 + vpsrld $16 + vpmovdw + vpmovdw + vinserti64x4.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2021-08-10  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/80355
>         * config/i386/i386-expand.c (expand_vec_perm_even_odd): Return false
>         for V32HImode if !TARGET_AVX512BW.
>         (ix86_vectorize_vec_perm_const) <case E_V32HImode, case E_V64QImode>:
>         If !TARGET_AVX512BW and TARGET_AVX512F and d.testing_p, don't fail
>         early, but actually check the permutation.
>
>         * gcc.target/i386/avx512f-pr80355-2.c: New test.
>
> --- gcc/config/i386/i386-expand.c.jj    2021-08-05 10:26:15.589555028 +0200
> +++ gcc/config/i386/i386-expand.c       2021-08-09 14:14:35.466268680 +0200
> @@ -20337,6 +20337,11 @@ expand_vec_perm_even_odd (struct expand_
>      if (d->perm[i] != 2 * i + odd)
>        return false;
>
> +  if (d->vmode == E_V32HImode
> +      && d->testing_p
> +      && !TARGET_AVX512BW)
> +    return false;
> +
>    return expand_vec_perm_even_odd_1 (d, odd);
>  }
>
> @@ -20877,16 +20882,16 @@ ix86_vectorize_vec_perm_const (machine_m
>         return true;
>        break;
>      case E_V32HImode:
> -      if (!TARGET_AVX512BW)
> +      if (!TARGET_AVX512F)
>         return false;
> -      if (d.testing_p)
> +      if (d.testing_p && TARGET_AVX512BW)
>         /* All implementable with a single vperm[it]2 insn.  */
>         return true;
>        break;
>      case E_V64QImode:
> -      if (!TARGET_AVX512BW)
> +      if (!TARGET_AVX512F)
>         return false;
> -      if (d.testing_p)
> +      if (d.testing_p && TARGET_AVX512BW)
>         /* Implementable with 2 vperm[it]2, 2 vpshufb and 1 or insn.  */
>         return true;
>        break;
> --- gcc/testsuite/gcc.target/i386/avx512f-pr80355-2.c.jj        2021-08-09 14:24:27.176142589 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512f-pr80355-2.c   2021-08-09 14:29:23.308074276 +0200
> @@ -0,0 +1,23 @@
> +/* PR target/80355 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx512f -mno-avx512vl -mno-avx512dq -mno-avx512bw" } */
> +/* { dg-final { scan-assembler-times "\tvshufi(?:32x4|64x2)\t" 2 } } */
> +
> +typedef short V __attribute__((vector_size (64)));
> +typedef char W __attribute__((vector_size (64)));
> +
> +W
> +f0 (W x)
> +{
> +  return __builtin_shuffle (x, (W) { 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
> +                                    48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
> +                                    0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
> +                                    17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 });
> +}
> +
> +V
> +f1 (V x)
> +{
> +  return __builtin_shuffle (x, (V) { 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
> +                                    0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 });
> +}
>
>         Jakub
>


-- 
BR,
Hongtao


To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=b234.6112506e.30e23.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5BPATCH%5D+i386%3A+Allow+some+V32HImode+and+V64QImode+permutations+even+without+AVX512BW+%5BPR80355%5D&verdict=C&c=8e377e0e566b38416325d128bec1b64e6df867ac

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10  9:35 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10  9:35 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx409.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 474 bytes --]

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: Undelivered Message Headers --]
[-- Type: text/rfc822-headers, Size: 4409 bytes --]

Return-Path: <gcc-patches@gcc.gnu.org>
Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) by
 fx409.security-mail.net (Postfix) with ESMTPS id F3866323659 for
 <marc.poulhies@kalray.eu>; Tue, 10 Aug 2021 11:35:12 +0200 (CEST)
Received: from server2.sourceware.org (localhost [IPv6:::1]) by
 sourceware.org (Postfix) with ESMTP id B6F2B385701D for
 <marc.poulhies@kalray.eu>; Tue, 10 Aug 2021 09:35:11 +0000 (GMT)
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org
 (Postfix) with ESMTP id BE86B385780A for <gcc-patches@gcc.gnu.org>; Tue, 10
 Aug 2021 09:34:14 +0000 (GMT)
Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com
 [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-267-G_Xo3-M5MmuNjX1L_mR37w-1; Tue, 10 Aug 2021 05:34:13 -0400
Received: from smtp.corp.redhat.com
 (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with
 cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by
 mimecast-mx01.redhat.com (Postfix) with ESMTPS id 30713801AE7 for
 <gcc-patches@gcc.gnu.org>; Tue, 10 Aug 2021 09:34:12 +0000 (UTC)
Received: from tucnak.zalov.cz (unknown [10.39.193.120]) by
 smtp.corp.redhat.com (Postfix) with ESMTPS id 8696A60854 for
 <gcc-patches@gcc.gnu.org>; Tue, 10 Aug 2021 09:34:11 +0000 (UTC)
Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz
 (8.16.1/8.16.1) with ESMTPS id 17A9Y4Cp1033420 (version=TLSv1.3
 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT) for
 <gcc-patches@gcc.gnu.org>; Tue, 10 Aug 2021 11:34:05 +0200
Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit)
 id 17A9Y3qn1033419 for gcc-patches@gcc.gnu.org; Tue, 10 Aug 2021 11:34:03
 +0200
X-Quarantine-ID: <pfTQHK5BmCDY>
X-Virus-Scanned: E-securemail, by Secumail
X-Spam-Status: No, score=0.379 tagged_above=-1000 required=7.5
 tests=[AB_ENVFROM_LONG_40=0.5, AB_LONG_SUBJ_30=0.001,
 DCC_REPUT_00_12=-0.002, DKIM_SIGNED=0.1, DKIM_VALID=-1, DKIM_VALID_AU=-0.1,
 FSL_HTML_ENT_LOTS_1=0.01, FSL_HTML_ENT_LOTS_2=0.01,
 FSL_HTML_ENT_LOTS_3=0.01, FSL_RCVD_EX_GT_5=1, FSL_RCVD_UT_GT_5=0.01,
 HEAD_NEWS=-0.5, MISSING_MID=0.14, MM_ENVFROM_BOUNCE=1,
 RCVD_IN_DNSWL_MED=-1.3, S_FROM_GREY_MINUS_2=-2, S_KW_BODY_EXPLICIT_=2.5]
 autolearn=disabled
Authentication-Results: fx409.security-mail.net (amavisd-new); dkim=pass
 (1024-bit key) header.d=gcc.gnu.org
Secumail-id: <b909.61124850.b636c.0>
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B6F2B385701D
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
 s=default; t=1628588111; bh=tzbRKtW8BXS1CQn+wgSaigrquTg45e3ecyRzvcn0FYM=;
 h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post:
 List-Help:List-Subscribe:From:Reply-To:From;
 b=m1IYcNVEW2hG3e/52ghShfxcz32xe1EBeH51vhGVSmVQj4FFtojpazNk4IUqhRSLD
 nW6UrLkcpZZjrmYPchWodiMWd+lK4HUY9PvNo3BGysN58F34INWcMZ1frxjwZbwLjY
 2dXwkFAgNHQgkbGIA7dEnGyYM+bo5gLajhliRa6A=
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BE86B385780A
X-MC-Unique: G_Xo3-M5MmuNjX1L_mR37w-1
Date: Tue, 10 Aug 2021 11:34:03 +0200
To: gcc-patches@gcc.gnu.org
Subject: [committed] openmp: Add support for declare simd and declare
 variant in a attribute syntax
Message-ID: <20210810093403.GK2380545@tucnak>
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
From: Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org>
Reply-To: Jakub Jelinek <jakub@redhat.com>
Errors-To: gcc-patches-bounces+marc.poulhies=kalray.eu@gcc.gnu.org
Sender: Gcc-patches
 <gcc-patches-bounces+marc.poulhies=kalray.eu@gcc.gnu.org>
Content-Type: multipart/alternative; boundary="=_WhqiTbHUGhAxDAymDSJ7Aev"
X-ALTERMIMEV2_in: done

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10  9:12 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10  9:12 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx306.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 474 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 11987 bytes --]

From: Hongtao Liu via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Jakub Jelinek <jakub@redhat.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] i386: Improve single operand AVX512F permutations [PR80355]
Date: Tue, 10 Aug 2021 17:16:47 +0800
Message-ID: <CAMZc-bwwccRst0F56SfFoaHR5BqrUL=amXXSDk_=o0GNYcSatQ@mail.gmail.com>

On Tue, Aug 10, 2021 at 4:44 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> Hi!
>
> On the following testcase we emit
>         vmovdqa32       .LC0(%rip), %zmm1
>         vpermd  %zmm0, %zmm1, %zmm0
> and
>         vmovdqa64       .LC1(%rip), %zmm1
>         vpermq  %zmm0, %zmm1, %zmm0
> instead of
>         vshufi32x4      $78, %zmm0, %zmm0, %zmm0
> and
>         vshufi64x2      $78, %zmm0, %zmm0, %zmm0
> we can emit with the patch.  We have patterns that match two argument
> permutations for vshuf[if]*, but for one argument it doesn't trigger.
> Either we can add two patterns for that, or we would need to add another
> routine to i386-expand.c that would transform under certain condition
> these cases to the two argument vshuf*, doing it in sse.md looked simpler.
Yes, we already have
<mask_codefor>avx512dq_shuf_<shuffletype>64x2_1<mask_name> and
avx512f_shuf_<shuffletype>32x4_1<mask_name>, it's simpler to add
another 2 patterns with similar logic but selected from only 1 vector
instead of 2 vectors.
Patch LGTM.
> We don't need this for 32-byte vectors, we already emit single insn
> permutation that doesn't need memory op there.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2021-08-10  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/80355
>         * config/i386/sse.md (*avx512f_shuf_<shuffletype>64x2_1<mask_name>_1,
>         *avx512f_shuf_<shuffletype>32x4_1<mask_name>_1): New define_insn
>         patterns.
>
>         * gcc.target/i386/avx512f-pr80355-1.c: New test.
>
> --- gcc/config/i386/sse.md.jj   2021-08-05 10:26:15.592554985 +0200
> +++ gcc/config/i386/sse.md      2021-08-09 13:31:49.025479889 +0200
> @@ -15320,6 +15320,42 @@ (define_insn "avx512f_shuf_<shuffletype>
>     (set_attr "prefix" "evex")
>     (set_attr "mode" "<sseinsnmode>")])
>
> +(define_insn "*avx512f_shuf_<shuffletype>64x2_1<mask_name>_1"
> +  [(set (match_operand:V8FI 0 "register_operand" "=v")
> +       (vec_select:V8FI
> +         (match_operand:V8FI 1 "register_operand" "v")
> +         (parallel [(match_operand 2 "const_0_to_7_operand")
> +                    (match_operand 3 "const_0_to_7_operand")
> +                    (match_operand 4 "const_0_to_7_operand")
> +                    (match_operand 5 "const_0_to_7_operand")
> +                    (match_operand 6 "const_0_to_7_operand")
> +                    (match_operand 7 "const_0_to_7_operand")
> +                    (match_operand 8 "const_0_to_7_operand")
> +                    (match_operand 9 "const_0_to_7_operand")])))]
> +  "TARGET_AVX512F
> +   && (INTVAL (operands[2]) & 1) == 0
> +   && INTVAL (operands[2]) == INTVAL (operands[3]) - 1
> +   && (INTVAL (operands[4]) & 1) == 0
> +   && INTVAL (operands[4]) == INTVAL (operands[5]) - 1
> +   && (INTVAL (operands[6]) & 1) == 0
> +   && INTVAL (operands[6]) == INTVAL (operands[7]) - 1
> +   && (INTVAL (operands[8]) & 1) == 0
> +   && INTVAL (operands[8]) == INTVAL (operands[9]) - 1"
> +{
> +  int mask;
> +  mask = INTVAL (operands[2]) / 2;
> +  mask |= INTVAL (operands[4]) / 2 << 2;
> +  mask |= INTVAL (operands[6]) / 2 << 4;
> +  mask |= INTVAL (operands[8]) / 2 << 6;
> +  operands[2] = GEN_INT (mask);
> +
> +  return "vshuf<shuffletype>64x2\t{%2, %1, %1, %0<mask_operand10>|%0<mask_operand10>, %1, %1, %2}";
> +}
> +  [(set_attr "type" "sselog")
> +   (set_attr "length_immediate" "1")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "<sseinsnmode>")])
> +
>  (define_expand "avx512vl_shuf_<shuffletype>32x4_mask"
>    [(match_operand:VI4F_256 0 "register_operand")
>     (match_operand:VI4F_256 1 "register_operand")
> @@ -15463,6 +15499,58 @@ (define_insn "avx512f_shuf_<shuffletype>
>  }
>    [(set_attr "type" "sselog")
>     (set_attr "length_immediate" "1")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "<sseinsnmode>")])
> +
> +(define_insn "*avx512f_shuf_<shuffletype>32x4_1<mask_name>_1"
> +  [(set (match_operand:V16FI 0 "register_operand" "=v")
> +       (vec_select:V16FI
> +         (match_operand:V16FI 1 "register_operand" "v")
> +         (parallel [(match_operand 2 "const_0_to_15_operand")
> +                    (match_operand 3 "const_0_to_15_operand")
> +                    (match_operand 4 "const_0_to_15_operand")
> +                    (match_operand 5 "const_0_to_15_operand")
> +                    (match_operand 6 "const_0_to_15_operand")
> +                    (match_operand 7 "const_0_to_15_operand")
> +                    (match_operand 8 "const_0_to_15_operand")
> +                    (match_operand 9 "const_0_to_15_operand")
> +                    (match_operand 10 "const_0_to_15_operand")
> +                    (match_operand 11 "const_0_to_15_operand")
> +                    (match_operand 12 "const_0_to_15_operand")
> +                    (match_operand 13 "const_0_to_15_operand")
> +                    (match_operand 14 "const_0_to_15_operand")
> +                    (match_operand 15 "const_0_to_15_operand")
> +                    (match_operand 16 "const_0_to_15_operand")
> +                    (match_operand 17 "const_0_to_15_operand")])))]
> +  "TARGET_AVX512F
> +   && (INTVAL (operands[2]) & 3) == 0
> +   && INTVAL (operands[2]) == INTVAL (operands[3]) - 1
> +   && INTVAL (operands[2]) == INTVAL (operands[4]) - 2
> +   && INTVAL (operands[2]) == INTVAL (operands[5]) - 3
> +   && (INTVAL (operands[6]) & 3) == 0
> +   && INTVAL (operands[6]) == INTVAL (operands[7]) - 1
> +   && INTVAL (operands[6]) == INTVAL (operands[8]) - 2
> +   && INTVAL (operands[6]) == INTVAL (operands[9]) - 3
> +   && (INTVAL (operands[10]) & 3) == 0
> +   && INTVAL (operands[10]) == INTVAL (operands[11]) - 1
> +   && INTVAL (operands[10]) == INTVAL (operands[12]) - 2
> +   && INTVAL (operands[10]) == INTVAL (operands[13]) - 3
> +   && (INTVAL (operands[14]) & 3) == 0
> +   && INTVAL (operands[14]) == INTVAL (operands[15]) - 1
> +   && INTVAL (operands[14]) == INTVAL (operands[16]) - 2
> +   && INTVAL (operands[14]) == INTVAL (operands[17]) - 3"
> +{
> +  int mask;
> +  mask = INTVAL (operands[2]) / 4;
> +  mask |= INTVAL (operands[6]) / 4 << 2;
> +  mask |= INTVAL (operands[10]) / 4 << 4;
> +  mask |= INTVAL (operands[14]) / 4 << 6;
> +  operands[2] = GEN_INT (mask);
> +
> +  return "vshuf<shuffletype>32x4\t{%2, %1, %1, %0<mask_operand18>|%0<mask_operand18>, %1, %1, %2}";
> +}
> +  [(set_attr "type" "sselog")
> +   (set_attr "length_immediate" "1")
>     (set_attr "prefix" "evex")
>     (set_attr "mode" "<sseinsnmode>")])
>
> --- gcc/testsuite/gcc.target/i386/avx512f-pr80355-1.c.jj        2021-08-09 13:42:14.621904142 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512f-pr80355-1.c   2021-08-09 13:04:10.070249292 +0200
> @@ -0,0 +1,19 @@
> +/* PR target/80355 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx512f -mno-avx512vl -mno-avx512dq" } */
> +/* { dg-final { scan-assembler "\tvshufi32x4\t" } } */
> +/* { dg-final { scan-assembler "\tvshufi64x2\t" } } */
> +
> +typedef long long V __attribute__((vector_size (64)));
> +typedef int W __attribute__((vector_size (64)));
> +
> +W
> +f0 (W x)
> +{
> +  return __builtin_shuffle (x, (W) { 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 3, 4, 5, 6, 7 });
> +}
> +V
> +f1 (V x)
> +{
> +  return __builtin_shuffle (x, (V) { 4, 5, 6, 7, 0, 1, 2, 3 });
> +}
>
>         Jakub
>


-- 
BR,
Hongtao


To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=266c.611242e4.a6bac.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5BPATCH%5D+i386%3A+Improve+single+operand+AVX512F+permutations+%5BPR80355%5D&verdict=C&c=ea7d5238f66bcf8c067ccd709f1f021b4861a903

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10  9:08 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10  9:08 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx308.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 474 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 8233 bytes --]

From: Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Hongtao Liu <crazylht@gmail.com>
Cc: gcc-patches@gcc.gnu.org
Subject: [PATCH] i386: Allow some V32HImode and V64QImode permutations even without AVX512BW [PR80355]
Date: Tue, 10 Aug 2021 10:54:26 +0200
Message-ID: <20210810085426.GJ2380545@tucnak>

Hi!

When working on the PR, I've noticed we generate terrible code for
V32HImode or V64QImode permutations for -mavx512f -mno-avx512bw.
Generally we can't do much with such permutations, but since PR68655
we can handle at least some, those expressible using V16SImode or V8DImode
permutations, but that wasn't reachable, because ix86_vectorize_vec_perm_const
didn't even try, it said without TARGET_AVX512BW it can't do anything, and
with it can do everything, no d.testing_p attempts.

This patch makes it try it for TARGET_AVX512F && !TARGET_AVX512BW.

The first hunk is to avoid ICE, expand_vec_perm_even_odd_1 asserts d->vmode
isn't V32HImode because expand_vec_perm_1 for AVX512BW handles already
all permutations, but when we let it through without !TARGET_AVX512BW,
expand_vec_perm_1 doesn't handle it.

If we want, that hunk can be dropped if we implement in
expand_vec_perm_even_odd_1 and its helper the even permutation as
vpmovdw + vpmovdw + vinserti64x4 and odd permutation as
vpsrld $16 + vpsrld $16 + vpmovdw + vpmovdw + vinserti64x4.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-08-10  Jakub Jelinek  <jakub@redhat.com>

	PR target/80355
	* config/i386/i386-expand.c (expand_vec_perm_even_odd): Return false
	for V32HImode if !TARGET_AVX512BW.
	(ix86_vectorize_vec_perm_const) <case E_V32HImode, case E_V64QImode>:
	If !TARGET_AVX512BW and TARGET_AVX512F and d.testing_p, don't fail
	early, but actually check the permutation.

	* gcc.target/i386/avx512f-pr80355-2.c: New test.

--- gcc/config/i386/i386-expand.c.jj	2021-08-05 10:26:15.589555028 +0200
+++ gcc/config/i386/i386-expand.c	2021-08-09 14:14:35.466268680 +0200
@@ -20337,6 +20337,11 @@ expand_vec_perm_even_odd (struct expand_
     if (d->perm[i] != 2 * i + odd)
       return false;
 
+  if (d->vmode == E_V32HImode
+      && d->testing_p
+      && !TARGET_AVX512BW)
+    return false;
+
   return expand_vec_perm_even_odd_1 (d, odd);
 }
 
@@ -20877,16 +20882,16 @@ ix86_vectorize_vec_perm_const (machine_m
 	return true;
       break;
     case E_V32HImode:
-      if (!TARGET_AVX512BW)
+      if (!TARGET_AVX512F)
 	return false;
-      if (d.testing_p)
+      if (d.testing_p && TARGET_AVX512BW)
 	/* All implementable with a single vperm[it]2 insn.  */
 	return true;
       break;
     case E_V64QImode:
-      if (!TARGET_AVX512BW)
+      if (!TARGET_AVX512F)
 	return false;
-      if (d.testing_p)
+      if (d.testing_p && TARGET_AVX512BW)
 	/* Implementable with 2 vperm[it]2, 2 vpshufb and 1 or insn.  */
 	return true;
       break;
--- gcc/testsuite/gcc.target/i386/avx512f-pr80355-2.c.jj	2021-08-09 14:24:27.176142589 +0200
+++ gcc/testsuite/gcc.target/i386/avx512f-pr80355-2.c	2021-08-09 14:29:23.308074276 +0200
@@ -0,0 +1,23 @@
+/* PR target/80355 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512f -mno-avx512vl -mno-avx512dq -mno-avx512bw" } */
+/* { dg-final { scan-assembler-times "\tvshufi(?:32x4|64x2)\t" 2 } } */
+
+typedef short V __attribute__((vector_size (64)));
+typedef char W __attribute__((vector_size (64)));
+
+W
+f0 (W x)
+{
+  return __builtin_shuffle (x, (W) { 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
+				     48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
+				     0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
+				     17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 });
+}
+
+V
+f1 (V x)
+{
+  return __builtin_shuffle (x, (V) { 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
+				     0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 });
+}

	Jakub



To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=bf1d.61123f06.99556.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=%5BPATCH%5D+i386%3A+Allow+some+V32HImode+and+V64QImode+permutations+even+without+AVX512BW+%5BPR80355%5D&verdict=C&c=454b4fd04703e9c385a54af0fdd95c58945fbb66

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Undelivered Mail Returned to Sender
@ 2021-08-10  8:45 MAILER-DAEMON
  0 siblings, 0 replies; 24+ messages in thread
From: MAILER-DAEMON @ 2021-08-10  8:45 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Notification --]
[-- Type: text/plain, Size: 601 bytes --]

This is the mail system at host fx308.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulhies@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulhies@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)

[-- Attachment #2: Delivery report --]
[-- Type: message/delivery-status, Size: 474 bytes --]

[-- Attachment #3: Undelivered Message --]
[-- Type: message/rfc822, Size: 10811 bytes --]

From: Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org>
To: Hongtao Liu <crazylht@gmail.com>
Cc: gcc-patches@gcc.gnu.org
Subject: [PATCH] i386: Improve single operand AVX512F permutations [PR80355]
Date: Tue, 10 Aug 2021 10:44:46 +0200
Message-ID: <20210810084446.GI2380545@tucnak>

Hi!

On the following testcase we emit
        vmovdqa32       .LC0(%rip), %zmm1
        vpermd  %zmm0, %zmm1, %zmm0
and
        vmovdqa64       .LC1(%rip), %zmm1
        vpermq  %zmm0, %zmm1, %zmm0
instead of
        vshufi32x4      $78, %zmm0, %zmm0, %zmm0
and
        vshufi64x2      $78, %zmm0, %zmm0, %zmm0
we can emit with the patch.  We have patterns that match two argument
permutations for vshuf[if]*, but for one argument it doesn't trigger.
Either we can add two patterns for that, or we would need to add another
routine to i386-expand.c that would transform under certain condition
these cases to the two argument vshuf*, doing it in sse.md looked simpler.
We don't need this for 32-byte vectors, we already emit single insn
permutation that doesn't need memory op there.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-08-10  Jakub Jelinek  <jakub@redhat.com>

	PR target/80355
	* config/i386/sse.md (*avx512f_shuf_<shuffletype>64x2_1<mask_name>_1,
	*avx512f_shuf_<shuffletype>32x4_1<mask_name>_1): New define_insn
	patterns.

	* gcc.target/i386/avx512f-pr80355-1.c: New test.

--- gcc/config/i386/sse.md.jj	2021-08-05 10:26:15.592554985 +0200
+++ gcc/config/i386/sse.md	2021-08-09 13:31:49.025479889 +0200
@@ -15320,6 +15320,42 @@ (define_insn "avx512f_shuf_<shuffletype>
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
+(define_insn "*avx512f_shuf_<shuffletype>64x2_1<mask_name>_1"
+  [(set (match_operand:V8FI 0 "register_operand" "=v")
+	(vec_select:V8FI
+	  (match_operand:V8FI 1 "register_operand" "v")
+	  (parallel [(match_operand 2 "const_0_to_7_operand")
+		     (match_operand 3 "const_0_to_7_operand")
+		     (match_operand 4 "const_0_to_7_operand")
+		     (match_operand 5 "const_0_to_7_operand")
+		     (match_operand 6 "const_0_to_7_operand")
+		     (match_operand 7 "const_0_to_7_operand")
+		     (match_operand 8 "const_0_to_7_operand")
+		     (match_operand 9 "const_0_to_7_operand")])))]
+  "TARGET_AVX512F
+   && (INTVAL (operands[2]) & 1) == 0
+   && INTVAL (operands[2]) == INTVAL (operands[3]) - 1
+   && (INTVAL (operands[4]) & 1) == 0
+   && INTVAL (operands[4]) == INTVAL (operands[5]) - 1
+   && (INTVAL (operands[6]) & 1) == 0
+   && INTVAL (operands[6]) == INTVAL (operands[7]) - 1
+   && (INTVAL (operands[8]) & 1) == 0
+   && INTVAL (operands[8]) == INTVAL (operands[9]) - 1"
+{
+  int mask;
+  mask = INTVAL (operands[2]) / 2;
+  mask |= INTVAL (operands[4]) / 2 << 2;
+  mask |= INTVAL (operands[6]) / 2 << 4;
+  mask |= INTVAL (operands[8]) / 2 << 6;
+  operands[2] = GEN_INT (mask);
+
+  return "vshuf<shuffletype>64x2\t{%2, %1, %1, %0<mask_operand10>|%0<mask_operand10>, %1, %1, %2}";
+}
+  [(set_attr "type" "sselog")
+   (set_attr "length_immediate" "1")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "<sseinsnmode>")])
+
 (define_expand "avx512vl_shuf_<shuffletype>32x4_mask"
   [(match_operand:VI4F_256 0 "register_operand")
    (match_operand:VI4F_256 1 "register_operand")
@@ -15463,6 +15499,58 @@ (define_insn "avx512f_shuf_<shuffletype>
 }
   [(set_attr "type" "sselog")
    (set_attr "length_immediate" "1")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "<sseinsnmode>")])
+
+(define_insn "*avx512f_shuf_<shuffletype>32x4_1<mask_name>_1"
+  [(set (match_operand:V16FI 0 "register_operand" "=v")
+	(vec_select:V16FI
+	  (match_operand:V16FI 1 "register_operand" "v")
+	  (parallel [(match_operand 2 "const_0_to_15_operand")
+		     (match_operand 3 "const_0_to_15_operand")
+		     (match_operand 4 "const_0_to_15_operand")
+		     (match_operand 5 "const_0_to_15_operand")
+		     (match_operand 6 "const_0_to_15_operand")
+		     (match_operand 7 "const_0_to_15_operand")
+		     (match_operand 8 "const_0_to_15_operand")
+		     (match_operand 9 "const_0_to_15_operand")
+		     (match_operand 10 "const_0_to_15_operand")
+		     (match_operand 11 "const_0_to_15_operand")
+		     (match_operand 12 "const_0_to_15_operand")
+		     (match_operand 13 "const_0_to_15_operand")
+		     (match_operand 14 "const_0_to_15_operand")
+		     (match_operand 15 "const_0_to_15_operand")
+		     (match_operand 16 "const_0_to_15_operand")
+		     (match_operand 17 "const_0_to_15_operand")])))]
+  "TARGET_AVX512F
+   && (INTVAL (operands[2]) & 3) == 0
+   && INTVAL (operands[2]) == INTVAL (operands[3]) - 1
+   && INTVAL (operands[2]) == INTVAL (operands[4]) - 2
+   && INTVAL (operands[2]) == INTVAL (operands[5]) - 3
+   && (INTVAL (operands[6]) & 3) == 0
+   && INTVAL (operands[6]) == INTVAL (operands[7]) - 1
+   && INTVAL (operands[6]) == INTVAL (operands[8]) - 2
+   && INTVAL (operands[6]) == INTVAL (operands[9]) - 3
+   && (INTVAL (operands[10]) & 3) == 0
+   && INTVAL (operands[10]) == INTVAL (operands[11]) - 1
+   && INTVAL (operands[10]) == INTVAL (operands[12]) - 2
+   && INTVAL (operands[10]) == INTVAL (operands[13]) - 3
+   && (INTVAL (operands[14]) & 3) == 0
+   && INTVAL (operands[14]) == INTVAL (operands[15]) - 1
+   && INTVAL (operands[14]) == INTVAL (operands[16]) - 2
+   && INTVAL (operands[14]) == INTVAL (operands[17]) - 3"
+{
+  int mask;
+  mask = INTVAL (operands[2]) / 4;
+  mask |= INTVAL (operands[6]) / 4 << 2;
+  mask |= INTVAL (operands[10]) / 4 << 4;
+  mask |= INTVAL (operands[14]) / 4 << 6;
+  operands[2] = GEN_INT (mask);
+
+  return "vshuf<shuffletype>32x4\t{%2, %1, %1, %0<mask_operand18>|%0<mask_operand18>, %1, %1, %2}";
+}
+  [(set_attr "type" "sselog")
+   (set_attr "length_immediate" "1")
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
--- gcc/testsuite/gcc.target/i386/avx512f-pr80355-1.c.jj	2021-08-09 13:42:14.621904142 +0200
+++ gcc/testsuite/gcc.target/i386/avx512f-pr80355-1.c	2021-08-09 13:04:10.070249292 +0200
@@ -0,0 +1,19 @@
+/* PR target/80355 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512f -mno-avx512vl -mno-avx512dq" } */
+/* { dg-final { scan-assembler "\tvshufi32x4\t" } } */
+/* { dg-final { scan-assembler "\tvshufi64x2\t" } } */
+
+typedef long long V __attribute__((vector_size (64)));
+typedef int W __attribute__((vector_size (64)));
+
+W
+f0 (W x)
+{
+  return __builtin_shuffle (x, (W) { 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 3, 4, 5, 6, 7 });
+}
+V
+f1 (V x)
+{
+  return __builtin_shuffle (x, (V) { 4, 5, 6, 7, 0, 1, 2, 3 });
+}

	Jakub



To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=4396.61123cc3.d5d9a.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=%5BPATCH%5D+i386%3A+Improve+single+operand+AVX512F+permutations+%5BPR80355%5D&verdict=C&c=fad81c81d65afd1829743f2dee200b4d084cffbb

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2021-08-10 15:03 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-10  8:40 Undelivered Mail Returned to Sender MAILER-DAEMON
2021-08-10  8:45 MAILER-DAEMON
2021-08-10  9:08 MAILER-DAEMON
2021-08-10  9:12 MAILER-DAEMON
2021-08-10  9:35 MAILER-DAEMON
2021-08-10 10:09 MAILER-DAEMON
2021-08-10 10:26 MAILER-DAEMON
2021-08-10 11:04 MAILER-DAEMON
2021-08-10 11:42 MAILER-DAEMON
2021-08-10 11:52 MAILER-DAEMON
2021-08-10 11:55 MAILER-DAEMON
2021-08-10 12:02 MAILER-DAEMON
2021-08-10 12:14 MAILER-DAEMON
2021-08-10 12:18 MAILER-DAEMON
2021-08-10 12:34 MAILER-DAEMON
2021-08-10 13:04 MAILER-DAEMON
2021-08-10 13:22 MAILER-DAEMON
2021-08-10 13:41 MAILER-DAEMON
2021-08-10 13:51 MAILER-DAEMON
2021-08-10 14:15 MAILER-DAEMON
2021-08-10 14:18 MAILER-DAEMON
2021-08-10 14:26 MAILER-DAEMON
2021-08-10 14:48 MAILER-DAEMON
2021-08-10 15:03 MAILER-DAEMON

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).