public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com>
To: Benjamin Minguez <benjamin.minguez@huawei.com>,
	Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>,
	"gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
Subject: Re: Condition execution optimization with gcc 7.5
Date: Mon, 22 May 2023 17:12:07 +0100	[thread overview]
Message-ID: <52514568-bf71-b750-b4cf-3c532271ba01@arm.com> (raw)
In-Reply-To: <b943a9b284b14fd996f548b9517b6de6@huawei.com>

On aarch64 this code cannot use conditional select.  An operation such as
	if (c) {
	  ...
	  r->lowcase_header[0] = c;
	  ...
	}

would be a conditional store to memory and can only happen if the 
guarding condition is true.  It's not safe to convert this into, say

	cmp c, #0
	...
	ldr w1, [ptr]
	csel w1, w1, c, eq
	str w1, [ptr]

because the store would introduce a possible race with any other thread 
that might be writing to the same location.  The compiler would also 
have to prove that ptr always contained a valid address when 'c' was 
false as well, something that might not be possible given the 
information available.

The function arm_max_conditional_execute is only used for 32-bit arm 
targets.  It's not part of the aarch64 compiler.

R.

On 22/05/2023 16:43, Benjamin Minguez via Gcc-help wrote:
> Hello Richard,
> 
> I'm compiling for aarch64. Indeed, I was expecting conversion via conditional move or set.
> I understand that code such as NGINX HTTP parser is suitable for such conversion. But I was expecting that, for example, this code can benefit of it (ngx_hash is an inline function and is a simple xor operation):
>>>                   if (c) {
>>>                       hash = ngx_hash(0, c);
>>>                       r->lowcase_header[0] = c;
>>>                       i = 1;
>>>                       break;
>>>                   }
> 
> Thank for your help and your answers.
> 
> Best,
> Benjamin Minguez
> 
> -----Original Message-----
> From: Richard Earnshaw (lists) <Richard.Earnshaw@arm.com>
> Sent: Thursday, May 18, 2023 1:02 PM
> To: Benjamin Minguez <benjamin.minguez@huawei.com>; Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>; gcc-help@gcc.gnu.org
> Subject: Re: Condition execution optimization with gcc 7.5
> 
> On 17/05/2023 09:17, Benjamin Minguez via Gcc-help wrote:
>> Hello,
>>
>> I did add -march=armv8-a (and the others armv8.*-a) to GCC command line, but it looks like the conditional execution optimization, cond_exec_find_if_block function, is never called. I enabled all gcc dumps (-da option) and this function debug message are never printed.
> 
> Just to be certain, are you compiling for aarch32 (arm/thumb), or aarch64?  The latter does not support conditional execution, except via instructions such as CSEL.
> 
> [more comments lower down]
> 
>> In parallel, I also try  with different version of GCC: 9.5.0 and 11.3.0, and again the I had the same results.
>>
>>    Do you have any idea why the this optimization step is not called?
>>
>> Thank you in advance for your help.
>>
>> Best,
>> Benjamin Minguez
>>
>> -----Original Message-----
>> From: Benjamin Minguez
>> Sent: Wednesday, May 10, 2023 8:43 AM
>> To: 'Kyrylo Tkachov' <Kyrylo.Tkachov@arm.com>; gcc-help@gcc.gnu.org
>> Subject: RE: Condition execution optimization with gcc 7.5
>>
>> Hi,
>>
>> Thank for the answer.
>>
>> I had a look at the wrong function definition, gcc-7.5.0/gcc/target.def:
>> 	DEFHOOK
>> 	(have_conditional_execution,
>> 	 "This target hook returns true if the target supports conditional execution.\n\
>> 	This target hook is required only when the target has several different\n\
>> 	modes and they have different conditional execution capability, such as ARM.",
>> 	 bool, (void),
>> 	 default_have_conditional_execution)
>> and find this one,  gcc-7.5.0/gcc/targhooks.c:
>> 	bool
>> 	default_have_conditional_execution (void)
>> 	{
>> 	  return HAVE_conditional_execution;
>> 	}
>> Finally, the macro HAVE_conditional_execution is defined here:
>> build-gcc/gcc/insn-config.h,
>>
>> I will investigate the -march or -mcpu option.
>>
>> Again, thanks a lot,
>>
>> Benjamin Minguez
>>
>> -----Original Message-----
>> From: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
>> Sent: Tuesday, May 9, 2023 11:50 AM
>> To: Benjamin Minguez <benjamin.minguez@huawei.com>;
>> gcc-help@gcc.gnu.org
>> Subject: RE: Condition execution optimization with gcc 7.5
>>
>> Hi Benjamin,
>>
>>> -----Original Message-----
>>> From: Gcc-help <gcc-help-bounces+kyrylo.tkachov=arm.com@gcc.gnu.org>
>>> On Behalf Of Benjamin Minguez via Gcc-help
>>> Sent: Tuesday, May 9, 2023 8:54 AM
>>> To: gcc-help@gcc.gnu.org
>>> Subject: Condition execution optimization with gcc 7.5
>>>
>>> Hello everyone,
>>>
>>> I'm trying to optimize an application that contains a lot of branches.
>>> I'm targeting armv8 processors and I'm using GCC 7.5.0 for compatibility reason.
>>
>> Of course GCC 7.5 is quite old now but if you're forced to use it...
>>
>>> As the original application is similar to NGINX, I investigated on
>>> NGINX. I'm focusing on the HTTP header parsing. Basically, the
>>> algorithm parse byte per byte and based on the value stores some variables.
>>> Here is an example, /src/http/ngx_http_parse.c: ngx_http_parse_header_line
>>>                   if (c) {
>>>                       hash = ngx_hash(0, c);
>>>                       r->lowcase_header[0] = c;
>>>                       i = 1;
>>>                       break;
>>>                   }
>>>
>>>                   if (ch == '_') {
>>>                       if (allow_underscores) {
>>>                           hash = ngx_hash(0, ch);
>>>                           r->lowcase_header[0] = ch;
>>>                           i = 1;
>>>
>>>                       } else {
>>>                           r->invalid_header = 1;
>>>                       }
>>>
>>>                       break;
>>>                   }
> 
> Your example code isn't complete enough to do a full analysis, but I doubt code like this would generate conditional execution anyway.  There are several reasons:
> 
> 1) It's likely too long once machine instructions are generated
> 2) There are function calls (ngx_hash) in the body of the conditional blocks (calls cannot be conditionally executed); if they are inlined then see 1) above.
> 3) you have nested conditions (only the innermost block could be conditionally executed).
> 4) you wouldn't want to conditionally execute 'if (allow_underscores)'
> anyway as it's probably highly predictable as a branch.
> 
> R.
> 
>>> Also, most of branches are not predictable because it compares against
>>> data coming from the network.
>>>   From these observations, I looked at the conditional execution
>>> optimization step in GCC and I found this function that should do the work:
>>> cond_exec_find_if_block. And how to customize the decision to use
>>> conditional instructions:
>>
>> ... This relates to the arm port i.e. the 32-bit target in Armv8-a, is that what you're targeting?
>> AArch64 has had more tuning work put into it over the years so may do better performance-wise if your processor and environment supports it.
>> If you're indeed looking at arm...
>>
>>>                   #define MAX_CONDITIONAL_EXECUTE
>>> arm_max_conditional_execute ()
>>>                   int
>>>                   arm_max_conditional_execute (void)
>>>                   {
>>>                     return max_insns_skipped;
>>>                   }
>>>                   static int max_insns_skipped = 5;
>>>
>>> I tried to compile NGNIX in -O2 (that should enable if-conversion2)
>>> but I did not noticed any change in the code. I enable GCC debug (-da)
>>> and also add some debug in this function and I figure out that
>>> targetm.have_conditional_execution is set to false.
>>>
>>> First, do you how to switch this variable to true. I guess it is an
>>> option during the configuration step of GCC.
>>
>> It's definition on that branch is:
>> /* Only thumb1 can't support conditional execution, so return true if
>>      the target is not thumb1.  */
>> static bool
>> arm_have_conditional_execution (void)
>> {
>>     return !TARGET_THUMB1;
>> }
>>
>> So it looks like you're maybe not setting the right -march or -mcpu option to enable the full armv8-a features?
>>
>> Thanks,
>> Kyrill
>>
>>> Then, I know  that the decision to use conditional execution is based
>>> on the extra cost added to compute both branches compare to the cost of a branch.
>>> In this specific case, branches are miss predicted and the cost is, indeed, high.
>>> Do you think that increasing the max_insns_skipped will be enough to
>>> help GCC to use conditional execution?
>>>
>>> Thank you in advance for your answers.
>>>
>>> Best,
>>> Benjamin Minguez
> 
> R.
> 
> 


  reply	other threads:[~2023-05-22 16:12 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-09  7:54 Benjamin Minguez
2023-05-09  9:49 ` Kyrylo Tkachov
2023-05-10  6:42   ` Benjamin Minguez
2023-05-17  8:17     ` Benjamin Minguez
2023-05-18 11:02       ` Richard Earnshaw (lists)
2023-05-22 15:43         ` Benjamin Minguez
2023-05-22 16:12           ` Richard Earnshaw (lists) [this message]
2023-05-23  6:36             ` Benjamin Minguez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52514568-bf71-b750-b4cf-3c532271ba01@arm.com \
    --to=richard.earnshaw@arm.com \
    --cc=Kyrylo.Tkachov@arm.com \
    --cc=benjamin.minguez@huawei.com \
    --cc=gcc-help@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).