From: "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com>
To: Benjamin Minguez <benjamin.minguez@huawei.com>,
Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>,
"gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
Subject: Re: Condition execution optimization with gcc 7.5
Date: Mon, 22 May 2023 17:12:07 +0100 [thread overview]
Message-ID: <52514568-bf71-b750-b4cf-3c532271ba01@arm.com> (raw)
In-Reply-To: <b943a9b284b14fd996f548b9517b6de6@huawei.com>
On aarch64 this code cannot use conditional select. An operation such as
if (c) {
...
r->lowcase_header[0] = c;
...
}
would be a conditional store to memory and can only happen if the
guarding condition is true. It's not safe to convert this into, say
cmp c, #0
...
ldr w1, [ptr]
csel w1, w1, c, eq
str w1, [ptr]
because the store would introduce a possible race with any other thread
that might be writing to the same location. The compiler would also
have to prove that ptr always contained a valid address when 'c' was
false as well, something that might not be possible given the
information available.
The function arm_max_conditional_execute is only used for 32-bit arm
targets. It's not part of the aarch64 compiler.
R.
On 22/05/2023 16:43, Benjamin Minguez via Gcc-help wrote:
> Hello Richard,
>
> I'm compiling for aarch64. Indeed, I was expecting conversion via conditional move or set.
> I understand that code such as NGINX HTTP parser is suitable for such conversion. But I was expecting that, for example, this code can benefit of it (ngx_hash is an inline function and is a simple xor operation):
>>> if (c) {
>>> hash = ngx_hash(0, c);
>>> r->lowcase_header[0] = c;
>>> i = 1;
>>> break;
>>> }
>
> Thank for your help and your answers.
>
> Best,
> Benjamin Minguez
>
> -----Original Message-----
> From: Richard Earnshaw (lists) <Richard.Earnshaw@arm.com>
> Sent: Thursday, May 18, 2023 1:02 PM
> To: Benjamin Minguez <benjamin.minguez@huawei.com>; Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>; gcc-help@gcc.gnu.org
> Subject: Re: Condition execution optimization with gcc 7.5
>
> On 17/05/2023 09:17, Benjamin Minguez via Gcc-help wrote:
>> Hello,
>>
>> I did add -march=armv8-a (and the others armv8.*-a) to GCC command line, but it looks like the conditional execution optimization, cond_exec_find_if_block function, is never called. I enabled all gcc dumps (-da option) and this function debug message are never printed.
>
> Just to be certain, are you compiling for aarch32 (arm/thumb), or aarch64? The latter does not support conditional execution, except via instructions such as CSEL.
>
> [more comments lower down]
>
>> In parallel, I also try with different version of GCC: 9.5.0 and 11.3.0, and again the I had the same results.
>>
>> Do you have any idea why the this optimization step is not called?
>>
>> Thank you in advance for your help.
>>
>> Best,
>> Benjamin Minguez
>>
>> -----Original Message-----
>> From: Benjamin Minguez
>> Sent: Wednesday, May 10, 2023 8:43 AM
>> To: 'Kyrylo Tkachov' <Kyrylo.Tkachov@arm.com>; gcc-help@gcc.gnu.org
>> Subject: RE: Condition execution optimization with gcc 7.5
>>
>> Hi,
>>
>> Thank for the answer.
>>
>> I had a look at the wrong function definition, gcc-7.5.0/gcc/target.def:
>> DEFHOOK
>> (have_conditional_execution,
>> "This target hook returns true if the target supports conditional execution.\n\
>> This target hook is required only when the target has several different\n\
>> modes and they have different conditional execution capability, such as ARM.",
>> bool, (void),
>> default_have_conditional_execution)
>> and find this one, gcc-7.5.0/gcc/targhooks.c:
>> bool
>> default_have_conditional_execution (void)
>> {
>> return HAVE_conditional_execution;
>> }
>> Finally, the macro HAVE_conditional_execution is defined here:
>> build-gcc/gcc/insn-config.h,
>>
>> I will investigate the -march or -mcpu option.
>>
>> Again, thanks a lot,
>>
>> Benjamin Minguez
>>
>> -----Original Message-----
>> From: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
>> Sent: Tuesday, May 9, 2023 11:50 AM
>> To: Benjamin Minguez <benjamin.minguez@huawei.com>;
>> gcc-help@gcc.gnu.org
>> Subject: RE: Condition execution optimization with gcc 7.5
>>
>> Hi Benjamin,
>>
>>> -----Original Message-----
>>> From: Gcc-help <gcc-help-bounces+kyrylo.tkachov=arm.com@gcc.gnu.org>
>>> On Behalf Of Benjamin Minguez via Gcc-help
>>> Sent: Tuesday, May 9, 2023 8:54 AM
>>> To: gcc-help@gcc.gnu.org
>>> Subject: Condition execution optimization with gcc 7.5
>>>
>>> Hello everyone,
>>>
>>> I'm trying to optimize an application that contains a lot of branches.
>>> I'm targeting armv8 processors and I'm using GCC 7.5.0 for compatibility reason.
>>
>> Of course GCC 7.5 is quite old now but if you're forced to use it...
>>
>>> As the original application is similar to NGINX, I investigated on
>>> NGINX. I'm focusing on the HTTP header parsing. Basically, the
>>> algorithm parse byte per byte and based on the value stores some variables.
>>> Here is an example, /src/http/ngx_http_parse.c: ngx_http_parse_header_line
>>> if (c) {
>>> hash = ngx_hash(0, c);
>>> r->lowcase_header[0] = c;
>>> i = 1;
>>> break;
>>> }
>>>
>>> if (ch == '_') {
>>> if (allow_underscores) {
>>> hash = ngx_hash(0, ch);
>>> r->lowcase_header[0] = ch;
>>> i = 1;
>>>
>>> } else {
>>> r->invalid_header = 1;
>>> }
>>>
>>> break;
>>> }
>
> Your example code isn't complete enough to do a full analysis, but I doubt code like this would generate conditional execution anyway. There are several reasons:
>
> 1) It's likely too long once machine instructions are generated
> 2) There are function calls (ngx_hash) in the body of the conditional blocks (calls cannot be conditionally executed); if they are inlined then see 1) above.
> 3) you have nested conditions (only the innermost block could be conditionally executed).
> 4) you wouldn't want to conditionally execute 'if (allow_underscores)'
> anyway as it's probably highly predictable as a branch.
>
> R.
>
>>> Also, most of branches are not predictable because it compares against
>>> data coming from the network.
>>> From these observations, I looked at the conditional execution
>>> optimization step in GCC and I found this function that should do the work:
>>> cond_exec_find_if_block. And how to customize the decision to use
>>> conditional instructions:
>>
>> ... This relates to the arm port i.e. the 32-bit target in Armv8-a, is that what you're targeting?
>> AArch64 has had more tuning work put into it over the years so may do better performance-wise if your processor and environment supports it.
>> If you're indeed looking at arm...
>>
>>> #define MAX_CONDITIONAL_EXECUTE
>>> arm_max_conditional_execute ()
>>> int
>>> arm_max_conditional_execute (void)
>>> {
>>> return max_insns_skipped;
>>> }
>>> static int max_insns_skipped = 5;
>>>
>>> I tried to compile NGNIX in -O2 (that should enable if-conversion2)
>>> but I did not noticed any change in the code. I enable GCC debug (-da)
>>> and also add some debug in this function and I figure out that
>>> targetm.have_conditional_execution is set to false.
>>>
>>> First, do you how to switch this variable to true. I guess it is an
>>> option during the configuration step of GCC.
>>
>> It's definition on that branch is:
>> /* Only thumb1 can't support conditional execution, so return true if
>> the target is not thumb1. */
>> static bool
>> arm_have_conditional_execution (void)
>> {
>> return !TARGET_THUMB1;
>> }
>>
>> So it looks like you're maybe not setting the right -march or -mcpu option to enable the full armv8-a features?
>>
>> Thanks,
>> Kyrill
>>
>>> Then, I know that the decision to use conditional execution is based
>>> on the extra cost added to compute both branches compare to the cost of a branch.
>>> In this specific case, branches are miss predicted and the cost is, indeed, high.
>>> Do you think that increasing the max_insns_skipped will be enough to
>>> help GCC to use conditional execution?
>>>
>>> Thank you in advance for your answers.
>>>
>>> Best,
>>> Benjamin Minguez
>
> R.
>
>
next prev parent reply other threads:[~2023-05-22 16:12 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-09 7:54 Benjamin Minguez
2023-05-09 9:49 ` Kyrylo Tkachov
2023-05-10 6:42 ` Benjamin Minguez
2023-05-17 8:17 ` Benjamin Minguez
2023-05-18 11:02 ` Richard Earnshaw (lists)
2023-05-22 15:43 ` Benjamin Minguez
2023-05-22 16:12 ` Richard Earnshaw (lists) [this message]
2023-05-23 6:36 ` Benjamin Minguez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52514568-bf71-b750-b4cf-3c532271ba01@arm.com \
--to=richard.earnshaw@arm.com \
--cc=Kyrylo.Tkachov@arm.com \
--cc=benjamin.minguez@huawei.com \
--cc=gcc-help@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).