From: Christophe Lyon <christophe.lyon@linaro.org>
To: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>
Cc: Ramana Radhakrishnan <ramana.gcc@googlemail.com>,
Jim Wilson <jim.wilson@linaro.org>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH, ARM] stop changing signedness in PROMOTE_MODE
Date: Wed, 17 Feb 2016 10:20:00 -0000 [thread overview]
Message-ID: <CAKdteOZZzE+aixcoiyqPRQ7qfwZF_QLM=eW2iVLuC39XYs6KDA@mail.gmail.com> (raw)
In-Reply-To: <56C445F6.6040004@foss.arm.com>
[-- Attachment #1: Type: text/plain, Size: 7082 bytes --]
On 17 February 2016 at 11:05, Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
>
> On 17/02/16 10:03, Christophe Lyon wrote:
>>
>> On 15 February 2016 at 12:32, Kyrill Tkachov
>> <kyrylo.tkachov@foss.arm.com> wrote:
>>>
>>> On 04/02/16 08:58, Ramana Radhakrishnan wrote:
>>>>
>>>> On Tue, Jun 30, 2015 at 2:15 AM, Jim Wilson <jim.wilson@linaro.org>
>>>> wrote:
>>>>>
>>>>> This is my suggested fix for PR 65932, which is a linux kernel
>>>>> miscompile with gcc-5.1.
>>>>>
>>>>> The problem here is caused by a chain of events. The first is that
>>>>> the relatively new eipa_sra pass creates fake parameters that behave
>>>>> slightly differently than normal parameters. The second is that the
>>>>> optimizer creates phi nodes that copy local variables to fake
>>>>> parameters and/or vice versa. The third is that the ouf-of-ssa pass
>>>>> assumes that it can emit simple move instructions for these phi nodes.
>>>>> And the fourth is that the ARM port has a PROMOTE_MODE macro that
>>>>> forces QImode and HImode to unsigned, but a
>>>>> TARGET_PROMOTE_FUNCTION_MODE hook that does not. So signed char and
>>>>> short parameters have different in register representations than local
>>>>> variables, and require a conversion when copying between them, a
>>>>> conversion that the out-of-ssa pass can't easily emit.
>>>>>
>>>>> Ultimately, I think this is a problem in the arm backend. It should
>>>>> not have a PROMOTE_MODE macro that is changing the sign of char and
>>>>> short local variables. I also think that we should merge the
>>>>> PROMOTE_MODE macro with the TARGET_PROMOTE_FUNCTION_MODE hook to
>>>>> prevent this from happening again.
>>>>>
>>>>> I see four general problems with the current ARM PROMOTE_MODE
>>>>> definition.
>>>>> 1) Unsigned char is only faster for armv5 and earlier, before the sxtb
>>>>> instruction was added. It is a lose for armv6 and later.
>>>>> 2) Unsigned short was only faster for targets that don't support
>>>>> unaligned accesses. Support for these targets was removed a while
>>>>> ago, and this PROMODE_MODE hunk should have been removed at the same
>>>>> time. It was accidentally left behind.
>>>>> 3) TARGET_PROMOTE_FUNCTION_MODE used to be a boolean hook, when it was
>>>>> converted to a function, the PROMOTE_MODE code was copied without the
>>>>> UNSIGNEDP changes. Thus it is only an accident that
>>>>> TARGET_PROMOTE_FUNCTION_MODE and PROMOTE_MODE disagree. Changing
>>>>> TARGET_PROMOTE_FUNCTION_MODE is an ABI change, so only PROMOTE_MODE
>>>>> changes to resolve the difference are safe.
>>>>> 4) There is a general principle that you should only change signedness
>>>>> in PROMOTE_MODE if the hardware forces it, as otherwise this results
>>>>> in extra conversion instructions that make code slower. The mips64
>>>>> hardware for instance requires that 32-bit values be sign-extended
>>>>> regardless of type, and instructions may trap if this is not true.
>>>>> However, it has a set of 32-bit instructions that operate on these
>>>>> values, and hence no conversions are required. There is no similar
>>>>> case on ARM. Thus the conversions are unnecessary and unwise. This
>>>>> can be seen in the testcases where gcc emits both a zero-extend and a
>>>>> sign-extend inside a loop, as the sign-extend is required for a
>>>>> compare, and the zero-extend is required by PROMOTE_MODE.
>>>>
>>>> Given Kyrill's testing with the patch and the reasonably detailed
>>>> check of the effects of code generation changes - The arm.h hunk is ok
>>>> - I do think we should make this explicit in the documentation that
>>>> TARGET_PROMOTE_MODE and TARGET_PROMOTE_FUNCTION_MODE should agree and
>>>> better still maybe put in a checking assert for the same in the
>>>> mid-end but that could be the subject of a follow-up patch.
>>>>
>>>> Ok to apply just the arm.h hunk as I think Kyrill has taken care of
>>>> the testsuite fallout separately.
>>>
>>> Hi all,
>>>
>>> I'd like to backport the arm.h from this ( r233130) to the GCC 5
>>> branch. As the CSE patch from my series had some fallout on x86_64
>>> due to a deficiency in the AVX patterns that is too invasive to fix
>>> at this stage (and presumably backport), I'd like to just backport
>>> this arm.h fix and adjust the tests to XFAIL the fallout that comes
>>> with not applying the CSE patch. The attached patch does that.
>>>
>>> The code quality fallout on code outside the testsuite is not
>>> that gread. The SPEC benchmarks are not affected by not applying
>>> the CSE change, and only a single sequence in a popular embedded
>>> benchmark
>>> shows some degradation for -mtune=cortex-a9 in the same way as the
>>> wmul-1.c and wmul-2.c tests.
>>>
>>> I think that's a fair tradeoff for fixing the wrong code bug on that
>>> branch.
>>>
>>> Ok to backport r233130 and the attached testsuite patch to the GCC 5
>>> branch?
>>>
>>> Thanks,
>>> Kyrill
>>>
>>> 2016-02-15 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>>>
>>> PR target/65932
>>> * gcc.target/arm/wmul-1.c: Add -mtune=cortex-a9 to dg-options.
>>> xfail the scan-assembler test.
>>> * gcc.target/arm/wmul-2.c: Likewise.
>>> * gcc.target/arm/wmul-3.c: Simplify test to generate a single
>>> smulbb.
>>>
>>>
>> Hi Kyrill,
>>
>> I've noticed that wmul-3 still fails on the gcc-5 branch when forcing GCC
>> configuration to:
>> --with-cpu cortex-a5 --with-fpu vfpv3-d16-fp16
>> (target arm-none-linux-gnueabihf)
>>
>> The generated code is:
>> sxth r0, r0
>> sxth r1, r1
>> mul r0, r1, r0
>> instead of
>> smulbb r0, r1, r0
>> on trunk.
>>
>> I guess we don't worry?
>
>
> Hi Christophe,
> Hmmm, I suspect we might want to backport
> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01714.html
> to fix backend the costing logic of smulbb.
> Could you please try that patch to see if it helps?
>
Ha indeed, with the attached patch, we now generate smulbb.
I didn't run a full make check though.
OK with a suitable ChangeLog entry?
Christophe.
> Thanks,
> Kyrill
>
>
>>>
>>>> regards
>>>> Ramana
>>>>
>>>>
>>>>
>>>>
>>>>> My change was tested with an arm bootstrap, make check, and SPEC
>>>>> CPU2000 run. The original poster verified that this gives a linux
>>>>> kernel that boots correctly.
>>>>>
>>>>> The PRMOTE_MODE change causes 3 testsuite testcases to fail. These
>>>>> are tests to verify that smulbb and/or smlabb are generated.
>>>>> Eliminating the unnecessary sign conversions causes us to get better
>>>>> code that doesn't include the smulbb and smlabb instructions. I had
>>>>> to modify the testcases to get them to emit the desired instructions.
>>>>> With the testcase changes there are no additional testsuite failures,
>>>>> though I'm concerned that these testcases with the changes may be
>>>>> fragile, and future changes may break them again.
>>>>
>>>>
>>>>
>>>>> If there are ARM parts where smulbb/smlabb are faster than mul/mla,
>>>>> then maybe we should try to add new patterns to get the instructions
>>>>> emitted again for the unmodified testcases.
>>>>>
>>>>> Jim
>>>
>>>
>
[-- Attachment #2: smulbb.patch --]
[-- Type: text/x-diff, Size: 626 bytes --]
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c (revision 233484)
+++ gcc/config/arm/arm.c (working copy)
@@ -10306,8 +10306,10 @@
/* SMUL[TB][TB]. */
if (speed_p)
*cost += extra_cost->mult[0].extend;
- *cost += (rtx_cost (XEXP (x, 0), SIGN_EXTEND, 0, speed_p)
- + rtx_cost (XEXP (x, 1), SIGN_EXTEND, 0, speed_p));
+ *cost += rtx_cost (XEXP (XEXP (x, 0), 0),
+ SIGN_EXTEND, 0, speed_p);
+ *cost += rtx_cost (XEXP (XEXP (x, 1), 0),
+ SIGN_EXTEND, 1, speed_p);
return true;
}
if (speed_p)
next prev parent reply other threads:[~2016-02-17 10:20 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-30 1:56 Jim Wilson
2015-07-02 9:07 ` Richard Earnshaw
2015-07-07 18:25 ` Jim Wilson
2015-07-07 15:07 ` Jeff Law
2015-07-07 16:29 ` Jim Wilson
2015-07-07 21:35 ` Richard Biener
2015-07-10 15:46 ` Jim Wilson
2015-07-13 8:19 ` Richard Biener
2015-07-13 15:29 ` Michael Matz
2015-07-13 15:35 ` H.J. Lu
2015-07-14 16:38 ` Richard Earnshaw
2015-07-14 16:49 ` Richard Biener
2015-07-14 17:07 ` Jim Wilson
2015-07-14 17:23 ` Richard Biener
2015-07-15 13:25 ` Michael Matz
2015-07-15 16:01 ` Jim Wilson
2015-07-16 9:40 ` Richard Earnshaw
2015-07-16 15:02 ` Michael Matz
2015-07-16 15:20 ` Richard Earnshaw
2015-07-15 13:04 ` Michael Matz
2015-07-08 22:54 ` Jeff Law
2015-07-10 15:35 ` Jim Wilson
2016-02-04 8:58 ` Ramana Radhakrishnan
2016-02-15 11:32 ` Kyrill Tkachov
2016-02-16 10:44 ` Ramana Radhakrishnan
2016-02-17 10:03 ` Christophe Lyon
2016-02-17 10:05 ` Kyrill Tkachov
2016-02-17 10:20 ` Christophe Lyon [this message]
2016-02-17 10:22 ` Kyrill Tkachov
2016-02-18 10:16 ` Christophe Lyon
2016-03-07 4:43 ` Ramana Radhakrishnan
2016-03-07 12:55 ` Christophe Lyon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAKdteOZZzE+aixcoiyqPRQ7qfwZF_QLM=eW2iVLuC39XYs6KDA@mail.gmail.com' \
--to=christophe.lyon@linaro.org \
--cc=gcc-patches@gcc.gnu.org \
--cc=jim.wilson@linaro.org \
--cc=kyrylo.tkachov@foss.arm.com \
--cc=ramana.gcc@googlemail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).