From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 79741 invoked by alias); 7 Mar 2016 12:55:31 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 79686 invoked by uid 89); 7 Mar 2016 12:55:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 spammy=jimwilsonlinaroorg, jim.wilson@linaro.org, quality, fourth X-HELO: mail-qg0-f50.google.com Received: from mail-qg0-f50.google.com (HELO mail-qg0-f50.google.com) (209.85.192.50) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Mon, 07 Mar 2016 12:55:28 +0000 Received: by mail-qg0-f50.google.com with SMTP id y89so94632612qge.2 for ; Mon, 07 Mar 2016 04:55:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc; bh=plzVVI81lt/Ln45BxKjndNZ5qmC52h/dgpkT/ZlNHiI=; b=XKrMZOTEfItuEf1tjAlrj0c09JIvg3PATkLCf4DVygraoWmHpgS3Pja+vSJfu947IS yEEnNolO4DDqhhckOJgQvaeHbsFTB36vwLMrmY/AWOlb3jUuOvFe/M0NTAQS7wJKZxYq JzvhW9JkAk9amhJoXB9rP+QtE1YH/ujb7wCwRCLaVx+j8Ra4FcclB4R7KloFSNtLcLKK /Wb9/lnAOhY8Vmlm9hjdlUmq6lfMtlhplddKHIuIHegKcYriJ+ApVLalCQJha5PRS7o7 6dgE5AunzuMZ2V5gPZj/jgLwYZJt0iVBt2EqPoV63MZMPZ76QeXcGxtZxfHZX0rYm2Fp 0gmA== X-Gm-Message-State: AD7BkJL5rFr7qra1WqbpUfBWCNaOePxfIMEz1QCkwspZUEmy7FLHRNo/lkbboCkiyiptN8VhFur1igQz3u6Dw7op MIME-Version: 1.0 X-Received: by 10.140.234.11 with SMTP id f11mr30136567qhc.51.1457355326403; Mon, 07 Mar 2016 04:55:26 -0800 (PST) Received: by 10.140.22.164 with HTTP; Mon, 7 Mar 2016 04:55:26 -0800 (PST) In-Reply-To: References: <56C1B74D.4070009@foss.arm.com> <56C445F6.6040004@foss.arm.com> Date: Mon, 07 Mar 2016 12:55:00 -0000 Message-ID: Subject: Re: [PATCH, ARM] stop changing signedness in PROMOTE_MODE From: Christophe Lyon To: Ramana Radhakrishnan Cc: Kyrill Tkachov , Jim Wilson , "gcc-patches@gcc.gnu.org" Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2016-03/txt/msg00479.txt.bz2 On 7 March 2016 at 05:43, Ramana Radhakrishnan wrote: > On Wed, Feb 17, 2016 at 10:20 AM, Christophe Lyon > wrote: >> On 17 February 2016 at 11:05, Kyrill Tkachov >> wrote: >>> >>> On 17/02/16 10:03, Christophe Lyon wrote: >>>> >>>> On 15 February 2016 at 12:32, Kyrill Tkachov >>>> wrote: >>>>> >>>>> On 04/02/16 08:58, Ramana Radhakrishnan wrote: >>>>>> >>>>>> On Tue, Jun 30, 2015 at 2:15 AM, Jim Wilson >>>>>> wrote: >>>>>>> >>>>>>> This is my suggested fix for PR 65932, which is a linux kernel >>>>>>> miscompile with gcc-5.1. >>>>>>> >>>>>>> The problem here is caused by a chain of events. The first is that >>>>>>> the relatively new eipa_sra pass creates fake parameters that behave >>>>>>> slightly differently than normal parameters. The second is that the >>>>>>> optimizer creates phi nodes that copy local variables to fake >>>>>>> parameters and/or vice versa. The third is that the ouf-of-ssa pass >>>>>>> assumes that it can emit simple move instructions for these phi nodes. >>>>>>> And the fourth is that the ARM port has a PROMOTE_MODE macro that >>>>>>> forces QImode and HImode to unsigned, but a >>>>>>> TARGET_PROMOTE_FUNCTION_MODE hook that does not. So signed char and >>>>>>> short parameters have different in register representations than local >>>>>>> variables, and require a conversion when copying between them, a >>>>>>> conversion that the out-of-ssa pass can't easily emit. >>>>>>> >>>>>>> Ultimately, I think this is a problem in the arm backend. It should >>>>>>> not have a PROMOTE_MODE macro that is changing the sign of char and >>>>>>> short local variables. I also think that we should merge the >>>>>>> PROMOTE_MODE macro with the TARGET_PROMOTE_FUNCTION_MODE hook to >>>>>>> prevent this from happening again. >>>>>>> >>>>>>> I see four general problems with the current ARM PROMOTE_MODE >>>>>>> definition. >>>>>>> 1) Unsigned char is only faster for armv5 and earlier, before the sxtb >>>>>>> instruction was added. It is a lose for armv6 and later. >>>>>>> 2) Unsigned short was only faster for targets that don't support >>>>>>> unaligned accesses. Support for these targets was removed a while >>>>>>> ago, and this PROMODE_MODE hunk should have been removed at the same >>>>>>> time. It was accidentally left behind. >>>>>>> 3) TARGET_PROMOTE_FUNCTION_MODE used to be a boolean hook, when it was >>>>>>> converted to a function, the PROMOTE_MODE code was copied without the >>>>>>> UNSIGNEDP changes. Thus it is only an accident that >>>>>>> TARGET_PROMOTE_FUNCTION_MODE and PROMOTE_MODE disagree. Changing >>>>>>> TARGET_PROMOTE_FUNCTION_MODE is an ABI change, so only PROMOTE_MODE >>>>>>> changes to resolve the difference are safe. >>>>>>> 4) There is a general principle that you should only change signedness >>>>>>> in PROMOTE_MODE if the hardware forces it, as otherwise this results >>>>>>> in extra conversion instructions that make code slower. The mips64 >>>>>>> hardware for instance requires that 32-bit values be sign-extended >>>>>>> regardless of type, and instructions may trap if this is not true. >>>>>>> However, it has a set of 32-bit instructions that operate on these >>>>>>> values, and hence no conversions are required. There is no similar >>>>>>> case on ARM. Thus the conversions are unnecessary and unwise. This >>>>>>> can be seen in the testcases where gcc emits both a zero-extend and a >>>>>>> sign-extend inside a loop, as the sign-extend is required for a >>>>>>> compare, and the zero-extend is required by PROMOTE_MODE. >>>>>> >>>>>> Given Kyrill's testing with the patch and the reasonably detailed >>>>>> check of the effects of code generation changes - The arm.h hunk is ok >>>>>> - I do think we should make this explicit in the documentation that >>>>>> TARGET_PROMOTE_MODE and TARGET_PROMOTE_FUNCTION_MODE should agree and >>>>>> better still maybe put in a checking assert for the same in the >>>>>> mid-end but that could be the subject of a follow-up patch. >>>>>> >>>>>> Ok to apply just the arm.h hunk as I think Kyrill has taken care of >>>>>> the testsuite fallout separately. >>>>> >>>>> Hi all, >>>>> >>>>> I'd like to backport the arm.h from this ( r233130) to the GCC 5 >>>>> branch. As the CSE patch from my series had some fallout on x86_64 >>>>> due to a deficiency in the AVX patterns that is too invasive to fix >>>>> at this stage (and presumably backport), I'd like to just backport >>>>> this arm.h fix and adjust the tests to XFAIL the fallout that comes >>>>> with not applying the CSE patch. The attached patch does that. >>>>> >>>>> The code quality fallout on code outside the testsuite is not >>>>> that gread. The SPEC benchmarks are not affected by not applying >>>>> the CSE change, and only a single sequence in a popular embedded >>>>> benchmark >>>>> shows some degradation for -mtune=cortex-a9 in the same way as the >>>>> wmul-1.c and wmul-2.c tests. >>>>> >>>>> I think that's a fair tradeoff for fixing the wrong code bug on that >>>>> branch. >>>>> >>>>> Ok to backport r233130 and the attached testsuite patch to the GCC 5 >>>>> branch? >>>>> >>>>> Thanks, >>>>> Kyrill >>>>> >>>>> 2016-02-15 Kyrylo Tkachov >>>>> >>>>> PR target/65932 >>>>> * gcc.target/arm/wmul-1.c: Add -mtune=cortex-a9 to dg-options. >>>>> xfail the scan-assembler test. >>>>> * gcc.target/arm/wmul-2.c: Likewise. >>>>> * gcc.target/arm/wmul-3.c: Simplify test to generate a single >>>>> smulbb. >>>>> >>>>> >>>> Hi Kyrill, >>>> >>>> I've noticed that wmul-3 still fails on the gcc-5 branch when forcing GCC >>>> configuration to: >>>> --with-cpu cortex-a5 --with-fpu vfpv3-d16-fp16 >>>> (target arm-none-linux-gnueabihf) >>>> >>>> The generated code is: >>>> sxth r0, r0 >>>> sxth r1, r1 >>>> mul r0, r1, r0 >>>> instead of >>>> smulbb r0, r1, r0 >>>> on trunk. >>>> >>>> I guess we don't worry? >>> >>> >>> Hi Christophe, >>> Hmmm, I suspect we might want to backport >>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01714.html >>> to fix backend the costing logic of smulbb. >>> Could you please try that patch to see if it helps? >>> >> >> Ha indeed, with the attached patch, we now generate smulbb. >> I didn't run a full make check though. >> >> OK with a suitable ChangeLog entry? > > Ok with a suitable Changelog entry and if no regressions. > Thanks, committed as r2340169, with no regression: http://people.linaro.org/~christophe.lyon/cross-validation/gcc/gcc-5-branch/234019/report-build-info.html > Ramana > >> >> Christophe. >> >>> Thanks, >>> Kyrill >>> >>> >>>>> >>>>>> regards >>>>>> Ramana >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> My change was tested with an arm bootstrap, make check, and SPEC >>>>>>> CPU2000 run. The original poster verified that this gives a linux >>>>>>> kernel that boots correctly. >>>>>>> >>>>>>> The PRMOTE_MODE change causes 3 testsuite testcases to fail. These >>>>>>> are tests to verify that smulbb and/or smlabb are generated. >>>>>>> Eliminating the unnecessary sign conversions causes us to get better >>>>>>> code that doesn't include the smulbb and smlabb instructions. I had >>>>>>> to modify the testcases to get them to emit the desired instructions. >>>>>>> With the testcase changes there are no additional testsuite failures, >>>>>>> though I'm concerned that these testcases with the changes may be >>>>>>> fragile, and future changes may break them again. >>>>>> >>>>>> >>>>>> >>>>>>> If there are ARM parts where smulbb/smlabb are faster than mul/mla, >>>>>>> then maybe we should try to add new patterns to get the instructions >>>>>>> emitted again for the unmodified testcases. >>>>>>> >>>>>>> Jim >>>>> >>>>> >>>