From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 100522 invoked by alias); 4 Mar 2015 15:16:31 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 100508 invoked by uid 89); 4 Mar 2015 15:16:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-la0-f46.google.com Received: from mail-la0-f46.google.com (HELO mail-la0-f46.google.com) (209.85.215.46) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Wed, 04 Mar 2015 15:16:29 +0000 Received: by labmn12 with SMTP id mn12so8983970lab.2 for ; Wed, 04 Mar 2015 07:16:26 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:content-type:mime-version:subject:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to; bh=jfg/DjxAXYpyIkyavaiXvO2MDsdwjQmF6YL5O8d3RDI=; b=LGiuEwL6qHVRjowuiL9uZZn6b+uK4KWZwO5aNvyqVLqb2w7dqFMpgerq0E46zlDZ1o +48EDfcfuFjHDMXqccMCqVMjFcNC+1Us/cIanR7SWvZbX5TpEHTWYCp5JPo/+1fSF70x fwzAydIkpgyG57iUZrqYqCRBVIGskBlECSZ5NU4P18yKdBM6kge2LbTKW1mX1zuNCr7h 4QDQXSYRY7vmlo/kCVaY30ljcVVFrpPPVBHFZ5heVuto4jpxrxbTbsI4Z4MpUvIK7vwU KooBgroz0rN/oD6Hcy6ryCvEsCw10BNrn2GvIj7KTQJDx6wyMGgzDUkHqNIhiazCGuh0 LtRQ== X-Gm-Message-State: ALoCoQlIgs00Lz12v5f093Z0YZepjif5v1wdXCUaNSVcRc1FWdX2gG4l4w6TlMehdYPpn/5pZYcU X-Received: by 10.112.29.211 with SMTP id m19mr3998509lbh.100.1425482186084; Wed, 04 Mar 2015 07:16:26 -0800 (PST) Received: from [192.168.1.118] (broadband-90-154-65-7.nationalcablenetworks.ru. [90.154.65.7]) by mx.google.com with ESMTPSA id x7sm710667lag.3.2015.03.04.07.16.23 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 04 Mar 2015 07:16:25 -0800 (PST) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Subject: Re: [PATCH][AArch64] Use conditional negate for abs expansion From: Maxim Kuvyrkov In-Reply-To: <000201d0568b$eb9ad4a0$c2d07de0$@com> Date: Wed, 04 Mar 2015 15:16:00 -0000 Cc: GCC Patches Content-Transfer-Encoding: quoted-printable Message-Id: References: <000901d055cd$ce6ed110$6b4c7330$@com> <000101d05677$0f973890$2ec5a9b0$@com> <000201d0568b$eb9ad4a0$c2d07de0$@com> To: Wilco Dijkstra X-SW-Source: 2015-03/txt/msg00224.txt.bz2 On Mar 4, 2015, at 6:00 PM, Wilco Dijkstra wrote: >=20 >> Maxim Kuvyrkov wrote: >>> On Mar 4, 2015, at 3:30 PM, Wilco Dijkstra wrote: >>>=20 >>>> Maxim Kuvyrkov wrote: >>>>=20 >>>> You are removing the 2nd alternative that generates "abs" with your pa= tch. While I agree >> that >>>> using "csneg" is faster on all implementations, can you say the same f= or "abs"? Especially >>>> given the fact that csneg requires 4 operands instead of abs'es 2? >>>=20 >>> Yes, given that latencies of scalar SIMD instructions are typically wor= se than integer >>> latencies. The number of operands is not an issue. >>=20 >> One could make an argument that having an opportunity to use FP register= s in high-reg-pressure >> code is valuable. But ... I can take your word for it. All-in-all, I d= on't have objections >> to your patch (note, this is a review, not approval, since I'm not an AR= M maintainer). >=20 > The idea of spilling between integer and FP register files sounds great b= ut GCC doesn't > support it unfortunately... >=20 >>>> Wouldn't it be better to have (define_expand "abs2") that would = expand into either >>>> csneg3 or second alternative of current absdi2? >>>=20 >>> How would that be possible? You'd have to delay expansion until after r= egister allocation, >>> which loses optimization opportunities. >>=20 >> If abs2 define_expand is given arguments in DI mode that are FP re= gister -- emit >> absdi2_insn. Otherwise (which will be 95% of the time) do what your pat= ch does. >=20 > I don't understand how that could ever happen - the new expansion happens= before register > allocation, so you'd never see FP registers as arguments. A new instruction can be generated after reload by a folding optimizations = or combine or other late optimizations. It's rare, but happens. >=20 >>> And I still don't see how it would ever make sense >>> to execute integer operations as scalar SIMD. >>>=20 >>=20 >> Many GCC ports exploit ability to use FP registers for storage of intege= r values for the >> benefit of high-reg-pressure code. Ability to do some basic operations = (like abs) on such >> integer values is beneficial for the same reason. >=20 > I haven't seen any benefit - my previous patches severely discourage this= kind of bad=20 > allocation and gave major performance gains. Despite that it still happen= s way too often. >=20 >> However, almost always FP-alternatives are not the only ones in the inst= ruction pattern, and >> they are discouraged with "!". If you can't use "!" to discourage them = (which you can't in a >> single-alternative case, like we got here) then your approach makes sens= e. [Yes, I know, you >> could fit that FP alternative into csneg pattern, but, probably, you wou= ld not want to spend >> your time on it, and I don't think it's important enough to ask you to.] >=20 > Did you mean '?' ? It appears '?' discourages incorrect allocations while= '*' seems to make > things worse. Besides that, there are various issues with the cost calcul= ation in ira-costs.c > that result in incorrect register preferences. Yes, I meant "*" and "?". Thanks, -- Maxim Kuvyrkov www.linaro.org