From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 35369 invoked by alias); 29 Apr 2016 10:37:25 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 35347 invoked by uid 89); 29 Apr 2016 10:37:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.2 spammy=predicted X-HELO: mailrelay7.public.one.com Received: from mailrelay7.public.one.com (HELO mailrelay7.public.one.com) (91.198.169.215) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Fri, 29 Apr 2016 10:37:14 +0000 X-HalOne-Cookie: ab680f7b8959fa9eab054f069581388fbedb541c X-HalOne-ID: 522a8119-0df6-11e6-bb5b-b82a72cffc46 Received: from localhost.localdomain (unknown [91.135.11.213]) by smtpfilter4.public.one.com (Halon Mail Gateway) with ESMTPSA; Fri, 29 Apr 2016 10:37:06 +0000 (UTC) Subject: Re: [PATCH] [ARC] Handle FPX NaN within optimized floating point library. To: Claudiu Zissulescu , "gcc-patches@gcc.gnu.org" References: <5722261F.1070102@amylaar.uk> <1461924998-9190-1-git-send-email-claziss@synopsys.com> <5723370B.6020904@amylaar.uk> <098ECE41A0A6114BB2A07F1EC238DE896618A1F3@de02wembxa.internal.synopsys.com> Cc: "Francois.Bedard@synopsys.com" , "jeremy.bennett@embecosm.com" From: Joern Wolfgang Rennecke Message-ID: <57233952.2090009@amylaar.uk> Date: Fri, 29 Apr 2016 10:37:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <098ECE41A0A6114BB2A07F1EC238DE896618A1F3@de02wembxa.internal.synopsys.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2016-04/txt/msg01972.txt.bz2 On 29/04/16 11:31, Claudiu Zissulescu wrote: > It should do the job, at least for EM where the jump takes 2 cycle, and by means of using delay slots we can make all the cycles count. HS has a branch prediction mechanism, hence, filling up the delay slot doesn't have such a big impact like in EM or even earlier cpus. No, the alternative is to hide the delay slot, so if the branch is predicted properly, the case with different high words should be faster without the .d suffix. I.e. , eagerly filling the delay slot like this has a bigger - negative - impact on performance.