From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 88649 invoked by alias); 11 Mar 2016 23:52:03 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 86336 invoked by uid 89); 11 Mar 2016 23:52:01 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_50,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 spammy=031, power7, rs6000md, U*meissner X-HELO: mail-lb0-f181.google.com Received: from mail-lb0-f181.google.com (HELO mail-lb0-f181.google.com) (209.85.217.181) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 11 Mar 2016 23:52:00 +0000 Received: by mail-lb0-f181.google.com with SMTP id k15so177095468lbg.0 for ; Fri, 11 Mar 2016 15:52:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to; bh=lEYL/k/Aom9M8AJu2ZG5xIqNdMhE70TIjgrOl0LNwgY=; b=AiGSVhyP7IvLdRoEiS3xfjqLpneKqyAY7cVwYiFMiPAEf5SSmA6r30N5p391NH8gPV jZmVg9iejtVrJuEtuo1LE50StwnrLM7Z8voZKJ+Nh/EHKE4lsZVg6k0cYikIXDlVsWDC cZchvp9uy381qJCCaKFtTSwwMU1PD5u4PAo9Z79tBD5amJR9jtrUxaAINSlIVWW8ZvUI kmVKgHYHxwpW+S6G7ZA/IO8s9yTFDU54lwy3pqtVwzgYcyG3Pii/PDjETKaiJIAptpzr g5GSmSRZZvnjxDhbKKC6N18SxeITnF5dzCtaRvYWrgMjvkb7klKWw/KWoTBuBxM7d9Ur lxRA== X-Gm-Message-State: AD7BkJJmZf1j0cODLJlalEvSAHmAJL09Sx7VBsx9Vyc9SDSMHa/DNZOatw5Rm9e16fuWBGAZdjHJRtzJhzxzPQ== MIME-Version: 1.0 X-Received: by 10.25.44.18 with SMTP id s18mr4142298lfs.66.1457740317223; Fri, 11 Mar 2016 15:51:57 -0800 (PST) Received: by 10.114.172.17 with HTTP; Fri, 11 Mar 2016 15:51:57 -0800 (PST) In-Reply-To: <20160311224148.GA31239@ibm-tiger.the-meissners.org> References: <20160311224148.GA31239@ibm-tiger.the-meissners.org> Date: Fri, 11 Mar 2016 23:52:00 -0000 Message-ID: Subject: Re: [PATCH], Fix PR 70131, disable (double)(int) optimization for power8 From: David Edelsohn To: Michael Meissner , GCC Patches Content-Type: text/plain; charset=UTF-8 X-SW-Source: 2016-03/txt/msg00726.txt.bz2 On Fri, Mar 11, 2016 at 5:41 PM, Michael Meissner wrote: > As I was auditing rs6000.md for power9 changes, I noticed that changes I had > made in 2010 for power7 weren't as effective with power8. > > The FCTIWZ/FCTIWUZ instructions convert the scalar floating point value to a > 32-bit signed/unsigned integer in bits 32-63 of the floating point or vector > register. Unfortunately, the hardware does not guarantee that bits 0-31 are > copies of the sign, so that it can be used as a valid 64-bit integer. There is > no conversion from 32-bit int to floating point. This meant in the power7 > days, if you wanted to round a floating point value to 32-bit integer, you > would need to do: > > convert to 32-bit integer > store 32-bit value on the stack > load 32-bit value to a GPR > sign/zero extend it > store 32-bit value to the stack > load 32-bit value to a FPR/vector register. > > The optimization does a store/load to sign/zero extend, rather than going > through the GPRs. > > On power8, we have a direct move instruction that copies the value between the > register sets, and the compiler will generate this if the above optimization is > turned off (which is what this patch does). > > There are other ways to sign/zero extend a value in the vector registers > without doing a move using multiple instructions, but in practice direct move > seems to be as fast as the other instructions. > > I bootstrapped the compiler and there were no regressions with this patch. > > I rebuilt the Spec 2006 benchmark suite, and there 7 of the benchmarks that > used this sequence somewhere in the code. I ran those benchmarks with this > patch, and compared them to the original benchmarks. In 6 of the benchmarks, > the run time was almost precisely the same. The 416.gamess benchmark was about > 2% faster, and there were no regressions. > > Is this patch ok to apply to the trunk? I would like to apply it to the gcc 5 > branch as well. Is this ok also? > > [gcc] > 2016-03-11 Michael Meissner > > PR target/70131 > * config/rs6000/rs6000.md (round322_fprs): Do not do the > optimization if we have direct move. > (roundu322_fprs): Likewise. > > [gcc/testsuite] > 2016-03-11 Michael Meissner > > PR target/70131 > * gcc.target/powerpc/ppc-round2.c: New test. Okay for trunk and GCC 5. Thanks, David