From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 30240 invoked by alias); 13 Dec 2014 17:50:15 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 30213 invoked by uid 48); 13 Dec 2014 17:50:11 -0000 From: "olegendo at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/64299] New: [SH] improve FPSCR.PR mode switching by reordering insns Date: Sat, 13 Dec 2014 17:50:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 5.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: olegendo at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter cf_gcctarget Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-12/txt/msg01549.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64299 Bug ID: 64299 Summary: [SH] improve FPSCR.PR mode switching by reordering insns Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: olegendo at gcc dot gnu.org Target: sh*-*-* Compiling the following function with -O2 results in 4 FPSCR.PR mode switches: double test_0 (const float* a, const float* b, const double* c, float x) { float aa = a[0] * b[0]; // single double cc = c[0] + c[1]; // double aa += b[1] * b[2]; // single cc += c[2] + c[3]; // double aa += b[3] * b[4]; // single return aa / cc; // double } Since the calculations are independent and FPSCR flags are not read between the operations, the fp insns can be reordered to reduce the amount of mode switches. The resulting code should be the same as when doing the reordering manually: double test_1 (const float* a, const float* b, const double* c, float x) { float aa = a[0] * b[0] + b[1] * b[2] + b[3] * b[4]; // single double cc = c[0] + c[1] + c[2] + c[3]; // double return aa / cc; // double } which results in only 2 FPSCR.PR mode switches. Moreover, the following example double test_2 (const float* x, const double* y, unsigned int c) { float var0 = 0; double var1 = 0; while (c--) { float xx = x[0] * x[1] + x[2] + 123.0f; x += 3; double yy = y[0] + y[1]; y += 2; var0 += xx; var1 += yy; } return var0 + var1; } is a good candidate for doing loop distribution. Since var0 and var1 are independent the loop can be replaced with a single precision loop and a double precision loop, eliminating the FPSCR.PR mode switches inside the loop.