From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 37244 invoked by alias); 4 Jun 2019 15:16:05 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 37227 invoked by uid 89); 4 Jun 2019 15:16:05 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-18.6 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: NAM01-BY2-obe.outbound.protection.outlook.com Received: from mail-eopbgr810115.outbound.protection.outlook.com (HELO NAM01-BY2-obe.outbound.protection.outlook.com) (40.107.81.115) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 04 Jun 2019 15:16:00 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amperemail.onmicrosoft.com; s=selector2-amperemail-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=kjncs8krZVW3XLg2ztSLxIZK8ltVfY2XebNja3YzSo0=; b=RY3CgzfeWdTaM1FTlHRU2t5oAiLAiUf4ovpysMBCMQPtSy1cuToP8RLhmgsW/NBP44LdaV/7hLq0KR3Lxjug5JrbKSDKCL2I+zeiNYZwudxtImxF/NxGVgF86e4VR4vL7eIH5ArR+IVyfgAsESegNMga0tQ1XOMYjntE7ikaPjQ= Received: from BYAPR01MB4869.prod.exchangelabs.com (20.177.228.18) by BYAPR01MB5430.prod.exchangelabs.com (20.179.63.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1943.22; Tue, 4 Jun 2019 15:15:56 +0000 Received: from BYAPR01MB4869.prod.exchangelabs.com ([fe80::9454:7630:22d5:a934]) by BYAPR01MB4869.prod.exchangelabs.com ([fe80::9454:7630:22d5:a934%7]) with mapi id 15.20.1943.018; Tue, 4 Jun 2019 15:15:56 +0000 From: Feng Xue OS To: "gcc-patches@gcc.gnu.org" CC: Richard Biener , Thomas Schwinge , Jeff Law Subject: [PATCH V5] Remove empty loop with assumed finiteness (PR tree-optimization/89713) Date: Tue, 04 Jun 2019 15:16:00 -0000 Message-ID: References: , , In-Reply-To: authentication-results: spf=none (sender IP is ) smtp.mailfrom=fxue@os.amperecomputing.com; x-ms-oob-tlc-oobclassifiers: OLM:1824; received-spf: None (protection.outlook.com: os.amperecomputing.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: fxue@os.amperecomputing.com X-SW-Source: 2019-06/txt/msg00180.txt.bz2 > Why wouldn't it be suitable for -O2? Normally, not suitable for -O2 could= =20 > be because it is expensive (in compile time), because it increases the=20 > code size a lot, because it doesn't always actually improve the running=20 > time, etc. I don't see any of that here. There isn't supposed to be a=20 > semantic difference between -O2 and -O3. Do you consider it "dangerous" i= n=20 > a similar sense as -fstrict-aliasing? We enable that by default at -O2. Yes. I did have such concern. Now I changed that to enable the option at -O= 2. Feng --- diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 37aab79..4fdc5c8 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,16 @@ +2019-06-04 Feng Xue + + PR tree-optimization/89713 + * doc/invoke.texi (-ffinite-loop): Document new option. + * common.opt (-ffinite-loop): New option. + * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Mark + IFN_GOACC_LOOP calls as necessary. + * tree-ssa-loop-niter.c (finite_loop): Assume loop with an exit is + finite. + * omp-offload.c (oacc_xform_loop): Skip lowering if return value of + IFN_GOACC_LOOP call is not used. + * opts.c (default_options_table): Enable -ffinite-loop at -O2+. + 2019-06-04 Alan Modra =20 PR target/90689 diff --git a/gcc/common.opt b/gcc/common.opt index 0e72fd0..f570815 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1437,6 +1437,10 @@ ffinite-math-only Common Report Var(flag_finite_math_only) Optimization SetByCombined Assume no NaNs or infinities are generated. =20 +ffinite-loop +Common Report Var(flag_finite_loop) Optimization +Assume that loops with an exit will terminate and not loop indefinitely. + ffixed- Common Joined RejectNegative Var(common_deferred_options) Defer -ffixed- Mark as being unavailable to the compiler. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 91c9bb8..8d3259d 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -412,6 +412,7 @@ Objective-C and Objective-C++ Dialects}. -fdevirtualize-at-ltrans -fdse @gol -fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects = @gol -ffast-math -ffinite-math-only -ffloat-store -fexcess-precision=3D@var{= style} @gol +-ffinite-loop @gol -fforward-propagate -ffp-contract=3D@var{style} -ffunction-sections @gol -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @g= ol -fgcse-sm -fhoist-adjacent-loads -fif-conversion @gol @@ -8316,7 +8317,8 @@ Optimize yet more. @option{-O3} turns on all optimiz= ations specified by @option{-O2} and also turns on the following optimization flags: =20 @c Please keep the following list alphabetized! -@gccoptlist{-fgcse-after-reload @gol +@gccoptlist{-ffinite-loop @gol +-fgcse-after-reload @gol -finline-functions @gol -fipa-cp-clone -floop-interchange @gol @@ -9503,6 +9505,15 @@ that may set @code{errno} but are otherwise free of = side effects. This flag is enabled by default at @option{-O2} and higher if @option{-Os} is not also specified. =20 +@item -ffinite-loop +@opindex ffinite-loop +@opindex fno-finite-loop +Assume that a loop with an exit will eventually take the exit and not loop +indefinitely. This allows the compiler to remove loops that otherwise have +no side-effects, not considering eventual endless looping as such. + +This option is enabled by default at @option{-O3}. + @item -ftree-dominator-opts @opindex ftree-dominator-opts Perform a variety of simple scalar cleanups (constant/copy diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c index 97ae47b..369122f 100644 --- a/gcc/omp-offload.c +++ b/gcc/omp-offload.c @@ -300,7 +300,7 @@ oacc_xform_loop (gcall *call) tree chunk_size =3D NULL_TREE; unsigned mask =3D (unsigned) TREE_INT_CST_LOW (gimple_call_arg (call, 5)= ); tree lhs =3D gimple_call_lhs (call); - tree type =3D TREE_TYPE (lhs); + tree type =3D NULL_TREE; tree diff_type =3D TREE_TYPE (range); tree r =3D NULL_TREE; gimple_seq seq =3D NULL; @@ -308,6 +308,15 @@ oacc_xform_loop (gcall *call) unsigned outer_mask =3D mask & (~mask + 1); // Outermost partitioning unsigned inner_mask =3D mask & ~outer_mask; // Inner partitioning (if an= y) =20 + /* Skip lowering if return value of IFN_GOACC_LOOP call is not used. */ + if (!lhs) + { + gsi_replace_with_seq (&gsi, seq, true); + return; + } + + type =3D TREE_TYPE (lhs); +=20 #ifdef ACCEL_COMPILER chunk_size =3D gimple_call_arg (call, 4); if (integer_minus_onep (chunk_size) /* Force static allocation. */ diff --git a/gcc/opts.c b/gcc/opts.c index 64f94ac..0db9dda 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -494,6 +494,7 @@ static const struct default_options default_options_tab= le[] =3D { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_fexpensive_optimizations, NULL, 1 }, + { OPT_LEVELS_2_PLUS, OPT_ffinite_loop, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_fgcse, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_findirect_inlining, NULL, 1 }, diff --git a/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C b/gcc/testsuite/g++= .dg/tree-ssa/empty-loop.C new file mode 100644 index 0000000..e374155 --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C @@ -0,0 +1,33 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */ + +#include +#include +#include +#include +#include + +using namespace std; + +int foo (vector &v, list &l, set &s, map &m) +{ + for (vector::iterator it =3D v.begin (); it !=3D v.end (); ++it) + it->length(); + + for (list::iterator it =3D l.begin (); it !=3D l.end (); ++it) + it->length(); + + for (map::iterator it =3D m.begin (); it !=3D m.end (); ++i= t) + it->first + it->second.length(); + + for (set::iterator it0 =3D s.begin (); it0 !=3D s.end(); ++it0) + for (vector::reverse_iterator it1 =3D v.rbegin(); it1 !=3D v.r= end(); ++it1) + { + it0->length(); + it1->length(); + }=20=20 + + return 0; +} +/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */ + diff --git a/gcc/testsuite/gcc.dg/const-1.c b/gcc/testsuite/gcc.dg/const-1.c index a5b2b16..13e3451 100644 --- a/gcc/testsuite/gcc.dg/const-1.c +++ b/gcc/testsuite/gcc.dg/const-1.c @@ -1,5 +1,5 @@ /* { dg-do compile { target nonpic } } */ -/* { dg-options "-O2 -Wsuggest-attribute=3Dconst" } */ +/* { dg-options "-O2 -Wsuggest-attribute=3Dconst -fno-finite-loop" } */ =20 extern int extern_const(int a) __attribute__ ((const)); =20 diff --git a/gcc/testsuite/gcc.dg/graphite/graphite.exp b/gcc/testsuite/gcc= .dg/graphite/graphite.exp index ea61446..b294b9c 100644 --- a/gcc/testsuite/gcc.dg/graphite/graphite.exp +++ b/gcc/testsuite/gcc.dg/graphite/graphite.exp @@ -56,7 +56,7 @@ set vect_files [lsort [glob -nocomplain $srcdir/$s= ubdir/vect-*.c ] ] =20 # Tests to be compiled. set dg-do-what-default compile -dg-runtest $scop_files "" "-O2 -fgraphite -fdump-tree-graphite-all" +dg-runtest $scop_files "" "-O2 -fgraphite -fdump-tree-graphite-all = -fno-finite-loop" dg-runtest $id_files "" "-O2 -fgraphite-identity -ffast-math -fdu= mp-tree-graphite-details" =20 # Tests to be run. diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-1.c b/gcc/testsuite/gcc.dg/= loop-unswitch-1.c index f6fc41d..735eeef 100644 --- a/gcc/testsuite/gcc.dg/loop-unswitch-1.c +++ b/gcc/testsuite/gcc.dg/loop-unswitch-1.c @@ -1,6 +1,6 @@ /* For PR rtl-optimization/27735 */ /* { dg-do compile } */ -/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details" } */ +/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details -fno-fi= nite-loop" } */ =20 void set_color(void); void xml_colorize_line(unsigned int *p, int state) diff --git a/gcc/testsuite/gcc.dg/predict-9.c b/gcc/testsuite/gcc.dg/predic= t-9.c index 7e5ba08..3710eef 100644 --- a/gcc/testsuite/gcc.dg/predict-9.c +++ b/gcc/testsuite/gcc.dg/predict-9.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate" } */ +/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate -fno= -finite-loop" } */ =20 extern int global; extern int global2; diff --git a/gcc/testsuite/gcc.dg/pure-2.c b/gcc/testsuite/gcc.dg/pure-2.c index fe6e2bc..6ac372b 100644 --- a/gcc/testsuite/gcc.dg/pure-2.c +++ b/gcc/testsuite/gcc.dg/pure-2.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -Wsuggest-attribute=3Dpure" } */ +/* { dg-options "-O2 -Wsuggest-attribute=3Dpure -fno-finite-loop" } */ /* { dg-add-options bind_pic_locally } */ =20 extern int extern_const(int a) __attribute__ ((pure)); diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c b/gcc/testsuite/gcc= .dg/tree-ssa/20040211-1.c index d289e5d..e4d331e 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-cddce2" } */ +/* { dg-options "-O2 -fdump-tree-cddce2 -fno-finite-loop" } */ =20 struct rtx_def; typedef struct rtx_def *rtx; diff --git a/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c b/gcc/testsuite/gcc.dg/t= ree-ssa/dce-2.c new file mode 100644 index 0000000..ffca49c --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c @@ -0,0 +1,37 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-cddce1 -ffinite-loop" } */ + +typedef struct list { + char pad[15]; + struct list *next; +} list; + +int data; + +list *head, *tail; + +int __attribute__((pure)) pfn (int); + +int foo (unsigned u, int s) +{ + unsigned i; + list *p; + int j; + + for (i =3D 0; i < u; i +=3D 2) + ; + + for (p =3D head; p; p =3D p->next) + ; + + for (j =3D data; j & s; j =3D pfn (j + 3)) + ; + + for (p =3D head; p !=3D tail; p =3D p->next) + for (j =3D data + 1; j > s; j =3D pfn (j + 2)) + ; + + return 0; +} +/* { dg-final { scan-tree-dump-not "if" "cddce1"} } */ + diff --git a/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c b/gcc/testsuite/gcc.dg= /tree-ssa/loop-10.c index a29c9fb..c605005 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-optimized" } */ +/* { dg-options "-O2 -fdump-tree-optimized -fno-finite-loop" } */ /* { dg-require-effective-target int32plus } */ =20 int bar (void); diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c b/gcc/testsuite/g= cc.dg/tree-ssa/split-path-6.c index e9b4f26..7c47906 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fdump-tree-split-path= s-details -w" } */ +/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fdump-tree-split-path= s-details -w -fno-finite-loop" } */ =20 struct __sFILE { diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c b/gcc/testsuite/= gcc.dg/tree-ssa/ssa-thread-12.c index d829b04..b7a3d77 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-detai= ls -fdump-tree-thread4-details" } */ +/* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-detai= ls -fdump-tree-thread4-details -fno-finite-loop" } */ /* { dg-final { scan-tree-dump "FSM" "thread2" } } */ /* { dg-final { scan-tree-dump "FSM" "thread3" } } */ /* { dg-final { scan-tree-dump "FSM" "thread4" { xfail *-*-* } } } */ diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c index 2478219..179605e 100644 --- a/gcc/tree-ssa-dce.c +++ b/gcc/tree-ssa-dce.c @@ -245,6 +245,17 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool a= ggressive) mark_stmt_necessary (stmt, true); return; } + /* IFN_GOACC_LOOP calls are necessary in that they are used to + represent parameter (i.e. step, bound) of a lowered OpenACC + partitioned loop. But this kind of partitioned loop might not + survive from aggressive loop removal for it has loop exit and + is assumed to be finite. Therefore, we need to explicitly mark + these calls. (An example is libgomp.oacc-c-c++-common/pr84955.c= ) */ + if (gimple_call_internal_p (stmt, IFN_GOACC_LOOP)) + { + mark_stmt_necessary (stmt, true); + return; + } if (!gimple_call_lhs (stmt)) return; break; diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c index 470b6a2..c25cb1d 100644 --- a/gcc/tree-ssa-loop-niter.c +++ b/gcc/tree-ssa-loop-niter.c @@ -2798,6 +2798,27 @@ finite_loop_p (struct loop *loop) loop->num); return true; } + + if (flag_finite_loop) + { + unsigned i; + vec exits =3D get_loop_exit_edges (loop); + edge ex; + + /* If the loop has any non-EH exit, we can assume it will terminate.= */ + FOR_EACH_VEC_ELT (exits, i, ex) + if (!(ex->flags & EDGE_EH)) + { + exits.release (); + if (dump_file) + fprintf (dump_file, "Assume loop %i to be finite: it has an exit " + "and -ffinite-loop is on.\n", loop->num); + return true; + } + + exits.release (); + } + return false; } =20 diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c b/libg= omp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c new file mode 100644 index 0000000..845268b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */ + +int +f1 (void) +{ + int i, j; + +#pragma acc parallel loop tile(2,3) + for (i =3D 1; i < 10; i++) + for (j =3D 1; j < 10; j++) + for (;;) + ; + + return i + j; +} + +int +f2 (void) +{ + int i, j, k; + +#pragma acc parallel loop tile(2,3) + for (i =3D 1; i < 10; i++) + for (j =3D 1; j < 10; j++) + for (k =3D 1; k < 10; k++) + ; + + return i + j; +} +/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */