From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 5636E3858D32; Sun, 14 Aug 2022 21:52:10 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5636E3858D32 From: "slyfox at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/106617] New: [13 Regression] gcc is very slow at ternary expressions, Date: Sun, 14 Aug 2022 21:52:08 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: slyfox at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Aug 2022 21:52:10 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D106617 Bug ID: 106617 Summary: [13 Regression] gcc is very slow at ternary expressions, Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: slyfox at gcc dot gnu.org Target Milestone: --- Created attachment 53454 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D53454&action=3Dedit fc_exch.c.c.orig.gz Initially I observed slowness on ia64 trying to compile linux-4.19 using th= is week's gcc snapshot. x86_64 also seems to be affected. Original (ia64-specific) example is attached as fc_exch.c.c.orig.gz. It sho= ws the origin of this construct. I reduced the example with 4s timeout and got the following reproducer: // $ cat fc_exch.c.c int nr_cpu_ids; void fc_setup_exch_mgr() { (((((((1UL << (((0, 0) ? ((1) ? (((nr_cpu_ids)) ? 0 : ((nr_cpu_ids)) & (21) ? 21 : ((nr_cpu_ids)) ? 20 : ((nr_cpu_ids)) & (19) ? 19 : ((nr_cpu_ids)) ? 18 : ((nr_cpu_ids)) & (17) ? 17 : ((nr_cpu_ids)) ? 16 : ((nr_cpu_ids)) & (15) ? 15 : ((nr_cpu_ids)) ? 14 : ((nr_cpu_ids)) & (13) ? 13 : ((nr_cpu_ids)) ? 12 : ((nr_cpu_ids)) & (11) ? 11 : ((nr_cpu_ids)) ? 10 : ((nr_cpu_ids)) & (9) ? 9 : ((nr_cpu_ids)) ? 8 : ((nr_cpu_ids)) & (7) ? 7 : ((nr_cpu_ids)) ? 6 : ((nr_cpu_ids)) & (5) ? 5 : ((nr_cpu_ids)) ? 4 : ((nr_cpu_ids)) & (3) ? 3 : ((nr_cpu_ids)-1) & 1) : 1) : 0) + 1))))) & (1ULL << 2) ? 2 : 1)) ); } $ time gcc-13.0.0 -Wno-address-of-packed-member -c fc_exch.c.c -o bug.o -O1 real 0m4,821s user 0m4,726s sys 0m0,077s Almost 5s! $ time gcc-12.1.0 -Wno-address-of-packed-member -c fc_exch.c.c -o bug.o -O1 real 0m0,019s user 0m0,013s sys 0m0,006s $ gcc -v Using built-in specs. COLLECT_GCC=3D/<>/gcc-13.0.0/bin/gcc COLLECT_LTO_WRAPPER=3D/<>/gcc-13.0.0/libexec/gcc/x86_64-unknown-linux-= gnu/13.0.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: Thread model: posix Supported LTO compression algorithms: zlib gcc version 13.0.0 20220807 (experimental) (GCC) On this example perf shows the following output: 8.63% cc1 cc1 [.] operand_compare::operand_equ= al_p 4.39% cc1 cc1 [.] operand_compare::verify_hash_value 4.29% cc1 cc1 [.] wide_int_to_tree_1 3.67% cc1 cc1 [.] fold_binary_loc 3.17% cc1 cc1 [.] generic_simplify_NE_EXPR 2.67% cc1 cc1 [.] tree_strip_nop_conversions 2.64% cc1 cc1 [.] generic_simplify_EQ_EXPR 2.47% cc1 cc1 [.] tree_operand_check 2.39% cc1 cc1 [.] get_inner_reference 2.37% cc1 cc1 [.] integer_zerop 2.25% cc1 cc1 [.] wide_int_binop 2.11% cc1 cc1 [.] tree_nop_convert 1.84% cc1 cc1 [.] int_const_binop 1.81% cc1 cc1 [.] int_cst_hasher::hash 1.78% cc1 cc1 [.] ggc_internal_alloc 1.70% cc1 cc1 [.] element_precision 1.44% cc1 cc1 [.] wi::fits_to_tree_p > > > 1.36% cc1 cc1 [.] contains_struct_check 1.08% cc1 cc1 [.] build2 1.05% cc1 cc1 [.] hash_table::find_slot_with_hash 1.04% cc1 cc1 [.] generic_simplify 1.03% cc1 cc1 [.] cache_wide_int_in_type_cache 0.97% cc1 cc1 [.] get_int_cst_ext_nunits 0.91% cc1 cc1 [.] bitmask_inv_cst_vector_p 0.87% cc1 cc1 [.] tree_strip_sign_nop_conversi= ons 0.86% cc1 libc.so.6 [.] __memset_avx2_unaligned_erms=