From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 34D6C3858C41; Fri, 19 May 2023 10:59:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 34D6C3858C41 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1684493943; bh=kJ+xyw5YaCCEO7/8nA6ysOyvY42cSOyp1rmuBwP9GUQ=; h=From:To:Subject:Date:In-Reply-To:References:From; b=NPv2StowCzjbVWc5diF2tOXG7WDRbWOw5lCVPFzRHhFKP52uGUvw4KgGBMHA7W9a4 vbKtEsx12zfSdE69+G2wUwugHI9iQX54KThBwac3C5kvjtOumzMu8b3G2iMSnmu8Ka S/T66s9KO3UsfY5yz3fdlQvbZt/hEb8UQTIg7H9Y= From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/101856] match_arith_overflow checks only mulv4_optab/umulv4_optab tables when smul_highpart_optab/umul_highpart_optab can produce decent code too Date: Fri, 19 May 2023 10:59:02 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: jakub at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D101856 --- Comment #3 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:62d08a67c83b4a089866c6d19e82d70ee5b8aed1 commit r14-992-g62d08a67c83b4a089866c6d19e82d70ee5b8aed1 Author: Jakub Jelinek Date: Fri May 19 12:57:31 2023 +0200 tree-ssa-math-opts: Pattern recognize hand written __builtin_mul_overfl= ow_p with same unsigned types even when target just has highpart umul [PR101856] As can be seen on the following testcase, we pattern recognize it on i?86/x86_64 as return __builtin_mul_overflow_p (x, y, 0UL) and avoid that way the extra division, but don't do it e.g. on aarch64 or ppc64le, even when return __builtin_mul_overflow_p (x, y, 0UL); actually produces there better code. The reason for testing the presence of the optab handler is to make sure the generated code for it is short to ensure we don't actually pessimize code instead of optimizing it. But, we have one case that the internal-fn.cc .MUL_OVERFLOW expansion handles nicely, and that is when arguments/result is the same mode TYPE_UNSIGNED type, we only use IMAGPART_EXPR of it (i.e. __builtin_mul_overflow_p rather than __builtin_mul_overflow) and umul_highpart_optab supports the particular mode, in that case we emit comparison of the highpart umul result against zero. So, the following patch matches what we do in internal-fn.cc and also pattern matches __builtin_mul_overflow_p if 1) we only need the flag whether it overflowed (i.e. !use_seen) 2) it is unsigned (i.e. !cast_stmt) 3) umul_highpart is supported for the mode 2023-05-19 Jakub Jelinek PR tree-optimization/101856 * tree-ssa-math-opts.cc (match_arith_overflow): Pattern detect unsigned __builtin_mul_overflow_p even when umulv4_optab doesn't support it but umul_highpart_optab does. * gcc.dg/tree-ssa/pr101856.c: New test.=