From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 10A993858416; Fri, 26 Apr 2024 12:54:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 10A993858416 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1714136100; bh=sq3ziZeapobFuRpIa1RGh6+lDUlnYgc3Au9gCSh7NaI=; h=From:To:Subject:Date:In-Reply-To:References:From; b=MM4zgJh3BwbsBuN44nL7LLwjAHjlfiaM2LMALkOPAX0J6MUM/uWgvPDUA7qRNYT8N TG1bMngwDMZ6yvnqw3EFpepH+GYzjkJ494DJJ2ybIPrx8V3q6et+eO/tr1adIyihVp 298xaiI+cb9yoLWotLbhY9Y8pF5E9djokh0Nkqwc= From: "roger at nextmovesoftware dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/97756] [11/12/13 Regression] Inefficient handling of 128-bit arguments Date: Fri, 26 Apr 2024 12:54:58 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 11.0 X-Bugzilla-Keywords: missed-optimization, ra X-Bugzilla-Severity: normal X-Bugzilla-Who: roger at nextmovesoftware dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 11.5 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cf_known_to_work short_desc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D97756 Roger Sayle changed: What |Removed |Added ---------------------------------------------------------------------------- Known to work| |14.0 Summary|[11/12/13/14/15 Regression] |[11/12/13 Regression] |Inefficient handling of |Inefficient handling of |128-bit arguments |128-bit arguments --- Comment #17 from Roger Sayle --- I believe this issue is now fixed on mainline (i.e. for both GCC 14 and GCC 15). Firstly, many thanks to Jakub for correcting the error in my patch. We now generate optimal code sequences for the code in comments #3 and #5, and use generate fewer instructions than described in the original description. The final remaining issue is that with -O3 GCC still uses more instructions than clang and icc (see Thomas' comments in comments #12 and #13). The good news is that this is intentional, compiling with -Os (to optimize for size) generates the same number of instructions as clang and icc [in fact, using = icc -Os generates larger code!?]. So when optimizing for performance, GCC is taking the opportunity to use more (cheap) instructions to execute faster (= or that's the theory).=