From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 2BECD3858C52; Thu, 22 Sep 2022 02:27:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2BECD3858C52 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1663813664; bh=vnz4276vyCIG/BO9rVmbUCD6TROrzpA0YqcB9bhtios=; h=From:To:Subject:Date:From; b=xev3lWC68KMJB7Qp9iE0FtakKotgIekfUbhJuDEfEwfJDlYgbJMWs66fVA749GBMG wN30JR66VvTGF6HrqFperaaHEN2LQSXi2n4iBLWamIjJ/8mPumihcDPuOQyB6HIajt dlDLA4lGPXQq13UXUiogDs3L0EFgaU08Dyez53bI= From: "hpa at zytor dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/107006] New: Missing optimization: common idiom for external data Date: Thu, 22 Sep 2022 02:27:43 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 12.2.1 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: hpa at zytor dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D107006 Bug ID: 107006 Summary: Missing optimization: common idiom for external data Product: gcc Version: 12.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: hpa at zytor dot com Target Milestone: --- Created attachment 53602 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D53602&action=3Dedit C test case source The only *portable* way in C to deal with external data structures containi= ng data of specific endianness, possibly unaligned, is to operate on them as b= yte (char) arrays. At least on x86 (which supports arbitrarily aligned loads), gcc *sometimes* recognize these as single loads, but sometimes not. In the included test cases, there is a plain C implementation and an implementation wrapped in a C++ class. Compiling the former with: gcc -std=3Dc2x -g -O3 -W -Wall -[cSE] -o bswap.[osi] bswap.c ... recognizes the load idiom for 16-bit numbers but not for 32- or 64-bit numbers. Compiling the latter with: gcc -std=3Dc++20 -g -O3 -E -Wall -[cSE] -o bswapcc.[osi] bswapcc.cc ... *additionally* recognizes the 32-bit load, *but only in the bigendian c= ase* (that is, it generates a load and a bswap instruction); whereas in the littleendian -- native -- case, this does not happen! I am familiar with the used of packed arrays and __builtin_bswap*() for the= se accesses, but unfortunately these are gcc-specific.=