From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from elaine.keithp.com (home.keithp.com [63.227.221.253]) by sourceware.org (Postfix) with ESMTPS id C25EB383E803 for ; Fri, 15 May 2020 15:19:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C25EB383E803 Received: from localhost (localhost [127.0.0.1]) by elaine.keithp.com (Postfix) with ESMTP id 2D46B3F2C3A7; Fri, 15 May 2020 08:19:42 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at keithp.com Received: from elaine.keithp.com ([127.0.0.1]) by localhost (elaine.keithp.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 6ullqKUUoMVJ; Fri, 15 May 2020 08:19:37 -0700 (PDT) Received: from keithp.com (koto.keithp.com [10.0.0.2]) by elaine.keithp.com (Postfix) with ESMTPSA id 4EF123F2B956; Fri, 15 May 2020 08:19:37 -0700 (PDT) Received: by keithp.com (Postfix, from userid 1000) id 3A13C1582185; Fri, 15 May 2020 08:19:37 -0700 (PDT) From: "Keith Packard" To: Richard Earnshaw , newlib@sourceware.org Subject: Re: [PATCH] arm/strlen-thumb2-Os.S: Correct assembly syntax for ldrb instruction In-Reply-To: <79529c59-9b22-9b7b-1a18-2c3f52615bdd@foss.arm.com> References: <20200512175830.1186422-1-keithp@keithp.com> <79529c59-9b22-9b7b-1a18-2c3f52615bdd@foss.arm.com> Date: Fri, 15 May 2020 08:19:36 -0700 Message-ID: <87mu69usxj.fsf@keithp.com> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: newlib@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Newlib mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 May 2020 15:19:54 -0000 --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Richard Earnshaw writes: > IIRC the .w was deliberate to keep the alignment right for some > subsequent instructions. I'm not sure what you mean by 'alignment' here -- would the assembler actually insert no-ops to ensure that the instruction were aligned somehow? I can't find any mention of this additional meaning of the '.w' qualifier and find that unlikely as it would increase code size? It sounds like gas and llvm have a different interpretation of the '.w' qualifier for this instruction. The ARMv7-M Architecture Reference Manual (ARM DDI 0403E.d (ID070218), says the '.W' means: .W Meaning wide, specifies that the assembler must select a 32-bit encoding for the instruction. If this is not possible, an assembler error is produced. LDRB offers three encodings, called T1, T2 and T3 in the specs. T1 and T2 have equivalent flexibility: loading indirect from a register with an optional offset (with T1 being 16 bit and T2 being 32 bit). Selecting the 32-bit T2 form extends the offset from 5 to 12 bits, so I imagine that an environment that couldn't replace instructions at link time might use the T2 form to make room for larger relocation values in case the offset weren't known at assembly time? However, only the 32-bit T3 encoding offers the post-indexed mode required by the code here, and for that, there is no need to clarify which to select using the .N or .W qualifiers. And, when reading the description of the three encodings, only T2 includes the '.W' qualifier in its syntax, although T1 doesn't include '.N', which kinda indicates that the qualifiers should always be accepted, even if they aren't necessary. Hrm. For this instruction, perhaps the intent in the specification is to use the .W to force a T2 encoding *over a T3 encoding*. The T3 encoding also supports register-indexed mode, but with only an 8-bit offset, so using .W might indicate that even if the offset *could* fit in the 8-bit T3 form, that the assembler should instead select the T2 encoding. I dunno. > LLVM's assembler needs fixing if it doesn't accept '.w'. Yes, it seems like that would be a good idea; having the '.w' in this case is harmless as only the T3 encoding can possibly work. There is already a discussion about this in the llvm world; I don't know if or when that will result in a fix being applied. I dug through the gas source and couldn't find any place where the '.w' qualifier would affect the output in this case though, so removing it should be harmless for gas. I'm just responding to a bug report from a user trying to use clang to build the library, and for that, fixing the source code to work with both gas and clang in ways that don't appear to affect the output of gas at all seemed like a reasonable option to me. =2D-=20 =2Dkeith --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEw4O3eCVWE9/bQJ2R2yIaaQAAABEFAl6+swgACgkQ2yIaaQAA ABE2Ig/7BHOjPBrL443+kNt1HYS4fQxYqXdu+xzpXB+LpuCmI5w2ESibCEvnwKt+ 3QtARbbYmW37zu0JY6pJ/zI17jzXUm+MhRwhfuv9xmjQBURIx2EPp9VZx+3xuRB0 G0qHvFdd+EgedUrh2uGhGxPbVRxdxWttkTt1eAx+/UVhE2CvLLciCxSS1ivLf2xb Pwq1XTBxlBbJIRt/S2lSED9h512LzoT8Ib41AdxCtLymDQZ9m3T+hSikzk+BTFRU qEXdRguylMmngv4voEzlq3BBdY53inaoy1ZtnbZmSTfQx1dQ+z/6WuV7scUDpnvs MR2aRDu3CJW6Aghry6pwstJ9RbV0tMf9SCoOE2+Fo/B4/ivRCdK7kNQ1BYtXJ5CD /jZIl7LH/JSZ9R5b7ILbQlZP5QNmwFWwBWUEu6+vBT8VL6QDVyVBd0rd3Wl15KTn 5HvaNaYJYMXUTDz/J9msnA9UBkXomFjDZFnV+Iw4xRvAZDizZNdSqrpkNqwFvKAm NQv7XTujXnHpHmq2ib4yJ0zv4mnXxsaURp5mvRjXm5AnInUYQc4qtSvi7OQbmsNe DzeACoSHSobVYj5xgAgd4vs9AoVF8CKp18FRrUw0H43BXRzpVwfKSD1AxZr8h+wx hWdQTkm5P0rhrQ8O3x2JcTueeIaQATJJXO+vCCwnybW2gAd8QMA= =y4W4 -----END PGP SIGNATURE----- --=-=-=--