Re: [PATCH] arm/strlen-thumb2-Os.S: Correct assembly syntax for ldrb instruction

public inbox for newlib@sourceware.org
 help / color / mirror / Atom feed

From: "Keith Packard" <keithp@keithp.com>
To: Richard Earnshaw <Richard.Earnshaw@foss.arm.com>, newlib@sourceware.org
Subject: Re: [PATCH] arm/strlen-thumb2-Os.S: Correct assembly syntax for ldrb instruction
Date: Fri, 15 May 2020 08:19:36 -0700	[thread overview]
Message-ID: <87mu69usxj.fsf@keithp.com> (raw)
In-Reply-To: <79529c59-9b22-9b7b-1a18-2c3f52615bdd@foss.arm.com>

[-- Attachment #1: Type: text/plain, Size: 2878 bytes --]

Richard Earnshaw <Richard.Earnshaw@foss.arm.com> writes:

> IIRC the .w was deliberate to keep the alignment right for some
> subsequent instructions.

I'm not sure what you mean by 'alignment' here -- would the assembler
actually insert no-ops to ensure that the instruction were aligned
somehow? I can't find any mention of this additional meaning of the '.w'
qualifier and find that unlikely as it would increase code size?

It sounds like gas and llvm have a different interpretation of the '.w'
qualifier for this instruction. The ARMv7-M Architecture Reference
Manual (ARM DDI 0403E.d (ID070218), says the '.W' means:

.W      Meaning wide, specifies that the assembler must select a 32-bit
        encoding for the instruction. If this is not possible, an
        assembler error is produced.

LDRB offers three encodings, called T1, T2 and T3 in the specs. T1 and
T2 have equivalent flexibility: loading indirect from a register with an
optional offset (with T1 being 16 bit and T2 being 32 bit). Selecting
the 32-bit T2 form extends the offset from 5 to 12 bits, so I imagine
that an environment that couldn't replace instructions at link time
might use the T2 form to make room for larger relocation values in case
the offset weren't known at assembly time?

However, only the 32-bit T3 encoding offers the post-indexed mode
required by the code here, and for that, there is no need to clarify
which to select using the .N or .W qualifiers. And, when reading the
description of the three encodings, only T2 includes the '.W' qualifier
in its syntax, although T1 doesn't include '.N', which kinda indicates
that the qualifiers should always be accepted, even if they aren't
necessary.

Hrm. For this instruction, perhaps the intent in the specification is to
use the .W to force a T2 encoding *over a T3 encoding*. The T3 encoding
also supports register-indexed mode, but with only an 8-bit offset, so
using .W might indicate that even if the offset *could* fit in the 8-bit
T3 form, that the assembler should instead select the T2 encoding. I
dunno.

> LLVM's assembler needs fixing if it doesn't accept '.w'.

Yes, it seems like that would be a good idea; having the '.w' in this
case is harmless as only the T3 encoding can possibly work. There is
already a discussion about this in the llvm world; I don't know if or
when that will result in a fix being applied.

I dug through the gas source and couldn't find any place where the '.w'
qualifier would affect the output in this case though, so removing it
should be harmless for gas.

I'm just responding to a bug report from a user trying to use
clang to build the library, and for that, fixing the source code to work
with both gas and clang in ways that don't appear to affect the output
of gas at all seemed like a reasonable option to me.

-- 
-keith

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

next prev parent reply	other threads:[~2020-05-15 15:19 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-12 17:58 Keith Packard
2020-05-15 13:25 ` Richard Earnshaw
2020-05-15 14:38   ` Emmanuel Blot
2020-05-15 15:19   ` Keith Packard [this message]
2020-05-15 15:44     ` Richard Earnshaw

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mu69usxj.fsf@keithp.com \
    --to=keithp@keithp.com \
    --cc=Richard.Earnshaw@foss.arm.com \
    --cc=newlib@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).