From: Fangrui Song <i@maskray.me>
To: LIU Hao <lh_mouse@126.com>
Cc: binutils@sourceware.org, GCC Development <gcc@gcc.gnu.org>
Subject: Re: RFC: Formalization of the Intel assembly syntax (PR53929)
Date: Thu, 18 Jan 2024 01:02:22 -0800 [thread overview]
Message-ID: <MN0PR12MB5761F72F6FBE6F8B6C7ACEEFCB712@MN0PR12MB5761.namprd12.prod.outlook.com> (raw)
In-Reply-To: <d3cb6174-caa6-47f3-9b5e-d39222701458@126.com>
On Wed, Jan 17, 2024 at 9:34 PM LIU Hao <lh_mouse@126.com> wrote:
>
> Hello,
>
> There hasn't been an solution to https://gcc.gnu.org/PR53929 since almost a dozen years ago, mostly
> due to compatibility with MASM. I was told that the ambiguity of Intel syntax should be classified
> as its own limitation and disrecommendation.
>
> Notwithstanding, I am proposing a permanent solution to this issue, by banning constructions that
> cause ambiguity. This is likely to effect incompatibility with other assemblers, but it should make
> GAS parse the output of GCC flawlessly.
>
>
> PR53929 contains a known ambiguous construction
>
> lea rax, bx[rip]
>
> where `bx` could denote the BX register and causes confusion. The Intel Software Developer Manual
> also contains an ambiguous construction
>
> MOV EBX, RAM_START
>
> which would look like loading the offset of `RAM_START`. My proposal is that these two constructions
> are ambiguous and should be rejected. The compiler should generate assembly in the unambiguous
> subset, and we can start to implement the assembler to reject the ambiguous ones.
>
> Their are formalized as
>
> lea rax, BYTE PTR bx[rip]
> mov EBX, DWORD PTR RAM_START
>
> Roughly speaking, anything after `PTR`/`BCST` (and before `[` if any) is considered a symbol even if
> it matches a keyword; any identifier between `[` and `]` is a register and not a symbol.
>
>
> My complete proposal can be found at
> <https://github.com/lhmouse/mcfgthread/wiki/Formalized-Intel-Syntax-for-x86>. Some ideas actually
> reflect the AT&T syntax. I hope it helps.
Thanks for the proposal. I hope that -masm=intel becomes more useful:)
Do you have a list of assembly in the unambiguous cases that fail to
be parsed today as a gas PR?
For example,
% as -msyntax=intel -mnaked-reg <<< 'lea rax, BYTE PTR bxx[rip]' -o
a.o && objdump -d -M intel a.o | grep -A1 '>:'
0000000000000000 <.text>:
0: 48 8d 05 00 00 00 00 lea rax,[rip+0x0] # 0x7
% as -msyntax=intel -mnaked-reg <<< 'lea rax, BYTE PTR bx[rip]' -o a.o
&& objdump -d -M intel a.o | grep -A1 '>:'
{standard input}: Assembler messages:
{standard input}:1: Error: invalid use of register
% as -msyntax=intel -mnaked-reg <<< 'mov EBX, DWORD PTR ebx' -o a.o
{standard input}: Assembler messages:
{standard input}:1: Error: invalid use of register
next prev parent reply other threads:[~2024-01-18 9:02 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-18 5:34 LIU Hao
2024-01-18 9:02 ` Fangrui Song [this message]
2024-01-18 12:54 ` Jan Beulich
2024-01-18 16:40 ` LIU Hao
2024-01-19 9:13 ` Jan Beulich
2024-01-20 12:40 ` LIU Hao
2024-01-22 8:39 ` Jan Beulich
2024-01-23 1:27 ` LIU Hao
2024-01-23 8:38 ` Jan Beulich
2024-01-23 9:00 ` LIU Hao
2024-01-23 9:03 ` Jan Beulich
2024-01-23 9:21 ` LIU Hao
2024-01-23 9:37 ` Jan Beulich
2024-01-30 4:22 ` Hans-Peter Nilsson
2024-01-31 10:11 ` LIU Hao
[not found] ` <DS7PR12MB5765DBF9500DE323DB4A8E29CB712@DS7PR12MB5765.namprd12.prod.outlook.com>
2024-01-19 1:42 ` LIU Hao
2024-01-19 7:41 ` Jan Beulich
2024-01-19 8:19 ` Fangrui Song
[not found] ` <DS7PR12MB5765654642BE3AD4C7F54E05CB702@DS7PR12MB5765.namprd12.prod.outlook.com>
2024-01-20 12:32 ` LIU Hao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=MN0PR12MB5761F72F6FBE6F8B6C7ACEEFCB712@MN0PR12MB5761.namprd12.prod.outlook.com \
--to=i@maskray.me \
--cc=binutils@sourceware.org \
--cc=gcc@gcc.gnu.org \
--cc=lh_mouse@126.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).