public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* RFC: Formalization of the Intel assembly syntax (PR53929)
@ 2024-01-18  5:34 LIU Hao
  2024-01-18  9:02 ` Fangrui Song
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: LIU Hao @ 2024-01-18  5:34 UTC (permalink / raw)
  To: binutils, GCC Development


[-- Attachment #1.1: Type: text/plain, Size: 1543 bytes --]

Hello,

There hasn't been an solution to https://gcc.gnu.org/PR53929 since almost a dozen years ago, mostly 
due to compatibility with MASM. I was told that the ambiguity of Intel syntax should be classified 
as its own limitation and disrecommendation.

Notwithstanding, I am proposing a permanent solution to this issue, by banning constructions that 
cause ambiguity. This is likely to effect incompatibility with other assemblers, but it should make 
GAS parse the output of GCC flawlessly.


PR53929 contains a known ambiguous construction

    lea	rax, bx[rip]

where `bx` could denote the BX register and causes confusion. The Intel Software Developer Manual 
also contains an ambiguous construction

    MOV EBX, RAM_START

which would look like loading the offset of `RAM_START`. My proposal is that these two constructions 
are ambiguous and should be rejected. The compiler should generate assembly in the unambiguous 
subset, and we can start to implement the assembler to reject the ambiguous ones.

Their are formalized as

    lea rax, BYTE PTR bx[rip]
    mov EBX, DWORD PTR RAM_START

Roughly speaking, anything after `PTR`/`BCST` (and before `[` if any) is considered a symbol even if 
it matches a keyword; any identifier between `[` and `]` is a register and not a symbol.


My complete proposal can be found at 
<https://github.com/lhmouse/mcfgthread/wiki/Formalized-Intel-Syntax-for-x86>. Some ideas actually 
reflect the AT&T syntax. I hope it helps.


-- 
Best regards,
LIU Hao

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-01-31 10:12 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-18  5:34 RFC: Formalization of the Intel assembly syntax (PR53929) LIU Hao
2024-01-18  9:02 ` Fangrui Song
2024-01-18 12:54 ` Jan Beulich
2024-01-18 16:40   ` LIU Hao
2024-01-19  9:13     ` Jan Beulich
2024-01-20 12:40       ` LIU Hao
2024-01-22  8:39         ` Jan Beulich
2024-01-23  1:27           ` LIU Hao
2024-01-23  8:38             ` Jan Beulich
2024-01-23  9:00               ` LIU Hao
2024-01-23  9:03                 ` Jan Beulich
2024-01-23  9:21                   ` LIU Hao
2024-01-23  9:37                     ` Jan Beulich
2024-01-30  4:22     ` Hans-Peter Nilsson
2024-01-31 10:11       ` LIU Hao
     [not found] ` <DS7PR12MB5765DBF9500DE323DB4A8E29CB712@DS7PR12MB5765.namprd12.prod.outlook.com>
2024-01-19  1:42   ` LIU Hao
2024-01-19  7:41     ` Jan Beulich
2024-01-19  8:19     ` Fangrui Song
     [not found]     ` <DS7PR12MB5765654642BE3AD4C7F54E05CB702@DS7PR12MB5765.namprd12.prod.outlook.com>
2024-01-20 12:32       ` LIU Hao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).