[RFC] x86: proposal for a new .insn directive

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

* [RFC] x86: proposal for a new .insn directive
@ 2023-01-13 11:58 Jan Beulich
  2023-01-17 15:56 ` H.J. Lu
  2023-02-03 11:39 ` Jan Beulich
  0 siblings, 2 replies; 6+ messages in thread
From: Jan Beulich @ 2023-01-13 11:58 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

All,

certain other architectures (Arm, RISC-V) have such, and x86 would imo
benefit from such even more: It is notoriously difficult to encode new
insns with operands which a certain version of gas doesn't support yet.
This is in particular related to the building of the ModR/M and SIB
bytes as well the VEX/XOP/EVEX prefixes.

I would appreciate feedback on the proposal (in form of an assembly
source file, providing examples at the same time). Besides pointing
out issues / oversights, thoughts on the various TBDs would be helpful.

Thanks, Jan

	.text
insn:

#	.insn [<prefix>] [<encoding>] <major-opcode>[+r|/<extension>] [,<operand>[,...]]

# Legacy encoding prefixes altering encoding space (0x0f, 0x0f38, 0x0f3a)
# have to be specified as high byte(s) of <major-opcode>. This also extends
# to certain FPU opcodes or sub-spaces like that of major opcode 0x0f01.

# Legacy encoding prefixes altering meaning (0x66, 0xF2, 0xF3) may be
# specified as high byte of <major-opcode> (perhaps already including an
# encoding space prefix). Other prefixes should be spelled out as usual
# ahead of <major-opcode> or, for segment overrides, with the memory
# operand.

# Operand order may not match that of the instruction actually being
# expressed: While for a memory operand (of which there can be only one) it
# is clear how to encode it in the resulting ModR/M byte, register operands
# are encoded strictly in the order
# - ModR/M.rm, ModR/M.reg for 2-operand insns,
# - ModR/M.rm, {E,}VEX.vvvv, ModR/M.reg for 3-operand insns, and
# - Imm{4,5}, ModR/M.rm, {E,}VEX.vvvv, ModR/M.reg for 4-operand insns,
# obviously with the ModR/M.rm slot skipped when there is a memory operand,
# and obviously with the ModR/M.reg slot skipped when there is an extension
# opcode. (For Intel syntax of course all in the opposite order.)

# Immediate operands (including immediate-like displacements, i.e. when not
# part of ModR/M addressing) should be specified by separate .byte / .word /
# .long / .quad (or alike) directives.
# TBD: How to deal with this for RIP-relative addressing?
# TBD: How to deal with this for 4-operand insns?

# When register operand size varies for an actual insn (like e.g. for MOVZX or
# VPMOVZX*), registers nevertheless need spelling out in a uniform manner, such
# that any of them could be used to derive operand size attributes (e.g.
# operand size prefix, REX.W, VEX.W, or VEX.L) as well as the EVEX Disp8
# scaling factor.
# TBD: Could also go from largest operand size, albeit that may end up confusing
#      in AT&T mode, where memory operands don't have size, yet the memory
#      operand may have larger size than the register one(s) (and would hence be
#      the one which the <len> attribute - see below - needs deriving from).

# For VEX / XOP / EVEX <encoding> is arranged like this:
# {VEX,XOP,EVEX}[.<len>][.<prefix>][.<space>][.<w>]
# where
# - <len> can be LIG, 128, 256, or (EVEX only) 512 as well as L0/L1 for
#   VEX / XOP and L0-L3 for EVEX,
# - <prefix> can be NP, 66, F3, or F2,
# - <space> can be
#   - 0f, 0f38, 0f3a, or M0...M31 for VEX,
#   - 08...3f (hex) for XOP,
#   - 0f, 0f38, 0f3a, or M0...M15 for EVEX,
# - <w> can be WIG, W0, or W1.
# Omitted <len> means "infer from operand size" if there is at least one
# sized operand, or LIG otherwise.
# Omitted <prefix> means NP.
# Omitted <space> implies encoding is taken from <major-opcode>.
# Omitted <w> means "infer from GPR operand size" if there is at least
# one GPR operand, or WIG otherwise.

# TBD: Is operand order being dependent on AT&T vs Intel syntax okay?

	.insn 0x90					# nop
	.insn 0xf390					# pause
	.insn rep 0x90					# pause
	.insn 0xd9c9					# fxch
	.insn 0xf30f01d9				# vmgexit

	.insn 0x89, %ecx, %eax				# mov %ecx, %eax
	.insn 0x89, %ax, %cx				# mov %ax, %cx

	.insn 0x8b, (%eax), %ecx			# mov (%eax), %ecx

	.insn 0x0fc8+r, %edx				# bswap %edx

	.insn lock 0x80/0, %fs:(%eax); .byte 1		# lock addb $1, %fs:(%eax)

1:
	.insn 0xe2; .byte 1b-.-1			# loop 1b
	.insn 0xc7f8; .long 1b-.-4			# xbegin 1b

	.insn 0x0fb6, %ax, %cx				# movzx %al, %cx
	.insn 0x0fb7, %eax, %ecx			# movzx %ax, %ecx

	.insn VEX.66.0F 0x58, %xmm0, %xmm1, %xmm2	# vaddpd %xmm0, %xmm1, %xmm2
	.insn VEX.66 0x0f58, %ymm0, %ymm1, %ymm2	# vaddpd %ymm0, %ymm1, %ymm2
	.insn VEX.LIG.F3.0F 0x58, %xmm0, %xmm1, %xmm2	# vaddss %xmm0, %xmm1, %xmm2

	.insn VEX.66.0F3A.W0 0x68, %xmm0, %xmm1, (%edx), %xmm3		# vfmaddps %xmm0, %xmm1, (%edx), %xmm3
	.insn VEX.66.0F3A.W1 0x68, %xmm0, %xmm1, (%edx), %xmm3		# vfmaddps %xmm0, %xmm1, %xmm3, (%edx)
	.insn VEX.66.0F3A.W1 0x68, %xmm0, %xmm1, %xmm2, (%ebx)		# vfmaddps %xmm0, %xmm1, %xmm2, (%ebx)

	.insn VEX.66.0F3A.W0 0x48, $0, %xmm0, %xmm1, (%edx), %xmm3	# vpermil2ps $0, %xmm0, %xmm1, (%edx), %xmm3
	.insn VEX.66.0F3A.W1 0x48, $1, %xmm0, %xmm1, (%edx), %xmm3	# vpermil2ps $1, %xmm0, %xmm1, %xmm3, (%edx)
	.insn VEX.66.0F3A.W1 0x48, $2, %xmm0, %xmm1, %xmm2, (%ebx)	# vpermil2ps $2, %xmm0, %xmm1, %xmm2, (%ebx)

	.insn VEX.L0.0F.W0 0x93, %eax, %k0		# kmovw %eax, %k0

	.insn VEX.256.0F.WIG 0x77			# vzeroall

	.insn EVEX.NP.0F.W0 0x58, {rn-sae}, %zmm0, %zmm1, %zmm2		# vaddps {rn-sae}, %zmm0, %zmm1, %zmm2
	.insn EVEX.66.0F.W1 0x58, 8(%eax){1to8}, %zmm1, %zmm2{%k2}{z}	# vaddpd 8(%eax){1to8}, %zmm0, %zmm1{%k2}{z}

# TBD: How to specify the Disp8 scaling factor here? (In Intel syntax we can simply
#      use memory operand size.)
	.insn EVEX.66.0F38.W0 0x88, 4(%eax), %ymm1	# vexpandps 4(%eax), %ymm1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] x86: proposal for a new .insn directive
  2023-01-13 11:58 [RFC] x86: proposal for a new .insn directive Jan Beulich
@ 2023-01-17 15:56 ` H.J. Lu
  2023-01-17 16:16   ` Jan Beulich
  2023-02-03 11:39 ` Jan Beulich
  1 sibling, 1 reply; 6+ messages in thread
From: H.J. Lu @ 2023-01-17 15:56 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Fri, Jan 13, 2023 at 3:58 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> All,
>
> certain other architectures (Arm, RISC-V) have such, and x86 would imo
> benefit from such even more: It is notoriously difficult to encode new
> insns with operands which a certain version of gas doesn't support yet.
> This is in particular related to the building of the ModR/M and SIB
> bytes as well the VEX/XOP/EVEX prefixes.
>
> I would appreciate feedback on the proposal (in form of an assembly
> source file, providing examples at the same time). Besides pointing
> out issues / oversights, thoughts on the various TBDs would be helpful.
>
> Thanks, Jan
>
>         .text
> insn:
>
> #       .insn [<prefix>] [<encoding>] <major-opcode>[+r|/<extension>] [,<operand>[,...]]
>
> # Legacy encoding prefixes altering encoding space (0x0f, 0x0f38, 0x0f3a)
> # have to be specified as high byte(s) of <major-opcode>. This also extends
> # to certain FPU opcodes or sub-spaces like that of major opcode 0x0f01.
>
> # Legacy encoding prefixes altering meaning (0x66, 0xF2, 0xF3) may be
> # specified as high byte of <major-opcode> (perhaps already including an
> # encoding space prefix). Other prefixes should be spelled out as usual
> # ahead of <major-opcode> or, for segment overrides, with the memory
> # operand.
>
> # Operand order may not match that of the instruction actually being
> # expressed: While for a memory operand (of which there can be only one) it
> # is clear how to encode it in the resulting ModR/M byte, register operands
> # are encoded strictly in the order
> # - ModR/M.rm, ModR/M.reg for 2-operand insns,
> # - ModR/M.rm, {E,}VEX.vvvv, ModR/M.reg for 3-operand insns, and
> # - Imm{4,5}, ModR/M.rm, {E,}VEX.vvvv, ModR/M.reg for 4-operand insns,
> # obviously with the ModR/M.rm slot skipped when there is a memory operand,
> # and obviously with the ModR/M.reg slot skipped when there is an extension
> # opcode. (For Intel syntax of course all in the opposite order.)
>
> # Immediate operands (including immediate-like displacements, i.e. when not
> # part of ModR/M addressing) should be specified by separate .byte / .word /
> # .long / .quad (or alike) directives.
> # TBD: How to deal with this for RIP-relative addressing?
> # TBD: How to deal with this for 4-operand insns?
>
> # When register operand size varies for an actual insn (like e.g. for MOVZX or
> # VPMOVZX*), registers nevertheless need spelling out in a uniform manner, such
> # that any of them could be used to derive operand size attributes (e.g.
> # operand size prefix, REX.W, VEX.W, or VEX.L) as well as the EVEX Disp8
> # scaling factor.
> # TBD: Could also go from largest operand size, albeit that may end up confusing
> #      in AT&T mode, where memory operands don't have size, yet the memory
> #      operand may have larger size than the register one(s) (and would hence be
> #      the one which the <len> attribute - see below - needs deriving from).
>
> # For VEX / XOP / EVEX <encoding> is arranged like this:
> # {VEX,XOP,EVEX}[.<len>][.<prefix>][.<space>][.<w>]
> # where
> # - <len> can be LIG, 128, 256, or (EVEX only) 512 as well as L0/L1 for
> #   VEX / XOP and L0-L3 for EVEX,
> # - <prefix> can be NP, 66, F3, or F2,
> # - <space> can be
> #   - 0f, 0f38, 0f3a, or M0...M31 for VEX,
> #   - 08...3f (hex) for XOP,
> #   - 0f, 0f38, 0f3a, or M0...M15 for EVEX,
> # - <w> can be WIG, W0, or W1.
> # Omitted <len> means "infer from operand size" if there is at least one
> # sized operand, or LIG otherwise.
> # Omitted <prefix> means NP.
> # Omitted <space> implies encoding is taken from <major-opcode>.
> # Omitted <w> means "infer from GPR operand size" if there is at least
> # one GPR operand, or WIG otherwise.
>
> # TBD: Is operand order being dependent on AT&T vs Intel syntax okay?
>
>         .insn 0x90                                      # nop
>         .insn 0xf390                                    # pause
>         .insn rep 0x90                                  # pause
>         .insn 0xd9c9                                    # fxch
>         .insn 0xf30f01d9                                # vmgexit
>
>         .insn 0x89, %ecx, %eax                          # mov %ecx, %eax
>         .insn 0x89, %ax, %cx                            # mov %ax, %cx
>
>         .insn 0x8b, (%eax), %ecx                        # mov (%eax), %ecx
>
>         .insn 0x0fc8+r, %edx                            # bswap %edx
>
>         .insn lock 0x80/0, %fs:(%eax); .byte 1          # lock addb $1, %fs:(%eax)
>
> 1:
>         .insn 0xe2; .byte 1b-.-1                        # loop 1b
>         .insn 0xc7f8; .long 1b-.-4                      # xbegin 1b
>
>         .insn 0x0fb6, %ax, %cx                          # movzx %al, %cx
>         .insn 0x0fb7, %eax, %ecx                        # movzx %ax, %ecx
>
>         .insn VEX.66.0F 0x58, %xmm0, %xmm1, %xmm2       # vaddpd %xmm0, %xmm1, %xmm2
>         .insn VEX.66 0x0f58, %ymm0, %ymm1, %ymm2        # vaddpd %ymm0, %ymm1, %ymm2
>         .insn VEX.LIG.F3.0F 0x58, %xmm0, %xmm1, %xmm2   # vaddss %xmm0, %xmm1, %xmm2
>
>         .insn VEX.66.0F3A.W0 0x68, %xmm0, %xmm1, (%edx), %xmm3          # vfmaddps %xmm0, %xmm1, (%edx), %xmm3
>         .insn VEX.66.0F3A.W1 0x68, %xmm0, %xmm1, (%edx), %xmm3          # vfmaddps %xmm0, %xmm1, %xmm3, (%edx)
>         .insn VEX.66.0F3A.W1 0x68, %xmm0, %xmm1, %xmm2, (%ebx)          # vfmaddps %xmm0, %xmm1, %xmm2, (%ebx)
>
>         .insn VEX.66.0F3A.W0 0x48, $0, %xmm0, %xmm1, (%edx), %xmm3      # vpermil2ps $0, %xmm0, %xmm1, (%edx), %xmm3
>         .insn VEX.66.0F3A.W1 0x48, $1, %xmm0, %xmm1, (%edx), %xmm3      # vpermil2ps $1, %xmm0, %xmm1, %xmm3, (%edx)
>         .insn VEX.66.0F3A.W1 0x48, $2, %xmm0, %xmm1, %xmm2, (%ebx)      # vpermil2ps $2, %xmm0, %xmm1, %xmm2, (%ebx)
>
>         .insn VEX.L0.0F.W0 0x93, %eax, %k0              # kmovw %eax, %k0
>
>         .insn VEX.256.0F.WIG 0x77                       # vzeroall
>
>         .insn EVEX.NP.0F.W0 0x58, {rn-sae}, %zmm0, %zmm1, %zmm2         # vaddps {rn-sae}, %zmm0, %zmm1, %zmm2
>         .insn EVEX.66.0F.W1 0x58, 8(%eax){1to8}, %zmm1, %zmm2{%k2}{z}   # vaddpd 8(%eax){1to8}, %zmm0, %zmm1{%k2}{z}
>
> # TBD: How to specify the Disp8 scaling factor here? (In Intel syntax we can simply
> #      use memory operand size.)
>         .insn EVEX.66.0F38.W0 0x88, 4(%eax), %ymm1      # vexpandps 4(%eax), %ymm1

I think it is a nice feature.  But it will be very difficult to
support all complex
x86 encoding schemes which change over time.  We can start with the regular
encoding schemes first.

-- 
H.J.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] x86: proposal for a new .insn directive
  2023-01-17 15:56 ` H.J. Lu
@ 2023-01-17 16:16   ` Jan Beulich
  2023-01-20  1:25     ` Jiang, Haochen
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2023-01-17 16:16 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Binutils

On 17.01.2023 16:56, H.J. Lu wrote:
> On Fri, Jan 13, 2023 at 3:58 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> All,
>>
>> certain other architectures (Arm, RISC-V) have such, and x86 would imo
>> benefit from such even more: It is notoriously difficult to encode new
>> insns with operands which a certain version of gas doesn't support yet.
>> This is in particular related to the building of the ModR/M and SIB
>> bytes as well the VEX/XOP/EVEX prefixes.
>>
>> I would appreciate feedback on the proposal (in form of an assembly
>> source file, providing examples at the same time). Besides pointing
>> out issues / oversights, thoughts on the various TBDs would be helpful.
>>
>> Thanks, Jan
>>
>>         .text
>> insn:
>>
>> #       .insn [<prefix>] [<encoding>] <major-opcode>[+r|/<extension>] [,<operand>[,...]]
>>
>> # Legacy encoding prefixes altering encoding space (0x0f, 0x0f38, 0x0f3a)
>> # have to be specified as high byte(s) of <major-opcode>. This also extends
>> # to certain FPU opcodes or sub-spaces like that of major opcode 0x0f01.
>>
>> # Legacy encoding prefixes altering meaning (0x66, 0xF2, 0xF3) may be
>> # specified as high byte of <major-opcode> (perhaps already including an
>> # encoding space prefix). Other prefixes should be spelled out as usual
>> # ahead of <major-opcode> or, for segment overrides, with the memory
>> # operand.
>>
>> # Operand order may not match that of the instruction actually being
>> # expressed: While for a memory operand (of which there can be only one) it
>> # is clear how to encode it in the resulting ModR/M byte, register operands
>> # are encoded strictly in the order
>> # - ModR/M.rm, ModR/M.reg for 2-operand insns,
>> # - ModR/M.rm, {E,}VEX.vvvv, ModR/M.reg for 3-operand insns, and
>> # - Imm{4,5}, ModR/M.rm, {E,}VEX.vvvv, ModR/M.reg for 4-operand insns,
>> # obviously with the ModR/M.rm slot skipped when there is a memory operand,
>> # and obviously with the ModR/M.reg slot skipped when there is an extension
>> # opcode. (For Intel syntax of course all in the opposite order.)
>>
>> # Immediate operands (including immediate-like displacements, i.e. when not
>> # part of ModR/M addressing) should be specified by separate .byte / .word /
>> # .long / .quad (or alike) directives.
>> # TBD: How to deal with this for RIP-relative addressing?
>> # TBD: How to deal with this for 4-operand insns?
>>
>> # When register operand size varies for an actual insn (like e.g. for MOVZX or
>> # VPMOVZX*), registers nevertheless need spelling out in a uniform manner, such
>> # that any of them could be used to derive operand size attributes (e.g.
>> # operand size prefix, REX.W, VEX.W, or VEX.L) as well as the EVEX Disp8
>> # scaling factor.
>> # TBD: Could also go from largest operand size, albeit that may end up confusing
>> #      in AT&T mode, where memory operands don't have size, yet the memory
>> #      operand may have larger size than the register one(s) (and would hence be
>> #      the one which the <len> attribute - see below - needs deriving from).
>>
>> # For VEX / XOP / EVEX <encoding> is arranged like this:
>> # {VEX,XOP,EVEX}[.<len>][.<prefix>][.<space>][.<w>]
>> # where
>> # - <len> can be LIG, 128, 256, or (EVEX only) 512 as well as L0/L1 for
>> #   VEX / XOP and L0-L3 for EVEX,
>> # - <prefix> can be NP, 66, F3, or F2,
>> # - <space> can be
>> #   - 0f, 0f38, 0f3a, or M0...M31 for VEX,
>> #   - 08...3f (hex) for XOP,
>> #   - 0f, 0f38, 0f3a, or M0...M15 for EVEX,
>> # - <w> can be WIG, W0, or W1.
>> # Omitted <len> means "infer from operand size" if there is at least one
>> # sized operand, or LIG otherwise.
>> # Omitted <prefix> means NP.
>> # Omitted <space> implies encoding is taken from <major-opcode>.
>> # Omitted <w> means "infer from GPR operand size" if there is at least
>> # one GPR operand, or WIG otherwise.
>>
>> # TBD: Is operand order being dependent on AT&T vs Intel syntax okay?
>>
>>         .insn 0x90                                      # nop
>>         .insn 0xf390                                    # pause
>>         .insn rep 0x90                                  # pause
>>         .insn 0xd9c9                                    # fxch
>>         .insn 0xf30f01d9                                # vmgexit
>>
>>         .insn 0x89, %ecx, %eax                          # mov %ecx, %eax
>>         .insn 0x89, %ax, %cx                            # mov %ax, %cx
>>
>>         .insn 0x8b, (%eax), %ecx                        # mov (%eax), %ecx
>>
>>         .insn 0x0fc8+r, %edx                            # bswap %edx
>>
>>         .insn lock 0x80/0, %fs:(%eax); .byte 1          # lock addb $1, %fs:(%eax)
>>
>> 1:
>>         .insn 0xe2; .byte 1b-.-1                        # loop 1b
>>         .insn 0xc7f8; .long 1b-.-4                      # xbegin 1b
>>
>>         .insn 0x0fb6, %ax, %cx                          # movzx %al, %cx
>>         .insn 0x0fb7, %eax, %ecx                        # movzx %ax, %ecx
>>
>>         .insn VEX.66.0F 0x58, %xmm0, %xmm1, %xmm2       # vaddpd %xmm0, %xmm1, %xmm2
>>         .insn VEX.66 0x0f58, %ymm0, %ymm1, %ymm2        # vaddpd %ymm0, %ymm1, %ymm2
>>         .insn VEX.LIG.F3.0F 0x58, %xmm0, %xmm1, %xmm2   # vaddss %xmm0, %xmm1, %xmm2
>>
>>         .insn VEX.66.0F3A.W0 0x68, %xmm0, %xmm1, (%edx), %xmm3          # vfmaddps %xmm0, %xmm1, (%edx), %xmm3
>>         .insn VEX.66.0F3A.W1 0x68, %xmm0, %xmm1, (%edx), %xmm3          # vfmaddps %xmm0, %xmm1, %xmm3, (%edx)
>>         .insn VEX.66.0F3A.W1 0x68, %xmm0, %xmm1, %xmm2, (%ebx)          # vfmaddps %xmm0, %xmm1, %xmm2, (%ebx)
>>
>>         .insn VEX.66.0F3A.W0 0x48, $0, %xmm0, %xmm1, (%edx), %xmm3      # vpermil2ps $0, %xmm0, %xmm1, (%edx), %xmm3
>>         .insn VEX.66.0F3A.W1 0x48, $1, %xmm0, %xmm1, (%edx), %xmm3      # vpermil2ps $1, %xmm0, %xmm1, %xmm3, (%edx)
>>         .insn VEX.66.0F3A.W1 0x48, $2, %xmm0, %xmm1, %xmm2, (%ebx)      # vpermil2ps $2, %xmm0, %xmm1, %xmm2, (%ebx)
>>
>>         .insn VEX.L0.0F.W0 0x93, %eax, %k0              # kmovw %eax, %k0
>>
>>         .insn VEX.256.0F.WIG 0x77                       # vzeroall
>>
>>         .insn EVEX.NP.0F.W0 0x58, {rn-sae}, %zmm0, %zmm1, %zmm2         # vaddps {rn-sae}, %zmm0, %zmm1, %zmm2
>>         .insn EVEX.66.0F.W1 0x58, 8(%eax){1to8}, %zmm1, %zmm2{%k2}{z}   # vaddpd 8(%eax){1to8}, %zmm0, %zmm1{%k2}{z}
>>
>> # TBD: How to specify the Disp8 scaling factor here? (In Intel syntax we can simply
>> #      use memory operand size.)
>>         .insn EVEX.66.0F38.W0 0x88, 4(%eax), %ymm1      # vexpandps 4(%eax), %ymm1
> 
> I think it is a nice feature.  But it will be very difficult to
> support all complex
> x86 encoding schemes which change over time.

Of course, and especially also if yet new schemes would appear. We can
only possibly encode what we're aware of; but even that may help with
unknown (or further extended) schemes, compared to today's need of
resorting to .byte.

>  We can start with the regular encoding schemes first.

Well, at the very least immediates need dealing with in some sensible way.
So at least those two TBDs will need addressing up front (or maybe while
I'm starting with some initial work here, which I may have time for in
about two weeks).

Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [RFC] x86: proposal for a new .insn directive
  2023-01-17 16:16   ` Jan Beulich
@ 2023-01-20  1:25     ` Jiang, Haochen
  2023-01-20  9:07       ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Jiang, Haochen @ 2023-01-20  1:25 UTC (permalink / raw)
  To: Beulich, Jan, H.J. Lu; +Cc: Binutils

> >>
> >> # TBD: How to specify the Disp8 scaling factor here? (In Intel syntax we can
> simply
> >> #      use memory operand size.)
> >>         .insn EVEX.66.0F38.W0 0x88, 4(%eax), %ymm1      # vexpandps
> 4(%eax), %ymm1

One of the way I think is to add a field at encoding

{VEX,XOP,EVEX}[.<len>][.<prefix>][.<space>][.<w>][.<memory>]

<memory> could be x or y.
If <memory> is omitted, it is implied by register size or it is in Intel syntax.

But the potential problem is that if we have to add a field every time we meet
something special, the directive will turn out to be longer and longer and more
and more complicated. I don't know whether everyone like this.

BRs,
Haochen

> >
> > I think it is a nice feature.  But it will be very difficult to
> > support all complex
> > x86 encoding schemes which change over time.
> 
> Of course, and especially also if yet new schemes would appear. We can
> only possibly encode what we're aware of; but even that may help with
> unknown (or further extended) schemes, compared to today's need of
> resorting to .byte.
> 
> >  We can start with the regular encoding schemes first.
> 
> Well, at the very least immediates need dealing with in some sensible way.
> So at least those two TBDs will need addressing up front (or maybe while
> I'm starting with some initial work here, which I may have time for in
> about two weeks).
> 
> Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] x86: proposal for a new .insn directive
  2023-01-20  1:25     ` Jiang, Haochen
@ 2023-01-20  9:07       ` Jan Beulich
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2023-01-20  9:07 UTC (permalink / raw)
  To: Jiang, Haochen; +Cc: Binutils, H.J. Lu

On 20.01.2023 02:25, Jiang, Haochen wrote:
>>>>
>>>> # TBD: How to specify the Disp8 scaling factor here? (In Intel syntax we can
>> simply
>>>> #      use memory operand size.)
>>>>         .insn EVEX.66.0F38.W0 0x88, 4(%eax), %ymm1      # vexpandps
>> 4(%eax), %ymm1
> 
> One of the way I think is to add a field at encoding
> 
> {VEX,XOP,EVEX}[.<len>][.<prefix>][.<space>][.<w>][.<memory>]
> 
> <memory> could be x or y.
> If <memory> is omitted, it is implied by register size or it is in Intel syntax.

I don't see how x or y would apply here. The Disp8 scaling size for this
and alike (e.g. also S/G insns) is element size, and hence can't be
derived from operand size (which I take x and y are kind of meant to
refer to). (Yes, a similar issue exists with insns where we have x/y/z
pseudo-suffixes in AT&T mode, but I specifically chose the example above
to point out that we need to go beyond x/y/z.)

While to address the issue with immediates right now I'm considering a
prefix (C cast like) notation e.g. $(s16)0x12, for EVEX memory operand
displacement handling it likely needs to be something different, e.g.
(%rax):4, to avoid possible parsing ambiguities. Question then would be
what to do when there is a non-zero displacement but no such specifier:
We could then derive it from other (register) operands, but we could
also default to avoid Disp8 in such cases. (This similarly affects
Intel syntax, wrt presence/absence of an operand size specifier on the
memory operand.)

Of course such a suffix notation could then also be used for immediates,
e.g. $0x12:u16 or $symbol:s32, which would overall end up looking a
little more uniform.

> But the potential problem is that if we have to add a field every time we meet
> something special, the directive will turn out to be longer and longer and more
> and more complicated. I don't know whether everyone like this.

I view this not so much as a problem because of the growth, but because
of my present goal being for this to largely match what the SDM uses
(with limited extensions, part of which are actually up for discussion).

Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] x86: proposal for a new .insn directive
  2023-01-13 11:58 [RFC] x86: proposal for a new .insn directive Jan Beulich
  2023-01-17 15:56 ` H.J. Lu
@ 2023-02-03 11:39 ` Jan Beulich
  1 sibling, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2023-02-03 11:39 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu, Jiang, Haochen

On 13.01.2023 12:58, Jan Beulich via Binutils wrote:
> certain other architectures (Arm, RISC-V) have such, and x86 would imo
> benefit from such even more: It is notoriously difficult to encode new
> insns with operands which a certain version of gas doesn't support yet.
> This is in particular related to the building of the ModR/M and SIB
> bytes as well the VEX/XOP/EVEX prefixes.
> 
> I would appreciate feedback on the proposal (in form of an assembly
> source file, providing examples at the same time). Besides pointing
> out issues / oversights, thoughts on the various TBDs would be helpful.

Some updates below, resulting from first steps taken. (There are other
more mechanical ones, which will be covered by the doc addition yet to
be written.)

> #	.insn [<prefix>] [<encoding>] <major-opcode>[+r|/<extension>] [,<operand>[,...]]
> 
> # Legacy encoding prefixes altering encoding space (0x0f, 0x0f38, 0x0f3a)
> # have to be specified as high byte(s) of <major-opcode>. This also extends
> # to certain FPU opcodes or sub-spaces like that of major opcode 0x0f01.
> 
> # Legacy encoding prefixes altering meaning (0x66, 0xF2, 0xF3) may be
> # specified as high byte of <major-opcode> (perhaps already including an
> # encoding space prefix). Other prefixes should be spelled out as usual
> # ahead of <major-opcode> or, for segment overrides, with the memory
> # operand.
> 
> # Operand order may not match that of the instruction actually being
> # expressed: While for a memory operand (of which there can be only one) it
> # is clear how to encode it in the resulting ModR/M byte, register operands
> # are encoded strictly in the order

# - {E,}VEX.vvvv for 1-register-operand VEX/XOP/EVEX insns,

> # - ModR/M.rm, ModR/M.reg for 2-operand insns,
> # - ModR/M.rm, {E,}VEX.vvvv, ModR/M.reg for 3-operand insns, and
> # - Imm{4,5}, ModR/M.rm, {E,}VEX.vvvv, ModR/M.reg for 4-operand insns,
> # obviously with the ModR/M.rm slot skipped when there is a memory operand,
> # and obviously with the ModR/M.reg slot skipped when there is an extension
> # opcode. (For Intel syntax of course all in the opposite order.)
> 
> # Immediate operands (including immediate-like displacements, i.e. when not
> # part of ModR/M addressing) should be specified by separate .byte / .word /
> # .long / .quad (or alike) directives.
> # TBD: How to deal with this for RIP-relative addressing?
> # TBD: How to deal with this for 4-operand insns?

The earlier two proposals how to address these two issues were

# Proposal 1: $({u,s}<bits>)<number>
# Proposal 2: $<number>:{u,s}<bits>

Neither will easily fit within the way operands are currently parsed.
To avoid further fragility, I'm therefore considering to extend what
we currently use for vector operations: Prefix or suffix the size
specifier enclosed in curly braces (using [] instead to represent
alternatives):

# Proposal 3: ${[u,s]<bits>}<number>
# Proposal 4: $<number>{[u,s]<bits>}

The former would be easiest to deal with from what I can tell right
now.

> # When register operand size varies for an actual insn (like e.g. for MOVZX or
> # VPMOVZX*), registers nevertheless need spelling out in a uniform manner, such
> # that any of them could be used to derive operand size attributes (e.g.
> # operand size prefix, REX.W, VEX.W, or VEX.L) as well as the EVEX Disp8
> # scaling factor.
> # TBD: Could also go from largest operand size, albeit that may end up confusing
> #      in AT&T mode, where memory operands don't have size, yet the memory
> #      operand may have larger size than the register one(s) (and would hence be
> #      the one which the <len> attribute - see below - needs deriving from).

Using largest operand size has turned out to be preferable. The AT&T
syntax concern is easy to address: Respective attributes can simply
be specified explicitly in the {VEX,XOP,EVEX}... construct when
operands don't allow correctly deriving one or more of them.

> # For VEX / XOP / EVEX <encoding> is arranged like this:
> # {VEX,XOP,EVEX}[.<len>][.<prefix>][.<space>][.<w>]

I've changed this for XOP, as being more natural this way:

# {,E}VEX[.<len>][.<prefix>][.<space>][.<w>]
# XOP<space>[.<len>][.<prefix>][.<w>]

> # where
> # - <len> can be LIG, 128, 256, or (EVEX only) 512 as well as L0/L1 for
> #   VEX / XOP and L0-L3 for EVEX,
> # - <prefix> can be NP, 66, F3, or F2,
> # - <space> can be
> #   - 0f, 0f38, 0f3a, or M0...M31 for VEX,
> #   - 08...3f (hex) for XOP,

This ranges only from 08 through to 1f.

> #   - 0f, 0f38, 0f3a, or M0...M15 for EVEX,
> # - <w> can be WIG, W0, or W1.
> # Omitted <len> means "infer from operand size" if there is at least one
> # sized operand, or LIG otherwise.
> # Omitted <prefix> means NP.
> # Omitted <space> implies encoding is taken from <major-opcode>.
> # Omitted <w> means "infer from GPR operand size" if there is at least
> # one GPR operand, or WIG otherwise.
>[...]
> # TBD: How to specify the Disp8 scaling factor here? (In Intel syntax we can simply
> #      use memory operand size.) Proposal: 4(%eax):4 or 4(%eax):d4.

Like for immediates, the proposals present parsing challenges (and here
there's also a [mild] forward compatibility concern, as we don't know
what may further be added to the architecture). Hence, like there I'm
now considering to instead put the size specifiers inside (potentially
already present) curly braces, e.g.

	.insn EVEX.M5.W0 0x5a, 16(%eax){:d16}, %zmm0	# vcvtph2pd 16(%eax), %zmm0
	.insn EVEX.M5.W0 0x5a, 2(%eax){1to8,:d2}, %zmm0	# vcvtph2pd 2(%eax){1to8}, %zmm0

I'd like to keep the colons to reduce the risk of issues which, as
said, might result from future additions to the spec. Whether the
comma as a separator is also wanted is secondary at this point. In
particular if it turned out to cause problems to the parsing code, I
wouldn't be worried to drop it. We could also follow the masking
syntax and use

	.insn EVEX.M5.W0 0x5a, 2(%eax){1to8}{:d2}, %zmm0 # vcvtph2pd 2(%eax){1to8}, %zmm0

Once again - input appreciated especially on all still open aspects.

Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-02-03 11:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-13 11:58 [RFC] x86: proposal for a new .insn directive Jan Beulich
2023-01-17 15:56 ` H.J. Lu
2023-01-17 16:16   ` Jan Beulich
2023-01-20  1:25     ` Jiang, Haochen
2023-01-20  9:07       ` Jan Beulich
2023-02-03 11:39 ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).