[x86 inline asm]: width of register arguments

public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed

* [x86 inline asm]: width of register arguments
@ 2019-06-24 12:19 Zdenek Sojka
  2019-06-29 16:10 ` Andrew Haley
  0 siblings, 1 reply; 7+ messages in thread
From: Zdenek Sojka @ 2019-06-24 12:19 UTC (permalink / raw)
  To: gcc-help

Hello,

how does gcc choose the register arguments of an inline assembler and what
can I assume about the "unused" bits?
My questions target the 64bit x86 architecture; I assume the behavior is the
same for all target triplets x86_64-*-*

1) does gcc always use register of size matching the size of the variable?

eg.
__asm__ ("mov %1, %0" : "=r"(a) : "r"(b));

will always use 8bit registers (eg. al, bl) for "int8_t / uint8_t a, b",
will always use 16bit registers (eg. ax, bx) for "int16_t / uint16_t a, b",
will always use 32bit registers (eg. eax, ebx) for "int32_t / uint32_t a, 
b",
will always use 64bit registers (eg. rax, rbx) for "int64_t / uint64_t a, 
b",
will always fail due to operand size mismatch for other combination?

2) can I assume anything about the high-order bits of the register? can I 
overwrite them freely?

2a) does gcc use the "high" 8bit registers (ah, bh, ch, dh) for variable 
allocation?

2b) can gcc allocate different 8bit variables in the "low" and "high"
registers (eg. al/ah, bl/bh, ...)?

For variables of type:

uint8_t a8, b8;
uint16_t a16, b16;
...

Enforcing same-sized arguments:
a)
__asm__ ("movb %b1, %b0" : "=r"(a8) : "r"(b8));
or
__asm__ ("movq %q1, %q0" : "=r"(a8) : "r"(b8));
is always safe to do? (eg. moving 56bits of garbage won't hurt anything)
OR might gcc assume something about the high-order 56bits (eg. zero, sign-/
zero-extension of the lower 8 bits), which might get broken by the move?

b)
__asm__ ("movw %w1, %w0" : "=r"(a8) : "r"(b16));
or
__asm__ ("movq %q1, %q0" : "=r"(a8) : "r"(b16));

is always safe to do? (eg. moving 56bits of garbage won't hurt anything)

Assuming zero-extension:
__asm__ ("movw %w1, %w0" : "=r"(a16) : "r"((uint8_t)b16));
or
__asm__ ("movw %w1, %w0" : "=r"(a16) : "r"(b8));
does not seem to work (high-order 8 bits of a16 are garbage)

Thank you,
Zdenek Sojka

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [x86 inline asm]: width of register arguments
  2019-06-24 12:19 [x86 inline asm]: width of register arguments Zdenek Sojka
@ 2019-06-29 16:10 ` Andrew Haley
  2019-07-02  5:37   ` Zdenek Sojka
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Haley @ 2019-06-29 16:10 UTC (permalink / raw)
  To: Zdenek Sojka, gcc-help

Hi,

On 6/24/19 1:19 PM, Zdenek Sojka wrote:
> how does gcc choose the register arguments of an inline assembler and what
> can I assume about the "unused" bits?

The choice is made by the register allocator.

You can't assume anything about the "unused" bits. The "r" register
constraint means you get a whole register to use, of the wordsize of
the machine.


> My questions target the 64bit x86 architecture; I assume the behavior is the
> same for all target triplets x86_64-*-*
> 
> 1) does gcc always use register of size matching the size of the variable?

No.

> 2) can I assume anything about the high-order bits of the register? can I 
> overwrite them freely?

No; yes.

> 2a) does gcc use the "high" 8bit registers (ah, bh, ch, dh) for variable 
> allocation?

No.

> 2b) can gcc allocate different 8bit variables in the "low" and "high"
> registers (eg. al/ah, bl/bh, ...)?
> 
> 
> For variables of type:
> 
> uint8_t a8, b8;
> uint16_t a16, b16;
> ...

I think not, but I'm unsure.

> Enforcing same-sized arguments:
> a)
> __asm__ ("movb %b1, %b0" : "=r"(a8) : "r"(b8));
> or
> __asm__ ("movq %q1, %q0" : "=r"(a8) : "r"(b8));
> is always safe to do? (eg. moving 56bits of garbage won't hurt anything)
> OR might gcc assume something about the high-order 56bits (eg. zero, sign-/
> zero-extension of the lower 8 bits), which might get broken by the move?

If you ask for a register with the "r" constraint, all of that
register is yours to use.


> Assuming zero-extension:
> __asm__ ("movw %w1, %w0" : "=r"(a16) : "r"((uint8_t)b16));
> or
> __asm__ ("movw %w1, %w0" : "=r"(a16) : "r"(b8));
> does not seem to work (high-order 8 bits of a16 are garbage)

That's how x86 works.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [x86 inline asm]: width of register arguments
  2019-06-29 16:10 ` Andrew Haley
@ 2019-07-02  5:37   ` Zdenek Sojka
  2019-07-02 10:41     ` Andrew Haley
  0 siblings, 1 reply; 7+ messages in thread
From: Zdenek Sojka @ 2019-07-02  5:37 UTC (permalink / raw)
  To: Andrew Haley; +Cc: gcc-help

Hello Andrew,

---------- Původní e-mail ----------
Od: Andrew Haley <aph@redhat.com>
Komu: Zdenek Sojka <zsojka@seznam.cz>, gcc-help@gcc.gnu.org
Datum: 29. 6. 2019 18:11:20
Předmět: Re: [x86 inline asm]: width of register arguments
"Hi,

On 6/24/19 1:19 PM, Zdenek Sojka wrote:
> how does gcc choose the register arguments of an inline assembler and what

> can I assume about the "unused" bits?

The choice is made by the register allocator.

You can't assume anything about the "unused" bits. The "r" register
constraint means you get a whole register to use, of the wordsize of
the machine.
"



Ok, that's a very important information!




I was a bit afraid that the compiler might assume the upper bits are eg. 
zeroed, if they were zeroed before the __asm__ statement. (or that high-
order bits might be sign-extension of the narrower value)


 
"

> My questions target the 64bit x86 architecture; I assume the behavior is
the
> same for all target triplets x86_64-*-*
>
> 1) does gcc always use register of size matching the size of the variable?


No. "



Ok, shame - it seems to behave so in my experiments:




void foo(void)
{
        uint8_t u8; uint16_t u16; uint32_t u32; uint64_t u64;
        __asm__ volatile ("# %0 %1 %2 %3" : "=r"(u8), "=r"(u16), "=r"(u32),
"=r"(u64));
        __asm__ volatile ("# %0 %1 %2 %3" : "+r"(u8), "+r"(u16), "+r"(u32),
"+r"(u64));
        __asm__ volatile ("# %0 %1 %2 %3" :: "r"(u8), "r"(u16), "r"(u32),
"r"(u64));
}





generates at all optimization levels (-O0 to -O3 8bit reg for u8, 16bit reg
for u16, 32bit reg for u32, 64bit reg for u64:




# 9 "tsts.c" 1
        # %al %dx %ecx %rsi
# 0 "" 2
# 10 "tsts.c" 1
        # %al %dx %ecx %rsi
# 0 "" 2
# 11 "tsts.c" 1
        # %al %dx %ecx %rsi
# 0 "" 2





Similar for:

void bar(uint8_t u8, uint16_t u16, uint32_t u32, uint64_t u64)
{
        __asm__ volatile ("# %0 %1 %2 %3" : "=r"(u8), "=r"(u16), "=r"(u32),
"=r"(u64));
        __asm__ volatile ("# %0 %1 %2 %3" : "+r"(u8), "+r"(u16), "+r"(u32),
"+r"(u64));
        __asm__ volatile ("# %0 %1 %2 %3" :: "r"(u8), "r"(u16), "r"(u32),
"r"(u64));
}





and for



void baz64(uint64_t a, uint64_t b, uint64_t c, uint64_t d)
{
        __asm__ volatile ("# %0 %1 %2 %3" :: "r"((uint8_t)a), "r"((uint16_t)
b), "r"((uint32_t)c), "r"((uint64_t)d));
}

void baz8(uint8_t a, uint8_t b, uint8_t c, uint8_t d)
{
        __asm__ volatile ("# %0 %1 %2 %3" :: "r"((uint8_t)a), "r"((uint16_t)
b), "r"((uint32_t)c), "r"((uint64_t)d));
}






always uses 8bit register for a, 16bit register for b, 32bit register for c,
64bit register for d.








Do you happen to know of any counter-example?








 
"

> 2) can I assume anything about the high-order bits of the register? can I

> overwrite them freely?

No; yes.

> 2a) does gcc use the "high" 8bit registers (ah, bh, ch, dh) for variable
> allocation?

No. "



Ok, thanks, another important information.


 
"

> 2b) can gcc allocate different 8bit variables in the "low" and "high" 
> registers (eg. al/ah, bl/bh, ...)?
>
>
> For variables of type:
>
> uint8_t a8, b8;
> uint16_t a16, b16;
> ...

I think not, but I'm unsure.
"



According to 2a), ah/bh/... are not used for register alocation -> so "No."



"
> Enforcing same-sized arguments:
> a)
> __asm__ ("movb %b1, %b0" : "=r"(a8) : "r"(b8));
> or
> __asm__ ("movq %q1, %q0" : "=r"(a8) : "r"(b8));
> is always safe to do? (eg. moving 56bits of garbage won't hurt anything)
> OR might gcc assume something about the high-order 56bits (eg. zero, sign-
/
> zero-extension of the lower 8 bits), which might get broken by the move?

If you ask for a register with the "r" constraint, all of that
register is yours to use.


> Assuming zero-extension:
> __asm__ ("movw %w1, %w0" : "=r"(a16) : "r"((uint8_t)b16));
> or
> __asm__ ("movw %w1, %w0" : "=r"(a16) : "r"(b8));
> does not seem to work (high-order 8 bits of a16 are garbage)

That's how x86 works.
"
 

Garbage in, garbage out. The high order bits are undefined.




"
--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 "



Best regards,
Zdenek Sojka


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [x86 inline asm]: width of register arguments
  2019-07-02  5:37   ` Zdenek Sojka
@ 2019-07-02 10:41     ` Andrew Haley
  2019-07-02 15:19       ` Segher Boessenkool
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Haley @ 2019-07-02 10:41 UTC (permalink / raw)
  To: Zdenek Sojka; +Cc: gcc-help

On 7/2/19 6:37 AM, Zdenek Sojka wrote:
> Ok, shame - it seems to behave so in my experiments:

It's more complicated than that. Sometimes the size of the operand is
in the name of the instruction and sometimes you need to force the
size yourself.

For example, with int a, x:

  asm("mov %1, %0" : "=&r"(a) : "r"(x));

generates

	mov %edx, %eax

but

  asm("mov %b1, %b0" : "=&r"(a) : "r"(x));

generates

	mov %dl, %al

For a real-world example,

static __inline void
outb_p (unsigned char __value, unsigned short int __port)
{
  __asm__ __volatile__ ("outb %b0,%w1\noutb %%al,$0x80": :"a" (__value),
                  "Nd" (__port));
}

On x86 the special asm out single letter directives following a '%'
are defined in i386.md.

I'm a bit paranoid about this stuff because my memory of GCC's inline
asm goes back decades to when it was far more fragile than it is now.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [x86 inline asm]: width of register arguments
  2019-07-02 10:41     ` Andrew Haley
@ 2019-07-02 15:19       ` Segher Boessenkool
  2019-07-03  7:32         ` Zdenek Sojka
  0 siblings, 1 reply; 7+ messages in thread
From: Segher Boessenkool @ 2019-07-02 15:19 UTC (permalink / raw)
  To: Andrew Haley; +Cc: Zdenek Sojka, gcc-help

On Tue, Jul 02, 2019 at 11:41:35AM +0100, Andrew Haley wrote:
> On 7/2/19 6:37 AM, Zdenek Sojka wrote:
> > Ok, shame - it seems to behave so in my experiments:
> 
> It's more complicated than that. Sometimes the size of the operand is
> in the name of the instruction and sometimes you need to force the
> size yourself.

You also need to consider the mode that is used for the variables.

> For example, with int a, x:
> 
>   asm("mov %1, %0" : "=&r"(a) : "r"(x));
> 
> generates
> 
> 	mov %edx, %eax

and with "short a, x" you get %ax and %dx, and with "char a, x" you get
%al and %dl without needing the %b output modifier.

> I'm a bit paranoid about this stuff because my memory of GCC's inline
> asm goes back decades to when it was far more fragile than it is now.

Oh it still is quite fragile / does unexpected things whenever you try
anything out of the ordinary.  For example, operands do not get the
usual integer promotions, as the example above shows: operands are
treated like lvalues instead, input operands as well.

If you know what mode is used for every operand you use in asm, and/or
you just use operands that are full register size, there aren't many
surprises.

Things get even more interesting if you use multi-register modes, like
DImode with -m32 on x86.  On x86 that says "warning: unsupported size
for integer register", but some other targets have to handle that.
Writing correct asm in such cases of course means that you have to know
what the compiler does.

Segher

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [x86 inline asm]: width of register arguments
  2019-07-02 15:19       ` Segher Boessenkool
@ 2019-07-03  7:32         ` Zdenek Sojka
  2019-07-03 13:57           ` Segher Boessenkool
  0 siblings, 1 reply; 7+ messages in thread
From: Zdenek Sojka @ 2019-07-03  7:32 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-help, Andrew Haley

Hello Segher,
Hello Andrew,

---------- Původní e-mail ----------
Od: Segher Boessenkool <segher@kernel.crashing.org>
Komu: Andrew Haley <aph@redhat.com>
Datum: 2. 7. 2019 17:19:48
Předmět: Re: [x86 inline asm]: width of register arguments
"On Tue, Jul 02, 2019 at 11:41:35AM +0100, Andrew Haley wrote:
> On 7/2/19 6:37 AM, Zdenek Sojka wrote:
> > Ok, shame - it seems to behave so in my experiments:
>
> It's more complicated than that. Sometimes the size of the operand is 
> in the name of the instruction and sometimes you need to force the
> size yourself.

You also need to consider the mode that is used for the variables.

> For example, with int a, x:
>
> asm("mov %1, %0" : "=&r"(a) : "r"(x));
>
> generates
>
> mov %edx, %eax

and with "short a, x" you get %ax and %dx, and with "char a, x" you get 
%al and %dl without needing the %b output modifier.

> I'm a bit paranoid about this stuff because my memory of GCC's inline 
> asm goes back decades to when it was far more fragile than it is now. 

Oh it still is quite fragile / does unexpected things whenever you try
anything out of the ordinary. For example, operands do not get the
usual integer promotions, as the example above shows: operands are
treated like lvalues instead, input operands as well.

If you know what mode is used for every operand you use in asm, and/or
you just use operands that are full register size, there aren't many
surprises.
"

Ok, thanks, that helps me a lot!

"
Things get even more interesting if you use multi-register modes, like
DImode with -m32 on x86. On x86 that says "warning: unsupported size
for integer register", but some other targets have to handle that.
Writing correct asm in such cases of course means that you have to know 
what the compiler does.
"
Like on arm and powerpc (32bit) targets, where there are the %L and/or %H 
operand modifiers (correct?) (I can't find them documented anywhere atm); I
just found the documentation for x86 at https://gcc.gnu.org/onlinedocs/gcc/
Extended-Asm.html#x86Operandmodifiers , but the other targets do not seem to
be documented on that page

x86 (32bit) has a DImode constraint "A" for the edx:eax register pair... but
it's not generic enough and out of scope for me, as I am using the x86_amd64
target with at most DImode variables.

Best regards,
Zdenek

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [x86 inline asm]: width of register arguments
  2019-07-03  7:32         ` Zdenek Sojka
@ 2019-07-03 13:57           ` Segher Boessenkool
  0 siblings, 0 replies; 7+ messages in thread
From: Segher Boessenkool @ 2019-07-03 13:57 UTC (permalink / raw)
  To: Zdenek Sojka; +Cc: gcc-help, Andrew Haley

Hi Zdenek,

On Wed, Jul 03, 2019 at 09:32:12AM +0200, Zdenek Sojka wrote:
> "
> Things get even more interesting if you use multi-register modes, like
> DImode with -m32 on x86. On x86 that says "warning: unsupported size
> for integer register", but some other targets have to handle that.
> Writing correct asm in such cases of course means that you have to know 
> what the compiler does.
> "
> Like on arm and powerpc (32bit) targets, where there are the %L and/or %H 
> operand modifiers (correct?)

Yup.  PowerPC's %L (not just 32-bit btw) just means "register number plus
one", which you can write as just "%0+1" with many assemblers (but not
with all).  arm and aarch64 %H are very similar.

> (I can't find them documented anywhere atm)

Yeah, working on it.  Known issue.

> x86 (32bit) has a DImode constraint "A" for the edx:eax register pair... but
> it's not generic enough and out of scope for me, as I am using the x86_amd64
> target with at most DImode variables.

You can have TImode variables in 64-bit mode, using __int128 or similar.

Anyway...  You hardly ever should have issues with this in actual inline
asm.  You should avoid these problems, avoid these constructs completely,
there usually are better ways to do these things (using pure C, or maybe
using some compiler intrinsics / builtin functions, for example).

Segher

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-07-03 13:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-24 12:19 [x86 inline asm]: width of register arguments Zdenek Sojka
2019-06-29 16:10 ` Andrew Haley
2019-07-02  5:37   ` Zdenek Sojka
2019-07-02 10:41     ` Andrew Haley
2019-07-02 15:19       ` Segher Boessenkool
2019-07-03  7:32         ` Zdenek Sojka
2019-07-03 13:57           ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).