public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* PATCH: Implement Intel Transactional Synchronization Extensions
@ 2012-02-08 18:25 H.J. Lu
  2012-02-21  8:57 ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: H.J. Lu @ 2012-02-08 18:25 UTC (permalink / raw)
  To: binutils

[-- Attachment #1: Type: text/plain, Size: 2999 bytes --]

Hi,

I checked in this pach to implement Intel Transactional Synchronization
Extensions:

http://software.intel.com/en-us/avx/


H.J.
---
gas/

2012-02-08  H.J. Lu  <hongjiu.lu@intel.com>

	* config/tc-i386.c (HLE_PREFIX): New.
	(check_hle): Likewise.
	(_i386_insn): Add have_hle.
	(cpu_arch): Add .hle and .rtm.
	(md_assemble): Call check_hle if i.have_hle isn't zero.
	(parse_insn): Set i.have_hle to 1 for HLE prefix.
	(output_jump): Support up to 2 byte opcode.

	* doc/c-i386.texi: Document hle/.hle and rtm/.rtm.

gas/testsuite/

2012-02-08  H.J. Lu  <hongjiu.lu@intel.com>

	* gas/i386/hle-intel.d: New.
	* gas/i386/hle.d: Likewise.
	* gas/i386/hle.s: Likewise.
	* gas/i386/hlebad.l: Likewise.
	* gas/i386/hlebad.s: Likewise.
	* gas/i386/rtm-intel.d: Likewise.
	* gas/i386/rtm.d: Likewise.
	* gas/i386/rtm.s: Likewise.
	* gas/i386/x86-64-hle-intel.d: Likewise.
	* gas/i386/x86-64-hle.d: Likewise.
	* gas/i386/x86-64-hle.s: Likewise.
	* gas/i386/x86-64-hlebad.l: Likewise.
	* gas/i386/x86-64-hlebad.s: Likewise.
	* gas/i386/x86-64-rtm-intel.d: Likewise.
	* gas/i386/x86-64-rtm.d: Likewise.
	* gas/i386/x86-64-rtm.s: Likewise.

	* gas/i386/i386.exp: Run hle, hle-intel, hlebad x86-64-hle, rtm,
	rtm-intel, x86-64-hle-intel, x86-64-hlebad, x86-64-rtm and
	x86-64-rtm-intel.

include/opcode/

2012-02-08  H.J. Lu  <hongjiu.lu@intel.com>

	* i386.h (XACQUIRE_PREFIX_OPCODE): New.
	(XRELEASE_PREFIX_OPCODE): Likewise.

opcodes/

2012-02-08  H.J. Lu  <hongjiu.lu@intel.com>

	* i386-dis.c (HLE_Fixup1): New.
	(HLE_Fixup2): Likewise.
	(HLE_Fixup3): Likewise.
	(Ebh1): Likewise.
	(Evh1): Likewise.
	(Ebh2): Likewise.
	(Evh2): Likewise.
	(Ebh3): Likewise.
	(Evh3): Likewise.
	(MOD_C6_REG_7): Likewise.
	(MOD_C7_REG_7): Likewise.
	(RM_C6_REG_7): Likewise.
	(RM_C7_REG_7): Likewise.
	(XACQUIRE_PREFIX): Likewise.
	(XRELEASE_PREFIX): Likewise.
	(dis386): Use Ebh1/Evh1 on add, adc, and, btc, btr, bts,
	cmpxchg, dec, inc, neg, not, or, sbb, sub, xor and xadd. Use
	Ebh2/Evh2 on xchg.  Use Ebh3/Evh3 on mov.
	(reg_table): Use Ebh1/Evh1 on add, adc, and, dec, inc, neg,
	not, or, sbb, sub and xor.  Use Ebh3/Evh3 on mov.  Use
	MOD_C6_REG_7 and MOD_C7_REG_7.
	(mod_table): Add MOD_C6_REG_7 and MOD_C7_REG_7.
	(rm_table): Add RM_C6_REG_7 and RM_C7_REG_7.  Add xend and
	xtest.
	(prefix_name): Handle XACQUIRE_PREFIX and XRELEASE_PREFIX.
	(CMPXCHG8B_Fixup): Handle HLE prefix on cmpxchg8b.

	* i386-gen.c (cpu_flag_init): Add CPU_HLE_FLAGS and
	CPU_RTM_FLAGS.
	(cpu_flags): Add CpuHLE and CpuRTM.
	(opcode_modifiers): Add HLEPrefixOk.

	* i386-opc.h (CpuHLE): New.
	(CpuRTM): Likewise.
	(HLEPrefixOk): Likewise.
	(i386_cpu_flags): Add cpuhle and cpurtm.
	(i386_opcode_modifier): Add hleprefixok.

	* i386-opc.tbl: Add HLEPrefixOk=3 to mov.  Add HLEPrefixOk to
	add, adc, and, btc, btr, bts, cmpxchg, dec, inc, neg, not, or,
	sbb, sub, xor and xadd.  Add HLEPrefixOk=2 to xchg with memory
	operand.  Add xacquire, xrelease, xabort, xbegin, xend and
	xtest.
	* i386-init.h: Regenerated.
	* i386-tbl.h: Likewise.

[-- Attachment #2: binutils-tsx.patch.bz2 --]
[-- Type: application/x-bzip2, Size: 31289 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: PATCH: Implement Intel Transactional Synchronization Extensions
  2012-02-08 18:25 PATCH: Implement Intel Transactional Synchronization Extensions H.J. Lu
@ 2012-02-21  8:57 ` Jan Beulich
  2012-02-21 18:18   ` H.J. Lu
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2012-02-21  8:57 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

>>> On 08.02.12 at 19:25, "H.J. Lu" <hongjiu.lu@intel.com> wrote:
>+static int
>+check_hle (void)
>+{
>+  switch (i.tm.opcode_modifier.hleprefixok)
>+    {
>+    default:
>+      abort ();
>+    case 0:

How about giving these literals proper names. Right now one has to
look up on what instructions they're being use to see what their
intended meaning is.

>+      if (i.prefix[HLE_PREFIX] == XACQUIRE_PREFIX_OPCODE)

With HLE_PREFIX == REP_PREFIX and XACQUIRE_PREFIX_OPCODE
== REPNE_PREFIX_OPCODE, how is this distinguished from the repne
case? Oh, I see, the *caller* checks i.have_hle - that's pretty bad to
be done this way imo.

+	as_bad (_("invalid instruction `%s' after `xacquire'"),
+		i.tm.name);
+      else
+	as_bad (_("invalid instruction `%s' after `xrelease'"),
+		i.tm.name);
+      return 0;
+    case 1:
+      if (i.prefix[LOCK_PREFIX])
+	return 1;
+      if (i.prefix[HLE_PREFIX] == XACQUIRE_PREFIX_OPCODE)
+	as_bad (_("missing `lock' with `xacquire'"));
+      else
+	as_bad (_("missing `lock' with `xrelease'"));

Wouldn't it make sense to insert missing LOCK prefixes silently
(optionally verbosely) rather than complaining about their absence?
(I'd also question the value of displaying both prefixes by default
in the disassembly - this just clutters things rather than helping
with anything.)

+      return 0;
+    case 2:
+      return 1;
+    case 3:
+      if (i.prefix[HLE_PREFIX] != XRELEASE_PREFIX_OPCODE)
+	{
+	  as_bad (_("instruction `%s' after `xacquire' not allowed"),
+		  i.tm.name);
+	  return 0;
+	}
+      if (i.mem_operands == 0
+	  || !operand_type_check (i.types[i.operands - 1], anymem))
+	{
+	  as_bad (_("memory destination needed for instruction `%s'"
+		    " after `xrelease'"), i.tm.name);
+	  return 0;
+	}
+      return 1;
+    }
+}

Further, while the patch deals with CMPXCHG8B, for some reason
it leaves out CMPXCHG16B (perhaps because the instruction, oddly
enough, is considered an SSE3 one).

Finally, albeit consistent with what is documented, I'm surprised that
MOV opcodes 0xA2 and 0xA3 aren't allowed with XRELEASE - is that
really the case?

Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: PATCH: Implement Intel Transactional Synchronization Extensions
  2012-02-21  8:57 ` Jan Beulich
@ 2012-02-21 18:18   ` H.J. Lu
  2012-02-22  7:53     ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: H.J. Lu @ 2012-02-21 18:18 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Tue, Feb 21, 2012 at 12:56 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 08.02.12 at 19:25, "H.J. Lu" <hongjiu.lu@intel.com> wrote:
>>+static int
>>+check_hle (void)
>>+{
>>+  switch (i.tm.opcode_modifier.hleprefixok)
>>+    {
>>+    default:
>>+      abort ();
>>+    case 0:
>
> How about giving these literals proper names. Right now one has to
> look up on what instructions they're being use to see what their
> intended meaning is.

I checked in a patch to defineHLEPrefixNone, HLEPrefixLock,
HLEPrefixAny and HLEPrefixRelease.

>>+      if (i.prefix[HLE_PREFIX] == XACQUIRE_PREFIX_OPCODE)
>
> With HLE_PREFIX == REP_PREFIX and XACQUIRE_PREFIX_OPCODE
> == REPNE_PREFIX_OPCODE, how is this distinguished from the repne
> case? Oh, I see, the *caller* checks i.have_hle - that's pretty bad to
> be done this way imo.

Any suggestions are welcome.

> +       as_bad (_("invalid instruction `%s' after `xacquire'"),
> +               i.tm.name);
> +      else
> +       as_bad (_("invalid instruction `%s' after `xrelease'"),
> +               i.tm.name);
> +      return 0;
> +    case 1:
> +      if (i.prefix[LOCK_PREFIX])
> +       return 1;
> +      if (i.prefix[HLE_PREFIX] == XACQUIRE_PREFIX_OPCODE)
> +       as_bad (_("missing `lock' with `xacquire'"));
> +      else
> +       as_bad (_("missing `lock' with `xrelease'"));
>
> Wouldn't it make sense to insert missing LOCK prefixes silently
> (optionally verbosely) rather than complaining about their absence?
> (I'd also question the value of displaying both prefixes by default
> in the disassembly - this just clutters things rather than helping
> with anything.)

In some cases, it is very useful to know/control exactly how many
prefixes are generated.

> +      return 0;
> +    case 2:
> +      return 1;
> +    case 3:
> +      if (i.prefix[HLE_PREFIX] != XRELEASE_PREFIX_OPCODE)
> +       {
> +         as_bad (_("instruction `%s' after `xacquire' not allowed"),
> +                 i.tm.name);
> +         return 0;
> +       }
> +      if (i.mem_operands == 0
> +         || !operand_type_check (i.types[i.operands - 1], anymem))
> +       {
> +         as_bad (_("memory destination needed for instruction `%s'"
> +                   " after `xrelease'"), i.tm.name);
> +         return 0;
> +       }
> +      return 1;
> +    }
> +}
>
> Further, while the patch deals with CMPXCHG8B, for some reason
> it leaves out CMPXCHG16B (perhaps because the instruction, oddly
> enough, is considered an SSE3 one).
>
> Finally, albeit consistent with what is documented, I'm surprised that
> MOV opcodes 0xA2 and 0xA3 aren't allowed with XRELEASE - is that
> really the case?
>

Yes, they are implemented according to TSX spec.


-- 
H.J.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: PATCH: Implement Intel Transactional Synchronization Extensions
  2012-02-21 18:18   ` H.J. Lu
@ 2012-02-22  7:53     ` Jan Beulich
  2012-02-22 17:11       ` H.J. Lu
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2012-02-22  7:53 UTC (permalink / raw)
  To: hjl.tools; +Cc: binutils

>>> "H.J. Lu" <hjl.tools@gmail.com> 02/21/12 7:17 PM >>>
On Tue, Feb 21, 2012 at 12:56 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 08.02.12 at 19:25, "H.J. Lu" <hongjiu.lu@intel.com> wrote:
>>>+      if (i.prefix[HLE_PREFIX] == XACQUIRE_PREFIX_OPCODE)
>>
>> With HLE_PREFIX == REP_PREFIX and XACQUIRE_PREFIX_OPCODE
>> == REPNE_PREFIX_OPCODE, how is this distinguished from the repne
>> case? Oh, I see, the *caller* checks i.have_hle - that's pretty bad to
>> be done this way imo.
>
>Any suggestions are welcome.

At least move the check into the function, so that an eventual second caller
doesn't have to ope code the check a second time.

>> Wouldn't it make sense to insert missing LOCK prefixes silently
>> (optionally verbosely) rather than complaining about their absence?
>> (I'd also question the value of displaying both prefixes by default
>> in the disassembly - this just clutters things rather than helping
>> with anything.)
>
>In some cases, it is very useful to know/control exactly how many
>prefixes are generated.

But there's no alternative here - the code is wrong without the prefixes.
And as written, for those who do care an optional warning could be
implemented.

>> Further, while the patch deals with CMPXCHG8B, for some reason
>> it leaves out CMPXCHG16B (perhaps because the instruction, oddly
>> enough, is considered an SSE3 one).
>>
>> Finally, albeit consistent with what is documented, I'm surprised that
>> MOV opcodes 0xA2 and 0xA3 aren't allowed with XRELEASE - is that
>> really the case?
>>
>
>Yes, they are implemented according to TSX spec.

Assuming you meant "are not", did you check that this isn't just an omission
in the spec?

Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: PATCH: Implement Intel Transactional Synchronization Extensions
  2012-02-22  7:53     ` Jan Beulich
@ 2012-02-22 17:11       ` H.J. Lu
  2012-02-22 17:21         ` H.J. Lu
  0 siblings, 1 reply; 6+ messages in thread
From: H.J. Lu @ 2012-02-22 17:11 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Tue, Feb 21, 2012 at 11:52 PM, Jan Beulich <jbeulich@suse.com> wrote:
>>>> "H.J. Lu" <hjl.tools@gmail.com> 02/21/12 7:17 PM >>>
> On Tue, Feb 21, 2012 at 12:56 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 08.02.12 at 19:25, "H.J. Lu" <hongjiu.lu@intel.com> wrote:
>>>>+      if (i.prefix[HLE_PREFIX] == XACQUIRE_PREFIX_OPCODE)
>>>
>>> With HLE_PREFIX == REP_PREFIX and XACQUIRE_PREFIX_OPCODE
>>> == REPNE_PREFIX_OPCODE, how is this distinguished from the repne
>>> case? Oh, I see, the *caller* checks i.have_hle - that's pretty bad to
>>> be done this way imo.
>>
>>Any suggestions are welcome.
>
> At least move the check into the function, so that an eventual second caller
> doesn't have to ope code the check a second time.

Can you provide a patch?

>>> Wouldn't it make sense to insert missing LOCK prefixes silently
>>> (optionally verbosely) rather than complaining about their absence?
>>> (I'd also question the value of displaying both prefixes by default
>>> in the disassembly - this just clutters things rather than helping
>>> with anything.)
>>
>>In some cases, it is very useful to know/control exactly how many
>>prefixes are generated.
>
> But there's no alternative here - the code is wrong without the prefixes.
> And as written, for those who do care an optional warning could be
> implemented.

LOCK prefix is optional on some instructions.  Implicit LOCK prefix
means we can't tell how many prefix bytes are generated from instructions.
I will not go with implicit LOCK prefix.

>>> Further, while the patch deals with CMPXCHG8B, for some reason
>>> it leaves out CMPXCHG16B (perhaps because the instruction, oddly
>>> enough, is considered an SSE3 one).
>>>
>>> Finally, albeit consistent with what is documented, I'm surprised that
>>> MOV opcodes 0xA2 and 0xA3 aren't allowed with XRELEASE - is that
>>> really the case?
>>>
>>
>>Yes, they are implemented according to TSX spec.
>
> Assuming you meant "are not", did you check that this isn't just an omission
> in the spec?
>

I will let you know what I found out.

Thanks,

-- 
H.J.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: PATCH: Implement Intel Transactional Synchronization Extensions
  2012-02-22 17:11       ` H.J. Lu
@ 2012-02-22 17:21         ` H.J. Lu
  0 siblings, 0 replies; 6+ messages in thread
From: H.J. Lu @ 2012-02-22 17:21 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Wed, Feb 22, 2012 at 9:10 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> Further, while the patch deals with CMPXCHG8B, for some reason
>>>> it leaves out CMPXCHG16B (perhaps because the instruction, oddly
>>>> enough, is considered an SSE3 one).
>>>>
>>>> Finally, albeit consistent with what is documented, I'm surprised that
>>>> MOV opcodes 0xA2 and 0xA3 aren't allowed with XRELEASE - is that
>>>> really the case?
>>>>
>>>
>>>Yes, they are implemented according to TSX spec.
>>
>> Assuming you meant "are not", did you check that this isn't just an omission
>> in the spec?
>>
>
> I will let you know what I found out.
>

I was told that the TSX spec is correct.  HLE doesn't support
CMPXCHG16B nor MOV opcodes 0xA2 and 0xA3.


-- 
H.J.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-02-22 17:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-08 18:25 PATCH: Implement Intel Transactional Synchronization Extensions H.J. Lu
2012-02-21  8:57 ` Jan Beulich
2012-02-21 18:18   ` H.J. Lu
2012-02-22  7:53     ` Jan Beulich
2012-02-22 17:11       ` H.J. Lu
2012-02-22 17:21         ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).