public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH]: x86 gas: allow 'rep' prefix on 'bsf' and 'bsr' instructions
@ 2012-06-21 21:36 Roland McGrath
       [not found] ` <CAMe9rOrZ0rJs13QBAgY+Tx4T=kN+1twUpC7y62MwyPJ47m=8nA@mail.gmail.com>
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Roland McGrath @ 2012-06-21 21:36 UTC (permalink / raw)
  To: binutils

'rep; bsf ...'/'rep; bsr ...' are encoded the same as 'tzcnt ...'/'lzcnt ...'.

When not doing -mbmi, GCC (trunk) like to emit 'rep; bsf ...' on the
theory that since the two instructions have sufficiently similar
semantics for the purposes for which the compiler emits this,
BMI-capable hardware will decode it as 'tzcnt ...' and may execute that
faster than 'bsf ...', while older hardware will ignore the REP prefix
and decode it as 'bsf ...'.

When using .bundle_align_mode, the assembler might decide to insert some
nop padding between any two instructions, so the ';' could become some
number of nop instructions and break the encoding intended.

This change makes the assembler accept 'rep bsf ...' or 'rep bsr ...'
without complaint.  The result is the same as using the 'tzcnt' or
'lzcnt' mnemonic, but the 'rep' forms are accepted even under
-march=i686 or the like where 'tzcnt' and 'lzcnt' would be refused.

With this, I can change the compiler to use this syntax (when configured
with a new assembler) and remove the possibility of running afoul of
.bundle_align_mode nop-insertion.

No testsuite failures on an x86_64-linux-gnu host.


Ok for trunk?

(I've omitted the large patch to the generated file opcodes/i386-tbl.h,
so use --enable-maintainer-mode to test the patch.)


Thanks,
Roland


gas/
2012-06-21  Roland McGrath  <mcgrathr@google.com>

	* config/tc-i386.c (parse_insn): Don't complain about REP prefix
	when the template has opcode_modifier.repprefixok set.
	* NEWS: Mention the change.

opcodes/
2012-06-21  Roland McGrath  <mcgrathr@google.com>

	* i386-opc.h (RepPrefixOk): New enum constant.
	(i386_opcode_modifier): New bitfield 'repprefixok'.
	* i386-gen.c (opcode_modifiers): Add RepPrefixOk.
	* i386-opc.tbl (bsf, bsr): Add RepPrefixOk.
	* i386-tbl.h: Regenerate.

diff --git a/gas/NEWS b/gas/NEWS
index 6b6dbba..6f62b93 100644
--- a/gas/NEWS
+++ b/gas/NEWS
@@ -13,6 +13,8 @@
 
 * Add support for the Adapteva EPIPHANY architecture.
 
+* For x86, allow 'rep bsf' or 'rep bsr' syntax.
+
 Changes in 2.22:
 
 * Add support for the Tilera TILEPro and TILE-Gx architectures.
diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index d2b4927..e3cdf71 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -3534,7 +3534,7 @@ skip:
       static templates override;
 
       for (t = current_templates->start; t < current_templates->end; ++t)
-	if (t->opcode_modifier.isstring)
+	if (t->opcode_modifier.isstring || t->opcode_modifier.repprefixok)
 	  break;
       if (t >= current_templates->end)
 	{
@@ -3543,7 +3543,7 @@ skip:
 	  return NULL;
 	}
       for (override.start = t; t < current_templates->end; ++t)
-	if (!t->opcode_modifier.isstring)
+	if (!t->opcode_modifier.isstring && !t->opcode_modifier.repprefixok)
 	  break;
       override.end = t;
       current_templates = &override;
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index 21f600f..9386a97 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -1,4 +1,4 @@
-/* Copyright 2007, 2008, 2009, 2010, 2011
+/* Copyright 2007, 2008, 2009, 2010, 2011, 2012
    Free Software Foundation, Inc.
 
    This file is part of the GNU opcodes library.
@@ -394,6 +394,7 @@ static bitfield opcode_modifiers[] =
   BITFIELD (No_ldSuf),
   BITFIELD (FWait),
   BITFIELD (IsString),
+  BITFIELD (RepPrefixOk),
   BITFIELD (IsLockable),
   BITFIELD (RegKludge),
   BITFIELD (FirstXmm0),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index f130a96..f761770 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -1,5 +1,5 @@
 /* Declarations for Intel 80386 opcode table
-   Copyright 2007, 2008, 2009, 2010
+   Copyright 2007, 2008, 2009, 2010, 2012
    Free Software Foundation, Inc.
 
    This file is part of the GNU opcodes library.
@@ -290,6 +290,8 @@ enum
   FWait,
   /* quick test for string instructions */
   IsString,
+  /* additional instruction on which a "rep" prefix is acceptable */
+  RepPrefixOk,
   /* quick test for lockable instructions */
   IsLockable,
   /* fake an extra reg operand for clr, imul and special register
@@ -438,6 +440,7 @@ typedef struct i386_opcode_modifier
   unsigned int no_ldsuf:1;
   unsigned int fwait:1;
   unsigned int isstring:1;
+  unsigned int repprefixok:1;
   unsigned int islockable:1;
   unsigned int regkludge:1;
   unsigned int firstxmm0:1;
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 8a43b51..8a25e15 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -1,5 +1,5 @@
 // i386 opcode table.
-// Copyright 2007, 2008, 2009, 2010
+// Copyright 2007, 2008, 2009, 2010, 2012
 // Free Software Foundation, Inc.
 //
 // This file is part of the GNU opcodes library.
@@ -475,8 +475,8 @@ xlat, 0, 0xd7, None, 1, 0, No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString, {
 xlat, 1, 0xd7, None, 1, 0, No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString, { Byte|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S }
 
 // Bit manipulation.
-bsf, 2, 0xfbc, None, 2, Cpu386, Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf, { Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S, Reg16|Reg32|Reg64 }
-bsr, 2, 0xfbd, None, 2, Cpu386, Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf, { Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S, Reg16|Reg32|Reg64 }
+bsf, 2, 0xfbc, None, 2, Cpu386, Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf|RepPrefixOk, { Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S, Reg16|Reg32|Reg64 }
+bsr, 2, 0xfbd, None, 2, Cpu386, Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf|RepPrefixOk, { Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S, Reg16|Reg32|Reg64 }
 bt, 2, 0xfa3, None, 2, Cpu386, Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf, { Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S }
 bt, 2, 0xfba, 0x4, 2, Cpu386, Modrm|No_bSuf|No_sSuf|No_ldSuf, { Imm8, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S }
 btc, 2, 0xfbb, None, 2, Cpu386, Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]: x86 gas: allow 'rep' prefix on 'bsf' and 'bsr' instructions
       [not found] ` <CAMe9rOrZ0rJs13QBAgY+Tx4T=kN+1twUpC7y62MwyPJ47m=8nA@mail.gmail.com>
@ 2012-06-21 22:08   ` Roland McGrath
  0 siblings, 0 replies; 12+ messages in thread
From: Roland McGrath @ 2012-06-21 22:08 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On Thu, Jun 21, 2012 at 2:48 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> I thought the gcc patch was reverted since the assumption was wrong.

No, it's still there.  What you may be thinking of is that it was
originally doing "rep; bsf" for all situations other than -mbmi or -Os.
The next day it was changed so it uses "rep; bsf" only for -mtune=generic,
after Jakub pointed out that "rep; bsf" was never desireable for some
particular tuning for a chip known not to support tzcnt.  I haven't seen
anybody refute the claim that tzcnt is "usually faster than bsf", though
that claim was only made in passing on gcc-patches (by Uros, I think) and
I haven't seen any references to substantiate it.

If you can make the case that it's never disadvantageous to use bsf on a
machine that also supports tzcnt, and thus have this GCC change removed,
then I'll be happy to drop the assembler change.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]: x86 gas: allow 'rep' prefix on 'bsf' and 'bsr' instructions
  2012-06-21 21:36 [PATCH]: x86 gas: allow 'rep' prefix on 'bsf' and 'bsr' instructions Roland McGrath
       [not found] ` <CAMe9rOrZ0rJs13QBAgY+Tx4T=kN+1twUpC7y62MwyPJ47m=8nA@mail.gmail.com>
@ 2012-06-21 23:01 ` H.J. Lu
  2012-06-22 14:29   ` H.J. Lu
  2012-07-06  9:51 ` Jan Beulich
  2 siblings, 1 reply; 12+ messages in thread
From: H.J. Lu @ 2012-06-21 23:01 UTC (permalink / raw)
  To: Roland McGrath; +Cc: binutils

On Thu, Jun 21, 2012 at 2:36 PM, Roland McGrath <mcgrathr@google.com> wrote:
> 'rep; bsf ...'/'rep; bsr ...' are encoded the same as 'tzcnt ...'/'lzcnt ...'.
>
> When not doing -mbmi, GCC (trunk) like to emit 'rep; bsf ...' on the
> theory that since the two instructions have sufficiently similar
> semantics for the purposes for which the compiler emits this,
> BMI-capable hardware will decode it as 'tzcnt ...' and may execute that
> faster than 'bsf ...', while older hardware will ignore the REP prefix
> and decode it as 'bsf ...'.
>
> When using .bundle_align_mode, the assembler might decide to insert some
> nop padding between any two instructions, so the ';' could become some
> number of nop instructions and break the encoding intended.
>
> This change makes the assembler accept 'rep bsf ...' or 'rep bsr ...'
> without complaint.  The result is the same as using the 'tzcnt' or
> 'lzcnt' mnemonic, but the 'rep' forms are accepted even under
> -march=i686 or the like where 'tzcnt' and 'lzcnt' would be refused.
>
> With this, I can change the compiler to use this syntax (when configured
> with a new assembler) and remove the possibility of running afoul of
> .bundle_align_mode nop-insertion.
>
> No testsuite failures on an x86_64-linux-gnu host.
>
>
> Ok for trunk?
>
> (I've omitted the large patch to the generated file opcodes/i386-tbl.h,
> so use --enable-maintainer-mode to test the patch.)
>
>

Please make following changes:

1. Move RepPrefixOk next to HLEPrefixOk.
2. Add RepPrefixOk to all insns with IsString
3. Check repprefixok instead of isstring for REP prefix in
gas.

OK with those changes.

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]: x86 gas: allow 'rep' prefix on 'bsf' and 'bsr' instructions
  2012-06-21 23:01 ` H.J. Lu
@ 2012-06-22 14:29   ` H.J. Lu
  2012-06-22 16:43     ` Roland McGrath
  0 siblings, 1 reply; 12+ messages in thread
From: H.J. Lu @ 2012-06-22 14:29 UTC (permalink / raw)
  To: Roland McGrath; +Cc: binutils

On Thu, Jun 21, 2012 at 4:01 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Jun 21, 2012 at 2:36 PM, Roland McGrath <mcgrathr@google.com> wrote:
>> 'rep; bsf ...'/'rep; bsr ...' are encoded the same as 'tzcnt ...'/'lzcnt ...'.
>>
>> When not doing -mbmi, GCC (trunk) like to emit 'rep; bsf ...' on the
>> theory that since the two instructions have sufficiently similar
>> semantics for the purposes for which the compiler emits this,
>> BMI-capable hardware will decode it as 'tzcnt ...' and may execute that
>> faster than 'bsf ...', while older hardware will ignore the REP prefix
>> and decode it as 'bsf ...'.
>>
>> When using .bundle_align_mode, the assembler might decide to insert some
>> nop padding between any two instructions, so the ';' could become some
>> number of nop instructions and break the encoding intended.
>>
>> This change makes the assembler accept 'rep bsf ...' or 'rep bsr ...'
>> without complaint.  The result is the same as using the 'tzcnt' or
>> 'lzcnt' mnemonic, but the 'rep' forms are accepted even under
>> -march=i686 or the like where 'tzcnt' and 'lzcnt' would be refused.
>>
>> With this, I can change the compiler to use this syntax (when configured
>> with a new assembler) and remove the possibility of running afoul of
>> .bundle_align_mode nop-insertion.
>>
>> No testsuite failures on an x86_64-linux-gnu host.
>>
>>
>> Ok for trunk?
>>
>> (I've omitted the large patch to the generated file opcodes/i386-tbl.h,
>> so use --enable-maintainer-mode to test the patch.)
>>
>>
>
> Please make following changes:
>
> 1. Move RepPrefixOk next to HLEPrefixOk.
> 2. Add RepPrefixOk to all insns with IsString
> 3. Check repprefixok instead of isstring for REP prefix in
> gas.
>
> OK with those changes.
>
> Thanks.

Please also add a testcase for each change.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]: x86 gas: allow 'rep' prefix on 'bsf' and 'bsr' instructions
  2012-06-22 14:29   ` H.J. Lu
@ 2012-06-22 16:43     ` Roland McGrath
  2012-06-22 16:46       ` H.J. Lu
  0 siblings, 1 reply; 12+ messages in thread
From: Roland McGrath @ 2012-06-22 16:43 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

Here is what I've committed.


Thanks,
Roland


gas/
2012-06-22  Roland McGrath  <mcgrathr@google.com>

	* config/tc-i386.c (parse_insn): Don't complain about REP prefix
	when the template has opcode_modifier.repprefixok set.
	* NEWS: Mention the change.

gas/testsuite/
2012-06-22  Roland McGrath  <mcgrathr@google.com>

	* gas/i386/rep-bsf.d: New file.
	* gas/i386/rep-bsf.s: New file.
	* gas/i386/i386.exp: Add the new test.

opcodes/
2012-06-22  Roland McGrath  <mcgrathr@google.com>

	* i386-opc.h (RepPrefixOk): New enum constant.
	(i386_opcode_modifier): New bitfield 'repprefixok'.
	* i386-gen.c (opcode_modifiers): Add RepPrefixOk.
	* i386-opc.tbl: Add RepPrefixOk to bsf, bsr, and to all
	instructions that have IsString.
	* i386-tbl.h: Regenerate.


diff --git a/gas/NEWS b/gas/NEWS
index 6b6dbba..6f62b93 100644
--- a/gas/NEWS
+++ b/gas/NEWS
@@ -13,6 +13,8 @@

 * Add support for the Adapteva EPIPHANY architecture.

+* For x86, allow 'rep bsf' or 'rep bsr' syntax.
+
 Changes in 2.22:

 * Add support for the Tilera TILEPro and TILE-Gx architectures.
diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index d2b4927..4701e6d 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -3528,13 +3528,13 @@ skip:
       as_warn (_("use .code16 to ensure correct addressing mode"));
     }

-  /* Check for rep/repne without a string instruction.  */
+  /* Check for rep/repne without a string (or other allowed) instruction.  */
   if (expecting_string_instruction)
     {
       static templates override;

       for (t = current_templates->start; t < current_templates->end; ++t)
-	if (t->opcode_modifier.isstring)
+	if (t->opcode_modifier.repprefixok)
 	  break;
       if (t >= current_templates->end)
 	{
@@ -3543,7 +3543,7 @@ skip:
 	  return NULL;
 	}
       for (override.start = t; t < current_templates->end; ++t)
-	if (!t->opcode_modifier.isstring)
+	if (!t->opcode_modifier.repprefixok)
 	  break;
       override.end = t;
       current_templates = &override;
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index 89adf69..2f05580 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -264,6 +264,7 @@ if [expr ([istarget "i*86-*-*"] ||  [istarget
"x86_64-*-*"]) && [gas_32_check]]

     # Miscellaneous tests.
     run_dump_test "pr12589-1"
+    run_dump_test "rep-bsf"

     set ASFLAGS "$old_ASFLAGS"
 }
diff --git a/gas/testsuite/gas/i386/rep-bsf.d b/gas/testsuite/gas/i386/rep-bsf.d
new file mode 100644
index 0000000..c0a47dd
--- /dev/null
+++ b/gas/testsuite/gas/i386/rep-bsf.d
@@ -0,0 +1,11 @@
+#objdump: -d
+#name: rep prefix on bsf/bsr
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+000 <bsf>:
+\s*[0-9]+:\s+f3 0f bc c1\s+tzcnt  %ecx,%eax
+0+004 <bsr>:
+\s*[0-9]+:\s+f3 0f bd c1\s+lzcnt  %ecx,%eax
diff --git a/gas/testsuite/gas/i386/rep-bsf.s b/gas/testsuite/gas/i386/rep-bsf.s
new file mode 100644
index 0000000..10a12aa
--- /dev/null
+++ b/gas/testsuite/gas/i386/rep-bsf.s
@@ -0,0 +1,3 @@
+	.text
+bsf:	rep bsf %ecx, %eax
+bsr:	rep bsr %ecx, %eax
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index 21f600f..add410b 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -1,4 +1,4 @@
-/* Copyright 2007, 2008, 2009, 2010, 2011
+/* Copyright 2007, 2008, 2009, 2010, 2011, 2012
    Free Software Foundation, Inc.

    This file is part of the GNU opcodes library.
@@ -398,6 +398,7 @@ static bitfield opcode_modifiers[] =
   BITFIELD (RegKludge),
   BITFIELD (FirstXmm0),
   BITFIELD (Implicit1stXmm0),
+  BITFIELD (RepPrefixOk),
   BITFIELD (HLEPrefixOk),
   BITFIELD (ToDword),
   BITFIELD (ToQword),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index f130a96..96bdfb5 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -1,5 +1,5 @@
 /* Declarations for Intel 80386 opcode table
-   Copyright 2007, 2008, 2009, 2010
+   Copyright 2007, 2008, 2009, 2010, 2012
    Free Software Foundation, Inc.

    This file is part of the GNU opcodes library.
@@ -309,6 +309,8 @@ enum
 #define HLEPrefixAny		2
 #define HLEPrefixRelease	3
   HLEPrefixOk,
+  /* An instruction on which a "rep" prefix is acceptable.  */
+  RepPrefixOk,
   /* Convert to DWORD */
   ToDword,
   /* Convert to QWORD */
@@ -443,6 +445,7 @@ typedef struct i386_opcode_modifier
   unsigned int firstxmm0:1;
   unsigned int implicit1stxmm0:1;
   unsigned int hleprefixok:2;
+  unsigned int repprefixok:1;
   unsigned int todword:1;
   unsigned int toqword:1;
   unsigned int addrprefixop0:1;
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 8a43b51..b545730 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -1,5 +1,5 @@
 // i386 opcode table.
-// Copyright 2007, 2008, 2009, 2010
+// Copyright 2007, 2008, 2009, 2010, 2012
 // Free Software Foundation, Inc.
 //
 // This file is part of the GNU opcodes library.
@@ -441,42 +441,42 @@ setnle, 1, 0xf9f, 0x0, 2, Cpu386,
Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf
 setg, 1, 0xf9f, 0x0, 2, Cpu386,
Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, {
Reg8|Byte|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S }

 // String manipulation.
-cmps, 0, 0xa6, None, 1, 0, W|No_sSuf|No_ldSuf|IsString, { 0 }
-cmps, 2, 0xa6, None, 1, 0, W|CheckRegSize|No_sSuf|No_ldSuf|IsString,
{ Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg,
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S
}
-scmp, 0, 0xa6, None, 1, 0, W|No_sSuf|No_ldSuf|IsString, { 0 }
-scmp, 2, 0xa6, None, 1, 0, W|CheckRegSize|No_sSuf|No_ldSuf|IsString,
{ Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg,
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S
}
-ins, 0, 0x6c, None, 1, Cpu186, W|No_sSuf|No_qSuf|No_ldSuf|IsString, { 0 }
-ins, 2, 0x6c, None, 1, Cpu186,
W|CheckRegSize|No_sSuf|No_qSuf|No_ldSuf|IsString, { InOutPortReg,
Byte|Word|Dword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
-outs, 0, 0x6e, None, 1, Cpu186, W|No_sSuf|No_qSuf|No_ldSuf|IsString, { 0 }
-outs, 2, 0x6e, None, 1, Cpu186,
W|CheckRegSize|No_sSuf|No_qSuf|No_ldSuf|IsString, {
Byte|Word|Dword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
InOutPortReg }
-lods, 0, 0xac, None, 1, 0, W|No_sSuf|No_ldSuf|IsString, { 0 }
-lods, 1, 0xac, None, 1, 0, W|No_sSuf|No_ldSuf|IsString, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S
}
-lods, 2, 0xac, None, 1, 0, W|CheckRegSize|No_sSuf|No_ldSuf|IsString,
{ Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Acc|Byte|Word|Dword|Qword }
-slod, 0, 0xac, None, 1, 0, W|No_sSuf|No_ldSuf|IsString, { 0 }
-slod, 1, 0xac, None, 1, 0, W|No_sSuf|No_ldSuf|IsString, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S
}
-slod, 2, 0xac, None, 1, 0, W|CheckRegSize|No_sSuf|No_ldSuf|IsString,
{ Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Acc|Byte|Word|Dword|Qword }
-movs, 0, 0xa4, None, 1, 0, W|No_sSuf|No_ldSuf|IsString, { 0 }
-movs, 2, 0xa4, None, 1, 0, W|CheckRegSize|No_sSuf|No_ldSuf|IsString,
{ Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
-smov, 0, 0xa4, None, 1, 0, W|No_sSuf|No_ldSuf|IsString, { 0 }
-smov, 2, 0xa4, None, 1, 0, W|CheckRegSize|No_sSuf|No_ldSuf|IsString,
{ Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
-scas, 0, 0xae, None, 1, 0, W|No_sSuf|No_ldSuf|IsString, { 0 }
-scas, 1, 0xae, None, 1, 0, W|No_sSuf|No_ldSuf|IsString, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
-scas, 2, 0xae, None, 1, 0, W|CheckRegSize|No_sSuf|No_ldSuf|IsString,
{ Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg,
Acc|Byte|Word|Dword|Qword }
-ssca, 0, 0xae, None, 1, 0, W|No_sSuf|No_ldSuf|IsString, { 0 }
-ssca, 1, 0xae, None, 1, 0, W|No_sSuf|No_ldSuf|IsString, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
-ssca, 2, 0xae, None, 1, 0, W|CheckRegSize|No_sSuf|No_ldSuf|IsString,
{ Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg,
Acc|Byte|Word|Dword|Qword }
-stos, 0, 0xaa, None, 1, 0, W|No_sSuf|No_ldSuf|IsString, { 0 }
-stos, 1, 0xaa, None, 1, 0, W|No_sSuf|No_ldSuf|IsString, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
-stos, 2, 0xaa, None, 1, 0, W|CheckRegSize|No_sSuf|No_ldSuf|IsString,
{ Acc|Byte|Word|Dword|Qword,
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
-ssto, 0, 0xaa, None, 1, 0, W|No_sSuf|No_ldSuf|IsString, { 0 }
-ssto, 1, 0xaa, None, 1, 0, W|No_sSuf|No_ldSuf|IsString, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
-ssto, 2, 0xaa, None, 1, 0, W|CheckRegSize|No_sSuf|No_ldSuf|IsString,
{ Acc|Byte|Word|Dword|Qword,
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
-xlat, 0, 0xd7, None, 1, 0,
No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString, { 0 }
-xlat, 1, 0xd7, None, 1, 0,
No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString, {
Byte|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S }
+cmps, 0, 0xa6, None, 1, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, { 0 }
+cmps, 2, 0xa6, None, 1, 0,
W|CheckRegSize|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg,
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S
}
+scmp, 0, 0xa6, None, 1, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, { 0 }
+scmp, 2, 0xa6, None, 1, 0,
W|CheckRegSize|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg,
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S
}
+ins, 0, 0x6c, None, 1, Cpu186,
W|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, { 0 }
+ins, 2, 0x6c, None, 1, Cpu186,
W|CheckRegSize|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {
InOutPortReg, Byte|Word|Dword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
+outs, 0, 0x6e, None, 1, Cpu186,
W|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, { 0 }
+outs, 2, 0x6e, None, 1, Cpu186,
W|CheckRegSize|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {
Byte|Word|Dword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
InOutPortReg }
+lods, 0, 0xac, None, 1, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, { 0 }
+lods, 1, 0xac, None, 1, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S
}
+lods, 2, 0xac, None, 1, 0,
W|CheckRegSize|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Acc|Byte|Word|Dword|Qword }
+slod, 0, 0xac, None, 1, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, { 0 }
+slod, 1, 0xac, None, 1, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S
}
+slod, 2, 0xac, None, 1, 0,
W|CheckRegSize|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Acc|Byte|Word|Dword|Qword }
+movs, 0, 0xa4, None, 1, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, { 0 }
+movs, 2, 0xa4, None, 1, 0,
W|CheckRegSize|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
+smov, 0, 0xa4, None, 1, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, { 0 }
+smov, 2, 0xa4, None, 1, 0,
W|CheckRegSize|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
+scas, 0, 0xae, None, 1, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, { 0 }
+scas, 1, 0xae, None, 1, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
+scas, 2, 0xae, None, 1, 0,
W|CheckRegSize|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg,
Acc|Byte|Word|Dword|Qword }
+ssca, 0, 0xae, None, 1, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, { 0 }
+ssca, 1, 0xae, None, 1, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
+ssca, 2, 0xae, None, 1, 0,
W|CheckRegSize|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg,
Acc|Byte|Word|Dword|Qword }
+stos, 0, 0xaa, None, 1, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, { 0 }
+stos, 1, 0xaa, None, 1, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
+stos, 2, 0xaa, None, 1, 0,
W|CheckRegSize|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {
Acc|Byte|Word|Dword|Qword,
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
+ssto, 0, 0xaa, None, 1, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, { 0 }
+ssto, 1, 0xaa, None, 1, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
+ssto, 2, 0xaa, None, 1, 0,
W|CheckRegSize|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {
Acc|Byte|Word|Dword|Qword,
Byte|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg
}
+xlat, 0, 0xd7, None, 1, 0,
No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, { 0 }
+xlat, 1, 0xd7, None, 1, 0,
No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {
Byte|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S }

 // Bit manipulation.
-bsf, 2, 0xfbc, None, 2, Cpu386,
Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf, {
Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Reg16|Reg32|Reg64 }
-bsr, 2, 0xfbd, None, 2, Cpu386,
Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf, {
Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Reg16|Reg32|Reg64 }
+bsf, 2, 0xfbc, None, 2, Cpu386,
Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf|RepPrefixOk, {
Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Reg16|Reg32|Reg64 }
+bsr, 2, 0xfbd, None, 2, Cpu386,
Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf|RepPrefixOk, {
Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Reg16|Reg32|Reg64 }
 bt, 2, 0xfa3, None, 2, Cpu386,
Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf, { Reg16|Reg32|Reg64,
Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S
}
 bt, 2, 0xfba, 0x4, 2, Cpu386, Modrm|No_bSuf|No_sSuf|No_ldSuf, { Imm8,
Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S
}
 btc, 2, 0xfbb, None, 2, Cpu386,
Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, {
Reg16|Reg32|Reg64,
Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S
}
@@ -1371,8 +1371,8 @@ cmpunordsd, 2, 0xf20fc2, 0x3, 2, CpuSSE2,
Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lS
 cmppd, 3, 0x66c2, None, 1, CpuAVX,
Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX,
{ Imm8, Xmmword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|RegXMM,
RegXMM }
 cmppd, 3, 0x660fc2, None, 2, CpuSSE2,
Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, {
Imm8, Xmmword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|RegXMM,
RegXMM }
 // Intel mode string compare.
-cmpsd, 0, 0xa7, None, 1, 0,
Size32|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString, { 0
}
-cmpsd, 2, 0xa7, None, 1, 0,
Size32|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString, {
Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg }
+cmpsd, 0, 0xa7, None, 1, 0,
Size32|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk,
{ 0 }
+cmpsd, 2, 0xa7, None, 1, 0,
Size32|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk,
{ Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg }
 cmpsd, 3, 0xf2c2, None, 1, CpuAVX,
Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX,
{ Imm8, Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|RegXMM,
RegXMM }
 cmpsd, 3, 0xf20fc2, None, 2, CpuSSE2,
Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, {
Imm8, Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|RegXMM,
RegXMM }
 comisd, 2, 0x662f, None, 1, CpuAVX,
Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX,
{ Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|RegXMM,
RegXMM }
@@ -1411,8 +1411,8 @@ movmskpd, 2, 0x660f50, None, 2, CpuSSE2,
Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSu
 movntpd, 2, 0x662b, None, 1, CpuAVX,
Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX,
{ RegXMM, Xmmword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S }
 movntpd, 2, 0x660f2b, None, 2, CpuSSE2,
Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, {
RegXMM, Xmmword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S }
 // Intel mode string move.
-movsd, 0, 0xa5, None, 1, 0,
Size32|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString, { 0
}
-movsd, 2, 0xa5, None, 1, 0,
Size32|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString, {
Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg }
+movsd, 0, 0xa5, None, 1, 0,
Size32|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk,
{ 0 }
+movsd, 2, 0xa5, None, 1, 0,
Size32|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk,
{ Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|EsSeg }
 movsd, 2, 0xf211, None, 1, CpuAVX,
Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX,
{ RegXMM, Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S }
 movsd, 2, 0xf210, None, 1, CpuAVX,
Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX,
{ Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S, RegXMM }
 movsd, 2, 0xf210, None, 1, CpuAVX,
S|Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX,
{ RegXMM, RegXMM }
@@ -3035,21 +3035,21 @@ popcnt, 2, 0xf30fb8, None, 2,
CpuABM|CpuSSE4_2, Modrm|CheckRegSize|No_bSuf|No_sS
 lzcnt, 2, 0xf30fbd, None, 2, CpuABM|CpuLZCNT,
Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf, {
Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S,
Reg16|Reg32|Reg64 }

 // VIA PadLock extensions.
-xstore-rng, 0, 0xfa7, 0xc0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|ImmExt, { 0
}
-xcrypt-ecb, 0, 0xf30fa7, 0xc8, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|ImmExt, { 0
}
-xcrypt-cbc, 0, 0xf30fa7, 0xd0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|ImmExt, { 0
}
-xcrypt-ctr, 0, 0xf30fa7, 0xd8, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|ImmExt, { 0
}
-xcrypt-cfb, 0, 0xf30fa7, 0xe0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|ImmExt, { 0
}
-xcrypt-ofb, 0, 0xf30fa7, 0xe8, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|ImmExt, { 0
}
-montmul, 0, 0xf30fa6, 0xc0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|ImmExt, { 0
}
-xsha1, 0, 0xf30fa6, 0xc8, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|ImmExt, { 0
}
-xsha256, 0, 0xf30fa6, 0xd0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|ImmExt, { 0
}
+xstore-rng, 0, 0xfa7, 0xc0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk|ImmExt,
{ 0 }
+xcrypt-ecb, 0, 0xf30fa7, 0xc8, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk|ImmExt,
{ 0 }
+xcrypt-cbc, 0, 0xf30fa7, 0xd0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk|ImmExt,
{ 0 }
+xcrypt-ctr, 0, 0xf30fa7, 0xd8, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk|ImmExt,
{ 0 }
+xcrypt-cfb, 0, 0xf30fa7, 0xe0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk|ImmExt,
{ 0 }
+xcrypt-ofb, 0, 0xf30fa7, 0xe8, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk|ImmExt,
{ 0 }
+montmul, 0, 0xf30fa6, 0xc0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk|ImmExt,
{ 0 }
+xsha1, 0, 0xf30fa6, 0xc8, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk|ImmExt,
{ 0 }
+xsha256, 0, 0xf30fa6, 0xd0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk|ImmExt,
{ 0 }
 // Aliases without hyphens.
-xstorerng, 0, 0xfa7, 0xc0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|ImmExt, { 0
}
-xcryptecb, 0, 0xf30fa7, 0xc8, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|ImmExt, { 0
}
-xcryptcbc, 0, 0xf30fa7, 0xd0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|ImmExt, { 0
}
-xcryptctr, 0, 0xf30fa7, 0xd8, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|ImmExt, { 0
}
-xcryptcfb, 0, 0xf30fa7, 0xe0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|ImmExt, { 0
}
-xcryptofb, 0, 0xf30fa7, 0xe8, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|ImmExt, { 0
}
+xstorerng, 0, 0xfa7, 0xc0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk|ImmExt,
{ 0 }
+xcryptecb, 0, 0xf30fa7, 0xc8, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk|ImmExt,
{ 0 }
+xcryptcbc, 0, 0xf30fa7, 0xd0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk|ImmExt,
{ 0 }
+xcryptctr, 0, 0xf30fa7, 0xd8, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk|ImmExt,
{ 0 }
+xcryptcfb, 0, 0xf30fa7, 0xe0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk|ImmExt,
{ 0 }
+xcryptofb, 0, 0xf30fa7, 0xe8, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk|ImmExt,
{ 0 }
 // Alias for xstore-rng.
-xstore, 0, 0xfa7, 0xc0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|ImmExt, { 0
}
+xstore, 0, 0xfa7, 0xc0, 2, CpuPadLock,
No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk|ImmExt,
{ 0 }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]: x86 gas: allow 'rep' prefix on 'bsf' and 'bsr' instructions
  2012-06-22 16:43     ` Roland McGrath
@ 2012-06-22 16:46       ` H.J. Lu
  0 siblings, 0 replies; 12+ messages in thread
From: H.J. Lu @ 2012-06-22 16:46 UTC (permalink / raw)
  To: Roland McGrath; +Cc: binutils

On Fri, Jun 22, 2012 at 9:43 AM, Roland McGrath <mcgrathr@google.com> wrote:
> Here is what I've committed.
>
>
> Thanks,
> Roland
>
>
> gas/
> 2012-06-22  Roland McGrath  <mcgrathr@google.com>
>
>        * config/tc-i386.c (parse_insn): Don't complain about REP prefix
>        when the template has opcode_modifier.repprefixok set.
>        * NEWS: Mention the change.
>
> gas/testsuite/
> 2012-06-22  Roland McGrath  <mcgrathr@google.com>
>
>        * gas/i386/rep-bsf.d: New file.
>        * gas/i386/rep-bsf.s: New file.
>        * gas/i386/i386.exp: Add the new test.
>
> opcodes/
> 2012-06-22  Roland McGrath  <mcgrathr@google.com>
>
>        * i386-opc.h (RepPrefixOk): New enum constant.
>        (i386_opcode_modifier): New bitfield 'repprefixok'.
>        * i386-gen.c (opcode_modifiers): Add RepPrefixOk.
>        * i386-opc.tbl: Add RepPrefixOk to bsf, bsr, and to all
>        instructions that have IsString.
>        * i386-tbl.h: Regenerate.
>

Perfect.   Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]: x86 gas: allow 'rep' prefix on 'bsf' and 'bsr' instructions
  2012-06-21 21:36 [PATCH]: x86 gas: allow 'rep' prefix on 'bsf' and 'bsr' instructions Roland McGrath
       [not found] ` <CAMe9rOrZ0rJs13QBAgY+Tx4T=kN+1twUpC7y62MwyPJ47m=8nA@mail.gmail.com>
  2012-06-21 23:01 ` H.J. Lu
@ 2012-07-06  9:51 ` Jan Beulich
  2012-07-06 13:07   ` H.J. Lu
  2 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2012-07-06  9:51 UTC (permalink / raw)
  To: Roland McGrath; +Cc: binutils

>>> On 21.06.12 at 23:36, Roland McGrath <mcgrathr@google.com> wrote:
> 'rep; bsf ...'/'rep; bsr ...' are encoded the same as 'tzcnt ...'/'lzcnt 
> ...'.

While tzcnt really is an extension of bsf, lzcnt is not one of bsr,
so I don't really follow why it's useful to allow the prefix there.

Jan

> When not doing -mbmi, GCC (trunk) like to emit 'rep; bsf ...' on the
> theory that since the two instructions have sufficiently similar
> semantics for the purposes for which the compiler emits this,
> BMI-capable hardware will decode it as 'tzcnt ...' and may execute that
> faster than 'bsf ...', while older hardware will ignore the REP prefix
> and decode it as 'bsf ...'.
> 
> When using .bundle_align_mode, the assembler might decide to insert some
> nop padding between any two instructions, so the ';' could become some
> number of nop instructions and break the encoding intended.
> 
> This change makes the assembler accept 'rep bsf ...' or 'rep bsr ...'
> without complaint.  The result is the same as using the 'tzcnt' or
> 'lzcnt' mnemonic, but the 'rep' forms are accepted even under
> -march=i686 or the like where 'tzcnt' and 'lzcnt' would be refused.
> 
> With this, I can change the compiler to use this syntax (when configured
> with a new assembler) and remove the possibility of running afoul of
> .bundle_align_mode nop-insertion.
> 
> No testsuite failures on an x86_64-linux-gnu host.
> 
> 
> Ok for trunk?
> 
> (I've omitted the large patch to the generated file opcodes/i386-tbl.h,
> so use --enable-maintainer-mode to test the patch.)
> 
> 
> Thanks,
> Roland
> 
> 
> gas/
> 2012-06-21  Roland McGrath  <mcgrathr@google.com>
> 
> 	* config/tc-i386.c (parse_insn): Don't complain about REP prefix
> 	when the template has opcode_modifier.repprefixok set.
> 	* NEWS: Mention the change.
> 
> opcodes/
> 2012-06-21  Roland McGrath  <mcgrathr@google.com>
> 
> 	* i386-opc.h (RepPrefixOk): New enum constant.
> 	(i386_opcode_modifier): New bitfield 'repprefixok'.
> 	* i386-gen.c (opcode_modifiers): Add RepPrefixOk.
> 	* i386-opc.tbl (bsf, bsr): Add RepPrefixOk.
> 	* i386-tbl.h: Regenerate.
> 
> diff --git a/gas/NEWS b/gas/NEWS
> index 6b6dbba..6f62b93 100644
> --- a/gas/NEWS
> +++ b/gas/NEWS
> @@ -13,6 +13,8 @@
>  
>  * Add support for the Adapteva EPIPHANY architecture.
>  
> +* For x86, allow 'rep bsf' or 'rep bsr' syntax.
> +
>  Changes in 2.22:
>  
>  * Add support for the Tilera TILEPro and TILE-Gx architectures.
> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> index d2b4927..e3cdf71 100644
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -3534,7 +3534,7 @@ skip:
>        static templates override;
>  
>        for (t = current_templates->start; t < current_templates->end; ++t)
> -	if (t->opcode_modifier.isstring)
> +	if (t->opcode_modifier.isstring || t->opcode_modifier.repprefixok)
>  	  break;
>        if (t >= current_templates->end)
>  	{
> @@ -3543,7 +3543,7 @@ skip:
>  	  return NULL;
>  	}
>        for (override.start = t; t < current_templates->end; ++t)
> -	if (!t->opcode_modifier.isstring)
> +	if (!t->opcode_modifier.isstring && !t->opcode_modifier.repprefixok)
>  	  break;
>        override.end = t;
>        current_templates = &override;
> diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
> index 21f600f..9386a97 100644
> --- a/opcodes/i386-gen.c
> +++ b/opcodes/i386-gen.c
> @@ -1,4 +1,4 @@
> -/* Copyright 2007, 2008, 2009, 2010, 2011
> +/* Copyright 2007, 2008, 2009, 2010, 2011, 2012
>     Free Software Foundation, Inc.
>  
>     This file is part of the GNU opcodes library.
> @@ -394,6 +394,7 @@ static bitfield opcode_modifiers[] =
>    BITFIELD (No_ldSuf),
>    BITFIELD (FWait),
>    BITFIELD (IsString),
> +  BITFIELD (RepPrefixOk),
>    BITFIELD (IsLockable),
>    BITFIELD (RegKludge),
>    BITFIELD (FirstXmm0),
> diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
> index f130a96..f761770 100644
> --- a/opcodes/i386-opc.h
> +++ b/opcodes/i386-opc.h
> @@ -1,5 +1,5 @@
>  /* Declarations for Intel 80386 opcode table
> -   Copyright 2007, 2008, 2009, 2010
> +   Copyright 2007, 2008, 2009, 2010, 2012
>     Free Software Foundation, Inc.
>  
>     This file is part of the GNU opcodes library.
> @@ -290,6 +290,8 @@ enum
>    FWait,
>    /* quick test for string instructions */
>    IsString,
> +  /* additional instruction on which a "rep" prefix is acceptable */
> +  RepPrefixOk,
>    /* quick test for lockable instructions */
>    IsLockable,
>    /* fake an extra reg operand for clr, imul and special register
> @@ -438,6 +440,7 @@ typedef struct i386_opcode_modifier
>    unsigned int no_ldsuf:1;
>    unsigned int fwait:1;
>    unsigned int isstring:1;
> +  unsigned int repprefixok:1;
>    unsigned int islockable:1;
>    unsigned int regkludge:1;
>    unsigned int firstxmm0:1;
> diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
> index 8a43b51..8a25e15 100644
> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -1,5 +1,5 @@
>  // i386 opcode table.
> -// Copyright 2007, 2008, 2009, 2010
> +// Copyright 2007, 2008, 2009, 2010, 2012
>  // Free Software Foundation, Inc.
>  //
>  // This file is part of the GNU opcodes library.
> @@ -475,8 +475,8 @@ xlat, 0, 0xd7, None, 1, 0, 
> No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString, {
>  xlat, 1, 0xd7, None, 1, 0, 
> No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString, { 
> Byte|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S }
>  
>  // Bit manipulation.
> -bsf, 2, 0xfbc, None, 2, Cpu386, Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf, 
> { 
> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|
> Disp32S, Reg16|Reg32|Reg64 }
> -bsr, 2, 0xfbd, None, 2, Cpu386, Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf, 
> { 
> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|
> Disp32S, Reg16|Reg32|Reg64 }
> +bsf, 2, 0xfbc, None, 2, Cpu386, 
> Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf|RepPrefixOk, { 
> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|
> Disp32S, Reg16|Reg32|Reg64 }
> +bsr, 2, 0xfbd, None, 2, Cpu386, 
> Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf|RepPrefixOk, { 
> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|
> Disp32S, Reg16|Reg32|Reg64 }
>  bt, 2, 0xfa3, None, 2, Cpu386, Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf, 
> { Reg16|Reg32|Reg64, 
> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|
> Disp32S }
>  bt, 2, 0xfba, 0x4, 2, Cpu386, Modrm|No_bSuf|No_sSuf|No_ldSuf, { Imm8, 
> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|
> Disp32S }
>  btc, 2, 0xfbb, None, 2, Cpu386, 
> Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { 
> Reg16|Reg32|Reg64, 
> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|
> Disp32S }


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]: x86 gas: allow 'rep' prefix on 'bsf' and 'bsr' instructions
  2012-07-06  9:51 ` Jan Beulich
@ 2012-07-06 13:07   ` H.J. Lu
  2012-07-06 13:47     ` Jan Beulich
  0 siblings, 1 reply; 12+ messages in thread
From: H.J. Lu @ 2012-07-06 13:07 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roland McGrath, binutils

On Fri, Jul 6, 2012 at 2:52 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 21.06.12 at 23:36, Roland McGrath <mcgrathr@google.com> wrote:
>> 'rep; bsf ...'/'rep; bsr ...' are encoded the same as 'tzcnt ...'/'lzcnt
>> ...'.
>
> While tzcnt really is an extension of bsf, lzcnt is not one of bsr,
> so I don't really follow why it's useful to allow the prefix there.
>

It can be used to encode lzcnt with the older assembler,
similar to pause.


-- 
H.J.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]: x86 gas: allow 'rep' prefix on 'bsf' and 'bsr' instructions
  2012-07-06 13:07   ` H.J. Lu
@ 2012-07-06 13:47     ` Jan Beulich
  2012-07-06 14:10       ` H.J. Lu
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2012-07-06 13:47 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Roland McGrath, binutils

>>> On 06.07.12 at 15:06, "H.J. Lu" <hjl.tools@gmail.com> wrote:
> On Fri, Jul 6, 2012 at 2:52 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 21.06.12 at 23:36, Roland McGrath <mcgrathr@google.com> wrote:
>>> 'rep; bsf ...'/'rep; bsr ...' are encoded the same as 'tzcnt ...'/'lzcnt
>>> ...'.
>>
>> While tzcnt really is an extension of bsf, lzcnt is not one of bsr,
>> so I don't really follow why it's useful to allow the prefix there.
>>
> 
> It can be used to encode lzcnt with the older assembler,
> similar to pause.

You didn't get my point: rep bsr, for non-zero input, is _not_
equivalent to bsr (this equivalence exists only for bsf/tzcnt,
no matter what the Intel SDM says). Hence allowing this
combination is potentially misleading, and not useful to gcc
(which was the supposed original goal of the change).

On an older assembler (which would then also not support
bundle align mode) rep;bsr could be used equally well.

Jan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]: x86 gas: allow 'rep' prefix on 'bsf' and 'bsr' instructions
  2012-07-06 13:47     ` Jan Beulich
@ 2012-07-06 14:10       ` H.J. Lu
  2012-07-06 14:25         ` Jan Beulich
  0 siblings, 1 reply; 12+ messages in thread
From: H.J. Lu @ 2012-07-06 14:10 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roland McGrath, binutils

On Fri, Jul 6, 2012 at 6:47 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 06.07.12 at 15:06, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>> On Fri, Jul 6, 2012 at 2:52 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 21.06.12 at 23:36, Roland McGrath <mcgrathr@google.com> wrote:
>>>> 'rep; bsf ...'/'rep; bsr ...' are encoded the same as 'tzcnt ...'/'lzcnt
>>>> ...'.
>>>
>>> While tzcnt really is an extension of bsf, lzcnt is not one of bsr,
>>> so I don't really follow why it's useful to allow the prefix there.
>>>
>>
>> It can be used to encode lzcnt with the older assembler,
>> similar to pause.
>
> You didn't get my point: rep bsr, for non-zero input, is _not_
> equivalent to bsr (this equivalence exists only for bsf/tzcnt,
> no matter what the Intel SDM says). Hence allowing this
> combination is potentially misleading, and not useful to gcc
> (which was the supposed original goal of the change).
>

lzcnt is executed as bsr on the older processors.

-- 
H.J.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]: x86 gas: allow 'rep' prefix on 'bsf' and 'bsr' instructions
  2012-07-06 14:10       ` H.J. Lu
@ 2012-07-06 14:25         ` Jan Beulich
  2012-07-06 14:57           ` H.J. Lu
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2012-07-06 14:25 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Roland McGrath, binutils

>>> On 06.07.12 at 16:09, "H.J. Lu" <hjl.tools@gmail.com> wrote:
> On Fri, Jul 6, 2012 at 6:47 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 06.07.12 at 15:06, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>>> On Fri, Jul 6, 2012 at 2:52 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>> On 21.06.12 at 23:36, Roland McGrath <mcgrathr@google.com> wrote:
>>>>> 'rep; bsf ...'/'rep; bsr ...' are encoded the same as 'tzcnt ...'/'lzcnt
>>>>> ...'.
>>>>
>>>> While tzcnt really is an extension of bsf, lzcnt is not one of bsr,
>>>> so I don't really follow why it's useful to allow the prefix there.
>>>>
>>>
>>> It can be used to encode lzcnt with the older assembler,
>>> similar to pause.
>>
>> You didn't get my point: rep bsr, for non-zero input, is _not_
>> equivalent to bsr (this equivalence exists only for bsf/tzcnt,
>> no matter what the Intel SDM says). Hence allowing this
>> combination is potentially misleading, and not useful to gcc
>> (which was the supposed original goal of the change).
>>
> 
> lzcnt is executed as bsr on the older processors.

That is exactly my point: It will, even for non-zero inputs,
produce different results depending on whether it is being
run on an old or a new CPU. Hence it's not suitable for
trickery like can be played with the bsf/tzcnt pair.

And just to be clear - the fact that this being possible was
implicitly (but wrongly) suggested by the original description
of the patch here made me almost fall into that trap and try
to use tzcnt as (presumably better performing) replacement
for bsr somewhere else.

Jan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]: x86 gas: allow 'rep' prefix on 'bsf' and 'bsr' instructions
  2012-07-06 14:25         ` Jan Beulich
@ 2012-07-06 14:57           ` H.J. Lu
  0 siblings, 0 replies; 12+ messages in thread
From: H.J. Lu @ 2012-07-06 14:57 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roland McGrath, binutils

On Fri, Jul 6, 2012 at 7:25 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 06.07.12 at 16:09, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>> On Fri, Jul 6, 2012 at 6:47 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 06.07.12 at 15:06, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>>>> On Fri, Jul 6, 2012 at 2:52 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>> On 21.06.12 at 23:36, Roland McGrath <mcgrathr@google.com> wrote:
>>>>>> 'rep; bsf ...'/'rep; bsr ...' are encoded the same as 'tzcnt ...'/'lzcnt
>>>>>> ...'.
>>>>>
>>>>> While tzcnt really is an extension of bsf, lzcnt is not one of bsr,
>>>>> so I don't really follow why it's useful to allow the prefix there.
>>>>>
>>>>
>>>> It can be used to encode lzcnt with the older assembler,
>>>> similar to pause.
>>>
>>> You didn't get my point: rep bsr, for non-zero input, is _not_
>>> equivalent to bsr (this equivalence exists only for bsf/tzcnt,
>>> no matter what the Intel SDM says). Hence allowing this
>>> combination is potentially misleading, and not useful to gcc
>>> (which was the supposed original goal of the change).
>>>
>>
>> lzcnt is executed as bsr on the older processors.
>
> That is exactly my point: It will, even for non-zero inputs,
> produce different results depending on whether it is being
> run on an old or a new CPU. Hence it's not suitable for
> trickery like can be played with the bsf/tzcnt pair.
>
> And just to be clear - the fact that this being possible was
> implicitly (but wrongly) suggested by the original description
> of the patch here made me almost fall into that trap and try
> to use tzcnt as (presumably better performing) replacement
> for bsr somewhere else.
>

You can't draw this conclusion from rep XXX. Even tzcnt != bsf
on zero input.  We fixed a GCC bug recently.  I will leave it
to Roland.


-- 
H.J.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-07-06 14:57 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-21 21:36 [PATCH]: x86 gas: allow 'rep' prefix on 'bsf' and 'bsr' instructions Roland McGrath
     [not found] ` <CAMe9rOrZ0rJs13QBAgY+Tx4T=kN+1twUpC7y62MwyPJ47m=8nA@mail.gmail.com>
2012-06-21 22:08   ` Roland McGrath
2012-06-21 23:01 ` H.J. Lu
2012-06-22 14:29   ` H.J. Lu
2012-06-22 16:43     ` Roland McGrath
2012-06-22 16:46       ` H.J. Lu
2012-07-06  9:51 ` Jan Beulich
2012-07-06 13:07   ` H.J. Lu
2012-07-06 13:47     ` Jan Beulich
2012-07-06 14:10       ` H.J. Lu
2012-07-06 14:25         ` Jan Beulich
2012-07-06 14:57           ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).