public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: "H.J. Lu" <hjl.tools@gmail.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: Binutils <binutils@sourceware.org>,
	"Lu, Hongjiu" <hongjiu.lu@intel.com>,
	 Hongtao Liu <crazylht@gmail.com>
Subject: Re: [PATCH 1/2] i386: Generate lfence with load/indirect branch/ret [CVE-2020-0551]
Date: Wed, 11 Mar 2020 09:17:01 -0700	[thread overview]
Message-ID: <CAMe9rOpJR34mmRp2nmMEZK8g12MvpErid4H1XLm55aev0wfVEA@mail.gmail.com> (raw)
In-Reply-To: <6dcb50a3-d3ad-6abc-8a5f-703373df95a1@suse.com>

[-- Attachment #1: Type: text/plain, Size: 13473 bytes --]

On Wed, Mar 11, 2020 at 3:55 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 10.03.2020 17:05, H.J. Lu wrote:
> > @@ -4311,6 +4333,291 @@ optimize_encoding (void)
> >      }
> >  }
> >
> > +/* Return non-zero for load instruction.  */
> > +
> > +static int
> > +load_insn_p (void)
> > +{
> > +  unsigned int dest;
> > +  int any_vex_p = is_any_vex_encoding (&i.tm);
> > +
> > +  if (!any_vex_p)
> > +    {
> > +      /* lea  */
> > +      if (i.tm.base_opcode == 0x8d)
> > +     return 0;
>
> Also include INVLPG, CLFLUSH etc, and maybe some prefetches here?
> (I'll mention the LEA-like MPX insns as well, but I think I can
> predict your reply.)

Hongtao, can you look into it?

> > +      /* pop  */
> > +      if ((i.tm.base_opcode & 0xfffffff8) == 0x58
>
> Mind using ~7 instead?

Changed.

> > +       || (i.tm.base_opcode == 0x8f && i.tm.extension_opcode == 0))
> > +     return 1;
>
> What about segment register POPs, POPF, POPA, ENTER, and LEAVE?

We decided that ENTER and LEAVE are safe.  Hongtao, can you look into others?

> > +      /* movs, cmps, lods, scas.  */
> > +      if ((i.tm.base_opcode >= 0xa4 && i.tm.base_opcode <= 0xa7)
> > +       || (i.tm.base_opcode >= 0xac && i.tm.base_opcode <= 0xaf))
>
> This can be had with a single comparison:
>
>       if ((i.tm.base_opcode | 0xb) == 0xaf)

Changed.

> > +     return 1;
> > +
> > +      /* outs */
> > +      if (i.tm.base_opcode == 0x6e || i.tm.base_opcode == 0x6f)
>
> And here:
>
>       if ((i.tm.base_opcode | 1) == 0x6f)
>
> Similar folding of comparisons may also be desirable further down.

Changed,

> Also, what about XLATB? What about implicit memory accesses done by

Hongtao, can you look into it?

> e.g. segment register loads? As to AMD-specific insns with implicit
> memory operands (often accessed through rAX), should the doc
> perhaps mention they're intentionally not covered?

Yes, AMD specific insns are skipped.  Hongtao, can you look into it?

> > +     return 1;
> > +    }
> > +
> > +  /* No memory operand.  */
> > +  if (!i.mem_operands)
> > +    return 0;
> > +
> > +  if (any_vex_p)
> > +    {
> > +      /* vldmxcsr.  */
> > +      if (i.tm.base_opcode == 0xae
> > +       && i.tm.opcode_modifier.vex
> > +       && i.tm.opcode_modifier.vexopcode == VEX0F
> > +       && i.tm.extension_opcode == 2)
> > +     return 1;
> > +    }
> > +  else
> > +    {
> > +      /* test, not, neg, mul, imul, div, idiv.  */
> > +      if ((i.tm.base_opcode == 0xf6 || i.tm.base_opcode == 0xf7)
> > +       && i.tm.extension_opcode != 1)
> > +     return 1;
> > +
> > +      /* inc, dec.  */
> > +      if ((i.tm.base_opcode == 0xfe || i.tm.base_opcode == 0xff)
> > +       && i.tm.extension_opcode <= 1)
> > +     return 1;
> > +
> > +      /* add, or, adc, sbb, and, sub, xor, cmp.  */
> > +      if (i.tm.base_opcode >= 0x80 && i.tm.base_opcode <= 0x83)
> > +     return 1;
> > +
> > +      /* bt, bts, btr, btc.  */
> > +      if (i.tm.base_opcode == 0xfba
> > +       && (i.tm.extension_opcode >= 4 && i.tm.extension_opcode <= 7))
> > +     return 1;
> > +
> > +      /* rol, ror, rcl, rcr, shl/sal, shr, sar. */
> > +      if ((i.tm.base_opcode == 0xc0
> > +        || i.tm.base_opcode == 0xc1
> > +        || (i.tm.base_opcode >= 0xd0 && i.tm.base_opcode <= 0xd3))
> > +       && i.tm.extension_opcode != 6)
> > +     return 1;
> > +
> > +      /* cmpxchg8b, cmpxchg16b, xrstors.  */
> > +      if (i.tm.base_opcode == 0xfc7
> > +       && (i.tm.extension_opcode == 1 || i.tm.extension_opcode == 3))
> > +     return 1;
> > +
> > +      /* fxrstor, ldmxcsr, xrstor.  */
> > +      if (i.tm.base_opcode == 0xfae
> > +       && (i.tm.extension_opcode == 1
> > +           || i.tm.extension_opcode == 2
> > +           || i.tm.extension_opcode == 5))
> > +     return 1;
> > +
> > +      /* lgdt, lidt, lmsw.  */
> > +      if (i.tm.base_opcode == 0xf01
> > +       && (i.tm.extension_opcode == 2
> > +           || i.tm.extension_opcode == 3
> > +           || i.tm.extension_opcode == 6))
> > +     return 1;
> > +
> > +      /* vmptrld */
> > +      if (i.tm.base_opcode == 0xfc7
> > +       && i.tm.extension_opcode == 6)
> > +     return 1;
> > +
> > +      /* Check for x87 instructions.  */
> > +      if (i.tm.base_opcode >= 0xd8 && i.tm.base_opcode <= 0xdf)
> > +     {
> > +       /* Skip fst, fstp, fstenv, fstcw.  */
> > +       if (i.tm.base_opcode == 0xd9
> > +           && (i.tm.extension_opcode == 2
> > +               || i.tm.extension_opcode == 3
> > +               || i.tm.extension_opcode == 6
> > +               || i.tm.extension_opcode == 7))
> > +         return 0;
> > +
> > +       /* Skip fisttp, fist, fistp, fstp.  */
> > +       if (i.tm.base_opcode == 0xdb
> > +           && (i.tm.extension_opcode == 1
> > +               || i.tm.extension_opcode == 2
> > +               || i.tm.extension_opcode == 3
> > +               || i.tm.extension_opcode == 7))
> > +         return 0;
> > +
> > +       /* Skip fisttp, fst, fstp, fsave, fstsw.  */
> > +       if (i.tm.base_opcode == 0xdd
> > +           && (i.tm.extension_opcode == 1
> > +               || i.tm.extension_opcode == 2
> > +               || i.tm.extension_opcode == 3
> > +               || i.tm.extension_opcode == 6
> > +               || i.tm.extension_opcode == 7))
> > +         return 0;
> > +
> > +       /* Skip fisttp, fist, fistp, fbstp, fistp.  */
> > +       if (i.tm.base_opcode == 0xdf
> > +           && (i.tm.extension_opcode == 1
> > +               || i.tm.extension_opcode == 2
> > +               || i.tm.extension_opcode == 3
> > +               || i.tm.extension_opcode == 6
> > +               || i.tm.extension_opcode == 7))
> > +         return 0;
> > +
> > +       return 1;
> > +     }
> > +    }
> > +
> > +  dest = i.operands - 1;
> > +
> > +  /* Check fake imm8 operand and 3 source operands.  */
> > +  if ((i.tm.opcode_modifier.immext
> > +       || i.tm.opcode_modifier.vexsources == VEX3SOURCES)
> > +      && i.types[dest].bitfield.imm8)
> > +    dest--;
> > +
> > +  /* add, or, adc, sbb, and, sub, xor, cmp, test, xchg, xadd  */
> > +  if (!any_vex_p
> > +      && (i.tm.base_opcode == 0x0
> > +       || i.tm.base_opcode == 0x1
> > +       || i.tm.base_opcode == 0x8
> > +       || i.tm.base_opcode == 0x9
> > +       || i.tm.base_opcode == 0x10
> > +       || i.tm.base_opcode == 0x11
> > +       || i.tm.base_opcode == 0x18
> > +       || i.tm.base_opcode == 0x19
> > +       || i.tm.base_opcode == 0x20
> > +       || i.tm.base_opcode == 0x21
> > +       || i.tm.base_opcode == 0x28
> > +       || i.tm.base_opcode == 0x29
> > +       || i.tm.base_opcode == 0x30
> > +       || i.tm.base_opcode == 0x31
> > +       || i.tm.base_opcode == 0x38
> > +       || i.tm.base_opcode == 0x39
> > +       || (i.tm.base_opcode >= 0x84 && i.tm.base_opcode <= 0x87)
> > +       || i.tm.base_opcode == 0xfc0
> > +       || i.tm.base_opcode == 0xfc1))
> > +    return 1;
>
> Don't quite a few of these fit very well with ...
>

Changed.

> > +  /* Check for load instruction.  */
> > +  return (i.types[dest].bitfield.class != ClassNone
> > +       || i.types[dest].bitfield.instance == Accum);
>
> ... this generic expression? It would seem to me that only TEST
> and XCHG need special casing, for allowing either operand order.
> Same seems to apply to quite a few of the special cases in the
> big "else" block further up, and even its if() [vldmxcsr] part.

Hongtao, can you look into it?

> > +static void
> > +insert_lfence_before (void)
> > +{
> > +  char *p;
> > +
> > +  if (i.tm.base_opcode == 0xff
> > +      && (i.tm.extension_opcode == 2 || i.tm.extension_opcode == 4))
>
> Also exclude VEX- and alike encoded insn here again?

I changed to:

static void
insert_lfence_before (void)
{
  char *p;

  if (is_any_vex_encoding (&i.tm))
    return;

> > +    {
> > +      /* Insert lfence before indirect branch if needed.  */
> > +
> > +      if (lfence_before_indirect_branch == lfence_branch_none)
> > +     return;
> > +
> > +      if (i.operands != 1)
> > +     abort ();
> > +
> > +      if (i.reg_operands == 1)
> > +     {
> > +       /* Indirect branch via register.  Don't insert lfence with
> > +          -mlfence-after-load=yes.  */
> > +       if (lfence_after_load
> > +           || lfence_before_indirect_branch == lfence_branch_memory)
> > +         return;
> > +     }
> > +      else if (i.mem_operands == 1
> > +            && lfence_before_indirect_branch != lfence_branch_register)
> > +     {
> > +       as_warn (_("indirect branch `%s` over memory should be avoided"),
> > +                i.tm.name);
>
> Perhaps drop "branch" and replace "over memory" by "with memory operand"?

Changed.

> > +       return;
> > +     }
> > +      else
> > +     return;
> > +
> > +      if (last_insn.kind != last_insn_other
> > +       && last_insn.seg == now_seg)
> > +     {
> > +       as_warn_where (last_insn.file, last_insn.line,
> > +                      _("`%s` skips -mlfence-before-indirect-branch on `%s`"),
> > +                      last_insn.name, i.tm.name);
> > +       return;
> > +     }
> > +
> > +      p = frag_more (3);
> > +      *p++ = 0xf;
> > +      *p++ = 0xae;
> > +      *p = 0xe8;
> > +      return;
> > +    }
> > +
> > +  /* Output orl/notl and lfence before ret.  */
>
> May I suggest to either drop the insn suffixes here (and below),
> or make them correctly reflect the code below (which may also
> produce q- or w-suffixed insns)?

Changed.

> > +  if (lfence_before_ret != lfence_before_ret_none
> > +      && (i.tm.base_opcode == 0xc2
> > +       || i.tm.base_opcode == 0xc3
> > +       || i.tm.base_opcode == 0xca
> > +       || i.tm.base_opcode == 0xcb))
> > +    {
> > +      if (last_insn.kind != last_insn_other
> > +       && last_insn.seg == now_seg)
> > +     {
> > +       as_warn_where (last_insn.file, last_insn.line,
> > +                      _("`%s` skips -mlfence-before-ret on `%s`"),
> > +                      last_insn.name, i.tm.name);
> > +       return;
> > +     }
> > +      if (lfence_before_ret == lfence_before_ret_or)
> > +     {
> > +       /* orl: 0x830c2400.  */
> > +       p = frag_more ((flag_code == CODE_64BIT ? 1 : 0) + 4 + 3);
> > +       if (flag_code == CODE_64BIT)
> > +         *p++ = 0x48;
>
> Shouldn't this depend on RET's operand size? Likewise wouldn't you
> also need to insert 0x66/0x67 in certain cases?

Hongtao, can you look into it?

> > +       *p++ = 0x83;
> > +       *p++ = 0xc;
> > +       *p++ = 0x24;
> > +       *p++ = 0x0;
> > +     }
> > +      else
> > +     {
> > +       p = frag_more ((flag_code == CODE_64BIT ? 2 : 0) + 6 + 3);
> > +       /* notl: 0xf71424.  */
> > +       if (flag_code == CODE_64BIT)
> > +         *p++ = 0x48;
> > +       *p++ = 0xf7;
> > +       *p++ = 0x14;
> > +       *p++ = 0x24;
> > +       if (flag_code == CODE_64BIT)
> > +         *p++ = 0x48;
> > +       /* notl: 0xf71424.  */
> > +       *p++ = 0xf7;
> > +       *p++ = 0x14;
> > +       *p++ = 0x24;
>
> When reading the description I was wondering about the use of NOT.
> I think the doc should mention that it's _two_ NOTs that get inserted,
> as this is even more growth of code size than the OR variant. Is
> there a performance reason for having this extra, more expensive (in
> terms of code size) variant? Or is it rather because of the OR
> variant clobbering EFLAGS (which ought to be called out in the doc)?
> In which case - was it considered to use e.g. SHL with an immediate
> of zero, thus having smaller code _and_ untouched EFLAGS (but of
> course requiring at least an 80186, albeit the addressing mode
> used requires a 386 anyway, which you don't seem to be checking
> anywhere)?

This is a very good suggestion.  I will talk to our people.  In meantime,
I'd like to keep it as is since this version has been tested extensively.
We can change it to SHL 0 later.

> Also I guess the last comment above would better move two lines up?

Changed.

> > @@ -12668,6 +12986,41 @@ md_parse_option (int c, const char *arg)
> >          as_fatal (_("invalid -mfence-as-lock-add= option: `%s'"), arg);
> >        break;
> >
> > +    case OPTION_MLFENCE_AFTER_LOAD:
> > +      if (strcasecmp (arg, "yes") == 0)
> > +     lfence_after_load = 1;
> > +      else if (strcasecmp (arg, "no") == 0)
> > +     lfence_after_load = 0;
> > +      else
> > +        as_fatal (_("invalid -mlfence-after-load= option: `%s'"), arg);
> > +      break;
> > +
> > +    case OPTION_MLFENCE_BEFORE_INDIRECT_BRANCH:
> > +      if (strcasecmp (arg, "all") == 0)
> > +     lfence_before_indirect_branch = lfence_branch_all;
>
> I wonder whether this shouldn't also enable a safe lfence_before_ret
> mode (i.e. not the OR one), for RET also being an indirect branch. Of
> course care would need to be taken to avoid clobbering an already set
> lfence_before_ret mode.

Hongtao, can you look into it?

> > @@ -13254,6 +13616,10 @@ i386_cons_align (int ignore ATTRIBUTE_UNUSED)
> >        last_insn.kind = last_insn_directive;
> >        last_insn.name = "constant directive";
> >        last_insn.file = as_where (&last_insn.line);
> > +      if (lfence_before_ret != lfence_before_ret_none)
> > +     as_warn (_("constant directive skips -mlfence-before-ret"));
> > +      if (lfence_before_indirect_branch != lfence_branch_none)
> > +     as_warn (_("constant directive skips -mlfence-before-indirect-branch"));
>
> Could these be folded into a single warning, to avoid getting overly
> verbose?
>

Changed.

This is the patch I am checking in.

-- 
H.J.

[-- Attachment #2: 0001-i386-Generate-lfence-with-load-indirect-branch-ret-C.patch --]
[-- Type: text/x-patch, Size: 19440 bytes --]

From c4b77b8bd61dfdd17d51f4e1c9178234e1139fd6 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Tue, 10 Mar 2020 08:29:57 -0700
Subject: [PATCH] i386: Generate lfence with load/indirect branch/ret
 [CVE-2020-0551]

Add 3 command-line options to generate lfence for load, indirect near
branch and ret to help mitigate:

https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00334.html
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-0551

1. -mlfence-after-load=[no|yes]:
  -mlfence-after-load=yes generates lfence after load instructions.
2. -mlfence-before-indirect-branch=[none|all|memory|register]:
  a. -mlfence-before-indirect-branch=all generates lfence before indirect
  near branches via register and a warning before indirect near branches
  via memory.
  b. -mlfence-before-indirect-branch=memory issue a warning before
  indirect near branches via memory.
  c. -mlfence-before-indirect-branch=register generates lfence before
  indirect near branches via register.
Note that lfence won't be generated before indirect near branches via
register with -mlfence-after-load=yes since lfence will be generated
after loading branch target register.
3. -mlfence-before-ret=[none|or|not]
  a. -mlfence-before-ret=or generates or with lfence before ret.
  b. -mlfence-before-ret=not generates not with lfence before ret.

A warning will be issued and lfence won't be generated before indirect
near branch and ret if the previous item is a prefix or a constant
directive, which may be used to hardcode an instruction, since there
is no clear instruction boundary.

	* config/tc-i386.c (lfence_after_load): New.
	(lfence_before_indirect_branch_kind): New.
	(lfence_before_indirect_branch): New.
	(lfence_before_ret_kind): New.
	(lfence_before_ret): New.
	(last_insn): New.
	(load_insn_p): New.
	(insert_lfence_after): New.
	(insert_lfence_before): New.
	(md_assemble): Call insert_lfence_before and insert_lfence_after.
	Set last_insn.
	(OPTION_MLFENCE_AFTER_LOAD): New.
	(OPTION_MLFENCE_BEFORE_INDIRECT_BRANCH): New.
	(OPTION_MLFENCE_BEFORE_RET): New.
	(md_longopts): Add -mlfence-after-load=,
	-mlfence-before-indirect-branch= and -mlfence-before-ret=.
	(md_parse_option): Handle -mlfence-after-load=,
	-mlfence-before-indirect-branch= and -mlfence-before-ret=.
	(md_show_usage): Display -mlfence-after-load=,
	-mlfence-before-indirect-branch= and -mlfence-before-ret=.
	(i386_cons_align): New.
	* config/tc-i386.h (i386_cons_align): New.
	(md_cons_align): New.
	* doc/c-i386.texi: Document -mlfence-after-load=,
	-mlfence-before-indirect-branch= and -mlfence-before-ret=.
---
 gas/ChangeLog        |  28 ++++
 gas/config/tc-i386.c | 366 ++++++++++++++++++++++++++++++++++++++++++-
 gas/doc/c-i386.texi  |  43 +++++
 3 files changed, 436 insertions(+), 1 deletion(-)

diff --git a/gas/ChangeLog b/gas/ChangeLog
index 836cb5c6d9..d581cc3d47 100644
--- a/gas/ChangeLog
+++ b/gas/ChangeLog
@@ -1,3 +1,31 @@
+2020-03-10  H.J. Lu  <hongjiu.lu@intel.com>
+
+	* config/tc-i386.c (lfence_after_load): New.
+	(lfence_before_indirect_branch_kind): New.
+	(lfence_before_indirect_branch): New.
+	(lfence_before_ret_kind): New.
+	(lfence_before_ret): New.
+	(last_insn): New.
+	(load_insn_p): New.
+	(insert_lfence_after): New.
+	(insert_lfence_before): New.
+	(md_assemble): Call insert_lfence_before and insert_lfence_after.
+	Set last_insn.
+	(OPTION_MLFENCE_AFTER_LOAD): New.
+	(OPTION_MLFENCE_BEFORE_INDIRECT_BRANCH): New.
+	(OPTION_MLFENCE_BEFORE_RET): New.
+	(md_longopts): Add -mlfence-after-load=,
+	-mlfence-before-indirect-branch= and -mlfence-before-ret=.
+	(md_parse_option): Handle -mlfence-after-load=,
+	-mlfence-before-indirect-branch= and -mlfence-before-ret=.
+	(md_show_usage): Display -mlfence-after-load=,
+	-mlfence-before-indirect-branch= and -mlfence-before-ret=.
+	(i386_cons_align): New.
+	* config/tc-i386.h (i386_cons_align): New.
+	(md_cons_align): New.
+	* doc/c-i386.texi: Document -mlfence-after-load=,
+	-mlfence-before-indirect-branch= and -mlfence-before-ret=.
+
 2020-03-10  Alan Modra  <amodra@gmail.com>
 
 	* config/tc-csky.c (get_operand_value): Rewrite 1 << 31 expressions
diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index b020f39c86..09063f784b 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -629,7 +629,29 @@ static int omit_lock_prefix = 0;
    "lock addl $0, (%{re}sp)".  */
 static int avoid_fence = 0;
 
-/* Type of the previous instruction.  */
+/* 1 if lfence should be inserted after every load.  */
+static int lfence_after_load = 0;
+
+/* Non-zero if lfence should be inserted before indirect branch.  */
+static enum lfence_before_indirect_branch_kind
+  {
+    lfence_branch_none = 0,
+    lfence_branch_register,
+    lfence_branch_memory,
+    lfence_branch_all
+  }
+lfence_before_indirect_branch;
+
+/* Non-zero if lfence should be inserted before ret.  */
+static enum lfence_before_ret_kind
+  {
+    lfence_before_ret_none = 0,
+    lfence_before_ret_not,
+    lfence_before_ret_or
+  }
+lfence_before_ret;
+
+/* Types of previous instruction is .byte or prefix.  */
 static struct
   {
     segT seg;
@@ -4311,6 +4333,283 @@ optimize_encoding (void)
     }
 }
 
+/* Return non-zero for load instruction.  */
+
+static int
+load_insn_p (void)
+{
+  unsigned int dest;
+  int any_vex_p = is_any_vex_encoding (&i.tm);
+  unsigned int base_opcode = i.tm.base_opcode | 1;
+
+  if (!any_vex_p)
+    {
+      /* lea  */
+      if (i.tm.base_opcode == 0x8d)
+	return 0;
+
+      /* pop  */
+      if ((i.tm.base_opcode & ~7) == 0x58
+	  || (i.tm.base_opcode == 0x8f && i.tm.extension_opcode == 0))
+	return 1;
+
+      /* movs, cmps, lods, scas.  */
+      if ((i.tm.base_opcode | 0xb) == 0xaf)
+	return 1;
+
+      /* outs */
+      if (base_opcode == 0x6f)
+	return 1;
+    }
+
+  /* No memory operand.  */
+  if (!i.mem_operands)
+    return 0;
+
+  if (any_vex_p)
+    {
+      /* vldmxcsr.  */
+      if (i.tm.base_opcode == 0xae
+	  && i.tm.opcode_modifier.vex
+	  && i.tm.opcode_modifier.vexopcode == VEX0F
+	  && i.tm.extension_opcode == 2)
+	return 1;
+    }
+  else
+    {
+      /* test, not, neg, mul, imul, div, idiv.  */
+      if ((i.tm.base_opcode == 0xf6 || i.tm.base_opcode == 0xf7)
+	  && i.tm.extension_opcode != 1)
+	return 1;
+
+      /* inc, dec.  */
+      if (base_opcode == 0xff && i.tm.extension_opcode <= 1)
+	return 1;
+
+      /* add, or, adc, sbb, and, sub, xor, cmp.  */
+      if (i.tm.base_opcode >= 0x80 && i.tm.base_opcode <= 0x83)
+	return 1;
+
+      /* bt, bts, btr, btc.  */
+      if (i.tm.base_opcode == 0xfba
+	  && (i.tm.extension_opcode >= 4 && i.tm.extension_opcode <= 7))
+	return 1;
+
+      /* rol, ror, rcl, rcr, shl/sal, shr, sar. */
+      if ((base_opcode == 0xc1
+	   || (i.tm.base_opcode >= 0xd0 && i.tm.base_opcode <= 0xd3))
+	  && i.tm.extension_opcode != 6)
+	return 1;
+
+      /* cmpxchg8b, cmpxchg16b, xrstors.  */
+      if (i.tm.base_opcode == 0xfc7
+	  && (i.tm.extension_opcode == 1 || i.tm.extension_opcode == 3))
+	return 1;
+
+      /* fxrstor, ldmxcsr, xrstor.  */
+      if (i.tm.base_opcode == 0xfae
+	  && (i.tm.extension_opcode == 1
+	      || i.tm.extension_opcode == 2
+	      || i.tm.extension_opcode == 5))
+	return 1;
+
+      /* lgdt, lidt, lmsw.  */
+      if (i.tm.base_opcode == 0xf01
+	  && (i.tm.extension_opcode == 2
+	      || i.tm.extension_opcode == 3
+	      || i.tm.extension_opcode == 6))
+	return 1;
+
+      /* vmptrld */
+      if (i.tm.base_opcode == 0xfc7
+	  && i.tm.extension_opcode == 6)
+	return 1;
+
+      /* Check for x87 instructions.  */
+      if (i.tm.base_opcode >= 0xd8 && i.tm.base_opcode <= 0xdf)
+	{
+	  /* Skip fst, fstp, fstenv, fstcw.  */
+	  if (i.tm.base_opcode == 0xd9
+	      && (i.tm.extension_opcode == 2
+		  || i.tm.extension_opcode == 3
+		  || i.tm.extension_opcode == 6
+		  || i.tm.extension_opcode == 7))
+	    return 0;
+
+	  /* Skip fisttp, fist, fistp, fstp.  */
+	  if (i.tm.base_opcode == 0xdb
+	      && (i.tm.extension_opcode == 1
+		  || i.tm.extension_opcode == 2
+		  || i.tm.extension_opcode == 3
+		  || i.tm.extension_opcode == 7))
+	    return 0;
+
+	  /* Skip fisttp, fst, fstp, fsave, fstsw.  */
+	  if (i.tm.base_opcode == 0xdd
+	      && (i.tm.extension_opcode == 1
+		  || i.tm.extension_opcode == 2
+		  || i.tm.extension_opcode == 3
+		  || i.tm.extension_opcode == 6
+		  || i.tm.extension_opcode == 7))
+	    return 0;
+
+	  /* Skip fisttp, fist, fistp, fbstp, fistp.  */
+	  if (i.tm.base_opcode == 0xdf
+	      && (i.tm.extension_opcode == 1
+		  || i.tm.extension_opcode == 2
+		  || i.tm.extension_opcode == 3
+		  || i.tm.extension_opcode == 6
+		  || i.tm.extension_opcode == 7))
+	    return 0;
+
+	  return 1;
+	}
+    }
+
+  dest = i.operands - 1;
+
+  /* Check fake imm8 operand and 3 source operands.  */
+  if ((i.tm.opcode_modifier.immext
+       || i.tm.opcode_modifier.vexsources == VEX3SOURCES)
+      && i.types[dest].bitfield.imm8)
+    dest--;
+
+  /* add, or, adc, sbb, and, sub, xor, cmp, test, xchg, xadd  */
+  if (!any_vex_p
+      && (base_opcode == 0x1
+	  || base_opcode == 0x9
+	  || base_opcode == 0x11
+	  || base_opcode == 0x19
+	  || base_opcode == 0x21
+	  || base_opcode == 0x29
+	  || base_opcode == 0x31
+	  || base_opcode == 0x39
+	  || (i.tm.base_opcode >= 0x84 && i.tm.base_opcode <= 0x87)
+	  || base_opcode == 0xfc1))
+    return 1;
+
+  /* Check for load instruction.  */
+  return (i.types[dest].bitfield.class != ClassNone
+	  || i.types[dest].bitfield.instance == Accum);
+}
+
+/* Output lfence, 0xfaee8, after instruction.  */
+
+static void
+insert_lfence_after (void)
+{
+  if (lfence_after_load && load_insn_p ())
+    {
+      char *p = frag_more (3);
+      *p++ = 0xf;
+      *p++ = 0xae;
+      *p = 0xe8;
+    }
+}
+
+/* Output lfence, 0xfaee8, before instruction.  */
+
+static void
+insert_lfence_before (void)
+{
+  char *p;
+
+  if (is_any_vex_encoding (&i.tm))
+    return;
+
+  if (i.tm.base_opcode == 0xff
+      && (i.tm.extension_opcode == 2 || i.tm.extension_opcode == 4))
+    {
+      /* Insert lfence before indirect branch if needed.  */
+
+      if (lfence_before_indirect_branch == lfence_branch_none)
+	return;
+
+      if (i.operands != 1)
+	abort ();
+
+      if (i.reg_operands == 1)
+	{
+	  /* Indirect branch via register.  Don't insert lfence with
+	     -mlfence-after-load=yes.  */
+	  if (lfence_after_load
+	      || lfence_before_indirect_branch == lfence_branch_memory)
+	    return;
+	}
+      else if (i.mem_operands == 1
+	       && lfence_before_indirect_branch != lfence_branch_register)
+	{
+	  as_warn (_("indirect `%s` with memory operand should be avoided"),
+		   i.tm.name);
+	  return;
+	}
+      else
+	return;
+
+      if (last_insn.kind != last_insn_other
+	  && last_insn.seg == now_seg)
+	{
+	  as_warn_where (last_insn.file, last_insn.line,
+			 _("`%s` skips -mlfence-before-indirect-branch on `%s`"),
+			 last_insn.name, i.tm.name);
+	  return;
+	}
+
+      p = frag_more (3);
+      *p++ = 0xf;
+      *p++ = 0xae;
+      *p = 0xe8;
+      return;
+    }
+
+  /* Output or/not and lfence before ret.  */
+  if (lfence_before_ret != lfence_before_ret_none
+      && (i.tm.base_opcode == 0xc2
+	  || i.tm.base_opcode == 0xc3
+	  || i.tm.base_opcode == 0xca
+	  || i.tm.base_opcode == 0xcb))
+    {
+      if (last_insn.kind != last_insn_other
+	  && last_insn.seg == now_seg)
+	{
+	  as_warn_where (last_insn.file, last_insn.line,
+			 _("`%s` skips -mlfence-before-ret on `%s`"),
+			 last_insn.name, i.tm.name);
+	  return;
+	}
+      if (lfence_before_ret == lfence_before_ret_or)
+	{
+	  /* orl: 0x830c2400.  */
+	  p = frag_more ((flag_code == CODE_64BIT ? 1 : 0) + 4 + 3);
+	  if (flag_code == CODE_64BIT)
+	    *p++ = 0x48;
+	  *p++ = 0x83;
+	  *p++ = 0xc;
+	  *p++ = 0x24;
+	  *p++ = 0x0;
+	}
+      else
+	{
+	  p = frag_more ((flag_code == CODE_64BIT ? 2 : 0) + 6 + 3);
+	  /* notl: 0xf71424.  */
+	  if (flag_code == CODE_64BIT)
+	    *p++ = 0x48;
+	  *p++ = 0xf7;
+	  *p++ = 0x14;
+	  *p++ = 0x24;
+	  /* notl: 0xf71424.  */
+	  if (flag_code == CODE_64BIT)
+	    *p++ = 0x48;
+	  *p++ = 0xf7;
+	  *p++ = 0x14;
+	  *p++ = 0x24;
+	}
+      *p++ = 0xf;
+      *p++ = 0xae;
+      *p = 0xe8;
+    }
+}
+
 /* This is the guts of the machine-dependent assembler.  LINE points to a
    machine dependent instruction.  This function is supposed to emit
    the frags/bytes it assembles to.  */
@@ -4628,9 +4927,13 @@ md_assemble (char *line)
   if (i.rex != 0)
     add_prefix (REX_OPCODE | i.rex);
 
+  insert_lfence_before ();
+
   /* We are ready to output the insn.  */
   output_insn ();
 
+  insert_lfence_after ();
+
   last_insn.seg = now_seg;
 
   if (i.tm.opcode_modifier.isprefix)
@@ -12250,6 +12553,9 @@ const char *md_shortopts = "qnO::";
 #define OPTION_MALIGN_BRANCH_PREFIX_SIZE (OPTION_MD_BASE + 28)
 #define OPTION_MALIGN_BRANCH (OPTION_MD_BASE + 29)
 #define OPTION_MBRANCHES_WITH_32B_BOUNDARIES (OPTION_MD_BASE + 30)
+#define OPTION_MLFENCE_AFTER_LOAD (OPTION_MD_BASE + 31)
+#define OPTION_MLFENCE_BEFORE_INDIRECT_BRANCH (OPTION_MD_BASE + 32)
+#define OPTION_MLFENCE_BEFORE_RET (OPTION_MD_BASE + 33)
 
 struct option md_longopts[] =
 {
@@ -12289,6 +12595,10 @@ struct option md_longopts[] =
   {"malign-branch-prefix-size", required_argument, NULL, OPTION_MALIGN_BRANCH_PREFIX_SIZE},
   {"malign-branch", required_argument, NULL, OPTION_MALIGN_BRANCH},
   {"mbranches-within-32B-boundaries", no_argument, NULL, OPTION_MBRANCHES_WITH_32B_BOUNDARIES},
+  {"mlfence-after-load", required_argument, NULL, OPTION_MLFENCE_AFTER_LOAD},
+  {"mlfence-before-indirect-branch", required_argument, NULL,
+   OPTION_MLFENCE_BEFORE_INDIRECT_BRANCH},
+  {"mlfence-before-ret", required_argument, NULL, OPTION_MLFENCE_BEFORE_RET},
   {"mamd64", no_argument, NULL, OPTION_MAMD64},
   {"mintel64", no_argument, NULL, OPTION_MINTEL64},
   {NULL, no_argument, NULL, 0}
@@ -12668,6 +12978,41 @@ md_parse_option (int c, const char *arg)
         as_fatal (_("invalid -mfence-as-lock-add= option: `%s'"), arg);
       break;
 
+    case OPTION_MLFENCE_AFTER_LOAD:
+      if (strcasecmp (arg, "yes") == 0)
+	lfence_after_load = 1;
+      else if (strcasecmp (arg, "no") == 0)
+	lfence_after_load = 0;
+      else
+        as_fatal (_("invalid -mlfence-after-load= option: `%s'"), arg);
+      break;
+
+    case OPTION_MLFENCE_BEFORE_INDIRECT_BRANCH:
+      if (strcasecmp (arg, "all") == 0)
+	lfence_before_indirect_branch = lfence_branch_all;
+      else if (strcasecmp (arg, "memory") == 0)
+	lfence_before_indirect_branch = lfence_branch_memory;
+      else if (strcasecmp (arg, "register") == 0)
+	lfence_before_indirect_branch = lfence_branch_register;
+      else if (strcasecmp (arg, "none") == 0)
+	lfence_before_indirect_branch = lfence_branch_none;
+      else
+        as_fatal (_("invalid -mlfence-before-indirect-branch= option: `%s'"),
+		  arg);
+      break;
+
+    case OPTION_MLFENCE_BEFORE_RET:
+      if (strcasecmp (arg, "or") == 0)
+	lfence_before_ret = lfence_before_ret_or;
+      else if (strcasecmp (arg, "not") == 0)
+	lfence_before_ret = lfence_before_ret_not;
+      else if (strcasecmp (arg, "none") == 0)
+	lfence_before_ret = lfence_before_ret_none;
+      else
+        as_fatal (_("invalid -mlfence-before-ret= option: `%s'"),
+		  arg);
+      break;
+
     case OPTION_MRELAX_RELOCATIONS:
       if (strcasecmp (arg, "yes") == 0)
         generate_relax_relocations = 1;
@@ -13025,6 +13370,15 @@ md_show_usage (FILE *stream)
   -mbranches-within-32B-boundaries\n\
                           align branches within 32 byte boundary\n"));
   fprintf (stream, _("\
+  -mlfence-after-load=[no|yes] (default: no)\n\
+                          generate lfence after load\n"));
+  fprintf (stream, _("\
+  -mlfence-before-indirect-branch=[none|all|register|memory] (default: none)\n\
+                          generate lfence before indirect near branch\n"));
+  fprintf (stream, _("\
+  -mlfence-before-ret=[none|or|not] (default: none)\n\
+                          generate lfence before ret\n"));
+  fprintf (stream, _("\
   -mamd64                 accept only AMD64 ISA [default]\n"));
   fprintf (stream, _("\
   -mintel64               accept only Intel64 ISA\n"));
@@ -13254,6 +13608,16 @@ i386_cons_align (int ignore ATTRIBUTE_UNUSED)
       last_insn.kind = last_insn_directive;
       last_insn.name = "constant directive";
       last_insn.file = as_where (&last_insn.line);
+      if (lfence_before_ret != lfence_before_ret_none)
+	{
+	  if (lfence_before_indirect_branch != lfence_branch_none)
+	    as_warn (_("constant directive skips -mlfence-before-ret "
+		       "and -mlfence-before-indirect-branch"));
+	  else
+	    as_warn (_("constant directive skips -mlfence-before-ret"));
+	}
+      else if (lfence_before_indirect_branch != lfence_branch_none)
+	as_warn (_("constant directive skips -mlfence-before-indirect-branch"));
     }
 }
 
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index c536759cb3..1dd99f91bb 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -464,6 +464,49 @@ on an instruction.  It is equivalent to
 @option{-malign-branch-prefix-size=5}.
 The default doesn't align branches.
 
+@cindex @samp{-mlfence-after-load=} option, i386
+@cindex @samp{-mlfence-after-load=} option, x86-64
+@item -mlfence-after-load=@var{no}
+@itemx -mlfence-after-load=@var{yes}
+These options control whether the assembler should generate lfence
+after load instructions.  @option{-mlfence-after-load=@var{yes}} will
+generate lfence.  @option{-mlfence-after-load=@var{no}} will not generate
+lfence, which is the default.
+
+@cindex @samp{-mlfence-before-indirect-branch=} option, i386
+@cindex @samp{-mlfence-before-indirect-branch=} option, x86-64
+@item -mlfence-before-indirect-branch=@var{none}
+@item -mlfence-before-indirect-branch=@var{all}
+@item -mlfence-before-indirect-branch=@var{register}
+@itemx -mlfence-before-indirect-branch=@var{memory}
+These options control whether the assembler should generate lfence
+after indirect near branch instructions.
+@option{-mlfence-before-indirect-branch=@var{all}} will generate lfence
+after indirect near branch via register and issue a warning before
+indirect near branch via memory.
+@option{-mlfence-before-indirect-branch=@var{register}} will generate
+lfence after indirect near branch via register.
+@option{-mlfence-before-indirect-branch=@var{memory}} will issue a
+warning before indirect near branch via memory.
+@option{-mlfence-before-indirect-branch=@var{none}} will not generate
+lfence nor issue warning, which is the default.  Note that lfence won't
+be generated before indirect near branch via register with
+@option{-mlfence-after-load=@var{yes}} since lfence will be generated
+after loading branch target register.
+
+@cindex @samp{-mlfence-before-ret=} option, i386
+@cindex @samp{-mlfence-before-ret=} option, x86-64
+@item -mlfence-before-ret=@var{none}
+@item -mlfence-before-ret=@var{or}
+@itemx -mlfence-before-ret=@var{not}
+These options control whether the assembler should generate lfence
+before ret.  @option{-mlfence-before-ret=@var{or}} will generate
+generate or instruction with lfence.
+@option{-mlfence-before-ret=@var{not}} will generate not instruction
+with lfence.
+@option{-mlfence-before-ret=@var{none}} will not generate lfence,
+which is the default.
+
 @cindex @samp{-mx86-used-note=} option, i386
 @cindex @samp{-mx86-used-note=} option, x86-64
 @item -mx86-used-note=@var{no}
-- 
2.24.1


  reply	other threads:[~2020-03-11 16:17 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-10 16:05 [PATCH 0/2] x86: Add assembler mitigation for CVE-2020-0551 H.J. Lu
2020-03-10 16:05 ` [PATCH 1/2] i386: Generate lfence with load/indirect branch/ret [CVE-2020-0551] H.J. Lu
2020-03-11 10:55   ` Jan Beulich
2020-03-11 16:17     ` H.J. Lu [this message]
2020-03-25  9:27       ` Hongtao Liu
2020-03-25 10:03         ` Jan Beulich
2020-03-26  2:23           ` Hongtao Liu
2020-03-26  9:12             ` Jan Beulich
2020-04-16  5:34               ` Hongtao Liu
2020-04-16  8:33                 ` Jan Beulich
2020-04-20  7:20                   ` Hongtao Liu
2020-04-20  7:34                     ` Jan Beulich
2020-04-21  2:24                       ` Hongtao Liu
2020-04-21  6:30                         ` Jan Beulich
2020-04-22  3:33                           ` Hongtao Liu
2020-04-22  8:47                             ` Jan Beulich
2020-04-23  2:53                               ` Hongtao Liu
2020-04-23  6:59                                 ` Jan Beulich
2020-04-23  8:53                                   ` Hongtao Liu
2020-04-23  9:15                                     ` Jan Beulich
2020-04-24  5:30                                     ` Hongtao Liu
2020-04-24  6:00                                       ` Jan Beulich
2020-04-24  7:29                                         ` Hongtao Liu
2020-04-24 13:00                                           ` H.J. Lu
2020-04-26  2:03                                             ` Hongtao Liu
2020-04-26  3:26                                               ` H.J. Lu
2020-03-10 16:05 ` [PATCH 2/2] i386: Add tests for lfence with load/indirect branch/ret H.J. Lu
2020-03-10 16:33 ` [PATCH 0/2] x86: Add assembler mitigation for CVE-2020-0551 Jan Beulich
2020-03-10 16:36   ` H.J. Lu
2020-03-12  0:32     ` Fangrui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMe9rOpJR34mmRp2nmMEZK8g12MvpErid4H1XLm55aev0wfVEA@mail.gmail.com \
    --to=hjl.tools@gmail.com \
    --cc=binutils@sourceware.org \
    --cc=crazylht@gmail.com \
    --cc=hongjiu.lu@intel.com \
    --cc=jbeulich@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).