public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [gold][aarch64]: Erratum 843419 optimized fix
@ 2015-07-16 21:44 Han Shen
  2015-07-20 17:21 ` Cary Coutant
  2016-07-16  5:43 ` Andrew Pinski
  0 siblings, 2 replies; 13+ messages in thread
From: Han Shen @ 2015-07-16 21:44 UTC (permalink / raw)
  To: binutils, Cary Coutant; +Cc: Luis Lozano

[-- Attachment #1: Type: text/plain, Size: 1206 bytes --]

Hi Cary, this is the patch for erratum 843419 fix optimization.

Usually we apply branch-to-stub fix for all erratum. For 843419, under some
condition, instead of generating jumps, we re-write 'adrp' with 'adr' (only
applicable if adrp calculation result fits in adr range), thus break such
erratum sequence and eliminate performance penalty (2-jump/fix).

Test - build on x86_64 platform and aarch64 platform using opt and debug (-O0).
  Pass unit tests.  Pass gold local test suite.  Pass tests from arm.

Ok for trunk?

gold/ChangeLog

2015-07-15 Han Shen <shenhan@google.com>

        Optimize erratum 843419 fix.

        gold/ChangeLog: * aarch64.cc (AArch64_insn_utilities::is_adr): New
        method.  (AArch64_insn_utilities::aarch64_adr_encode_imm): New method.
        (AArch64_insn_utilities::aarch64_adrp_decode_imm): New method.
        (E843419_stub): New sub-class of Erratum_stub.
        (AArch64_relobj::try_fix_erratum_843419_optimized): New method.
        (AArch64_relobj::section_needs_reloc_stub_scanning): Try optimized fix.
        (AArch64_relobj::create_erratum_stub): Add 1 argument.
        (Target_aarch64::scan_erratum_843419_span): Pass in adrp insn offset.

-- 
Han Shen

[-- Attachment #2: gold-erratum-843419-opt.patch --]
[-- Type: text/x-patch, Size: 10353 bytes --]

diff --git a/gold/aarch64.cc b/gold/aarch64.cc
index bc86f68..5a180b5 100644
--- a/gold/aarch64.cc
+++ b/gold/aarch64.cc
@@ -101,10 +101,14 @@ public:
   static unsigned int
   aarch64_ra(Insntype insn)
   { return aarch64_bits(insn, 10, 5); }
 
   static bool
+  is_adr(const Insntype insn)
+  { return (insn & 0x9F000000) == 0x10000000; }
+
+  static bool
   is_adrp(const Insntype insn)
   { return (insn & 0x9F000000) == 0x90000000; }
 
   static unsigned int
   aarch64_rm(const Insntype insn)
@@ -124,10 +128,43 @@ public:
 
   static unsigned int
   aarch64_rt2(const Insntype insn)
   { return aarch64_bits(insn, 10, 5); }
 
+  // Encode imm21 into adr. Signed imm21 is in the range of [-1M, 1M).
+  static Insntype
+  aarch64_adr_encode_imm(Insntype adr, int imm21)
+  {
+    gold_assert(is_adr(adr));
+    gold_assert(-(1 << 20) <= imm21 && imm21 < (1 << 20));
+    const int mask19 = (1 << 19) - 1;
+    const int mask2 = 3;
+    adr &= ~((mask19 << 5) | (mask2 << 29));
+    adr |= ((imm21 & mask2) << 29) | (((imm21 >> 2) & mask19) << 5);
+    return adr;
+  }
+
+  // Retrieve encoded adrp 33-bit signed imm value. This value is obtained by
+  // 21-bit signed imm encoded in the insn multiplied by 4k (page size) and
+  // 64-bit sign-extended, resulting in [-4G, 4G) with 12-lsb being 0.
+  static int64_t
+  aarch64_adrp_decode_imm(const Insntype adrp)
+  {
+    const int mask19 = (1 << 19) - 1;
+    const int mask2 = 3;
+    gold_assert(is_adrp(adrp));
+    // 21-bit imm encoded in adrp.
+    uint64_t imm = ((adrp >> 29) & mask2) | (((adrp >> 5) & mask19) << 2);
+    // Retrieve msb of 21-bit-signed imm for sign extension.
+    uint64_t msbt = (imm >> 20) & 1;
+    // Real value is imm multipled by 4k. Value now has 33-bit information.
+    int64_t value = imm << 12;
+    // Sign extend to 64-bit by repeating msbt 31 (64-33) times and merge it
+    // with value.
+    return ((((uint64_t)(1) << 32) - msbt) << 33) | value;
+  }
+
   static bool
   aarch64_b(const Insntype insn)
   { return (insn & 0xFC000000) == 0x14000000; }
 
   static bool
@@ -1017,10 +1054,39 @@ private:
   Insntype erratum_insn_;
   // The address of the above insn.
   AArch64_address erratum_address_;
 };  // End of "Erratum_stub".
 
+
+// Erratum sub class to wrap additional info needed by 843419.  In fixing this
+// erratum, we may choose to replace 'adrp' with 'adr', in this case, we need
+// adrp's code position (two or three insns before erratum insn itself).
+
+template<int size, bool big_endian>
+class E843419_stub : public Erratum_stub<size, big_endian>
+{
+public:
+  typedef typename AArch64_insn_utilities<big_endian>::Insntype Insntype;
+
+  E843419_stub(AArch64_relobj<size, big_endian>* relobj,
+		      unsigned int shndx, unsigned int sh_offset,
+		      unsigned int adrp_sh_offset)
+    : Erratum_stub<size, big_endian>(relobj, ST_E_843419, shndx, sh_offset),
+      adrp_sh_offset_(adrp_sh_offset)
+  {}
+
+  unsigned int
+  adrp_sh_offset() const
+  { return this->adrp_sh_offset_; }
+
+private:
+  // Section offset of "adrp". (We do not need a "adrp_shndx_" field, because we
+  // can can obtain it from its parent.)
+  const unsigned int adrp_sh_offset_;
+};
+
+
 template<int size, bool big_endian>
 const int Erratum_stub<size, big_endian>::STUB_ADDR_ALIGN = 4;
 
 // Comparator used in set definition.
 template<int size, bool big_endian>
@@ -1752,10 +1818,17 @@ class AArch64_relobj : public Sized_relobj_file<size, big_endian>
  private:
   // Fix all errata in the object.
   void
   fix_errata(typename Sized_relobj_file<size, big_endian>::Views* pviews);
 
+  // Try to fix erratum 843419 in an optimized way. Return true if patch is
+  // applied.
+  bool
+  try_fix_erratum_843419_optimized(
+      The_erratum_stub*,
+      typename Sized_relobj_file<size, big_endian>::View_size&);
+
   // Whether a section needs to be scanned for relocation stubs.
   bool
   section_needs_reloc_stub_scanning(const elfcpp::Shdr<size, big_endian>&,
 				    const Relobj::Output_sections&,
 				    const Symbol_table*, const unsigned char*);
@@ -1889,22 +1962,79 @@ AArch64_relobj<size, big_endian>::fix_errata(
 	  Insntype* ip =
 	    reinterpret_cast<Insntype*>(pview.view + stub->sh_offset());
 	  Insntype insn_to_fix = ip[0];
 	  stub->update_erratum_insn(insn_to_fix);
 
-	  // Replace the erratum insn with a branch-to-stub.
-	  AArch64_address stub_address =
-	    stub_table->erratum_stub_address(stub);
-	  unsigned int b_offset = stub_address - stub->erratum_address();
-	  AArch64_relocate_functions<size, big_endian>::construct_b(
-	    pview.view + stub->sh_offset(), b_offset & 0xfffffff);
+	  // First try to see if erratum is 843419 and if it can be fixed
+	  // without using branch-to-stub.
+	  if (!try_fix_erratum_843419_optimized(stub, pview))
+	    {
+	      // Replace the erratum insn with a branch-to-stub.
+	      AArch64_address stub_address =
+		stub_table->erratum_stub_address(stub);
+	      unsigned int b_offset = stub_address - stub->erratum_address();
+	      AArch64_relocate_functions<size, big_endian>::construct_b(
+		pview.view + stub->sh_offset(), b_offset & 0xfffffff);
+	    }
 	  ++p;
 	}
     }
 }
 
 
+// This is an optimization for 843419. This erratum requires the sequence begin
+// with 'adrp', when final value calculated by adrp fits in adr, we can just
+// replace 'adrp' with 'adr', so we save 2 jumps per occurrence. (Note, however,
+// in this case, we do not delete the erratum stub (too late to do so), it is
+// merely generated without ever been called.)
+
+template<int size, bool big_endian>
+bool
+AArch64_relobj<size, big_endian>::try_fix_erratum_843419_optimized(
+    The_erratum_stub* stub,
+    typename Sized_relobj_file<size, big_endian>::View_size& pview)
+{
+  if (stub->type() != ST_E_843419)
+    return false;
+
+  typedef AArch64_insn_utilities<big_endian> Insn_utilities;
+  typedef typename elfcpp::Swap<32,big_endian>::Valtype Insntype;
+  E843419_stub<size, big_endian>* e843419_stub =
+    reinterpret_cast<E843419_stub<size, big_endian>*>(stub);
+  AArch64_address pc = pview.address + e843419_stub->adrp_sh_offset();
+  Insntype* adrp_view = reinterpret_cast<Insntype*>(
+    pview.view + e843419_stub->adrp_sh_offset());
+  Insntype adrp_insn = adrp_view[0];
+  gold_assert(Insn_utilities::is_adrp(adrp_insn));
+  // Get adrp 33-bit signed imm value.
+  int64_t adrp_imm = Insn_utilities::
+    aarch64_adrp_decode_imm(adrp_insn);
+  // adrp - final value transferred to target register is calculated as:
+  //     PC[11:0] = Zeros(12)
+  //     adrp_dest_value = PC + adrp_imm;
+  int64_t adrp_dest_value = (pc & ~((1 << 12) - 1)) + adrp_imm;
+  // adr -final value transferred to target register is calucalted as:
+  //     PC + adr_imm
+  // So we have:
+  //     PC + adr_imm = adrp_dest_value
+  //   ==>
+  //     adr_imm = adrp_dest_value - PC
+  int64_t adr_imm = adrp_dest_value - pc;
+  // Check if imm fits in adr (21-bit signed).
+  if (-(1 << 20) <= adr_imm && adr_imm < (1 << 20))
+    {
+      // Convert 'adrp' into 'adr'.
+      Insntype adr_insn = adrp_insn & ((1 << 31) - 1);
+      adr_insn = Insn_utilities::
+	aarch64_adr_encode_imm(adr_insn, adr_imm);
+      elfcpp::Swap<32, big_endian>::writeval(adrp_view, adr_insn);
+      return true;
+    }
+  return false;
+}
+
+
 // Relocate sections.
 
 template<int size, bool big_endian>
 void
 AArch64_relobj<size, big_endian>::do_relocate_sections(
@@ -3164,18 +3294,20 @@ class Target_aarch64 : public Sized_target<size, big_endian>
   {
     gold_assert(this->plt_ != NULL);
     return this->plt_;
   }
 
-  // Helper method to create erratum stubs for ST_E_843419 and ST_E_835769.
+  // Helper method to create erratum stubs for ST_E_843419 and ST_E_835769. For
+  // ST_E_843419, we need an additional field for adrp offset.
   void create_erratum_stub(
     AArch64_relobj<size, big_endian>* relobj,
     unsigned int shndx,
     section_size_type erratum_insn_offset,
     Address erratum_address,
     typename Insn_utilities::Insntype erratum_insn,
-    int erratum_type);
+    int erratum_type,
+    unsigned int e843419_adrp_offset=0);
 
   // Return whether this is a 3-insn erratum sequence.
   bool is_erratum_843419_sequence(
       typename elfcpp::Swap<32,big_endian>::Valtype insn1,
       typename elfcpp::Swap<32,big_endian>::Valtype insn2,
@@ -7869,22 +8001,30 @@ Target_aarch64<size, big_endian>::create_erratum_stub(
     AArch64_relobj<size, big_endian>* relobj,
     unsigned int shndx,
     section_size_type erratum_insn_offset,
     Address erratum_address,
     typename Insn_utilities::Insntype erratum_insn,
-    int erratum_type)
+    int erratum_type,
+    unsigned int e843419_adrp_offset)
 {
   gold_assert(erratum_type == ST_E_843419 || erratum_type == ST_E_835769);
   The_stub_table* stub_table = relobj->stub_table(shndx);
   gold_assert(stub_table != NULL);
   if (stub_table->find_erratum_stub(relobj,
 				    shndx,
 				    erratum_insn_offset) == NULL)
     {
       const int BPI = AArch64_insn_utilities<big_endian>::BYTES_PER_INSN;
-      The_erratum_stub* stub = new The_erratum_stub(
-	relobj, erratum_type, shndx, erratum_insn_offset);
+      The_erratum_stub* stub;
+      if (erratum_type == ST_E_835769)
+	stub = new The_erratum_stub(relobj, erratum_type, shndx,
+				    erratum_insn_offset);
+      else if (erratum_type == ST_E_843419)
+	stub = new E843419_stub<size, big_endian>(
+	    relobj, shndx, erratum_insn_offset, e843419_adrp_offset);
+      else
+	gold_unreachable();
       stub->set_erratum_insn(erratum_insn);
       stub->set_erratum_address(erratum_address);
       // For erratum ST_E_843419 and ST_E_835769, the destination address is
       // always the next insn after erratum insn.
       stub->set_destination_address(erratum_address + BPI);
@@ -8025,11 +8165,12 @@ Target_aarch64<size, big_endian>::scan_erratum_843419_span(
 		span_start + offset + insn_offset;
 	      Address erratum_address =
 		output_address + offset + insn_offset;
 	      create_erratum_stub(relobj, shndx,
 				  erratum_insn_offset, erratum_address,
-				  erratum_insn, ST_E_843419);
+				  erratum_insn, ST_E_843419,
+				  span_start + offset);
 	    }
 	}
 
       // Advance to next candidate instruction. We only consider instruction
       // sequences starting at a page offset of 0xff8 or 0xffc.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [gold][aarch64]: Erratum 843419 optimized fix
  2015-07-16 21:44 [gold][aarch64]: Erratum 843419 optimized fix Han Shen
@ 2015-07-20 17:21 ` Cary Coutant
  2015-07-20 22:25   ` Han Shen
  2016-07-16  5:43 ` Andrew Pinski
  1 sibling, 1 reply; 13+ messages in thread
From: Cary Coutant @ 2015-07-20 17:21 UTC (permalink / raw)
  To: Han Shen; +Cc: binutils, Luis Lozano

> gold/ChangeLog
>
> 2015-07-15 Han Shen <shenhan@google.com>
>
>         Optimize erratum 843419 fix.
>
>         gold/ChangeLog: * aarch64.cc (AArch64_insn_utilities::is_adr): New
>         method.  (AArch64_insn_utilities::aarch64_adr_encode_imm): New method.
>         (AArch64_insn_utilities::aarch64_adrp_decode_imm): New method.
>         (E843419_stub): New sub-class of Erratum_stub.
>         (AArch64_relobj::try_fix_erratum_843419_optimized): New method.
>         (AArch64_relobj::section_needs_reloc_stub_scanning): Try optimized fix.
>         (AArch64_relobj::create_erratum_stub): Add 1 argument.
>         (Target_aarch64::scan_erratum_843419_span): Pass in adrp insn offset.


+// This is an optimization for 843419. This erratum requires the sequence begin
+// with 'adrp', when final value calculated by adrp fits in adr, we can just
+// replace 'adrp' with 'adr', so we save 2 jumps per occurrence.
(Note, however,
+// in this case, we do not delete the erratum stub (too late to do so), it is
+// merely generated without ever been called.)

s/been/being/

This is OK with that fix. Thanks!

-cary

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [gold][aarch64]: Erratum 843419 optimized fix
  2015-07-20 17:21 ` Cary Coutant
@ 2015-07-20 22:25   ` Han Shen
  0 siblings, 0 replies; 13+ messages in thread
From: Han Shen @ 2015-07-20 22:25 UTC (permalink / raw)
  To: Cary Coutant; +Cc: binutils, Luis Lozano

Hi Cary, comments addressed and CL submitted. Thanks,

Han

On Mon, Jul 20, 2015 at 10:21 AM, Cary Coutant <ccoutant@gmail.com> wrote:
>> gold/ChangeLog
>>
>> 2015-07-15 Han Shen <shenhan@google.com>
>>
>>         Optimize erratum 843419 fix.
>>
>>         gold/ChangeLog: * aarch64.cc (AArch64_insn_utilities::is_adr): New
>>         method.  (AArch64_insn_utilities::aarch64_adr_encode_imm): New method.
>>         (AArch64_insn_utilities::aarch64_adrp_decode_imm): New method.
>>         (E843419_stub): New sub-class of Erratum_stub.
>>         (AArch64_relobj::try_fix_erratum_843419_optimized): New method.
>>         (AArch64_relobj::section_needs_reloc_stub_scanning): Try optimized fix.
>>         (AArch64_relobj::create_erratum_stub): Add 1 argument.
>>         (Target_aarch64::scan_erratum_843419_span): Pass in adrp insn offset.
>
>
> +// This is an optimization for 843419. This erratum requires the sequence begin
> +// with 'adrp', when final value calculated by adrp fits in adr, we can just
> +// replace 'adrp' with 'adr', so we save 2 jumps per occurrence.
> (Note, however,
> +// in this case, we do not delete the erratum stub (too late to do so), it is
> +// merely generated without ever been called.)
>
> s/been/being/
>
> This is OK with that fix. Thanks!
>
> -cary



-- 
Han Shen |  Software Engineer |  shenhan@google.com |  +1-650-440-3330

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [gold][aarch64]: Erratum 843419 optimized fix
  2015-07-16 21:44 [gold][aarch64]: Erratum 843419 optimized fix Han Shen
  2015-07-20 17:21 ` Cary Coutant
@ 2016-07-16  5:43 ` Andrew Pinski
  2016-07-18 16:46   ` Han Shen
  1 sibling, 1 reply; 13+ messages in thread
From: Andrew Pinski @ 2016-07-16  5:43 UTC (permalink / raw)
  To: Han Shen; +Cc: binutils, Cary Coutant, Luis Lozano

On Thu, Jul 16, 2015 at 2:44 PM, Han Shen <shenhan@google.com> wrote:
> Hi Cary, this is the patch for erratum 843419 fix optimization.
>
> Usually we apply branch-to-stub fix for all erratum. For 843419, under some
> condition, instead of generating jumps, we re-write 'adrp' with 'adr' (only
> applicable if adrp calculation result fits in adr range), thus break such
> erratum sequence and eliminate performance penalty (2-jump/fix).
>
> Test - build on x86_64 platform and aarch64 platform using opt and debug (-O0).
>   Pass unit tests.  Pass gold local test suite.  Pass tests from arm.
>
> Ok for trunk?

Hi,
  I am getting an internal error some of the time when linking HHVM :
Erratum 843419 found and fixed at
"../runtime/libhphp_runtime.a(bytecode.cpp.o)", section 10882, offset
0x0000022c.
Erratum 843419 found and fixed at
"../runtime/libhphp_runtime.a(unique-stubs.cpp.o)", section 7040,
offset 0x00000218.
Erratum 843419 found and fixed at
"../runtime/libhphp_runtime.a(bytecode.cpp.o)", section 10882, offset
0x0000022c.
Erratum 843419 found and fixed at
"../runtime/libhphp_runtime.a(unique-stubs.cpp.o)", section 7040,
offset 0x00000218.
/usr/bin/ld.gold: internal error in try_fix_erratum_843419_optimized,
at ../../gold/aarch64.cc:2007
collect2: error: ld returned 1 exit status

Is there anything which you need to debug this issue?

Thanks,
Andrew

>
> gold/ChangeLog
>
> 2015-07-15 Han Shen <shenhan@google.com>
>
>         Optimize erratum 843419 fix.
>
>         gold/ChangeLog: * aarch64.cc (AArch64_insn_utilities::is_adr): New
>         method.  (AArch64_insn_utilities::aarch64_adr_encode_imm): New method.
>         (AArch64_insn_utilities::aarch64_adrp_decode_imm): New method.
>         (E843419_stub): New sub-class of Erratum_stub.
>         (AArch64_relobj::try_fix_erratum_843419_optimized): New method.
>         (AArch64_relobj::section_needs_reloc_stub_scanning): Try optimized fix.
>         (AArch64_relobj::create_erratum_stub): Add 1 argument.
>         (Target_aarch64::scan_erratum_843419_span): Pass in adrp insn offset.
>
> --
> Han Shen

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [gold][aarch64]: Erratum 843419 optimized fix
  2016-07-16  5:43 ` Andrew Pinski
@ 2016-07-18 16:46   ` Han Shen
  2017-06-07  9:26     ` Jiong Wang
  0 siblings, 1 reply; 13+ messages in thread
From: Han Shen @ 2016-07-18 16:46 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: binutils, Cary Coutant, Luis Lozano

Hi Andrew, thanks for reporting this. Could you send me the objs and
the command line? (I tried to build hhvm on aarch64 machine, seemed to
me this needs a few third_packages that need to be installed through
apt-get  (mysql, for example), since I am not a superuser on the
machine, it is not easy for me to build the whole thing from scratch
...)

On Fri, Jul 15, 2016 at 10:42 PM, Andrew Pinski <pinskia@gmail.com> wrote:
> On Thu, Jul 16, 2015 at 2:44 PM, Han Shen <shenhan@google.com> wrote:
>> Hi Cary, this is the patch for erratum 843419 fix optimization.
>>
>> Usually we apply branch-to-stub fix for all erratum. For 843419, under some
>> condition, instead of generating jumps, we re-write 'adrp' with 'adr' (only
>> applicable if adrp calculation result fits in adr range), thus break such
>> erratum sequence and eliminate performance penalty (2-jump/fix).
>>
>> Test - build on x86_64 platform and aarch64 platform using opt and debug (-O0).
>>   Pass unit tests.  Pass gold local test suite.  Pass tests from arm.
>>
>> Ok for trunk?
>
> Hi,
>   I am getting an internal error some of the time when linking HHVM :
> Erratum 843419 found and fixed at
> "../runtime/libhphp_runtime.a(bytecode.cpp.o)", section 10882, offset
> 0x0000022c.
> Erratum 843419 found and fixed at
> "../runtime/libhphp_runtime.a(unique-stubs.cpp.o)", section 7040,
> offset 0x00000218.
> Erratum 843419 found and fixed at
> "../runtime/libhphp_runtime.a(bytecode.cpp.o)", section 10882, offset
> 0x0000022c.
> Erratum 843419 found and fixed at
> "../runtime/libhphp_runtime.a(unique-stubs.cpp.o)", section 7040,
> offset 0x00000218.
> /usr/bin/ld.gold: internal error in try_fix_erratum_843419_optimized,
> at ../../gold/aarch64.cc:2007
> collect2: error: ld returned 1 exit status
>
> Is there anything which you need to debug this issue?
>
> Thanks,
> Andrew
>
>>
>> gold/ChangeLog
>>
>> 2015-07-15 Han Shen <shenhan@google.com>
>>
>>         Optimize erratum 843419 fix.
>>
>>         gold/ChangeLog: * aarch64.cc (AArch64_insn_utilities::is_adr): New
>>         method.  (AArch64_insn_utilities::aarch64_adr_encode_imm): New method.
>>         (AArch64_insn_utilities::aarch64_adrp_decode_imm): New method.
>>         (E843419_stub): New sub-class of Erratum_stub.
>>         (AArch64_relobj::try_fix_erratum_843419_optimized): New method.
>>         (AArch64_relobj::section_needs_reloc_stub_scanning): Try optimized fix.
>>         (AArch64_relobj::create_erratum_stub): Add 1 argument.
>>         (Target_aarch64::scan_erratum_843419_span): Pass in adrp insn offset.
>>
>> --
>> Han Shen



-- 
Han Shen |  Software Engineer |  shenhan@google.com |  +1-650-440-3330

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [gold][aarch64]: Erratum 843419 optimized fix
  2016-07-18 16:46   ` Han Shen
@ 2017-06-07  9:26     ` Jiong Wang
  2017-06-08 10:35       ` [PATCH][gold, aarch64] Skip ERRATUM 843419 fix if the sequences have been relaxed by TLS optimization Jiong Wang
  0 siblings, 1 reply; 13+ messages in thread
From: Jiong Wang @ 2017-06-07  9:26 UTC (permalink / raw)
  To: Han Shen, Andrew Pinski; +Cc: binutils, Cary Coutant, Luis Lozano

On 18/07/16 17:46, Han Shen wrote:

> Hi Andrew, thanks for reporting this. Could you send me the objs and
> the command line? (I tried to build hhvm on aarch64 machine, seemed to
> me this needs a few third_packages that need to be installed through
> apt-get  (mysql, for example), since I am not a superuser on the
> machine, it is not easy for me to build the whole thing from scratch
> ...)
>
> On Fri, Jul 15, 2016 at 10:42 PM, Andrew Pinski <pinskia@gmail.com> wrote:
>> On Thu, Jul 16, 2015 at 2:44 PM, Han Shen <shenhan@google.com> wrote:
>>> Hi Cary, this is the patch for erratum 843419 fix optimization.
>>>
>>> Usually we apply branch-to-stub fix for all erratum. For 843419, under some
>>> condition, instead of generating jumps, we re-write 'adrp' with 'adr' (only
>>> applicable if adrp calculation result fits in adr range), thus break such
>>> erratum sequence and eliminate performance penalty (2-jump/fix).
>>>
>>> Test - build on x86_64 platform and aarch64 platform using opt and debug (-O0).
>>>    Pass unit tests.  Pass gold local test suite.  Pass tests from arm.
>>>
>>> Ok for trunk?
>> Hi,
>>    I am getting an internal error some of the time when linking HHVM :
>> Erratum 843419 found and fixed at
>> "../runtime/libhphp_runtime.a(bytecode.cpp.o)", section 10882, offset
>> 0x0000022c.
>> Erratum 843419 found and fixed at
>> "../runtime/libhphp_runtime.a(unique-stubs.cpp.o)", section 7040,
>> offset 0x00000218.
>> Erratum 843419 found and fixed at
>> "../runtime/libhphp_runtime.a(bytecode.cpp.o)", section 10882, offset
>> 0x0000022c.
>> Erratum 843419 found and fixed at
>> "../runtime/libhphp_runtime.a(unique-stubs.cpp.o)", section 7040,
>> offset 0x00000218.
>> /usr/bin/ld.gold: internal error in try_fix_erratum_843419_optimized,
>> at ../../gold/aarch64.cc:2007
>> collect2: error: ld returned 1 exit status
>>
>> Is there anything which you need to debug this issue?
>>
>> Thanks,
>> Andrew
>>
>>> gold/ChangeLog
>>>
>>> 2015-07-15 Han Shen <shenhan@google.com>
>>>
>>>          Optimize erratum 843419 fix.
>>>
>>>          gold/ChangeLog: * aarch64.cc (AArch64_insn_utilities::is_adr): New
>>>          method.  (AArch64_insn_utilities::aarch64_adr_encode_imm): New method.
>>>          (AArch64_insn_utilities::aarch64_adrp_decode_imm): New method.
>>>          (E843419_stub): New sub-class of Erratum_stub.
>>>          (AArch64_relobj::try_fix_erratum_843419_optimized): New method.
>>>          (AArch64_relobj::section_needs_reloc_stub_scanning): Try optimized fix.
>>>          (AArch64_relobj::create_erratum_stub): Add 1 argument.
>>>          (Target_aarch64::scan_erratum_843419_span): Pass in adrp insn offset.
>>>
>>> --
>>> Han Shen

I have encountered the same issue.

On latest GOLD, the following assertion triggered:

   gold/aarch64.cc +2016 gold_assert(Insn_utilities::is_adrp(adrp_insn));

A quick debug shows the offending instruction sequence is like the following:


    91000401 	add	x1, x0, #0x1
    d53bd042 	mrs	x2, tpidr_el0
    d2a00000 	movz	x0, #0x0, lsl #16  <- adrp_sh_offset
    f285cf00 	movk	x0, #0x2e78
    8b000040 	add	x0, x2, x0
    f9001401 	str	x1, [x0,#40]

This sequence looks like the sequence for TLS local executable mode, So, I am
wondering if this issue is caused by TLS relaxtaion?

Before relaxataion they will be adrp + add instructions and the stub was recorded
at that time.  Then the later relaxation change the instructions and break the
assumptions.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH][gold, aarch64] Skip ERRATUM 843419 fix if the sequences have been relaxed by TLS optimization
  2017-06-07  9:26     ` Jiong Wang
@ 2017-06-08 10:35       ` Jiong Wang
  2017-06-13 22:17         ` Han Shen via binutils
  0 siblings, 1 reply; 13+ messages in thread
From: Jiong Wang @ 2017-06-08 10:35 UTC (permalink / raw)
  To: Han Shen, Andrew Pinski; +Cc: binutils, Cary Coutant, Luis Lozano

[-- Attachment #1: Type: text/plain, Size: 4524 bytes --]

On 07/06/17 10:26, Jiong Wang wrote:
> On 18/07/16 17:46, Han Shen wrote:
>
>> Hi Andrew, thanks for reporting this. Could you send me the objs and
>> the command line? (I tried to build hhvm on aarch64 machine, seemed to
>> me this needs a few third_packages that need to be installed through
>> apt-get  (mysql, for example), since I am not a superuser on the
>> machine, it is not easy for me to build the whole thing from scratch
>> ...)
>>
>> On Fri, Jul 15, 2016 at 10:42 PM, Andrew Pinski <pinskia@gmail.com> wrote:
>>> On Thu, Jul 16, 2015 at 2:44 PM, Han Shen <shenhan@google.com> wrote:
>>>> Hi Cary, this is the patch for erratum 843419 fix optimization.
>>>>
>>>> Usually we apply branch-to-stub fix for all erratum. For 843419, under some
>>>> condition, instead of generating jumps, we re-write 'adrp' with 'adr' (only
>>>> applicable if adrp calculation result fits in adr range), thus break such
>>>> erratum sequence and eliminate performance penalty (2-jump/fix).
>>>>
>>>> Test - build on x86_64 platform and aarch64 platform using opt and debug (-O0).
>>>>    Pass unit tests.  Pass gold local test suite.  Pass tests from arm.
>>>>
>>>> Ok for trunk? 
>>> Hi,
>>>    I am getting an internal error some of the time when linking HHVM :
>>> Erratum 843419 found and fixed at
>>> "../runtime/libhphp_runtime.a(bytecode.cpp.o)", section 10882, offset
>>> 0x0000022c.
>>> Erratum 843419 found and fixed at
>>> "../runtime/libhphp_runtime.a(unique-stubs.cpp.o)", section 7040,
>>> offset 0x00000218.
>>> Erratum 843419 found and fixed at
>>> "../runtime/libhphp_runtime.a(bytecode.cpp.o)", section 10882, offset
>>> 0x0000022c.
>>> Erratum 843419 found and fixed at
>>> "../runtime/libhphp_runtime.a(unique-stubs.cpp.o)", section 7040,
>>> offset 0x00000218.
>>> /usr/bin/ld.gold: internal error in try_fix_erratum_843419_optimized,
>>> at ../../gold/aarch64.cc:2007
>>> collect2: error: ld returned 1 exit status
>>>
>>> Is there anything which you need to debug this issue?
>>>
>>> Thanks,
>>> Andrew
>>>
>>>> gold/ChangeLog
>>>>
>>>> 2015-07-15 Han Shen <shenhan@google.com>
>>>>
>>>>          Optimize erratum 843419 fix.
>>>>
>>>>          gold/ChangeLog: * aarch64.cc (AArch64_insn_utilities::is_adr): New
>>>>          method.  (AArch64_insn_utilities::aarch64_adr_encode_imm): New method.
>>>>          (AArch64_insn_utilities::aarch64_adrp_decode_imm): New method.
>>>>          (E843419_stub): New sub-class of Erratum_stub.
>>>>          (AArch64_relobj::try_fix_erratum_843419_optimized): New method.
>>>>          (AArch64_relobj::section_needs_reloc_stub_scanning): Try optimized fix.
>>>>          (AArch64_relobj::create_erratum_stub): Add 1 argument.
>>>>          (Target_aarch64::scan_erratum_843419_span): Pass in adrp insn offset.
>>>>
>>>> -- 
>>>> Han Shen 
>
> I have encountered the same issue.
>
> On latest GOLD, the following assertion triggered:
>
>   gold/aarch64.cc +2016 gold_assert(Insn_utilities::is_adrp(adrp_insn));
>
> A quick debug shows the offending instruction sequence is like the following:
>
>
>    91000401     add    x1, x0, #0x1
>    d53bd042     mrs    x2, tpidr_el0
>    d2a00000     movz    x0, #0x0, lsl #16  <- adrp_sh_offset
>    f285cf00     movk    x0, #0x2e78
>    8b000040     add    x0, x2, x0
>    f9001401     str    x1, [x0,#40]
>
> This sequence looks like the sequence for TLS local executable mode, So, I am
> wondering if this issue is caused by TLS relaxtaion?
>
> Before relaxataion they will be adrp + add instructions and the stub was recorded
> at that time.  Then the later relaxation change the instructions and break the
> assumptions.


Here is a proposed fix, please have a look, it fixed the HHVM build.

This patch simply returns false from the erratum fix function.  From my
understanding, this ignores that particular stub.  I feel this is reasonable as
after TLS relaxataion the original sequence does not contain adrp instruction
anymore.
  
This patch return false in to cases:
   * if the instruction pointed by adrp_sh_offset is "mrs R, tpidr_el0".
     IE -> LE relaxation etc. may generate this.
   * if the instruction pointed by adrp_sh_offset is not ADRP and the instruction
     before adrp_sh_offset is "mrs R, tpidr_el0", LD -> LE relaxation etc may
     generate this.


gold/
2017-06-08  Jiong Wang  <jiong.wang@arm.com>

         * aarch64.cc (Insn_utilities::is_mrs_tpidr_el0): New method.
         (AArch64_relobj<size, big_endian>::try_fix_erratum_843419_optimized):
         Skip if there is TLS relaxation.



[-- Attachment #2: 1.patch --]
[-- Type: text/x-patch, Size: 2186 bytes --]

diff --git a/gold/aarch64.cc b/gold/aarch64.cc
index 2470986..3bfb96d 100644
--- a/gold/aarch64.cc
+++ b/gold/aarch64.cc
@@ -110,6 +110,10 @@ public:
   is_adrp(const Insntype insn)
   { return (insn & 0x9F000000) == 0x90000000; }
 
+  static bool
+  is_mrs_tpidr_el0(const Insntype insn)
+  { return (insn & 0xFFFFFFE0) == 0xd53bd040; }
+
   static unsigned int
   aarch64_rm(const Insntype insn)
   { return aarch64_bits(insn, 16, 5); }
@@ -2010,9 +2014,35 @@ AArch64_relobj<size, big_endian>::try_fix_erratum_843419_optimized(
   E843419_stub<size, big_endian>* e843419_stub =
     reinterpret_cast<E843419_stub<size, big_endian>*>(stub);
   AArch64_address pc = pview.address + e843419_stub->adrp_sh_offset();
-  Insntype* adrp_view = reinterpret_cast<Insntype*>(
-    pview.view + e843419_stub->adrp_sh_offset());
+  unsigned int adrp_offset = e843419_stub->adrp_sh_offset ();
+  Insntype* adrp_view = reinterpret_cast<Insntype*>(pview.view + adrp_offset);
   Insntype adrp_insn = adrp_view[0];
+  Insntype prev_insn;
+
+  if (adrp_offset)
+    {
+      Insntype* prev_view
+	= reinterpret_cast<Insntype*>(pview.view + adrp_offset - 4);
+      prev_insn = prev_view[0];
+    }
+
+  // TLS relaxation might change ADRP instruction into other instructions.
+  // The following check identifies the following cases:
+  //
+  //   1. If the instruction pointed to by adrp_sh_offset is "mrs R, tpidr_el0".
+  //	  This may come from IE -> LE relaxation etc.  ADRP has been turned
+  //	  into MRS as expected, we could safely skip the erratum fix.
+  //
+  //   2. If the instruction before adrp_sh_offset is "mrs R, tpidr_el0" and
+  //	  the instruction at adrp_sh_offset is not ADRP.  This may come from
+  //	  LD -> LE relaxation etc.  Similar as case 1, we can safely skip the
+  //	  erratum fix.
+  if (Insn_utilities::is_mrs_tpidr_el0(adrp_insn)
+      || (!Insn_utilities::is_adrp(adrp_insn)
+	  && adrp_offset && Insn_utilities::is_mrs_tpidr_el0(prev_insn)))
+    return false;
+
+  /* If we reach here, the first instruction must be ADRP.  */
   gold_assert(Insn_utilities::is_adrp(adrp_insn));
   // Get adrp 33-bit signed imm value.
   int64_t adrp_imm = Insn_utilities::

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH][gold, aarch64] Skip ERRATUM 843419 fix if the sequences have been relaxed by TLS optimization
  2017-06-08 10:35       ` [PATCH][gold, aarch64] Skip ERRATUM 843419 fix if the sequences have been relaxed by TLS optimization Jiong Wang
@ 2017-06-13 22:17         ` Han Shen via binutils
  2017-06-14  8:42           ` Jiong Wang
  0 siblings, 1 reply; 13+ messages in thread
From: Han Shen via binutils @ 2017-06-13 22:17 UTC (permalink / raw)
  To: Jiong Wang; +Cc: Andrew Pinski, binutils, Cary Coutant, Luis Lozano

Hi Jiong, thanks for the fix.

One question - in your patch, when linker encounters 2 cases described
in your comments, "try_fix_erratum_843419_optimized" returns false,
which makes linker fall back to "replacing the erratum insn with a
branch-to-stub." Whereas the correct behavior should leave the code
untouched, because adrp is replaced by relaxation and shall never
trigger the erratum. What do you think?

Han

On Thu, Jun 8, 2017 at 3:35 AM, Jiong Wang <jiong.wang@foss.arm.com> wrote:
> On 07/06/17 10:26, Jiong Wang wrote:
>>
>> On 18/07/16 17:46, Han Shen wrote:
>>
>>> Hi Andrew, thanks for reporting this. Could you send me the objs and
>>> the command line? (I tried to build hhvm on aarch64 machine, seemed to
>>> me this needs a few third_packages that need to be installed through
>>> apt-get  (mysql, for example), since I am not a superuser on the
>>> machine, it is not easy for me to build the whole thing from scratch
>>> ...)
>>>
>>> On Fri, Jul 15, 2016 at 10:42 PM, Andrew Pinski <pinskia@gmail.com>
>>> wrote:
>>>>
>>>> On Thu, Jul 16, 2015 at 2:44 PM, Han Shen <shenhan@google.com> wrote:
>>>>>
>>>>> Hi Cary, this is the patch for erratum 843419 fix optimization.
>>>>>
>>>>> Usually we apply branch-to-stub fix for all erratum. For 843419, under
>>>>> some
>>>>> condition, instead of generating jumps, we re-write 'adrp' with 'adr'
>>>>> (only
>>>>> applicable if adrp calculation result fits in adr range), thus break
>>>>> such
>>>>> erratum sequence and eliminate performance penalty (2-jump/fix).
>>>>>
>>>>> Test - build on x86_64 platform and aarch64 platform using opt and
>>>>> debug (-O0).
>>>>>    Pass unit tests.  Pass gold local test suite.  Pass tests from arm.
>>>>>
>>>>> Ok for trunk?
>>>>
>>>> Hi,
>>>>    I am getting an internal error some of the time when linking HHVM :
>>>> Erratum 843419 found and fixed at
>>>> "../runtime/libhphp_runtime.a(bytecode.cpp.o)", section 10882, offset
>>>> 0x0000022c.
>>>> Erratum 843419 found and fixed at
>>>> "../runtime/libhphp_runtime.a(unique-stubs.cpp.o)", section 7040,
>>>> offset 0x00000218.
>>>> Erratum 843419 found and fixed at
>>>> "../runtime/libhphp_runtime.a(bytecode.cpp.o)", section 10882, offset
>>>> 0x0000022c.
>>>> Erratum 843419 found and fixed at
>>>> "../runtime/libhphp_runtime.a(unique-stubs.cpp.o)", section 7040,
>>>> offset 0x00000218.
>>>> /usr/bin/ld.gold: internal error in try_fix_erratum_843419_optimized,
>>>> at ../../gold/aarch64.cc:2007
>>>> collect2: error: ld returned 1 exit status
>>>>
>>>> Is there anything which you need to debug this issue?
>>>>
>>>> Thanks,
>>>> Andrew
>>>>
>>>>> gold/ChangeLog
>>>>>
>>>>> 2015-07-15 Han Shen <shenhan@google.com>
>>>>>
>>>>>          Optimize erratum 843419 fix.
>>>>>
>>>>>          gold/ChangeLog: * aarch64.cc (AArch64_insn_utilities::is_adr):
>>>>> New
>>>>>          method.  (AArch64_insn_utilities::aarch64_adr_encode_imm): New
>>>>> method.
>>>>>          (AArch64_insn_utilities::aarch64_adrp_decode_imm): New method.
>>>>>          (E843419_stub): New sub-class of Erratum_stub.
>>>>>          (AArch64_relobj::try_fix_erratum_843419_optimized): New
>>>>> method.
>>>>>          (AArch64_relobj::section_needs_reloc_stub_scanning): Try
>>>>> optimized fix.
>>>>>          (AArch64_relobj::create_erratum_stub): Add 1 argument.
>>>>>          (Target_aarch64::scan_erratum_843419_span): Pass in adrp insn
>>>>> offset.
>>>>>
>>>>> --
>>>>> Han Shen
>>
>>
>> I have encountered the same issue.
>>
>> On latest GOLD, the following assertion triggered:
>>
>>   gold/aarch64.cc +2016 gold_assert(Insn_utilities::is_adrp(adrp_insn));
>>
>> A quick debug shows the offending instruction sequence is like the
>> following:
>>
>>
>>    91000401     add    x1, x0, #0x1
>>    d53bd042     mrs    x2, tpidr_el0
>>    d2a00000     movz    x0, #0x0, lsl #16  <- adrp_sh_offset
>>    f285cf00     movk    x0, #0x2e78
>>    8b000040     add    x0, x2, x0
>>    f9001401     str    x1, [x0,#40]
>>
>> This sequence looks like the sequence for TLS local executable mode, So, I
>> am
>> wondering if this issue is caused by TLS relaxtaion?
>>
>> Before relaxataion they will be adrp + add instructions and the stub was
>> recorded
>> at that time.  Then the later relaxation change the instructions and break
>> the
>> assumptions.
>
>
>
> Here is a proposed fix, please have a look, it fixed the HHVM build.
>
> This patch simply returns false from the erratum fix function.  From my
> understanding, this ignores that particular stub.  I feel this is reasonable
> as
> after TLS relaxataion the original sequence does not contain adrp
> instruction
> anymore.
>  This patch return false in to cases:
>   * if the instruction pointed by adrp_sh_offset is "mrs R, tpidr_el0".
>     IE -> LE relaxation etc. may generate this.
>   * if the instruction pointed by adrp_sh_offset is not ADRP and the
> instruction
>     before adrp_sh_offset is "mrs R, tpidr_el0", LD -> LE relaxation etc may
>     generate this.
>
>
> gold/
> 2017-06-08  Jiong Wang  <jiong.wang@arm.com>
>
>         * aarch64.cc (Insn_utilities::is_mrs_tpidr_el0): New method.
>         (AArch64_relobj<size,
> big_endian>::try_fix_erratum_843419_optimized):
>         Skip if there is TLS relaxation.
>
>



-- 
Han Shen |  Software Engineer |  shenhan@google.com |  +1-650-440-3330

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH][gold, aarch64] Skip ERRATUM 843419 fix if the sequences have been relaxed by TLS optimization
  2017-06-13 22:17         ` Han Shen via binutils
@ 2017-06-14  8:42           ` Jiong Wang
  2017-06-14 18:06             ` Han Shen via binutils
  0 siblings, 1 reply; 13+ messages in thread
From: Jiong Wang @ 2017-06-14  8:42 UTC (permalink / raw)
  To: Han Shen; +Cc: Andrew Pinski, binutils, Cary Coutant, Luis Lozano

On 13/06/17 23:17, Han Shen via binutils wrote:
> Whereas the correct behavior should leave the code
> untouched, because adrp is replaced by relaxation and shall never
> trigger the erratum.

Hi Han,

   This is exactly what I want to do.

> What do you think?

   Look more on fix_errata + try_try_fix_erratum_843419_optimized, I think we
should return true as return true means this sequence becomes safe after either
instruction rewrite or double-check that we don't want to install the branch-to-stub,
is this looks correct to you?

>
> Han
>
> On Thu, Jun 8, 2017 at 3:35 AM, Jiong Wang <jiong.wang@foss.arm.com> wrote:
>> On 07/06/17 10:26, Jiong Wang wrote:
>>> On 18/07/16 17:46, Han Shen wrote:
>>>
>>>> Hi Andrew, thanks for reporting this. Could you send me the objs and
>>>> the command line? (I tried to build hhvm on aarch64 machine, seemed to
>>>> me this needs a few third_packages that need to be installed through
>>>> apt-get  (mysql, for example), since I am not a superuser on the
>>>> machine, it is not easy for me to build the whole thing from scratch
>>>> ...)
>>>>
>>>> On Fri, Jul 15, 2016 at 10:42 PM, Andrew Pinski <pinskia@gmail.com>
>>>> wrote:
>>>>> On Thu, Jul 16, 2015 at 2:44 PM, Han Shen <shenhan@google.com> wrote:
>>>>>> Hi Cary, this is the patch for erratum 843419 fix optimization.
>>>>>>
>>>>>> Usually we apply branch-to-stub fix for all erratum. For 843419, under
>>>>>> some
>>>>>> condition, instead of generating jumps, we re-write 'adrp' with 'adr'
>>>>>> (only
>>>>>> applicable if adrp calculation result fits in adr range), thus break
>>>>>> such
>>>>>> erratum sequence and eliminate performance penalty (2-jump/fix).
>>>>>>
>>>>>> Test - build on x86_64 platform and aarch64 platform using opt and
>>>>>> debug (-O0).
>>>>>>     Pass unit tests.  Pass gold local test suite.  Pass tests from arm.
>>>>>>
>>>>>> Ok for trunk?
>>>>> Hi,
>>>>>     I am getting an internal error some of the time when linking HHVM :
>>>>> Erratum 843419 found and fixed at
>>>>> "../runtime/libhphp_runtime.a(bytecode.cpp.o)", section 10882, offset
>>>>> 0x0000022c.
>>>>> Erratum 843419 found and fixed at
>>>>> "../runtime/libhphp_runtime.a(unique-stubs.cpp.o)", section 7040,
>>>>> offset 0x00000218.
>>>>> Erratum 843419 found and fixed at
>>>>> "../runtime/libhphp_runtime.a(bytecode.cpp.o)", section 10882, offset
>>>>> 0x0000022c.
>>>>> Erratum 843419 found and fixed at
>>>>> "../runtime/libhphp_runtime.a(unique-stubs.cpp.o)", section 7040,
>>>>> offset 0x00000218.
>>>>> /usr/bin/ld.gold: internal error in try_fix_erratum_843419_optimized,
>>>>> at ../../gold/aarch64.cc:2007
>>>>> collect2: error: ld returned 1 exit status
>>>>>
>>>>> Is there anything which you need to debug this issue?
>>>>>
>>>>> Thanks,
>>>>> Andrew
>>>>>
>>>>>> gold/ChangeLog
>>>>>>
>>>>>> 2015-07-15 Han Shen <shenhan@google.com>
>>>>>>
>>>>>>           Optimize erratum 843419 fix.
>>>>>>
>>>>>>           gold/ChangeLog: * aarch64.cc (AArch64_insn_utilities::is_adr):
>>>>>> New
>>>>>>           method.  (AArch64_insn_utilities::aarch64_adr_encode_imm): New
>>>>>> method.
>>>>>>           (AArch64_insn_utilities::aarch64_adrp_decode_imm): New method.
>>>>>>           (E843419_stub): New sub-class of Erratum_stub.
>>>>>>           (AArch64_relobj::try_fix_erratum_843419_optimized): New
>>>>>> method.
>>>>>>           (AArch64_relobj::section_needs_reloc_stub_scanning): Try
>>>>>> optimized fix.
>>>>>>           (AArch64_relobj::create_erratum_stub): Add 1 argument.
>>>>>>           (Target_aarch64::scan_erratum_843419_span): Pass in adrp insn
>>>>>> offset.
>>>>>>
>>>>>> --
>>>>>> Han Shen
>>>
>>> I have encountered the same issue.
>>>
>>> On latest GOLD, the following assertion triggered:
>>>
>>>    gold/aarch64.cc +2016 gold_assert(Insn_utilities::is_adrp(adrp_insn));
>>>
>>> A quick debug shows the offending instruction sequence is like the
>>> following:
>>>
>>>
>>>     91000401     add    x1, x0, #0x1
>>>     d53bd042     mrs    x2, tpidr_el0
>>>     d2a00000     movz    x0, #0x0, lsl #16  <- adrp_sh_offset
>>>     f285cf00     movk    x0, #0x2e78
>>>     8b000040     add    x0, x2, x0
>>>     f9001401     str    x1, [x0,#40]
>>>
>>> This sequence looks like the sequence for TLS local executable mode, So, I
>>> am
>>> wondering if this issue is caused by TLS relaxtaion?
>>>
>>> Before relaxataion they will be adrp + add instructions and the stub was
>>> recorded
>>> at that time.  Then the later relaxation change the instructions and break
>>> the
>>> assumptions.
>>
>>
>> Here is a proposed fix, please have a look, it fixed the HHVM build.
>>
>> This patch simply returns false from the erratum fix function.  From my
>> understanding, this ignores that particular stub.  I feel this is reasonable
>> as
>> after TLS relaxataion the original sequence does not contain adrp
>> instruction
>> anymore.
>>   This patch return false in to cases:
>>    * if the instruction pointed by adrp_sh_offset is "mrs R, tpidr_el0".
>>      IE -> LE relaxation etc. may generate this.
>>    * if the instruction pointed by adrp_sh_offset is not ADRP and the
>> instruction
>>      before adrp_sh_offset is "mrs R, tpidr_el0", LD -> LE relaxation etc may
>>      generate this.
>>
>>
>> gold/
>> 2017-06-08  Jiong Wang  <jiong.wang@arm.com>
>>
>>          * aarch64.cc (Insn_utilities::is_mrs_tpidr_el0): New method.
>>          (AArch64_relobj<size,
>> big_endian>::try_fix_erratum_843419_optimized):
>>          Skip if there is TLS relaxation.
>>
>>
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH][gold, aarch64] Skip ERRATUM 843419 fix if the sequences have been relaxed by TLS optimization
  2017-06-14  8:42           ` Jiong Wang
@ 2017-06-14 18:06             ` Han Shen via binutils
  2017-06-15  0:07               ` Cary Coutant
  0 siblings, 1 reply; 13+ messages in thread
From: Han Shen via binutils @ 2017-06-14 18:06 UTC (permalink / raw)
  To: Jiong Wang; +Cc: Andrew Pinski, binutils, Cary Coutant, Luis Lozano

Thanks,

On Wed, Jun 14, 2017 at 1:42 AM, Jiong Wang <jiong.wang@foss.arm.com> wrote:
> On 13/06/17 23:17, Han Shen via binutils wrote:
>>
>> Whereas the correct behavior should leave the code
>> untouched, because adrp is replaced by relaxation and shall never
>> trigger the erratum.
>
>
> Hi Han,
>
>   This is exactly what I want to do.
>
>> What do you think?
>
>
>   Look more on fix_errata + try_try_fix_erratum_843419_optimized, I think we
> should return true as return true means this sequence becomes safe after
> either
> instruction rewrite or double-check that we don't want to install the
> branch-to-stub,
> is this looks correct to you?

Yup, let's change "return  false;" -> "return true;" (and add proper comments).
Also, I assume this relaxation-changing-instructions behavior should
not break erratum 835769? Erratum 843419 must have adrp as its first
insn, whereas 835769 begins with LD/ST.

Otherwise, LGTM

We shall still wait for Cary's approval.

Han

>
>
>>
>> Han
>>
>> On Thu, Jun 8, 2017 at 3:35 AM, Jiong Wang <jiong.wang@foss.arm.com>
>> wrote:
>>>
>>> On 07/06/17 10:26, Jiong Wang wrote:
>>>>
>>>> On 18/07/16 17:46, Han Shen wrote:
>>>>
>>>>> Hi Andrew, thanks for reporting this. Could you send me the objs and
>>>>> the command line? (I tried to build hhvm on aarch64 machine, seemed to
>>>>> me this needs a few third_packages that need to be installed through
>>>>> apt-get  (mysql, for example), since I am not a superuser on the
>>>>> machine, it is not easy for me to build the whole thing from scratch
>>>>> ...)
>>>>>
>>>>> On Fri, Jul 15, 2016 at 10:42 PM, Andrew Pinski <pinskia@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> On Thu, Jul 16, 2015 at 2:44 PM, Han Shen <shenhan@google.com> wrote:
>>>>>>>
>>>>>>> Hi Cary, this is the patch for erratum 843419 fix optimization.
>>>>>>>
>>>>>>> Usually we apply branch-to-stub fix for all erratum. For 843419,
>>>>>>> under
>>>>>>> some
>>>>>>> condition, instead of generating jumps, we re-write 'adrp' with 'adr'
>>>>>>> (only
>>>>>>> applicable if adrp calculation result fits in adr range), thus break
>>>>>>> such
>>>>>>> erratum sequence and eliminate performance penalty (2-jump/fix).
>>>>>>>
>>>>>>> Test - build on x86_64 platform and aarch64 platform using opt and
>>>>>>> debug (-O0).
>>>>>>>     Pass unit tests.  Pass gold local test suite.  Pass tests from
>>>>>>> arm.
>>>>>>>
>>>>>>> Ok for trunk?
>>>>>>
>>>>>> Hi,
>>>>>>     I am getting an internal error some of the time when linking HHVM
>>>>>> :
>>>>>> Erratum 843419 found and fixed at
>>>>>> "../runtime/libhphp_runtime.a(bytecode.cpp.o)", section 10882, offset
>>>>>> 0x0000022c.
>>>>>> Erratum 843419 found and fixed at
>>>>>> "../runtime/libhphp_runtime.a(unique-stubs.cpp.o)", section 7040,
>>>>>> offset 0x00000218.
>>>>>> Erratum 843419 found and fixed at
>>>>>> "../runtime/libhphp_runtime.a(bytecode.cpp.o)", section 10882, offset
>>>>>> 0x0000022c.
>>>>>> Erratum 843419 found and fixed at
>>>>>> "../runtime/libhphp_runtime.a(unique-stubs.cpp.o)", section 7040,
>>>>>> offset 0x00000218.
>>>>>> /usr/bin/ld.gold: internal error in try_fix_erratum_843419_optimized,
>>>>>> at ../../gold/aarch64.cc:2007
>>>>>> collect2: error: ld returned 1 exit status
>>>>>>
>>>>>> Is there anything which you need to debug this issue?
>>>>>>
>>>>>> Thanks,
>>>>>> Andrew
>>>>>>
>>>>>>> gold/ChangeLog
>>>>>>>
>>>>>>> 2015-07-15 Han Shen <shenhan@google.com>
>>>>>>>
>>>>>>>           Optimize erratum 843419 fix.
>>>>>>>
>>>>>>>           gold/ChangeLog: * aarch64.cc
>>>>>>> (AArch64_insn_utilities::is_adr):
>>>>>>> New
>>>>>>>           method.  (AArch64_insn_utilities::aarch64_adr_encode_imm):
>>>>>>> New
>>>>>>> method.
>>>>>>>           (AArch64_insn_utilities::aarch64_adrp_decode_imm): New
>>>>>>> method.
>>>>>>>           (E843419_stub): New sub-class of Erratum_stub.
>>>>>>>           (AArch64_relobj::try_fix_erratum_843419_optimized): New
>>>>>>> method.
>>>>>>>           (AArch64_relobj::section_needs_reloc_stub_scanning): Try
>>>>>>> optimized fix.
>>>>>>>           (AArch64_relobj::create_erratum_stub): Add 1 argument.
>>>>>>>           (Target_aarch64::scan_erratum_843419_span): Pass in adrp
>>>>>>> insn
>>>>>>> offset.
>>>>>>>
>>>>>>> --
>>>>>>> Han Shen
>>>>
>>>>
>>>> I have encountered the same issue.
>>>>
>>>> On latest GOLD, the following assertion triggered:
>>>>
>>>>    gold/aarch64.cc +2016
>>>> gold_assert(Insn_utilities::is_adrp(adrp_insn));
>>>>
>>>> A quick debug shows the offending instruction sequence is like the
>>>> following:
>>>>
>>>>
>>>>     91000401     add    x1, x0, #0x1
>>>>     d53bd042     mrs    x2, tpidr_el0
>>>>     d2a00000     movz    x0, #0x0, lsl #16  <- adrp_sh_offset
>>>>     f285cf00     movk    x0, #0x2e78
>>>>     8b000040     add    x0, x2, x0
>>>>     f9001401     str    x1, [x0,#40]
>>>>
>>>> This sequence looks like the sequence for TLS local executable mode, So,
>>>> I
>>>> am
>>>> wondering if this issue is caused by TLS relaxtaion?
>>>>
>>>> Before relaxataion they will be adrp + add instructions and the stub was
>>>> recorded
>>>> at that time.  Then the later relaxation change the instructions and
>>>> break
>>>> the
>>>> assumptions.
>>>
>>>
>>>
>>> Here is a proposed fix, please have a look, it fixed the HHVM build.
>>>
>>> This patch simply returns false from the erratum fix function.  From my
>>> understanding, this ignores that particular stub.  I feel this is
>>> reasonable
>>> as
>>> after TLS relaxataion the original sequence does not contain adrp
>>> instruction
>>> anymore.
>>>   This patch return false in to cases:
>>>    * if the instruction pointed by adrp_sh_offset is "mrs R, tpidr_el0".
>>>      IE -> LE relaxation etc. may generate this.
>>>    * if the instruction pointed by adrp_sh_offset is not ADRP and the
>>> instruction
>>>      before adrp_sh_offset is "mrs R, tpidr_el0", LD -> LE relaxation etc
>>> may
>>>      generate this.
>>>
>>>
>>> gold/
>>> 2017-06-08  Jiong Wang  <jiong.wang@arm.com>
>>>
>>>          * aarch64.cc (Insn_utilities::is_mrs_tpidr_el0): New method.
>>>          (AArch64_relobj<size,
>>> big_endian>::try_fix_erratum_843419_optimized):
>>>          Skip if there is TLS relaxation.
>>>
>>>
>>
>>
>



-- 
Han Shen |  Software Engineer |  shenhan@google.com |  +1-650-440-3330

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH][gold, aarch64] Skip ERRATUM 843419 fix if the sequences have been relaxed by TLS optimization
  2017-06-14 18:06             ` Han Shen via binutils
@ 2017-06-15  0:07               ` Cary Coutant
  2017-06-15 10:27                 ` [COMMITTED][gold, " Jiong Wang
  0 siblings, 1 reply; 13+ messages in thread
From: Cary Coutant @ 2017-06-15  0:07 UTC (permalink / raw)
  To: Han Shen; +Cc: Jiong Wang, Andrew Pinski, binutils, Luis Lozano

> Yup, let's change "return  false;" -> "return true;" (and add proper comments).
> Also, I assume this relaxation-changing-instructions behavior should
> not break erratum 835769? Erratum 843419 must have adrp as its first
> insn, whereas 835769 begins with LD/ST.
>
> Otherwise, LGTM

This is OK with me (with Han's suggested changes). It doesn't look
like this is related to any outstanding PR, but if it is, please add
the PR number to the ChangeLog.

-cary

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [COMMITTED][gold, aarch64] Skip ERRATUM 843419 fix if the sequences have been relaxed by TLS optimization
  2017-06-15  0:07               ` Cary Coutant
@ 2017-06-15 10:27                 ` Jiong Wang
  2017-06-15 15:25                   ` Han Shen via binutils
  0 siblings, 1 reply; 13+ messages in thread
From: Jiong Wang @ 2017-06-15 10:27 UTC (permalink / raw)
  To: Cary Coutant, Han Shen; +Cc: Andrew Pinski, binutils, Luis Lozano

[-- Attachment #1: Type: text/plain, Size: 1221 bytes --]

>> Yup, let's change "return  false;" -> "return true;" (and add proper comments).

Thanks, done.
   
>> Also, I assume this relaxation-changing-instructions behavior should
>> not break erratum 835769?

I think it won't break 835769.

For 835769, it's LD/ST followed by multiply accumulation, this won't exist
in TLS sequence if it's not allowed to be scheduled like in GCC.  In LLVM,
I am not sure.
     
Given 835769 doesn't have the optimized shortcut like 843419, and there is no
assertion inside it, so even if TLS sequence be scheduled that qualifies the
erratum 835769 and the ld instruction relaxed (this do happen in some TLS
model), the only consequence is we install unnecessary branch-to-stub.
       
> This is OK with me (with Han's suggested changes). It doesn't look
> like this is related to any outstanding PR, but if it is, please add
> the PR number to the ChangeLog.

Thanks.

I committed attached patch with updated comments and minor tweak.

gold/
2017-06-15  Jiong Wang  <jiong.wang@arm.com>

         * aarch64.cc (Insn_utilities::is_mrs_tpidr_el0): New method.
         (AArch64_relobj<size, big_endian>::try_fix_erratum_843419_optimized):
         Return ture for some TLS relaxed sequences.


[-- Attachment #2: new.patch --]
[-- Type: text/x-patch, Size: 2106 bytes --]

diff --git a/gold/aarch64.cc b/gold/aarch64.cc
index 2470986..7309ded 100644
--- a/gold/aarch64.cc
+++ b/gold/aarch64.cc
@@ -110,6 +110,10 @@ public:
   is_adrp(const Insntype insn)
   { return (insn & 0x9F000000) == 0x90000000; }
 
+  static bool
+  is_mrs_tpidr_el0(const Insntype insn)
+  { return (insn & 0xFFFFFFE0) == 0xd53bd040; }
+
   static unsigned int
   aarch64_rm(const Insntype insn)
   { return aarch64_bits(insn, 16, 5); }
@@ -2010,9 +2014,32 @@ AArch64_relobj<size, big_endian>::try_fix_erratum_843419_optimized(
   E843419_stub<size, big_endian>* e843419_stub =
     reinterpret_cast<E843419_stub<size, big_endian>*>(stub);
   AArch64_address pc = pview.address + e843419_stub->adrp_sh_offset();
-  Insntype* adrp_view = reinterpret_cast<Insntype*>(
-    pview.view + e843419_stub->adrp_sh_offset());
+  unsigned int adrp_offset = e843419_stub->adrp_sh_offset ();
+  Insntype* adrp_view = reinterpret_cast<Insntype*>(pview.view + adrp_offset);
   Insntype adrp_insn = adrp_view[0];
+
+  // If the instruction at adrp_sh_offset is "mrs R, tpidr_el0", it may come
+  // from IE -> LE relaxation etc.  This is a side-effect of TLS relaxation that
+  // ADRP has been turned into MRS, there is no erratum risk anymore.
+  // Therefore, we return true to avoid doing unnecessary branch-to-stub.
+  if (Insn_utilities::is_mrs_tpidr_el0(adrp_insn))
+    return true;
+
+  // If the instruction at adrp_sh_offset is not ADRP and the instruction before
+  // it is "mrs R, tpidr_el0", it may come from LD -> LE relaxation etc.
+  // Like the above case, there is no erratum risk any more, we can safely
+  // return true.
+  if (!Insn_utilities::is_adrp(adrp_insn) && adrp_offset)
+    {
+      Insntype* prev_view
+	= reinterpret_cast<Insntype*>(pview.view + adrp_offset - 4);
+      Insntype prev_insn = prev_view[0];
+
+      if (Insn_utilities::is_mrs_tpidr_el0(prev_insn))
+	return true;
+    }
+
+  /* If we reach here, the first instruction must be ADRP.  */
   gold_assert(Insn_utilities::is_adrp(adrp_insn));
   // Get adrp 33-bit signed imm value.
   int64_t adrp_imm = Insn_utilities::

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [COMMITTED][gold, aarch64] Skip ERRATUM 843419 fix if the sequences have been relaxed by TLS optimization
  2017-06-15 10:27                 ` [COMMITTED][gold, " Jiong Wang
@ 2017-06-15 15:25                   ` Han Shen via binutils
  0 siblings, 0 replies; 13+ messages in thread
From: Han Shen via binutils @ 2017-06-15 15:25 UTC (permalink / raw)
  To: Jiong Wang; +Cc: Cary Coutant, Andrew Pinski, binutils, Luis Lozano

On Thu, Jun 15, 2017 at 3:27 AM, Jiong Wang <jiong.wang@foss.arm.com> wrote:
>>> Yup, let's change "return  false;" -> "return true;" (and add proper
>>> comments).
>
>
> Thanks, done.
>
>>>
>>> Also, I assume this relaxation-changing-instructions behavior should
>>> not break erratum 835769?
>
>
> I think it won't break 835769.
>
> For 835769, it's LD/ST followed by multiply accumulation, this won't exist
> in TLS sequence if it's not allowed to be scheduled like in GCC.  In LLVM,
> I am not sure.
>     Given 835769 doesn't have the optimized shortcut like 843419, and there
> is no
> assertion inside it, so even if TLS sequence be scheduled that qualifies the
> erratum 835769 and the ld instruction relaxed (this do happen in some TLS
> model), the only consequence is we install unnecessary branch-to-stub.

Thanks for the explanation.

>
>>
>> This is OK with me (with Han's suggested changes). It doesn't look
>> like this is related to any outstanding PR, but if it is, please add
>> the PR number to the ChangeLog.
>
>
> Thanks.
>
> I committed attached patch with updated comments and minor tweak.

Thanks.
>
> gold/
> 2017-06-15  Jiong Wang  <jiong.wang@arm.com>
>
>         * aarch64.cc (Insn_utilities::is_mrs_tpidr_el0): New method.
>         (AArch64_relobj<size,
> big_endian>::try_fix_erratum_843419_optimized):
>         Return ture for some TLS relaxed sequences.
>

Han Shen

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-06-15 15:25 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-16 21:44 [gold][aarch64]: Erratum 843419 optimized fix Han Shen
2015-07-20 17:21 ` Cary Coutant
2015-07-20 22:25   ` Han Shen
2016-07-16  5:43 ` Andrew Pinski
2016-07-18 16:46   ` Han Shen
2017-06-07  9:26     ` Jiong Wang
2017-06-08 10:35       ` [PATCH][gold, aarch64] Skip ERRATUM 843419 fix if the sequences have been relaxed by TLS optimization Jiong Wang
2017-06-13 22:17         ` Han Shen via binutils
2017-06-14  8:42           ` Jiong Wang
2017-06-14 18:06             ` Han Shen via binutils
2017-06-15  0:07               ` Cary Coutant
2017-06-15 10:27                 ` [COMMITTED][gold, " Jiong Wang
2017-06-15 15:25                   ` Han Shen via binutils

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).