public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH/AArch64 1/3] * config/tc-aarch64.c (aarch64_cpus): Add entry for "xgene-1"
@ 2013-12-04 18:29 Philipp Tomsich
  2013-12-04 18:30 ` [PATCH/AArch64 2/3] * config/tc-aarch64.h (MAX_MEM_FOR_RS_ALIGN_CODE): Increase to 63 Philipp Tomsich
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Philipp Tomsich @ 2013-12-04 18:29 UTC (permalink / raw)
  To: binutils, binutils; +Cc: Philipp Tomsich

This adds support for the AppliedMicro X-Gene 1 processor to the assembler.
There's a dependency against this change for the GCC patches that I'm about to send out...

Best,
Philipp Tomsich.

    * config/tc-aarch64.c (aarch64_cpus): Add entry for "xgene-1"

---
 gas/config/tc-aarch64.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/gas/config/tc-aarch64.c b/gas/config/tc-aarch64.c
index 14ffdad..4312bed 100644
--- a/gas/config/tc-aarch64.c
+++ b/gas/config/tc-aarch64.c
@@ -7123,6 +7123,7 @@ static const struct aarch64_cpu_option_table aarch64_cpus[] = {
   {"all", AARCH64_ANY, NULL},
   {"cortex-a53",	AARCH64_ARCH_V8, "Cortex-A53"},
   {"cortex-a57",	AARCH64_ARCH_V8, "Cortex-A57"},
+  {"xgene-1", AARCH64_ARCH_V8, "APM X-Gene 1"},
   {"generic", AARCH64_ARCH_V8, NULL},
 
   /* These two are example CPUs supported in GCC, once we have real
-- 
1.7.2.5

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH/AArch64 3/3] * opcodes/aarch64-tbl.h (aarch64_opcode_table): Add back the "lost" instruction aliases for scalar compare and vector compare.
  2013-12-04 18:29 [PATCH/AArch64 1/3] * config/tc-aarch64.c (aarch64_cpus): Add entry for "xgene-1" Philipp Tomsich
  2013-12-04 18:30 ` [PATCH/AArch64 2/3] * config/tc-aarch64.h (MAX_MEM_FOR_RS_ALIGN_CODE): Increase to 63 Philipp Tomsich
@ 2013-12-04 18:30 ` Philipp Tomsich
  2013-12-10 14:17   ` Marcus Shawcroft
  2013-12-10 14:11 ` [PATCH/AArch64 1/3] * config/tc-aarch64.c (aarch64_cpus): Add entry for "xgene-1" Marcus Shawcroft
  2 siblings, 1 reply; 20+ messages in thread
From: Philipp Tomsich @ 2013-12-04 18:30 UTC (permalink / raw)
  To: binutils, binutils; +Cc: Philipp Tomsich

In v27 of the ARMv8 ISA various compare instructions were moved to new 
"Vector Compare" and "Scalar Compare" sections. However, some instructions
were lost in this move. 

GCC still generates some of these instructions at -O3 when compiling SPECfp.
This patch adds these aliases back to the assembler for consistency.

Tested on aarch64-apm-linux-gnu.

Thanks,
Philipp.


    * opcodes/aarch64-tbl.h (aarch64_opcode_table): Add back the "lost"
      instruction aliases for scalar compare and vector compare.

---
 opcodes/aarch64-tbl.h |   18 ++++++++++++++++++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h
index 7c77768..5718821 100644
--- a/opcodes/aarch64-tbl.h
+++ b/opcodes/aarch64-tbl.h
@@ -2112,6 +2112,24 @@ struct aarch64_opcode aarch64_opcode_table[] =
   {"blt", 0x5400000b, 0xff00001f, condbranch, 0, CORE, OP1 (ADDR_PCREL19), QL_PCREL_NIL, F_ALIAS | F_PSEUDO},
   {"bgt", 0x5400000c, 0xff00001f, condbranch, 0, CORE, OP1 (ADDR_PCREL19), QL_PCREL_NIL, F_ALIAS | F_PSEUDO},
   {"ble", 0x5400000d, 0xff00001f, condbranch, 0, CORE, OP1 (ADDR_PCREL19), QL_PCREL_NIL, F_ALIAS | F_PSEUDO},
+  /* Aliases for "Vector Compare" instructions. */
+  {"fcmlt", 0x2e20e400, 0xbfa0fc00, asimdsame, 0, SIMD, OP3 (Vd, Vm, Vn), QL_V3SAMESD, F_ALIAS | F_PSEUDO | F_SIZEQ},
+  {"fcmle", 0x2ea0e400, 0xbfa0fc00, asimdsame, 0, SIMD, OP3 (Vd, Vm, Vn), QL_V3SAMESD, F_ALIAS | F_PSEUDO | F_SIZEQ},
+  {"fcmlt", 0x7e20e400, 0xffa0fc00, asisdsame, 0, SIMD, OP3 (Sd, Sm, Sn), QL_FP3, F_ALIAS | F_PSEUDO | F_SSIZE},
+  {"fcmle", 0x7ea0e400, 0xffa0fc00, asisdsame, 0, SIMD, OP3 (Sd, Sm, Sn), QL_FP3, F_ALIAS | F_PSEUDO | F_SSIZE},
+  {"facle", 0x2ea0ec00, 0xbfa0fc00, asimdsame, 0, SIMD, OP3 (Vd, Vm, Vn), QL_V3SAMESD, F_ALIAS | F_PSEUDO | F_SIZEQ},
+  {"faclt", 0x2e20ec00, 0xbfa0fc00, asimdsame, 0, SIMD, OP3 (Vd, Vm, Vn), QL_V3SAMESD, F_ALIAS | F_PSEUDO | F_SIZEQ},
+  {"facle", 0x7ea0ec00, 0xffa0fc00, asisdsame, 0, SIMD, OP3 (Sd, Sm, Sn), QL_FP3, F_ALIAS | F_PSEUDO | F_SSIZE},
+  {"faclt", 0x7e20ec00, 0xffa0fc00, asisdsame, 0, SIMD, OP3 (Sd, Sm, Sn), QL_FP3, F_ALIAS | F_PSEUDO | F_SSIZE},
+  /* Aliases for "Scalar Compare" instructions. */
+  {"cmls", 0x2e203c00, 0xbf20fc00, asimdsame, 0, SIMD, OP3 (Vd, Vm, Vn), QL_V3SAME, F_ALIAS | F_PSEUDO | F_SIZEQ},
+  {"cmls", 0x7ee03c00, 0xffe0fc00, asisdsame, 0, SIMD, OP3 (Sd, Sm, Sn), QL_S_3SAMED, F_ALIAS | F_PSEUDO | F_SSIZE},
+  {"cmle", 0xe203400, 0xbf20fc00, asimdsame, 0, SIMD, OP3 (Vd, Vm, Vn), QL_V3SAME, F_ALIAS | F_PSEUDO | F_SIZEQ},
+  {"cmle", 0x5ee03400, 0xffe0fc00, asisdsame, 0, SIMD, OP3 (Sd, Sm, Sn), QL_S_3SAMED, F_ALIAS | F_PSEUDO | F_SSIZE},
+  {"cmlo", 0x2e203400, 0xbf20fc00, asimdsame, 0, SIMD, OP3 (Vd, Vm, Vn), QL_V3SAME, F_ALIAS | F_PSEUDO | F_SIZEQ},
+  {"cmlo", 0x7ee03400, 0xffe0fc00, asisdsame, 0, SIMD, OP3 (Sd, Sm, Sn), QL_S_3SAMED, F_ALIAS | F_PSEUDO | F_SSIZE},
+  {"cmlt", 0xe203c00, 0xbf20fc00, asimdsame, 0, SIMD, OP3 (Vd, Vm, Vn), QL_V3SAME, F_ALIAS | F_PSEUDO | F_SIZEQ},
+  {"cmlt", 0x5ee03c00, 0xffe0fc00, asisdsame, 0, SIMD, OP3 (Sd, Sm, Sn), QL_S_3SAMED, F_ALIAS | F_PSEUDO | F_SSIZE},
 
   {0, 0, 0, 0, 0, 0, {}, {}, 0},
 };
-- 
1.7.2.5

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH/AArch64 2/3] * config/tc-aarch64.h (MAX_MEM_FOR_RS_ALIGN_CODE): Increase to 63.
  2013-12-04 18:29 [PATCH/AArch64 1/3] * config/tc-aarch64.c (aarch64_cpus): Add entry for "xgene-1" Philipp Tomsich
@ 2013-12-04 18:30 ` Philipp Tomsich
  2013-12-06  1:52   ` Andrew Pinski
  2013-12-04 18:30 ` [PATCH/AArch64 3/3] * opcodes/aarch64-tbl.h (aarch64_opcode_table): Add back the "lost" instruction aliases for scalar compare and vector compare Philipp Tomsich
  2013-12-10 14:11 ` [PATCH/AArch64 1/3] * config/tc-aarch64.c (aarch64_cpus): Add entry for "xgene-1" Marcus Shawcroft
  2 siblings, 1 reply; 20+ messages in thread
From: Philipp Tomsich @ 2013-12-04 18:30 UTC (permalink / raw)
  To: binutils, binutils; +Cc: Philipp Tomsich

While working on various benchmarks, we've seen improvements from aligning
function entries and hot loops to a cache-line boundary. For this reason we
need to increase the permissible maximum code alignment to 64 bytes.

Thanks,
Philipp Tomsich.

    * config/tc-aarch64.h (MAX_MEM_FOR_RS_ALIGN_CODE): Increase to 63.

---
 gas/config/tc-aarch64.h |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gas/config/tc-aarch64.h b/gas/config/tc-aarch64.h
index 74a81d6..ee60525 100644
--- a/gas/config/tc-aarch64.h
+++ b/gas/config/tc-aarch64.h
@@ -116,8 +116,8 @@ void aarch64_copy_symbol_attributes (symbolS *, symbolS *);
 
 #define TC_CONS_FIX_NEW cons_fix_new_aarch64
 
-/* Max code alignment is 32 bytes */
-#define MAX_MEM_FOR_RS_ALIGN_CODE 31
+/* Max code alignment is 64 bytes */
+#define MAX_MEM_FOR_RS_ALIGN_CODE 63
 
 /* For frags in code sections we need to record whether they contain
    code or data.  */
-- 
1.7.2.5

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/AArch64 2/3] * config/tc-aarch64.h (MAX_MEM_FOR_RS_ALIGN_CODE): Increase to 63.
  2013-12-04 18:30 ` [PATCH/AArch64 2/3] * config/tc-aarch64.h (MAX_MEM_FOR_RS_ALIGN_CODE): Increase to 63 Philipp Tomsich
@ 2013-12-06  1:52   ` Andrew Pinski
  2013-12-06  1:53     ` Andrew Pinski
  2014-01-02 12:48     ` [PATCH/AArch64 2/4] Remove the artificial limit on code alignment through the use of the fixed part of a fragment for output generation only, which required MAX_MEM_FOR_RS_ALIGN_CODE to be large enough to hold the maximum pad Philipp Tomsich
  0 siblings, 2 replies; 20+ messages in thread
From: Andrew Pinski @ 2013-12-06  1:52 UTC (permalink / raw)
  To: Philipp Tomsich; +Cc: binutils

On Wed, Dec 4, 2013 at 10:29 AM, Philipp Tomsich
<philipp.tomsich@theobroma-systems.com> wrote:
> While working on various benchmarks, we've seen improvements from aligning
> function entries and hot loops to a cache-line boundary. For this reason we
> need to increase the permissible maximum code alignment to 64 bytes.

I think this should be increased to even 128 bytes as Cavium's thunder
has 128byte cache line sizes.
Also I think you should at least update this code to be closer to what
the arm code does as in
https://sourceware.org/ml/binutils/2009-06/msg00289.html .

Thanks,
Andrew

>
> Thanks,
> Philipp Tomsich.
>
>     * config/tc-aarch64.h (MAX_MEM_FOR_RS_ALIGN_CODE): Increase to 63.
>
> ---
>  gas/config/tc-aarch64.h |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gas/config/tc-aarch64.h b/gas/config/tc-aarch64.h
> index 74a81d6..ee60525 100644
> --- a/gas/config/tc-aarch64.h
> +++ b/gas/config/tc-aarch64.h
> @@ -116,8 +116,8 @@ void aarch64_copy_symbol_attributes (symbolS *, symbolS *);
>
>  #define TC_CONS_FIX_NEW cons_fix_new_aarch64
>
> -/* Max code alignment is 32 bytes */
> -#define MAX_MEM_FOR_RS_ALIGN_CODE 31
> +/* Max code alignment is 64 bytes */
> +#define MAX_MEM_FOR_RS_ALIGN_CODE 63
>
>  /* For frags in code sections we need to record whether they contain
>     code or data.  */
> --
> 1.7.2.5
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/AArch64 2/3] * config/tc-aarch64.h (MAX_MEM_FOR_RS_ALIGN_CODE): Increase to 63.
  2013-12-06  1:52   ` Andrew Pinski
@ 2013-12-06  1:53     ` Andrew Pinski
  2014-01-02 12:48     ` [PATCH/AArch64 2/4] Remove the artificial limit on code alignment through the use of the fixed part of a fragment for output generation only, which required MAX_MEM_FOR_RS_ALIGN_CODE to be large enough to hold the maximum pad Philipp Tomsich
  1 sibling, 0 replies; 20+ messages in thread
From: Andrew Pinski @ 2013-12-06  1:53 UTC (permalink / raw)
  To: Philipp Tomsich; +Cc: binutils

On Thu, Dec 5, 2013 at 5:52 PM, Andrew Pinski <pinskia@gmail.com> wrote:
> On Wed, Dec 4, 2013 at 10:29 AM, Philipp Tomsich
> <philipp.tomsich@theobroma-systems.com> wrote:
>> While working on various benchmarks, we've seen improvements from aligning
>> function entries and hot loops to a cache-line boundary. For this reason we
>> need to increase the permissible maximum code alignment to 64 bytes.
>
> I think this should be increased to even 128 bytes as Cavium's thunder
> has 128byte cache line sizes.
> Also I think you should at least update this code to be closer to what
> the arm code does as in
> https://sourceware.org/ml/binutils/2009-06/msg00289.html .

Also it might be better to remove this limit if possible.  See the
thread there on why there was a limit for ARM.

Thanks,
Andrew Pinski

>
> Thanks,
> Andrew
>
>>
>> Thanks,
>> Philipp Tomsich.
>>
>>     * config/tc-aarch64.h (MAX_MEM_FOR_RS_ALIGN_CODE): Increase to 63.
>>
>> ---
>>  gas/config/tc-aarch64.h |    4 ++--
>>  1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/gas/config/tc-aarch64.h b/gas/config/tc-aarch64.h
>> index 74a81d6..ee60525 100644
>> --- a/gas/config/tc-aarch64.h
>> +++ b/gas/config/tc-aarch64.h
>> @@ -116,8 +116,8 @@ void aarch64_copy_symbol_attributes (symbolS *, symbolS *);
>>
>>  #define TC_CONS_FIX_NEW cons_fix_new_aarch64
>>
>> -/* Max code alignment is 32 bytes */
>> -#define MAX_MEM_FOR_RS_ALIGN_CODE 31
>> +/* Max code alignment is 64 bytes */
>> +#define MAX_MEM_FOR_RS_ALIGN_CODE 63
>>
>>  /* For frags in code sections we need to record whether they contain
>>     code or data.  */
>> --
>> 1.7.2.5
>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/AArch64 1/3] * config/tc-aarch64.c (aarch64_cpus): Add entry for "xgene-1"
  2013-12-04 18:29 [PATCH/AArch64 1/3] * config/tc-aarch64.c (aarch64_cpus): Add entry for "xgene-1" Philipp Tomsich
  2013-12-04 18:30 ` [PATCH/AArch64 2/3] * config/tc-aarch64.h (MAX_MEM_FOR_RS_ALIGN_CODE): Increase to 63 Philipp Tomsich
  2013-12-04 18:30 ` [PATCH/AArch64 3/3] * opcodes/aarch64-tbl.h (aarch64_opcode_table): Add back the "lost" instruction aliases for scalar compare and vector compare Philipp Tomsich
@ 2013-12-10 14:11 ` Marcus Shawcroft
  2014-01-06 15:54   ` Marcus Shawcroft
  2 siblings, 1 reply; 20+ messages in thread
From: Marcus Shawcroft @ 2013-12-10 14:11 UTC (permalink / raw)
  To: Philipp Tomsich; +Cc: binutils

Hi Philipp,

On 4 December 2013 18:29, Philipp Tomsich
<philipp.tomsich@theobroma-systems.com> wrote:


>     * config/tc-aarch64.c (aarch64_cpus): Add entry for "xgene-1"
>
> ---
>  gas/config/tc-aarch64.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>

This patch looks fine, however before we can apply this patch we need
assignment papers in place with the fsf, my understanding is that you
are working to get the paperwork in place.

Thanks
/Marcus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/AArch64 3/3] * opcodes/aarch64-tbl.h (aarch64_opcode_table): Add back the "lost" instruction aliases for scalar compare and vector compare.
  2013-12-04 18:30 ` [PATCH/AArch64 3/3] * opcodes/aarch64-tbl.h (aarch64_opcode_table): Add back the "lost" instruction aliases for scalar compare and vector compare Philipp Tomsich
@ 2013-12-10 14:17   ` Marcus Shawcroft
  2013-12-11  3:50     ` Dr. Philipp Tomsich
  0 siblings, 1 reply; 20+ messages in thread
From: Marcus Shawcroft @ 2013-12-10 14:17 UTC (permalink / raw)
  To: Philipp Tomsich; +Cc: binutils

On 4 December 2013 18:29, Philipp Tomsich
<philipp.tomsich@theobroma-systems.com> wrote:
> In v27 of the ARMv8 ISA various compare instructions were moved to new
> "Vector Compare" and "Scalar Compare" sections. However, some instructions
> were lost in this move.
>
> GCC still generates some of these instructions at -O3 when compiling SPECfp.
> This patch adds these aliases back to the assembler for consistency.

Hi, These mnemonics are no longer defined as architectural
instructions or aliases in the ISA.  There is work ongoing to define a
set of programmer convenience aliases that go beyond the
architecturally defined aliases and it is likely that these mnemonics
will be defined by that extension.

In the meantime binutils 2.23 and now the recently released 2.24 do
not support these mnemonics therefore gcc must not generate them and
will need fixing to emit the appropriate architecture instruction
instead.

Thanks
/Marcus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/AArch64 3/3] * opcodes/aarch64-tbl.h (aarch64_opcode_table): Add back the "lost" instruction aliases for scalar compare and vector compare.
  2013-12-10 14:17   ` Marcus Shawcroft
@ 2013-12-11  3:50     ` Dr. Philipp Tomsich
  2013-12-11  9:38       ` James Greenhalgh
  0 siblings, 1 reply; 20+ messages in thread
From: Dr. Philipp Tomsich @ 2013-12-11  3:50 UTC (permalink / raw)
  To: Marcus Shawcroft; +Cc: binutils

Marcus,

> In the meantime binutils 2.23 and now the recently released 2.24 do
> not support these mnemonics therefore gcc must not generate them and
> will need fixing to emit the appropriate architecture instruction
> instead.

GCC 4.8.x and (the current top-of-trunk for) 4.9 will generate at least some of
these instructions. Here’s one relevant snippet from aarch64-simd.md:
> ;; fcm(eq|ge|gt|le|lt)
> 
> (define_insn "aarch64_cm<optab><mode>"
>   [(set (match_operand:<V_cmp_result> 0 "register_operand" "=w,w")
>         (neg:<V_cmp_result>
>           (COMPARISONS:<V_cmp_result>
>             (match_operand:VALLF 1 "register_operand" "w,w")
>             (match_operand:VALLF 2 "aarch64_simd_reg_or_zero" "w,YDz")
>           )))]
>   "TARGET_SIMD"
>   "@
>   fcm<n_optab>\t%<v>0<Vmtype>, %<v><cmp_1><Vmtype>, %<v><cmp_2><Vmtype>
>   fcm<optab>\t%<v>0<Vmtype>, %<v>1<Vmtype>, 0"
>   [(set_attr "type" "neon_fp_compare_<Vetype><q>")]
> )

Given that this is the behaviour for all release 4.8.x versions of the AArch64 
backend, I do believe that extending the assembler at least to the extent 
necessary to support the existing GCC versions would be advisable.

As this would require only the addition of all variants of the ‘fcm’-instruction,
I would be happy to revise the patch to limit the change to these instructions.

Thanks,
Philipp.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/AArch64 3/3] * opcodes/aarch64-tbl.h (aarch64_opcode_table): Add back the "lost" instruction aliases for scalar compare and vector compare.
  2013-12-11  3:50     ` Dr. Philipp Tomsich
@ 2013-12-11  9:38       ` James Greenhalgh
  2013-12-11 14:38         ` Dr. Philipp Tomsich
  0 siblings, 1 reply; 20+ messages in thread
From: James Greenhalgh @ 2013-12-11  9:38 UTC (permalink / raw)
  To: Dr. Philipp Tomsich; +Cc: Marcus Shawcroft, binutils

On Wed, Dec 11, 2013 at 03:50:01AM +0000, Dr. Philipp Tomsich wrote:
> GCC 4.8.x and (the current top-of-trunk for) 4.9 will generate at least some of
> these instructions. Here’s one relevant snippet from aarch64-simd.md:
> > ;; fcm(eq|ge|gt|le|lt)
> > 
> > (define_insn "aarch64_cm<optab><mode>"
> >   [(set (match_operand:<V_cmp_result> 0 "register_operand" "=w,w")
> >         (neg:<V_cmp_result>
> >           (COMPARISONS:<V_cmp_result>
> >             (match_operand:VALLF 1 "register_operand" "w,w")
> >             (match_operand:VALLF 2 "aarch64_simd_reg_or_zero" "w,YDz")
> >           )))]
> >   "TARGET_SIMD"
> >   "@
> >   fcm<n_optab>\t%<v>0<Vmtype>, %<v><cmp_1><Vmtype>, %<v><cmp_2><Vmtype>
> >   fcm<optab>\t%<v>0<Vmtype>, %<v>1<Vmtype>, 0"
> >   [(set_attr "type" "neon_fp_compare_<Vetype><q>")]
> > )
> 

Hi Philipp,

This pattern, through the <n_optab> code attribute, will only
generate instructions from the following set:

lt
  fcmgt %0, %2, %1
  fcmlt %0, %1, 0

le
  fcmge %0, %2, %1
  fcmle %0, %1, 0

eq
  fcmeq %0, %1, %2
  fcmeq %0, %1, 0

ge
  fcmge %0, %1, %1
  fcmge %0, %1, 0
  
gt
  fcmgt %0, %1, %2
  fcmgt %0, %1, 0

Each of these instructions appear in my copy of the ARMv8 ARMARM.

In fact, we have this explicit comment in aarch64/iterators.md calling
out the absence of the FCMLE and FCMLT 3 register variants and
describing the workaround.

;; For comparison operators we use the FCM* and CM* instructions.
;; As there are no CMLE or CMLT instructions which act on 3 vector
;; operands, we must use CMGE or CMGT and swap the order of the
;; source operands.

If you are seeing other instructions generated it is either a bug,
or some other pattern.

Thanks,
James

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/AArch64 3/3] * opcodes/aarch64-tbl.h (aarch64_opcode_table): Add back the "lost" instruction aliases for scalar compare and vector compare.
  2013-12-11  9:38       ` James Greenhalgh
@ 2013-12-11 14:38         ` Dr. Philipp Tomsich
  2014-01-07 11:08           ` Yufeng Zhang
  0 siblings, 1 reply; 20+ messages in thread
From: Dr. Philipp Tomsich @ 2013-12-11 14:38 UTC (permalink / raw)
  To: James Greenhalgh; +Cc: Marcus Shawcroft, binutils

James,

thanks for the clarification and pointing out the change in iterators.md, which only 
applies to 4.9 (I didn’t check the sources carefully enough last night), but seems to
have never made it into the 4.8.2 tree.

On 11 Dec 2013, at 10:38 , James Greenhalgh <james.greenhalgh@arm.com> wrote:

> In fact, we have this explicit comment in aarch64/iterators.md calling
> out the absence of the FCMLE and FCMLT 3 register variants and
> describing the workaround.
> 
> ;; For comparison operators we use the FCM* and CM* instructions.
> ;; As there are no CMLE or CMLT instructions which act on 3 vector
> ;; operands, we must use CMGE or CMGT and swap the order of the
> ;; source operands.
> 
> If you are seeing other instructions generated it is either a bug,
> or some other pattern.

I just rechecked against a fresh checkout of 4.8.2 and still don't see the n_optab 
iterator there, so it appears as if that change didn’t make it onto the 4.8 tree.

The sources in 4.8.2 still show:
>   fcm<cmp>\t%<v>0<Vmtype>, %<v>1<Vmtype>, %<v>2<Vmtype>
>   fcm<cmp>\t%<v>0<Vmtype>, %<v>1<Vmtype>, 0”
with the iterator
> (define_int_attr cmp [(UNSPEC_CMGE "ge") (UNSPEC_CMGT "gt")
>                       (UNSPEC_CMLE "le") (UNSPEC_CMLT "lt")
>                       (UNSPEC_CMEQ "eq")
>                       (UNSPEC_CMHS "hs") (UNSPEC_CMHI "hi")
>                       (UNSPEC_CMTST "tst")])

Note, that this breaks 481.wrf from SPEC at -O3.

Best,
Philipp.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH/AArch64 2/4] Remove the artificial limit on code alignment through the use of the fixed part of a fragment for output generation only, which required MAX_MEM_FOR_RS_ALIGN_CODE to be large enough to hold the maximum pad.
  2013-12-06  1:52   ` Andrew Pinski
  2013-12-06  1:53     ` Andrew Pinski
@ 2014-01-02 12:48     ` Philipp Tomsich
  2014-01-07 12:49       ` Marcus Shawcroft
  1 sibling, 1 reply; 20+ messages in thread
From: Philipp Tomsich @ 2014-01-02 12:48 UTC (permalink / raw)
  To: binutils; +Cc: Andrew Pinski, Philipp Tomsich

* config/tc-aarch64.h (MAX_MEM_FOR_RS_ALIGN_CODE): Define to 7.
* config/tc-aarch64.c (aarch64_handle_align): Rewrite to handle large
  alignments with a constant fragment size of MAX_MEM_FOR_RS_ALIGN_CODE.
---
 gas/config/tc-aarch64.c |   81 ++++++++++++++++++-----------------------------
 gas/config/tc-aarch64.h |    7 ++--
 2 files changed, 35 insertions(+), 53 deletions(-)

diff --git a/gas/config/tc-aarch64.c b/gas/config/tc-aarch64.c
index 4312bed..7a5ad5f 100644
--- a/gas/config/tc-aarch64.c
+++ b/gas/config/tc-aarch64.c
@@ -5754,10 +5754,28 @@ md_section_align (segT segment ATTRIBUTE_UNUSED, valueT size)
 }
 
 /* This is called from HANDLE_ALIGN in write.c.	 Fill in the contents
-   of an rs_align_code fragment.  */
+   of an rs_align_code fragment.  
+
+   Here we fill the frag with the appropriate info for padding the output
+   stream. The resulting frag will consist of a fixed (fr_fix) and of a 
+   repeating (fr_var) part. 
+
+   The fixed content is always emitted before the repeating content and 
+   these two parts are used as follows in constructing the output:
+   - the fixed part will be used to align to a valid instruction word 
+     boundary, in case that we start at a misaligned address; as no 
+     executable instruction can live at the misaligned location, we 
+     simply fill with zeros;
+   - the variable part will be used to cover the remaining padding and
+     we fill using the AArch64 NOP instruction.
+
+   Note that the size of a RS_ALIGN_CODE fragment is always 7 to provide
+   enough storage space for up to 3 bytes for padding the back to a valid
+   instruction alignment and exactly 4 bytes to store the NOP pattern.
+ */
 
 void
-aarch64_handle_align (fragS * fragP)
+aarch64_handle_align (fragS * frag)
 {
   /* NOP = d503201f */
   /* AArch64 instructions are always little-endian.  */
@@ -5765,69 +5783,32 @@ aarch64_handle_align (fragS * fragP)
 
   int bytes, fix, noop_size;
   char *p;
-  const char *noop;
 
-  if (fragP->fr_type != rs_align_code)
+  if (frag->fr_type != rs_align_code)
     return;
 
-  bytes = fragP->fr_next->fr_address - fragP->fr_address - fragP->fr_fix;
-  p = fragP->fr_literal + fragP->fr_fix;
-  fix = 0;
-
-  if (bytes > MAX_MEM_FOR_RS_ALIGN_CODE)
-    bytes &= MAX_MEM_FOR_RS_ALIGN_CODE;
+  bytes = frag->fr_next->fr_address - frag->fr_address - frag->fr_fix;
+  p = frag->fr_literal + frag->fr_fix;
 
 #ifdef OBJ_ELF
-  gas_assert (fragP->tc_frag_data.recorded);
+  gas_assert (frag->tc_frag_data.recorded);
 #endif
 
-  noop = aarch64_noop;
-  noop_size = sizeof (aarch64_noop);
-  fragP->fr_var = noop_size;
+  noop_size = sizeof(aarch64_noop);
 
-  if (bytes & (noop_size - 1))
+  fix = bytes & (noop_size - 1);
+  if (fix)
     {
-      fix = bytes & (noop_size - 1);
 #ifdef OBJ_ELF
-      insert_data_mapping_symbol (MAP_INSN, fragP->fr_fix, fragP, fix);
+      insert_data_mapping_symbol (MAP_INSN, frag->fr_fix, frag, fix);
 #endif
       memset (p, 0, fix);
       p += fix;
-      bytes -= fix;
-    }
-
-  while (bytes >= noop_size)
-    {
-      memcpy (p, noop, noop_size);
-      p += noop_size;
-      bytes -= noop_size;
-      fix += noop_size;
+      frag->fr_fix += fix;
     }
 
-  fragP->fr_fix += fix;
-}
-
-/* Called from md_do_align.  Used to create an alignment
-   frag in a code section.  */
-
-void
-aarch64_frag_align_code (int n, int max)
-{
-  char *p;
-
-  /* We assume that there will never be a requirement
-     to support alignments greater than x bytes.  */
-  if (max > MAX_MEM_FOR_RS_ALIGN_CODE)
-    as_fatal (_
-	      ("alignments greater than %d bytes not supported in .text sections"),
-	      MAX_MEM_FOR_RS_ALIGN_CODE + 1);
-
-  p = frag_var (rs_align_code,
-		MAX_MEM_FOR_RS_ALIGN_CODE,
-		1,
-		(relax_substateT) max,
-		(symbolS *) NULL, (offsetT) n, (char *) NULL);
-  *p = 0;
+  memcpy (p, aarch64_noop, noop_size);
+  frag->fr_var = noop_size;
 }
 
 /* Perform target specific initialisation of a frag.
diff --git a/gas/config/tc-aarch64.h b/gas/config/tc-aarch64.h
index 74a81d6..77e71d2 100644
--- a/gas/config/tc-aarch64.h
+++ b/gas/config/tc-aarch64.h
@@ -116,8 +116,9 @@ void aarch64_copy_symbol_attributes (symbolS *, symbolS *);
 
 #define TC_CONS_FIX_NEW cons_fix_new_aarch64
 
-/* Max code alignment is 32 bytes */
-#define MAX_MEM_FOR_RS_ALIGN_CODE 31
+/* Max space for a rs_align_code fragment is 3 unaligned bytes (fr_fix)
+   plus 4 bytes to contain the repeating NOP (fr_var) */
+#define MAX_MEM_FOR_RS_ALIGN_CODE 7
 
 /* For frags in code sections we need to record whether they contain
    code or data.  */
@@ -141,7 +142,7 @@ struct aarch64_frag_type
 #define md_do_align(N, FILL, LEN, MAX, LABEL)					\
   if (FILL == NULL && (N) != 0 && ! need_pass_2 && subseg_text_p (now_seg))	\
     {										\
-      aarch64_frag_align_code (N, MAX);						\
+      frag_align_code (N, MAX);						\
       goto LABEL;								\
     }
 
-- 
1.7.2.5

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/AArch64 1/3] * config/tc-aarch64.c (aarch64_cpus): Add entry for "xgene-1"
  2013-12-10 14:11 ` [PATCH/AArch64 1/3] * config/tc-aarch64.c (aarch64_cpus): Add entry for "xgene-1" Marcus Shawcroft
@ 2014-01-06 15:54   ` Marcus Shawcroft
  2014-01-07 10:36     ` nick clifton
  0 siblings, 1 reply; 20+ messages in thread
From: Marcus Shawcroft @ 2014-01-06 15:54 UTC (permalink / raw)
  To: nickc; +Cc: binutils, Philipp Tomsich

On 10 December 2013 14:11, Marcus Shawcroft <marcus.shawcroft@gmail.com> wrote:

> This patch looks fine, however before we can apply this patch we need
> assignment papers in place with the fsf, my understanding is that you
> are working to get the paperwork in place.

Hi Nick,

I believe Philipp now has papers in place for these binutils
contributions but I can't check if the relevant database/file has been
updated, are you able to check?

Thanks
/Marcus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/AArch64 1/3] * config/tc-aarch64.c (aarch64_cpus): Add entry for "xgene-1"
  2014-01-06 15:54   ` Marcus Shawcroft
@ 2014-01-07 10:36     ` nick clifton
  2014-01-07 12:29       ` Marcus Shawcroft
  0 siblings, 1 reply; 20+ messages in thread
From: nick clifton @ 2014-01-07 10:36 UTC (permalink / raw)
  To: Marcus Shawcroft; +Cc: binutils, Philipp Tomsich

Hi Marcus,

>> This patch looks fine, however before we can apply this patch we need
>> assignment papers in place with the fsf, my understanding is that you

> I believe Philipp now has papers in place for these binutils
> contributions but I can't check if the relevant database/file has been
> updated, are you able to check?

Yes.  The latest copyright list contains this entry:

   BINUTILS	Dr. Philipp Tomsich	Austria, 1978	2013-12-20
   Assigns Past and Future Changes (temporary copy)
   philipp.tomsich@theobroma-systems.com	

So the patch can be applied now.  Do you want me to do that ?

Cheers
   Nick





^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/AArch64 3/3] * opcodes/aarch64-tbl.h (aarch64_opcode_table): Add back the "lost" instruction aliases for scalar compare and vector compare.
  2013-12-11 14:38         ` Dr. Philipp Tomsich
@ 2014-01-07 11:08           ` Yufeng Zhang
  0 siblings, 0 replies; 20+ messages in thread
From: Yufeng Zhang @ 2014-01-07 11:08 UTC (permalink / raw)
  To: Dr. Philipp Tomsich; +Cc: James Greenhalgh, Marcus Shawcroft, binutils

Hi Philipp,

On 12/11/13 14:38, Dr. Philipp Tomsich wrote:
> James,
>
> thanks for the clarification and pointing out the change in iterators.md, which only
> applies to 4.9 (I didn’t check the sources carefully enough last night), but seems to
> have never made it into the 4.8.2 tree.
>
> On 11 Dec 2013, at 10:38 , James Greenhalgh<james.greenhalgh@arm.com>  wrote:
>
>> In fact, we have this explicit comment in aarch64/iterators.md calling
>> out the absence of the FCMLE and FCMLT 3 register variants and
>> describing the workaround.
>>
>> ;; For comparison operators we use the FCM* and CM* instructions.
>> ;; As there are no CMLE or CMLT instructions which act on 3 vector
>> ;; operands, we must use CMGE or CMGT and swap the order of the
>> ;; source operands.
>>
>> If you are seeing other instructions generated it is either a bug,
>> or some other pattern.
>
> I just rechecked against a fresh checkout of 4.8.2 and still don't see the n_optab
> iterator there, so it appears as if that change didn’t make it onto the 4.8 tree.

In case you haven't been aware, James has backported the change to 
gcc-4_8-branch@206133 before Christmas:

commit 0ae4ef5a14e5a50fa125142726bc6aa5fc0a05a8
Author: jgreenhalgh <jgreenhalgh@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Thu Dec 19 20:01:26 2013 +0000

     [AArch64 4.8-branch] Backport: Fix <F>CM instruction generation.


Thanks,
Yufeng

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/AArch64 1/3] * config/tc-aarch64.c (aarch64_cpus): Add entry for "xgene-1"
  2014-01-07 10:36     ` nick clifton
@ 2014-01-07 12:29       ` Marcus Shawcroft
  0 siblings, 0 replies; 20+ messages in thread
From: Marcus Shawcroft @ 2014-01-07 12:29 UTC (permalink / raw)
  To: nick clifton; +Cc: binutils, Philipp Tomsich

On 7 January 2014 10:33, nick clifton <nickc@redhat.com> wrote:

> Yes.  The latest copyright list contains this entry:

Thanks for checking.

> So the patch can be applied now.  Do you want me to do that ?

I've committed it.

Cheers
/Marcus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/AArch64 2/4] Remove the artificial limit on code alignment through the use of the fixed part of a fragment for output generation only, which required MAX_MEM_FOR_RS_ALIGN_CODE to be large enough to hold the maximum pad.
  2014-01-02 12:48     ` [PATCH/AArch64 2/4] Remove the artificial limit on code alignment through the use of the fixed part of a fragment for output generation only, which required MAX_MEM_FOR_RS_ALIGN_CODE to be large enough to hold the maximum pad Philipp Tomsich
@ 2014-01-07 12:49       ` Marcus Shawcroft
  2014-10-28  8:55         ` Andrew Pinski
  0 siblings, 1 reply; 20+ messages in thread
From: Marcus Shawcroft @ 2014-01-07 12:49 UTC (permalink / raw)
  To: Philipp Tomsich; +Cc: binutils, Andrew Pinski

Hi,

A few minor issues....

On 2 January 2014 12:48, Philipp Tomsich
<philipp.tomsich@theobroma-systems.com> wrote:

> +   Here we fill the frag with the appropriate info for padding the output
> +   stream. The resulting frag will consist of a fixed (fr_fix) and of a

GNU style places two spaces after a period.  Same comment applies
through the patch.

> +   repeating (fr_var) part.
> +
> +   The fixed content is always emitted before the repeating content and
> +   these two parts are used as follows in constructing the output:
> +   - the fixed part will be used to align to a valid instruction word
> +     boundary, in case that we start at a misaligned address; as no
> +     executable instruction can live at the misaligned location, we
> +     simply fill with zeros;
> +   - the variable part will be used to cover the remaining padding and
> +     we fill using the AArch64 NOP instruction.
> +
> +   Note that the size of a RS_ALIGN_CODE fragment is always 7 to provide
> +   enough storage space for up to 3 bytes for padding the back to a valid
> +   instruction alignment and exactly 4 bytes to store the NOP pattern.
> + */

Remove the trailing white space throughout the patch please.

>
>  void
> -aarch64_handle_align (fragS * fragP)
> +aarch64_handle_align (fragS * frag)

Please don't mix rename /  reformat with functional changes, it makes
it harder to review the patch.  In this case fragP -> frag is an
unrelated change, either split the patch into two one with a rename
the other with the functional change, or drop the rename completely.

>  {
>    /* NOP = d503201f */
>    /* AArch64 instructions are always little-endian.  */

Double space after period.

> @@ -5765,69 +5783,32 @@ aarch64_handle_align (fragS * fragP)
>
>    int bytes, fix, noop_size;
>    char *p;
> -  const char *noop;
>
> -  if (fragP->fr_type != rs_align_code)
> +  if (frag->fr_type != rs_align_code)
>      return;
>
> -  bytes = fragP->fr_next->fr_address - fragP->fr_address - fragP->fr_fix;
> -  p = fragP->fr_literal + fragP->fr_fix;
> -  fix = 0;
> -
> -  if (bytes > MAX_MEM_FOR_RS_ALIGN_CODE)
> -    bytes &= MAX_MEM_FOR_RS_ALIGN_CODE;
> +  bytes = frag->fr_next->fr_address - frag->fr_address - frag->fr_fix;
> +  p = frag->fr_literal + frag->fr_fix;
>
>  #ifdef OBJ_ELF
> -  gas_assert (fragP->tc_frag_data.recorded);
> +  gas_assert (frag->tc_frag_data.recorded);
>  #endif
>
> -  noop = aarch64_noop;
> -  noop_size = sizeof (aarch64_noop);
> -  fragP->fr_var = noop_size;
> +  noop_size = sizeof(aarch64_noop);

Inappropriate removal of the white space before the opening parenthesis.

Thanks
/Marcus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/AArch64 2/4] Remove the artificial limit on code alignment through the use of the fixed part of a fragment for output generation only, which required MAX_MEM_FOR_RS_ALIGN_CODE to be large enough to hold the maximum pad.
  2014-01-07 12:49       ` Marcus Shawcroft
@ 2014-10-28  8:55         ` Andrew Pinski
  2014-10-28 10:58           ` Dr. Philipp Tomsich
  0 siblings, 1 reply; 20+ messages in thread
From: Andrew Pinski @ 2014-10-28  8:55 UTC (permalink / raw)
  To: Marcus Shawcroft; +Cc: Philipp Tomsich, binutils

On Tue, Jan 7, 2014 at 4:49 AM, Marcus Shawcroft
<marcus.shawcroft@gmail.com> wrote:
> Hi,
>
> A few minor issues....

Any news on this patch since it never went in?  Could you also remove
md_do_align too?

Right now even a simple:
start64:
 nop
 .align 6
 done:
  nop

is broken.

Thanks,
Andrew Pinski


>
> On 2 January 2014 12:48, Philipp Tomsich
> <philipp.tomsich@theobroma-systems.com> wrote:
>
>> +   Here we fill the frag with the appropriate info for padding the output
>> +   stream. The resulting frag will consist of a fixed (fr_fix) and of a
>
> GNU style places two spaces after a period.  Same comment applies
> through the patch.
>
>> +   repeating (fr_var) part.
>> +
>> +   The fixed content is always emitted before the repeating content and
>> +   these two parts are used as follows in constructing the output:
>> +   - the fixed part will be used to align to a valid instruction word
>> +     boundary, in case that we start at a misaligned address; as no
>> +     executable instruction can live at the misaligned location, we
>> +     simply fill with zeros;
>> +   - the variable part will be used to cover the remaining padding and
>> +     we fill using the AArch64 NOP instruction.
>> +
>> +   Note that the size of a RS_ALIGN_CODE fragment is always 7 to provide
>> +   enough storage space for up to 3 bytes for padding the back to a valid
>> +   instruction alignment and exactly 4 bytes to store the NOP pattern.
>> + */
>
> Remove the trailing white space throughout the patch please.
>
>>
>>  void
>> -aarch64_handle_align (fragS * fragP)
>> +aarch64_handle_align (fragS * frag)
>
> Please don't mix rename /  reformat with functional changes, it makes
> it harder to review the patch.  In this case fragP -> frag is an
> unrelated change, either split the patch into two one with a rename
> the other with the functional change, or drop the rename completely.
>
>>  {
>>    /* NOP = d503201f */
>>    /* AArch64 instructions are always little-endian.  */
>
> Double space after period.
>
>> @@ -5765,69 +5783,32 @@ aarch64_handle_align (fragS * fragP)
>>
>>    int bytes, fix, noop_size;
>>    char *p;
>> -  const char *noop;
>>
>> -  if (fragP->fr_type != rs_align_code)
>> +  if (frag->fr_type != rs_align_code)
>>      return;
>>
>> -  bytes = fragP->fr_next->fr_address - fragP->fr_address - fragP->fr_fix;
>> -  p = fragP->fr_literal + fragP->fr_fix;
>> -  fix = 0;
>> -
>> -  if (bytes > MAX_MEM_FOR_RS_ALIGN_CODE)
>> -    bytes &= MAX_MEM_FOR_RS_ALIGN_CODE;
>> +  bytes = frag->fr_next->fr_address - frag->fr_address - frag->fr_fix;
>> +  p = frag->fr_literal + frag->fr_fix;
>>
>>  #ifdef OBJ_ELF
>> -  gas_assert (fragP->tc_frag_data.recorded);
>> +  gas_assert (frag->tc_frag_data.recorded);
>>  #endif
>>
>> -  noop = aarch64_noop;
>> -  noop_size = sizeof (aarch64_noop);
>> -  fragP->fr_var = noop_size;
>> +  noop_size = sizeof(aarch64_noop);
>
> Inappropriate removal of the white space before the opening parenthesis.
>
> Thanks
> /Marcus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/AArch64 2/4] Remove the artificial limit on code alignment through the use of the fixed part of a fragment for output generation only, which required MAX_MEM_FOR_RS_ALIGN_CODE to be large enough to hold the maximum pad.
  2014-10-28  8:55         ` Andrew Pinski
@ 2014-10-28 10:58           ` Dr. Philipp Tomsich
  2014-10-28 15:51             ` Andrew Pinski
  0 siblings, 1 reply; 20+ messages in thread
From: Dr. Philipp Tomsich @ 2014-10-28 10:58 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: Marcus Shawcroft, binutils

Andrew,

I had sent a revised patch to the list on Feb 18th:
	https://sourceware.org/ml/binutils/2014-02/msg00108.html

We’re maintaining this change and use it on production.
So if need be, I can generate a new patch against the current master.

Best,
Philipp.

On 28 Oct 2014, at 09:55 , Andrew Pinski <pinskia@gmail.com> wrote:

> On Tue, Jan 7, 2014 at 4:49 AM, Marcus Shawcroft
> <marcus.shawcroft@gmail.com> wrote:
>> Hi,
>> 
>> A few minor issues....
> 
> Any news on this patch since it never went in?  Could you also remove
> md_do_align too?
> 
> Right now even a simple:
> start64:
> nop
> .align 6
> done:
>  nop
> 
> is broken.
> 
> Thanks,
> Andrew Pinski
> 
> 
>> 
>> On 2 January 2014 12:48, Philipp Tomsich
>> <philipp.tomsich@theobroma-systems.com> wrote:
>> 
>>> +   Here we fill the frag with the appropriate info for padding the output
>>> +   stream. The resulting frag will consist of a fixed (fr_fix) and of a
>> 
>> GNU style places two spaces after a period.  Same comment applies
>> through the patch.
>> 
>>> +   repeating (fr_var) part.
>>> +
>>> +   The fixed content is always emitted before the repeating content and
>>> +   these two parts are used as follows in constructing the output:
>>> +   - the fixed part will be used to align to a valid instruction word
>>> +     boundary, in case that we start at a misaligned address; as no
>>> +     executable instruction can live at the misaligned location, we
>>> +     simply fill with zeros;
>>> +   - the variable part will be used to cover the remaining padding and
>>> +     we fill using the AArch64 NOP instruction.
>>> +
>>> +   Note that the size of a RS_ALIGN_CODE fragment is always 7 to provide
>>> +   enough storage space for up to 3 bytes for padding the back to a valid
>>> +   instruction alignment and exactly 4 bytes to store the NOP pattern.
>>> + */
>> 
>> Remove the trailing white space throughout the patch please.
>> 
>>> 
>>> void
>>> -aarch64_handle_align (fragS * fragP)
>>> +aarch64_handle_align (fragS * frag)
>> 
>> Please don't mix rename /  reformat with functional changes, it makes
>> it harder to review the patch.  In this case fragP -> frag is an
>> unrelated change, either split the patch into two one with a rename
>> the other with the functional change, or drop the rename completely.
>> 
>>> {
>>>   /* NOP = d503201f */
>>>   /* AArch64 instructions are always little-endian.  */
>> 
>> Double space after period.
>> 
>>> @@ -5765,69 +5783,32 @@ aarch64_handle_align (fragS * fragP)
>>> 
>>>   int bytes, fix, noop_size;
>>>   char *p;
>>> -  const char *noop;
>>> 
>>> -  if (fragP->fr_type != rs_align_code)
>>> +  if (frag->fr_type != rs_align_code)
>>>     return;
>>> 
>>> -  bytes = fragP->fr_next->fr_address - fragP->fr_address - fragP->fr_fix;
>>> -  p = fragP->fr_literal + fragP->fr_fix;
>>> -  fix = 0;
>>> -
>>> -  if (bytes > MAX_MEM_FOR_RS_ALIGN_CODE)
>>> -    bytes &= MAX_MEM_FOR_RS_ALIGN_CODE;
>>> +  bytes = frag->fr_next->fr_address - frag->fr_address - frag->fr_fix;
>>> +  p = frag->fr_literal + frag->fr_fix;
>>> 
>>> #ifdef OBJ_ELF
>>> -  gas_assert (fragP->tc_frag_data.recorded);
>>> +  gas_assert (frag->tc_frag_data.recorded);
>>> #endif
>>> 
>>> -  noop = aarch64_noop;
>>> -  noop_size = sizeof (aarch64_noop);
>>> -  fragP->fr_var = noop_size;
>>> +  noop_size = sizeof(aarch64_noop);
>> 
>> Inappropriate removal of the white space before the opening parenthesis.
>> 
>> Thanks
>> /Marcus

Dr. Philipp Tomsich
Theobroma Systems Design und Consulting GmbH
Seestadtstrasse 27 (Aspern IQ), A-1220 Wien, Austria
Phone: +43 1 2369893-401, Fax: +43 1 2369893-9-401
Cell phone: +43 664 8346109
http://www.theobroma-systems.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/AArch64 2/4] Remove the artificial limit on code alignment through the use of the fixed part of a fragment for output generation only, which required MAX_MEM_FOR_RS_ALIGN_CODE to be large enough to hold the maximum pad.
  2014-10-28 10:58           ` Dr. Philipp Tomsich
@ 2014-10-28 15:51             ` Andrew Pinski
  2014-10-30 10:54               ` Nicholas Clifton
  0 siblings, 1 reply; 20+ messages in thread
From: Andrew Pinski @ 2014-10-28 15:51 UTC (permalink / raw)
  To: Dr. Philipp Tomsich; +Cc: Marcus Shawcroft, binutils

On Tue, Oct 28, 2014 at 3:58 AM, Dr. Philipp Tomsich
<philipp.tomsich@theobroma-systems.com> wrote:
> Andrew,
>
> I had sent a revised patch to the list on Feb 18th:
>         https://sourceware.org/ml/binutils/2014-02/msg00108.html
>
> We’re maintaining this change and use it on production.
> So if need be, I can generate a new patch against the current master.

Yes that would be useful as it seems to fix
https://bugs.linaro.org/show_bug.cgi?id=815 .

Thanks,
Andrew Pinski


>
> Best,
> Philipp.
>
> On 28 Oct 2014, at 09:55 , Andrew Pinski <pinskia@gmail.com> wrote:
>
>> On Tue, Jan 7, 2014 at 4:49 AM, Marcus Shawcroft
>> <marcus.shawcroft@gmail.com> wrote:
>>> Hi,
>>>
>>> A few minor issues....
>>
>> Any news on this patch since it never went in?  Could you also remove
>> md_do_align too?
>>
>> Right now even a simple:
>> start64:
>> nop
>> .align 6
>> done:
>>  nop
>>
>> is broken.
>>
>> Thanks,
>> Andrew Pinski
>>
>>
>>>
>>> On 2 January 2014 12:48, Philipp Tomsich
>>> <philipp.tomsich@theobroma-systems.com> wrote:
>>>
>>>> +   Here we fill the frag with the appropriate info for padding the output
>>>> +   stream. The resulting frag will consist of a fixed (fr_fix) and of a
>>>
>>> GNU style places two spaces after a period.  Same comment applies
>>> through the patch.
>>>
>>>> +   repeating (fr_var) part.
>>>> +
>>>> +   The fixed content is always emitted before the repeating content and
>>>> +   these two parts are used as follows in constructing the output:
>>>> +   - the fixed part will be used to align to a valid instruction word
>>>> +     boundary, in case that we start at a misaligned address; as no
>>>> +     executable instruction can live at the misaligned location, we
>>>> +     simply fill with zeros;
>>>> +   - the variable part will be used to cover the remaining padding and
>>>> +     we fill using the AArch64 NOP instruction.
>>>> +
>>>> +   Note that the size of a RS_ALIGN_CODE fragment is always 7 to provide
>>>> +   enough storage space for up to 3 bytes for padding the back to a valid
>>>> +   instruction alignment and exactly 4 bytes to store the NOP pattern.
>>>> + */
>>>
>>> Remove the trailing white space throughout the patch please.
>>>
>>>>
>>>> void
>>>> -aarch64_handle_align (fragS * fragP)
>>>> +aarch64_handle_align (fragS * frag)
>>>
>>> Please don't mix rename /  reformat with functional changes, it makes
>>> it harder to review the patch.  In this case fragP -> frag is an
>>> unrelated change, either split the patch into two one with a rename
>>> the other with the functional change, or drop the rename completely.
>>>
>>>> {
>>>>   /* NOP = d503201f */
>>>>   /* AArch64 instructions are always little-endian.  */
>>>
>>> Double space after period.
>>>
>>>> @@ -5765,69 +5783,32 @@ aarch64_handle_align (fragS * fragP)
>>>>
>>>>   int bytes, fix, noop_size;
>>>>   char *p;
>>>> -  const char *noop;
>>>>
>>>> -  if (fragP->fr_type != rs_align_code)
>>>> +  if (frag->fr_type != rs_align_code)
>>>>     return;
>>>>
>>>> -  bytes = fragP->fr_next->fr_address - fragP->fr_address - fragP->fr_fix;
>>>> -  p = fragP->fr_literal + fragP->fr_fix;
>>>> -  fix = 0;
>>>> -
>>>> -  if (bytes > MAX_MEM_FOR_RS_ALIGN_CODE)
>>>> -    bytes &= MAX_MEM_FOR_RS_ALIGN_CODE;
>>>> +  bytes = frag->fr_next->fr_address - frag->fr_address - frag->fr_fix;
>>>> +  p = frag->fr_literal + frag->fr_fix;
>>>>
>>>> #ifdef OBJ_ELF
>>>> -  gas_assert (fragP->tc_frag_data.recorded);
>>>> +  gas_assert (frag->tc_frag_data.recorded);
>>>> #endif
>>>>
>>>> -  noop = aarch64_noop;
>>>> -  noop_size = sizeof (aarch64_noop);
>>>> -  fragP->fr_var = noop_size;
>>>> +  noop_size = sizeof(aarch64_noop);
>>>
>>> Inappropriate removal of the white space before the opening parenthesis.
>>>
>>> Thanks
>>> /Marcus
>
> Dr. Philipp Tomsich
> Theobroma Systems Design und Consulting GmbH
> Seestadtstrasse 27 (Aspern IQ), A-1220 Wien, Austria
> Phone: +43 1 2369893-401, Fax: +43 1 2369893-9-401
> Cell phone: +43 664 8346109
> http://www.theobroma-systems.com
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/AArch64 2/4] Remove the artificial limit on code alignment through the use of the fixed part of a fragment for output generation only, which required MAX_MEM_FOR_RS_ALIGN_CODE to be large enough to hold the maximum pad.
  2014-10-28 15:51             ` Andrew Pinski
@ 2014-10-30 10:54               ` Nicholas Clifton
  0 siblings, 0 replies; 20+ messages in thread
From: Nicholas Clifton @ 2014-10-30 10:54 UTC (permalink / raw)
  To: Andrew Pinski, Dr. Philipp Tomsich; +Cc: Marcus Shawcroft, binutils

Hi Andrew, Hi Phillip,

   The patch has been approved and applied, to mainline and the 2.25 branch.

Cheers
   Nick

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2014-10-30 10:54 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-04 18:29 [PATCH/AArch64 1/3] * config/tc-aarch64.c (aarch64_cpus): Add entry for "xgene-1" Philipp Tomsich
2013-12-04 18:30 ` [PATCH/AArch64 2/3] * config/tc-aarch64.h (MAX_MEM_FOR_RS_ALIGN_CODE): Increase to 63 Philipp Tomsich
2013-12-06  1:52   ` Andrew Pinski
2013-12-06  1:53     ` Andrew Pinski
2014-01-02 12:48     ` [PATCH/AArch64 2/4] Remove the artificial limit on code alignment through the use of the fixed part of a fragment for output generation only, which required MAX_MEM_FOR_RS_ALIGN_CODE to be large enough to hold the maximum pad Philipp Tomsich
2014-01-07 12:49       ` Marcus Shawcroft
2014-10-28  8:55         ` Andrew Pinski
2014-10-28 10:58           ` Dr. Philipp Tomsich
2014-10-28 15:51             ` Andrew Pinski
2014-10-30 10:54               ` Nicholas Clifton
2013-12-04 18:30 ` [PATCH/AArch64 3/3] * opcodes/aarch64-tbl.h (aarch64_opcode_table): Add back the "lost" instruction aliases for scalar compare and vector compare Philipp Tomsich
2013-12-10 14:17   ` Marcus Shawcroft
2013-12-11  3:50     ` Dr. Philipp Tomsich
2013-12-11  9:38       ` James Greenhalgh
2013-12-11 14:38         ` Dr. Philipp Tomsich
2014-01-07 11:08           ` Yufeng Zhang
2013-12-10 14:11 ` [PATCH/AArch64 1/3] * config/tc-aarch64.c (aarch64_cpus): Add entry for "xgene-1" Marcus Shawcroft
2014-01-06 15:54   ` Marcus Shawcroft
2014-01-07 10:36     ` nick clifton
2014-01-07 12:29       ` Marcus Shawcroft

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).