public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/54089] New: [SH] Refactor shift patterns
@ 2012-07-24 23:42 olegendo at gcc dot gnu.org
  2012-07-24 23:46 ` [Bug target/54089] " olegendo at gcc dot gnu.org
                   ` (100 more replies)
  0 siblings, 101 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-07-24 23:42 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

             Bug #: 54089
           Summary: [SH] Refactor shift patterns
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
        AssignedTo: olegendo@gcc.gnu.org
        ReportedBy: olegendo@gcc.gnu.org
            Target: sh*-*-*


The code related to shift patterns in sh.c / sh.md maybe could use some
improvements here and there.  In some places clobbers and scratch regs could be
avoided.
There is also a large part that deals with minimizing and-shift/shift-and
sequences, but there are no test cases to verify that those actually work.
It would also be interesting to see, whether some of the and-shift/shift-and
combine code could be reduced due to improvements in the middle-end.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
@ 2012-07-24 23:46 ` olegendo at gcc dot gnu.org
  2012-07-25 23:03 ` olegendo at gcc dot gnu.org
                   ` (99 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-07-24 23:46 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2012-07-24
     Ever Confirmed|0                           |1


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
  2012-07-24 23:46 ` [Bug target/54089] " olegendo at gcc dot gnu.org
@ 2012-07-25 23:03 ` olegendo at gcc dot gnu.org
  2012-07-27 17:36 ` olegendo at gcc dot gnu.org
                   ` (98 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-07-25 23:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #1 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-07-25 23:03:12 UTC ---
Author: olegendo
Date: Wed Jul 25 23:03:06 2012
New Revision: 189872

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189872
Log:
    PR target/54089
    * config/sh/predicates.md (shift_count_operand): Handle not-SHMEDIA
    case.
    (p27_shift_count_operand, not_p27_shift_count_operand): New predicates.
    * config/sh/sh.md (ashlsi3): Remove parallel and T_REG clobber
    from expander.  Do not emit shift insn for not-SHMEDIA case.
    (ashlsi3_std): Replace with ...
    (ashlsi3_k, ashlsi3_d): ... these new insns.
    * config/sh/sh.c (gen_ashift): Make static.  Add sanity checks.
    Emit ashlsi3_k insn instead of ashlsi3_std in ASHIFT case.
    (gen_ashift_hi): Make static.
    * config/sh/sh-protos.h (gen_ashift, gen_ashift_hi): Remove forward
    declaration.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/predicates.md
    trunk/gcc/config/sh/sh-protos.h
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
  2012-07-24 23:46 ` [Bug target/54089] " olegendo at gcc dot gnu.org
  2012-07-25 23:03 ` olegendo at gcc dot gnu.org
@ 2012-07-27 17:36 ` olegendo at gcc dot gnu.org
  2012-07-30  6:43 ` olegendo at gcc dot gnu.org
                   ` (97 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-07-27 17:36 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #2 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-07-27 17:36:30 UTC ---
Author: olegendo
Date: Fri Jul 27 17:36:20 2012
New Revision: 189917

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189917
Log:
    PR target/54089
    * config/sh/sh.c (shiftcosts): Remove case where first operand 
    is a const_int.  Move COSTS_N_INSNS usage into caller ...
    (sh_rtx_costs) ... here.  Return false when shiftcosts cannot be
    calculated instead of MAX_COST.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.c


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2012-07-27 17:36 ` olegendo at gcc dot gnu.org
@ 2012-07-30  6:43 ` olegendo at gcc dot gnu.org
  2012-08-09 21:55 ` olegendo at gcc dot gnu.org
                   ` (96 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-07-30  6:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #3 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-07-30 06:43:26 UTC ---
Author: olegendo
Date: Mon Jul 30 06:43:20 2012
New Revision: 189952

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189952
Log:
    PR target/54089
    * config/sh/sh.md (ashlsi3_d): Invoke gen_shifty_op directly instead
    of trying to emit ashlsi3_n.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.md


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2012-07-30  6:43 ` olegendo at gcc dot gnu.org
@ 2012-08-09 21:55 ` olegendo at gcc dot gnu.org
  2012-08-09 23:18 ` olegendo at gcc dot gnu.org
                   ` (95 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-09 21:55 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kkojima at gcc dot gnu.org

--- Comment #4 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-09 21:54:42 UTC ---
I'm currently playing around with the macro SHIFT_COUNT_TRUNCATED (sh.h) and
the target hook TARGET_SHIFT_TRUNCATION_MASK (which is not implemented yet).
Doing the following on rev 190259 (which is actually wrong):

sh.h:
-#define SHIFT_COUNT_TRUNCATED (! TARGET_SH3 && ! TARGET_SH2A)
+#define SHIFT_COUNT_TRUNCATED (1)

sh.c:
+/* Implement the TARGET_SHIFT_TRUNCATION_MASK target hook.  */
+
+#undef TARGET_SHIFT_TRUNCATION_MASK
+#define TARGET_SHIFT_TRUNCATION_MASK sh_shift_truncation_mask
+
+static unsigned HOST_WIDE_INT
+sh_shift_truncation_mask (enum machine_mode mode)
+{
+  int bitsize = GET_MODE_BIT_SIZE (mode);
+
+  if (TARGET_SHMEDIA)
+    return bitsize - 1;
+
+  return MAX (32, bitsize) - 1;
+}

... and looking at some CSiBE size results, I see the a couple of weird cases
similar to what happens to the set_bh_page function in
linux-2.4.23-pre3-testplatform/fs/buffer.c.

Without the changes:
   ...
.L592:
        mov.l   .L598,r1     !! 
        mov     r9,r3
        mov.l   @r1,r2       !! r2 = zone_table[0]
        add     #124,r2
        mov.l   @(36,r2),r1
        mov.l   @(32,r2),r2
        add     r10,r1
        mov.l   r9,@(56,r8)
        sub     r2,r3
        mov     r3,r2
        mov.l   .L599,r3
        shar    r2
        shar    r2
        mul.l   r3,r2
        mov     #12,r3
        sts     macl,r2
        shld    r3,r2
        add     r2,r1
        mov.l   r1,@(52,r8)

With the changes:
        ...
.L592:
        mov.l   @(24,r8),r0   !! 
        mov     r8,r3
        mov.l   .L598,r1
        shlr16  r0            !!
        shlr8   r0            !!
        shll2   r0            !!
        mov.l   @(r0,r1),r2   !! r2 = zone_table[page->flags >> ZONE_SHIFT]
        add     #124,r2
        mov.l   @(36,r2),r1
        mov.l   @(32,r2),r2
        add     r10,r1
        mov.l   r8,@(56,r9)
        sub     r2,r3
        mov     r3,r2
        mov.l   .L599,r3
        shar    r2
        shar    r2
        mul.l   r3,r2
        mov     #12,r3
        sts     macl,r2
        shld    r3,r2
        add     r2,r1
        mov.l   r1,@(52,r9)

It seems that without the (wrong) patch, the index in the inline function
'page_zone' is reduced from 'page->flags >> ZONE_SHIFT' to '0', and thus the
resulting code is wrong?!
I've tried to reproduce this in an isolated test case but couldn't get it to do
the same - the generated code seems always correct, with and without the
changes.

I'm confused... Kaz, do you have any idea what could be going on there?


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2012-08-09 21:55 ` olegendo at gcc dot gnu.org
@ 2012-08-09 23:18 ` olegendo at gcc dot gnu.org
  2012-08-09 23:28 ` olegendo at gcc dot gnu.org
                   ` (94 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-09 23:18 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #5 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-09 23:17:54 UTC ---
OK, I checking out the preprocessed file reveals the following relevant pieces:

typedef struct page {
 struct list_head list;
 struct address_space *mapping;
 unsigned long index;
 struct page *next_hash;

 atomic_t count;
 unsigned long flags;    // <<<< 

 struct list_head lru;

 struct page **pprev_hash;
 struct buffer_head * buffers;
} mem_map_t;


static inline zone_t *page_zone(struct page *page)
{
 return zone_table[page->flags >> (64 - 8)];  // = page->flags >> 56
}

Depending on whether SHIFT_COUNT_TRUNCATED evals to 1 or 0, the function above
returns either zone_table[0] (dyn shifts) or zone_table[page->flags >> 24] (no
dyn shifts).  Both should be OK to do, since that's undefined behavior.
In this case maybe fixing SHIFT_COUNT_TRUNCATED to '0' would be better, since
that would produce faster undefined behavior code ;)


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2012-08-09 23:18 ` olegendo at gcc dot gnu.org
@ 2012-08-09 23:28 ` olegendo at gcc dot gnu.org
  2012-08-09 23:36 ` olegendo at gcc dot gnu.org
                   ` (93 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-09 23:28 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #6 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-09 23:27:55 UTC ---
Author: olegendo
Date: Thu Aug  9 23:27:51 2012
New Revision: 190273

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190273
Log:
    PR target/54089
    * config/sh/sh-protos (shift_insns_rtx): Delete.
    (sh_ashlsi_clobbers_t_reg_p): Add.
    * config/sh/sh.c (shift_insns, shift_amounts, ext_shift_insns,
    ext_shift_amounts): Merge arrays of ints to array of structs.
    Adapt usage of arrays throughout the file.
    (shift_insns_rtx): Delete unused function.
    (sh_ashlsi_clobbers_t_reg_p): New function.
    * config/sh/sh.md (ashlsi3): Emit ashlsi3_n_clobbers_t insn if the
    final shift sequence will clobber T_REG.
    (ashlsi3_n): Split only if the final shift sequence will not
    clobber T_REG.
    (ashlsi3_n_clobbers_t): New insn_and_split.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh-protos.h
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2012-08-09 23:28 ` olegendo at gcc dot gnu.org
@ 2012-08-09 23:36 ` olegendo at gcc dot gnu.org
  2012-08-09 23:43 ` olegendo at gcc dot gnu.org
                   ` (92 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-09 23:36 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #7 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-09 23:36:10 UTC ---
Kaz, another thing I'm a bit unsure about ...

#define SH_DYNAMIC_SHIFT_COST \
  (TARGET_HARD_SH4 ? 1 : TARGET_DYNSHIFT ? (optimize_size ? 1 : 2) : 20)

Do you have any idea, why this is not simply
#define SH_DYNAMIC_SHIFT_COST (TARGET_DYNSHIFT) ?

I've checked the HW manuals for SH2A, SH3 and SH4 and all state that dynamic
shifts take one cycle, i.e. are as fast/slow as the other constant shift
insns...


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2012-08-09 23:36 ` olegendo at gcc dot gnu.org
@ 2012-08-09 23:43 ` olegendo at gcc dot gnu.org
  2012-08-10  0:40 ` kkojima at gcc dot gnu.org
                   ` (91 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-09 23:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #8 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-09 23:42:57 UTC ---
(In reply to comment #7)
> Kaz, another thing I'm a bit unsure about ...
> 
> #define SH_DYNAMIC_SHIFT_COST \
>   (TARGET_HARD_SH4 ? 1 : TARGET_DYNSHIFT ? (optimize_size ? 1 : 2) : 20)
> 
> Do you have any idea, why this is not simply
> #define SH_DYNAMIC_SHIFT_COST (TARGET_DYNSHIFT) ?
> 

Sorry, I meant:

#define SH_DYNAMIC_SHIFT_COST (TARGET_DYNSHIFT ? 1 : 20)


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2012-08-09 23:43 ` olegendo at gcc dot gnu.org
@ 2012-08-10  0:40 ` kkojima at gcc dot gnu.org
  2012-08-10 14:25 ` rmansfield at qnx dot com
                   ` (90 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: kkojima at gcc dot gnu.org @ 2012-08-10  0:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #9 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2012-08-10 00:39:50 UTC ---
(In reply to comment #8)
> #define SH_DYNAMIC_SHIFT_COST (TARGET_DYNSHIFT ? 1 : 20)

Sounds reasonable.  Perhaps some historical reason for the original
one, though I don't know why.  Could you run SCiBE tests with it?


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2012-08-10  0:40 ` kkojima at gcc dot gnu.org
@ 2012-08-10 14:25 ` rmansfield at qnx dot com
  2012-08-10 15:40 ` olegendo at gcc dot gnu.org
                   ` (89 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: rmansfield at qnx dot com @ 2012-08-10 14:25 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #10 from Ryan Mansfield <rmansfield at qnx dot com> 2012-08-10 14:24:55 UTC ---
Created attachment 27985
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27985
preprocessed src

 ./xgcc -B. -w ~/ice2.i -c -Os 
/home/ryan/ice2.i: In function 'tg_extent':
/home/ryan/ice2.i:81:2: error: unrecognizable insn:
  }
  ^
(insn 189 163 190 9 (set (reg:SI 227)
        (ashift:SI (reg:SI 147 t)
            (const_int 1 [0x1]))) /home/ryan/ice2.i:75 -1
     (nil))
/home/ryan/ice2.i:81:2: internal compiler error: in extract_insn, at
recog.c:2125
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2012-08-10 14:25 ` rmansfield at qnx dot com
@ 2012-08-10 15:40 ` olegendo at gcc dot gnu.org
  2012-08-11 20:26 ` olegendo at gcc dot gnu.org
                   ` (88 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-10 15:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #11 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-10 15:40:07 UTC ---
(In reply to comment #10)
> Created attachment 27985 [details]
> preprocessed src
> 
>  ./xgcc -B. -w ~/ice2.i -c -Os 
> /home/ryan/ice2.i: In function 'tg_extent':
> /home/ryan/ice2.i:81:2: error: unrecognizable insn:
>   }
>   ^
> (insn 189 163 190 9 (set (reg:SI 227)
>         (ashift:SI (reg:SI 147 t)
>             (const_int 1 [0x1]))) /home/ryan/ice2.i:75 -1
>      (nil))
> /home/ryan/ice2.i:81:2: internal compiler error: in extract_insn, at
> recog.c:2125
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See <http://gcc.gnu.org/bugs.html> for instructions.

Thanks for the test case!  It turns out that this is basically the same ICE as
triggered by your test case in PR 39423.  I will include this test case in the
next patch for PR 39423.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2012-08-10 15:40 ` olegendo at gcc dot gnu.org
@ 2012-08-11 20:26 ` olegendo at gcc dot gnu.org
  2012-08-16 23:13 ` olegendo at gcc dot gnu.org
                   ` (87 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-11 20:26 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #12 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-11 20:25:45 UTC ---
(In reply to comment #9)
> (In reply to comment #8)
> > #define SH_DYNAMIC_SHIFT_COST (TARGET_DYNSHIFT ? 1 : 20)
> 
> Sounds reasonable.  Perhaps some historical reason for the original
> one, though I don't know why.  Could you run SCiBE tests with it?

I've checked result-size for -m2a-single -mb -O2 ...
In total there's a decrease of 1728 bytes, with a few cases where there are
increases.  I've briefly checked out why code gets bigger in those cases.  As
far as I can see at the moment, one reason is because right shifts are
dynamicalized (converted to dynamic shift insns) although it's not really
beneficial.

I think I'll try to brush up the right shifts patterns first, then try again.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2012-08-11 20:26 ` olegendo at gcc dot gnu.org
@ 2012-08-16 23:13 ` olegendo at gcc dot gnu.org
  2012-08-20 21:29 ` olegendo at gcc dot gnu.org
                   ` (86 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-16 23:13 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #13 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-16 23:13:18 UTC ---
Author: olegendo
Date: Thu Aug 16 23:13:11 2012
New Revision: 190457

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190457
Log:
    PR target/54089
    * config/sh/sh.md (ashlsi3_d): Do not split if it would result
    in a T_REG clobber.  Correct comment.
    (ashlsi3_n): Correct comment.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.md


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2012-08-16 23:13 ` olegendo at gcc dot gnu.org
@ 2012-08-20 21:29 ` olegendo at gcc dot gnu.org
  2012-08-22 22:52 ` olegendo at gcc dot gnu.org
                   ` (85 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-20 21:29 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #14 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-20 21:29:26 UTC ---
For the record, I've committed the following under the wrong PR number by
accident.

Author: olegendo
Date: Mon Aug 20 20:54:20 2012
New Revision: 190545

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190545
Log:
    PR target/50489
    * config/sh/sh.md (rotcr, *rotcr, shar, shlr): New insns and splits.
    (ashrdi3_k, lshrdi3_k): Rewrite as insn_and_split.
    * config/sh/sh.c (sh_lshrsi_clobbers_t_reg_p): New function.
    * config/sh/sh-protos.h (sh_lshrsi_clobbers_t_reg_p): Declare it.

    PR target/50489
    * gcc.target/sh/pr54089-1.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/sh/pr54089-1.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh-protos.h
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2012-08-20 21:29 ` olegendo at gcc dot gnu.org
@ 2012-08-22 22:52 ` olegendo at gcc dot gnu.org
  2012-09-10 20:35 ` olegendo at gcc dot gnu.org
                   ` (84 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-22 22:52 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #15 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-22 22:52:25 UTC ---
Author: olegendo
Date: Wed Aug 22 22:52:17 2012
New Revision: 190603

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190603
Log:
    PR target/54089
    * config/sh/predicates (p27_rshift_count_operand,
    not_p27_rshift_count_operand): New predicates.
    * config/sh/sh.c (sh_ashlsi_clobbers_t_reg_p,
    sh_lshrsi_clobbers_t_reg_p, sh_dynamicalize_shift_p): Handle special
    case when shift amount is 31.
    (gen_ashift): Emit gen_shlr instead of gen_lshrsi3_m.
    * config/sh/sh.md (ashlsi3_d): Set type to 'dyn_shift' instead
    of 'arith'.
    (ashlsi_c): Rename to shll.  Adapt calls to gen_ashlsi_c throughout
    the file.
    (lshrsi3): Remove clobber from expander.  Use shift_count_operand
    instead of nonmemory_operand predicate for second operand.  Add
    handling of case lshrsi3_n_clobbers_t.
    (lshrsi3_k): Use p27_rshift_count_operand for second operand.
    (lshrsi3_d): Make insn_and_split.  Split dynamic shift to constant
    shift sequences if beneficial.
    (lshrsi3_n): Make insn_and_split.  Split constant shift sequence to
    dynamic shift if beneficial.
    (lshrsi3_n_clobbers_t): New insn_and_split.
    (lshrsi3_m): Delete.

    PR target/54089
    * gcc.target/sh/pr54089-2.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/sh/pr54089-2.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/predicates.md
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (15 preceding siblings ...)
  2012-08-22 22:52 ` olegendo at gcc dot gnu.org
@ 2012-09-10 20:35 ` olegendo at gcc dot gnu.org
  2012-09-10 21:27 ` olegendo at gcc dot gnu.org
                   ` (83 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-09-10 20:35 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #16 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-09-10 20:35:30 UTC ---
Author: olegendo
Date: Mon Sep 10 20:35:25 2012
New Revision: 191161

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=191161
Log:
    PR target/54089
    * config/sh/sh.h (SH_DYNAMIC_SHIFT_COST): Set always to 1 if
    dynamic shifts are available.
    (SHIFT_COUNT_TRUNCATED): Always define to 0.  Correct comment.
    * config/sh/sh.c (ashl_lshr_seq, ext_ashl_lshr_seq): Add comments.
    * config/sh/predicates.md (shift_count_operand): Allow
    arith_reg_operand even if TARGET_DYNSHIFT is false.
    * config/sh/sh.md (ashlsi3, lshrsi3): Expand library call patterns
    if needed.
    (ashlsi3_d_call, lshrsi3_d_call): New insns.

    PR target/54089
    * config/sh/lib1funcs.S (ashlsi3): Reimplement as ashlsi3_r0.
    (lshrsi3): Reimplement as lshrsi3_r0.

    PR target/54089
    * gcc.target/sh/pr54089-3.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/sh/pr54089-3.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/predicates.md
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.h
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog
    trunk/libgcc/ChangeLog
    trunk/libgcc/config/sh/lib1funcs.S


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (16 preceding siblings ...)
  2012-09-10 20:35 ` olegendo at gcc dot gnu.org
@ 2012-09-10 21:27 ` olegendo at gcc dot gnu.org
  2012-09-17 23:30 ` olegendo at gcc dot gnu.org
                   ` (82 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-09-10 21:27 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #17 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-09-10 21:27:30 UTC ---
Created attachment 28163
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28163
Alternative dropped software dynamic ashlsi3,lshrsi3 patch

(In reply to comment #16)
> Author: olegendo
> Date: Mon Sep 10 20:35:25 2012
> New Revision: 191161
> 
> URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=191161
> Log:
>     PR target/54089
>     * config/sh/sh.h (SH_DYNAMIC_SHIFT_COST): Set always to 1 if
>     dynamic shifts are available.
>     (SHIFT_COUNT_TRUNCATED): Always define to 0.  Correct comment.
>     * config/sh/sh.c (ashl_lshr_seq, ext_ashl_lshr_seq): Add comments.
>     * config/sh/predicates.md (shift_count_operand): Allow
>     arith_reg_operand even if TARGET_DYNSHIFT is false.
>     * config/sh/sh.md (ashlsi3, lshrsi3): Expand library call patterns
>     if needed.
>     (ashlsi3_d_call, lshrsi3_d_call): New insns.
> 
>     PR target/54089
>     * config/sh/lib1funcs.S (ashlsi3): Reimplement as ashlsi3_r0.
>     (lshrsi3): Reimplement as lshrsi3_r0.
> 
>     PR target/54089
>     * gcc.target/sh/pr54089-3.c: New.
> 
> 
> Added:
>     trunk/gcc/testsuite/gcc.target/sh/pr54089-3.c
> Modified:
>     trunk/gcc/ChangeLog
>     trunk/gcc/config/sh/predicates.md
>     trunk/gcc/config/sh/sh.c
>     trunk/gcc/config/sh/sh.h
>     trunk/gcc/config/sh/sh.md
>     trunk/gcc/testsuite/ChangeLog
>     trunk/libgcc/ChangeLog
>     trunk/libgcc/config/sh/lib1funcs.S

This patch is just for the record/reference.  It is a little bit more complex
than the committed patch.  The main difference is the clobber list and the
input/output regs of the shift insns.  The committed version seemed more
beneficial.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (17 preceding siblings ...)
  2012-09-10 21:27 ` olegendo at gcc dot gnu.org
@ 2012-09-17 23:30 ` olegendo at gcc dot gnu.org
  2012-09-19 17:49 ` olegendo at gcc dot gnu.org
                   ` (81 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-09-17 23:30 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #18 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-09-17 23:29:52 UTC ---
Created attachment 28207
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28207
Arithmetic right shift rework

I have tried to apply the same strategies to the arithmetic right shift
patterns as for logical right/left shift patterns.
What happens then is that combine will see arithmetic right shift patterns (for
shift amounts > 1) and will fail to convert arithmetic right shifts to logical
right shifts where it would actually be possible.  This results in bigger code
since arithmetic shifts on SH are more expensive than logical shifts.

Unfortunately, combine will only do the shift type conversion if it can combine
the logical shift with some other insn -- which is not very likely on SH. 
Thus, the logical shift is discarded.  If however, no arithmetic shift pattern
is there that matches the shift amount (which is currently the case), combine
will split out the logical shift.

Not having arithmetic right shift patterns makes it difficult to write combine
patterns for things like ... 

char test (int x)
{
  return x >> 24;
}

which ends up as:
-m4:
        mov    #-24,r1
        shad    r1,r4
        rts
        exts.b    r4,r0

-m2:
        mov.l    .L3,r1
        sts.l    pr,@-r15
        jsr    @r1
        nop
        mov
        lds.l    @r15+,pr
        rts    
        nop
.L4:
    .align 2
.L3:
    .long    ___ashiftrt_r4_24


On the other hand, the following is handled almost as expected:

unsigned char test (int a)
{
  return a >> 24;
}

-m4:
        mov    r4,r0
        shlr16    r0
        rts
        shlr8    r0

-m2:
        mov.l    .L3,r1
        sts.l    pr,@-r15
        jsr    @r1
        nop
        extu.b    r4,r0
        lds.l    @r15+,p
        rts    
        nop
.L4:
        .align 2
.L3:
        .long    ___ashiftrt_r4_24


According to my understanding, to fix those issues something like the attached
patch would be required.  However, it would also require a separate arithmetic
right shift to logical right shift conversion pass.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (18 preceding siblings ...)
  2012-09-17 23:30 ` olegendo at gcc dot gnu.org
@ 2012-09-19 17:49 ` olegendo at gcc dot gnu.org
  2012-09-25 19:07 ` olegendo at gcc dot gnu.org
                   ` (80 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-09-19 17:49 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #19 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-09-19 17:48:31 UTC ---
Author: olegendo
Date: Wed Sep 19 17:48:25 2012
New Revision: 191490

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=191490
Log:
    PR target/54089
    * config/sh/predicates.md (arith_reg_or_t_reg_operand): New predicate.
    * config/sh/sh.md (*rotcr): Use arith_reg_or_t_reg_operand predicate.
    Handle the case where one of the operands is T_REG.
    Add new pattern to handle MSB extraction.

    PR target/54089
    * gcc.target/sh/pr54089-1.c (test_11, test_12, test_13, test_14): New
    functions.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/predicates.md
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/sh/pr54089-1.c


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (19 preceding siblings ...)
  2012-09-19 17:49 ` olegendo at gcc dot gnu.org
@ 2012-09-25 19:07 ` olegendo at gcc dot gnu.org
  2012-09-30 18:46 ` olegendo at gcc dot gnu.org
                   ` (79 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-09-25 19:07 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #20 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-09-25 19:06:34 UTC ---
Author: olegendo
Date: Tue Sep 25 19:06:28 2012
New Revision: 191743

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=191743
Log:
    PR target/54089
    * config/sh/constraints.md (Jhb): New constraint.
    * config/sh/predicates.md (negt_reg_shl31_operand): New predicate.
    * config/sh/sh.md (rotrsi3): New expander.
    (rotrsi3_1, *rotrsi3_1, *rotlsi3_1): New insns.
    (rotlsi3, rotlhi3): Use const_int_operand predicate instead of
    immediate_operand and remove CONST_INT_P checks in expansion code.
    (*rotcr): Cleanup variable usage.  Handle preceding nott insn.  Add
    split with swapped operands.
    (*rotcr_neg_t, *movt_msb, *negt_msb): New insns and splits.

    PR target/54089
    * gcc.target/sh/pr54089-1.c (test_15, test_16, test_17, test_18,
    test_19, test_20, test_21, test_22, test_23): New functions.
    * gcc.target/sh/pr54089-4.c: New.
    * gcc.target/sh/pr54089-5.c: New.
    * gcc.target/sh/pr54089-6.c: New.
    * gcc.target/sh/pr54089-7.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/sh/pr54089-4.c
    trunk/gcc/testsuite/gcc.target/sh/pr54089-5.c
    trunk/gcc/testsuite/gcc.target/sh/pr54089-6.c
    trunk/gcc/testsuite/gcc.target/sh/pr54089-7.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/constraints.md
    trunk/gcc/config/sh/predicates.md
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/sh/pr54089-1.c


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (20 preceding siblings ...)
  2012-09-25 19:07 ` olegendo at gcc dot gnu.org
@ 2012-09-30 18:46 ` olegendo at gcc dot gnu.org
  2012-10-16 10:53 ` amylaar at gcc dot gnu.org
                   ` (78 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-09-30 18:46 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #21 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-09-30 18:45:56 UTC ---
I've noticed that there seems to be a problem with register allocation related
to shift insns.  For example in the CSiBE set, I've seen sequences such as

        mov.w   .L342,r2
        mov     #0,r7
        .align 2
.L340:
        mov     r7,r1
        add     r2,r1
        shar    r1
        mov     r1,r3      <<<
        shll2   r3         <<<
        mov     r3,r0      <<<
        mov.l   @(r0,r5),r3
        cmp/gt  r4,r3
        bf      0f

quite often.  Not sure why this happens.  Maybe because 'gen_shifty_op' (which
emits the shift sequence insns) is called after reload and not during the first
split pass after combine...


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (21 preceding siblings ...)
  2012-09-30 18:46 ` olegendo at gcc dot gnu.org
@ 2012-10-16 10:53 ` amylaar at gcc dot gnu.org
  2012-10-16 11:49 ` olegendo at gcc dot gnu.org
                   ` (77 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: amylaar at gcc dot gnu.org @ 2012-10-16 10:53 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

Jorn Wolfgang Rennecke <amylaar at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amylaar at gcc dot gnu.org

--- Comment #22 from Jorn Wolfgang Rennecke <amylaar at gcc dot gnu.org> 2012-10-16 10:52:46 UTC ---
(In reply to comment #0)
> The code related to shift patterns in sh.c / sh.md maybe could use some
> improvements here and there.  In some places clobbers and scratch regs could be
> avoided.
> There is also a large part that deals with minimizing and-shift/shift-and
> sequences, but there are no test cases to verify that those actually work.
> It would also be interesting to see, whether some of the and-shift/shift-and
> combine code could be reduced due to improvements in the middle-end.

Be careful with removing 'useless' clobbers, as they might be needed to
facilitate instruction combinations into patterns that have these clobbers.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (22 preceding siblings ...)
  2012-10-16 10:53 ` amylaar at gcc dot gnu.org
@ 2012-10-16 11:49 ` olegendo at gcc dot gnu.org
  2012-11-06 11:56 ` olegendo at gcc dot gnu.org
                   ` (76 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-10-16 11:49 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #23 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-10-16 11:49:14 UTC ---
(In reply to comment #22)
> (In reply to comment #0)
> > The code related to shift patterns in sh.c / sh.md maybe could use some
> > improvements here and there.  In some places clobbers and scratch regs could be
> > avoided.
> > There is also a large part that deals with minimizing and-shift/shift-and
> > sequences, but there are no test cases to verify that those actually work.
> > It would also be interesting to see, whether some of the and-shift/shift-and
> > combine code could be reduced due to improvements in the middle-end.
> 
> Be careful with removing 'useless' clobbers, as they might be needed to
> facilitate instruction combinations into patterns that have these clobbers.

Yeah, I've noticed ;)
Logical left/right shifts are now expanded into T-clobbering and
non-T-clobbering pattern variations.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (23 preceding siblings ...)
  2012-10-16 11:49 ` olegendo at gcc dot gnu.org
@ 2012-11-06 11:56 ` olegendo at gcc dot gnu.org
  2012-11-07 23:32 ` olegendo at gcc dot gnu.org
                   ` (75 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-11-06 11:56 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #24 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-11-06 11:55:47 UTC ---
Author: olegendo
Date: Tue Nov  6 11:55:43 2012
New Revision: 193236

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=193236
Log:
    PR target/54089
    * config/sh/sh.c (and_xor_ior_costs, addsubcosts): Double the costs for
    ops larger than SImode.
    * config/sh/sh.md (rotcl, *rotcl): New insns and splits.
    (ashldi3_k): Convert to insn_and_split and use new rotcl insn.

    PR target/54089
    * gcc.target/sh/pr54089-8.c: New.
    * gcc.target/sh/pr54089-9.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/sh/pr54089-8.c
    trunk/gcc/testsuite/gcc.target/sh/pr54089-9.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (24 preceding siblings ...)
  2012-11-06 11:56 ` olegendo at gcc dot gnu.org
@ 2012-11-07 23:32 ` olegendo at gcc dot gnu.org
  2012-11-09 10:48 ` olegendo at gcc dot gnu.org
                   ` (74 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-11-07 23:32 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #25 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-11-07 23:31:20 UTC ---
Created attachment 28633
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28633
Arithmetic right shift rework 2

This could be an alternative approach for the arith right shifts.
Applied to rev 193240:

CSiBE -m4-single -O2 -ml -mpretend-cmove
sum:  3159747 -> 3160047    +300 / +0.009494 %

CSiBE -m2 -O2 -ml
sum:  3314907 -> 3313319    -1588 / -0.047905 %

The idea here is to let arith shifts initially go through a slightly more
complex pattern 'ashrsi3_bridge_const', which allows the lib function address
or shad constant to be CSE'ed early on.
As mentioned in comment #18, the arith -> logical shift conversion is done by
combine only under certain circumstances.  After some trial and error I arrived
at the 'ashrsi3_bridge_const' pattern, which makes combine convert most of the
arith shifts into logical shifts.  In some cases it also finds more
opportunities, but regressions like this one still remain:

void MC_put_xy_16_c
(uint8_t * dest, const uint8_t * ref, const int stride, int height)
{
  // must see a shlr instead of 2x shar.
  // somehow the loop prevents this...
  do
  {
    dest[0] = (((ref[0]+ref[0 +1]+(ref+stride)[0]+(ref+stride)[0 +1]+2)>>2));
    ref += stride; dest += stride;
  } while (--height);
}

In addition to that, some unlucky register allocations appear, which cause
unneeded reg-reg copies (it seems something has trouble utilizing commutative
insns such as 'add').

The other problem is that patterns such as:

(define_insn_and_split "*lshrsi3"
  [(set (match_operand:SI 0 "arith_reg_dest")
    (lshiftrt:SI (match_operand:SI 1 "arith_reg_operand")
             (match_operand:SI 2 "shift_count_operand")))
   (clobber (reg:SI T_REG))
   (use (match_operand:SI 3 "general_operand"))
   (use (match_operand:SI 4 "general_operand"))]
  "TARGET_SH1 && can_create_pseudo_p () && CONST_INT_P (operands[2])"
  "#"
  "&& 1"
  [(const_int 0)]
{
  emit_insn (gen_lshrsi3 (operands[0], operands[1], operands[2]));
  DONE;
})

have to be added for combine to convert more arith -> logical shifts.  In this
patch I've added only one such pattern.  However, if for example zero_extract
patterns are added, variations such as above would be required, too.  This
looks discomforting to me.  Maybe the better solution would indeed be to add a
arith -> logical shift conversion pass before combine, or try to convert arith
shifts in the split pass after combine by looking at the data flow of the
shifted values.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (25 preceding siblings ...)
  2012-11-07 23:32 ` olegendo at gcc dot gnu.org
@ 2012-11-09 10:48 ` olegendo at gcc dot gnu.org
  2012-11-09 13:29 ` amylaar at gcc dot gnu.org
                   ` (73 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-11-09 10:48 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #26 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-11-09 10:48:17 UTC ---
(In reply to comment #25)
> Maybe the better solution would indeed be to add a
> arith -> logical shift conversion pass before combine, or try to convert arith
> shifts in the split pass after combine by looking at the data flow of the
> shifted values.

Another idea would be to extend the combine pass, so that whenever it
internally converts arith -> logical shifts, it always checks the costs of the
shift op, and if the logical shift is cheaper, always split out the logical
shift.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (26 preceding siblings ...)
  2012-11-09 10:48 ` olegendo at gcc dot gnu.org
@ 2012-11-09 13:29 ` amylaar at gcc dot gnu.org
  2012-11-13  1:14 ` olegendo at gcc dot gnu.org
                   ` (72 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: amylaar at gcc dot gnu.org @ 2012-11-09 13:29 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #27 from Jorn Wolfgang Rennecke <amylaar at gcc dot gnu.org> 2012-11-09 13:29:10 UTC ---
(In reply to comment #26)
> (In reply to comment #25)
> > Maybe the better solution would indeed be to add a
> > arith -> logical shift conversion pass before combine, or try to convert arith
> > shifts in the split pass after combine by looking at the data flow of the
> > shifted values.

FWIW, while the ARC600/700 have a full set up static and dynamic shifts,
the ARC 601 with the swap option has similar issues with shifts that should
be stitched together from multiple instructions, and preferences of unsigned
over signed shifts.

> Another idea would be to extend the combine pass, so that whenever it
> internally converts arith -> logical shifts, it always checks the costs of the
> shift op, and if the logical shift is cheaper, always split out the logical
> shift.

Indeed, combine's insistence that a computation has just one canonical
form, which depends on how much it known about the input/output data,
is in want of some correction.  We got define_split for things that
should become multiple instructions, but nothing for things that should
just be differently-shaped instructions, and there are no predicates available
to get to know about the (non)zero / signbits that combine knows about.

E.g. the following peephole in arc.md is only necessary because combine
does some unwanted 'simplifications' with known-zero bits:

;; -------------------------------------------------------------
;; Pattern 1 : r0 = r1 << {i}
;;             r3 = r4/INT + r0     ;;and commutative
;;                 ||
;;                 \/
;;             add{i} r3,r4/INT,r1
;; -------------------------------------------------------------
;; ??? This should be covered by combine, alas, at times combine gets
;; too clever for it's own good: when the shifted input is known to be
;; either 0 or 1, the operation will be made into an if-then-else, and
;; thus fail to match the add_n pattern.  Example: _mktm_r, line 85 in
;; newlib/libc/time/mktm_r.c .

(define_peephole2
  [(set (match_operand:SI 0 "dest_reg_operand" "")
        (ashift:SI (match_operand:SI 1 "register_operand" "")
                   (match_operand:SI 2 "const_int_operand" "")))
  (set (match_operand:SI 3 "dest_reg_operand" "")
       (plus:SI (match_operand:SI 4 "nonmemory_operand" "")
                (match_operand:SI 5 "nonmemory_operand" "")))]
  "(INTVAL (operands[2]) == 1
    || INTVAL (operands[2]) == 2
    || INTVAL (operands[2]) == 3)
   && (true_regnum (operands[4]) == true_regnum (operands[0])
       || true_regnum (operands[5]) == true_regnum (operands[0]))
   && (peep2_reg_dead_p (2, operands[0]) || (true_regnum (operands[3]) ==
true_regnum (operands[0])))"
 ;; the preparation statements take care to put proper operand in operands[4]
 ;; operands[4] will always contain the correct operand. This is added to
satisfy commutativity
  [(set (match_dup 3)
        (plus:SI (mult:SI (match_dup 1)
                          (match_dup 2))
                 (match_dup 4)))]
  "if (true_regnum (operands[4]) == true_regnum (operands[0]))
      operands[4] = operands[5];
   operands[2] = GEN_INT (1 << INTVAL (operands[2]));"
)


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (27 preceding siblings ...)
  2012-11-09 13:29 ` amylaar at gcc dot gnu.org
@ 2012-11-13  1:14 ` olegendo at gcc dot gnu.org
  2013-02-16 11:36 ` olegendo at gcc dot gnu.org
                   ` (71 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-11-13  1:14 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #28 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-11-13 01:13:55 UTC ---
(In reply to comment #27)

> FWIW, while the ARC600/700 have a full set up static and dynamic shifts,
> the ARC 601 with the swap option has similar issues with shifts that should
> be stitched together from multiple instructions, and preferences of unsigned
> over signed shifts.

Ah, good to know that SH is not the only case where it would make sense.
I've created another PR for this issue:  PR 55306


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (28 preceding siblings ...)
  2012-11-13  1:14 ` olegendo at gcc dot gnu.org
@ 2013-02-16 11:36 ` olegendo at gcc dot gnu.org
  2013-12-17 11:11 ` olegendo at gcc dot gnu.org
                   ` (70 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-02-16 11:36 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #29 from Oleg Endo <olegendo at gcc dot gnu.org> 2013-02-16 11:36:37 UTC ---
Another case taken from CSiBE / bzip2, where reusing the intermediate shift
result would be better:

void uInt64_from_UInt32s ( UInt64* n, UInt32 lo32, UInt32 hi32 )
{
   n->b[7] = (UChar)((hi32 >> 24) & 0xFF);
   n->b[6] = (UChar)((hi32 >> 16) & 0xFF);
   n->b[5] = (UChar)((hi32 >> 8) & 0xFF);
   n->b[4] = (UChar) (hi32 & 0xFF);
/*
   n->b[3] = (UChar)((lo32 >> 24) & 0xFF);
   n->b[2] = (UChar)((lo32 >> 16) & 0xFF);
   n->b[1] = (UChar)((lo32 >> 8) & 0xFF);
   n->b[0] = (UChar) (lo32 & 0xFF);
*/
}

on rev 196091 with -O2 -m4 compiles to:

        mov     r6,r0
        shlr16  r0
        shlr8   r0
        mov.b   r0,@(7,r4)
        mov     r6,r0
        shlr16  r0
        mov.b   r0,@(6,r4)
        mov     r6,r0
        shlr8   r0
        mov.b   r0,@(5,r4)
        mov     r6,r0
    mov.b   r0,@(4,r4)

which would be better as:
        mov     r6,r0
        mov.b   r0,@(4,r4)
        shlr8   r0
        mov.b   r0,@(5,r4)
        shlr8   r0
        mov.b   r0,@(6,r4)
        shlr8   r0
        mov.b   r0,@(7,r4)

this would require reordering of the mem stores, which should be OK to do if
the mem is not volatile.  

Reordering the stores manually:

void uInt64_from_UInt32s ( UInt64* n, UInt32 lo32, UInt32 hi32 )
{
   n->b[4] = (UChar) (hi32 & 0xFF);
   n->b[5] = (UChar)((hi32 >> 8) & 0xFF);
   n->b[6] = (UChar)((hi32 >> 16) & 0xFF);
   n->b[7] = (UChar)((hi32 >> 24) & 0xFF);
}

still results in:

        mov     r6,r0
        mov.b   r0,@(4,r4)
        mov     r6,r0
        shlr8   r0
        mov.b   r0,@(5,r4)
        mov     r6,r0
        shlr16  r0
        mov.b   r0,@(6,r4)
        mov     r6,r0
        shlr16  r0
        shlr8   r0
        mov.b   r0,@(7,r4)

... at least this case should be handled, I think.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (29 preceding siblings ...)
  2013-02-16 11:36 ` olegendo at gcc dot gnu.org
@ 2013-12-17 11:11 ` olegendo at gcc dot gnu.org
  2014-05-16 23:12 ` olegendo at gcc dot gnu.org
                   ` (69 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-12-17 11:11 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #30 from Oleg Endo <olegendo at gcc dot gnu.org> ---
A case from libmpeg2/slice.c:

        mov.b   @(1,r10),r0    // load of shift amount
        shld    r7,r6
        add     #1,r6
        extu.b  r0,r0          // zero extend shift amount
        shld    r0,r1          // r1 <<= r0
        mov     r1,r0

The zero extension of the shift amount variable could be omitted because shift
amounts > 31 are undefined behavior.  If the shift amount is in the valid range
of 0...31 the zero extension won't do anything.
A reduced test case:

int test33 (unsigned char* x, int y)
{
  return y << x[4];
}

results in:
        mov.b   @(4,r4),r0
        extu.b  r0,r0
        shld    r0,r5
        rts
        mov     r5,r0


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (30 preceding siblings ...)
  2013-12-17 11:11 ` olegendo at gcc dot gnu.org
@ 2014-05-16 23:12 ` olegendo at gcc dot gnu.org
  2014-12-16 21:38 ` olegendo at gcc dot gnu.org
                   ` (68 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-05-16 23:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #31 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Fri May 16 23:12:19 2014
New Revision: 210537

URL: http://gcc.gnu.org/viewcvs?rev=210537&root=gcc&view=rev
Log:
gcc/
    PR target/54089
    * config/sh/predicates.md (negt_reg_shl31_operand): Match additional
    patterns.
    * config/sh/sh.md (*negt_msb): Merge SH2A and non-SH2A variants.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/predicates.md
    trunk/gcc/config/sh/sh.md


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (31 preceding siblings ...)
  2014-05-16 23:12 ` olegendo at gcc dot gnu.org
@ 2014-12-16 21:38 ` olegendo at gcc dot gnu.org
  2015-05-30 13:51 ` bugdal at aerifal dot cx
                   ` (67 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-12-16 21:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #32 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Tue Dec 16 21:37:42 2014
New Revision: 218795

URL: https://gcc.gnu.org/viewcvs?rev=218795&root=gcc&view=rev
Log:
gcc/testsuite/
    PR target/54089
    * gcc.target/sh/pr54089-1.c: Change optimization level from -O1 to -O2.

Modified:
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/sh/pr54089-1.c


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (32 preceding siblings ...)
  2014-12-16 21:38 ` olegendo at gcc dot gnu.org
@ 2015-05-30 13:51 ` bugdal at aerifal dot cx
  2015-05-30 14:11 ` olegendo at gcc dot gnu.org
                   ` (66 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: bugdal at aerifal dot cx @ 2015-05-30 13:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

Rich Felker <bugdal at aerifal dot cx> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugdal at aerifal dot cx

--- Comment #33 from Rich Felker <bugdal at aerifal dot cx> ---
The commit in comment 16 broke Linux (the kernel) and nobody seems to have
noticed since 2012...


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (33 preceding siblings ...)
  2015-05-30 13:51 ` bugdal at aerifal dot cx
@ 2015-05-30 14:11 ` olegendo at gcc dot gnu.org
  2015-05-30 14:35 ` bugdal at aerifal dot cx
                   ` (65 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-05-30 14:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #34 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Thanks for spotting and pin pointing.  Do you have a bit more info, maybe a
reduced test case?


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (34 preceding siblings ...)
  2015-05-30 14:11 ` olegendo at gcc dot gnu.org
@ 2015-05-30 14:35 ` bugdal at aerifal dot cx
  2015-05-30 14:42 ` olegendo at gcc dot gnu.org
                   ` (64 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: bugdal at aerifal dot cx @ 2015-05-30 14:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #35 from Rich Felker <bugdal at aerifal dot cx> ---
No, but the cause is simple -- the kernel doesn't use libgcc but defines its
own versions of these functions, and has the old ones but not the new ones. I
tried just removing the kernel's copies and using libgcc.a but it produced a
non-bootable kernel and I didn't test further; my guess would be some
relocations in the libgcc.a versions are incompatible with the way the kernel
is built.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (35 preceding siblings ...)
  2015-05-30 14:35 ` bugdal at aerifal dot cx
@ 2015-05-30 14:42 ` olegendo at gcc dot gnu.org
  2015-05-30 14:46 ` bugdal at aerifal dot cx
                   ` (63 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-05-30 14:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #36 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Rich Felker from comment #35)
> No, but the cause is simple -- the kernel doesn't use libgcc but defines its
> own versions of these functions, and has the old ones but not the new ones.

Yeah, that's what happens when other SW relies on the implementation details. 
I don't think it's a problem of GCC, but rather the kernel.  You can try to
copy the new functions from the GCC source to the kernel source.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (36 preceding siblings ...)
  2015-05-30 14:42 ` olegendo at gcc dot gnu.org
@ 2015-05-30 14:46 ` bugdal at aerifal dot cx
  2015-05-30 14:57 ` olegendo at gcc dot gnu.org
                   ` (62 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: bugdal at aerifal dot cx @ 2015-05-30 14:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #37 from Rich Felker <bugdal at aerifal dot cx> ---
I expect this is going to be problematic from a license perspective unless they
can be licensed under GPLv2...


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (37 preceding siblings ...)
  2015-05-30 14:46 ` bugdal at aerifal dot cx
@ 2015-05-30 14:57 ` olegendo at gcc dot gnu.org
  2015-08-11  3:56 ` bugdal at aerifal dot cx
                   ` (61 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-05-30 14:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #38 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Rich Felker from comment #33)
> The commit in comment 16 broke Linux (the kernel) and nobody seems to have
> noticed since 2012...

There aren't that many (publicly known) users of Linux on SH2 or SH2A.  If they
don't speak up and stay in the underground, nobody will notice :)


(In reply to Rich Felker from comment #37)
> I expect this is going to be problematic from a license perspective unless
> they can be licensed under GPLv2...

In this case, adjust the functions in the kernel to match the interface of the
compiler.  However, maybe it's better to figure out how to have libgcc in the
kernel to avoid this kind of situation in the future.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (38 preceding siblings ...)
  2015-05-30 14:57 ` olegendo at gcc dot gnu.org
@ 2015-08-11  3:56 ` bugdal at aerifal dot cx
  2015-08-11 14:17 ` olegendo at gcc dot gnu.org
                   ` (60 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: bugdal at aerifal dot cx @ 2015-08-11  3:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #39 from Rich Felker <bugdal at aerifal dot cx> ---
Oleg, do you have rights under your copyright assignment agreement to
dual-license the libgcc part of the patch under GPLv2 so it could be included
in the kernel, and if so, would you be willing to do so? I agree it would be
cleaner for the kernel not to duplicate libgcc code, but getting in a simple
patch to update the code that's there would be a lot easier than the
build-architectural change of using libgcc itself in the kernel, which is an
issue I'd really rather not fight with the kernel developers on if they
disagree. :-)


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (39 preceding siblings ...)
  2015-08-11  3:56 ` bugdal at aerifal dot cx
@ 2015-08-11 14:17 ` olegendo at gcc dot gnu.org
  2015-08-11 21:05 ` segher at gcc dot gnu.org
                   ` (59 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-08-11 14:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #40 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Rich Felker from comment #39)
> Oleg, do you have rights under your copyright assignment agreement to
> dual-license the libgcc part of the patch under GPLv2 so it could be
> included in the kernel, and if so, would you be willing to do so?

If I understand it correctly, it should be OK.  Please contact me directly
about this.  If possible, please also add the other people who want this to the
CC list.

> I agree it
> would be cleaner for the kernel not to duplicate libgcc code, but getting in
> a simple patch to update the code that's there would be a lot easier than
> the build-architectural change of using libgcc itself in the kernel, which
> is an issue I'd really rather not fight with the kernel developers on if
> they disagree. :-)

A quick search for "libgcc linux kernel" tells the story about this. 
Unfortunately, you are right.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (40 preceding siblings ...)
  2015-08-11 14:17 ` olegendo at gcc dot gnu.org
@ 2015-08-11 21:05 ` segher at gcc dot gnu.org
  2015-08-12  1:30 ` segher at gcc dot gnu.org
                   ` (58 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: segher at gcc dot gnu.org @ 2015-08-11 21:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

Segher Boessenkool <segher at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |segher at gcc dot gnu.org

--- Comment #41 from Segher Boessenkool <segher at gcc dot gnu.org> ---
On the other hand, there are quite a few archs (more modern ones
mostly) where the Linux kernel uses the real libgcc.  And the SH
kernel maintainers won't complain anyway (SH is orphaned).


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (41 preceding siblings ...)
  2015-08-11 21:05 ` segher at gcc dot gnu.org
@ 2015-08-12  1:30 ` segher at gcc dot gnu.org
  2023-06-01  7:32 ` klepikov.alex+bugs at gmail dot com
                   ` (57 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: segher at gcc dot gnu.org @ 2015-08-12  1:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #43 from Segher Boessenkool <segher at gcc dot gnu.org> ---
The archs that already use the real libgcc are arc, cris, hexagon,
m32r, nios2, openrisc, parisc, xtensa.

grep -w LIBGCC ~/src/kernel/arch/*/Makefile


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (42 preceding siblings ...)
  2015-08-12  1:30 ` segher at gcc dot gnu.org
@ 2023-06-01  7:32 ` klepikov.alex+bugs at gmail dot com
  2023-06-01  8:21 ` olegendo at gcc dot gnu.org
                   ` (56 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-01  7:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

Alexander Klepikov <klepikov.alex+bugs at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |klepikov.alex+bugs at gmail dot co
                   |                            |m

--- Comment #48 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
Let's continue discussion we started here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263

I've found that my patch catches integer division. In short, it appears to work
unpredictable. It looks like there's no easy way to catch right shift.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (43 preceding siblings ...)
  2023-06-01  7:32 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-01  8:21 ` olegendo at gcc dot gnu.org
  2023-06-01 11:18 ` klepikov.alex+bugs at gmail dot com
                   ` (55 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-01  8:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #49 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #48)
> Let's continue discussion we started here:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263
> 
> I've found that my patch catches integer division. In short, it appears to
> work unpredictable. It looks like there's no easy way to catch right shift.

What do you mean it catches integer division? There could be several places
during compilation, where the compiler will try to convert integer division to
right shifts.  But it will not do so unless it's wrong.  The patterns you have
added shouldn't match a division operation.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (44 preceding siblings ...)
  2023-06-01  8:21 ` olegendo at gcc dot gnu.org
@ 2023-06-01 11:18 ` klepikov.alex+bugs at gmail dot com
  2023-06-01 12:08 ` olegendo at gcc dot gnu.org
                   ` (54 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-01 11:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #50 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
> > I've found that my patch catches integer division. In short, it appears to
> > work unpredictable. It looks like there's no easy way to catch right shift.
> 
> What do you mean it catches integer division? There could be several places
> during compilation, where the compiler will try to convert integer division
> to right shifts.  But it will not do so unless it's wrong.  The patterns you
> have added shouldn't match a division operation.

Ooh, my bad! You are absolutely right. A function is inlined and division is
converted to 4 'shar's which at combine pass are catched by 'define_insn
"ashrsi3_libcall_collapsed"' which is little unexpected to me. Of course, it is
expanded at 'split' pass to lib call. 5 and less right shifts should not
convert to a lib call, but that can be easily corrected.

But maybe there is a way to exclude particular insn from combine pass? (I guess
not).

Here are the files that confused me:

$ cat pr49880-3.c
/* Check that the option -mdiv=call-table works.  */
/* { dg-do link }  */
/* { dg-options "-mdiv=call-table" }  */

int
test00 (int a, int b)
{
  return a / b;
}

unsigned int
test01 (unsigned int a, unsigned b)
{
  return a / b;
}

int
main (int argc, char** argv)
{
  return test00 (argc, 123) + test01 (argc, 123);
}

Not-patched GCC

$ cat pr49880-3.s.clean
        .file   "pr49880-3.c"
        .text
        .text
        .align 1
        .align 2
        .global _test00
        .type   _test00, @function
_test00:
        mov.l   .L4,r0
        sts.l   pr,@-r15
        jsr     @r0
        nop
        lds.l   @r15+,pr
        rts
        nop
.L5:
        .align 2
.L4:
        .long   ___sdivsi3
        .size   _test00, .-_test00
        .align 1
        .align 2
        .global _test01
        .type   _test01, @function
_test01:
        mov.l   .L8,r0
        sts.l   pr,@-r15
        jsr     @r0
        nop
        lds.l   @r15+,pr
        rts
        nop
.L9:
        .align 2
.L8:
        .long   ___udivsi3
        .size   _test01, .-_test01
        .section        .text.startup,"ax",@progbits
        .align 1
        .align 2
        .global _main
        .type   _main, @function
_main:
        mov.l   .L11,r1
        dmuls.l r1,r4
        sts     mach,r0
        shar    r0
        shar    r0
        shar    r0
        shar    r0
        mov     r4,r1
        shll    r1
        subc    r1,r1
        sub     r1,r0
        mov.l   .L12,r1
        dmulu.l r1,r4
        sts     mach,r1
        sub     r1,r4
        shlr    r4
        add     r4,r1
        shlr2   r1
        shlr2   r1
        shlr2   r1
        rts
        add     r1,r0
.L13:
        .align 2
.L11:
        .long   558694933
.L12:
        .long   174592167
        .size   _main, .-_main
        .ident  "GCC: (GNU) 13.1.0"

Patched GCC

$ cat pr49880-3.s
        .file   "pr49880-3.c"
        .text
        .text
        .align 1
        .align 2
        .global _test00
        .type   _test00, @function
_test00:
        mov.l   .L4,r0
        sts.l   pr,@-r15
        jsr     @r0
        nop
        lds.l   @r15+,pr
        rts
        nop
.L5:
        .align 2
.L4:
        .long   ___sdivsi3
        .size   _test00, .-_test00
        .align 1
        .align 2
        .global _test01
        .type   _test01, @function
_test01:
        mov.l   .L8,r0
        sts.l   pr,@-r15
        jsr     @r0
        nop
        lds.l   @r15+,pr
        rts
        nop
.L9:
        .align 2
.L8:
        .long   ___udivsi3
        .size   _test01, .-_test01
        .section        .text.startup,"ax",@progbits
        .align 1
        .align 2
        .global _main
        .type   _main, @function
_main:
        mov.l   .L12,r2
        mov     r4,r1
        sts.l   pr,@-r15
        dmuls.l r2,r4
        mov.l   .L13,r2
        jsr     @r2
        sts     mach,r4
        mov     r1,r2
        shll    r2
        subc    r2,r2
        mov     r4,r0
        sub     r2,r0
        mov.l   .L14,r2
        dmulu.l r2,r1
        sts     mach,r2
        sub     r2,r1
        shlr    r1
        add     r1,r2
        shlr2   r2
        shlr2   r2
        shlr2   r2
        add     r2,r0
        lds.l   @r15+,pr
        rts
        nop
.L15:
        .align 2
.L12:
        .long   558694933
.L13:
        .long   ___ashiftrt_r4_4
.L14:
        .long   174592167
        .size   _main, .-_main
        .ident  "GCC: (GNU) 13.1.0"



Now concerning GCC testsuite. I ran it in such way:

make check RUNTESTFLAGS="CFLAGS='$CFLAGS -I/usr/local/sh-elf/include/'
--target_board=sh-sim\{-m2e,-m4\}\{-ml,-mb\}"

There are lots of failed tests on non-patched GCC. Even if I narrow tests list
to sh.exp, it still fails a lot of times. At last there's nothing in logs that
produce ICE in RTL. I'm not sure I could find a crash due to the patch at all,
even if it were there.

And finally, 'parallel' appears to be unnecessary, thank you for pointing.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (45 preceding siblings ...)
  2023-06-01 11:18 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-01 12:08 ` olegendo at gcc dot gnu.org
  2023-06-01 17:45 ` segher at gcc dot gnu.org
                   ` (53 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-01 12:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #51 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #50)
> 
> Ooh, my bad! You are absolutely right. A function is inlined and division is
> converted to 4 'shar's which at combine pass are catched by 'define_insn
> "ashrsi3_libcall_collapsed"' which is little unexpected to me. Of course, it
> is expanded at 'split' pass to lib call. 5 and less right shifts should not
> convert to a lib call, but that can be easily corrected.
> 
> But maybe there is a way to exclude particular insn from combine pass? (I
> guess not).
> 

I don't think there is a direct way to hide patterns from the combine pass,
although I could be wrong, since I haven't been tracking the combine pass
development closely for a while.

Perhaps you can look at how the left-shifts are done as a reference.


> Now concerning GCC testsuite. I ran it in such way:
> 
> make check RUNTESTFLAGS="CFLAGS='$CFLAGS -I/usr/local/sh-elf/include/'
> --target_board=sh-sim\{-m2e,-m4\}\{-ml,-mb\}"
> 
> There are lots of failed tests on non-patched GCC. Even if I narrow tests
> list to sh.exp, it still fails a lot of times. At last there's nothing in
> logs that produce ICE in RTL. I'm not sure I could find a crash due to the
> patch at all, even if it were there.

To test it, you first use a known good version of GCC without your patch, build
the whole toolchain from scratch and run it.  Then collect the test results.

Then apply your patch, and repeat the process.  Collect the test results again.

Then compare both test results.  If there are any new changes in the test
results your patch is hitting something.

Usually I'd run the testsuite as follows:

make -k check
RUNTESTFLAGS="--target_board=sh-sim\{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"

(running make with multiple jobs here is useful, and still it will take some
hours)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (46 preceding siblings ...)
  2023-06-01 12:08 ` olegendo at gcc dot gnu.org
@ 2023-06-01 17:45 ` segher at gcc dot gnu.org
  2023-06-01 20:46 ` olegendo at gcc dot gnu.org
                   ` (52 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: segher at gcc dot gnu.org @ 2023-06-01 17:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #52 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #50)
> But maybe there is a way to exclude particular insn from combine pass? (I
> guess not).

In general, it is best to let combine just work on everything.  It will not
replace instructions if the replacement is more expensive, and it will only
ever create instruction sequences with the same semantics as what it started
with.

There is TARGET_LEGITIMATE_COMBINED_INSN though, which is a workaround for if
you really do not want the instruction combiner to create particular
instruction patterns (but it does nothing to prevent other parts of the
compiler from doing the same!)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (47 preceding siblings ...)
  2023-06-01 17:45 ` segher at gcc dot gnu.org
@ 2023-06-01 20:46 ` olegendo at gcc dot gnu.org
  2023-06-03  7:24 ` klepikov.alex+bugs at gmail dot com
                   ` (51 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-01 20:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #53 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Segher Boessenkool from comment #52)
> 
> There is TARGET_LEGITIMATE_COMBINED_INSN though, which is a workaround for if
> you really do not want the instruction combiner to create particular
> instruction patterns (but it does nothing to prevent other parts of the
> compiler from doing the same!)

Thanks for pointing it out.  I knew I missed something recent ;)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (48 preceding siblings ...)
  2023-06-01 20:46 ` olegendo at gcc dot gnu.org
@ 2023-06-03  7:24 ` klepikov.alex+bugs at gmail dot com
  2023-06-03  8:50 ` olegendo at gcc dot gnu.org
                   ` (50 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-03  7:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #54 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
Regarding testsuite. There's execute fails, but this is due to lack of
multilib. I'll rebuild and retest.

There's also fail in pr64345-1.c, in this function:

typedef signed char int8_t;

int test_int8_t__shift_7_8 (int8_t* x) {
 return ((*x >> 7) ^ 1) & 1;
}

Clean GCC:

        .file   "pr64345-1-test.c"
        .text
        .text
        .align 1
        .align 2
        .global _test_int8_t__shift_7_8
        .type   _test_int8_t__shift_7_8, @function
_test_int8_t__shift_7_8:
        mov.l   .L4,r1
        sts.l   pr,@-r15
        jsr     @r1
        mov.b   @r4,r4
        mov     r4,r0
        tst     #1,r0
        movt    r0
        lds.l   @r15+,pr
        rts
        nop
.L5:
        .align 2
.L4:
        .long   ___ashiftrt_r4_7
        .size   _test_int8_t__shift_7_8, .-_test_int8_t__shift_7_8
        .ident  "GCC: (GNU) 13.1.0"


Patched GCC:

        .file   "pr64345-1-test.c"
        .text
        .text
        .align 1
        .align 2
        .global _test_int8_t__shift_7_8
        .type   _test_int8_t__shift_7_8, @function
_test_int8_t__shift_7_8:
        mov.b   @r4,r1
        sts.l   pr,@-r15
        cmp/pz  r1
        movt    r0
        lds.l   @r15+,pr
        rts
        nop
        .size   _test_int8_t__shift_7_8, .-_test_int8_t__shift_7_8
        .ident  "GCC: (GNU) 13.1.0"

But it looks more like it's not a fail, but an optimization.

But also there's tests that pass on patched but fail on clean. I'll take a
closer look on them later after GCC and multilibs rebuild.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (49 preceding siblings ...)
  2023-06-03  7:24 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-03  8:50 ` olegendo at gcc dot gnu.org
  2023-06-03 15:43 ` klepikov.alex+bugs at gmail dot com
                   ` (49 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-03  8:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #55 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #54)
> Regarding testsuite. There's execute fails, but this is due to lack of
> multilib. I'll rebuild and retest.
> 
> There's also fail in pr64345-1.c, in this function:
> [...]
> 
> But it looks more like it's not a fail, but an optimization.

Yeah, that looks like an improvement.  There might be some SH specific tests
that scan for particular assembler outputs like that one.  Those tests would
need to be adjusted of course.

In that test you can see the unnecessary push/pop of PR.  This is because
initially it wanted to expand as a library call, but then your patterns decided
to change the insns. This can or can't be avoided, depending on the case.

> 
> But also there's tests that pass on patched but fail on clean. I'll take a
> closer look on them later after GCC and multilibs rebuild.

Yes, there are some (well ... quite a lot actually) tests that will also fail
on vanilla GCC on SH.  Hence the need to look at the test result delta
before/after patch.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (50 preceding siblings ...)
  2023-06-03  8:50 ` olegendo at gcc dot gnu.org
@ 2023-06-03 15:43 ` klepikov.alex+bugs at gmail dot com
  2023-06-03 16:09 ` olegendo at gcc dot gnu.org
                   ` (48 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-03 15:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #56 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
> > Regarding testsuite. There's execute fails, but this is due to lack of
> > multilib. I'll rebuild and retest.
> > 
> > There's also fail in pr64345-1.c, in this function:
> > [...]
> > 
> > But it looks more like it's not a fail, but an optimization.
> 
> Yeah, that looks like an improvement.  There might be some SH specific tests
> that scan for particular assembler outputs like that one.  Those tests would
> need to be adjusted of course.

I checked and that one was the only one.

> In that test you can see the unnecessary push/pop of PR.  This is because
> initially it wanted to expand as a library call, but then your patterns
> decided to change the insns. This can or can't be avoided, depending on the
> case.

That's an valuable observation, thank you! I found why. Sometimes 'collapsed
libcall' instruction is emitted in combine pass and 'clobber PR' is set then. I
commented this out and checked that file and it seems to be OK. I will rerun
testsuite again to be sure.

Now regarding tests that fail on vanilla GCC and pass with patch. It looks like
something was changed in the middle of GCC and it started to produce other
instruction patterns that cannot be catched by existing isns and thus expanded
to library calls. The patch simply fixed it.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (51 preceding siblings ...)
  2023-06-03 15:43 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-03 16:09 ` olegendo at gcc dot gnu.org
  2023-06-04 18:35 ` klepikov.alex+bugs at gmail dot com
                   ` (47 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-03 16:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #57 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #56)
> 
> > In that test you can see the unnecessary push/pop of PR.  This is because
> > initially it wanted to expand as a library call, but then your patterns
> > decided to change the insns. This can or can't be avoided, depending on the
> > case.
> 
> That's an valuable observation, thank you! I found why. Sometimes 'collapsed
> libcall' instruction is emitted in combine pass and 'clobber PR' is set
> then. I commented this out and checked that file and it seems to be OK. I
> will rerun testsuite again to be sure.

Another thing ... the reason why it's desirable to expand into the libcall
earlier is to allow hoisting the function call address outside of loops and
things like that.  That happens only once at the RTL level during compilation
and that's before the combine pass.  Since there are no loops around
optimization passes in the compiler, it's always going to be a compromise.  But
it might be OK to repeat a particular optimization pass only for SH, if it
helps anything.

> 
> Now regarding tests that fail on vanilla GCC and pass with patch. It looks
> like something was changed in the middle of GCC and it started to produce
> other instruction patterns that cannot be catched by existing isns and thus
> expanded to library calls. The patch simply fixed it.

Yes, that happens every now and then, as folks work on target independent
optimizations in the middle-end.  SH backend hasn't been keeping up for a while
in this regard.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (52 preceding siblings ...)
  2023-06-03 16:09 ` olegendo at gcc dot gnu.org
@ 2023-06-04 18:35 ` klepikov.alex+bugs at gmail dot com
  2023-06-05  0:03 ` olegendo at gcc dot gnu.org
                   ` (46 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-04 18:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #58 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
> Another thing ... the reason why it's desirable to expand into the libcall
> earlier is to allow hoisting the function call address outside of loops and
> things like that.

Ouch. That's a real problem. Short loops can become slower on about 10%. But is
it possible to detect a loop during expand pass? It looks like basic blocks
have loop depth info, but I still don't now how to access a basic block.

> That happens only once at the RTL level during
> compilation and that's before the combine pass.  Since there are no loops
> around optimization passes in the compiler, it's always going to be a
> compromise.  But it might be OK to repeat a particular optimization pass
> only for SH, if it helps anything.

I would try that. I think loop optimiztion pass should be repeated after split
pass. Do you know how to do it?

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (53 preceding siblings ...)
  2023-06-04 18:35 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-05  0:03 ` olegendo at gcc dot gnu.org
  2023-06-06 10:30 ` klepikov.alex+bugs at gmail dot com
                   ` (45 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-05  0:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #59 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #58)
> 
> Ouch. That's a real problem. Short loops can become slower on about 10%. But
> is it possible to detect a loop during expand pass? It looks like basic
> blocks have loop depth info, but I still don't now how to access a basic
> block.

There is 'BLOCK_FOR_INSN'

But I'm not sure it will be helpful during the initial expand pass.


> I would try that. I think loop optimiztion pass should be repeated after
> split pass. Do you know how to do it?

I don't know if we can simply repeat the loop optimizations.  I think I've
tried doing something like that before -- repeating some of the RTL passes, but
without any useful results.  It could also result in oscillations (pass A does
a transformation, pass B undoes it, then pass A would do it again ...).  Maybe
you can get better results.

There are already 2 SH specific passes 'sh_optimize_sett_clrt' and
'sh_treg_combine'.  You can look at how they are instantiated and inserted into
the RTL passes chain in 'register_sh_passes'.

Maybe it's easier to add some shift specific passes.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (54 preceding siblings ...)
  2023-06-05  0:03 ` olegendo at gcc dot gnu.org
@ 2023-06-06 10:30 ` klepikov.alex+bugs at gmail dot com
  2023-06-06 10:51 ` olegendo at gcc dot gnu.org
                   ` (44 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-06 10:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #60 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
> Maybe it's easier to add some shift specific passes.

Well, I couldn't think of anything better and ported the loop optimization
pass. More precisely, all loop optimization passes, because they are tied to
each other. They run after split1 pass. And it worked!

I want to beleive that another loop optimization pass won't break anything
because loops are already optimized.

If that's redundant, I thought of expanding libcall if it's inside a loop
before loop optimization passes.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (55 preceding siblings ...)
  2023-06-06 10:30 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-06 10:51 ` olegendo at gcc dot gnu.org
  2023-06-06 12:17 ` klepikov.alex+bugs at gmail dot com
                   ` (43 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-06 10:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #61 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #60)
> > Maybe it's easier to add some shift specific passes.
> 
> Well, I couldn't think of anything better and ported the loop optimization
> pass. More precisely, all loop optimization passes, because they are tied to
> each other. They run after split1 pass. And it worked!
> 
> I want to beleive that another loop optimization pass won't break anything
> because loops are already optimized.

Theoretically it should't ... 

> 
> If that's redundant, I thought of expanding libcall if it's inside a loop
> before loop optimization passes.

I'm a bit concerned about the increased compile time.  Have you observed
anything (negative) in this regard?

Loop, hoist, constant propagation etc (+ all the good stuff) optimizations are
done before insn combine / split1.  We could add a simple SH specific pass that
goes over the RTL and does stuff to shifts before those other optimizations. 
However, it might miss insn combine opportunities later on.  I'm thinking about
your tst #imm,r0 case from before.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (56 preceding siblings ...)
  2023-06-06 10:51 ` olegendo at gcc dot gnu.org
@ 2023-06-06 12:17 ` klepikov.alex+bugs at gmail dot com
  2023-06-06 12:39 ` olegendo at gcc dot gnu.org
                   ` (42 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-06 12:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #62 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
> I'm a bit concerned about the increased compile time.  Have you observed
> anything (negative) in this regard?

My project is small and it compiles in under 1 second on both clean and patched
GCC. sh.exp test suite mean run time is 117s on clean and 119s on patched. I
did 1 warm-up run and then 3 main one-threaded runs for each task. I'm thinking
of something else.

> 
> Loop, hoist, constant propagation etc (+ all the good stuff) optimizations
> are done before insn combine / split1.  We could add a simple SH specific
> pass that goes over the RTL and does stuff to shifts before those other
> optimizations.

That's a good idea! 

> However, it might miss insn combine opportunities later on. 

Implementing features not supported by hardware will always be a tradeoff.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (57 preceding siblings ...)
  2023-06-06 12:17 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-06 12:39 ` olegendo at gcc dot gnu.org
  2023-06-06 12:55 ` klepikov.alex+bugs at gmail dot com
                   ` (41 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-06 12:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #63 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #62)
> 
> My project is small and it compiles in under 1 second on both clean and
> patched GCC. sh.exp test suite mean run time is 117s on clean and 119s on
> patched. I did 1 warm-up run and then 3 main one-threaded runs for each
> task. I'm thinking of something else.

We have to consider that SH is also a linux target and it's also used to build
larger applications (and GCC itself ...).  It'd be good to not regress too much
in this regard.  One way to check it is the CSiBE test set.  I can help testing
your patch later.

> 
> Implementing features not supported by hardware will always be a tradeoff.

I'd say it's generally about how to find the best choice of
instructions/sequences.  With GCC's "waterfall" optimization it's impossible to
find the best solution because it doesn't have a way to compare the total cost
of one solution vs. another, for example.  We have to work around that ;)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (58 preceding siblings ...)
  2023-06-06 12:39 ` olegendo at gcc dot gnu.org
@ 2023-06-06 12:55 ` klepikov.alex+bugs at gmail dot com
  2023-06-06 16:10 ` klepikov.alex+bugs at gmail dot com
                   ` (40 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-06 12:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #64 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
> We have to consider that SH is also a linux target and it's also used to
> build larger applications (and GCC itself ...).  It'd be good to not regress
> too much in this regard.  One way to check it is the CSiBE test set.  I can
> help testing your patch later.

I agree with you. Thank you!

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (59 preceding siblings ...)
  2023-06-06 12:55 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-06 16:10 ` klepikov.alex+bugs at gmail dot com
  2023-06-06 19:07 ` klepikov.alex+bugs at gmail dot com
                   ` (39 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-06 16:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #65 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
> I'm thinking of something else.

During kernel compile I got few internal errors in different passes. That is,
additional loop optimization pass patch is no good at all.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (60 preceding siblings ...)
  2023-06-06 16:10 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-06 19:07 ` klepikov.alex+bugs at gmail dot com
  2023-06-06 23:11 ` olegendo at gcc dot gnu.org
                   ` (38 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-06 19:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #66 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
(In reply to Alexander Klepikov from comment #65)
> > I'm thinking of something else.
> 
> During kernel compile I got few internal errors in different passes. That
> is, additional loop optimization pass patch is no good at all.

I am sorry, I am, as always, panicking ahead of time. I think I found what's
wrong and do additional tests.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (61 preceding siblings ...)
  2023-06-06 19:07 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-06 23:11 ` olegendo at gcc dot gnu.org
  2023-06-08 12:07 ` klepikov.alex+bugs at gmail dot com
                   ` (37 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-06 23:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #67 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #66)
> (In reply to Alexander Klepikov from comment #65)
> > > I'm thinking of something else.
> > 
> > During kernel compile I got few internal errors in different passes. That
> > is, additional loop optimization pass patch is no good at all.
> 
> I am sorry, I am, as always, panicking ahead of time. I think I found what's
> wrong and do additional tests.

Don't worry.  I know what you're going through.  Been there myself ;)
Take your time.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (62 preceding siblings ...)
  2023-06-06 23:11 ` olegendo at gcc dot gnu.org
@ 2023-06-08 12:07 ` klepikov.alex+bugs at gmail dot com
  2023-06-08 12:09 ` klepikov.alex+bugs at gmail dot com
                   ` (36 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-08 12:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #68 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
OK, I finished my patch. Let me remind you briefly, what it does.

First, it does not emit library call for ashrsi3 during expand pass. Instead it
emits intermediate 'collapsed' libcall insn which is expanded later at split1
pass.

Consecutive right arithmetic shifts can be catched at combine pass
and converted to 'collapsed' libcall insn.

Then 'collapsed' insn splits whether to short sequnce of individual right
shifts
or to __ashiftrt_r4_N library call during split1 pass.

Finally, loop move invariants only pass executed right after split1 pass
to hoist library addresses that potentially were emitted during split pass.

I ran GCC testsuite and there were 1 new failed test that actually turned out
to be an optimization, and 3 tests that failed on clean GCC and passed on
patched. I could not set up my environment to test all little endian exec
tests, so for those I don't have results. Also I successfully compiled (but not
linked due to not fully set environment) Linux kenel 6.1. My homemade
performance tests (I timed linux kernel build) showed a slowdown of less then
3%.

To reduce slowdown it would be good to run loop move invariants pass only in
functions that have expanded libcall, but to do so it is necessary to mark such
funcnion when emitting insn and I'm not so sure if insn knows about in what
function it is located.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (63 preceding siblings ...)
  2023-06-08 12:07 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-08 12:09 ` klepikov.alex+bugs at gmail dot com
  2023-06-08 13:22 ` olegendo at gcc dot gnu.org
                   ` (35 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-08 12:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #69 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
Created attachment 55284
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55284&action=edit
Collapsed libcall and additional loop move invariants patch

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (64 preceding siblings ...)
  2023-06-08 12:09 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-08 13:22 ` olegendo at gcc dot gnu.org
  2023-06-08 15:22 ` klepikov.alex+bugs at gmail dot com
                   ` (34 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-08 13:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #70 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #69)
> Created attachment 55284 [details]
> Collapsed libcall and additional loop move invariants patch

Thanks for sharing the patch.  I also don't have the GCC SH test environment
setup at the moment, so it will take me some time to get up to speed on that.

There are some formatting nits in your patch, please check again:
* don't add / remove empty lines for no reason
* '{' goes on new line (follow surrounding code style)
* in comments: two spaces after the period
* closing ')' and ']' in the RTL should go on the same line (follow surrounding
style)

But more interestingly:
* Do we really need to add that new source file sh_loop.cc with the wrapper
classes?  Can't we just instantiate the passes that are needed in-place when
they are registered?

* Can you add some tests to the SH specific tests?

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (65 preceding siblings ...)
  2023-06-08 13:22 ` olegendo at gcc dot gnu.org
@ 2023-06-08 15:22 ` klepikov.alex+bugs at gmail dot com
  2023-06-08 20:17 ` olegendo at gcc dot gnu.org
                   ` (33 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-08 15:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #71 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
> There are some formatting nits in your patch, please check again:

Thanks for pointing that out, I'll check again.

> But more interestingly:
> * Do we really need to add that new source file sh_loop.cc with the wrapper
> classes?  Can't we just instantiate the passes that are needed in-place when
> they are registered?

Unfortunately, no.

First of all, classes are defined in loop-init.cc and thus cannot be accessed
directly.

There are wrappers called 'make_pass_rtl_blah_bla_blah' which create class
instances, but. Class .name field is hardcoded, there's no option to set it
when initializing, and pass manager checks if new pass name differs from any
already existing. Here's the code from passes.cc:

if (pass->type == new_pass_info->pass->type
          && pass->name
          && !strcmp (pass->name, new_pass_info->reference_pass_name) //Name
check
          ...

Moreover, pass_rtl_loop_init::execute method calls a function that do the
following check:

gcc_assert (current_ir_type () == IR_RTL_CFGLAYOUT);

And in our pass (that goes after split1 pass),
current_ir_type()!=IR_RTL_CFGLAYOUT, so we will get ICE. And even if we could
inherit the class, 'execute' method is final.

I'm not so good at C++, so may be there's another way. I know that the less
code is the better, but I'm afraid, not this time.

> 
> * Can you add some tests to the SH specific tests?

Yes, of course! I'll add some 'tst #imm,0' presence tests. Should I add hoist
test? I do not yet understand how hoisting check should be performed, but I
hope to find examples. Maybe some other tests? If I'm missing something, please
tell me.

And as I understand, I should name test file something like
pr54089-next-free-number.c, right?

Thank you.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (66 preceding siblings ...)
  2023-06-08 15:22 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-08 20:17 ` olegendo at gcc dot gnu.org
  2023-06-14  9:30 ` klepikov.alex+bugs at gmail dot com
                   ` (32 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-08 20:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #72 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #71)
> 
> > * Do we really need to add that new source file sh_loop.cc with the wrapper
> > classes?  Can't we just instantiate the passes that are needed in-place when
> > they are registered?
> 
> Unfortunately, no.
> [...]

That's too bad.  Thanks for digging.  I haven't checked myself recently, I
thought this infrastructure has been improved a little bit, but seems not so. 
Perhaps SH is the only backend that wants to do this kind of thing.  Let's
leave it at that for now.  Later, after everything else has been cleared out we
can ask other GCC developers' opinion on this.

> Yes, of course! I'll add some 'tst #imm,0' presence tests.

That'd be good.

> Should I add hoist test? I do not yet understand how hoisting check should be
> performed, but I hope to find examples.

Also not sure how to do that at the moment.  Maybe scanning RTL dumps of the
new extra passes?  Or perhaps just a simple example as a start:

void test_func (int* a_ptr, int* b_ptr, unsigned int count)
{
  for (unsigned int i = 0; i < count; ++i)
    a_ptr[i] >> 5;


  for (unsigned int i = 0; i < count; ++i)
    b_ptr[i] >> 5;
}

... and then just scan the asm output and check against the expected number of
insns.  In this case, mov.l insn to load the function call address.


> Maybe some other tests? If I'm missing something, please tell me.

Right now not that I can think of.


> And as I understand, I should name test file something like
> pr54089-next-free-number.c, right?

Yes, that should be fine.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (67 preceding siblings ...)
  2023-06-08 20:17 ` olegendo at gcc dot gnu.org
@ 2023-06-14  9:30 ` klepikov.alex+bugs at gmail dot com
  2023-06-14  9:31 ` klepikov.alex+bugs at gmail dot com
                   ` (31 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-14  9:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #73 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
I had to check everything a few more times because I found a bug in my patch
that caused the long shifts to expand incorrectly. I added a test that checks
correct shifts expansion in this regard.

I also managed to make a test that verifies hoisting. It can be done with
'scan-assembler' command and regexp.

If you're interested, there's more powerful command 'check-function-bodies',
but it looks like it works only when C++ cross compiler is built, so I decided
to leave the regexp.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (68 preceding siblings ...)
  2023-06-14  9:30 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-14  9:31 ` klepikov.alex+bugs at gmail dot com
  2023-06-16 13:57 ` klepikov.alex+bugs at gmail dot com
                   ` (30 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-14  9:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

Alexander Klepikov <klepikov.alex+bugs at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #55284|0                           |1
        is obsolete|                            |

--- Comment #74 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
Created attachment 55321
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55321&action=edit
Collapsed libcall and additional loop move invariants patch v2

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (69 preceding siblings ...)
  2023-06-14  9:31 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-16 13:57 ` klepikov.alex+bugs at gmail dot com
  2023-06-16 21:58 ` olegendo at gcc dot gnu.org
                   ` (29 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-16 13:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #75 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
I found that patch incorrectly works when '-O0 -fmove-loop-invariants' flags
are set. Stock loop optimization passes do not run when '-O0' is specified
desipte '-fmove-loop-invariants' is set. I'll do the same and recheck again.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (70 preceding siblings ...)
  2023-06-16 13:57 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-16 21:58 ` olegendo at gcc dot gnu.org
  2023-06-17  6:08 ` klepikov.alex+bugs at gmail dot com
                   ` (28 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-16 21:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #76 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #75)
> I found that patch incorrectly works when '-O0 -fmove-loop-invariants' flags
> are set. Stock loop optimization passes do not run when '-O0' is specified
> desipte '-fmove-loop-invariants' is set. I'll do the same and recheck again.

It'd be good if the newly added passes are ran only with -O2 or higher.  Or
even better, if we can find a way to enable them only when actually needed. 
E.g. when it's splitting a shift insn that will potentially need the loop
optimizations again, set a flag in the function.

One way to do that could be adding SH specific function context class/struct to
store some per-function data for the compilation process.  Right now we'd have
only one entry, that is whether to run the additional passes again or not. 
This per-function 'SH context' can be stored as a global variable in sh.cc. 
Then override 'set_current_function' in the backend to reset the SH specific
per-function context.

However, to get better test coverage, it's better first let the additional loop
passes run all the time to uncover any potential issues.  Later the above can
be added as a supplementary optimization.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (71 preceding siblings ...)
  2023-06-16 21:58 ` olegendo at gcc dot gnu.org
@ 2023-06-17  6:08 ` klepikov.alex+bugs at gmail dot com
  2023-06-17  7:06 ` olegendo at gcc dot gnu.org
                   ` (27 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-17  6:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #77 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
> It'd be good if the newly added passes are ran only with -O2 or higher.

This can be confusing to users when they discover that not all invariants are
moved out of loops. Then we should inform them about that at least.

>  Or even better, if we can find a way to enable them only when actually needed. 
> E.g. when it's splitting a shift insn that will potentially need the loop
> optimizations again, set a flag in the function.

I'm thinking about this for some time. We know that we should potentially run
additional loop optimization pass when we're splitting libcall. I did not find
the way to know in what function we are splitting yet.

> However, to get better test coverage, it's better first let the additional
> loop passes run all the time to uncover any potential issues.  Later the
> above can be added as a supplementary optimization.

I see some strange new exec fails only at testsuite logs. Right now I'm trying
to find the cause.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (72 preceding siblings ...)
  2023-06-17  6:08 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-17  7:06 ` olegendo at gcc dot gnu.org
  2023-06-17  8:24 ` klepikov.alex+bugs at gmail dot com
                   ` (26 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-17  7:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #78 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #77)
> > It'd be good if the newly added passes are ran only with -O2 or higher.
> 
> This can be confusing to users when they discover that not all invariants
> are moved out of loops. Then we should inform them about that at least.

I don't think the compiler reports any optimizations not done to the user at
lower optimization levels?  It's normal not to do certain optimizations at a
lower level, that's why we have -O0 -O1 -O2 ... or do you mean something else
by that?

> I'm thinking about this for some time. We know that we should potentially
> run additional loop optimization pass when we're splitting libcall. I did
> not find the way to know in what function we are splitting yet.

The compiler processes one function at a time, all passes at once.  It doesn't
mix passes of different functions.  So I think using global variable in sh.cc +
override 'set_current_function' should get the job done.  When the insn split
code is executed, just set the global flag in the SH specific function context.


> I see some strange new exec fails only at testsuite logs. Right now I'm
> trying to find the cause.

What's the configure line of your GCC build?

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (73 preceding siblings ...)
  2023-06-17  7:06 ` olegendo at gcc dot gnu.org
@ 2023-06-17  8:24 ` klepikov.alex+bugs at gmail dot com
  2023-06-17  9:46 ` olegendo at gcc dot gnu.org
                   ` (25 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-17  8:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #79 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
(In reply to Oleg Endo from comment #78)
> (In reply to Alexander Klepikov from comment #77)
> > > It'd be good if the newly added passes are ran only with -O2 or higher.
> > 
> > This can be confusing to users when they discover that not all invariants
> > are moved out of loops. Then we should inform them about that at least.
> 
> I don't think the compiler reports any optimizations not done to the user at
> lower optimization levels?  It's normal not to do certain optimizations at a
> lower level, that's why we have -O0 -O1 -O2 ... or do you mean something
> else by that?
> 

I mean that if a user run GCC with -O1 and we don't do SH specific loop move
invariants pass at -O1, a user could look at binary (or at .S file) and find
that not all invariants are moved out of the loops, because library addresses
are hoisted after split1 pass. That would confuse a user even if RTL dump is
generated, and he can decide that it's a bug into loop2 pass. I suggest to
generate sh_loop dumps if RTL dump is requested and place there a message that
sh_loop hoisting is not done due to optimization level setting.

But, actually, it's not that important at the moment because I'm aiming to do
the second option - run sh_loop only when it's potentially needed.

> The compiler processes one function at a time, all passes at once.

That's what I missed! Thank you for clarification!

> > I see some strange new exec fails only at testsuite logs. Right now I'm
> > trying to find the cause.
> 
> What's the configure line of your GCC build?

../gcc-13.1.0/configure --target=sh-elf --with-endian=big,little
--disable-bootstrap --enable-languages=c,c++ --disable-nls --with-gnu-as
--with-gnu-ld --prefix=/usr/local/sh-toolchain/ --without-newlib
--without-headers

I checked several cases and they are because ld cannot link little endian
binary. When I run failed command taken from log file, it always fails for
linking little endian binary. But sometimes logs say that executing of little
endian binary is successful. I'm at a loss - how is that even possible?

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (74 preceding siblings ...)
  2023-06-17  8:24 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-17  9:46 ` olegendo at gcc dot gnu.org
  2023-06-17 11:00 ` klepikov.alex+bugs at gmail dot com
                   ` (24 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-17  9:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #80 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #79)
> 
> I mean that if a user run GCC with -O1 and we don't do SH specific loop move
> invariants pass at -O1, a user could look at binary (or at .S file) and find
> that not all invariants are moved out of the loops, because library
> addresses are hoisted after split1 pass. That would confuse a user even if
> RTL dump is generated, and he can decide that it's a bug into loop2 pass. I
> suggest to generate sh_loop dumps if RTL dump is requested and place there a
> message that sh_loop hoisting is not done due to optimization level setting.

I don't think users would get surprised or anything by lack of certain
optimization. There is no official set of guaranteed optimizations performed at
a particular -O level.  It's OK to output a message into the RTL dump of
course.  But everything else seems a bit overthinking the issue.

Actually the hoisting might not be always done.  E.g. when register pressure in
a loop is high, it might be better to not hoist the function address and keep
it inside of the loop to reduce register pressure.  But I'm not sure the loop
optimization passes are smart enough to do that (yet).

> 
> I checked several cases and they are because ld cannot link little endian
> binary. When I run failed command taken from log file, it always fails for
> linking little endian binary. But sometimes logs say that executing of
> little endian binary is successful. I'm at a loss - how is that even
> possible?

Ugh, sounds weird. Sometimes a particular binutils version is also no good. 
Have you tried using an older binutils?  Maybe it's more stable?

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (75 preceding siblings ...)
  2023-06-17  9:46 ` olegendo at gcc dot gnu.org
@ 2023-06-17 11:00 ` klepikov.alex+bugs at gmail dot com
  2023-06-17 17:57 ` klepikov.alex+bugs at gmail dot com
                   ` (23 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-17 11:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #81 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
(In reply to Oleg Endo from comment #78)
> The compiler processes one function at a time, all passes at once.  It
> doesn't mix passes of different functions.  So I think using global variable
> in sh.cc + override 'set_current_function' should get the job done.  When
> the insn split code is executed, just set the global flag in the SH specific
> function context.

I made a proof of concept using  a global variable only. It is defined in sh.cc
and it is defined as extern in sh_loop.cc. It is set when splitting is done and
it is cleared when sh_loop_done is called, i.e. at the end of loop
optimization. Do we really need 'set_current_function'? Because so far it seems
a little excessive.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (76 preceding siblings ...)
  2023-06-17 11:00 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-17 17:57 ` klepikov.alex+bugs at gmail dot com
  2023-06-20 11:28 ` klepikov.alex+bugs at gmail dot com
                   ` (22 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-17 17:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #82 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
I have read the docs and other targets' code, and the puzzle finally came
together. A struct with additional current function context is a really good
idea! I'll implement it in a couple of days.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (77 preceding siblings ...)
  2023-06-17 17:57 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-20 11:28 ` klepikov.alex+bugs at gmail dot com
  2023-06-20 17:03 ` klepikov.alex+bugs at gmail dot com
                   ` (21 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-20 11:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #83 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
Created attachment 55367
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55367&action=edit
Collapsed libcall and additional loop move invariants patch v3

I digged other targets and I found that cfun->machine can point to user-defined
structure. So I implemented per-function data through it. Loop optimizer now
runs when -O2 or higher is specified and only for functions where ashrsi3
libcall was expanded. I also ran all tests I did before a few times.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (78 preceding siblings ...)
  2023-06-20 11:28 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-20 17:03 ` klepikov.alex+bugs at gmail dot com
  2023-06-20 23:46 ` olegendo at gcc dot gnu.org
                   ` (20 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-20 17:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #84 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
I've forgot to say that first I ran all tests with SH specific loop
optimization enabled when condition 'optimize && flag_move_loop_invariants' is
true. And only then I ran all tests with final (at the moment) version of patch
once more.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (79 preceding siblings ...)
  2023-06-20 17:03 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-20 23:46 ` olegendo at gcc dot gnu.org
  2023-06-21 11:51 ` klepikov.alex+bugs at gmail dot com
                   ` (19 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-20 23:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #85 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #83)
> Created attachment 55367 [details]
> Collapsed libcall and additional loop move invariants patch v3

Thanks for staying on it!  I've looked through the latest version of your
patch.

There are still formatting issues.  I will not point out one by one at this
time (but will so later in the end).  For now, can you try to run the style
check script on your changes?

https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=contrib/check_GNU_style.sh

and also briefly cross check with https://gcc.gnu.org/codingconventions.html


As for the actual contents of the patch...


> bool
> expand_ashiftrt (rtx *operands)
> {
>   int value = INTVAL (operands[2]) & 31;
                    ^^^^^^

Missing check for 'const_int' operand.  See other uses of 'CONST_INT_P'.

> if (dump_file)
>    fprintf(dump_file, "ashrsi3: Emitting collapsed libcall\n");

This can be omitted.  It will be obvious in the RTL dump after the expand pass
because of the insn name, or not?


'expand_ashiftrt' is called only in the pattern 'define_expand "ashrsi3"', so
we can move all the code into the expander, like it's done in e.g.
'define_expand "lshrsi3"'.

Then the function 'expand_ashiftrt_individual' can be renamed back to
'expand_ashiftrt'.  

Actually, the 'define_expand "ashrsi3"' can just emit the
'ashrsi3_libcall_collapsed' directly.  In other words, change the original
'expand_ashiftrt' function tail:

>  wrk = gen_reg_rtx (Pmode);
>
> /* Load the value into an arg reg and call a helper.  */
>  emit_move_insn (gen_rtx_REG (SImode, 4), operands[1]);
>  sprintf (func, "__ashiftrt_r4_%d", value);
>  rtx lab = function_symbol (wrk, func, SFUNC_STATIC).lab;
>  emit_insn (gen_ashrsi3_n (GEN_INT (value), wrk, lab));
>  emit_move_insn (operands[0], gen_rtx_REG (SImode, 4));
>  return true;

to this:

>  emit_insn (gen_ashrsi3_libcall_collapsed (operands[0], operands[1], GEN_INT(value)));
>  return true;

... but of course only if operand[2] is actually a 'const_int'.  If operand[2]
is not a constant, then the whole expander should fail and not emit anything. 
Which is what the original code was doing.  In case of shifts by non-constant
amounts, the middle-end will then expand the generic libcall for it (if I
remember correctly).

So basically, all we're doing here is adding a proxy pattern for the existing
'ashrsi3_n', that has a simpler structure and can be used by other passes like
combine easier.  Sounds plausible, but looking through the other shift
patterns, they would all need that treatment?

I think the 'define_insn "ashrsi3_libcall_collapsed"' and
'define_insn_and_split "ashrsi3_libcall_expand"' can be merged into a single
pattern 'define_insn_and_split "ashrsi3_n_call" and the function
'expand_ashrsi3_libcall' (the tail of the original 'expand_ashiftrt') can be
just put into the splitter code in the new "ashrsi3_n_call".

The splitter condition should include "&& can_create_pseudo_p ()" to make sure
it's ran before register allocation.

I think this will reduce the change set a bit and make the intention of the
patch a bit clearer.

> +/* { dg-final { scan-assembler "_f_loop1_rshift:.*mov\.l\\t(\.L\[0-9\]+),(r\[0-9\]+).*sts.l\\tpr,@-r15.*(\.L\[0-9\]+):.*jsr\\t@\\2.*bf\.s\\t\\3.*\\1:\\n\\t\.long\\t___ashiftrt_r4_6.*_f_loop2_rshift:" { target { ! has_dyn_shift } } } }  */

Can you try to somehow write this in a simpler way?  Maybe omit some of the
register number matches, as they don't matter etc.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (80 preceding siblings ...)
  2023-06-20 23:46 ` olegendo at gcc dot gnu.org
@ 2023-06-21 11:51 ` klepikov.alex+bugs at gmail dot com
  2023-06-21 20:13 ` segher at gcc dot gnu.org
                   ` (18 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-21 11:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #86 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
(In reply to Oleg Endo from comment #85)
>For now, can you try to run the style check script on your changes?

Thank you, that's what I've been missing!

> > bool
> > expand_ashiftrt (rtx *operands)
> > {
> >   int value = INTVAL (operands[2]) & 31;
>                     ^^^^^^
> 
> Missing check for 'const_int' operand.  See other uses of 'CONST_INT_P'.

I was completely sure that condition '(match_operand:SI 2 "const_int_operand")'
in define_insn_and_split guarantees that 'CONST_INT_P (operands[2]) == true'.

> I think the 'define_insn "ashrsi3_libcall_collapsed"' and
> 'define_insn_and_split "ashrsi3_libcall_expand"' can be merged into a single
> pattern 'define_insn_and_split "ashrsi3_n_call" and the function
> 'expand_ashrsi3_libcall' (the tail of the original 'expand_ashiftrt') can be
> just put into the splitter code in the new "ashrsi3_n_call".
> 
> The splitter condition should include "&& can_create_pseudo_p ()" to make
> sure it's ran before register allocation.
> 
> I think this will reduce the change set a bit and make the intention of the
> patch a bit clearer.
> 

Yes, it looks like it works in general. But first tests show that all constant
dynamic shifts expand to library call for all targets, even for those with
dynshift instructions. That's because 'ashrsi3_n_call' and 'ashrsi3' should
check whether right shift could be expanded to individual shifts. They both
should execute the code that I separated into 'expand_ashiftrt_individual'
function to expand to individual shifts properly.

> > +/* { dg-final { scan-assembler "_f_loop1_rshift:.*mov\.l\\t(\.L\[0-9\]+),(r\[0-9\]+).*sts.l\\tpr,@-r15.*(\.L\[0-9\]+):.*jsr\\t@\\2.*bf\.s\\t\\3.*\\1:\\n\\t\.long\\t___ashiftrt_r4_6.*_f_loop2_rshift:" { target { ! has_dyn_shift } } } }  */
> 
> Can you try to somehow write this in a simpler way?  Maybe omit some of the
> register number matches, as they don't matter etc.

I took a note, I'll deal with it later.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (81 preceding siblings ...)
  2023-06-21 11:51 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-21 20:13 ` segher at gcc dot gnu.org
  2023-06-21 20:32 ` segher at gcc dot gnu.org
                   ` (17 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: segher at gcc dot gnu.org @ 2023-06-21 20:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #87 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #53)
> (In reply to Segher Boessenkool from comment #52)
> > There is TARGET_LEGITIMATE_COMBINED_INSN though, which is a workaround for if
> > you really do not want the instruction combiner to create particular
> > instruction patterns (but it does nothing to prevent other parts of the
> > compiler from doing the same!)
> 
> Thanks for pointing it out.  I knew I missed something recent ;)

g:78e4f1ad4e48, from 2012?  Well, fairly recent, okay :-)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (82 preceding siblings ...)
  2023-06-21 20:13 ` segher at gcc dot gnu.org
@ 2023-06-21 20:32 ` segher at gcc dot gnu.org
  2023-06-22 11:09 ` klepikov.alex+bugs at gmail dot com
                   ` (16 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: segher at gcc dot gnu.org @ 2023-06-21 20:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #88 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #85)
> > +/* { dg-final { scan-assembler "_f_loop1_rshift:.*mov\.l\\t(\.L\[0-9\]+),(r\[0-9\]+).*sts.l\\tpr,@-r15.*(\.L\[0-9\]+):.*jsr\\t@\\2.*bf\.s\\t\\3.*\\1:\\n\\t\.long\\t___ashiftrt_r4_6.*_f_loop2_rshift:" { target { ! has_dyn_shift } } } }  */
> 
> Can you try to somehow write this in a simpler way?  Maybe omit some of the
> register number matches, as they don't matter etc.

Do not use double-quoted strings unless you need interpolation?  If you use {}
around the string you do not need to backslash-quote (and double-quote) so much
at all.

[0-9] is \d

whitespace is \s

See the Tcl re_syntax manual page :-)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (83 preceding siblings ...)
  2023-06-21 20:32 ` segher at gcc dot gnu.org
@ 2023-06-22 11:09 ` klepikov.alex+bugs at gmail dot com
  2023-06-22 11:44 ` olegendo at gcc dot gnu.org
                   ` (15 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-22 11:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #89 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
Here's what I did

sh.md:

(define_expand "ashrsi3"
  [(parallel [(set (match_operand:SI 0 "arith_reg_dest" "")
                   (ashiftrt:SI (match_operand:SI 1 "arith_reg_operand" "")
                                (match_operand:SI 2 "nonmemory_operand" "")))
              (clobber (reg:SI T_REG))])]
  ""
{
  if (expand_ashiftrt (operands))
    DONE;
  if (!CONST_INT_P (operands[2]))
    FAIL;
  int value = INTVAL (operands[2]) & 31;
  emit_insn (gen_ashrsi3_n_call (operands[0], operands[1], GEN_INT(value)));
  DONE;
})

(define_insn_and_split "ashrsi3_n_call"
  [(set (match_operand:SI 0 "arith_reg_dest" "=r")
        (ashiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0")
                (match_operand:SI 2 "const_int_operand")))
        (clobber (reg:SI T_REG))]
  "TARGET_SH1"
  "#"
  "&& can_create_pseudo_p ()"
  [(const_int 0)]
{
    char func[18];
    int value;
    rtx wrk;

    cfun->machine->ashrsi3_libcall_expanded++;

    if (expand_ashiftrt (operands))
      DONE;
    if (!CONST_INT_P (operands[2]))
      FAIL;
    value = INTVAL (operands[2]) & 31;

    if (dump_file)
      fprintf(dump_file, "ashrsi3_n_call: Expanding ashrsi3 libcall\n");

    wrk = gen_reg_rtx (Pmode);
    /* Load the value into an arg reg and call a helper.  */
    emit_move_insn (gen_rtx_REG (SImode, 4), operands[1]);
    sprintf (func, "__ashiftrt_r4_%d", value);
    rtx lab = function_symbol (wrk, func, SFUNC_STATIC).lab;
    emit_insn (gen_ashrsi3_n (GEN_INT (value), wrk, lab));
    emit_move_insn (operands[0], gen_rtx_REG (SImode, 4));
    DONE;
})


sh.cc:

bool
expand_ashiftrt (rtx *operands)
{
  rtx wrk;
  int value;

  if (TARGET_DYNSHIFT)
    {
      if (!CONST_INT_P (operands[2]))
        {
          rtx count = copy_to_mode_reg (SImode, operands[2]);
          emit_insn (gen_negsi2 (count, count));
          emit_insn (gen_ashrsi3_d (operands[0], operands[1], count));
          return true;
        }
      else if (ashiftrt_insns[INTVAL (operands[2]) & 31]
               > 1 + SH_DYNAMIC_SHIFT_COST)
        {
          rtx count
            = force_reg (SImode, GEN_INT (- (INTVAL (operands[2]) & 31)));
          emit_insn (gen_ashrsi3_d (operands[0], operands[1], count));
          return true;
        }
    }
  if (!CONST_INT_P (operands[2]))
    return false;

  value = INTVAL (operands[2]) & 31;

  if (value == 31)
    {
      /* If we are called from abs expansion, arrange things so that we
         we can use a single MT instruction that doesn't clobber the source,
         if LICM can hoist out the load of the constant zero.  */
      if (currently_expanding_to_rtl)
        {
          emit_insn (gen_cmpgtsi_t (force_reg (SImode, CONST0_RTX (SImode)),
                                    operands[1]));
          emit_insn (gen_mov_neg_si_t (operands[0], get_t_reg_rtx ()));
          return true;
        }
      emit_insn (gen_ashrsi2_31 (operands[0], operands[1]));
      return true;
    }
  else if (value >= 16 && value <= 19)
    {
      wrk = gen_reg_rtx (SImode);
      emit_insn (gen_ashrsi2_16 (wrk, operands[1]));
      value -= 16;
      while (value--)
        gen_ashift (ASHIFTRT, 1, wrk);
      emit_move_insn (operands[0], wrk);
      return true;
    }
  /* Expand a short sequence inline, longer call a magic routine.  */
  else if (value <= 5)
    {
      wrk = gen_reg_rtx (SImode);
      emit_move_insn (wrk, operands[1]);
      while (value--)
        gen_ashift (ASHIFTRT, 1, wrk);
      emit_move_insn (operands[0], wrk);
      return true;
    }
  return false;
}


Now GCC falls into infinite loop when compiling this:

int f_short_rshift_signed(char v){
    return v >> 5;
}

It looks like it splits and then combines again.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (84 preceding siblings ...)
  2023-06-22 11:09 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-22 11:44 ` olegendo at gcc dot gnu.org
  2023-06-22 12:34 ` klepikov.alex+bugs at gmail dot com
                   ` (14 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-22 11:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #90 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #89)
> Here's what I did
> ...

Can you attach it here as a patch please?

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (85 preceding siblings ...)
  2023-06-22 11:44 ` olegendo at gcc dot gnu.org
@ 2023-06-22 12:34 ` klepikov.alex+bugs at gmail dot com
  2023-06-23  6:02 ` klepikov.alex+bugs at gmail dot com
                   ` (13 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-22 12:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

Alexander Klepikov <klepikov.alex+bugs at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #55367|0                           |1
        is obsolete|                            |

--- Comment #91 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
Created attachment 55382
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55382&action=edit
Draft libcall/sh_loop patch, causes infinite loop

Please patch with '-p2'.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (86 preceding siblings ...)
  2023-06-22 12:34 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-23  6:02 ` klepikov.alex+bugs at gmail dot com
  2023-06-23  6:06 ` olegendo at gcc dot gnu.org
                   ` (12 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-06-23  6:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #92 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
I remembered why I used two different insns - first to eliminate infinite loop
with help of marking insn with attribute, and second because I could not set
attribute when emitting insn from C code. Whe have 'get_attr_*' functions but
we have not 'set_attr_*'.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (87 preceding siblings ...)
  2023-06-23  6:02 ` klepikov.alex+bugs at gmail dot com
@ 2023-06-23  6:06 ` olegendo at gcc dot gnu.org
  2023-06-23 14:11 ` segher at gcc dot gnu.org
                   ` (11 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-06-23  6:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #93 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #92)
> I remembered why I used two different insns - first to eliminate infinite
> loop with help of marking insn with attribute, and second because I could
> not set attribute when emitting insn from C code. Whe have 'get_attr_*'
> functions but we have not 'set_attr_*'.

Yes, I thought so.  I'll give it a try myself with your patch ... please hold
on.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (88 preceding siblings ...)
  2023-06-23  6:06 ` olegendo at gcc dot gnu.org
@ 2023-06-23 14:11 ` segher at gcc dot gnu.org
  2023-07-06 14:16 ` olegendo at gcc dot gnu.org
                   ` (10 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: segher at gcc dot gnu.org @ 2023-06-23 14:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #94 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #92)
> I remembered why I used two different insns - first to eliminate infinite
> loop with help of marking insn with attribute, and second because I could
> not set attribute when emitting insn from C code. Whe have 'get_attr_*'
> functions but we have not 'set_attr_*'.

An attribute is part of the instruction *definition*, the define_insn; it isn't
a property you put on one instance of it.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (89 preceding siblings ...)
  2023-06-23 14:11 ` segher at gcc dot gnu.org
@ 2023-07-06 14:16 ` olegendo at gcc dot gnu.org
  2023-07-07 11:10 ` klepikov.alex+bugs at gmail dot com
                   ` (9 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-07-06 14:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #95 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #93)
> (In reply to Alexander Klepikov from comment #92)
> > I remembered why I used two different insns - first to eliminate infinite
> > loop with help of marking insn with attribute, and second because I could
> > not set attribute when emitting insn from C code. Whe have 'get_attr_*'
> > functions but we have not 'set_attr_*'.
> 
> Yes, I thought so.  I'll give it a try myself with your patch ... please
> hold on.

Finally could have a look at it, sorry for the delay.

The infinite loop is in splitting of the 'ashrsi3_n_call' pattern with the
constant 1.  This is because 'ashrsi3_n_call' match overlaps with the 'shar'
pattern.

One quick fix is to put the condition into the first string.  A nicer way would
be to use a predicate which would constrain the operand[2] to "not 1".  But
it's the first and only use so quick version is OK.

In addition, that pattern (not only the split condition) should be matched only
before register allocation, so the 'can_create_pseudo_p' check is moved.

Another point is that you had the 'cfun->machine->ashrsi3_libcall_expanded++;'
in the wrong place.  It was counting up even if it wouldn't have emitted the
libcall.

This seems to work:

(define_insn_and_split "ashrsi3_n_call"
  [(set (match_operand:SI 0 "arith_reg_dest")
        (ashiftrt:SI (match_operand:SI 1 "arith_reg_operand")
                     (match_operand:SI 2 "const_int_operand")))
        (clobber (reg:SI T_REG))]
  "TARGET_SH1 && can_create_pseudo_p () && operands[2] != const1_rtx"
  "#"
  "&& 1"
  [(const_int 0)]
{
  if (expand_ashiftrt (operands))
    DONE;

  if (dump_file)
    fprintf(dump_file, "ashrsi3_n_call: Expanding ashrsi3 libcall\n");

  cfun->machine->ashrsi3_libcall_expanded++;

  int value = INTVAL (operands[2]) & 31;

  char func[18];
  sprintf (func, "__ashiftrt_r4_%d", value);

  emit_move_insn (gen_rtx_REG (SImode, 4), operands[1]);

  rtx wrk = gen_reg_rtx (Pmode);
  rtx lab = function_symbol (wrk, func, SFUNC_STATIC).lab;
  emit_insn (gen_ashrsi3_n (GEN_INT (value), wrk, lab));
  emit_move_insn (operands[0], gen_rtx_REG (SImode, 4));
  DONE;
})


However, there is one case from  your previous posts in PR 49263:

int f_rshift(char v){
    return v >> S;
}

This will not work on SH2 and expand to a libcall.  On SH4 the combine pass
converts it into:

Successfully matched this instruction:
(set (reg:SI 166)
    (ashiftrt:SI (reg/v:SI 164 [ v ])
        (const_int 31 [0x1f])))

But it doesn't even try to do so with the 'ashrsi3_n_call' pattern.  Not sure
why ...

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (90 preceding siblings ...)
  2023-07-06 14:16 ` olegendo at gcc dot gnu.org
@ 2023-07-07 11:10 ` klepikov.alex+bugs at gmail dot com
  2023-07-07 11:45 ` olegendo at gcc dot gnu.org
                   ` (8 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-07-07 11:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #96 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
(In reply to Oleg Endo from comment #95)
> The infinite loop is in splitting of the 'ashrsi3_n_call' pattern with the
> constant 1.  This is because 'ashrsi3_n_call' match overlaps with the 'shar'
> pattern.

Yes, I confirm that. 'operands[2] != const1_rtx' works well, thank you!

> Another point is that you had the
> 'cfun->machine->ashrsi3_libcall_expanded++;' in the wrong place.  It was
> counting up even if it wouldn't have emitted the libcall.

Not really. 'expand_ashiftrt' could emit

mov  #imm, r1
shad r1,   r0

and 'mov' instruction would be loop invariant if there's a loop. It looks like
'ashrsi3_libcall_expanded' is a bad name. I think name
'ashrsi3_n_call_expanded' would be more appropriate.

> However, there is one case from  your previous posts in PR 49263:
> 
> int f_rshift(char v){
>     return v >> S;
> }
> 
> This will not work on SH2 and expand to a libcall.

I think you mean SH2A, because the same is going on with SH4.

>  On SH4 the combine pass
> converts it into:
> 
> Successfully matched this instruction:
> (set (reg:SI 166)
>     (ashiftrt:SI (reg/v:SI 164 [ v ])
>         (const_int 31 [0x1f])))
> 
> But it doesn't even try to do so with the 'ashrsi3_n_call' pattern.  Not
> sure why ...

Well, the same thing is going on when using vanilla GCC. It looks like it's
happening due to char sign extension. Then instruction is catched by
'ashrsi2_31' pattern. In another words, it looks to me like an optimization.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (91 preceding siblings ...)
  2023-07-07 11:10 ` klepikov.alex+bugs at gmail dot com
@ 2023-07-07 11:45 ` olegendo at gcc dot gnu.org
  2023-07-08  7:56 ` klepikov.alex+bugs at gmail dot com
                   ` (7 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-07-07 11:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #97 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #96)
> 
> Not really. 'expand_ashiftrt' could emit
> 
> mov  #imm, r1
> shad r1,   r0
> 
> and 'mov' instruction would be loop invariant if there's a loop. It looks
> like 'ashrsi3_libcall_expanded' is a bad name. I think name
> 'ashrsi3_n_call_expanded' would be more appropriate.

Ah, yes, indeed.  I'm cleaning up your patch and will rename it.

> 
> > However, there is one case from  your previous posts in PR 49263:
> > 
> > int f_rshift(char v){
> >     return v >> S;
> > }
> > 
> > This will not work on SH2 and expand to a libcall.
> 
> I think you mean SH2A, because the same is going on with SH4.
> 
> >  On SH4 the combine pass
> > converts it into:
> > 
> > Successfully matched this instruction:
> > (set (reg:SI 166)
> >     (ashiftrt:SI (reg/v:SI 164 [ v ])
> >         (const_int 31 [0x1f])))
> > 
> > But it doesn't even try to do so with the 'ashrsi3_n_call' pattern.  Not
> > sure why ...
> 
> Well, the same thing is going on when using vanilla GCC. It looks like it's
> happening due to char sign extension. Then instruction is catched by
> 'ashrsi2_31' pattern. In another words, it looks to me like an optimization.

It can be fixed by adding another pattern.  I'll make another patch for that
later.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (92 preceding siblings ...)
  2023-07-07 11:45 ` olegendo at gcc dot gnu.org
@ 2023-07-08  7:56 ` klepikov.alex+bugs at gmail dot com
  2023-07-09 13:55 ` olegendo at gcc dot gnu.org
                   ` (6 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-07-08  7:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

Alexander Klepikov <klepikov.alex+bugs at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #55382|0                           |1
        is obsolete|                            |

--- Comment #98 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
Created attachment 55503
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55503&action=edit
Testcase for SH specific loop optimization pass

I modified testcase for hoisting tests. It contains two different dg-final
commands at once. First uses 'scan-assembler' command and second uses
'check-function-bodies'.

The difference is that 'scan-assembler' uses "plain" regexp and is more
powerfull but more complicated to understand. 'check-function-bodies' is
simpler but less powerfull and requires C++, if I'm not mistaken.

They both work, but only one is needed because they do the same. I hope this
will help.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (93 preceding siblings ...)
  2023-07-08  7:56 ` klepikov.alex+bugs at gmail dot com
@ 2023-07-09 13:55 ` olegendo at gcc dot gnu.org
  2023-07-10 14:02 ` klepikov.alex+bugs at gmail dot com
                   ` (5 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-07-09 13:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #28207|0                           |1
        is obsolete|                            |
  Attachment #28633|0                           |1
        is obsolete|                            |

--- Comment #99 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Created attachment 55506
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55506&action=edit
proposed patch

(In reply to Alexander Klepikov from comment #98)
> Created attachment 55503 [details]
> Testcase for SH specific loop optimization pass
> 

Thanks.  I'll check it out.
Meanwhile, here's my iteration on your patch.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (94 preceding siblings ...)
  2023-07-09 13:55 ` olegendo at gcc dot gnu.org
@ 2023-07-10 14:02 ` klepikov.alex+bugs at gmail dot com
  2023-07-13 23:40 ` olegendo at gcc dot gnu.org
                   ` (4 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-07-10 14:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #100 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
Created attachment 55513
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55513&action=edit
Arithmetic right shift late expanding

(In reply to Oleg Endo from comment #99)
> Meanwhile, here's my iteration on your patch.

Thank you! I did all checks I did before, added testcases, and edited to fit
the code style.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (95 preceding siblings ...)
  2023-07-10 14:02 ` klepikov.alex+bugs at gmail dot com
@ 2023-07-13 23:40 ` olegendo at gcc dot gnu.org
  2023-07-14 14:31 ` klepikov.alex+bugs at gmail dot com
                   ` (3 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-07-13 23:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #101 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #100)
> Created attachment 55513 [details]
> Arithmetic right shift late expanding
> 
> (In reply to Oleg Endo from comment #99)
> > Meanwhile, here's my iteration on your patch.
> 
> Thank you! I did all checks I did before, added testcases, and edited to fit
> the code style.

Looks OK.  Just 3 things:

> +++ gcc-13.1.0/gcc/testsuite/gcc.target/sh/pr54089-11.c	2023-07-07 08:56:41.212345982 +0300
> @@ -0,0 +1,37 @@
> +/* Check that 'tst #64,r0' genrated.  */
> +/* { dg-do compile }  */
> +/* { dg-options "-O2" }  */

Please rename this test to pr49263... to have all tst #imm,r0 related tests in
one place.

Also:
  - 'genrated' -> 'generated'
  - space before opening paren of function args
  - 2 spaces indention
  - similarly, check code style of other added tests

> --- gcc-13.1.0.orig/gcc/testsuite/gcc.target/sh/pr54089-12.c	1970-01-01 03:00:00.000000000 +0300

Can you try out Segher's suggestion in #c88 to make the regex look less
cluttered?

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (96 preceding siblings ...)
  2023-07-13 23:40 ` olegendo at gcc dot gnu.org
@ 2023-07-14 14:31 ` klepikov.alex+bugs at gmail dot com
  2023-10-13  5:02 ` olegendo at gcc dot gnu.org
                   ` (2 subsequent siblings)
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-07-14 14:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

Alexander Klepikov <klepikov.alex+bugs at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #55503|0                           |1
        is obsolete|                            |
  Attachment #55513|0                           |1
        is obsolete|                            |

--- Comment #102 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
Created attachment 55543
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55543&action=edit
Arithmetic right shift late expanding v2

Here's the patch. I hope I did not miss anything.

Now considering regexp. I remade it using 'check-function-bodies' command and
now it looks less confusing. I also found that in this testcase right shift
expands to 'shad' instructions early on platforms that have dynamic shift
support, so I deleted checks for those CPUs.

I don't like that 'check-function-bodies' ignores asm labels but it's better
than nothing.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (97 preceding siblings ...)
  2023-07-14 14:31 ` klepikov.alex+bugs at gmail dot com
@ 2023-10-13  5:02 ` olegendo at gcc dot gnu.org
  2023-10-14 15:20 ` klepikov.alex+bugs at gmail dot com
  2023-10-15  1:06 ` olegendo at gcc dot gnu.org
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-10-13  5:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #103 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #102)
> Created attachment 55543 [details]
> Arithmetic right shift late expanding v2
> 
> Here's the patch. I hope I did not miss anything.
> 

Sorry, I've been busy with other things.  I think your patch is OK, but I
wanted to test it in more detail before committing it.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (98 preceding siblings ...)
  2023-10-13  5:02 ` olegendo at gcc dot gnu.org
@ 2023-10-14 15:20 ` klepikov.alex+bugs at gmail dot com
  2023-10-15  1:06 ` olegendo at gcc dot gnu.org
  100 siblings, 0 replies; 102+ messages in thread
From: klepikov.alex+bugs at gmail dot com @ 2023-10-14 15:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #104 from Alexander Klepikov <klepikov.alex+bugs at gmail dot com> ---
(In reply to Oleg Endo from comment #103)
> Sorry, I've been busy with other things.  I think your patch is OK, but I
> wanted to test it in more detail before committing it.

Oh, it's OK. By the way, your contribution is not less than mine.

I've been thinking about something. I suspect that this patch could take work
away from other patches. I'm sorry, I don't know how to express myself
properly. I mean there's several patches that corrects shift patterns and 'tst'
instruction generation (most of them are written by you, by the way). I suspect
that some of them might not run anymore because this patch looks more general
and should cover more cases, including yet unknown cases, I hope. And, in the
end, dead code may appear because of it. I hope I was able to make my point
clearly.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [Bug target/54089] [SH] Refactor shift patterns
  2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
                   ` (99 preceding siblings ...)
  2023-10-14 15:20 ` klepikov.alex+bugs at gmail dot com
@ 2023-10-15  1:06 ` olegendo at gcc dot gnu.org
  100 siblings, 0 replies; 102+ messages in thread
From: olegendo at gcc dot gnu.org @ 2023-10-15  1:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #105 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Alexander Klepikov from comment #104)
> I've been thinking about something. I suspect that this patch could take
> work away from other patches. I'm sorry, I don't know how to express myself
> properly. I mean there's several patches that corrects shift patterns and
> 'tst' instruction generation (most of them are written by you, by the way).
> I suspect that some of them might not run anymore because this patch looks
> more general and should cover more cases, including yet unknown cases, I
> hope. And, in the end, dead code may appear because of it. I hope I was able
> to make my point clearly.

Yes, I understand what you're saying.  As other parts of the compiler evolve,
the RTL input that the backend code has to work with changes.  It's a normal
thing that happens during the course of development.  Some patterns might stop
working (especially those combine patterns are prone to that).  And sometimes
things magically start working because something got fixed somewhere else.

I've tried to add SH specific test cases to try and keep it in check.  Ideally
we'd have to go through all of the specific SH quirks in the backend
periodically, try to remove them one by one and run tests to see if the
patterns are still working/needed or whether they can be removed.

Let me know if you have more test cases (that work or don't work).

^ permalink raw reply	[flat|nested] 102+ messages in thread

end of thread, other threads:[~2023-10-15  1:07 UTC | newest]

Thread overview: 102+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-24 23:42 [Bug target/54089] New: [SH] Refactor shift patterns olegendo at gcc dot gnu.org
2012-07-24 23:46 ` [Bug target/54089] " olegendo at gcc dot gnu.org
2012-07-25 23:03 ` olegendo at gcc dot gnu.org
2012-07-27 17:36 ` olegendo at gcc dot gnu.org
2012-07-30  6:43 ` olegendo at gcc dot gnu.org
2012-08-09 21:55 ` olegendo at gcc dot gnu.org
2012-08-09 23:18 ` olegendo at gcc dot gnu.org
2012-08-09 23:28 ` olegendo at gcc dot gnu.org
2012-08-09 23:36 ` olegendo at gcc dot gnu.org
2012-08-09 23:43 ` olegendo at gcc dot gnu.org
2012-08-10  0:40 ` kkojima at gcc dot gnu.org
2012-08-10 14:25 ` rmansfield at qnx dot com
2012-08-10 15:40 ` olegendo at gcc dot gnu.org
2012-08-11 20:26 ` olegendo at gcc dot gnu.org
2012-08-16 23:13 ` olegendo at gcc dot gnu.org
2012-08-20 21:29 ` olegendo at gcc dot gnu.org
2012-08-22 22:52 ` olegendo at gcc dot gnu.org
2012-09-10 20:35 ` olegendo at gcc dot gnu.org
2012-09-10 21:27 ` olegendo at gcc dot gnu.org
2012-09-17 23:30 ` olegendo at gcc dot gnu.org
2012-09-19 17:49 ` olegendo at gcc dot gnu.org
2012-09-25 19:07 ` olegendo at gcc dot gnu.org
2012-09-30 18:46 ` olegendo at gcc dot gnu.org
2012-10-16 10:53 ` amylaar at gcc dot gnu.org
2012-10-16 11:49 ` olegendo at gcc dot gnu.org
2012-11-06 11:56 ` olegendo at gcc dot gnu.org
2012-11-07 23:32 ` olegendo at gcc dot gnu.org
2012-11-09 10:48 ` olegendo at gcc dot gnu.org
2012-11-09 13:29 ` amylaar at gcc dot gnu.org
2012-11-13  1:14 ` olegendo at gcc dot gnu.org
2013-02-16 11:36 ` olegendo at gcc dot gnu.org
2013-12-17 11:11 ` olegendo at gcc dot gnu.org
2014-05-16 23:12 ` olegendo at gcc dot gnu.org
2014-12-16 21:38 ` olegendo at gcc dot gnu.org
2015-05-30 13:51 ` bugdal at aerifal dot cx
2015-05-30 14:11 ` olegendo at gcc dot gnu.org
2015-05-30 14:35 ` bugdal at aerifal dot cx
2015-05-30 14:42 ` olegendo at gcc dot gnu.org
2015-05-30 14:46 ` bugdal at aerifal dot cx
2015-05-30 14:57 ` olegendo at gcc dot gnu.org
2015-08-11  3:56 ` bugdal at aerifal dot cx
2015-08-11 14:17 ` olegendo at gcc dot gnu.org
2015-08-11 21:05 ` segher at gcc dot gnu.org
2015-08-12  1:30 ` segher at gcc dot gnu.org
2023-06-01  7:32 ` klepikov.alex+bugs at gmail dot com
2023-06-01  8:21 ` olegendo at gcc dot gnu.org
2023-06-01 11:18 ` klepikov.alex+bugs at gmail dot com
2023-06-01 12:08 ` olegendo at gcc dot gnu.org
2023-06-01 17:45 ` segher at gcc dot gnu.org
2023-06-01 20:46 ` olegendo at gcc dot gnu.org
2023-06-03  7:24 ` klepikov.alex+bugs at gmail dot com
2023-06-03  8:50 ` olegendo at gcc dot gnu.org
2023-06-03 15:43 ` klepikov.alex+bugs at gmail dot com
2023-06-03 16:09 ` olegendo at gcc dot gnu.org
2023-06-04 18:35 ` klepikov.alex+bugs at gmail dot com
2023-06-05  0:03 ` olegendo at gcc dot gnu.org
2023-06-06 10:30 ` klepikov.alex+bugs at gmail dot com
2023-06-06 10:51 ` olegendo at gcc dot gnu.org
2023-06-06 12:17 ` klepikov.alex+bugs at gmail dot com
2023-06-06 12:39 ` olegendo at gcc dot gnu.org
2023-06-06 12:55 ` klepikov.alex+bugs at gmail dot com
2023-06-06 16:10 ` klepikov.alex+bugs at gmail dot com
2023-06-06 19:07 ` klepikov.alex+bugs at gmail dot com
2023-06-06 23:11 ` olegendo at gcc dot gnu.org
2023-06-08 12:07 ` klepikov.alex+bugs at gmail dot com
2023-06-08 12:09 ` klepikov.alex+bugs at gmail dot com
2023-06-08 13:22 ` olegendo at gcc dot gnu.org
2023-06-08 15:22 ` klepikov.alex+bugs at gmail dot com
2023-06-08 20:17 ` olegendo at gcc dot gnu.org
2023-06-14  9:30 ` klepikov.alex+bugs at gmail dot com
2023-06-14  9:31 ` klepikov.alex+bugs at gmail dot com
2023-06-16 13:57 ` klepikov.alex+bugs at gmail dot com
2023-06-16 21:58 ` olegendo at gcc dot gnu.org
2023-06-17  6:08 ` klepikov.alex+bugs at gmail dot com
2023-06-17  7:06 ` olegendo at gcc dot gnu.org
2023-06-17  8:24 ` klepikov.alex+bugs at gmail dot com
2023-06-17  9:46 ` olegendo at gcc dot gnu.org
2023-06-17 11:00 ` klepikov.alex+bugs at gmail dot com
2023-06-17 17:57 ` klepikov.alex+bugs at gmail dot com
2023-06-20 11:28 ` klepikov.alex+bugs at gmail dot com
2023-06-20 17:03 ` klepikov.alex+bugs at gmail dot com
2023-06-20 23:46 ` olegendo at gcc dot gnu.org
2023-06-21 11:51 ` klepikov.alex+bugs at gmail dot com
2023-06-21 20:13 ` segher at gcc dot gnu.org
2023-06-21 20:32 ` segher at gcc dot gnu.org
2023-06-22 11:09 ` klepikov.alex+bugs at gmail dot com
2023-06-22 11:44 ` olegendo at gcc dot gnu.org
2023-06-22 12:34 ` klepikov.alex+bugs at gmail dot com
2023-06-23  6:02 ` klepikov.alex+bugs at gmail dot com
2023-06-23  6:06 ` olegendo at gcc dot gnu.org
2023-06-23 14:11 ` segher at gcc dot gnu.org
2023-07-06 14:16 ` olegendo at gcc dot gnu.org
2023-07-07 11:10 ` klepikov.alex+bugs at gmail dot com
2023-07-07 11:45 ` olegendo at gcc dot gnu.org
2023-07-08  7:56 ` klepikov.alex+bugs at gmail dot com
2023-07-09 13:55 ` olegendo at gcc dot gnu.org
2023-07-10 14:02 ` klepikov.alex+bugs at gmail dot com
2023-07-13 23:40 ` olegendo at gcc dot gnu.org
2023-07-14 14:31 ` klepikov.alex+bugs at gmail dot com
2023-10-13  5:02 ` olegendo at gcc dot gnu.org
2023-10-14 15:20 ` klepikov.alex+bugs at gmail dot com
2023-10-15  1:06 ` olegendo at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).