public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch] fixes for power4 scheduler description
@ 2012-07-10 12:20 Steven Bosscher
  2012-07-17 17:51 ` David Edelsohn
  0 siblings, 1 reply; 5+ messages in thread
From: Steven Bosscher @ 2012-07-10 12:20 UTC (permalink / raw)
  To: Alan Modra, GCC Patches; +Cc: David Edelsohn

Hello,

These look like typos:

* "power4-store-update" wants "iuX,iuY" for X=1|2 and Y=1|2. The
"iu2,iu1" case appeared twice.
* "power4-three" wants "iuX,iuX,iuY|iuX,iuY,iuY" for X=1|2 and Y=1|2.
The "iu1,iu1,iu2" case appeared twice.

Bootstrapped&tested on powerpc64-unknown-linux-gnu.
OK for trunk?

Note, it'd be nice if the size of the power4iu automaton could be
reduced somehow. It is by far the largest automaton in the rs6000 back
end, accounting for ~40% of the total.

Automaton `power4iu'
     8128 NDFA states,          68391 NDFA arcs
    12609 DFA states,           104521 DFA arcs
    10894 minimal DFA states,   89248 minimal DFA arcs
      683 all insns         23 insn equivalence classes
    0 locked states
107203 transition comb vector els, 250562 trans table els: use simple vect
250562 min delay table els, compression factor 1

(All:)
124282 all allocated states,     493955 all allocated arcs
811237 all allocated alternative states
248236 all transition comb vector els, 613376 all trans table els
613376 all min delay table els
    0 all locked states

For comparison, power7iu:

Automaton `power7iu'
     7697 NDFA states,          23034 NDFA arcs
     3738 DFA states,           10277 DFA arcs
     3690 minimal DFA states,   10056 minimal DFA arcs
      683 all insns          9 insn equivalence classes
    0 locked states
10690 transition comb vector els, 33210 trans table els: use comb vect
33210 min delay table els, compression factor 1

I don't understand well enough how the scheduler descriptions are
translated to DFAs, so I don't really understand why the power4iu
automaton needs so many table elts, but the above seems
disproportional to me.

Ciao!
Steven

        * config/rs6000/power4.md (power4-store-update): Fix reservation.
        (power4-three): Likewise.

Index: config/rs6000/power4.md
===================================================================
--- config/rs6000/power4.md     (revision 189388)
+++ config/rs6000/power4.md     (working copy)
@@ -145,7 +145,7 @@
     |(du3_power4+du4_power4,lsu2_power4))+\
    ((nothing,iu2_power4,iu1_power4)\
     |(nothing,iu2_power4,iu2_power4)\
-    |(nothing,iu1_power4,iu2_power4)\
+    |(nothing,iu1_power4,iu1_power4)\
     |(nothing,iu1_power4,iu2_power4))")

 (define_insn_reservation "power4-store-update-indexed" 12
@@ -212,7 +212,7 @@
    ((iu1_power4,nothing,iu2_power4,nothing,iu2_power4)\
     |(iu2_power4,nothing,iu2_power4,nothing,iu1_power4)\
     |(iu2_power4,nothing,iu1_power4,nothing,iu1_power4)\
-    |(iu1_power4,nothing,iu2_power4,nothing,iu2_power4))")
+    |(iu1_power4,nothing,iu1_power4,nothing,iu2_power4))")

 (define_insn_reservation "power4-insert" 4
   (and (eq_attr "type" "insert_word")

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] fixes for power4 scheduler description
  2012-07-10 12:20 [patch] fixes for power4 scheduler description Steven Bosscher
@ 2012-07-17 17:51 ` David Edelsohn
  2012-07-19 18:44   ` Pat Haugen
  0 siblings, 1 reply; 5+ messages in thread
From: David Edelsohn @ 2012-07-17 17:51 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: GCC Patches, Vladimir Makarov, Pat Haugen

On Tue, Jul 10, 2012 at 8:20 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> Hello,
>
> These look like typos:
>
> * "power4-store-update" wants "iuX,iuY" for X=1|2 and Y=1|2. The
> "iu2,iu1" case appeared twice.
> * "power4-three" wants "iuX,iuX,iuY|iuX,iuY,iuY" for X=1|2 and Y=1|2.
> The "iu1,iu1,iu2" case appeared twice.
>
> Bootstrapped&tested on powerpc64-unknown-linux-gnu.
> OK for trunk?
>
> Note, it'd be nice if the size of the power4iu automaton could be
> reduced somehow. It is by far the largest automaton in the rs6000 back
> end, accounting for ~40% of the total.

Steven,

The change to the power3-three reservation is okay.

The change to power4-store-update looks incorrect or at least incomplete.

These reservations and others were changed by Vlad in March/April 2004
to fix a consistency check that he introduced at the time. Note that
the dispatch units for the final choice also is wrong and duplicated
from the previous line for power4-store-update:

   |(du3_power4+du4_power4,lsu2_power4)\
    |(du3_power4+du4_power4,lsu2_power4))+\

It looks like the final line should be

du4_power4+du1_power4,lsu1_power4

On POWER4, the dispatch slot forces specific function units.  I don't
remember and don't have the documents handy to know if the function
units for store-update should be

2-1
2-2
1-1
1-2

or

2-1
2-2
1-2
1-1

I think it is the latter.

If you look at the thread from 2009, we discuss that the POWER4
scheduler description already is an approximation because an accurate
description creates an unreasonably large automata.  A lot of the
problem is the 1 cycle delay for dependent integer ops.

Thanks, David

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] fixes for power4 scheduler description
  2012-07-17 17:51 ` David Edelsohn
@ 2012-07-19 18:44   ` Pat Haugen
  2012-07-20 20:35     ` David Edelsohn
  0 siblings, 1 reply; 5+ messages in thread
From: Pat Haugen @ 2012-07-19 18:44 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Steven Bosscher, GCC Patches, Vladimir Makarov

On 07/17/2012 12:51 PM, David Edelsohn wrote:
> The change to power4-store-update looks incorrect or at least incomplete.
>
> These reservations and others were changed by Vlad in March/April 2004
> to fix a consistency check that he introduced at the time. Note that
> the dispatch units for the final choice also is wrong and duplicated
> from the previous line for power4-store-update:
>
>     |(du3_power4+du4_power4,lsu2_power4)\
>      |(du3_power4+du4_power4,lsu2_power4))+\
>
> It looks like the final line should be
>
> du4_power4+du1_power4,lsu1_power4

This 4th line should be removed altogether. If only the 4th dispatch slot is available and a cracked (or microcoded) insn is next, a nop will fill that slot and the cracked insn will start a new dispatch group.

>
> On POWER4, the dispatch slot forces specific function units.  I don't
> remember and don't have the documents handy to know if the function
> units for store-update should be
>
> 2-1
> 2-2
> 1-1
> 1-2
>
> or
>
> 2-1
> 2-2
> 1-2
> 1-1
>
> I think it is the latter.

On Power4, the dispatch to issue/execution unit affinity is as follows (4 dispatch slots, 2 FXU/LSU issue/execution units):
1 -> 1
2 -> 2
3 -> 2
4 -> 1

Update form stores are cracked into a store and addi, and the store is dual issued to the LSU/FXU. So based on this info I would think it should look like the following (untested).

@@ -141,12 +141,10 @@ (define_insn_reservation "power4-store-u
         (eq_attr "cpu" "power4"))
    "((du1_power4+du2_power4,lsu1_power4)\
      |(du2_power4+du3_power4,lsu2_power4)\
-    |(du3_power4+du4_power4,lsu2_power4)\
      |(du3_power4+du4_power4,lsu2_power4))+\
-   ((nothing,iu2_power4,iu1_power4)\
+   ((nothing,iu1_power4,iu2_power4)\
      |(nothing,iu2_power4,iu2_power4)\
-    |(nothing,iu1_power4,iu2_power4)\
-    |(nothing,iu1_power4,iu2_power4))")
+    |(nothing,iu2_power4,iu1_power4))")

-Pat

>
> If you look at the thread from 2009, we discuss that the POWER4
> scheduler description already is an approximation because an accurate
> description creates an unreasonably large automata.  A lot of the
> problem is the 1 cycle delay for dependent integer ops.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] fixes for power4 scheduler description
  2012-07-19 18:44   ` Pat Haugen
@ 2012-07-20 20:35     ` David Edelsohn
  2012-07-20 22:05       ` Pat Haugen
  0 siblings, 1 reply; 5+ messages in thread
From: David Edelsohn @ 2012-07-20 20:35 UTC (permalink / raw)
  To: Pat Haugen; +Cc: Steven Bosscher, GCC Patches, Vladimir Makarov

On Thu, Jul 19, 2012 at 2:43 PM, Pat Haugen <pthaugen@linux.vnet.ibm.com> wrote:

> Update form stores are cracked into a store and addi, and the store is dual
> issued to the LSU/FXU. So based on this info I would think it should look
> like the following (untested).
>
> @@ -141,12 +141,10 @@ (define_insn_reservation "power4-store-u
>         (eq_attr "cpu" "power4"))
>    "((du1_power4+du2_power4,lsu1_power4)\
>      |(du2_power4+du3_power4,lsu2_power4)\
> -    |(du3_power4+du4_power4,lsu2_power4)\
>      |(du3_power4+du4_power4,lsu2_power4))+\
> -   ((nothing,iu2_power4,iu1_power4)\
> +   ((nothing,iu1_power4,iu2_power4)\
>      |(nothing,iu2_power4,iu2_power4)\
> -    |(nothing,iu1_power4,iu2_power4)\
> -    |(nothing,iu1_power4,iu2_power4))")
> +    |(nothing,iu2_power4,iu1_power4))")

Looks good.  Check it in after testing.

Thanks, David

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] fixes for power4 scheduler description
  2012-07-20 20:35     ` David Edelsohn
@ 2012-07-20 22:05       ` Pat Haugen
  0 siblings, 0 replies; 5+ messages in thread
From: Pat Haugen @ 2012-07-20 22:05 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Steven Bosscher, GCC Patches, Vladimir Makarov

On 07/20/2012 03:34 PM, David Edelsohn wrote:
> Looks good.  Check it in after testing.
>

Testing went fine, patch checked in.

-Pat

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-07-20 22:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-10 12:20 [patch] fixes for power4 scheduler description Steven Bosscher
2012-07-17 17:51 ` David Edelsohn
2012-07-19 18:44   ` Pat Haugen
2012-07-20 20:35     ` David Edelsohn
2012-07-20 22:05       ` Pat Haugen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).