public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Patch for PR target/20632
       [not found] <200503251745.j2PHjhD6029735@napali.hpl.hp.com>
@ 2005-03-31 17:44 ` Vladimir Makarov
  2005-03-31 20:36   ` James E Wilson
  0 siblings, 1 reply; 9+ messages in thread
From: Vladimir Makarov @ 2005-03-31 17:44 UTC (permalink / raw)
  To: James E Wilson; +Cc: davidm, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1892 bytes --]

David Mosberger wrote:

>Yesterday I posted this GCC bug report:
>
>  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20632
>
>I wouldn't normally mention this here, but I thought it might be of
>some interest because it describes a stall condition that occurs on
>all McKinley-derived cores.  Specifically, if you issue an F-unit
>instruction within 6 cycles of reading ar.bsp, ar.bspstore, ar.unat,
>ar.rnat, or cr.ifs, the processor will stall for the remainder of that
>6 cycle window.  I found this the hard way while tuning the ia64 linux
>syscall path, but it's easy to demonstrate with the attached program
>and I confirmed with the chip folks that this is indeed the expect (if
>not desired/ideal) behavior.
>
>The moral of the story is that it's generally a good idea to avoid
>F-unit instructions and, in particular, F-unit NOPs.
>
>Now, as for GCC: does anybody on this list understand how the new
>scheduler/bundler works?  I suspect it's easy to fix current GCC to
>avoid F (and B) unit NOPs for someone who's already somewhat familiar
>with that code.
>
>  
>
  The following patch changes ia64 gcc behaviour in choosing nops.
With the patch F nops will be used only if usage of the other nops is
not possible.

The patch improves SPECFP2000 by 0.7% (from 442 to 445 on 1.4Ghz
itanium2) mainly because of big improvement (about 11%) on art
benchmark.  SPECINT2000 rate is not changed.

The patch has been sucessfully tested by bootstrap.

Jim, is the changes in ia64.c ok for you.  After your formal approval
the changes I'll commit the patch.

2005-03-31  Vladimir Makarov  <vmakarov@redhat.com>

	* genautomata.c (first_cycle_unit_presence): Check all alternative
	states for unit presence.

	* doc/md.texi: Remove remark about impossibility to query unit
	presence in non nondeterministic automaton state.
	
	* config/ia64/ia64.c (get_template): Change order of unit querying.


[-- Attachment #2: avoid_f_nops.patch --]
[-- Type: text/plain, Size: 5911 bytes --]

Index: genautomata.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/genautomata.c,v
retrieving revision 1.61
diff -c -d -p -r1.61 genautomata.c
*** genautomata.c	21 Feb 2005 14:39:49 -0000	1.61
--- genautomata.c	31 Mar 2005 16:53:56 -0000
*************** copy_equiv_class (vla_ptr_t *to, const v
*** 6120,6134 ****
  static int
  first_cycle_unit_presence (state_t state, int unit_num)
  {
!   int presence_p;
  
    if (state->component_states == NULL)
!     presence_p = test_unit_reserv (state->reservs, 0, unit_num);
    else
!     presence_p
!       = test_unit_reserv (state->component_states->state->reservs,
! 			  0, unit_num);
!   return presence_p;
  }
  
  /* The function returns nonzero value if STATE is not equivalent to
--- 6120,6138 ----
  static int
  first_cycle_unit_presence (state_t state, int unit_num)
  {
!   alt_state_t alt_state;
  
    if (state->component_states == NULL)
!     return test_unit_reserv (state->reservs, 0, unit_num);
    else
!     {
!       for (alt_state = state->component_states;
! 	   alt_state != NULL;
! 	   alt_state = alt_state->next_sorted_alt_state)
! 	if (test_unit_reserv (alt_state->state->reservs, 0, unit_num))
! 	  return true;
!     }
!   return false;
  }
  
  /* The function returns nonzero value if STATE is not equivalent to
Index: doc/md.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/md.texi,v
retrieving revision 1.126
diff -c -d -p -r1.126 md.texi
*** doc/md.texi	5 Mar 2005 19:56:29 -0000	1.126
--- doc/md.texi	31 Mar 2005 16:53:56 -0000
*************** the treatment of operator @samp{|} in th
*** 6232,6240 ****
  usual treatment of the operator is to try the first alternative and,
  if the reservation is not possible, the second alternative.  The
  nondeterministic treatment means trying all alternatives, some of them
! may be rejected by reservations in the subsequent insns.  You can not
! query functional unit reservations in nondeterministic automaton
! states.
  
  @item
  @dfn{progress} means output of a progress bar showing how many states
--- 6232,6238 ----
  usual treatment of the operator is to try the first alternative and,
  if the reservation is not possible, the second alternative.  The
  nondeterministic treatment means trying all alternatives, some of them
! may be rejected by reservations in the subsequent insns.
  
  @item
  @dfn{progress} means output of a progress bar showing how many states
Index: config/ia64/ia64.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/ia64/ia64.c,v
retrieving revision 1.350
diff -c -d -p -r1.350 ia64.c
*** config/ia64/ia64.c	17 Mar 2005 17:35:16 -0000	1.350
--- config/ia64/ia64.c	31 Mar 2005 16:53:56 -0000
*************** get_template (state_t state, int pos)
*** 6489,6510 ****
    switch (pos)
      {
      case 3:
!       if (cpu_unit_reservation_p (state, _0mii_))
! 	return 0;
!       else if (cpu_unit_reservation_p (state, _0mmi_))
  	return 1;
!       else if (cpu_unit_reservation_p (state, _0mfi_))
! 	return 2;
!       else if (cpu_unit_reservation_p (state, _0mmf_))
! 	return 3;
!       else if (cpu_unit_reservation_p (state, _0bbb_))
! 	return 4;
!       else if (cpu_unit_reservation_p (state, _0mbb_))
! 	return 5;
!       else if (cpu_unit_reservation_p (state, _0mib_))
! 	return 6;
        else if (cpu_unit_reservation_p (state, _0mmb_))
  	return 7;
        else if (cpu_unit_reservation_p (state, _0mfb_))
  	return 8;
        else if (cpu_unit_reservation_p (state, _0mlx_))
--- 6489,6510 ----
    switch (pos)
      {
      case 3:
!       if (cpu_unit_reservation_p (state, _0mmi_))
  	return 1;
!       else if (cpu_unit_reservation_p (state, _0mii_))
! 	return 0;
        else if (cpu_unit_reservation_p (state, _0mmb_))
  	return 7;
+       else if (cpu_unit_reservation_p (state, _0mib_))
+ 	return 6;
+       else if (cpu_unit_reservation_p (state, _0mbb_))
+ 	return 5;
+       else if (cpu_unit_reservation_p (state, _0bbb_))
+ 	return 4;
+       else if (cpu_unit_reservation_p (state, _0mmf_))
+ 	return 3;
+       else if (cpu_unit_reservation_p (state, _0mfi_))
+ 	return 2;
        else if (cpu_unit_reservation_p (state, _0mfb_))
  	return 8;
        else if (cpu_unit_reservation_p (state, _0mlx_))
*************** get_template (state_t state, int pos)
*** 6512,6533 ****
        else
  	abort ();
      case 6:
!       if (cpu_unit_reservation_p (state, _1mii_))
! 	return 0;
!       else if (cpu_unit_reservation_p (state, _1mmi_))
  	return 1;
!       else if (cpu_unit_reservation_p (state, _1mfi_))
! 	return 2;
!       else if (_1mmf_ >= 0 && cpu_unit_reservation_p (state, _1mmf_))
! 	return 3;
!       else if (cpu_unit_reservation_p (state, _1bbb_))
! 	return 4;
!       else if (cpu_unit_reservation_p (state, _1mbb_))
! 	return 5;
!       else if (cpu_unit_reservation_p (state, _1mib_))
! 	return 6;
        else if (cpu_unit_reservation_p (state, _1mmb_))
  	return 7;
        else if (cpu_unit_reservation_p (state, _1mfb_))
  	return 8;
        else if (cpu_unit_reservation_p (state, _1mlx_))
--- 6512,6533 ----
        else
  	abort ();
      case 6:
!       if (cpu_unit_reservation_p (state, _1mmi_))
  	return 1;
!       else if (cpu_unit_reservation_p (state, _1mii_))
! 	return 0;
        else if (cpu_unit_reservation_p (state, _1mmb_))
  	return 7;
+       else if (cpu_unit_reservation_p (state, _1mib_))
+ 	return 6;
+       else if (cpu_unit_reservation_p (state, _1mbb_))
+ 	return 5;
+       else if (cpu_unit_reservation_p (state, _1bbb_))
+ 	return 4;
+       else if (_1mmf_ >= 0 && cpu_unit_reservation_p (state, _1mmf_))
+ 	return 3;
+       else if (cpu_unit_reservation_p (state, _1mfi_))
+ 	return 2;
        else if (cpu_unit_reservation_p (state, _1mfb_))
  	return 8;
        else if (cpu_unit_reservation_p (state, _1mlx_))

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Patch for PR target/20632
  2005-03-31 17:44 ` Patch for PR target/20632 Vladimir Makarov
@ 2005-03-31 20:36   ` James E Wilson
  2005-03-31 23:33     ` Vladimir Makarov
  0 siblings, 1 reply; 9+ messages in thread
From: James E Wilson @ 2005-03-31 20:36 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: David Mosberger, gcc-patches

On Thu, 2005-03-31 at 09:25, Vladimir Makarov wrote:
> Jim, is the changes in ia64.c ok for you.  After your formal approval
> the changes I'll commit the patch.

You should add a comment to get_template explaining why we check
templates in that specific order, i.e. to prefer M/I nops over B nops
over F nops.  It would also be helpful to note here that F nops can
cause stalls in Itanium2.

Ok with that change.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Patch for PR target/20632
  2005-03-31 20:36   ` James E Wilson
@ 2005-03-31 23:33     ` Vladimir Makarov
  2005-04-01  8:09       ` David Mosberger
  0 siblings, 1 reply; 9+ messages in thread
From: Vladimir Makarov @ 2005-03-31 23:33 UTC (permalink / raw)
  To: James E Wilson; +Cc: David Mosberger, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 714 bytes --]

James E Wilson wrote:

>You should add a comment to get_template explaining why we check
>templates in that specific order, i.e. to prefer M/I nops over B nops
>over F nops.  It would also be helpful to note here that F nops can
>cause stalls in Itanium2.
>
>Ok with that change.
>
>
>  
>
This is the final version patch has been committed into the mainline.

2005-03-31  Vladimir Makarov  <vmakarov@redhat.com>

	PR target/20632
	* genautomata.c (first_cycle_unit_presence): Check all alternative
	states for unit presence.

	* doc/md.texi: Remove remark about impossibility to query unit
	presence in non nondeterministic automaton state.
	
	* config/ia64/ia64.c (get_template): Change order of unit querying.


[-- Attachment #2: avoid_f_nops.patch --]
[-- Type: text/plain, Size: 6924 bytes --]

Index: genautomata.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/genautomata.c,v
retrieving revision 1.61
diff -c -d -p -r1.61 genautomata.c
*** genautomata.c	21 Feb 2005 14:39:49 -0000	1.61
--- genautomata.c	31 Mar 2005 23:25:16 -0000
*************** copy_equiv_class (vla_ptr_t *to, const v
*** 6120,6134 ****
  static int
  first_cycle_unit_presence (state_t state, int unit_num)
  {
!   int presence_p;
  
    if (state->component_states == NULL)
!     presence_p = test_unit_reserv (state->reservs, 0, unit_num);
    else
!     presence_p
!       = test_unit_reserv (state->component_states->state->reservs,
! 			  0, unit_num);
!   return presence_p;
  }
  
  /* The function returns nonzero value if STATE is not equivalent to
--- 6120,6138 ----
  static int
  first_cycle_unit_presence (state_t state, int unit_num)
  {
!   alt_state_t alt_state;
  
    if (state->component_states == NULL)
!     return test_unit_reserv (state->reservs, 0, unit_num);
    else
!     {
!       for (alt_state = state->component_states;
! 	   alt_state != NULL;
! 	   alt_state = alt_state->next_sorted_alt_state)
! 	if (test_unit_reserv (alt_state->state->reservs, 0, unit_num))
! 	  return true;
!     }
!   return false;
  }
  
  /* The function returns nonzero value if STATE is not equivalent to
Index: doc/md.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/md.texi,v
retrieving revision 1.126
diff -c -d -p -r1.126 md.texi
*** doc/md.texi	5 Mar 2005 19:56:29 -0000	1.126
--- doc/md.texi	31 Mar 2005 23:25:16 -0000
*************** the treatment of operator @samp{|} in th
*** 6232,6240 ****
  usual treatment of the operator is to try the first alternative and,
  if the reservation is not possible, the second alternative.  The
  nondeterministic treatment means trying all alternatives, some of them
! may be rejected by reservations in the subsequent insns.  You can not
! query functional unit reservations in nondeterministic automaton
! states.
  
  @item
  @dfn{progress} means output of a progress bar showing how many states
--- 6232,6238 ----
  usual treatment of the operator is to try the first alternative and,
  if the reservation is not possible, the second alternative.  The
  nondeterministic treatment means trying all alternatives, some of them
! may be rejected by reservations in the subsequent insns.
  
  @item
  @dfn{progress} means output of a progress bar showing how many states
Index: config/ia64/ia64.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/ia64/ia64.c,v
retrieving revision 1.352
diff -c -d -p -r1.352 ia64.c
*** config/ia64/ia64.c	30 Mar 2005 21:34:41 -0000	1.352
--- config/ia64/ia64.c	31 Mar 2005 23:25:16 -0000
*************** get_max_pos (state_t state)
*** 6481,6487 ****
  
  /* The function returns code of a possible template for given position
     and state.  The function should be called only with 2 values of
!    position equal to 3 or 6.  */
  
  static int
  get_template (state_t state, int pos)
--- 6481,6493 ----
  
  /* The function returns code of a possible template for given position
     and state.  The function should be called only with 2 values of
!    position equal to 3 or 6.  We avoid generating F NOPs by putting
!    templates containing F insns at the end of the template search
!    because undocumented anomaly in McKinley derived cores which can
!    cause stalls if an F-unit insn (including a NOP) is issued within a
!    six-cycle window after reading certain application registers (such
!    as ar.bsp).  Furthermore, power-considerations also argue against
!    the use of F-unit instructions unless they're really needed.  */
  
  static int
  get_template (state_t state, int pos)
*************** get_template (state_t state, int pos)
*** 6489,6510 ****
    switch (pos)
      {
      case 3:
!       if (cpu_unit_reservation_p (state, _0mii_))
! 	return 0;
!       else if (cpu_unit_reservation_p (state, _0mmi_))
  	return 1;
!       else if (cpu_unit_reservation_p (state, _0mfi_))
! 	return 2;
!       else if (cpu_unit_reservation_p (state, _0mmf_))
! 	return 3;
!       else if (cpu_unit_reservation_p (state, _0bbb_))
! 	return 4;
!       else if (cpu_unit_reservation_p (state, _0mbb_))
! 	return 5;
!       else if (cpu_unit_reservation_p (state, _0mib_))
! 	return 6;
        else if (cpu_unit_reservation_p (state, _0mmb_))
  	return 7;
        else if (cpu_unit_reservation_p (state, _0mfb_))
  	return 8;
        else if (cpu_unit_reservation_p (state, _0mlx_))
--- 6495,6516 ----
    switch (pos)
      {
      case 3:
!       if (cpu_unit_reservation_p (state, _0mmi_))
  	return 1;
!       else if (cpu_unit_reservation_p (state, _0mii_))
! 	return 0;
        else if (cpu_unit_reservation_p (state, _0mmb_))
  	return 7;
+       else if (cpu_unit_reservation_p (state, _0mib_))
+ 	return 6;
+       else if (cpu_unit_reservation_p (state, _0mbb_))
+ 	return 5;
+       else if (cpu_unit_reservation_p (state, _0bbb_))
+ 	return 4;
+       else if (cpu_unit_reservation_p (state, _0mmf_))
+ 	return 3;
+       else if (cpu_unit_reservation_p (state, _0mfi_))
+ 	return 2;
        else if (cpu_unit_reservation_p (state, _0mfb_))
  	return 8;
        else if (cpu_unit_reservation_p (state, _0mlx_))
*************** get_template (state_t state, int pos)
*** 6512,6533 ****
        else
  	abort ();
      case 6:
!       if (cpu_unit_reservation_p (state, _1mii_))
! 	return 0;
!       else if (cpu_unit_reservation_p (state, _1mmi_))
  	return 1;
!       else if (cpu_unit_reservation_p (state, _1mfi_))
! 	return 2;
!       else if (_1mmf_ >= 0 && cpu_unit_reservation_p (state, _1mmf_))
! 	return 3;
!       else if (cpu_unit_reservation_p (state, _1bbb_))
! 	return 4;
!       else if (cpu_unit_reservation_p (state, _1mbb_))
! 	return 5;
!       else if (cpu_unit_reservation_p (state, _1mib_))
! 	return 6;
        else if (cpu_unit_reservation_p (state, _1mmb_))
  	return 7;
        else if (cpu_unit_reservation_p (state, _1mfb_))
  	return 8;
        else if (cpu_unit_reservation_p (state, _1mlx_))
--- 6518,6539 ----
        else
  	abort ();
      case 6:
!       if (cpu_unit_reservation_p (state, _1mmi_))
  	return 1;
!       else if (cpu_unit_reservation_p (state, _1mii_))
! 	return 0;
        else if (cpu_unit_reservation_p (state, _1mmb_))
  	return 7;
+       else if (cpu_unit_reservation_p (state, _1mib_))
+ 	return 6;
+       else if (cpu_unit_reservation_p (state, _1mbb_))
+ 	return 5;
+       else if (cpu_unit_reservation_p (state, _1bbb_))
+ 	return 4;
+       else if (_1mmf_ >= 0 && cpu_unit_reservation_p (state, _1mmf_))
+ 	return 3;
+       else if (cpu_unit_reservation_p (state, _1mfi_))
+ 	return 2;
        else if (cpu_unit_reservation_p (state, _1mfb_))
  	return 8;
        else if (cpu_unit_reservation_p (state, _1mlx_))

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Patch for PR target/20632
  2005-03-31 23:33     ` Vladimir Makarov
@ 2005-04-01  8:09       ` David Mosberger
  2005-04-01 21:47         ` Geert Bosch
  2005-04-01 22:24         ` Vladimir N. Makarov
  0 siblings, 2 replies; 9+ messages in thread
From: David Mosberger @ 2005-04-01  8:09 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: James E Wilson, David Mosberger, gcc-patches

>>>>> On Thu, 31 Mar 2005 18:30:38 -0500, Vladimir Makarov <vmakarov@redhat.com> said:

  Vlad> This is the final version patch has been committed into the
  Vlad> mainline.

Cool.  I tried it on the Linux kernel and we're down to 72 "nop.f 0"
instructions (from 71199!).  The remaining nop.f's are
compiler-generated (not hand-coded) and they typically appear in
bundles like these:

       0f f8 c8 22 13 20       [MMF]       shladd r31=r50,4,r17
       00 00 00 02 00 00                   nop.m 0x0
       00 00 04 00                         nop.f 0x0;;

As far as I can see, the "nop.f" only occur in bundles where the
second slot contains a "nop.m 0".

More curiously, I found a handful of bundles containing only NOPs.
For example:

00 00 00 00 01 00       [MII]       nop.m 0x0
a0 02 ae 3e 29 00                   shr.u r42=r43,32
00 00 04 00                         nop.i 0x0
0f 00 00 00 01 00       [MMF]       nop.m 0x0
00 00 00 02 00 00                   nop.m 0x0
00 00 04 00                         nop.f 0x0;;

Is there a good reason to emit such a NOP-only bundle?

Thanks,

	--david

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Patch for PR target/20632
  2005-04-01  8:09       ` David Mosberger
@ 2005-04-01 21:47         ` Geert Bosch
  2005-04-01 22:31           ` David Mosberger
  2005-04-01 22:33           ` Vladimir N. Makarov
  2005-04-01 22:24         ` Vladimir N. Makarov
  1 sibling, 2 replies; 9+ messages in thread
From: Geert Bosch @ 2005-04-01 21:47 UTC (permalink / raw)
  To: davidm; +Cc: James E Wilson, gcc-patches, Vladimir Makarov

On Apr 1, 2005, at 03:06, David Mosberger wrote:
> More curiously, I found a handful of bundles containing only NOPs.
> For example:
>
> 00 00 00 00 01 00       [MII]       nop.m 0x0
> a0 02 ae 3e 29 00                   shr.u r42=r43,32
> 00 00 04 00                         nop.i 0x0
> 0f 00 00 00 01 00       [MMF]       nop.m 0x0
> 00 00 00 02 00 00                   nop.m 0x0
> 00 00 04 00                         nop.f 0x0;;
>
> Is there a good reason to emit such a NOP-only bundle?

This is a direct result of GCC's new tree-ssa optimizers performing
load- and store elimination and advanced strengh-reduction on your
code. In combination with the new DFA scheduler's just-in-time 
scheduling,
which prevents executing any instructions until absolutely necessary,
this allows drastic code simplification.

Instead of executing complex instructions, such a memory loads and
stores that may take many cycles to complete, the processor now only
needs to execute these much simplified instructions. Taking advantage
of the explicit parallelism in the code, the current Itanium-2 
processors
are capable of executing up to 6 of these instructions in one cycle,
thereby fully realizing the promises of Intel's EPIC architecture by
computing absolutely nothing at unprecedented speeds.

   -Geert

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Patch for PR target/20632
  2005-04-01  8:09       ` David Mosberger
  2005-04-01 21:47         ` Geert Bosch
@ 2005-04-01 22:24         ` Vladimir N. Makarov
  2005-04-01 23:12           ` David Mosberger
  1 sibling, 1 reply; 9+ messages in thread
From: Vladimir N. Makarov @ 2005-04-01 22:24 UTC (permalink / raw)
  To: davidm; +Cc: James E Wilson, gcc-patches

David Mosberger wrote:

>
>More curiously, I found a handful of bundles containing only NOPs.
>For example:
>
>00 00 00 00 01 00       [MII]       nop.m 0x0
>a0 02 ae 3e 29 00                   shr.u r42=r43,32
>00 00 04 00                         nop.i 0x0
>0f 00 00 00 01 00       [MMF]       nop.m 0x0
>00 00 00 02 00 00                   nop.m 0x0
>00 00 04 00                         nop.f 0x0;;
>
>Is there a good reason to emit such a NOP-only bundle?
>
>  
>
Of course there are no reasons for this. I can guess there are some 
errors in NDFA description.  It is very non-trivial description because 
of transitions of  bundle information from one cycle to another which is 
complicated by stops bit in the middle of bundles.  Jim pointed me one 
such problem a week ago.  I am going to look at these problems but I can 
not promise that I'll fix it soon.

Vlad

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Patch for PR target/20632
  2005-04-01 21:47         ` Geert Bosch
@ 2005-04-01 22:31           ` David Mosberger
  2005-04-01 22:33           ` Vladimir N. Makarov
  1 sibling, 0 replies; 9+ messages in thread
From: David Mosberger @ 2005-04-01 22:31 UTC (permalink / raw)
  To: Geert Bosch; +Cc: davidm, James E Wilson, gcc-patches, Vladimir Makarov

>>>>> On Fri, 1 Apr 2005 16:47:37 -0500, Geert Bosch <bosch@adacore.com> said:

  Geert> This is a direct result of GCC's new tree-ssa optimizers
  Geert> performing load- and store elimination and advanced
  Geert> strengh-reduction on your code. In combination with the new
  Geert> DFA scheduler's just-in-time scheduling, which prevents
  Geert> executing any instructions until absolutely necessary, this
  Geert> allows drastic code simplification.

  Geert> Instead of executing complex instructions, such a memory
  Geert> loads and stores that may take many cycles to complete, the
  Geert> processor now only needs to execute these much simplified
  Geert> instructions. Taking advantage of the explicit parallelism in
  Geert> the code, the current Itanium-2 processors are capable of
  Geert> executing up to 6 of these instructions in one cycle, thereby
  Geert> fully realizing the promises of Intel's EPIC architecture by
  Geert> computing absolutely nothing at unprecedented speeds.

Oh, man, my brain almost got fried reading this, but then the
AutoDrink(TM) feature of my Google Gulp came to the rescue! ;-)

	--david

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Patch for PR target/20632
  2005-04-01 21:47         ` Geert Bosch
  2005-04-01 22:31           ` David Mosberger
@ 2005-04-01 22:33           ` Vladimir N. Makarov
  1 sibling, 0 replies; 9+ messages in thread
From: Vladimir N. Makarov @ 2005-04-01 22:33 UTC (permalink / raw)
  To: Geert Bosch; +Cc: davidm, James E Wilson, gcc-patches

Geert Bosch wrote:

> Instead of executing complex instructions, such a memory loads and
> stores that may take many cycles to complete, the processor now only
> needs to execute these much simplified instructions. Taking advantage
> of the explicit parallelism in the code, the current Itanium-2 processors
> are capable of executing up to 6 of these instructions in one cycle,
> thereby fully realizing the promises of Intel's EPIC architecture by
> computing absolutely nothing at unprecedented speeds.

I got it :)  It is 1st April.  At least we see that the processor does 
nothing.  You will never see it in modern x86 and x86_64 processors.  It 
is a black box (with their mysterious micro-ops which could be combined 
producing other mysterious things) resulting in try and pray 
optimization approach.

Vlad

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Patch for PR target/20632
  2005-04-01 22:24         ` Vladimir N. Makarov
@ 2005-04-01 23:12           ` David Mosberger
  0 siblings, 0 replies; 9+ messages in thread
From: David Mosberger @ 2005-04-01 23:12 UTC (permalink / raw)
  To: Vladimir N. Makarov; +Cc: davidm, James E Wilson, gcc-patches

>>>>> On Fri, 01 Apr 2005 17:24:54 -0500, "Vladimir N. Makarov" <vmakarov@redhat.com> said:

  Vladimir> Of course there are no reasons for this. I can guess there
  Vladimir> are some errors in NDFA description.  It is very
  Vladimir> non-trivial description because of transitions of bundle
  Vladimir> information from one cycle to another which is complicated
  Vladimir> by stops bit in the middle of bundles.

I can imagine...

  Vladimir> Jim pointed me one such problem a week ago.  I am going to
  Vladimir> look at these problems but I can not promise that I'll fix
  Vladimir> it soon.

Sounds great.

Thanks!

	--david

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2005-04-01 23:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <200503251745.j2PHjhD6029735@napali.hpl.hp.com>
2005-03-31 17:44 ` Patch for PR target/20632 Vladimir Makarov
2005-03-31 20:36   ` James E Wilson
2005-03-31 23:33     ` Vladimir Makarov
2005-04-01  8:09       ` David Mosberger
2005-04-01 21:47         ` Geert Bosch
2005-04-01 22:31           ` David Mosberger
2005-04-01 22:33           ` Vladimir N. Makarov
2005-04-01 22:24         ` Vladimir N. Makarov
2005-04-01 23:12           ` David Mosberger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).