public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH,i386] fix PR 11001
@ 2007-10-12 17:26 Uros Bizjak
  0 siblings, 0 replies; 11+ messages in thread
From: Uros Bizjak @ 2007-10-12 17:26 UTC (permalink / raw)
  To: GCC Patches; +Cc: Nathan Froyd, H. J. Lu

Hello!

> > +      /* Can't use this if the user has appropriated eax, ecx, or edi.  */
> > +      if (global_regs[0] || global_regs[2] || global_regs[5])
> > +        return false;
> > +
>
> Can you use/add AX_REG, CX_REG, DI_REG, SI_REG instead using 0, 2,
> 4, 5?
>   

I'll commit the update as soon as bootstrap finishes. A couple of 
constants has to be added to i386.md.

Uros.

^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH,i386] fix PR 11001
@ 2007-10-04  6:34 Uros Bizjak
  2007-10-04 20:55 ` Nathan Froyd
  0 siblings, 1 reply; 11+ messages in thread
From: Uros Bizjak @ 2007-10-04  6:34 UTC (permalink / raw)
  To: GCC Patches; +Cc: Nathan Froyd

Hello!

> The fix is to ensure that the registers are available before generating
> the instructions.  Note that the code is not optimal: in the memset
> case, for instance, if we choose an inlining strategy requiring 'rep
> stosl' and then discover that the necessary registers are not available,
> we generate a full call to 'memset' rather than generating an inline
> copy loop.  I don't see this as a serious defect; if you are using
> register globals on the x86, you deserve a performance penalty.


> Index: gcc/config/i386/i386.c
> ===================================================================
> --- gcc/config/i386/i386.c	(revision 128981)
> +++ gcc/config/i386/i386.c	(working copy)
> @@ -15286,6 +15286,13 @@ ix86_expand_movmem (rtx dst, rtx src, rt
>        break;
>      }
>
> +  /* Can't use this if the user has appropriated ecx, esi, or edi.  */
> +  if ((alg == rep_prefix_1_byte
> +       || alg == rep_prefix_4_byte
> +       || alg == rep_prefix_8_byte)
> +      && (global_regs[2] || global_regs[4] || global_regs[5]))
> +    return 0;
> +

I think that you should put this check into decide_alg(). There you
can decide between copy loop and libcall, also taking into account
optimize_size flag, as well as TARGET_INLINE_ALL_STRINGOPS and
TARGET_INLINE_ALL_STRINGOPS_DYNAMICALLY.

Please note, that rep_prefix_* algorithms are used for larger blocks.
Perhaps in your case, we should scan algorithms table backwards until
we hit other non-rep_prefix algorithm. If we found none, we should
default to a loop algorithm for optimize_size or to a libcall for
other cases.

Uros.

^ permalink raw reply	[flat|nested] 11+ messages in thread
* [PATCH,i386] fix PR 11001
@ 2007-10-03 15:01 Nathan Froyd
  0 siblings, 0 replies; 11+ messages in thread
From: Nathan Froyd @ 2007-10-03 15:01 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2075 bytes --]

The attached patch fixes PR11001, a long-standing bad interaction
between register globals and strlen/memset/memcpy on the x86.  The
primary issue is that the expanders for those functions would generate
(in some cases) x86 instructions that required particular registers
without checking whether those registers were available for use.

The fix is to ensure that the registers are available before generating
the instructions.  Note that the code is not optimal: in the memset
case, for instance, if we choose an inlining strategy requiring 'rep
stosl' and then discover that the necessary registers are not available,
we generate a full call to 'memset' rather than generating an inline
copy loop.  I don't see this as a serious defect; if you are using
register globals on the x86, you deserve a performance penalty.

The testcase in the PR no longer ICEs on mainline--it does produce
incorrect code, but that is mostly due to the semantics of
parameter-scope register variables differing from global reigster
variables and is therefore a bug in the code.  It is not included.  The
nine testcases included in the patch *do* ICE on current mainline,
however, and no longer ICE with the patch.

Tested on x86_64-unknown-linux-gnu with no regressions.  OK to commit?

-Nathan

2007-10-03  Nathan Froyd  <froydnj@codesourcery.com>

	PR 11001
	gcc/
	* config/i386/i386.md (strmov): Check for esi and edi usage.
	* config/i386/i386.c (ix86_expand_movmem): Check for ecx, esi,
	and edi usage.
	(ix86_expand_setmem): Check for eax, ecx, and edi usage.
	(ix86_expand_strlen): Likewise.

	gcc/testsuite/
	* gcc.target/i386/pr11001-strlen-1.c: New testcase.
	* gcc.target/i386/pr11001-strlen-2.c: New testcase.
	* gcc.target/i386/pr11001-strlen-3.c: New testcase.
	* gcc.target/i386/pr11001-memset-1.c: New testcase.
	* gcc.target/i386/pr11001-memset-2.c: New testcase.
	* gcc.target/i386/pr11001-memset-3.c: New testcase.
	* gcc.target/i386/pr11001-memcpy-1.c: New testcase.
	* gcc.target/i386/pr11001-memcpy-2.c: New testcase.
	* gcc.target/i386/pr11001-memcpy-3.c: New testcase.


[-- Attachment #2: pr11001.patch --]
[-- Type: text/x-diff, Size: 9006 bytes --]

Index: gcc/config/i386/i386.md
===================================================================
--- gcc/config/i386/i386.md	(revision 128981)
+++ gcc/config/i386/i386.md	(working copy)
@@ -18699,7 +18699,9 @@ (define_expand "strmov"
   operands[5] = gen_rtx_PLUS (Pmode, operands[0], adjust);
   operands[6] = gen_rtx_PLUS (Pmode, operands[2], adjust);
 
-  if (TARGET_SINGLE_STRINGOP || optimize_size)
+  /* Can't use this if the user has appropriated esi or edi.  */
+  if ((TARGET_SINGLE_STRINGOP || optimize_size)
+      && !(global_regs[4] || global_regs[5]))
     {
       emit_insn (gen_strmov_singleop (operands[0], operands[1],
 				      operands[2], operands[3],
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 128981)
+++ gcc/config/i386/i386.c	(working copy)
@@ -15286,6 +15286,13 @@ ix86_expand_movmem (rtx dst, rtx src, rt
       break;
     }
 
+  /* Can't use this if the user has appropriated ecx, esi, or edi.  */
+  if ((alg == rep_prefix_1_byte
+       || alg == rep_prefix_4_byte
+       || alg == rep_prefix_8_byte)
+      && (global_regs[2] || global_regs[4] || global_regs[5]))
+    return 0;
+
   epilogue_size_needed = size_needed;
 
   /* Step 1: Prologue guard.  */
@@ -15594,6 +15601,14 @@ ix86_expand_setmem (rtx dst, rtx count_e
       size_needed = 1;
       break;
     }
+
+  /* Can't use this if the user has appropriated eax, ecx, or edi.  */
+  if ((alg == rep_prefix_1_byte
+       || alg == rep_prefix_4_byte
+       || alg == rep_prefix_8_byte)
+      && (global_regs[0] || global_regs[2] || global_regs[5]))
+    return 0;
+
   epilogue_size_needed = size_needed;
 
   /* Step 1: Prologue guard.  */
@@ -15985,6 +16000,11 @@ ix86_expand_strlen (rtx out, rtx src, rt
   else
     {
       rtx unspec;
+
+      /* Can't use this if the user has appropriated eax, ecx, or edi.  */
+      if (global_regs[0] || global_regs[2] || global_regs[5])
+        return false;
+
       scratch2 = gen_reg_rtx (Pmode);
       scratch3 = gen_reg_rtx (Pmode);
       scratch4 = force_reg (Pmode, constm1_rtx);
Index: gcc/testsuite/gcc.target/i386/pr11001-strlen-2.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr11001-strlen-2.c	(revision 0)
+++ gcc/testsuite/gcc.target/i386/pr11001-strlen-2.c	(revision 0)
@@ -0,0 +1,16 @@
+/* Ensure that we don't use 'repnz scasb' in the presence of register globals.  */
+/* { dg-do compile } */
+/* { dg-options "-O1 -m32" } */
+
+extern __SIZE_TYPE__ strlen (const char *);
+extern void *malloc (__SIZE_TYPE__);
+
+register int regvar asm("%eax"); /* { dg-warning "call-clobbered register" } */
+
+char *
+do_copy (char *str)
+{
+  return malloc (strlen (str) + 1);
+}
+
+/* { dg-final { scan-assembler-not "repnz scasb" } } */
Index: gcc/testsuite/gcc.target/i386/pr11001-memset-2.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr11001-memset-2.c	(revision 0)
+++ gcc/testsuite/gcc.target/i386/pr11001-memset-2.c	(revision 0)
@@ -0,0 +1,23 @@
+/* Ensure that we don't use 'rep stoX' in the presence of register globals.  */
+/* { dg-do compile } */
+/* { dg-options "-Os -m32" } */
+
+extern void *memset (void *, int, __SIZE_TYPE__);
+
+register int regvar asm("%ecx"); /* { dg-warning "call-clobbered register" } */
+
+int foo[10];
+int bar[10];
+
+char baz[15];
+char quux[15];
+
+void
+do_copy ()
+{
+  memset (foo, 0, sizeof foo);
+  memset (baz, 0, sizeof baz);
+}
+
+/* { dg-final { scan-assembler-not "rep stosl" } } */
+/* { dg-final { scan-assembler-not "rep stosb" } } */
Index: gcc/testsuite/gcc.target/i386/pr11001-strlen-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr11001-strlen-3.c	(revision 0)
+++ gcc/testsuite/gcc.target/i386/pr11001-strlen-3.c	(revision 0)
@@ -0,0 +1,16 @@
+/* Ensure that we don't use 'repnz scasb' in the presence of register globals.  */
+/* { dg-do compile } */
+/* { dg-options "-O1 -m32" } */
+
+extern __SIZE_TYPE__ strlen (const char *);
+extern void *malloc (__SIZE_TYPE__);
+
+register int regvar asm("%ecx"); /* { dg-warning "call-clobbered register" } */
+
+char *
+do_copy (char *str)
+{
+  return malloc (strlen (str) + 1);
+}
+
+/* { dg-final { scan-assembler-not "repnz scasb" } } */
Index: gcc/testsuite/gcc.target/i386/pr11001-memset-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr11001-memset-3.c	(revision 0)
+++ gcc/testsuite/gcc.target/i386/pr11001-memset-3.c	(revision 0)
@@ -0,0 +1,23 @@
+/* Ensure that we don't use 'rep stoX' in the presence of register globals.  */
+/* { dg-do compile } */
+/* { dg-options "-Os -m32" } */
+
+extern void *memset (void *, int, __SIZE_TYPE__);
+
+register int regvar asm("%edi");
+
+int foo[10];
+int bar[10];
+
+char baz[15];
+char quux[15];
+
+void
+do_copy ()
+{
+  memset (foo, 0, sizeof foo);
+  memset (baz, 0, sizeof baz);
+}
+
+/* { dg-final { scan-assembler-not "rep stosl" } } */
+/* { dg-final { scan-assembler-not "rep stosb" } } */
Index: gcc/testsuite/gcc.target/i386/pr11001-memcpy-1.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr11001-memcpy-1.c	(revision 0)
+++ gcc/testsuite/gcc.target/i386/pr11001-memcpy-1.c	(revision 0)
@@ -0,0 +1,23 @@
+/* Ensure that we don't use 'rep movX' in the presence of register globals.  */
+/* { dg-do compile } */
+/* { dg-options "-Os -m32" } */
+
+extern void *memcpy (void *, const void *, __SIZE_TYPE__);
+
+register int regvar asm("%esi");
+
+int foo[10];
+int bar[10];
+
+char baz[15];
+char quux[15];
+
+void
+do_copy ()
+{
+  memcpy (foo, bar, sizeof foo);
+  memcpy (baz, quux, sizeof baz);
+}
+
+/* { dg-final { scan-assembler-not "rep movsl" } } */
+/* { dg-final { scan-assembler-not "rep movsb" } } */
Index: gcc/testsuite/gcc.target/i386/pr11001-memcpy-2.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr11001-memcpy-2.c	(revision 0)
+++ gcc/testsuite/gcc.target/i386/pr11001-memcpy-2.c	(revision 0)
@@ -0,0 +1,23 @@
+/* Ensure that we don't use 'rep movX' in the presence of register globals.  */
+/* { dg-do compile } */
+/* { dg-options "-Os -m32" } */
+
+extern void *memcpy (void *, const void *, __SIZE_TYPE__);
+
+register int regvar asm("%edi");
+
+int foo[10];
+int bar[10];
+
+char baz[15];
+char quux[15];
+
+void
+do_copy ()
+{
+  memcpy (foo, bar, sizeof foo);
+  memcpy (baz, quux, sizeof baz);
+}
+
+/* { dg-final { scan-assembler-not "rep movsl" } } */
+/* { dg-final { scan-assembler-not "rep movsb" } } */
Index: gcc/testsuite/gcc.target/i386/pr11001-memcpy-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr11001-memcpy-3.c	(revision 0)
+++ gcc/testsuite/gcc.target/i386/pr11001-memcpy-3.c	(revision 0)
@@ -0,0 +1,23 @@
+/* Ensure that we don't use 'rep movX' in the presence of register globals.  */
+/* { dg-do compile } */
+/* { dg-options "-Os -m32" } */
+
+extern void *memcpy (void *, const void *, __SIZE_TYPE__);
+
+register int regvar asm("%ecx"); /* { dg-warning "call-clobbered register" } */
+
+int foo[10];
+int bar[10];
+
+char baz[15];
+char quux[15];
+
+void
+do_copy ()
+{
+  memcpy (foo, bar, sizeof foo);
+  memcpy (baz, quux, sizeof baz);
+}
+
+/* { dg-final { scan-assembler-not "rep movsl" } } */
+/* { dg-final { scan-assembler-not "rep movsb" } } */
Index: gcc/testsuite/gcc.target/i386/pr11001-strlen-1.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr11001-strlen-1.c	(revision 0)
+++ gcc/testsuite/gcc.target/i386/pr11001-strlen-1.c	(revision 0)
@@ -0,0 +1,16 @@
+/* Ensure that we don't use 'repnz scasb' in the presence of register globals.  */
+/* { dg-do compile } */
+/* { dg-options "-O1 -m32" } */
+
+extern __SIZE_TYPE__ strlen (const char *);
+extern void *malloc (__SIZE_TYPE__);
+
+register int regvar asm("%edi");
+
+char *
+do_copy (char *str)
+{
+  return malloc (strlen (str) + 1);
+}
+
+/* { dg-final { scan-assembler-not "repnz scasb" } } */
Index: gcc/testsuite/gcc.target/i386/pr11001-memset-1.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr11001-memset-1.c	(revision 0)
+++ gcc/testsuite/gcc.target/i386/pr11001-memset-1.c	(revision 0)
@@ -0,0 +1,23 @@
+/* Ensure that we don't use 'rep stoX' in the presence of register globals.  */
+/* { dg-do compile } */
+/* { dg-options "-Os -m32" } */
+
+extern void *memset (void *, int, __SIZE_TYPE__);
+
+register int regvar asm("%eax"); /* { dg-warning "call-clobbered register" } */
+
+int foo[10];
+int bar[10];
+
+char baz[15];
+char quux[15];
+
+void
+do_copy ()
+{
+  memset (foo, 0, sizeof foo);
+  memset (baz, 0, sizeof baz);
+}
+
+/* { dg-final { scan-assembler-not "rep stosl" } } */
+/* { dg-final { scan-assembler-not "rep stosb" } } */

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2007-10-12 19:40 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-10-12 17:26 [PATCH,i386] fix PR 11001 Uros Bizjak
  -- strict thread matches above, loose matches on Subject: below --
2007-10-04  6:34 Uros Bizjak
2007-10-04 20:55 ` Nathan Froyd
2007-10-05  5:59   ` Uros Bizjak
2007-10-12 16:13     ` Nathan Froyd
2007-10-12 16:40       ` H.J. Lu
2007-10-12 17:13         ` Nathan Froyd
2007-10-12 17:28           ` Uros Bizjak
2007-10-12 19:34           ` H.J. Lu
2007-10-12 19:40             ` Nathan Froyd
2007-10-03 15:01 Nathan Froyd

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).