public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, i386]: Fix PR 36079 [4.3/4.4 Regression] cld instruction is not emitted anymore.
@ 2008-05-20 13:07 Uros Bizjak
  2008-05-20 16:27 ` Richard Guenther
  2008-05-20 16:29 ` Jakub Jelinek
  0 siblings, 2 replies; 12+ messages in thread
From: Uros Bizjak @ 2008-05-20 13:07 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 2216 bytes --]

Hello!

OK, here is the controversial patch that enables cld instruction with
-mcld. It includes --enable-cld configure option, and in this case
-mno-cld disables generation of cld instructions.

2008-05-20  Uros Bizjak  <ubizjak@gmail.com>

        PR target/36079
        * configure.ac: Handle --enable-cld.
        * configure.in: Regenerate.
        * configure.ac: Ditto.
        * config/i386/i386.h (TARGET_CLD): New define.
        (struct machine_function): Add needs_cld field.
        (ix86_current_function_needs_cld): New define.
        * config/i386/i386.md (UNSPEC_CLD): New unspec volatile constant.
        ("cld"): New isns pattern.
        ("strmov_singleop"): Set ix86_current_function_needs_cld flag.
        ("rep_mov"): Ditto.
        ("strset_singleop"): Ditto.
        ("rep_stos"): Ditto.
        ("cmpstrnqi_nz_1"): Ditto.
        ("cmpstrnqi_1"): Ditto.
        ("strlenqi_1"): Ditto.
        * config/i386/i386.opt (mcld, mno-cld): New options.
        * config/i386/i386.c (ix86_expand_prologue): Emit cld insn for
        when ix86_current_function_needs_cld is set and when combination
        of configure and compile options allow it.

The patch was bootstrapped and regression tested on i686-pc-linux-gnu
a month or so ago. Currently vanila (without patch!) gcc-4_3 branch
doesn't build for me, it dies with:

gcc -c   -g -fkeep-inline-functions -DIN_GCC   -W -Wall
-Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes
-Wold-style-definition -Wmissing-format-attribute -pedantic
-Wno-long-long -Wno-variadic-macros      -Wno-overlength-strings
-fno-common   -DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild
-I../../gcc-svn/branches/gcc-4_3-branch/gcc
-I../../gcc-svn/branches/gcc-4_3-branch/gcc/build
-I../../gcc-svn/branches/gcc-4_3-branch/gcc/../include
-I../../gcc-svn/branches/gcc-4_3-branch/gcc/../libcpp/include
-I/usr/local/include
-I../../gcc-svn/branches/gcc-4_3-branch/gcc/../libdecnumber
-I../../gcc-svn/branches/gcc-4_3-branch/gcc/../libdecnumber/bid
-I../libdecnumber    -o build/errors.o
../../gcc-svn/branches/gcc-4_3-branch/gcc/errors.c
build/genmodes -h > tmp-modes.h
/bin/sh: build/genmodes: No such file or directory
gmake[3]: *** [s-modes-h] Error 127

Uros.

[-- Attachment #2: cld.diff.txt --]
[-- Type: text/plain, Size: 8478 bytes --]

Index: gcc/configure
===================================================================
--- gcc/configure	(revision 135609)
+++ gcc/configure	(working copy)
@@ -898,6 +898,7 @@ Optional Features:
   --enable-sjlj-exceptions
                           arrange to use setjmp/longjmp exception handling
   --enable-secureplt      enable -msecure-plt by default for PowerPC
+  --enable-cld            enable -mcld by default for 32bit x86
   --disable-win32-registry
                           disable lookup of installation paths in the
                           Registry on Windows hosts
@@ -12951,6 +12952,22 @@ if test "${enable_secureplt+set}" = set;
 
 fi;
 
+# Check whether --enable-cld or --disable-cld was given.
+if test "${enable_cld+set}" = set; then
+  enableval="$enable_cld"
+
+else
+  enable_cld=no
+fi;
+
+if test "$enable_cld" = yes ; then
+
+cat >>confdefs.h <<\_ACEOF
+#define USE_IX86_CLD 1
+_ACEOF
+
+fi
+
 # Windows32 Registry support for specifying GCC installation paths.
 # Check whether --enable-win32-registry or --disable-win32-registry was given.
 if test "${enable_win32_registry+set}" = set; then
Index: gcc/config.in
===================================================================
--- gcc/config.in	(revision 135609)
+++ gcc/config.in	(working copy)
@@ -1369,37 +1369,37 @@
 #endif
 
 
-/* The size of `int', as computed by sizeof. */
+/* The size of a `int', as computed by sizeof. */
 #ifndef USED_FOR_TARGET
 #undef SIZEOF_INT
 #endif
 
 
-/* The size of `long', as computed by sizeof. */
+/* The size of a `long', as computed by sizeof. */
 #ifndef USED_FOR_TARGET
 #undef SIZEOF_LONG
 #endif
 
 
-/* The size of `long long', as computed by sizeof. */
+/* The size of a `long long', as computed by sizeof. */
 #ifndef USED_FOR_TARGET
 #undef SIZEOF_LONG_LONG
 #endif
 
 
-/* The size of `short', as computed by sizeof. */
+/* The size of a `short', as computed by sizeof. */
 #ifndef USED_FOR_TARGET
 #undef SIZEOF_SHORT
 #endif
 
 
-/* The size of `void *', as computed by sizeof. */
+/* The size of a `void *', as computed by sizeof. */
 #ifndef USED_FOR_TARGET
 #undef SIZEOF_VOID_P
 #endif
 
 
-/* The size of `__int64', as computed by sizeof. */
+/* The size of a `__int64', as computed by sizeof. */
 #ifndef USED_FOR_TARGET
 #undef SIZEOF___INT64
 #endif
@@ -1441,6 +1441,13 @@
 #endif
 
 
+/* Define if gcc should emit cld at the function entry when stringops are
+   used. */
+#ifndef USED_FOR_TARGET
+#undef USE_IX86_CLD
+#endif
+
+
 /* Define to 1 if the 'long long' (or '__int64') is wider than 'long' but
    still efficiently supported by the host hardware. */
 #ifndef USED_FOR_TARGET
Index: gcc/configure.ac
===================================================================
--- gcc/configure.ac	(revision 135609)
+++ gcc/configure.ac	(working copy)
@@ -1548,6 +1548,15 @@ AC_ARG_ENABLE(secureplt,
 [  --enable-secureplt      enable -msecure-plt by default for PowerPC],
 [], [])
 
+AC_ARG_ENABLE(cld,
+[  --enable-cld            enable -mcld by default for 32bit x86], [],
+[enable_cld=no])
+
+if test "$enable_cld" = yes ; then
+  AC_DEFINE(USE_IX86_CLD, 1,
+[Define if gcc should emit cld at the function entry when stringops are used.])
+fi
+
 # Windows32 Registry support for specifying GCC installation paths.
 AC_ARG_ENABLE(win32-registry,
 [  --disable-win32-registry
Index: gcc/config/i386/i386.h
===================================================================
--- gcc/config/i386/i386.h	(revision 135609)
+++ gcc/config/i386/i386.h	(working copy)
@@ -2443,8 +2443,9 @@ struct machine_function GTY(())
   int save_varrargs_registers;
   int accesses_prev_frame;
   int optimize_mode_switching[MAX_386_ENTITIES];
-  /* Set by ix86_compute_frame_layout and used by prologue/epilogue expander to
-     determine the style used.  */
+  int needs_cld;
+  /* Set by ix86_compute_frame_layout and used by prologue/epilogue
+     expander to determine the style used.  */
   int use_fast_prologue_epilogue;
   /* Number of saved registers USE_FAST_PROLOGUE_EPILOGUE has been computed
      for.  */
@@ -2464,6 +2465,7 @@ struct machine_function GTY(())
 #define ix86_stack_locals (cfun->machine->stack_locals)
 #define ix86_save_varrargs_registers (cfun->machine->save_varrargs_registers)
 #define ix86_optimize_mode_switching (cfun->machine->optimize_mode_switching)
+#define ix86_current_function_needs_cld (cfun->machine->needs_cld)
 #define ix86_tls_descriptor_calls_expanded_in_cfun \
   (cfun->machine->tls_descriptor_call_expanded_p)
 /* Since tls_descriptor_call_expanded is not cleared, even if all TLS
Index: gcc/config/i386/i386.md
===================================================================
--- gcc/config/i386/i386.md	(revision 135609)
+++ gcc/config/i386/i386.md	(working copy)
@@ -205,6 +205,7 @@
    (UNSPECV_XCHG		12)
    (UNSPECV_LOCK		13)
    (UNSPECV_PROLOGUE_USE	14)
+   (UNSPECV_CLD			15)
   ])
 
 ;; Constants to represent pcomtrue/pcomfalse variants
@@ -18514,6 +18515,14 @@
 \f
 ;; Block operation instructions
 
+(define_insn "cld"
+  [(unspec_volatile [(const_int 0)] UNSPECV_CLD)]
+  ""
+  "cld"
+  [(set_attr "length" "1")
+   (set_attr "length_immediate" "0")
+   (set_attr "modrm" "0")])
+
 (define_expand "movmemsi"
   [(use (match_operand:BLK 0 "memory_operand" ""))
    (use (match_operand:BLK 1 "memory_operand" ""))
@@ -18586,7 +18595,7 @@
 	      (set (match_operand 2 "register_operand" "")
 		   (match_operand 5 "" ""))])]
   "TARGET_SINGLE_STRINGOP || optimize_size"
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*strmovdi_rex_1"
   [(set (mem:DI (match_operand:DI 2 "register_operand" "0"))
@@ -18703,7 +18712,7 @@
 		   (match_operand 3 "memory_operand" ""))
 	      (use (match_dup 4))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*rep_movdi_rex64"
   [(set (match_operand:DI 2 "register_operand" "=c") (const_int 0))
@@ -18863,7 +18872,7 @@
 	      (set (match_operand 0 "register_operand" "")
 		   (match_operand 3 "" ""))])]
   "TARGET_SINGLE_STRINGOP || optimize_size"
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*strsetdi_rex_1"
   [(set (mem:DI (match_operand:DI 1 "register_operand" "0"))
@@ -18957,7 +18966,7 @@
 	      (use (match_operand 3 "register_operand" ""))
 	      (use (match_dup 1))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*rep_stosdi_rex64"
   [(set (match_operand:DI 1 "register_operand" "=c") (const_int 0))
@@ -19133,7 +19142,7 @@
 	      (clobber (match_operand 1 "register_operand" ""))
 	      (clobber (match_dup 2))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*cmpstrnqi_nz_1"
   [(set (reg:CC FLAGS_REG)
@@ -19180,7 +19189,7 @@
 	      (clobber (match_operand 1 "register_operand" ""))
 	      (clobber (match_dup 2))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*cmpstrnqi_1"
   [(set (reg:CC FLAGS_REG)
@@ -19249,7 +19258,7 @@
 	      (clobber (match_operand 1 "register_operand" ""))
 	      (clobber (reg:CC FLAGS_REG))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*strlenqi_1"
   [(set (match_operand:SI 0 "register_operand" "=&c")
Index: gcc/config/i386/i386.opt
===================================================================
--- gcc/config/i386/i386.opt	(revision 135609)
+++ gcc/config/i386/i386.opt	(working copy)
@@ -250,6 +250,14 @@ Support SSE5 built-in functions and code
 
 ;; Instruction support
 
+mcld
+Target Report RejectNegative Mask(X86_CLD)
+Generate cld instruction in the function prologue.
+
+mno-cld
+Target Report RejectNegative Mask(NO_X86_CLD)
+Do not generate cld instruction in the function prologue.
+
 mabm
 Target Report RejectNegative Var(x86_abm)
 Support code generation of Advanced Bit Manipulation (ABM) instructions.
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 135609)
+++ gcc/config/i386/i386.c	(working copy)
@@ -6485,6 +6485,16 @@ ix86_expand_prologue (void)
 	emit_insn (gen_prologue_use (pic_offset_table_rtx));
       emit_insn (gen_blockage ());
     }
+
+  /* Emit cld instruction if stringops are used in the function.  */
+  if (!TARGET_64BIT
+#ifdef USE_IX86_CLD
+      && !TARGET_NO_X86_CLD
+#else
+      && TARGET_X86_CLD
+#endif
+      && ix86_current_function_needs_cld)
+    emit_insn (gen_cld ());
 }
 
 /* Emit code to restore saved registers using MOV insns.  First register

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH, i386]: Fix PR 36079 [4.3/4.4 Regression] cld instruction is not emitted anymore.
  2008-05-20 13:07 [PATCH, i386]: Fix PR 36079 [4.3/4.4 Regression] cld instruction is not emitted anymore Uros Bizjak
@ 2008-05-20 16:27 ` Richard Guenther
  2008-05-20 16:29 ` Jakub Jelinek
  1 sibling, 0 replies; 12+ messages in thread
From: Richard Guenther @ 2008-05-20 16:27 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: GCC Patches

On Tue, May 20, 2008 at 2:47 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> Hello!
>
>  OK, here is the controversial patch that enables cld instruction with
>  -mcld. It includes --enable-cld configure option, and in this case
>  -mno-cld disables generation of cld instructions.
>
>  2008-05-20  Uros Bizjak  <ubizjak@gmail.com>
>
>         PR target/36079
>         * configure.ac: Handle --enable-cld.
>         * configure.in: Regenerate.
>         * configure.ac: Ditto.
>         * config/i386/i386.h (TARGET_CLD): New define.
>         (struct machine_function): Add needs_cld field.
>         (ix86_current_function_needs_cld): New define.
>         * config/i386/i386.md (UNSPEC_CLD): New unspec volatile constant.
>         ("cld"): New isns pattern.
>         ("strmov_singleop"): Set ix86_current_function_needs_cld flag.
>         ("rep_mov"): Ditto.
>         ("strset_singleop"): Ditto.
>         ("rep_stos"): Ditto.
>         ("cmpstrnqi_nz_1"): Ditto.
>         ("cmpstrnqi_1"): Ditto.
>         ("strlenqi_1"): Ditto.
>         * config/i386/i386.opt (mcld, mno-cld): New options.
>         * config/i386/i386.c (ix86_expand_prologue): Emit cld insn for
>         when ix86_current_function_needs_cld is set and when combination
>         of configure and compile options allow it.
>
>  The patch was bootstrapped and regression tested on i686-pc-linux-gnu
>  a month or so ago. Currently vanila (without patch!) gcc-4_3 branch
>  doesn't build for me, it dies with:
>
>  gcc -c   -g -fkeep-inline-functions -DIN_GCC   -W -Wall
>  -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes
>  -Wold-style-definition -Wmissing-format-attribute -pedantic
>  -Wno-long-long -Wno-variadic-macros      -Wno-overlength-strings
>  -fno-common   -DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild
>  -I../../gcc-svn/branches/gcc-4_3-branch/gcc
>  -I../../gcc-svn/branches/gcc-4_3-branch/gcc/build
>  -I../../gcc-svn/branches/gcc-4_3-branch/gcc/../include
>  -I../../gcc-svn/branches/gcc-4_3-branch/gcc/../libcpp/include
>  -I/usr/local/include
>  -I../../gcc-svn/branches/gcc-4_3-branch/gcc/../libdecnumber
>  -I../../gcc-svn/branches/gcc-4_3-branch/gcc/../libdecnumber/bid
>  -I../libdecnumber    -o build/errors.o
>  ../../gcc-svn/branches/gcc-4_3-branch/gcc/errors.c
>  build/genmodes -h > tmp-modes.h
>  /bin/sh: build/genmodes: No such file or directory
>  gmake[3]: *** [s-modes-h] Error 127

I have bootstrapped and tested this patch successfully on i686-pc-linux-gnu
with --enable-cld.

Richard.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH, i386]: Fix PR 36079 [4.3/4.4 Regression] cld instruction is not emitted anymore.
  2008-05-20 13:07 [PATCH, i386]: Fix PR 36079 [4.3/4.4 Regression] cld instruction is not emitted anymore Uros Bizjak
  2008-05-20 16:27 ` Richard Guenther
@ 2008-05-20 16:29 ` Jakub Jelinek
  2008-05-20 16:43   ` Jakub Jelinek
                     ` (2 more replies)
  1 sibling, 3 replies; 12+ messages in thread
From: Jakub Jelinek @ 2008-05-20 16:29 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: GCC Patches

On Tue, May 20, 2008 at 02:47:15PM +0200, Uros Bizjak wrote:
> OK, here is the controversial patch that enables cld instruction with
> -mcld. It includes --enable-cld configure option, and in this case
> -mno-cld disables generation of cld instructions.

The configury bits are IMHO too ugly and the ChangeLog entry doesn't match
the patch either.

I think there is no reason why we need to use two separate target_flags
bits, one for -mcld and one for -mno-cld - we already have
target_flags_explicit, so there are 2 bits for each mask option already.
Also, it is IMHO cleaner to just set target_flags & MASK_CLD bit during
option processing.  I think we shouldn't ignore -mcld even for 64-bit code,
remember the original bug has been found in a 64-bit app with weirdo hand
written assembly, but as libc never uses std for 64-bit code, IMHO
--enable-cld shouldn't make -mcld the default for 64-bit code.

2008-05-20  Uros Bizjak  <ubizjak@gmail.com>
	    Jakub Jelinek  <jakub@redhat.com>

	PR target/36079
	* configure.ac: Handle --enable-cld.
	* configure.in: Regenerate.
	* configure.ac: Ditto.
	* config/i386/i386.h (struct machine_function): Add needs_cld field.
	(ix86_current_function_needs_cld): New define.
	* config/i386/i386.md (UNSPEC_CLD): New unspec volatile constant.
	(cld): New isns pattern.
	(strmov_singleop, rep_mov, strset_singleop, rep_stos, cmpstrnqi_nz_1,
	cmpstrnqi_1, strlenqi_1): Set ix86_current_function_needs_cld flag.
	* config/i386/i386.opt (mcld): New option.
	* config/i386/i386.c (ix86_expand_prologue): Emit cld insn if
	TARGET_CLD and ix86_current_function_needs_cld.
	(override_options): Use -mcld by default for 32-bit code if
	USE_IX86_CLD.

--- gcc/config/i386/i386.md.jj	2008-04-22 00:19:40.000000000 +0200
+++ gcc/config/i386/i386.md	2008-05-20 16:34:37.000000000 +0200
@@ -205,6 +205,7 @@
    (UNSPECV_XCHG		12)
    (UNSPECV_LOCK		13)
    (UNSPECV_PROLOGUE_USE	14)
+   (UNSPECV_CLD			15)
   ])
 
 ;; Constants to represent pcomtrue/pcomfalse variants
@@ -18514,6 +18515,14 @@
 \f
 ;; Block operation instructions
 
+(define_insn "cld"
+  [(unspec_volatile [(const_int 0)] UNSPECV_CLD)]
+  ""
+  "cld"
+  [(set_attr "length" "1")
+   (set_attr "length_immediate" "0")
+   (set_attr "modrm" "0")])
+
 (define_expand "movmemsi"
   [(use (match_operand:BLK 0 "memory_operand" ""))
    (use (match_operand:BLK 1 "memory_operand" ""))
@@ -18586,7 +18595,7 @@
 	      (set (match_operand 2 "register_operand" "")
 		   (match_operand 5 "" ""))])]
   "TARGET_SINGLE_STRINGOP || optimize_size"
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*strmovdi_rex_1"
   [(set (mem:DI (match_operand:DI 2 "register_operand" "0"))
@@ -18703,7 +18712,7 @@
 		   (match_operand 3 "memory_operand" ""))
 	      (use (match_dup 4))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*rep_movdi_rex64"
   [(set (match_operand:DI 2 "register_operand" "=c") (const_int 0))
@@ -18863,7 +18872,7 @@
 	      (set (match_operand 0 "register_operand" "")
 		   (match_operand 3 "" ""))])]
   "TARGET_SINGLE_STRINGOP || optimize_size"
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*strsetdi_rex_1"
   [(set (mem:DI (match_operand:DI 1 "register_operand" "0"))
@@ -18957,7 +18966,7 @@
 	      (use (match_operand 3 "register_operand" ""))
 	      (use (match_dup 1))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*rep_stosdi_rex64"
   [(set (match_operand:DI 1 "register_operand" "=c") (const_int 0))
@@ -19133,7 +19142,7 @@
 	      (clobber (match_operand 1 "register_operand" ""))
 	      (clobber (match_dup 2))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*cmpstrnqi_nz_1"
   [(set (reg:CC FLAGS_REG)
@@ -19180,7 +19189,7 @@
 	      (clobber (match_operand 1 "register_operand" ""))
 	      (clobber (match_dup 2))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*cmpstrnqi_1"
   [(set (reg:CC FLAGS_REG)
@@ -19249,7 +19258,7 @@
 	      (clobber (match_operand 1 "register_operand" ""))
 	      (clobber (reg:CC FLAGS_REG))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*strlenqi_1"
   [(set (match_operand:SI 0 "register_operand" "=&c")
--- gcc/config/i386/i386.opt.jj	2008-02-18 23:45:48.000000000 +0100
+++ gcc/config/i386/i386.opt	2008-05-20 16:34:37.000000000 +0200
@@ -250,6 +250,10 @@ Support SSE5 built-in functions and code
 
 ;; Instruction support
 
+mcld
+Target Report Mask(CLD)
+Generate cld instruction in the function prologue.
+
 mabm
 Target Report RejectNegative Var(x86_abm)
 Support code generation of Advanced Bit Manipulation (ABM) instructions.
--- gcc/config/i386/i386.h.jj	2008-02-18 23:45:48.000000000 +0100
+++ gcc/config/i386/i386.h	2008-05-20 16:34:37.000000000 +0200
@@ -2443,8 +2443,9 @@ struct machine_function GTY(())
   int save_varrargs_registers;
   int accesses_prev_frame;
   int optimize_mode_switching[MAX_386_ENTITIES];
-  /* Set by ix86_compute_frame_layout and used by prologue/epilogue expander to
-     determine the style used.  */
+  int needs_cld;
+  /* Set by ix86_compute_frame_layout and used by prologue/epilogue
+     expander to determine the style used.  */
   int use_fast_prologue_epilogue;
   /* Number of saved registers USE_FAST_PROLOGUE_EPILOGUE has been computed
      for.  */
@@ -2464,6 +2465,7 @@ struct machine_function GTY(())
 #define ix86_stack_locals (cfun->machine->stack_locals)
 #define ix86_save_varrargs_registers (cfun->machine->save_varrargs_registers)
 #define ix86_optimize_mode_switching (cfun->machine->optimize_mode_switching)
+#define ix86_current_function_needs_cld (cfun->machine->needs_cld)
 #define ix86_tls_descriptor_calls_expanded_in_cfun \
   (cfun->machine->tls_descriptor_call_expanded_p)
 /* Since tls_descriptor_call_expanded is not cleared, even if all TLS
--- gcc/config/i386/i386.c.jj	2008-05-18 22:16:19.000000000 +0200
+++ gcc/config/i386/i386.c	2008-05-20 16:44:36.000000000 +0200
@@ -2733,6 +2733,12 @@ override_options (void)
      can be optimized to ap = __builtin_next_arg (0).  */
   if (!TARGET_64BIT || TARGET_64BIT_MS_ABI)
     targetm.expand_builtin_va_start = NULL;
+
+#ifdef USE_IX86_CLD
+  /* Use -mcld by default for 32-bit code if configured with --enable-cld.  */
+  if (!TARGET_64BIT && !(target_flags_explicit & MASK_CLD))
+    target_flags |= MASK_CLD;
+#endif
 }
 \f
 /* Return true if this goes in large data/bss.  */
@@ -6485,6 +6491,10 @@ ix86_expand_prologue (void)
 	emit_insn (gen_prologue_use (pic_offset_table_rtx));
       emit_insn (gen_blockage ());
     }
+
+  /* Emit cld instruction if stringops are used in the function.  */
+  if (TARGET_CLD && ix86_current_function_needs_cld)
+    emit_insn (gen_cld ());
 }
 
 /* Emit code to restore saved registers using MOV insns.  First register
--- gcc/configure.ac.jj	2008-03-05 17:17:21.000000000 +0100
+++ gcc/configure.ac	2008-05-20 16:34:37.000000000 +0200
@@ -1548,6 +1548,15 @@ AC_ARG_ENABLE(secureplt,
 [  --enable-secureplt      enable -msecure-plt by default for PowerPC],
 [], [])
 
+AC_ARG_ENABLE(cld,
+[  --enable-cld            enable -mcld by default for 32bit x86], [],
+[enable_cld=no])
+
+if test "$enable_cld" = yes ; then
+  AC_DEFINE(USE_IX86_CLD, 1,
+[Define if gcc should by default emit cld at the function entry when stringops are used.])
+fi
+
 # Windows32 Registry support for specifying GCC installation paths.
 AC_ARG_ENABLE(win32-registry,
 [  --disable-win32-registry
--- gcc/configure.jj	2008-03-05 17:17:21.000000000 +0100
+++ gcc/configure	2008-05-20 16:34:36.000000000 +0200
@@ -898,6 +898,7 @@ Optional Features:
   --enable-sjlj-exceptions
                           arrange to use setjmp/longjmp exception handling
   --enable-secureplt      enable -msecure-plt by default for PowerPC
+  --enable-cld            enable -mcld by default for 32bit x86
   --disable-win32-registry
                           disable lookup of installation paths in the
                           Registry on Windows hosts
@@ -12951,6 +12952,22 @@ if test "${enable_secureplt+set}" = set;
 
 fi;
 
+# Check whether --enable-cld or --disable-cld was given.
+if test "${enable_cld+set}" = set; then
+  enableval="$enable_cld"
+
+else
+  enable_cld=no
+fi;
+
+if test "$enable_cld" = yes ; then
+
+cat >>confdefs.h <<\_ACEOF
+#define USE_IX86_CLD 1
+_ACEOF
+
+fi
+
 # Windows32 Registry support for specifying GCC installation paths.
 # Check whether --enable-win32-registry or --disable-win32-registry was given.
 if test "${enable_win32_registry+set}" = set; then
--- gcc/config.in.jj	2008-02-18 23:46:08.000000000 +0100
+++ gcc/config.in	2008-05-20 16:34:37.000000000 +0200
@@ -1441,6 +1441,13 @@
 #endif
 
 
+/* Define if gcc should by default emit cld at the function entry when
+   stringops are used. */
+#ifndef USED_FOR_TARGET
+#undef USE_IX86_CLD
+#endif
+
+
 /* Define to 1 if the 'long long' (or '__int64') is wider than 'long' but
    still efficiently supported by the host hardware. */
 #ifndef USED_FOR_TARGET


	Jakub

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH, i386]: Fix PR 36079 [4.3/4.4 Regression] cld instruction is not emitted anymore.
  2008-05-20 16:29 ` Jakub Jelinek
@ 2008-05-20 16:43   ` Jakub Jelinek
  2008-05-20 17:03   ` Paolo Bonzini
  2008-05-21  2:58   ` Mark Mitchell
  2 siblings, 0 replies; 12+ messages in thread
From: Jakub Jelinek @ 2008-05-20 16:43 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: GCC Patches

On Tue, May 20, 2008 at 11:10:34AM -0400, Jakub Jelinek wrote:
> On Tue, May 20, 2008 at 02:47:15PM +0200, Uros Bizjak wrote:
> --- gcc/config/i386/i386.c.jj	2008-05-18 22:16:19.000000000 +0200
> +++ gcc/config/i386/i386.c	2008-05-20 16:44:36.000000000 +0200
> @@ -2733,6 +2733,12 @@ override_options (void)
>       can be optimized to ap = __builtin_next_arg (0).  */
>    if (!TARGET_64BIT || TARGET_64BIT_MS_ABI)
>      targetm.expand_builtin_va_start = NULL;
> +
> +#ifdef USE_IX86_CLD
> +  /* Use -mcld by default for 32-bit code if configured with --enable-cld.  */
> +  if (!TARGET_64BIT && !(target_flags_explicit & MASK_CLD))
> +    target_flags |= MASK_CLD;
> +#endif
>  }
>  \f
>  /* Return true if this goes in large data/bss.  */

This hunk should actually be:

@@ -2733,6 +2733,12 @@ override_options (void)
      can be optimized to ap = __builtin_next_arg (0).  */
   if (!TARGET_64BIT || TARGET_64BIT_MS_ABI)
     targetm.expand_builtin_va_start = NULL;
+
+#ifdef USE_IX86_CLD
+  /* Use -mcld by default for 32-bit code if configured with --enable-cld.  */
+  if (!TARGET_64BIT)
+    target_flags |= MASK_CLD & ~target_flags_explicit;
+#endif
 }
 \f
 /* Return true if this goes in large data/bss.  */


	Jakub

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH, i386]: Fix PR 36079 [4.3/4.4 Regression] cld instruction  is not emitted anymore.
  2008-05-20 16:29 ` Jakub Jelinek
  2008-05-20 16:43   ` Jakub Jelinek
@ 2008-05-20 17:03   ` Paolo Bonzini
  2008-05-21  2:58   ` Mark Mitchell
  2 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2008-05-20 17:03 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Uros Bizjak, GCC Patches

Jakub Jelinek wrote:
> On Tue, May 20, 2008 at 02:47:15PM +0200, Uros Bizjak wrote:
>> OK, here is the controversial patch that enables cld instruction with
>> -mcld. It includes --enable-cld configure option, and in this case
>> -mno-cld disables generation of cld instructions.
> 
> The configury bits are IMHO too ugly

Yes, the problem is that configure.ac is polluted with target-specific 
stuff.  You can instead add a

   if test $enable_cld = yes; then
      tm_defines=${tm_defines} USE_IX86_CLD=1"
   fi

in config.gcc around this place:

case ${target} in
x86_64-*-*)
         tm_file="i386/biarch64.h ${tm_file}"
         ;;
esac

Paolo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH, i386]: Fix PR 36079 [4.3/4.4 Regression] cld instruction  is not emitted anymore.
  2008-05-20 16:29 ` Jakub Jelinek
  2008-05-20 16:43   ` Jakub Jelinek
  2008-05-20 17:03   ` Paolo Bonzini
@ 2008-05-21  2:58   ` Mark Mitchell
  2008-05-21  8:24     ` [PATCH v2, " Uros Bizjak
  2 siblings, 1 reply; 12+ messages in thread
From: Mark Mitchell @ 2008-05-21  2:58 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Uros Bizjak, GCC Patches

Jakub Jelinek wrote:

> option processing.  I think we shouldn't ignore -mcld even for 64-bit code,
> remember the original bug has been found in a 64-bit app with weirdo hand
> written assembly, but as libc never uses std for 64-bit code, IMHO
> --enable-cld shouldn't make -mcld the default for 64-bit code.

Uros, thank you for writing this patch!  Richard, Jakub, thank you for 
the reviews.

Per our off-list RM discussions, let's not worry too much about the 
defaults at this point.  Whether --enable-cld is the default or not, and 
whether --enable-cld enables -mcld by default in 64-bit mode or not, are 
topics we can and should debate, but let's get the core technology in.

Thanks,

-- 
Mark Mitchell
CodeSourcery
mark@codesourcery.com
(650) 331-3385 x713

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2, i386]: Fix PR 36079 [4.3/4.4 Regression] cld instruction is not emitted anymore.
  2008-05-21  2:58   ` Mark Mitchell
@ 2008-05-21  8:24     ` Uros Bizjak
  2008-05-21  8:25       ` Paolo Bonzini
  2008-05-21  9:00       ` Richard Guenther
  0 siblings, 2 replies; 12+ messages in thread
From: Uros Bizjak @ 2008-05-21  8:24 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Jakub Jelinek, GCC Patches, Paolo Bonzini, Richard Guenther

[-- Attachment #1: Type: text/plain, Size: 1765 bytes --]

On Wed, May 21, 2008 at 3:40 AM, Mark Mitchell <mark@codesourcery.com> wrote:

>> option processing.  I think we shouldn't ignore -mcld even for 64-bit
>> code,
>> remember the original bug has been found in a 64-bit app with weirdo hand
>> written assembly, but as libc never uses std for 64-bit code, IMHO
>> --enable-cld shouldn't make -mcld the default for 64-bit code.
>
> Uros, thank you for writing this patch!  Richard, Jakub, thank you for the
> reviews.
>
> Per our off-list RM discussions, let's not worry too much about the defaults
> at this point.  Whether --enable-cld is the default or not, and whether
> --enable-cld enables -mcld by default in 64-bit mode or not, are topics we
> can and should debate, but let's get the core technology in.

Attached patch adds all suggestions by Jakub and Paolo. Patch works as
advertised, bootstrap and regression test on i686-pc-linux with
--enable-cld is in progress.

2008-05-21  Uros Bizjak  <ubizjak@gmail.com>
	    Jakub Jelinek  <jakub@redhat.com>

	PR target/36079
	* configure.ac: Handle --enable-cld.
	* configure: Regenerated.
	* config.gcc: Add USE_IX86_CLD to tm_defines for x86 targets.
	* config/i386/i386.h (struct machine_function): Add needs_cld field.
	(ix86_current_function_needs_cld): New define.
	* config/i386/i386.md (UNSPEC_CLD): New unspec volatile constant.
	(cld): New isns pattern.
	(strmov_singleop, rep_mov, strset_singleop, rep_stos, cmpstrnqi_nz_1,
	cmpstrnqi_1, strlenqi_1): Set ix86_current_function_needs_cld flag.
	* config/i386/i386.opt (mcld): New option.
	* config/i386/i386.c (ix86_expand_prologue): Emit cld insn if
	TARGET_CLD and ix86_current_function_needs_cld.
	(override_options): Use -mcld by default for 32-bit code if
	USE_IX86_CLD.

OK for 4.3 branch?

Uros.

[-- Attachment #2: cld.diff.txt --]
[-- Type: text/plain, Size: 7345 bytes --]

Index: gcc/configure
===================================================================
--- gcc/configure	(revision 135707)
+++ gcc/configure	(working copy)
@@ -898,6 +898,7 @@ Optional Features:
   --enable-sjlj-exceptions
                           arrange to use setjmp/longjmp exception handling
   --enable-secureplt      enable -msecure-plt by default for PowerPC
+  --enable-cld            enable -mcld by default for 32bit x86
   --disable-win32-registry
                           disable lookup of installation paths in the
                           Registry on Windows hosts
@@ -12951,6 +12952,14 @@ if test "${enable_secureplt+set}" = set;
 
 fi;
 
+# Check whether --enable-cld or --disable-cld was given.
+if test "${enable_cld+set}" = set; then
+  enableval="$enable_cld"
+
+else
+  enable_cld=no
+fi;
+
 # Windows32 Registry support for specifying GCC installation paths.
 # Check whether --enable-win32-registry or --disable-win32-registry was given.
 if test "${enable_win32_registry+set}" = set; then
Index: gcc/configure.ac
===================================================================
--- gcc/configure.ac	(revision 135707)
+++ gcc/configure.ac	(working copy)
@@ -1548,6 +1548,10 @@ AC_ARG_ENABLE(secureplt,
 [  --enable-secureplt      enable -msecure-plt by default for PowerPC],
 [], [])
 
+AC_ARG_ENABLE(cld,
+[  --enable-cld            enable -mcld by default for 32bit x86], [],
+[enable_cld=no])
+
 # Windows32 Registry support for specifying GCC installation paths.
 AC_ARG_ENABLE(win32-registry,
 [  --disable-win32-registry
Index: gcc/config.gcc
===================================================================
--- gcc/config.gcc	(revision 135707)
+++ gcc/config.gcc	(working copy)
@@ -397,6 +397,14 @@ x86_64-*-*)
 	;;
 esac
 
+case ${target} in
+i[34567]86-*-* | x86_64-*-*)
+	if test $enable_cld = yes; then
+		tm_defines="${tm_defines} USE_IX86_CLD=1"
+	fi
+	;;
+esac
+
 # On a.out targets, we need to use collect2.
 case ${target} in
 *-*-*aout*)
Index: gcc/config/i386/i386.h
===================================================================
--- gcc/config/i386/i386.h	(revision 135707)
+++ gcc/config/i386/i386.h	(working copy)
@@ -2443,8 +2443,9 @@ struct machine_function GTY(())
   int save_varrargs_registers;
   int accesses_prev_frame;
   int optimize_mode_switching[MAX_386_ENTITIES];
-  /* Set by ix86_compute_frame_layout and used by prologue/epilogue expander to
-     determine the style used.  */
+  int needs_cld;
+  /* Set by ix86_compute_frame_layout and used by prologue/epilogue
+     expander to determine the style used.  */
   int use_fast_prologue_epilogue;
   /* Number of saved registers USE_FAST_PROLOGUE_EPILOGUE has been computed
      for.  */
@@ -2464,6 +2465,7 @@ struct machine_function GTY(())
 #define ix86_stack_locals (cfun->machine->stack_locals)
 #define ix86_save_varrargs_registers (cfun->machine->save_varrargs_registers)
 #define ix86_optimize_mode_switching (cfun->machine->optimize_mode_switching)
+#define ix86_current_function_needs_cld (cfun->machine->needs_cld)
 #define ix86_tls_descriptor_calls_expanded_in_cfun \
   (cfun->machine->tls_descriptor_call_expanded_p)
 /* Since tls_descriptor_call_expanded is not cleared, even if all TLS
Index: gcc/config/i386/i386.md
===================================================================
--- gcc/config/i386/i386.md	(revision 135707)
+++ gcc/config/i386/i386.md	(working copy)
@@ -205,6 +205,7 @@
    (UNSPECV_XCHG		12)
    (UNSPECV_LOCK		13)
    (UNSPECV_PROLOGUE_USE	14)
+   (UNSPECV_CLD			15)
   ])
 
 ;; Constants to represent pcomtrue/pcomfalse variants
@@ -18514,6 +18515,14 @@
 \f
 ;; Block operation instructions
 
+(define_insn "cld"
+  [(unspec_volatile [(const_int 0)] UNSPECV_CLD)]
+  ""
+  "cld"
+  [(set_attr "length" "1")
+   (set_attr "length_immediate" "0")
+   (set_attr "modrm" "0")])
+
 (define_expand "movmemsi"
   [(use (match_operand:BLK 0 "memory_operand" ""))
    (use (match_operand:BLK 1 "memory_operand" ""))
@@ -18586,7 +18595,7 @@
 	      (set (match_operand 2 "register_operand" "")
 		   (match_operand 5 "" ""))])]
   "TARGET_SINGLE_STRINGOP || optimize_size"
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*strmovdi_rex_1"
   [(set (mem:DI (match_operand:DI 2 "register_operand" "0"))
@@ -18703,7 +18712,7 @@
 		   (match_operand 3 "memory_operand" ""))
 	      (use (match_dup 4))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*rep_movdi_rex64"
   [(set (match_operand:DI 2 "register_operand" "=c") (const_int 0))
@@ -18863,7 +18872,7 @@
 	      (set (match_operand 0 "register_operand" "")
 		   (match_operand 3 "" ""))])]
   "TARGET_SINGLE_STRINGOP || optimize_size"
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*strsetdi_rex_1"
   [(set (mem:DI (match_operand:DI 1 "register_operand" "0"))
@@ -18957,7 +18966,7 @@
 	      (use (match_operand 3 "register_operand" ""))
 	      (use (match_dup 1))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*rep_stosdi_rex64"
   [(set (match_operand:DI 1 "register_operand" "=c") (const_int 0))
@@ -19133,7 +19142,7 @@
 	      (clobber (match_operand 1 "register_operand" ""))
 	      (clobber (match_dup 2))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*cmpstrnqi_nz_1"
   [(set (reg:CC FLAGS_REG)
@@ -19180,7 +19189,7 @@
 	      (clobber (match_operand 1 "register_operand" ""))
 	      (clobber (match_dup 2))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*cmpstrnqi_1"
   [(set (reg:CC FLAGS_REG)
@@ -19249,7 +19258,7 @@
 	      (clobber (match_operand 1 "register_operand" ""))
 	      (clobber (reg:CC FLAGS_REG))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*strlenqi_1"
   [(set (match_operand:SI 0 "register_operand" "=&c")
Index: gcc/config/i386/i386.opt
===================================================================
--- gcc/config/i386/i386.opt	(revision 135707)
+++ gcc/config/i386/i386.opt	(working copy)
@@ -250,6 +250,10 @@ Support SSE5 built-in functions and code
 
 ;; Instruction support
 
+mcld
+Target Report Mask(CLD)
+Generate cld instruction in the function prologue.
+
 mabm
 Target Report RejectNegative Var(x86_abm)
 Support code generation of Advanced Bit Manipulation (ABM) instructions.
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 135707)
+++ gcc/config/i386/i386.c	(working copy)
@@ -2733,6 +2733,12 @@ override_options (void)
      can be optimized to ap = __builtin_next_arg (0).  */
   if (!TARGET_64BIT || TARGET_64BIT_MS_ABI)
     targetm.expand_builtin_va_start = NULL;
+
+#ifdef USE_IX86_CLD
+  /* Use -mcld by default for 32-bit code if configured with --enable-cld.  */
+  if (!TARGET_64BIT)
+    target_flags |= MASK_CLD & ~target_flags_explicit;
+#endif
 }
 \f
 /* Return true if this goes in large data/bss.  */
@@ -6485,6 +6491,10 @@ ix86_expand_prologue (void)
 	emit_insn (gen_prologue_use (pic_offset_table_rtx));
       emit_insn (gen_blockage ());
     }
+
+  /* Emit cld instruction if stringops are used in the function.  */
+  if (TARGET_CLD && ix86_current_function_needs_cld)
+    emit_insn (gen_cld ());
 }
 
 /* Emit code to restore saved registers using MOV insns.  First register

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2, i386]: Fix PR 36079 [4.3/4.4 Regression] cld instruction  is not emitted anymore.
  2008-05-21  8:24     ` [PATCH v2, " Uros Bizjak
@ 2008-05-21  8:25       ` Paolo Bonzini
  2008-05-21  9:04         ` Jakub Jelinek
  2008-05-21  9:00       ` Richard Guenther
  1 sibling, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2008-05-21  8:25 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: Mark Mitchell, Jakub Jelinek, GCC Patches, Paolo Bonzini,
	Richard Guenther


> 2008-05-21  Uros Bizjak  <ubizjak@gmail.com>
> 	    Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR target/36079
> 	* configure.ac: Handle --enable-cld.
> 	* configure: Regenerated.
> 	* config.gcc: Add USE_IX86_CLD to tm_defines for x86 targets.
> 	* config/i386/i386.h (struct machine_function): Add needs_cld field.
> 	(ix86_current_function_needs_cld): New define.
> 	* config/i386/i386.md (UNSPEC_CLD): New unspec volatile constant.
> 	(cld): New isns pattern.
> 	(strmov_singleop, rep_mov, strset_singleop, rep_stos, cmpstrnqi_nz_1,
> 	cmpstrnqi_1, strlenqi_1): Set ix86_current_function_needs_cld flag.
> 	* config/i386/i386.opt (mcld): New option.
> 	* config/i386/i386.c (ix86_expand_prologue): Emit cld insn if
> 	TARGET_CLD and ix86_current_function_needs_cld.
> 	(override_options): Use -mcld by default for 32-bit code if
> 	USE_IX86_CLD.
> 
> OK for 4.3 branch?

Build parts are ok for 4.3 and trunk, except that you could merge the 
case statement you added with the one before it (yes, that involves a 
small duplication).  Those parts are preapproved with this change.

The rest, you can approve yourself (though IIUC you want a word from the 
branch RMs).

Paolo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2, i386]: Fix PR 36079 [4.3/4.4 Regression] cld instruction is not emitted anymore.
  2008-05-21  8:24     ` [PATCH v2, " Uros Bizjak
  2008-05-21  8:25       ` Paolo Bonzini
@ 2008-05-21  9:00       ` Richard Guenther
  2008-05-21  9:14         ` Uros Bizjak
  2008-05-21 10:52         ` Uros Bizjak
  1 sibling, 2 replies; 12+ messages in thread
From: Richard Guenther @ 2008-05-21  9:00 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: Mark Mitchell, Jakub Jelinek, GCC Patches, Paolo Bonzini,
	Richard Guenther

On Wed, May 21, 2008 at 9:41 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Wed, May 21, 2008 at 3:40 AM, Mark Mitchell <mark@codesourcery.com> wrote:
>
>>> option processing.  I think we shouldn't ignore -mcld even for 64-bit
>>> code,
>>> remember the original bug has been found in a 64-bit app with weirdo hand
>>> written assembly, but as libc never uses std for 64-bit code, IMHO
>>> --enable-cld shouldn't make -mcld the default for 64-bit code.
>>
>> Uros, thank you for writing this patch!  Richard, Jakub, thank you for the
>> reviews.
>>
>> Per our off-list RM discussions, let's not worry too much about the defaults
>> at this point.  Whether --enable-cld is the default or not, and whether
>> --enable-cld enables -mcld by default in 64-bit mode or not, are topics we
>> can and should debate, but let's get the core technology in.
>
> Attached patch adds all suggestions by Jakub and Paolo. Patch works as
> advertised, bootstrap and regression test on i686-pc-linux with
> --enable-cld is in progress.
>
> 2008-05-21  Uros Bizjak  <ubizjak@gmail.com>
>            Jakub Jelinek  <jakub@redhat.com>
>
>        PR target/36079
>        * configure.ac: Handle --enable-cld.
>        * configure: Regenerated.
>        * config.gcc: Add USE_IX86_CLD to tm_defines for x86 targets.
>        * config/i386/i386.h (struct machine_function): Add needs_cld field.
>        (ix86_current_function_needs_cld): New define.
>        * config/i386/i386.md (UNSPEC_CLD): New unspec volatile constant.
>        (cld): New isns pattern.
>        (strmov_singleop, rep_mov, strset_singleop, rep_stos, cmpstrnqi_nz_1,
>        cmpstrnqi_1, strlenqi_1): Set ix86_current_function_needs_cld flag.
>        * config/i386/i386.opt (mcld): New option.
>        * config/i386/i386.c (ix86_expand_prologue): Emit cld insn if
>        TARGET_CLD and ix86_current_function_needs_cld.
>        (override_options): Use -mcld by default for 32-bit code if
>        USE_IX86_CLD.
>
> OK for 4.3 branch?

This is ok for the mainline and the branch if you add documentation for both the
configure option and the new target option.

If I specify --enable-cld on x86_64 I get clds emitted for -m32 but
not for -m64,
correct?  But if I specify -mcld I get clds on both -m32 and -m64?  I think this
is reasonable behavior, but the documentation should probably explicitly
mention this.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2, i386]: Fix PR 36079 [4.3/4.4 Regression] cld instruction is not emitted anymore.
  2008-05-21  8:25       ` Paolo Bonzini
@ 2008-05-21  9:04         ` Jakub Jelinek
  0 siblings, 0 replies; 12+ messages in thread
From: Jakub Jelinek @ 2008-05-21  9:04 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Uros Bizjak, Mark Mitchell, GCC Patches, Paolo Bonzini, Richard Guenther

On Wed, May 21, 2008 at 09:53:35AM +0200, Paolo Bonzini wrote:
> >	PR target/36079
> >	* configure.ac: Handle --enable-cld.
> >	* configure: Regenerated.
> >	* config.gcc: Add USE_IX86_CLD to tm_defines for x86 targets.
> >	* config/i386/i386.h (struct machine_function): Add needs_cld field.
> >	(ix86_current_function_needs_cld): New define.
> >	* config/i386/i386.md (UNSPEC_CLD): New unspec volatile constant.
> >	(cld): New isns pattern.
> >	(strmov_singleop, rep_mov, strset_singleop, rep_stos, cmpstrnqi_nz_1,
> >	cmpstrnqi_1, strlenqi_1): Set ix86_current_function_needs_cld flag.
> >	* config/i386/i386.opt (mcld): New option.
> >	* config/i386/i386.c (ix86_expand_prologue): Emit cld insn if
> >	TARGET_CLD and ix86_current_function_needs_cld.
> >	(override_options): Use -mcld by default for 32-bit code if
> >	USE_IX86_CLD.
> >
> >OK for 4.3 branch?

It should go to trunk too.

> Build parts are ok for 4.3 and trunk, except that you could merge the 
> case statement you added with the one before it (yes, that involves a 
> small duplication).  Those parts are preapproved with this change.
> 
> The rest, you can approve yourself (though IIUC you want a word from the 
> branch RMs).

Yes, it is ok for the branch.

	Jakub

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2, i386]: Fix PR 36079 [4.3/4.4 Regression] cld instruction is not emitted anymore.
  2008-05-21  9:00       ` Richard Guenther
@ 2008-05-21  9:14         ` Uros Bizjak
  2008-05-21 10:52         ` Uros Bizjak
  1 sibling, 0 replies; 12+ messages in thread
From: Uros Bizjak @ 2008-05-21  9:14 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Mark Mitchell, Jakub Jelinek, GCC Patches, Paolo Bonzini,
	Richard Guenther

[-- Attachment #1: Type: text/plain, Size: 878 bytes --]

On Wed, May 21, 2008 at 10:35 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:

>>> Per our off-list RM discussions, let's not worry too much about the defaults
>>> at this point.  Whether --enable-cld is the default or not, and whether
>>> --enable-cld enables -mcld by default in 64-bit mode or not, are topics we
>>> can and should debate, but let's get the core technology in.
>>
>> Attached patch adds all suggestions by Jakub and Paolo. Patch works as
>> advertised, bootstrap and regression test on i686-pc-linux with
>> --enable-cld is in progress.

> This is ok for the mainline and the branch if you add documentation for both the
> configure option and the new target option.

I have committed attached patch to the branch.

Documentation will be written in a day or two in order to get core
functionality into the branch ASAP (as per Mark's request).

Uros.

[-- Attachment #2: cld.diff.txt --]
[-- Type: text/plain, Size: 7297 bytes --]

Index: configure
===================================================================
--- configure	(revision 135707)
+++ configure	(working copy)
@@ -898,6 +898,7 @@ Optional Features:
   --enable-sjlj-exceptions
                           arrange to use setjmp/longjmp exception handling
   --enable-secureplt      enable -msecure-plt by default for PowerPC
+  --enable-cld            enable -mcld by default for 32bit x86
   --disable-win32-registry
                           disable lookup of installation paths in the
                           Registry on Windows hosts
@@ -12951,6 +12952,14 @@ if test "${enable_secureplt+set}" = set;
 
 fi;
 
+# Check whether --enable-cld or --disable-cld was given.
+if test "${enable_cld+set}" = set; then
+  enableval="$enable_cld"
+
+else
+  enable_cld=no
+fi;
+
 # Windows32 Registry support for specifying GCC installation paths.
 # Check whether --enable-win32-registry or --disable-win32-registry was given.
 if test "${enable_win32_registry+set}" = set; then
Index: configure.ac
===================================================================
--- configure.ac	(revision 135707)
+++ configure.ac	(working copy)
@@ -1548,6 +1548,10 @@ AC_ARG_ENABLE(secureplt,
 [  --enable-secureplt      enable -msecure-plt by default for PowerPC],
 [], [])
 
+AC_ARG_ENABLE(cld,
+[  --enable-cld            enable -mcld by default for 32bit x86], [],
+[enable_cld=no])
+
 # Windows32 Registry support for specifying GCC installation paths.
 AC_ARG_ENABLE(win32-registry,
 [  --disable-win32-registry
Index: config.gcc
===================================================================
--- config.gcc	(revision 135707)
+++ config.gcc	(working copy)
@@ -392,8 +392,16 @@ then
 fi
 
 case ${target} in
+i[34567]86-*-*)
+	if test $enable_cld = yes; then
+		tm_defines="${tm_defines} USE_IX86_CLD=1"
+	fi
+	;;
 x86_64-*-*)
 	tm_file="i386/biarch64.h ${tm_file}"
+	if test $enable_cld = yes; then
+		tm_defines="${tm_defines} USE_IX86_CLD=1"
+	fi
 	;;
 esac
 
Index: config/i386/i386.h
===================================================================
--- config/i386/i386.h	(revision 135707)
+++ config/i386/i386.h	(working copy)
@@ -2443,8 +2443,9 @@ struct machine_function GTY(())
   int save_varrargs_registers;
   int accesses_prev_frame;
   int optimize_mode_switching[MAX_386_ENTITIES];
-  /* Set by ix86_compute_frame_layout and used by prologue/epilogue expander to
-     determine the style used.  */
+  int needs_cld;
+  /* Set by ix86_compute_frame_layout and used by prologue/epilogue
+     expander to determine the style used.  */
   int use_fast_prologue_epilogue;
   /* Number of saved registers USE_FAST_PROLOGUE_EPILOGUE has been computed
      for.  */
@@ -2464,6 +2465,7 @@ struct machine_function GTY(())
 #define ix86_stack_locals (cfun->machine->stack_locals)
 #define ix86_save_varrargs_registers (cfun->machine->save_varrargs_registers)
 #define ix86_optimize_mode_switching (cfun->machine->optimize_mode_switching)
+#define ix86_current_function_needs_cld (cfun->machine->needs_cld)
 #define ix86_tls_descriptor_calls_expanded_in_cfun \
   (cfun->machine->tls_descriptor_call_expanded_p)
 /* Since tls_descriptor_call_expanded is not cleared, even if all TLS
Index: config/i386/i386.md
===================================================================
--- config/i386/i386.md	(revision 135707)
+++ config/i386/i386.md	(working copy)
@@ -205,6 +205,7 @@
    (UNSPECV_XCHG		12)
    (UNSPECV_LOCK		13)
    (UNSPECV_PROLOGUE_USE	14)
+   (UNSPECV_CLD			15)
   ])
 
 ;; Constants to represent pcomtrue/pcomfalse variants
@@ -18514,6 +18515,14 @@
 \f
 ;; Block operation instructions
 
+(define_insn "cld"
+  [(unspec_volatile [(const_int 0)] UNSPECV_CLD)]
+  ""
+  "cld"
+  [(set_attr "length" "1")
+   (set_attr "length_immediate" "0")
+   (set_attr "modrm" "0")])
+
 (define_expand "movmemsi"
   [(use (match_operand:BLK 0 "memory_operand" ""))
    (use (match_operand:BLK 1 "memory_operand" ""))
@@ -18586,7 +18595,7 @@
 	      (set (match_operand 2 "register_operand" "")
 		   (match_operand 5 "" ""))])]
   "TARGET_SINGLE_STRINGOP || optimize_size"
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*strmovdi_rex_1"
   [(set (mem:DI (match_operand:DI 2 "register_operand" "0"))
@@ -18703,7 +18712,7 @@
 		   (match_operand 3 "memory_operand" ""))
 	      (use (match_dup 4))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*rep_movdi_rex64"
   [(set (match_operand:DI 2 "register_operand" "=c") (const_int 0))
@@ -18863,7 +18872,7 @@
 	      (set (match_operand 0 "register_operand" "")
 		   (match_operand 3 "" ""))])]
   "TARGET_SINGLE_STRINGOP || optimize_size"
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*strsetdi_rex_1"
   [(set (mem:DI (match_operand:DI 1 "register_operand" "0"))
@@ -18957,7 +18966,7 @@
 	      (use (match_operand 3 "register_operand" ""))
 	      (use (match_dup 1))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*rep_stosdi_rex64"
   [(set (match_operand:DI 1 "register_operand" "=c") (const_int 0))
@@ -19133,7 +19142,7 @@
 	      (clobber (match_operand 1 "register_operand" ""))
 	      (clobber (match_dup 2))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*cmpstrnqi_nz_1"
   [(set (reg:CC FLAGS_REG)
@@ -19180,7 +19189,7 @@
 	      (clobber (match_operand 1 "register_operand" ""))
 	      (clobber (match_dup 2))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*cmpstrnqi_1"
   [(set (reg:CC FLAGS_REG)
@@ -19249,7 +19258,7 @@
 	      (clobber (match_operand 1 "register_operand" ""))
 	      (clobber (reg:CC FLAGS_REG))])]
   ""
-  "")
+  "ix86_current_function_needs_cld = 1;")
 
 (define_insn "*strlenqi_1"
   [(set (match_operand:SI 0 "register_operand" "=&c")
Index: config/i386/i386.opt
===================================================================
--- config/i386/i386.opt	(revision 135707)
+++ config/i386/i386.opt	(working copy)
@@ -250,6 +250,10 @@ Support SSE5 built-in functions and code
 
 ;; Instruction support
 
+mcld
+Target Report Mask(CLD)
+Generate cld instruction in the function prologue.
+
 mabm
 Target Report RejectNegative Var(x86_abm)
 Support code generation of Advanced Bit Manipulation (ABM) instructions.
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 135707)
+++ config/i386/i386.c	(working copy)
@@ -2733,6 +2733,12 @@ override_options (void)
      can be optimized to ap = __builtin_next_arg (0).  */
   if (!TARGET_64BIT || TARGET_64BIT_MS_ABI)
     targetm.expand_builtin_va_start = NULL;
+
+#ifdef USE_IX86_CLD
+  /* Use -mcld by default for 32-bit code if configured with --enable-cld.  */
+  if (!TARGET_64BIT)
+    target_flags |= MASK_CLD & ~target_flags_explicit;
+#endif
 }
 \f
 /* Return true if this goes in large data/bss.  */
@@ -6485,6 +6491,10 @@ ix86_expand_prologue (void)
 	emit_insn (gen_prologue_use (pic_offset_table_rtx));
       emit_insn (gen_blockage ());
     }
+
+  /* Emit cld instruction if stringops are used in the function.  */
+  if (TARGET_CLD && ix86_current_function_needs_cld)
+    emit_insn (gen_cld ());
 }
 
 /* Emit code to restore saved registers using MOV insns.  First register

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2, i386]: Fix PR 36079 [4.3/4.4 Regression] cld instruction is not emitted anymore.
  2008-05-21  9:00       ` Richard Guenther
  2008-05-21  9:14         ` Uros Bizjak
@ 2008-05-21 10:52         ` Uros Bizjak
  1 sibling, 0 replies; 12+ messages in thread
From: Uros Bizjak @ 2008-05-21 10:52 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Mark Mitchell, Jakub Jelinek, GCC Patches, Paolo Bonzini,
	Richard Guenther

On Wed, May 21, 2008 at 10:35 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:

> If I specify --enable-cld on x86_64 I get clds emitted for -m32 but
> not for -m64,
> correct?  But if I specify -mcld I get clds on both -m32 and -m64?  I think this
> is reasonable behavior, but the documentation should probably explicitly
> mention this.

This is exactly how -mcld works now. Additionally, you can add
-mno-cld to suppress generation of cld on --enable-cld configured
32bit targets.

Uros.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2008-05-21  9:39 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-05-20 13:07 [PATCH, i386]: Fix PR 36079 [4.3/4.4 Regression] cld instruction is not emitted anymore Uros Bizjak
2008-05-20 16:27 ` Richard Guenther
2008-05-20 16:29 ` Jakub Jelinek
2008-05-20 16:43   ` Jakub Jelinek
2008-05-20 17:03   ` Paolo Bonzini
2008-05-21  2:58   ` Mark Mitchell
2008-05-21  8:24     ` [PATCH v2, " Uros Bizjak
2008-05-21  8:25       ` Paolo Bonzini
2008-05-21  9:04         ` Jakub Jelinek
2008-05-21  9:00       ` Richard Guenther
2008-05-21  9:14         ` Uros Bizjak
2008-05-21 10:52         ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).