* cmov in asm()
@ 2003-05-22 13:20 Peter Kovar
2003-05-22 13:27 ` Andrew Pinski
0 siblings, 1 reply; 3+ messages in thread
From: Peter Kovar @ 2003-05-22 13:20 UTC (permalink / raw)
To: gcc
Hi *,
this is little performance improvement hint for IA-32, x86-64
In the userland I've been using following inlined functions in order
to avoid unpredictable conditional branching (indeed, it looks ugly).
static inline int
absolute (int x)
{
#if defined (__i386__)
asm volatile
(
".intel_syntax noprefix\n\t"
"mov ecx, eax\n\t"
"neg ecx\n\t"
"cmovns eax, ecx\n\t"
".att_syntax\n\t"
: "=a" (x) : "a" (x) : "ecx", "flags"
) ;
#else
if (x < 0)
{
x = -x ;
}
#endif
return x ;
}
GCC 3.x now emits cmov, however there is one specific case, where it
could be done bit more efficiently.
GCC generated code from Linux 2.5.69 kernel and then human made
equivalent of abs()
c036703c: 89 d0 mov %edx,%eax
c036703e: f7 d8 neg %eax
c0367040: 83 fa ff cmp $0xffffffff,%edx
c0367043: 0f 4e d0 cmovle %eax,%edx
c036703e: 89 c1 mov %eax,%ecx
c0367040: f7 d9 neg %ecx
c0367042: 0f 49 c1 cmovns %ecx,%eax
neg changes flags, cmp with constant -1 is not necessary.
Would it be possible to integrate in architecture specific code
generation?
br
Peter Kovar
====================== REKLAMA =================================
Získajte supervýhodné ADSL ešte výhodnejšie. Ceny už od 399 Sk mesačne bez
DPH. Byť rýchly sa naozaj oplatí. http://www.slovanet.sk/menu/adsl.html
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: cmov in asm()
2003-05-22 13:20 cmov in asm() Peter Kovar
@ 2003-05-22 13:27 ` Andrew Pinski
2003-05-22 22:28 ` Richard Henderson
0 siblings, 1 reply; 3+ messages in thread
From: Andrew Pinski @ 2003-05-22 13:27 UTC (permalink / raw)
To: Peter Kovar; +Cc: Andrew Pinski, gcc
What about using abs() or (x<0?-x:x) :
att_syntax:
movl 4(%esp), %eax
cltd
xorl %edx, %eax
subl %edx, %eax
intel syntax:
mov eax, DWORD PTR [esp+4]
cdq
xor eax, edx
sub eax, edx
It is same number of instructions and does not depend on the
conditional register.
Thanks,
Andrew Pinski
On Thursday, May 22, 2003, at 08:46 US/Eastern, Peter Kovar wrote:
> Hi *,
>
> this is little performance improvement hint for IA-32, x86-64
>
> In the userland I've been using following inlined functions in order
> to avoid unpredictable conditional branching (indeed, it looks ugly).
>
> static inline int
> absolute (int x)
> {
> #if defined (__i386__)
> asm volatile
> (
> ".intel_syntax noprefix\n\t"
> "mov ecx, eax\n\t"
> "neg ecx\n\t"
> "cmovns eax, ecx\n\t"
> ".att_syntax\n\t"
> : "=a" (x) : "a" (x) : "ecx", "flags"
> ) ;
> #else
> if (x < 0)
> {
> x = -x ;
> }
> #endif
> return x ;
> }
>
> GCC 3.x now emits cmov, however there is one specific case, where it
> could be done bit more efficiently.
>
> GCC generated code from Linux 2.5.69 kernel and then human made
> equivalent of abs()
>
>
> c036703c: 89 d0 mov %edx,%eax
> c036703e: f7 d8 neg %eax
> c0367040: 83 fa ff cmp $0xffffffff,%edx
> c0367043: 0f 4e d0 cmovle %eax,%edx
>
> c036703e: 89 c1 mov %eax,%ecx
> c0367040: f7 d9 neg %ecx
> c0367042: 0f 49 c1 cmovns %ecx,%eax
>
>
> neg changes flags, cmp with constant -1 is not necessary.
>
> Would it be possible to integrate in architecture specific code
> generation?
>
> br
> Peter Kovar
>
> ====================== REKLAMA =================================
> Získajte supervýhodné ADSL ešte výhodnejšie. Ceny už od 399 Sk mesačne
> bez
> DPH. Byť rýchly sa naozaj oplatí. http://www.slovanet.sk/menu/adsl.html
>
>
>
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: cmov in asm()
2003-05-22 13:27 ` Andrew Pinski
@ 2003-05-22 22:28 ` Richard Henderson
0 siblings, 0 replies; 3+ messages in thread
From: Richard Henderson @ 2003-05-22 22:28 UTC (permalink / raw)
To: Andrew Pinski; +Cc: Peter Kovar, gcc
Try the following.
r~
* optabs.c (expand_abs_nojump): Split out from ...
(expand_abs): ... here.
* optabs.h (expand_abs_nojump): Declare.
* ifcvt.c (noce_try_abs): Use it.
* Makefile.in (ifcvt.o): Depend on optabs.h.
Index: Makefile.in
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Makefile.in,v
retrieving revision 1.1056
diff -c -p -d -r1.1056 Makefile.in
*** Makefile.in 14 May 2003 15:29:07 -0000 1.1056
--- Makefile.in 22 May 2003 22:08:47 -0000
*************** timevar.o : timevar.c $(CONFIG_H) $(SYST
*** 1770,1778 ****
regrename.o : regrename.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h output.h $(RECOG_H) function.h \
resource.h $(OBSTACK_H) flags.h $(TM_P_H)
! ifcvt.o : ifcvt.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(REGS_H) toplev.h \
! flags.h insn-config.h function.h $(RECOG_H) $(BASIC_BLOCK_H) $(EXPR_H) \
! output.h except.h $(TM_P_H) real.h
params.o : params.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(PARAMS_H) toplev.h
hooks.o: hooks.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(HOOKS_H)
--- 1770,1778 ----
regrename.o : regrename.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h output.h $(RECOG_H) function.h \
resource.h $(OBSTACK_H) flags.h $(TM_P_H)
! ifcvt.o : ifcvt.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
! $(REGS_H) toplev.h flags.h insn-config.h function.h $(RECOG_H) \
! $(BASIC_BLOCK_H) $(EXPR_H) output.h except.h $(TM_P_H) real.h optabs.h
params.o : params.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(PARAMS_H) toplev.h
hooks.o: hooks.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(HOOKS_H)
Index: ifcvt.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/ifcvt.c,v
retrieving revision 1.115
diff -c -p -d -r1.115 ifcvt.c
*** ifcvt.c 14 Apr 2003 21:44:36 -0000 1.115
--- ifcvt.c 22 May 2003 22:08:49 -0000
***************
*** 35,40 ****
--- 35,41 ----
#include "expr.h"
#include "real.h"
#include "output.h"
+ #include "optabs.h"
#include "toplev.h"
#include "tm_p.h"
*************** noce_try_abs (if_info)
*** 1602,1608 ****
start_sequence ();
! target = expand_simple_unop (GET_MODE (if_info->x), ABS, b, if_info->x, 0);
/* ??? It's a quandry whether cmove would be better here, especially
for integers. Perhaps combine will clean things up. */
--- 1603,1609 ----
start_sequence ();
! target = expand_abs_nojump (GET_MODE (if_info->x), b, if_info->x, 1);
/* ??? It's a quandry whether cmove would be better here, especially
for integers. Perhaps combine will clean things up. */
Index: optabs.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/optabs.c,v
retrieving revision 1.172
diff -c -p -d -r1.172 optabs.c
*** optabs.c 15 Apr 2003 13:06:58 -0000 1.172
--- optabs.c 22 May 2003 22:08:53 -0000
*************** expand_unop (mode, unoptab, op0, target,
*** 2773,2786 ****
*/
rtx
! expand_abs (mode, op0, target, result_unsignedp, safe)
enum machine_mode mode;
rtx op0;
rtx target;
int result_unsignedp;
- int safe;
{
! rtx temp, op1;
if (! flag_trapv)
result_unsignedp = 1;
--- 2773,2785 ----
*/
rtx
! expand_abs_nojump (mode, op0, target, result_unsignedp)
enum machine_mode mode;
rtx op0;
rtx target;
int result_unsignedp;
{
! rtx temp;
if (! flag_trapv)
result_unsignedp = 1;
*************** expand_abs (mode, op0, target, result_un
*** 2867,2872 ****
--- 2866,2888 ----
if (temp != 0)
return temp;
}
+
+ return NULL_RTX;
+ }
+
+ rtx
+ expand_abs (mode, op0, target, result_unsignedp, safe)
+ enum machine_mode mode;
+ rtx op0;
+ rtx target;
+ int result_unsignedp;
+ int safe;
+ {
+ rtx temp, op1;
+
+ temp = expand_abs_nojump (mode, op0, target, result_unsignedp);
+ if (temp != 0)
+ return temp;
/* If that does not win, use conditional jump and negate. */
Index: optabs.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/optabs.h,v
retrieving revision 1.12
diff -c -p -d -r1.12 optabs.h
*** optabs.h 11 Feb 2003 19:34:08 -0000 1.12
--- optabs.h 22 May 2003 22:08:53 -0000
*************** extern int expand_twoval_binop PARAMS ((
*** 306,311 ****
--- 306,312 ----
extern rtx expand_unop PARAMS ((enum machine_mode, optab, rtx, rtx, int));
/* Expand the absolute value operation. */
+ extern rtx expand_abs_nojump PARAMS ((enum machine_mode, rtx, rtx, int));
extern rtx expand_abs PARAMS ((enum machine_mode, rtx, rtx, int, int));
/* Expand the complex absolute value operation. */
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2003-05-22 22:13 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-22 13:20 cmov in asm() Peter Kovar
2003-05-22 13:27 ` Andrew Pinski
2003-05-22 22:28 ` Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).