public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* question regarding div / std::div implementation
@ 2016-04-20 19:44 Daniel Gutson
  2016-04-20 20:27 ` Mike Frysinger
       [not found] ` <5717DF65.5060606@linaro.org>
  0 siblings, 2 replies; 10+ messages in thread
From: Daniel Gutson @ 2016-04-20 19:44 UTC (permalink / raw)
  To: libc-alpha

Hi,

   is there any reason that std::div / cstdlib div is not implemented
in such a way that it is expanded to
the assembly instruction -when available- that calculates both the
remainder and the quotient,
e.g. x86' div ?

For example, why not an inline function with inline assembly? Or,
should this require a gcc built-in?

Thanks,

   Daniel.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: question regarding div / std::div implementation
  2016-04-20 19:44 question regarding div / std::div implementation Daniel Gutson
@ 2016-04-20 20:27 ` Mike Frysinger
       [not found] ` <5717DF65.5060606@linaro.org>
  1 sibling, 0 replies; 10+ messages in thread
From: Mike Frysinger @ 2016-04-20 20:27 UTC (permalink / raw)
  To: Daniel Gutson; +Cc: libc-alpha

[-- Attachment #1: Type: text/plain, Size: 583 bytes --]

On 20 Apr 2016 16:44, Daniel Gutson wrote:
>    is there any reason that std::div / cstdlib div is not implemented
> in such a way that it is expanded to
> the assembly instruction -when available- that calculates both the
> remainder and the quotient,
> e.g. x86' div ?
> 
> For example, why not an inline function with inline assembly? Or,
> should this require a gcc built-in?

sounds like an optimization for gcc rather than hacking something at the
cpp level.  you could post to the gcc-help@ list and they would know best.
	https://gcc.gnu.org/ml/gcc-help/
-mike

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: question regarding div / std::div implementation
       [not found]   ` <CAF5HaEWdpAGiXtCO36u3F0QGAXfVHL+qkY+RLsszpv7paPVdMg@mail.gmail.com>
@ 2016-04-20 20:29     ` Adhemerval Zanella
  2016-04-20 20:36       ` Daniel Gutson
  0 siblings, 1 reply; 10+ messages in thread
From: Adhemerval Zanella @ 2016-04-20 20:29 UTC (permalink / raw)
  To: GNU C Library; +Cc: Daniel Gutson



On 20-04-2016 17:07, Daniel Gutson wrote:
> On Wed, Apr 20, 2016 at 4:58 PM, Adhemerval Zanella
> <adhemerval.zanella@linaro.org> wrote:
>>
>>
>> On 20-04-2016 16:44, Daniel Gutson wrote:
>>> Hi,
>>>
>>>    is there any reason that std::div / cstdlib div is not implemented
>>> in such a way that it is expanded to
>>> the assembly instruction -when available- that calculates both the
>>> remainder and the quotient,
>>> e.g. x86' div ?
>>>
>>> For example, why not an inline function with inline assembly? Or,
>>> should this require a gcc built-in?
>>
>> I believe because nobody really implemented this optimization and
>> my felling is if this is being a hotspot in your application you
>> will probably get more gains trying to rewrite it than using the
>> libc call.
> 
> then it won't be portable, or optimally-portable, meaning that the optimization
> would show up whenever my target supports it. Suppose I need to provide
> my application for several architectures, I would expect that I should
> be able to
> write my application using standard functions, and that it will get
> optimized for each platform.
> 
> I'm reporting it in bugzilla and asking to assign it to one of my team members.

I do not really get what exactly you are referring as non-portable,
since glibc div code is implemented as stdlib/div.c and these will
generate idivl instruction on x86_64 for all supported chips. And
afaik these are true for all supported architectures (I am not
aware of any architecture that added a more optimized
division/modulus operation with a *different* opcode).
 
I mean to use the integer operation directly instead of using the
libcall. The code is quite simple:

div_t
div (int numer, int denom)
{
  div_t result;

  result.quot = numer / denom;
  result.rem = numer % denom;

  return result;
}

You can try to add an inline version on headers, as such the one
for string.h, but I would strongly recommend you to either work on
your application if these are the hotspot (either by calling the
operations directly instead) or on compiler side to make it
handling it as builtin (and thus avoid the libcall).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: question regarding div / std::div implementation
  2016-04-20 20:29     ` Adhemerval Zanella
@ 2016-04-20 20:36       ` Daniel Gutson
  2016-04-20 20:49         ` Adhemerval Zanella
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Gutson @ 2016-04-20 20:36 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: GNU C Library

On Wed, Apr 20, 2016 at 5:29 PM, Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:
>
>
> On 20-04-2016 17:07, Daniel Gutson wrote:
>> On Wed, Apr 20, 2016 at 4:58 PM, Adhemerval Zanella
>> <adhemerval.zanella@linaro.org> wrote:
>>>
>>>
>>> On 20-04-2016 16:44, Daniel Gutson wrote:
>>>> Hi,
>>>>
>>>>    is there any reason that std::div / cstdlib div is not implemented
>>>> in such a way that it is expanded to
>>>> the assembly instruction -when available- that calculates both the
>>>> remainder and the quotient,
>>>> e.g. x86' div ?
>>>>
>>>> For example, why not an inline function with inline assembly? Or,
>>>> should this require a gcc built-in?
>>>
>>> I believe because nobody really implemented this optimization and
>>> my felling is if this is being a hotspot in your application you
>>> will probably get more gains trying to rewrite it than using the
>>> libc call.
>>
>> then it won't be portable, or optimally-portable, meaning that the optimization
>> would show up whenever my target supports it. Suppose I need to provide
>> my application for several architectures, I would expect that I should
>> be able to
>> write my application using standard functions, and that it will get
>> optimized for each platform.
>>
>> I'm reporting it in bugzilla and asking to assign it to one of my team members.

FWIW, https://sourceware.org/bugzilla/show_bug.cgi?id=19974

>
> I do not really get what exactly you are referring as non-portable,
> since glibc div code is implemented as stdlib/div.c and these will
> generate idivl instruction on x86_64 for all supported chips. And

I don't see it generating the idivl instruction, but
      callq  400430 <div@plt>
so I think it should be implemented as an inline function maybe with
inline assembly
(or rely on the pattern recognition as you suggest below).

> afaik these are true for all supported architectures (I am not
> aware of any architecture that added a more optimized
> division/modulus operation with a *different* opcode).

Could you please post an example and the gcc command line call where
you do get the idiv?

>
> I mean to use the integer operation directly instead of using the
> libcall. The code is quite simple:
>
> div_t
> div (int numer, int denom)
> {
>   div_t result;
>
>   result.quot = numer / denom;
>   result.rem = numer % denom;
>
>   return result;
> }

>
> You can try to add an inline version on headers, as such the one
> for string.h, but I would strongly recommend you to either work on
> your application if these are the hotspot (either by calling the
> operations directly instead) or on compiler side to make it
> handling it as builtin (and thus avoid the libcall).

Why should this be a builtin? I can implement it on gcc, but I still
don't see why should I pass the burden to the compiler
whereas it is a matter of library implementation.


-- 

Daniel F. Gutson
Engineering Manager

San Lorenzo 47, 3rd Floor, Office 5
Córdoba, Argentina

Phone:   +54 351 4217888 / +54 351 4218211
Skype:    dgutson
LinkedIn: http://ar.linkedin.com/in/danielgutson

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: question regarding div / std::div implementation
  2016-04-20 20:36       ` Daniel Gutson
@ 2016-04-20 20:49         ` Adhemerval Zanella
  2016-04-20 21:16           ` Daniel Gutson
  0 siblings, 1 reply; 10+ messages in thread
From: Adhemerval Zanella @ 2016-04-20 20:49 UTC (permalink / raw)
  To: Daniel Gutson; +Cc: GNU C Library



On 20-04-2016 17:36, Daniel Gutson wrote:
> On Wed, Apr 20, 2016 at 5:29 PM, Adhemerval Zanella
> <adhemerval.zanella@linaro.org> wrote:
>>
>>
>> On 20-04-2016 17:07, Daniel Gutson wrote:
>>> On Wed, Apr 20, 2016 at 4:58 PM, Adhemerval Zanella
>>> <adhemerval.zanella@linaro.org> wrote:
>>>>
>>>>
>>>> On 20-04-2016 16:44, Daniel Gutson wrote:
>>>>> Hi,
>>>>>
>>>>>    is there any reason that std::div / cstdlib div is not implemented
>>>>> in such a way that it is expanded to
>>>>> the assembly instruction -when available- that calculates both the
>>>>> remainder and the quotient,
>>>>> e.g. x86' div ?
>>>>>
>>>>> For example, why not an inline function with inline assembly? Or,
>>>>> should this require a gcc built-in?
>>>>
>>>> I believe because nobody really implemented this optimization and
>>>> my felling is if this is being a hotspot in your application you
>>>> will probably get more gains trying to rewrite it than using the
>>>> libc call.
>>>
>>> then it won't be portable, or optimally-portable, meaning that the optimization
>>> would show up whenever my target supports it. Suppose I need to provide
>>> my application for several architectures, I would expect that I should
>>> be able to
>>> write my application using standard functions, and that it will get
>>> optimized for each platform.
>>>
>>> I'm reporting it in bugzilla and asking to assign it to one of my team members.
> 
> FWIW, https://sourceware.org/bugzilla/show_bug.cgi?id=19974
> 
>>
>> I do not really get what exactly you are referring as non-portable,
>> since glibc div code is implemented as stdlib/div.c and these will
>> generate idivl instruction on x86_64 for all supported chips. And
> 
> I don't see it generating the idivl instruction, but
>       callq  400430 <div@plt>
> so I think it should be implemented as an inline function maybe with
> inline assembly
> (or rely on the pattern recognition as you suggest below).

Off course it will generate a libcall, since stdlib.h header defines
it an external call and compiler does not have any information on
how to lower this.

> 
>> afaik these are true for all supported architectures (I am not
>> aware of any architecture that added a more optimized
>> division/modulus operation with a *different* opcode).
> 
> Could you please post an example and the gcc command line call where
> you do get the idiv?

I mean when building stdlib/div.c itself.

> 
>>
>> I mean to use the integer operation directly instead of using the
>> libcall. The code is quite simple:
>>
>> div_t
>> div (int numer, int denom)
>> {
>>   div_t result;
>>
>>   result.quot = numer / denom;
>>   result.rem = numer % denom;
>>
>>   return result;
>> }
> 
>>
>> You can try to add an inline version on headers, as such the one
>> for string.h, but I would strongly recommend you to either work on
>> your application if these are the hotspot (either by calling the
>> operations directly instead) or on compiler side to make it
>> handling it as builtin (and thus avoid the libcall).
> 
> Why should this be a builtin? I can implement it on gcc, but I still
> don't see why should I pass the burden to the compiler
> whereas it is a matter of library implementation.

Because carrying such implementation adds header complexity and burden
maintainability, just check the string{2}.h header cleanup Wilco is
pushing.

IMHO I do not see a compelling reason to add the usage of inline
assembly for such operation and I would avoid add a inline operation
just to remove the libcall. 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: question regarding div / std::div implementation
  2016-04-20 20:49         ` Adhemerval Zanella
@ 2016-04-20 21:16           ` Daniel Gutson
  2016-04-20 21:38             ` Paul Eggert
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Gutson @ 2016-04-20 21:16 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: GNU C Library

On Wed, Apr 20, 2016 at 5:49 PM, Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:
>
>
> On 20-04-2016 17:36, Daniel Gutson wrote:
>> On Wed, Apr 20, 2016 at 5:29 PM, Adhemerval Zanella
>> <adhemerval.zanella@linaro.org> wrote:
>>>
>>>
>>> On 20-04-2016 17:07, Daniel Gutson wrote:
>>>> On Wed, Apr 20, 2016 at 4:58 PM, Adhemerval Zanella
>>>> <adhemerval.zanella@linaro.org> wrote:
>>>>>
>>>>>
>>>>> On 20-04-2016 16:44, Daniel Gutson wrote:
>>>>>> Hi,
>>>>>>
>>>>>>    is there any reason that std::div / cstdlib div is not implemented
>>>>>> in such a way that it is expanded to
>>>>>> the assembly instruction -when available- that calculates both the
>>>>>> remainder and the quotient,
>>>>>> e.g. x86' div ?
>>>>>>
>>>>>> For example, why not an inline function with inline assembly? Or,
>>>>>> should this require a gcc built-in?
>>>>>
>>>>> I believe because nobody really implemented this optimization and
>>>>> my felling is if this is being a hotspot in your application you
>>>>> will probably get more gains trying to rewrite it than using the
>>>>> libc call.
>>>>
>>>> then it won't be portable, or optimally-portable, meaning that the optimization
>>>> would show up whenever my target supports it. Suppose I need to provide
>>>> my application for several architectures, I would expect that I should
>>>> be able to
>>>> write my application using standard functions, and that it will get
>>>> optimized for each platform.
>>>>
>>>> I'm reporting it in bugzilla and asking to assign it to one of my team members.
>>
>> FWIW, https://sourceware.org/bugzilla/show_bug.cgi?id=19974
>>
>>>
>>> I do not really get what exactly you are referring as non-portable,
>>> since glibc div code is implemented as stdlib/div.c and these will
>>> generate idivl instruction on x86_64 for all supported chips. And
>>
>> I don't see it generating the idivl instruction, but
>>       callq  400430 <div@plt>
>> so I think it should be implemented as an inline function maybe with
>> inline assembly
>> (or rely on the pattern recognition as you suggest below).
>
> Off course it will generate a libcall, since stdlib.h header defines
> it an external call and compiler does not have any information on
> how to lower this.
>
>>
>>> afaik these are true for all supported architectures (I am not
>>> aware of any architecture that added a more optimized
>>> division/modulus operation with a *different* opcode).
>>
>> Could you please post an example and the gcc command line call where
>> you do get the idiv?
>
> I mean when building stdlib/div.c itself.
>
>>
>>>
>>> I mean to use the integer operation directly instead of using the
>>> libcall. The code is quite simple:
>>>
>>> div_t
>>> div (int numer, int denom)
>>> {
>>>   div_t result;
>>>
>>>   result.quot = numer / denom;
>>>   result.rem = numer % denom;
>>>
>>>   return result;
>>> }
>>
>>>
>>> You can try to add an inline version on headers, as such the one
>>> for string.h, but I would strongly recommend you to either work on
>>> your application if these are the hotspot (either by calling the
>>> operations directly instead) or on compiler side to make it
>>> handling it as builtin (and thus avoid the libcall).
>>
>> Why should this be a builtin? I can implement it on gcc, but I still
>> don't see why should I pass the burden to the compiler
>> whereas it is a matter of library implementation.
>
> Because carrying such implementation adds header complexity and burden
> maintainability, just check the string{2}.h header cleanup Wilco is
> pushing.
>
> IMHO I do not see a compelling reason to add the usage of inline
> assembly for such operation and I would avoid add a inline operation
> just to remove the libcall.

OK with no inline asm, but a libcall might be expensive specially in a
tight loop and messes with predictions;
a builtin is nonportable as well.

-- 

Daniel F. Gutson
Engineering Manager

San Lorenzo 47, 3rd Floor, Office 5
Córdoba, Argentina

Phone:   +54 351 4217888 / +54 351 4218211
Skype:    dgutson
LinkedIn: http://ar.linkedin.com/in/danielgutson

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: question regarding div / std::div implementation
  2016-04-20 21:16           ` Daniel Gutson
@ 2016-04-20 21:38             ` Paul Eggert
  2016-04-20 21:55               ` Daniel Gutson
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Eggert @ 2016-04-20 21:38 UTC (permalink / raw)
  To: Daniel Gutson, Adhemerval Zanella; +Cc: GNU C Library

On 04/20/2016 02:15 PM, Daniel Gutson wrote:
> OK with no inline asm, but a libcall might be expensive specially in a
> tight loop and messes with predictions;
> a builtin is nonportable as well.

In practice, C programs that need integer quotient and remainder 
typically don't call 'div'. They just use % and /, and compilers are now 
smart enough to do just one machine-level operation to get both quotient 
and remainder. For example, time/offtime.c has this macro:

#define DIV(a, b) ((a) / (b) - ((a) % (b) < 0))

which should work just fine as-is. In theory one could change this to 
use div/ldiv/lldiv, but why bother making the code way more complicated?

As the 'div' function family was designed back when C compilers were not 
that smart and is largely obsolete now, simplicity would appear to be 
more important than performance here. Perhaps someone someday will work 
up the energy to get 'div' removed from the C standard.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: question regarding div / std::div implementation
  2016-04-20 21:38             ` Paul Eggert
@ 2016-04-20 21:55               ` Daniel Gutson
  2016-04-20 22:10                 ` Paul Eggert
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Gutson @ 2016-04-20 21:55 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Adhemerval Zanella, GNU C Library

On Wed, Apr 20, 2016 at 6:38 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
> On 04/20/2016 02:15 PM, Daniel Gutson wrote:
>>
>> OK with no inline asm, but a libcall might be expensive specially in a
>> tight loop and messes with predictions;
>> a builtin is nonportable as well.
>
>
> In practice, C programs that need integer quotient and remainder typically
> don't call 'div'. They just use % and /, and compilers are now smart enough
> to do just one machine-level operation to get both quotient and remainder.
> For example, time/offtime.c has this macro:
>
> #define DIV(a, b) ((a) / (b) - ((a) % (b) < 0))
>
> which should work just fine as-is. In theory one could change this to use
> div/ldiv/lldiv, but why bother making the code way more complicated?

I personally see the macro above more complicated to read that a
simple function call with a meaningful name.

Additionally, consider

int foo(int x, int y)
{
    if (x > 0 && y > 0)
        return x * y * foo(x - 1, y - 1);
    else
        return 1;
}

int addqr2(int x, int y)
{
    const int quot = x / y;
    int something = x + y - 2 * x * y;
    int something_else = something * foo(x + 1, y + 1);
    const int rem = x % y;
    return quot + rem + something_else;
}

That fools the compiler. Moving the % just below the / induces the
optimization; but we'd be relying on a weak pattern recognition. Code
should not rely on that IMHO.
Anyway I already filed the issue to the gcc bugzilla, and if people
agrees there, I will ask them to assign it to someone of my team.

>
> As the 'div' function family was designed back when C compilers were not
> that smart and is largely obsolete now, simplicity would appear to be more
> important than performance here. Perhaps someone someday will work up the
> energy to get 'div' removed from the C standard.



-- 

Daniel F. Gutson
Engineering Manager

San Lorenzo 47, 3rd Floor, Office 5
Córdoba, Argentina

Phone:   +54 351 4217888 / +54 351 4218211
Skype:    dgutson
LinkedIn: http://ar.linkedin.com/in/danielgutson

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: question regarding div / std::div implementation
  2016-04-20 21:55               ` Daniel Gutson
@ 2016-04-20 22:10                 ` Paul Eggert
  2016-04-20 22:18                   ` Daniel Gutson
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Eggert @ 2016-04-20 22:10 UTC (permalink / raw)
  To: Daniel Gutson; +Cc: Adhemerval Zanella, GNU C Library

On 04/20/2016 02:55 PM, Daniel Gutson wrote:
> That fools the compiler.

Yes, bizarre C code can fool the compiler. I was talking about realistic 
code, not contrived examples.

glibc itself uses % and / in places where div/ldiv/lldiv might plausibly 
be used internally, and this is portable and maintainable and efficient 
and there is no real reason to change this, even if div/ldiv/lldiv were 
tuned. Similarly for other C applications.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: question regarding div / std::div implementation
  2016-04-20 22:10                 ` Paul Eggert
@ 2016-04-20 22:18                   ` Daniel Gutson
  0 siblings, 0 replies; 10+ messages in thread
From: Daniel Gutson @ 2016-04-20 22:18 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Adhemerval Zanella, GNU C Library

On Wed, Apr 20, 2016 at 7:10 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
> On 04/20/2016 02:55 PM, Daniel Gutson wrote:
>>
>> That fools the compiler.
>
>
> Yes, bizarre C code can fool the compiler. I was talking about realistic
> code, not contrived examples.

No need to be bizarre (I just wanted to exaggerate). Code by not
knowledgeable programmers,
or simply programmers that don't know that such optimization exists
(that requires that both lines have to be written adjacent)
will miss the optimization.

>
> glibc itself uses % and / in places where div/ldiv/lldiv might plausibly be
> used internally, and this is portable and maintainable and efficient and
> there is no real reason to change this, even if div/ldiv/lldiv were tuned.
> Similarly for other C applications.

This discussion has already moved to the compiler arena. Once the
builtins are implemented,
you glibc maintainers may want to consider changing that in order to
use those functions, or not.
In any case, code already using the div/ldiv/lldiv functions will be
benefited, and that's already something good.

Thanks for your feedback.

    Daniel.


-- 

Daniel F. Gutson
Engineering Manager

San Lorenzo 47, 3rd Floor, Office 5
Córdoba, Argentina

Phone:   +54 351 4217888 / +54 351 4218211
Skype:    dgutson
LinkedIn: http://ar.linkedin.com/in/danielgutson

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-04-20 22:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-20 19:44 question regarding div / std::div implementation Daniel Gutson
2016-04-20 20:27 ` Mike Frysinger
     [not found] ` <5717DF65.5060606@linaro.org>
     [not found]   ` <CAF5HaEWdpAGiXtCO36u3F0QGAXfVHL+qkY+RLsszpv7paPVdMg@mail.gmail.com>
2016-04-20 20:29     ` Adhemerval Zanella
2016-04-20 20:36       ` Daniel Gutson
2016-04-20 20:49         ` Adhemerval Zanella
2016-04-20 21:16           ` Daniel Gutson
2016-04-20 21:38             ` Paul Eggert
2016-04-20 21:55               ` Daniel Gutson
2016-04-20 22:10                 ` Paul Eggert
2016-04-20 22:18                   ` Daniel Gutson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).