public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/36041]  New: Speed up builtin_popcountll
@ 2008-04-25  0:35 intvnut at gmail dot com
  2008-04-25  0:40 ` [Bug middle-end/36041] " intvnut at gmail dot com
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: intvnut at gmail dot com @ 2008-04-25  0:35 UTC (permalink / raw)
  To: gcc-bugs

The current __builtin_popcountll (and likely __builtin_popcount) are fairly
slow as compared to a simple, short C version derived from what can be found in
Knuth's recent publications.  The following short function is about 3x as fast
as the __builtin version, which runs counter to the idea that __builtin_XXX
provides access to implementations that are exemplars for a given platform.

unsigned int popcount64(unsigned long long x)
{
    x = (x & 0x5555555555555555ULL) + ((x >> 1) & 0x5555555555555555ULL);
    x = (x & 0x3333333333333333ULL) + ((x >> 2) & 0x3333333333333333ULL);
    x = (x & 0x0F0F0F0F0F0F0F0FULL) + ((x >> 4) & 0x0F0F0F0F0F0F0F0FULL);
    return (x * 0x0101010101010101ULL) >> 56;
}

This version has the additional benefit that it omits the lookup table that the
current "builtin" version uses.

I measured the above function vs. __builtin_popcountll with a loop like the
following:

    t1 = clock();
    for (j = 0; j < 1000000; j++)
        for (i = 0; i < 1024; i++)
            pt = popcount64(data[i]);
    t2 = clock();

    printf("popcount64 = %d clocks\n", t2 - t1);

...where data[] is a u64 that's preinitialized.

I'll attach the exact source I used, which also includes two other possible
implementations of popcountll.


-- 
           Summary: Speed up builtin_popcountll
           Product: gcc
           Version: 4.2.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: intvnut at gmail dot com
 GCC build triplet: x86_64-unknown-linux-gnu
  GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36041


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-08-16 23:28 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-36041-4@http.gcc.gnu.org/bugzilla/>
2012-09-05 10:40 ` [Bug middle-end/36041] Speed up builtin_popcountll jsalavert at gmail dot com
2012-09-05 15:21 ` paolo.carlini at oracle dot com
2012-10-26 15:51 ` gpiez at web dot de
2013-06-26 18:52 ` crrodriguez at opensuse dot org
2013-06-26 23:28 ` glisse at gcc dot gnu.org
2013-06-26 23:31 ` pinskia at gcc dot gnu.org
2013-06-26 23:38 ` crrodriguez at opensuse dot org
2013-06-26 23:49 ` glisse at gcc dot gnu.org
2013-06-27  5:34 ` jakub at gcc dot gnu.org
2013-06-27  6:14 ` crrodriguez at opensuse dot org
2013-06-27  7:13 ` jakub at gcc dot gnu.org
2013-06-28 12:50 ` glisse at gcc dot gnu.org
2013-06-28 13:01 ` jakub at gcc dot gnu.org
2021-08-16 23:28 ` pinskia at gcc dot gnu.org
2008-04-25  0:35 [Bug c/36041] New: " intvnut at gmail dot com
2008-04-25  0:40 ` [Bug middle-end/36041] " intvnut at gmail dot com
2008-04-25  8:45 ` rguenth at gcc dot gnu dot org
2008-04-25 12:29 ` intvnut at gmail dot com
2008-04-25 14:52 ` rguenth at gcc dot gnu dot org
2008-04-29  3:42 ` intvnut at gmail dot com
2010-02-21  1:34 ` manu at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).