public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "hubicka at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/109812] GraphicsMagick resize is a lot slower in GCC 13.1 vs Clang 16 on Intel Raptor Lake
Date: Sun, 28 May 2023 18:50:53 +0000	[thread overview]
Message-ID: <bug-109812-4-xmKa7bYW8L@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-109812-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109812

--- Comment #10 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
This is benchmarkeable version of the simplified testcase:

jan@localhost:/tmp> cat t.c
#define N 10000000
struct rgb {unsigned char r,g,b;} rgbs[N];
int *addr;
struct drgb {double r,g,b;
#ifdef OPACITY
             double o;
#endif
};

struct drgb sum(double w)
{
        struct drgb r;
        for (int i = 0; i < N; i++)
        {
          r.r += rgbs[i].r * w;
          r.g += rgbs[i].g * w;
          r.b += rgbs[i].b * w;
        }
        return r;
}
jan@localhost:/tmp> cat q.c
struct drgb {double r,g,b;
#ifdef OPACITY
             double o;
#endif
};
struct drgb sum(double w);
int
main()
{
        for (int i = 0; i < 1000; i++)
                sum(i);
}


jan@localhost:/tmp> gcc t.c q.c -march=native -O3 -g ; objdump -d a.out | grep
vfmadd231pd  ; perf stat ./a.out
  40119d:       c4 e2 d9 b8 d1          vfmadd231pd %xmm1,%xmm4,%xmm2

 Performance counter stats for './a.out':

         12,148.04 msec task-clock:u                     #    1.000 CPUs
utilized             
                 0      context-switches:u               #    0.000 /sec        
                 0      cpu-migrations:u                 #    0.000 /sec        
               736      page-faults:u                    #   60.586 /sec        
    50,018,421,148      cycles:u                         #    4.117 GHz         
           220,502      stalled-cycles-frontend:u        #    0.00% frontend
cycles idle      
    39,950,154,369      stalled-cycles-backend:u         #   79.87% backend
cycles idle       
   120,000,191,713      instructions:u                   #    2.40  insn per
cycle            
                                                  #    0.33  stalled cycles per
insn   
    10,000,048,918      branches:u                       #  823.182 M/sec       
             7,959      branch-misses:u                  #    0.00% of all
branches           

      12.149466078 seconds time elapsed

      12.149084000 seconds user
       0.000000000 seconds sys


jan@localhost:/tmp> gcc t.c q.c -march=native -O3 -g -DOPACITY ; objdump -d
a.out | grep vfmadd231pd  ; perf stat ./a.out

 Performance counter stats for './a.out':

         12,141.11 msec task-clock:u                     #    1.000 CPUs
utilized             
                 0      context-switches:u               #    0.000 /sec        
                 0      cpu-migrations:u                 #    0.000 /sec        
               735      page-faults:u                    #   60.538 /sec        
    50,018,839,129      cycles:u                         #    4.120 GHz         
           185,034      stalled-cycles-frontend:u        #    0.00% frontend
cycles idle      
    29,963,999,798      stalled-cycles-backend:u         #   59.91% backend
cycles idle       
   120,000,191,729      instructions:u                   #    2.40  insn per
cycle            
                                                  #    0.25  stalled cycles per
insn   
    10,000,048,913      branches:u                       #  823.652 M/sec       
             7,311      branch-misses:u                  #    0.00% of all
branches           

      12.142252354 seconds time elapsed

      12.138237000 seconds user
       0.004000000 seconds sys


So on zen2 hardware I get same performance on both.  It may be interesting to
test it on Raptor Lake.

  parent reply	other threads:[~2023-05-28 18:50 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-11 14:25 [Bug tree-optimization/109812] New: GraphicsMagick resize is a lot slower in GCC 13.1 vs Clang 16 aros at gmx dot com
2023-05-11 14:26 ` [Bug tree-optimization/109812] " aros at gmx dot com
2023-05-11 15:20 ` [Bug target/109812] " pinskia at gcc dot gnu.org
2023-05-11 15:50 ` aros at gmx dot com
2023-05-12  8:47 ` aros at gmx dot com
2023-05-16 22:43 ` juzhe.zhong at rivai dot ai
2023-05-17  0:08 ` sjames at gcc dot gnu.org
2023-05-28 16:46 ` hubicka at gcc dot gnu.org
2023-05-28 17:29 ` [Bug target/109812] GraphicsMagick resize is a lot slower in GCC 13.1 vs Clang 16 on Intel Raptor Lake hubicka at gcc dot gnu.org
2023-05-28 17:39 ` hubicka at gcc dot gnu.org
2023-05-28 18:11 ` hubicka at gcc dot gnu.org
2023-05-28 18:50 ` hubicka at gcc dot gnu.org [this message]
2023-05-30  0:05 ` zhangjungcc at gmail dot com
2023-05-31 12:42 ` hubicka at ucw dot cz
2023-05-31 16:11 ` hubicka at gcc dot gnu.org
2023-05-31 16:52 ` jamborm at gcc dot gnu.org
2023-06-01  9:38 ` jamborm at gcc dot gnu.org
2023-06-01 11:19 ` jakub at gcc dot gnu.org
2023-06-01 12:28 ` hubicka at gcc dot gnu.org
2023-06-21  9:46 ` ubizjak at gmail dot com
2023-10-12  4:48 ` cvs-commit at gcc dot gnu.org
2023-11-24 23:38 ` hubicka at gcc dot gnu.org
2023-11-25 10:21 ` liuhongt at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-109812-4-xmKa7bYW8L@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).