public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Hongyu Wang <wwwhhhyyy333@gmail.com>
To: "H.J. Lu" <hjl.tools@gmail.com>
Cc: Richard Biener <richard.guenther@gmail.com>,
	Jan Hubicka <jh@suse.cz>,  Hongtao Liu <hongtao.liu@intel.com>,
	GCC Patches <gcc-patches@gcc.gnu.org>,
	 Hongyu Wang <hongyu.wang@intel.com>
Subject: Re: [PATCH 3/3] x86: Update memcpy/memset inline strategies for -mtune=generic
Date: Tue, 23 Mar 2021 10:41:36 +0800	[thread overview]
Message-ID: <CA+OydWnRAt2rfN-9iGXdE=LLky2yovWLAP=iJ6QxECKu5xb1Sw@mail.gmail.com> (raw)
In-Reply-To: <CAMe9rOp0ptAZd4X9RyqXoDVged=9UfnmyOFV_+c=JHbj0Dpa=w@mail.gmail.com>

> Hongyue, please collect code size differences on SPEC CPU 2017 and
> eembc.

Here is code size difference for this patch

SPEC CPU 2017
                                   difference             w patch      w/o patch
500.perlbench_r              0.051%             1622637          1621805
502.gcc_r                         0.039%             6930877          6928141
505.mcf_r                         0.098%             16413              16397
520.omnetpp_r               0.083%             1327757          1326653
523.xalancbmk_r            0.001%             3575709          3575677
525.x264_r                       -0.067%           769095            769607
531.deepsjeng_r             0.071%             67629              67581
541.leela_r                       -3.062%           127629            131661
548.exchange2_r            -0.338%            66141              66365
557.xz_r                            0.946%            128061            126861

503.bwaves_r                  0.534%             33117              32941
507.cactuBSSN_r            0.004%             2993645          2993517
508.namd_r                     0.006%             851677            851629
510.parest_r                    0.488%             6741277          6708557
511.povray_r                   -0.021%           849290            849466
521.wrf_r                         0.022%             29682154       29675530
526.blender_r                  0.054%             7544057          7540009
527.cam4_r                      0.043%             6102234          6099594
538.imagick_r                  -0.015%           1625770          1626010
544.nab_r                         0.155%             155453            155213
549.fotonik3d_r              0.000%             351757            351757
554.roms_r                      0.041%             735837            735533

eembc
                                    difference        w patch      w/o patch
aifftr01                              0.762%             14813            14701
aiifft01                              0.556%             14477            14397
idctrn01                            0.101%             15853            15837
cjpeg-rose7-preset         0.114%             56125              56061
nnet_test                         -0.848%           35549              35853
aes                                   0.125%             38493            38445
cjpegv2data                     0.108%             59213              59149
djpegv2data                     0.025%             63821              63805
huffde                               -0.104%           30621              30653
mp2decoddata                -0.047%           68285              68317
mp2enf32data1              0.018%             86925              86909
mp2enf32data2              0.018%             89357              89341
mp2enf32data3              0.018%             88253              88237
mp3playerfixeddata       0.103%             46877              46829
ip_pktcheckb1m              0.191%             25213              25165
nat                                   0.527%             45757             45517
ospfv2                               0.196%             24573             24525
routelookup                     0.189%             25389              25341
tcpbulk                            0.155%             30925              30877
textv2data                        0.055%             29101              29085

H.J. Lu via Gcc-patches <gcc-patches@gcc.gnu.org> 于2021年3月22日周一 下午9:39写道:
>
> On Mon, Mar 22, 2021 at 6:29 AM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Mon, Mar 22, 2021 at 2:19 PM H.J. Lu via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> > >
> > > Simply memcpy and memset inline strategies to avoid branches for
> > > -mtune=generic:
> > >
> > > 1. With MOVE_RATIO and CLEAR_RATIO == 17, GCC will use integer/vector
> > >    load and store for up to 16 * 16 (256) bytes when the data size is
> > >    fixed and known.
> > > 2. Inline only if data size is known to be <= 256.
> > >    a. Use "rep movsb/stosb" with simple code sequence if the data size
> > >       is a constant.
> > >    b. Use loop if data size is not a constant.
> > > 3. Use memcpy/memset libray function if data size is unknown or > 256.
> > >
> > > With -mtune=generic -O2,
> >
> > Is there any visible code-size effect of increasing CLEAR_RATIO on
>
> Hongyue, please collect code size differences on SPEC CPU 2017 and
> eembc.
>
> > SPEC/eembc?  Did you play with other values of MOVE/CLEAR_RATIO?
> > 17 memory-to-memory/memory-clear insns looks quite a lot.
> >
>
> Yes, we did.  256 bytes is the threshold above which memcpy/memset in libc
> win. Below 256 bytes, 16 by_pieces move/store is faster.
>
> --
> H.J.

-- 
Regards,

Hongyu, Wang

  reply	other threads:[~2021-03-23  2:41 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-22 13:16 [PATCH 0/3] x86: Update memcpy/memset inline strategies H.J. Lu
2021-03-22 13:16 ` [PATCH 1/3] x86: Update memcpy/memset inline strategies for Ice Lake H.J. Lu
2021-03-22 14:10   ` Jan Hubicka
2021-03-22 23:57     ` [PATCH v2 " H.J. Lu
2021-03-29 13:43       ` H.J. Lu
2021-03-31  6:59       ` Richard Biener
2021-03-31  8:05       ` Jan Hubicka
2021-03-31 13:09         ` H.J. Lu
2021-03-31 13:40           ` Jan Hubicka
2021-03-31 13:47             ` Jan Hubicka
2021-03-31 15:41               ` H.J. Lu
2021-03-31 17:43                 ` Jan Hubicka
2021-03-31 17:54                   ` H.J. Lu
2021-04-01  5:57                     ` Hongyu Wang
2021-03-22 13:16 ` [PATCH 2/3] x86: Update memcpy/memset inline strategies for Skylake family CPUs H.J. Lu
2021-04-05 13:45   ` H.J. Lu
2021-04-05 21:14     ` Jan Hubicka
2021-04-05 21:53       ` H.J. Lu
2021-04-06  9:09         ` Hongyu Wang
2021-04-06  9:51           ` Jan Hubicka
2021-04-06 12:34             ` H.J. Lu
2021-03-22 13:16 ` [PATCH 3/3] x86: Update memcpy/memset inline strategies for -mtune=generic H.J. Lu
2021-03-22 13:29   ` Richard Biener
2021-03-22 13:38     ` H.J. Lu
2021-03-23  2:41       ` Hongyu Wang [this message]
2021-03-23  8:19         ` Richard Biener
2021-08-22 15:28           ` PING [PATCH] " H.J. Lu
2021-09-08  3:01             ` PING^2 " H.J. Lu
2021-09-13 13:38               ` H.J. Lu
2021-09-20 17:06                 ` PING^3 " H.J. Lu
2021-10-01 15:24                   ` PING^4 " H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+OydWnRAt2rfN-9iGXdE=LLky2yovWLAP=iJ6QxECKu5xb1Sw@mail.gmail.com' \
    --to=wwwhhhyyy333@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hjl.tools@gmail.com \
    --cc=hongtao.liu@intel.com \
    --cc=hongyu.wang@intel.com \
    --cc=jh@suse.cz \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).