From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <libc-alpha-return-74625-listarch-libc-alpha=sources.redhat.com@sourceware.org>
Received: (qmail 49801 invoked by alias); 10 Nov 2016 08:01:20 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Received: (qmail 49789 invoked by uid 89); 10 Nov 2016 08:01:19 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.2 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_SPAM,SPF_PASS autolearn=no version=3.3.2 spammy=wellknown, well-known, U*pinskia, sk:pinskia
X-HELO: mail-yb0-f194.google.com
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20130820;
        h=x-gm-message-state:mime-version:in-reply-to:references:from:date
         :message-id:subject:to:cc;
        bh=bS/TgiIWFVmx61xvmd1M4AYUfhtyCDMq42Smeddm7gM=;
        b=DfZEtwHCWjGkIHU1Qn5NX0nSDrayDQt5AkLti/DANK7hyKud9/ZzZXHx5K5d4Hy5NO
         EpnNw9V0QoCOITTMAa7E48noReVHDS7+HAMf0pOJN+PKghO0P7VPM/F2Cq2nRrppTRFc
         ftQtSQs+FPD9ja1DytGRLi3Hd/3fJdQHuWMlQ5qFwnt0hqtMzgAiMk8sdU0Ogx90QOBa
         m9hTOfWnuwbhXa1hi3ejehAOgZamU+sCqgQGfqFmUv336A86K5v7vDWNPbCDcXVlLofa
         mUuHhDtfSvGDndSOmlL1lOD96NRijIDqHMUMKKYsPtjJ+OnwmHZAhFBTmoBVh3gBoAth
         QQfg==
X-Gm-Message-State: ABUngvfnYCgN5b1dNZuNqbxxt61XOGWqgXjxxkOposv/LqjIEPkHIM2B67/nMvl06o6k/OhfJcmBLYmFbVQ+Tw==
X-Received: by 10.37.165.65 with SMTP id h59mr4115906ybi.128.1478764867656;
 Thu, 10 Nov 2016 00:01:07 -0800 (PST)
MIME-Version: 1.0
In-Reply-To: <FAC7553E-A7FA-4A00-A984-0F2C1D3A2D36@linaro.org>
References: <E122D105-776D-4B9B-A15C-0A7C24BC0910@linaro.org>
 <c05f0be4-6e4b-80cd-ad74-edc16a443b14@linaro.org> <193512EC-DC6C-4BDF-8138-1C1F54B30A12@linaro.org>
 <CA+=Sn1nEq4mjH0T1+vcdA9N1Vzo8n5DAvaDkzhzNQB-akJVfjw@mail.gmail.com> <FAC7553E-A7FA-4A00-A984-0F2C1D3A2D36@linaro.org>
From: Andrew Pinski <pinskia@gmail.com>
Date: Thu, 10 Nov 2016 08:01:00 -0000
Message-ID: <CA+=Sn1=kfWA89LSy15HQjQoLrwYrcyz2bM49uE64t76_0rQd6w@mail.gmail.com>
Subject: Re: [libc/string] State of PAGE_COPY_FWD / PAGE_COPY_THRESHOLD
To: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org>, 
	GNU C Library <libc-alpha@sourceware.org>
Content-Type: text/plain; charset=UTF-8
X-SW-Source: 2016-11/txt/msg00378.txt.bz2

On Wed, Nov 9, 2016 at 11:52 PM, Maxim Kuvyrkov
<maxim.kuvyrkov@linaro.org> wrote:
>> On Nov 10, 2016, at 11:48 AM, Andrew Pinski <pinskia@gmail.com> wrote:
>>
>> On Wed, Nov 9, 2016 at 11:39 PM, Maxim Kuvyrkov
>> <maxim.kuvyrkov@linaro.org> wrote:
>>>> On Nov 1, 2016, at 5:59 PM, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:
>>>>
>>> ...
>>>> $ cat sysdeps/x86/pagecopy.h
>>>>
>>>> #define PAGE_SIZE           4096
>>>> #define PAGE_COPY_THRESHOLD PAGE_SIZE
>>>>
>>>> #define PAGE_COPY_FWD(dstp, srcp, nbytes_left, nbytes)  /* Implement it */
>>>>
>>>> It should work on any other architecture as well.  Now the question
>>>> is whether this actually does make sense for Linux.  Hurd/mach provided
>>>> a syscall (?) to actually copy the pages (vm_copy) which seems to apply
>>>> some tricks to avoid full copy pages. By 'linux zero page sharing' are
>>>> you referring to KSM (Kernel Samepage Merging)?
>>>>
>>>> If so, on a system without a provided kernel interface to work directed
>>>> with underlying memory mapping (such as for mach), mem{cpy,set} will
>>>> actually need to touch the pages and it will be up to kernel page fault
>>>> mechanism to actually handle it (by identifying common pages and adjusting
>>>> vma mapping accordingly). And AFAIK this are only enabled on KSM if you
>>>> actually madavise the page explicit. So I am not grasping the need to
>>>> actually implement page copying on Linux.
>>>
>>> Linux kernel has a reserved page filled with zeroes, so it there /were/ a syscall to tell kernel to map N consecutive pages starting at address PTR to that zero page, we could use that in GLIBC for really big memset(0).
>>>
>>> A quick investigation shows that there is no such syscall provided by the Linux kernel.  Doesn't mean we can't ask for / implement one.
>>
>> And then there would be a COW interrupt on the first write.  Not a
>> good idea.  Since most likely you are writing zeros to a big page for
>> security reasons before filling it again with other data.
>
> I'm looking at this as a possible performance optimization for a well-known benchmark.

Please don't do it unless you benchmark real workloads.  Doing this
for a benchmark is not a good.  Please use something like WRF, mysql,
hadoop, spark or any other real workload that does lots of
memset/memcpy.  Please don't do this just for a well-known broken
benchmark.  Seriously this is just a broken benchmark anyways.

>
>>   That mean
>> each page would need to be copied which is normally slower than
>> zeroing in the first place.
>
> It may be like you say, or it may be a significant performance improvement.  I want to see numbers before deciding on how useful this may be.

Copying is always slower than doing setting zero.  There are
instructions on most major arch (including AARCH64) for zeroing a
cache line.  Copying means loading one cache line to L1 and then doing
stores.  Yes you can mark the cache line as not going to be used later
but that still means going to the cache.

Thanks,
Andrew

>
>>
>> COW is only useful when most of the pages will not be written to; it
>> is not useful when doing memcpy or memset.  Mainly because you don't
>> need to take the overhead of taking an interrupt twice (a system call
>> is still an interrupt).
>
>
> --
> Maxim Kuvyrkov
> www.linaro.org
>