From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 47136 invoked by alias); 1 Nov 2016 13:59:56 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 47107 invoked by uid 89); 1 Nov 2016 13:59:54 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=aiming, Merging, H*r:10.0.0, State X-HELO: mail-ua0-f173.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=WbMOrIZ/jwwEPUQX3g2a0MFOM5+iXAZTyyzKrCY6D2E=; b=NjQOcwz8FichQWG7Vj+/OhiLSu++fpZfOeKmWIdXO7sOkLmCtBetWhOHVeh/9roRW2 H992j2wlc59W5wYJqNdqT0zbRRd0fFUYpxP2rQtdmsy6Ebxow7VE1Vp7v2rOEbbOrln7 Wta0i+W/rvIEUEX3N1zwHmPQoxVW3/fow0Z+71HVJNvGa71mbitzCYsZzs5d5k56Lk4A yLyWBNFKf/J80acAco/xcAvhNkMZ+Tmyl53ixb/mA+gsExZb6bA81REhoTbc88Kj28AH pDtQQhHB+I7VEz2Z98BhIyPTrRMB2wcDO/3wU6a6BqkKtAHvearkb9j8A/Anb7rfM3pR yVrQ== X-Gm-Message-State: ABUngvdWyOL7jSZCnmkeDApY5B8iyxSxMF88oa+Y8jwT4tKq0uNuNuCirzKvH8U1E6lL0Iqo X-Received: by 10.176.68.6 with SMTP id m6mr24056434uam.82.1478008781876; Tue, 01 Nov 2016 06:59:41 -0700 (PDT) Subject: Re: [libc/string] State of PAGE_COPY_FWD / PAGE_COPY_THRESHOLD To: libc-alpha@sourceware.org References: From: Adhemerval Zanella Message-ID: Date: Tue, 01 Nov 2016 13:59:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit X-SW-Source: 2016-11/txt/msg00009.txt.bz2 On 01/11/2016 07:28, Maxim Kuvyrkov wrote: > I wanted to check performance impact of using linux zero page sharing in calls to memset (PTR, 0, SIZE). I remembered seeing PAGE_COPY_FWD_MAYBE and PAGE_COPY_THRESHOLD in string/memcpy.c, and my plan was to copy this logic to an experimental memset() implementation. > > Closer inspection of the current code showed that only Mach port attempted to use full-page copying in memcpy.c, but now even the Mach port disables it. The net result is that code in string/memcpy.c, as well as parts of headers sysdeps/generic/pagecopy.h and sysdeps/generic/memcopy.h are dead code. > > From the above we have 2 questions: > 1. Is it possible to use full-page copy (with COW) in the Linux glibc port for memcpy() and/or memset(0)? It is still possible to use the algorithms string/mem{cpy,set}, you just need to make some change on the architecture you are aiming for. On x86_64, for instance, you will need to remove any possible assembly implementation so sysdeps won't use it instead. While configuring with --disable-multi-arch (to remove the ifunc usage and keep it simpler), I removed: deleted: sysdeps/x86_64/memcpy.S deleted: sysdeps/x86_64/memcpy_chk.S deleted: sysdeps/x86_64/memmove.S deleted: sysdeps/x86_64/mempcpy.S deleted: sysdeps/x86_64/wordcopy.c The 'memcpy.S' is the default optimized implementation and 'memcpy_chk.S' is an empty one (since it is implemented on memcpy.S for x86_64 and we will need the symbols provided). Same logic applies for the other removed one (memmove.S and mempcpy.S). I also removed sysdeps/x86_64/wordcopy.c because the idea is to use the default one on string/wordcopy.c. Next it will require to define OP_T_THRES so I created the file 'sysdeps/x86_64/memcopy.h' with the contents: $ cat sysdeps/x86_64/memcopy.h #include #undef OP_T_THRES #define OP_T_THRES 8 (I think we should just define it to WORDSIZE/8 somewhere). This should enable the build and use of generic memcpy implementation. To actually use the PAGE_COPY_* macro you will need to add a arch specific pagecopy.h header. Using the x86_64 example: $ cat sysdeps/x86/pagecopy.h #define PAGE_SIZE 4096 #define PAGE_COPY_THRESHOLD PAGE_SIZE #define PAGE_COPY_FWD(dstp, srcp, nbytes_left, nbytes) /* Implement it */ It should work on any other architecture as well. Now the question is whether this actually does make sense for Linux. Hurd/mach provided a syscall (?) to actually copy the pages (vm_copy) which seems to apply some tricks to avoid full copy pages. By 'linux zero page sharing' are you referring to KSM (Kernel Samepage Merging)? If so, on a system without a provided kernel interface to work directed with underlying memory mapping (such as for mach), mem{cpy,set} will actually need to touch the pages and it will be up to kernel page fault mechanism to actually handle it (by identifying common pages and adjusting vma mapping accordingly). And AFAIK this are only enabled on KSM if you actually madavise the page explicit. So I am not grasping the need to actually implement page copying on Linux. > > 2. If not, then is there any reason to keep the dead code around or should we clean it up? In fact I think hurd/mach intent is indeed to actually use it and it is not using due a missing adjustment in commit 99f8dc922033821edcc13f9f8360e9fda40dfcff (Fix -Wundef warning on PAGE_COPY_THRESHOLD). It should have changed 'sysdeps/mach/pagecopy.h" PAGE_THRESHOLD definition to PAGE_COPY_THRESHOLD. > > -- > Maxim Kuvyrkov > www.linaro.org > > >