From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 84897 invoked by alias); 25 Sep 2019 04:45:06 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 84864 invoked by uid 89); 25 Sep 2019 04:45:04 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-10.9 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=HContent-Transfer-Encoding:8bit X-HELO: mail-wm1-f52.google.com Received: from mail-wm1-f52.google.com (HELO mail-wm1-f52.google.com) (209.85.128.52) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 25 Sep 2019 04:45:02 +0000 Received: by mail-wm1-f52.google.com with SMTP id p7so3034326wmp.4; Tue, 24 Sep 2019 21:45:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:references:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding:content-language; bh=H49LWRpPv2sN2lru1HAvg1d4pywHPuXKh2DS/0BHF2k=; b=m+GlCGd/K4yzmlNVnXOKrrVf99aFvhyCA5Izqpq/GDGLoWFC0hvO7BH0pCtwo51xyX b+A2u6HBqKCx/oPz9ebWVQuf+tNeOYl/WfTfO3inli3dtcULvS22JahyNgLPf32ESCAo hDqQX4Ap0983jqae4fBN9uvJzCzI6ibaOy613Gy4V7/Kp4jxN2hrrjeCVnRdZVxjTt9Z 1AFnu5+hzmTVmQt8/qBADLxtXMWGnbM7P+nhDBIp2cPaWA+YgkL9l+ZjQPgWiG97Qlh7 Ga3uHfEvU5x/9durTyQFmmePPzda08nBb1bInsFdgU/2M7BRPzt8N479RTtA9RFGcmJe uzDg== Return-Path: Received: from [10.127.5.166] ([109.190.253.11]) by smtp.googlemail.com with ESMTPSA id m16sm1894341wml.11.2019.09.24.21.44.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 24 Sep 2019 21:44:58 -0700 (PDT) Subject: Re: copy/copy_backward/fill/fill_n/equal rework From: =?UTF-8?Q?Fran=c3=a7ois_Dumont?= To: "libstdc++@gcc.gnu.org" , gcc-patches References: <1aba6050-85ae-8753-eb81-0ec76340e10b@gmail.com> Message-ID: Date: Wed, 25 Sep 2019 04:45:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <1aba6050-85ae-8753-eb81-0ec76340e10b@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-SW-Source: 2019-09/txt/msg01427.txt.bz2 Ping ? On 9/9/19 8:34 PM, François Dumont wrote: > Hi > >     This patch improves stl_algobase.h > copy/copy_backward/fill/fill_n/equal implementations. The improvements > are: > > - activation of algo specialization for __gnu_debug::_Safe_iterator > (w/o _GLIBCXX_DEBUG mode) > > - activation of algo specialization for _Deque_iterator even if mixed > with another kind of iterator. > > - activation of algo specializations __copy_move_a2 for something else > than pointers. For example this code: > > std::vector v { 'a', 'b', .... }; > > ostreambuf_iterator out(std::cout); > > std::copy(v.begin(), v.end(), out); > > is not calling the specialization __copy_move_a2(const char*, const > char*, ostreambuf_iterator<>); > > It also fix a _GLIBCXX_DEBUG issue where the __niter_base > specialization was wrongly removing the _Safe_iterator<> layer. The > testsuite/25_algorithms/copy/debug/1_neg.cc test case was failing on a > debug assertion because _after_ the copy we were trying to increment > the vector iterator after past-the-end. Of course the problem is the > _after_, Debug mode should detect this _before_ it takes place which > it does now. > > Note that std::fill_n is now making use of std::fill for some > optimizations dealing with random access iterators. > > Performances are very good: > > Before: > > copy_backward_deque_iterators.cc    deque 2 deque 1084r 1084u > 0s         0mem    0pf > copy_backward_deque_iterators.cc    deque 2 vector 3373r 3372u > 0s         0mem    0pf > copy_backward_deque_iterators.cc    vector 2 deque 3316r 3316u > 0s         0mem    0pf > copy_backward_deque_iterators.cc    int deque 2 char vector 3610r > 3609u    0s         0mem    0pf > copy_backward_deque_iterators.cc    char vector 2 int deque 3552r > 3552u    0s         0mem    0pf > copy_backward_deque_iterators.cc    deque 2 list 10528r 10528u > 0s         0mem    0pf > copy_backward_deque_iterators.cc    list 2 deque 2161r 2162u > 0s         0mem    0pf > copy_deque_iterators.cc      deque 2 deque                 752r > 751u    0s         0mem    0pf > copy_deque_iterators.cc      deque 2 vector               3300r > 3299u    0s         0mem    0pf > copy_deque_iterators.cc      vector 2 deque               3144r > 3140u    0s         0mem    0pf > copy_deque_iterators.cc      int deque 2 char vector      3340r > 3338u    1s         0mem    0pf > copy_deque_iterators.cc      char vector 2 int deque      3132r > 3132u    0s         0mem    0pf > copy_deque_iterators.cc      deque 2 list                 10013r > 10012u    0s         0mem    0pf > copy_deque_iterators.cc      list 2 deque                 2274r > 2275u    0s         0mem    0pf > equal_deque_iterators.cc     deque vs deque               8676r > 8675u    0s         0mem    0pf > equal_deque_iterators.cc     deque vs vector              5870r > 5870u    0s         0mem    0pf > equal_deque_iterators.cc     vector vs deque              3163r > 3163u    0s         0mem    0pf > equal_deque_iterators.cc     int deque vs char vector     5845r > 5845u    0s         0mem    0pf > equal_deque_iterators.cc     char vector vs int deque     3307r > 3307u    0s         0mem    0pf > > After: > > copy_backward_deque_iterators.cc    deque 2 deque  697r  697u > 0s         0mem    0pf > copy_backward_deque_iterators.cc    deque 2 vector  219r  218u > 0s         0mem    0pf > copy_backward_deque_iterators.cc    vector 2 deque  453r  453u > 0s         0mem    0pf > copy_backward_deque_iterators.cc    int deque 2 char vector 1914r > 1915u    0s         0mem    0pf > copy_backward_deque_iterators.cc    char vector 2 int deque 2112r > 2111u    0s         0mem    0pf > copy_backward_deque_iterators.cc    deque 2 list 7770r 7771u > 0s         0mem    0pf > copy_backward_deque_iterators.cc    list 2 deque 2194r 2193u > 0s         0mem    0pf > copy_deque_iterators.cc      deque 2 deque                 505r > 504u    0s         0mem    0pf > copy_deque_iterators.cc      deque 2 vector                221r > 221u    0s         0mem    0pf > copy_deque_iterators.cc      vector 2 deque                398r > 397u    0s         0mem    0pf > copy_deque_iterators.cc      int deque 2 char vector      1770r > 1767u    0s         0mem    0pf > copy_deque_iterators.cc      char vector 2 int deque      1995r > 1993u    0s         0mem    0pf > copy_deque_iterators.cc      deque 2 list                 7650r > 7641u    2s         0mem    0pf > copy_deque_iterators.cc      list 2 deque                 2270r > 2270u    0s         0mem    0pf > equal_deque_iterators.cc     deque vs deque                769r > 768u    0s         0mem    0pf > equal_deque_iterators.cc     deque vs vector               231r > 230u    0s         0mem    0pf > equal_deque_iterators.cc     vector vs deque               397r > 397u    0s         0mem    0pf > equal_deque_iterators.cc     int deque vs char vector     1541r > 1541u    0s         0mem    0pf > equal_deque_iterators.cc     char vector vs int deque     1623r > 1623u    0s         0mem    0pf > > In Debug Mode it is of course even better. I haven't had the patience > to run the benches before the patch, it just takes hours to run. So > here is just the After part: > > copy_backward_deque_iterators.cc    deque 2 deque 1128r 1128u > 0s         0mem    0pf > copy_backward_deque_iterators.cc    deque 2 vector  616r  616u > 0s         0mem    0pf > copy_backward_deque_iterators.cc    vector 2 deque  856r  855u > 0s         0mem    0pf > copy_backward_deque_iterators.cc    int deque 2 char vector 2277r > 2277u    0s         0mem    0pf > copy_backward_deque_iterators.cc    char vector 2 int deque 2518r > 2519u    0s         0mem    0pf > copy_backward_deque_iterators.cc    deque 2 list 8029r 8028u > 0s         0mem    0pf > copy_backward_deque_iterators.cc    list 2 deque 10418r 10416u > 0s         0mem    0pf > copy_deque_iterators.cc      deque 2 deque                 931r > 930u    0s         0mem    0pf > copy_deque_iterators.cc      deque 2 vector                613r > 613u    0s         0mem    0pf > copy_deque_iterators.cc      vector 2 deque                794r > 795u    0s         0mem    0pf > copy_deque_iterators.cc      int deque 2 char vector      2192r > 2192u    0s         0mem    0pf > copy_deque_iterators.cc      char vector 2 int deque      2365r > 2364u    0s         0mem    0pf > copy_deque_iterators.cc      deque 2 list                 8009r > 8010u    0s         0mem    0pf > copy_deque_iterators.cc      list 2 deque                 10979r > 10970u    1s         0mem    0pf > equal_deque_iterators.cc     deque vs deque               1034r > 1032u    0s         0mem    0pf > equal_deque_iterators.cc     deque vs vector               481r > 482u    0s         0mem    0pf > equal_deque_iterators.cc     vector vs deque               646r > 646u    0s         0mem    0pf > equal_deque_iterators.cc     int deque vs char vector     1802r > 1802u    0s         0mem    0pf > equal_deque_iterators.cc     char vector vs int deque     1867r > 1865u    0s         0mem    0pf > > Note that copy/copy_backward 'list 2 deque' is still much slower > because for the moment we can't remove the debug layer. I plan to do a > refinement to also cover this use case. > > In Debug implementations I have duplicated the Debug checks because > those algo specialization will also be used when users are directly > using for instance without defining _GLIBCXX_DEBUG. > > All algos tests passed except the constexpr ones in Debug mode, I've > put this on my todo list. > > Ok to commit once I completed all the other tests ? > > François >