From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x229.google.com (mail-oi1-x229.google.com [IPv6:2607:f8b0:4864:20::229]) by sourceware.org (Postfix) with ESMTPS id 952FC3861896; Mon, 4 Jan 2021 15:55:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 952FC3861896 Received: by mail-oi1-x229.google.com with SMTP id q205so32549428oig.13; Mon, 04 Jan 2021 07:55:30 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=pybe7knKZQxLCDx4nPEKsditVB8Iubi44/oCeW0j/0Q=; b=LhifPe4ck5LNZuXGOJNLW2v93SJ4eP/hr8HhFnILyM65psPIIDxVd5jaFEzTOKMuvR mBjf456Bh+1xCb8eBIji/3QGwUi0Bz+MeE4IiDPMapiH+GU76anu21w6DKDnIL0IhBgk 1nn+u3cpzT2rGft9zNBF3rHa2r9Y086Uxf4tsRDrSCO4OA9OsK96ggHlVpGEVEgK9uQA 0sdB+OhciYr0ii+GX5+v44cZJaHbaFWG1T9q6ut+kSdZuIfKAEYqPLpaWRMxzBV1AvTf VYQ3+L7lHIInU7hmXUwN/K39YxmV3gsIkgRJOXEgvbn7Me/kG1QP7z3q77i+sni0d09A wFcw== X-Gm-Message-State: AOAM532gx83usitiHHid9uAEJNfU5dOp7lFNqlA3FRUFTPPIAqrybwRH 6FZj4pkccF/biGBtVhfiAvh2Kpxo0K6f1KpQgys= X-Google-Smtp-Source: ABdhPJxmm1C1Rt07hLPbOEiQUWbakrzVC7Yy5/+PtDOspCWRKtVmCuagOmJAPib4EUB7Liaq57R8aXKyNeOKJgUH4qw= X-Received: by 2002:aca:f5d3:: with SMTP id t202mr18487000oih.25.1609775730085; Mon, 04 Jan 2021 07:55:30 -0800 (PST) MIME-Version: 1.0 References: <20210104151706.2129490-1-hjl.tools@gmail.com> <874kjwloah.fsf@oldenburg2.str.redhat.com> <87wnwsk8lh.fsf@oldenburg2.str.redhat.com> In-Reply-To: <87wnwsk8lh.fsf@oldenburg2.str.redhat.com> From: "H.J. Lu" Date: Mon, 4 Jan 2021 07:54:53 -0800 Message-ID: Subject: Re: V2 [PATCH] x86-64: Avoid rep movsb with short distance [BZ #27130] To: Florian Weimer , Libc-stable Mailing List Cc: "H.J. Lu via Libc-alpha" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3030.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-stable@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-stable mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jan 2021 15:55:31 -0000 On Mon, Jan 4, 2021 at 7:47 AM Florian Weimer wrote: > > * H. J. Lu: > > > On Mon, Jan 4, 2021 at 7:22 AM Florian Weimer wrote: > >> > >> * H. J. Lu via Libc-alpha: > >> > >> > 1: > >> > +# if AVOID_SHORT_DISTANCE_REP_MOVSB > >> > + movq %rsi, %rcx > >> > + subq %rdi, %rcx > >> > +2: > >> > +/* Avoid "rep movsb" if RCX, the distance between source and destination, > >> > + is N*4GB + [1..63] with N >= 0. */ > >> > + cmpl $63, %ecx > >> > + jbe L(more_2x_vec) /* Avoid "rep movsb" if ECX <= 63. */ > >> > +# endif > >> > mov %RDX_LP, %RCX_LP > >> > rep movsb > >> > L(nop): > >> > >> Why not use _LP names here? I think the %ecx comparison at least can > >> give false results on x86-64 (64-bit). > >> > > > > This is done on purpose since we want to avoid "rep movsb" for distances of > > N*4GB + [1..63] with N >= 0 which include 0x100000003. > > Ah, and the comment is quite clear (the commit subject less so). It isn't easy to describe it with so few letters. > I tried to make sense of the assembler code, and I think the change is > okay because L(movsb) is only reached when there is more to copy than > twice the vector size. > That is correct. I am checking it in. I will backport it to release branches next week. Thanks. -- H.J.