From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-x32c.google.com (mail-ot1-x32c.google.com [IPv6:2607:f8b0:4864:20::32c]) by sourceware.org (Postfix) with ESMTPS id 012A03858D28; Sat, 16 Dec 2023 20:10:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 012A03858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 012A03858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::32c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702757447; cv=none; b=alaw0HpgHPRsAOwHHEdgIFptZbFcoYPrEDyZwbsCm4aefFFUeEuVDxHJFjzLRx3ZC3ffBp3+2soW+2jPJI0Svat2C0PomUlwtzB57OGIZQBqAMVz73xKDumUCzw0s87SMfHug1hHYsYCK7Qqf7BnmXsvL8wmZXNoISRYFTHPbEc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702757447; c=relaxed/simple; bh=madTjnM8bCftpYxU4dU6lWoeM1z53h9ylqc62GZlZ28=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=JkUME3tWPX17y76McsZLEVP2LBMjD7/N739gRSyNsidP49N43xdcEZUt5zp9g3Rc3of2pxK3Z+XFGvIy16xIfO4O4ZWnUXJ+dYyT9IqQe4JHOZdS6QS5P+8UuZRezr3CYcjum/mnqX71UC3V5C857m2qWMlGF5L2ds7dVI+qwQw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ot1-x32c.google.com with SMTP id 46e09a7af769-6d9e0f0cba9so1435858a34.1; Sat, 16 Dec 2023 12:10:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702757445; x=1703362245; darn=gcc.gnu.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=QPHmMQEPtAH6CC1gSE2OG5QsriYJl9qquDOzeZr2RnM=; b=ane/r2dBsNzKco6ZStByjorcJWo8MpZDp/Ud4RmRW4n0pn4WQuue1bfv2IqOaS4sci pQQJ49UBMfCiiI9w2Vm0vCjyE30Rgflgvqer0Efya8JLbIPLuLgZ2zNn/niulpyip4Kl g8au43LyOdGFnupiQvbKdyEfGJilLUcnzF701k0i4yW2TcH46Oj2mTA3N/QePITVzunI rJUz61Y1XC6lCHkdVuQKUCR/AseY9xi/gZ/0mxyVUwf7cki7iZn9Abrv5buG0yTIvDUX nwvf8AQ3t/IcBxK2pavmH4os6BRmicLSzERU5UcCPV8f1H5qUyReklWRilWwmpnYJ48V Nd2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702757445; x=1703362245; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=QPHmMQEPtAH6CC1gSE2OG5QsriYJl9qquDOzeZr2RnM=; b=sjF9s4CZzCqsRPggArMl95OQdwYAY7hKuDYHmgf37NrWkhdGWz2CvtrsYQxhA0nois hdgGFgXYswxNhYurpcqLS0FcAnyqWDy5nL9Y930c8RWlK+jF5HM1lZpt47wfBpGUtyhT R2cvQOZul0Ml0CYkwrn2BpTFH9vs4T3kFIpWB/w9C4aYDwbyceOSAzJAEyA4o3u1ny5Z SplaYSXKtNqnGtsyKM5cotPUdAQVxUS2Ikp5tDm3xEdnYE3xGpthHZ8j8o6sdGSjDElC NKw0TGtuqJUU2DzSVtiGCWjF3nP2bceI9HtzaXKFyuDoExwYAIscL7d0iDhYFaIwr2oL 29ag== X-Gm-Message-State: AOJu0YyP36yI3XtkzVRS3tpKhv086f+RX6cvb+R0FDoBIWxbDN0wlEdh 1U15hshbuaDiWGcx0M0fjVQLQK0FsYCgfCcMrtW2l/1kJnc= X-Google-Smtp-Source: AGHT+IFNfx30nnlWkFPfS0L9Nuv+qDEBrlCUuf/xhZY6w/+SIlLXZ/yN+sbPcKTslEcSskelCrENjt3zdQ9vgPH1y9c= X-Received: by 2002:a05:6808:38c3:b0:3ba:5dd:9457 with SMTP id el3-20020a05680838c300b003ba05dd9457mr19688400oib.38.1702757444849; Sat, 16 Dec 2023 12:10:44 -0800 (PST) MIME-Version: 1.0 From: Antony Polukhin Date: Sat, 16 Dec 2023 23:10:33 +0300 Message-ID: Subject: [PATCH] PR libstdc++/112682 More efficient std::basic_string move To: "libstdc++" , gcc-patches List Content-Type: multipart/mixed; boundary="000000000000a3ba97060ca61cf3" X-Spam-Status: No, score=-8.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --000000000000a3ba97060ca61cf3 Content-Type: text/plain; charset="UTF-8" A few places in bits/basic_string.h use `traits_type::copy` to copy `__str.length() + 1` bytes. Despite the knowledge that `__str.length()` is not greater than 15 the compiler emits (and sometimes inlines) a `memcpy` call. That results in a quite big set of instructions https://godbolt.org/z/j35MMfxzq Replacing `__str.length() + 1` with `_S_local_capacity + 1` explicitly forces the compiler to copy the whole `__str._M_local_buf`. As a result the assembly becomes almost 5 times shorter and without any function calls or multiple conditional jumps https://godbolt.org/z/bfq8bxra9 This patch always copies `_S_local_capacity + 1` if working with `std::char_traits`. PR libstdc++/112682: * include/bits/basic_string.h: Optimize string moves. P.S.: still not sure that this optimization is not an UB or fine for libstdc++. However, the assembly looks much better with it. -- Best regards, Antony Polukhin --000000000000a3ba97060ca61cf3 Content-Type: text/plain; charset="US-ASCII"; name="copy_indeterminate.txt" Content-Disposition: attachment; filename="copy_indeterminate.txt" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_lq8hmwdu0 ZGlmZiAtLWdpdCBhL2xpYnN0ZGMrKy12My9pbmNsdWRlL2JpdHMvYmFzaWNfc3RyaW5nLmggYi9s aWJzdGRjKystdjMvaW5jbHVkZS9iaXRzL2Jhc2ljX3N0cmluZy5oCmluZGV4IDFiOGViY2E3ZGFk Li43YTVlMzQ4MjgwYyAxMDA2NDQKLS0tIGEvbGlic3RkYysrLXYzL2luY2x1ZGUvYml0cy9iYXNp Y19zdHJpbmcuaAorKysgYi9saWJzdGRjKystdjMvaW5jbHVkZS9iaXRzL2Jhc2ljX3N0cmluZy5o CkBAIC0xODgsNiArMTg4LDIzIEBAIF9HTElCQ1hYX0JFR0lOX05BTUVTUEFDRV9DWFgxMQogICAg ICAgOiBiYXNpY19zdHJpbmcoX19zdncuX01fc3YuZGF0YSgpLCBfX3N2dy5fTV9zdi5zaXplKCks IF9fYSkgeyB9CiAjZW5kaWYKIAorICAgICAgX0dMSUJDWFgxN19DT05TVEVYUFIKKyAgICAgIHN0 YXRpYyBib29sCisgICAgICBfU19wZXJtaXRfY29weWluZ19pbmRldGVybWluYXRlKCkgbm9leGNl cHQKKyAgICAgIHsKKwkvLyBDb3B5aW5nIGNvbXBpbGUtdGltZSBrbm93biBfU19sb2NhbF9jYXBh Y2l0eSArIDEgYnl0ZXMgaXMgbXVjaCBtb3JlCisJLy8gZWZmaWNpZW50IHRoYW4gY29weWluZyBy dW50aW1lIGtub3duIF9fc3RyLmxlbmd0aCgpICsgMS4gVGhpcworCS8vIGZ1bmN0aW9uIHJldHVy bnMgdHJ1ZSwgaWYgc3VjaCBpbml0aWFsaXphdGlvbiBpcyBwZXJtaXR0ZWQgZXZlbiBpZgorCS8v IHRoZSByaWdodCBzaWRlIGhhcyBpbmRldGVybWluYXRlIHZhbHVlcy4KKwkvLworCS8vIFtkY2wu aW5pdF0gcGVybWl0cyBpbml0aWFsaXppbmcgd2l0aCBpbmRldGVybWluYXRlIHZhbHVlIG9mIHVu c2lnbmVkCisJLy8gbmFycm93IGNoYXJhY3RlciB0eXBlLgorCS8vCisJLy8gTGlicmFyeSB1c2Vy cyBzaG91bGQgbm90IHNwZWNpYWxpemUgY2hhcl90cmFpdHM8Y2hhcj4gc28gdGhpcyBpcworCS8v IG5vdCBvYnNlcnZhYmxlIGZvciB1c2VyLgorCXJldHVybiBpc19zYW1lPHRyYWl0c190eXBlLCBj aGFyX3RyYWl0czxjaGFyPiA+Ojp2YWx1ZTsKKwkgIH0KKwogICAgICAgLy8gVXNlIGVtcHR5LWJh c2Ugb3B0aW1pemF0aW9uOiBodHRwOi8vd3d3LmNhbnRyaXAub3JnL2VtcHR5b3B0Lmh0bWwKICAg ICAgIHN0cnVjdCBfQWxsb2NfaGlkZXIgOiBhbGxvY2F0b3JfdHlwZSAvLyBUT0RPIGNoZWNrIF9f aXNfZmluYWwKICAgICAgIHsKQEAgLTY3Miw4ICs2ODksMTAgQEAgX0dMSUJDWFhfQkVHSU5fTkFN RVNQQUNFX0NYWDExCiAgICAgICB7CiAJaWYgKF9fc3RyLl9NX2lzX2xvY2FsKCkpCiAJICB7Ci0J ICAgIHRyYWl0c190eXBlOjpjb3B5KF9NX2xvY2FsX2J1ZiwgX19zdHIuX01fbG9jYWxfYnVmLAot CQkJICAgICAgX19zdHIubGVuZ3RoKCkgKyAxKTsKKwkgICAgc2l6ZV90eXBlIF9fY29weV9jb3Vu dCA9IF9TX2xvY2FsX2NhcGFjaXR5ICsgMTsKKwkgICAgaWYgX0dMSUJDWFgxN19DT05TVEVYUFIg KCFfU19wZXJtaXRfY29weWluZ19pbmRldGVybWluYXRlKCkpCisJICAgICAgX19jb3B5X2NvdW50 ID0gX19zdHIubGVuZ3RoKCkgKyAxOworCSAgICB0cmFpdHNfdHlwZTo6Y29weShfTV9sb2NhbF9i dWYsIF9fc3RyLl9NX2xvY2FsX2J1ZiwgX19jb3B5X2NvdW50KTsKIAkgIH0KIAllbHNlCiAJICB7 CkBAIC03MTEsOCArNzMwLDEwIEBAIF9HTElCQ1hYX0JFR0lOX05BTUVTUEFDRV9DWFgxMQogICAg ICAgewogCWlmIChfX3N0ci5fTV9pc19sb2NhbCgpKQogCSAgewotCSAgICB0cmFpdHNfdHlwZTo6 Y29weShfTV9sb2NhbF9idWYsIF9fc3RyLl9NX2xvY2FsX2J1ZiwKLQkJCSAgICAgIF9fc3RyLmxl bmd0aCgpICsgMSk7CisJICAgIHNpemVfdHlwZSBfX2NvcHlfY291bnQgPSBfU19sb2NhbF9jYXBh Y2l0eSArIDE7CisJICAgIGlmIF9HTElCQ1hYMTdfQ09OU1RFWFBSICghX1NfcGVybWl0X2NvcHlp bmdfaW5kZXRlcm1pbmF0ZSgpKQorCSAgICAgIF9fY29weV9jb3VudCA9IF9fc3RyLmxlbmd0aCgp ICsgMTsKKwkgICAgdHJhaXRzX3R5cGU6OmNvcHkoX01fbG9jYWxfYnVmLCBfX3N0ci5fTV9sb2Nh bF9idWYsIF9fY29weV9jb3VudCk7CiAJICAgIF9NX2xlbmd0aChfX3N0ci5sZW5ndGgoKSk7CiAJ ICAgIF9fc3RyLl9NX3NldF9sZW5ndGgoMCk7CiAJICB9Cg== --000000000000a3ba97060ca61cf3--