From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 113407 invoked by alias); 10 Jul 2015 11:52:56 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 113394 invoked by uid 89); 10 Jul 2015 11:52:56 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_40,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-la0-f50.google.com Received: from mail-la0-f50.google.com (HELO mail-la0-f50.google.com) (209.85.215.50) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 10 Jul 2015 11:52:54 +0000 Received: by laar3 with SMTP id r3so264295091laa.0 for ; Fri, 10 Jul 2015 04:52:51 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.152.224.162 with SMTP id rd2mr19452233lac.43.1436529171424; Fri, 10 Jul 2015 04:52:51 -0700 (PDT) Received: by 10.152.90.230 with HTTP; Fri, 10 Jul 2015 04:52:51 -0700 (PDT) In-Reply-To: <20150710065144.GA9215@domone> References: <1436446689-12180-1-git-send-email-rep.dot.nop@gmail.com> <20150710065144.GA9215@domone> Date: Fri, 10 Jul 2015 11:52:00 -0000 Message-ID: Subject: Re: [PATCH] fold builtin_tolower, builtin_toupper From: Bernhard Reutner-Fischer To: =?UTF-8?B?T25kxZllaiBCw61sa2E=?= Cc: Richard Biener , GCC Patches , "Joseph S . Myers" Content-Type: multipart/mixed; boundary=001a11336b72f9522c051a840285 X-IsSubscribed: yes X-SW-Source: 2015-07/txt/msg00873.txt.bz2 --001a11336b72f9522c051a840285 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-length: 2433 On 10 July 2015 at 08:51, Ond=C5=99ej B=C3=ADlka wrote: > On Thu, Jul 09, 2015 at 03:46:08PM +0200, Richard Biener wrote: >> On Thu, 9 Jul 2015, Bernhard Reutner-Fischer wrote: [toupper/tolower patch withdrawn] >> I don't think this can be correct for all locales which need not >> have a lower-case character for all upper-case ones nor do >> all letters having one need to be in the range of 'A' to 'Z'. >> >> Joseph will surely correct me if I am wrong. >> > Thats correct as this doesn't handle toupper('=C4=8D') with appropriate > single byte locale. You cannot even rely on fact that if x<128 then only > conversion is happens in 'A'..'Z' range, there are locales where that > doesn't hold and we need to check _NL_CTYPE_NONASCII_CASE. We don't > export that so you would need to check that while constructing table with= 256 entries. I detest locales. > > Also your example is invalid as you used __builtin_tolower instead > tolower. As usual gcc builtins are slow, you will get better performance You're of course right, libc usually has a map-lookup for the fast path for these. (tolower) (...) comes to mind but doesn't matter here. > with following. > > #include > int foo(char *c) > { > int i; > for(i=3D0;i<1000;i++) > c[i]=3Dtolower(c[i]); > } > > > As your example first problem is that it doesn't work with utf8 due > multibyte characters. yea, the app i saw doing that strcpy/tolower has a defined input of ASCII A-Za-z0-9- so i should not have used toupper in the example in the first place. > > Second problem is that sse4.2 doesn't help at all as generating masks > with it is quite slow. Using just sse2 is faster here. The point of the PR was that a) loop-fusion is missing and b) nothing is vectorized. The quick sse4.2 example was just an extension my CPU happens to support and that showed the result would be smaller than before and maybe even a tiny bit faster.. ;) > > It could be possible to add such function to libc. For vectorization you I think it would be better if GCC was able to fuse two or more loops and grok to vectorize patterns like these. As you point out, toupper is a bad example, a better one would perhaps be something like the attached. I guess that there is real-world code that does a memcpy/memmove/str[n]cpy and then mask out some bits in the destination so this should be useful generally. thanks for your comments, though! cheers, --001a11336b72f9522c051a840285 Content-Type: text/x-csrc; charset=US-ASCII; name="tolower_strcpy-0b.c" Content-Disposition: attachment; filename="tolower_strcpy-0b.c" Content-Transfer-Encoding: base64 X-Attachment-Id: f_ibxkddeh0 Content-length: 1639 LyogUFIgbWlkZGxlLWVuZC82Njc0MSAqLwovKiBNYW51YWxseSBleHBhbmRl ZCB2YXJpYW50ICovCi8qIFdlIHdlcmUgbm90IGZ1c2luZyB0aGUgMiBsb29w cyAoc3RyY3B5IGFuZCB0b2xvd2VyKSBhbmQgd2UgZGlkIG5vdAogKiB2ZWN0 b3JpemUgdGhlIGxvb3AuICAqLwp0eXBlZGVmIF9fU0laRV9UWVBFX18gc2l6 ZV90OwpzdGF0aWMgX19hdHRyaWJ1dGVfXyAoKG5vaW5saW5lLCBub2Nsb25l KSkgY2hhciAqCnRvbG93ZXJfc3RyY3B5XzEoY2hhciAqZGVzdCwgY29uc3Qg Y2hhciAqc3JjKSB7CgljaGFyICpkID0gZGVzdCwgKnMgPSAoY2hhciAqKXNy YzsKCXdoaWxlICgqcykgLyogc3RyY3B5ICovCgkJKmQrKyA9ICpzKys7Cgkq ZCA9ICdcMCc7CglkID0gZGVzdDsKCS8qIHdoaWxlICgqZCkgc2hvdWxkIHdv cmsgYXMgd2VsbCBidXQgbWlnaHQgYmUgdG9vIGNvbXBsaWNhdGVkLCBzbzog Ki8KCS8qIHVzZSBzYW1lIGxvb3AgY29uZGl0aW9uIGFzIGFib3ZlICovCglz ID0gKGNoYXIgKilzcmM7Cgl3aGlsZSAoKnMpIHsgLyogYXNjaWlfdG9sb3dl ciAqLwoJCWludCBjaCA9ICpkOwoJCSpkKysgPSBjaCA+PSAnQScgJiYgY2gg PD0gJ1onID8gY2ggfCAweDIwIDogY2g7CgkJcysrOwoJfQoJcmV0dXJuIGRl c3Q7Cn0KY2hhciAqdG9sb3dlcl9zdHJjcHkoY2hhciAqZGVzdCwgY29uc3Qg Y2hhciAqc3JjKSB7CgljaGFyICpzID0gKGNoYXIgKilzcmM7Cgl1bnNpZ25l ZCBpbnQgbGVuID0gMDsKCXdoaWxlICgqcykKCQlpZiAoKnMgPCAnLScgfHwg KnMgPiAneicgfHwgKytsZW4gPiAyNTUpCgkJCXJldHVybiAodm9pZCopMDsK CXJldHVybiB0b2xvd2VyX3N0cmNweV8xKGRlc3QsIHNyYyk7Cn0KI2lmZGVm IE1BSU4KI2luY2x1ZGUgPHVuaXN0ZC5oPgojaW5jbHVkZSA8c3RyaW5nLmg+ CiNkZWZpbmUgTiAxMjgKaW50IG1haW4odm9pZCkgewoJdW5zaWduZWQgbG9u ZyBzdW0gPSAwOwoJY2hhciBzcmNbTiArIDFdLCBkZXN0W04gKyAxXTsKCXdo aWxlICgxKSB7CgkJaW50IG4gPSByZWFkKDAsICZzcmMsIE4pOwoJCWlmIChu ID09IDApCgkJCWJyZWFrOwoJCWlmIChuIDwgMCkKCQkJcmV0dXJuIDE7CgkJ c3JjW25dID0gMDsKCQlzdW0gfD0gKHVuc2lnbmVkIGxvbmcpdG9sb3dlcl9z dHJjcHkoZGVzdCwgc3JjKTsKLy8JCXdyaXRlKDEsIGRlc3QsIHN0cmxlbihk ZXN0KSk7Cgl9CglyZXR1cm4gc3VtID09IDQyOwp9CiNlbmRpZgo= --001a11336b72f9522c051a840285--