From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 800D138515FF for ; Tue, 21 Jun 2022 12:01:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 800D138515FF Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-468-xruGXvmSO-Kl0IPUo9xhwg-1; Tue, 21 Jun 2022 08:01:34 -0400 X-MC-Unique: xruGXvmSO-Kl0IPUo9xhwg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E4A6519705C7; Tue, 21 Jun 2022 12:01:33 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.14]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8F03D18EB4; Tue, 21 Jun 2022 12:01:33 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 25LC1VE9695974 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Tue, 21 Jun 2022 14:01:31 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 25LC1UIA695973; Tue, 21 Jun 2022 14:01:30 +0200 Date: Tue, 21 Jun 2022 14:01:30 +0200 From: Jakub Jelinek To: Noah Goldstein Cc: gcc-patches@gcc.gnu.org, hjl.tools@gmail.com Subject: Re: [PATCH v2] tree-optimization/95821 - Convert strlen + strchr to memchr Message-ID: Reply-To: Jakub Jelinek References: <20220620163536.2653437-1-goldstein.w.n@gmail.com> <20220620214220.2776648-1-goldstein.w.n@gmail.com> MIME-Version: 1.0 In-Reply-To: <20220620214220.2776648-1-goldstein.w.n@gmail.com> X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Jun 2022 12:01:38 -0000 On Mon, Jun 20, 2022 at 02:42:20PM -0700, Noah Goldstein wrote: > This patch allows for strchr(x, c) to the replace with memchr(x, c, > strlen(x) + 1) if strlen(x) has already been computed earlier in the > tree. > > Handles PR95821: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95821 > > Since memchr doesn't need to re-find the null terminator it is faster > than strchr. > > bootstrapped and tested on x86_64-linux. > > PR tree-optimization/95821 This should be indented by a single tab, not two. > > gcc/ > > * tree-ssa-strlen.cc (strlen_pass::handle_builtin_strchr): Emit > memchr instead of strchr if strlen already computed. > > gcc/testsuite/ > > * c-c++-common/pr95821-1.c: New test. > * c-c++-common/pr95821-2.c: New test. > * c-c++-common/pr95821-3.c: New test. > * c-c++-common/pr95821-4.c: New test. > * c-c++-common/pr95821-5.c: New test. > * c-c++-common/pr95821-6.c: New test. > * c-c++-common/pr95821-7.c: New test. > * c-c++-common/pr95821-8.c: New test. > --- a/gcc/tree-ssa-strlen.cc > +++ b/gcc/tree-ssa-strlen.cc > @@ -2405,9 +2405,12 @@ strlen_pass::handle_builtin_strlen () > } > } > > -/* Handle a strchr call. If strlen of the first argument is known, replace > - the strchr (x, 0) call with the endptr or x + strlen, otherwise remember > - that lhs of the call is endptr and strlen of the argument is endptr - x. */ > +/* Handle a strchr call. If strlen of the first argument is known, > + replace the strchr (x, 0) call with the endptr or x + strlen, > + otherwise remember that lhs of the call is endptr and strlen of the > + argument is endptr - x. If strlen of x is not know but has been > + computed earlier in the tree then replace strchr(x, c) to Still missing space before ( above. > + memchr (x, c, strlen + 1). */ > > void > strlen_pass::handle_builtin_strchr () > @@ -2418,8 +2421,12 @@ strlen_pass::handle_builtin_strchr () > if (lhs == NULL_TREE) > return; > > - if (!integer_zerop (gimple_call_arg (stmt, 1))) > - return; > + tree chr = gimple_call_arg (stmt, 1); > + /* strchr only uses the lower char of input so to check if its > + strchr (s, zerop) only take into account the lower char. */ > + bool is_strchr_zerop > + = (TREE_CODE (chr) == INTEGER_CST > + && integer_zerop (fold_convert (char_type_node, chr))); The indentation rule is that = should be 2 columns to the right from bool, so bool is_strchr_zerop = (TREE_CODE (chr) == INTEGER_CST && integer_zerop (fold_convert (char_type_node, chr))); > + /* If its not strchr (s, zerop) then try and convert to > + memchr since strlen has already been computed. */ This comment still has the second line weirdly indented. > + tree fn = builtin_decl_explicit (BUILT_IN_MEMCHR); > + > + /* Only need to check length strlen (s) + 1 if chr may be zero. > + Otherwise the last chr (which is known to be zero) can never > + be a match. NB: We don't need to test if chr is a non-zero > + integer const with zero char bits because that is taken into > + account with is_strchr_zerop. */ > + if (!tree_expr_nonzero_p (chr)) The above is unsafe though. tree_expr_nonzero_p (chr) will return true if say VRP can prove it is not zero, but because of the implicit (char) chr cast done by the function we need something different. Say if VRP determines that chr is in [1, INT_MAX] or even just [255, 257] it doesn't mean (char) chr won't be 0. So, as I've tried to explain in the previous mail, it can be done e.g. with bool chr_nonzero = false; if (TREE_CODE (chr) == INTEGER_CST && integer_nonzerop (fold_convert (char_type_node, chr))) chr_nonzero = true; else if (TREE_CODE (chr) == SSA_NAME && CHAR_TYPE_SIZE < INT_TYPE_SIZE) { value_range r; /* Try to determine using ranges if (char) chr must be always 0. That is true e.g. if all the subranges have the INT_TYPE_SIZE - CHAR_TYPE_SIZE bits the same on lower and upper bounds. */ if (get_range_query (cfun)->range_of_expr (r, chr, stmt) && r.kind () == VR_RANGE) { chr_nonzero = true; wide_int mask = wi::mask (CHAR_TYPE_SIZE, true, INT_TYPE_SIZE); for (int i = 0; i < r.num_pairs (); ++i) if ((r.lower_bound (i) & mask) != (r.upper_bound (i) & mask)) { chr_nonzero = false; break; } } } if (!chr_nonzero) Jakub