From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by sourceware.org (Postfix) with ESMTPS id 1E7FA3858C55 for ; Wed, 27 Mar 2024 18:44:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1E7FA3858C55 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1E7FA3858C55 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711565071; cv=none; b=f6RIdoPL5Ogr+1f0FhdCUypc4O9gIVK0n27DgB98e90+79oBAjVdg0tDTdBUrPYRNPFTsaJMtwmiLHN4wzUpjF3+Dga1ddm3/j/u8xrnKhWyjZTOAsjE+HMNBtjbYot6OHAHke64+IymZKYb4kQtTBQXrfYfyoKAIP/Mm8k1H88= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711565071; c=relaxed/simple; bh=BJrcvqC+dZt1zKsPgEC56yJqsnbx7OvVB2Cq8+Vi2Lc=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:Message-ID:MIME-Version; b=jHZhIcs2wuq56VDf3z8gpCEELOgnxxwXeUabrckTokpno99QlJjbfFqjJyZzTSWjdRrxhAWBVaR/umFWe2PD/aTzD/o8iFp8npPhRjvGRkqv3ncODxgM6sD4Dn/q710njW5PJQqPd7st/ZmXObXSzHab1bGaMq+QpNnzW/bOFdc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from knuth.suse.de (unknown [10.168.5.16]) by smtp-out1.suse.de (Postfix) with ESMTP id 1E12521F86; Wed, 27 Mar 2024 18:44:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1711565069; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=friXiktJXm6RpULVLxiTvJfuJ54050Ar43sSyvFlP2Q=; b=uvAfrAf5IsqqX7COh0Z/jvVHc2YAKEChg70B0saYEvsYHa4scchWivReVowfUBBRYxAwK8 ZdsA5uw4/XgyxLvy2cLvmp1AlkfTKaeUAH7WsS+3wIQG1KmpuxrFqWmjO9+sjNJy+Uysfv iVkuRMyf5egh+IFQXGgqrzlNw59AdQM= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1711565069; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=friXiktJXm6RpULVLxiTvJfuJ54050Ar43sSyvFlP2Q=; b=DhTjJW5Kt4VNogiTAZRO8QyzJNPXygoYiCU3KtpEy0X0Am8uDON+yMC4/m+EA0mtFpKJH1 AiZVoxvFyl2K1sCw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1711565069; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=friXiktJXm6RpULVLxiTvJfuJ54050Ar43sSyvFlP2Q=; b=uvAfrAf5IsqqX7COh0Z/jvVHc2YAKEChg70B0saYEvsYHa4scchWivReVowfUBBRYxAwK8 ZdsA5uw4/XgyxLvy2cLvmp1AlkfTKaeUAH7WsS+3wIQG1KmpuxrFqWmjO9+sjNJy+Uysfv iVkuRMyf5egh+IFQXGgqrzlNw59AdQM= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1711565069; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=friXiktJXm6RpULVLxiTvJfuJ54050Ar43sSyvFlP2Q=; b=DhTjJW5Kt4VNogiTAZRO8QyzJNPXygoYiCU3KtpEy0X0Am8uDON+yMC4/m+EA0mtFpKJH1 AiZVoxvFyl2K1sCw== Received: by knuth.suse.de (Postfix, from userid 10510) id 0FE58343575; Wed, 27 Mar 2024 19:44:28 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by knuth.suse.de (Postfix) with ESMTP id F33E8343574; Wed, 27 Mar 2024 19:44:28 +0100 (CET) Date: Wed, 27 Mar 2024 19:44:28 +0100 (CET) From: Michael Matz To: Jakub Jelinek cc: Richard Biener , gcc-patches@gcc.gnu.org Subject: Re: [PATCH] middle-end/114480 - IDF compute is slow In-Reply-To: Message-ID: References: <96606.124032711422300395@us-mta-154.us.mimecast.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Score: -1.13 X-Spamd-Result: default: False [-1.13 / 50.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-1.00)[-1.000]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.09)[-0.437]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]; MID_RHS_MATCH_FROM(0.00)[]; BAYES_HAM(-0.04)[58.82%] X-Spam-Level: Authentication-Results: smtp-out1.suse.de; none X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hey, On Wed, 27 Mar 2024, Jakub Jelinek wrote: > > @@ -1712,12 +1711,9 @@ compute_idf (bitmap def_blocks, bitmap_head *dfs) > > gcc_checking_assert (bb_index > > < (unsigned) last_basic_block_for_fn (cfun)); > > > > - EXECUTE_IF_AND_COMPL_IN_BITMAP (&dfs[bb_index], phi_insertion_points, > > - 0, i, bi) > > - { > > + EXECUTE_IF_SET_IN_BITMAP (&dfs[bb_index], 0, i, bi) > > + if (bitmap_set_bit (phi_insertion_points, i)) > > bitmap_set_bit (work_set, i); > > - bitmap_set_bit (phi_insertion_points, i); > > - } > > } > > I don't understand why the above is better. > Wouldn't it be best to do > bitmap_ior_and_compl_into (work_set, &dfs[bb_index], > phi_insertion_points); > bitmap_ior_into (phi_insertion_points, &dfs[bb_index]); > ? I had the same hunch, but: 1) One would have to make work_set be non-tree-view again (which with the current structure is a wash anyway, and that makes sense as accesses to work_set aren't heavily random here). 2) But doing that and using bitmap_ior.._into is still measurably slower: on a reduced testcase with -O0 -fno-checking, proposed structure (tree-view or not-tree-view workset doesn't matter): tree SSA rewrite : 14.93 ( 12%) 0.01 ( 2%) 14.95 ( 12%) 27M ( 8%) with non-tree-view, and your suggestion: tree SSA rewrite : 20.68 ( 12%) 0.02 ( 4%) 20.75 ( 12%) 27M ( 8%) I can only speculate that the usually extreme sparsity of the bitmaps in question make the setup costs of the two bitmap_ior calls actually more expensive than the often skipped second call to bitmap_set_bit in Richis proposed structure. (That or cache effects) Ciao, Michael.