From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 7B1B8385842D; Wed, 21 Feb 2024 12:03:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7B1B8385842D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1708517010; bh=Aiw9420ZMaCW6CaFSGx8oE6gYPuCkjKgH8ipS0tZDTI=; h=From:To:Subject:Date:In-Reply-To:References:From; b=URUoYipZRLJPIix2WuAPe+XJw/uEf1ztl1xfOYCgQZnS+3lctnym1o6S4awmeJcPs PX8dkyv2hMAL/4ud/fmuZa1lAe1SzPiDBKChSlqc9kkfUj+1a4ZXInDWgc5O7GJkxh 9NyBXWAa/3emi8CwQ9dekgr7Xd79xdOPNKK0ZPkc= From: "manolis.tsamis at vrull dot eu" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/114010] Unwanted effects of using SSA free lists. Date: Wed, 21 Feb 2024 12:03:27 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: manolis.tsamis at vrull dot eu X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D114010 --- Comment #4 from Manolis Tsamis --- Hi Andrew, Thank for your insights on this. Let me reply to some of your points: (In reply to Andrew Pinski from comment #1) > >The most important case I have observed is that the vectorizer can fail = or create inferior code with more shuffles/moves when the SSA names aren't = monotonically increasing. >=20 > That should not be true. Indeed, after further cleaning-up the dumps, some differences that I was considering were just due to the diff algorithm not doing a good job and th= at confused me (sigh). So, for this example while we're in tree form I observe only naming changes, but no different code or order of statements.=20 (In reply to Andrew Pinski from comment #2) > Note what I had found in the past it is not exactly SSA_NAMEs that cause = the > difference but rather the RTL register pesdu # causes differences in > register allocation which was exposed from the different in operands > canonicalization. Yes, I have also observed this and it looks to be the main issue. (In reply to Andrew Pinski from comment #3) > The first example (of assembly here) in comment #0 is extra moves due to = the > RA not handling subreg that decent for the load/store lane. There are oth= er > bug reports dealing with that. Why the SSA_NAMES being monotonically help= is > just by an accident really.=20 >=20 >=20 Do you happen to recall the relevant ticket(s)? I would like to have a look= but couldn't find them so far. Also, while I agree than in some cases changes like this 'just happen' to improve codegen in some particular case, it was in multiple experiments that vectorized code was superior with sorted names and it never was worse with sorted names. In most cases that I recall the version that used unsorted na= mes had additional shuffles of different sorts or moves. So, which anecdotal, t= he effects doesn't look accidental to me in this case. I feel like there may be some subtle difference due to the names that helps in this case? >=20 > Also: > > This mostly affects all the bitmaps that use SSA_NAME_VERSION as a key. >=20 > Most use sparse bitmaps there so it is not a big deal. >=20 Agreed and that's probably why I couldn't measure any non-trivial differenc= e in compilation times. I should just note that there are also places that create vectors or other = data structures sized to the number of ssa_names, so in theory this could still = help in extreme cases. > I think this should be split up in a few different bug reports really. > One for each case where better optimizations happen. >=20 Ok, the only cases that I found to be clearly better are the ones related to vectorization. Would it help to create a ticket just for that now, or shoul= d I wait for the discussion in this one to conclude first? > Also: > >I have seen two similar source files generating the exact same GIMPLE co= de up to some optimization pass but then completely diverging due to differ= ent freelists. >=20 > The only case where I have seen this happen is expand will have different > pesdu # really. Yes I noticed this effect while I did > r14-569-g21e2ef2dc25de3 really. Afaik, the codegen differences that I observed was due to the same reason, = but it nonetheless felt weird that the same GIMPLE could produce two different w.r.t. name ordering files later on just because the freelists were differe= nt (but invisible in the dumps). So I naturally questioned 'why don't we just flush the freelists after every pass if it's not a performance issue'?=