public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "tnfchris at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/114932] IVopts inefficient handling of signed IV used for addressing. Date: Wed, 05 Jun 2024 09:42:44 +0000 [thread overview] Message-ID: <bug-114932-4-9hZSVXq2oK@http.gcc.gnu.org/bugzilla/> (raw) In-Reply-To: <bug-114932-4@http.gcc.gnu.org/bugzilla/> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #9 from Tamar Christina <tnfchris at gcc dot gnu.org> --- It's taken me a bit of time to track down all the reasons for the speedup with the earlier patch. This comes from two parts: 1. Signed IVs don't get simplified. Due to possible UB with signed overflows gimple expressions don't get simplified when the type is signed. However for addressing modes it doesn't matter as simplifying the constants any potential overflow can still happen. Secondly most architectures say you can never reach the full address space range anyway. Those that due (like those that offer baremetal variants like Arm and AArch64) explicitly specify that overflow is defined as wrapping around. That means that IVs for their use in IV opts should be save to simplify as if they were unsigned. I have a patch that during the creation of IV candidates folds them to unsigned and then folds them back to their original signed types. This maintains all the original overflow analysis and the correct typing in gimple. 2. The second problem is that due to Fortran not having unsigned types, the front-end generates a signed IV. Some optimizations as they work can convert these to unsigned due to folding, e.g. extract_muldiv is one place where this is done. This can make us end up having the same IV as both signed and unsigned, as is the case here: <Invariant Expressions>: inv_expr 1: stride.3_27 * 4 inv_expr 2: (unsigned long) stride.3_27 * 4 These end up being used in the same group: Group 1: cand cost compl. inv.expr. inv.vars 1 0 0 NIL; 6 2 0 0 NIL; 6 3 0 0 NIL; 6 4 0 0 NIL; 6 which ends up with IV opts picking the signed and unsigned IVs: Improved to: cost: 24 (complexity 3) reg_cost: 9 cand_cost: 15 cand_group_cost: 0 (complexity 3) candidates: 1, 6, 8 group:0 --> iv_cand:6, cost=(0,1) group:1 --> iv_cand:1, cost=(0,0) group:2 --> iv_cand:8, cost=(0,1) group:3 --> iv_cand:8, cost=(0,1) invariant variables: 6 invariant expressions: 1, 2 and so generates the same IV as both signed and unsigned: ;; basic block 21, loop depth 3, count 214748368 (estimated locally, freq 58.2545), maybe hot ;; prev block 28, next block 31, flags: (NEW, REACHABLE, VISITED) ;; pred: 28 [always] count:23622320 (estimated locally, freq 6.4080) (FALLTHRU,EXECUTABLE) ;; 25 [always] count:191126046 (estimated locally, freq 51.8465) (FALLTHRU,DFS_BACK,EXECUTABLE) # .MEM_66 = PHI <.MEM_34(28), .MEM_22(25)> # ivtmp.22_41 = PHI <0(28), ivtmp.22_82(25)> # ivtmp.26_51 = PHI <ivtmp.26_55(28), ivtmp.26_72(25)> # ivtmp.28_90 = PHI <ivtmp.28_99(28), ivtmp.28_98(25)> ... ;; basic block 24, loop depth 3, count 214748366 (estimated locally, freq 58.2545), maybe hot ;; prev block 22, next block 25, flags: (NEW, REACHABLE, VISITED) ;; pred: 22 [always] count:95443719 (estimated locally, freq 25.8909) (FALLTHRU) ;; 21 [33.3% (guessed)] count:71582790 (estimated locally, freq 19.4182) (TRUE_VALUE,EXECUTABLE) ;; 31 [33.3% (guessed)] count:47721860 (estimated locally, freq 12.9455) (TRUE_VALUE,EXECUTABLE) # .MEM_22 = PHI <.MEM_44(22), .MEM_31(21), .MEM_79(31)> ivtmp.22_82 = ivtmp.22_41 + 1; ivtmp.26_72 = ivtmp.26_51 + _80; ivtmp.28_98 = ivtmp.28_90 + _39; These two IVs are always used as unsigned, so IV ops generates: _73 = stride.3_27 * 4; _80 = (unsigned long) _73; _54 = (unsigned long) stride.3_27; _39 = _54 * 4; Which means that in e.g. exchange2 we generate a lot of duplicate code. I'm unsure yet how to fix this. I think I need to know how the IV values are used. Given that the signed IV is used as unsigned they should be the same.
next prev parent reply other threads:[~2024-06-05 9:42 UTC|newest] Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top 2024-05-03 5:51 [Bug tree-optimization/114932] New: Improvement in CHREC can give large performance gains tnfchris at gcc dot gnu.org 2024-05-03 6:26 ` [Bug tree-optimization/114932] " rguenth at gcc dot gnu.org 2024-05-03 7:03 ` pinskia at gcc dot gnu.org 2024-05-03 8:09 ` tnfchris at gcc dot gnu.org 2024-05-03 8:41 ` tnfchris at gcc dot gnu.org 2024-05-03 8:44 ` tnfchris at gcc dot gnu.org 2024-05-03 8:45 ` tnfchris at gcc dot gnu.org 2024-05-03 9:12 ` rguenth at gcc dot gnu.org 2024-05-13 8:28 ` tnfchris at gcc dot gnu.org 2024-06-05 9:42 ` tnfchris at gcc dot gnu.org [this message] 2024-06-05 10:23 ` [Bug tree-optimization/114932] IVopts inefficient handling of signed IV used for addressing rguenth at gcc dot gnu.org 2024-06-05 19:02 ` tnfchris at gcc dot gnu.org 2024-06-06 6:17 ` rguenther at suse dot de 2024-06-06 6:40 ` tnfchris at gcc dot gnu.org 2024-06-06 7:55 ` rguenther at suse dot de 2024-06-06 8:01 ` tnfchris at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-114932-4-9hZSVXq2oK@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).