public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/46554] New: Less inlining leads to CSiBE regression
@ 2010-11-19  8:27 hubicka at gcc dot gnu.org
  2010-11-19 10:55 ` [Bug middle-end/46554] " rguenth at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: hubicka at gcc dot gnu.org @ 2010-11-19  8:27 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46554

           Summary: Less inlining leads to CSiBE regression
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: hubicka@gcc.gnu.org


Created attachment 22451
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22451
testcase flex-2.5.31/regex.c

The loss here is not inlining regmatch_len. The catch is that the test if (m ==
((void *)0) || m->rm_so < 0) is tested before all uses of regmatch_len and thus
optimized out.  So it simplifies into m->rm_so < 0 test and arithmetic that
ends up being cheaper than call.

int regmatch_len (regmatch_t * m)
{
 if (m == ((void *)0) || m->rm_so < 0) {
  return 0;
 }

 return m->rm_eo - m->rm_so;
}

It is used as:

 if (m == ((void *)0) || m->rm_so < 0)
  return 0;

 if (regmatch_len (m) < 20)
  s = regmatch_cpy (m, buf, src);
 else
  s = regmatch_dup (m, src);

Tricky.  Inliner sees it as:

Analyzing function body size: regmatch_len
  freq:  1000 size:  2 time:  2 if (m_2(D) == 0B)
  freq:   898 size:  1 time:  1 D.7268_3 = m_2(D)->rm_so;
    50% will be eliminated by inlining
  freq:   898 size:  2 time:  2 if (D.7268_3 < 0)
  freq:   726 size:  1 time:  1 D.7270_4 = m_2(D)->rm_eo;
    50% will be eliminated by inlining
  freq:   726 size:  1 time:  1 D.7268_5 = m_2(D)->rm_so;
    50% will be eliminated by inlining
  freq:   726 size:  1 time:  1 D.7269_6 = D.7270_4 - D.7268_5;
  freq:  1000 size:  1 time:  2 return D.7269_1;
    will eliminated by inlining
Overall function body time: 9-3 size: 11-5
With function call overhead time: 9-15 size: 11-8

I can imagine we can try to get summary based on value ranges, instead of known
constants, do early VRP and work out first test well.

Even optimizing the first conditoinal away won't make it inlined, it will be
still considered to have size 9, so code will be expected to grow by 1 byte.
Optimizing second conditoinal is even trickier.

The code can be optimized away by IP-value range propagation that would
be interesting optimization to have...


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-11-29  2:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-19  8:27 [Bug middle-end/46554] New: Less inlining leads to CSiBE regression hubicka at gcc dot gnu.org
2010-11-19 10:55 ` [Bug middle-end/46554] " rguenth at gcc dot gnu.org
2010-11-19 10:58   ` Jan Hubicka
2010-11-19 11:14 ` hubicka at ucw dot cz
2021-11-29  2:16 ` [Bug ipa/46554] " pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).