public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/14703] New: Inadequate optimization of inline templated functions
@ 2004-03-24  2:25 eric-gcc at omnifarious dot org
  2004-03-24  2:27 ` [Bug c++/14703] " eric-gcc at omnifarious dot org
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: eric-gcc at omnifarious dot org @ 2004-03-24  2:25 UTC (permalink / raw)
  To: gcc-bugs

This code:

=====================
#include <iostream>

namespace {
template <unsigned long long L> class fib {
 public:
   static const unsigned long long value = fib<L - 1>::value + fib<L - 2>::value;
};

template <> class fib<0> {
 public:
   static const unsigned long long value = 1;
};

template <> class fib<1> {
 public:
   static const unsigned long long value = 1;
};

template<unsigned long long L> inline unsigned long long fibconst()
{
   return fibconst<L - 1>() + fibconst<L - 2>();
}

template <> inline unsigned long long fibconst<0>()
{
   return 1ull;
}

template <> inline unsigned long long fibconst<1>()
{
   return 1ull;
}

template <> inline unsigned long long fibconst<2>()
{
   return 2ull;
}

}

int main()
{
   ::std::cerr << "fib<90>::value == " << fib<90>::value << "\n";
   ::std::cerr << "fibconst<90>() == " << fibconst<90>() << "\n";
}
=====================

does not result in the call to fibconst being optimized out of existence.  In
fact, only calls to fibconst<11>() and lower are optimized out at -O3.  The
compiler should have plenty enough information to realize that they can all be
optimized out.  After all, in order to expand the template at all, it has to get
down to fibconst<0> or fibconst<1> or fibconst<2> and build things back up again.

This is against gcc 3.4 pre-release.  I certainly don't expect anything to be
done to gcc 3.4 about it, but I think it should still be taken care of someday.

-- 
           Summary: Inadequate optimization of inline templated functions
           Product: gcc
           Version: 3.4.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P2
         Component: c++
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: eric-gcc at omnifarious dot org
                CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: pentium4-redhat-linux
  GCC host triplet: pentium4-redhat-linux
GCC target triplet: pentium4-redhat-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14703


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c++/14703] Inadequate optimization of inline templated functions
  2004-03-24  2:25 [Bug c++/14703] New: Inadequate optimization of inline templated functions eric-gcc at omnifarious dot org
@ 2004-03-24  2:27 ` eric-gcc at omnifarious dot org
  2004-03-24  2:30 ` eric-gcc at omnifarious dot org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: eric-gcc at omnifarious dot org @ 2004-03-24  2:27 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From eric-gcc at omnifarious dot org  2004-03-24 02:27 -------
Created an attachment (id=5985)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=5985&action=view)
Source code for a test case


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14703


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c++/14703] Inadequate optimization of inline templated functions
  2004-03-24  2:25 [Bug c++/14703] New: Inadequate optimization of inline templated functions eric-gcc at omnifarious dot org
  2004-03-24  2:27 ` [Bug c++/14703] " eric-gcc at omnifarious dot org
@ 2004-03-24  2:30 ` eric-gcc at omnifarious dot org
  2004-03-24  3:44 ` pinskia at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: eric-gcc at omnifarious dot org @ 2004-03-24  2:30 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From eric-gcc at omnifarious dot org  2004-03-24 02:30 -------
Created an attachment (id=5986)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=5986&action=view)
Assembly code resulting from attachment 5985

g++-4 -v -pipe -march=pentium4 -O3 -Wall -S torturefib.cxx
Reading specs from
/workplace/hopdist-garnome-0.29.1/usr/bin/../lib/gcc/pentium4-redhat-linux/3.4.0/specs

Configured with: ../gcc/configure --program-suffix=-4 --prefix=/hopdist/usr
--enable-languages=c,c++ --enable-shared pentium4-redhat-linux
Thread model: posix
gcc version 3.4.0 20040323 (prerelease)

/workplace/hopdist-garnome-0.29.1/usr/bin/../libexec/gcc/pentium4-redhat-linux/3.4.0/cc1plus
-quiet -v -iprefix
/workplace/hopdist-garnome-0.29.1/usr/bin/../lib/gcc/pentium4-redhat-linux/3.4.0/
-D_GNU_SOURCE torturefib.cxx -quiet -dumpbase torturefib.cxx -march=pentium4
-auxbase torturefib -O3 -Wall -version -o torturefib.s
ignoring nonexistent directory
"/workplace/hopdist-garnome-0.29.1/usr/bin/../lib/gcc/pentium4-redhat-linux/3.4.0/../../../../pentium4-redhat-linux/include"

ignoring duplicate directory
"/hopdist/usr/lib/gcc/pentium4-redhat-linux/3.4.0/../../../../include/c++/3.4.0"

ignoring duplicate directory
"/hopdist/usr/lib/gcc/pentium4-redhat-linux/3.4.0/../../../../include/c++/3.4.0/pentium4-redhat-linux"

ignoring duplicate directory
"/hopdist/usr/lib/gcc/pentium4-redhat-linux/3.4.0/../../../../include/c++/3.4.0/backward"

ignoring duplicate directory
"/hopdist/usr/lib/gcc/pentium4-redhat-linux/3.4.0/include"
ignoring nonexistent directory
"/hopdist/usr/lib/gcc/pentium4-redhat-linux/3.4.0/../../../../pentium4-redhat-linux/include"

#include "..." search starts here:
#include <...> search starts here:

/workplace/hopdist-garnome-0.29.1/usr/bin/../lib/gcc/pentium4-redhat-linux/3.4.0/../../../../include/c++/3.4.0


/workplace/hopdist-garnome-0.29.1/usr/bin/../lib/gcc/pentium4-redhat-linux/3.4.0/../../../../include/c++/3.4.0/pentium4-redhat-linux


/workplace/hopdist-garnome-0.29.1/usr/bin/../lib/gcc/pentium4-redhat-linux/3.4.0/../../../../include/c++/3.4.0/backward


/workplace/hopdist-garnome-0.29.1/usr/bin/../lib/gcc/pentium4-redhat-linux/3.4.0/include

 /usr/local/include
 /hopdist/usr/include
 /usr/include
End of search list.
GNU C++ version 3.4.0 20040323 (prerelease) (pentium4-redhat-linux)
	compiled by GNU C version 3.4.0 20040323 (prerelease).
GGC heuristics: --param ggc-min-expand=98 --param ggc-min-heapsize=128170
 
Compilation finished at Tue Mar 23 18:28:03


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14703


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c++/14703] Inadequate optimization of inline templated functions
  2004-03-24  2:25 [Bug c++/14703] New: Inadequate optimization of inline templated functions eric-gcc at omnifarious dot org
  2004-03-24  2:27 ` [Bug c++/14703] " eric-gcc at omnifarious dot org
  2004-03-24  2:30 ` eric-gcc at omnifarious dot org
@ 2004-03-24  3:44 ` pinskia at gcc dot gnu dot org
  2004-03-24  4:25 ` pinskia at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-24  3:44 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Attachment #5985|application/octet-stream    |text/plain
          mime type|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14703


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c++/14703] Inadequate optimization of inline templated functions
  2004-03-24  2:25 [Bug c++/14703] New: Inadequate optimization of inline templated functions eric-gcc at omnifarious dot org
                   ` (2 preceding siblings ...)
  2004-03-24  3:44 ` pinskia at gcc dot gnu dot org
@ 2004-03-24  4:25 ` pinskia at gcc dot gnu dot org
  2005-02-25 23:49 ` [Bug tree-optimization/14703] " rguenth at gcc dot gnu dot org
  2005-05-27  0:18 ` pinskia at gcc dot gnu dot org
  5 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-24  4:25 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-03-24 04:25 -------
Confirmed, it is much worse on the tree-ssa, most likely what needs to happen is that inline happens 
after some optimizations and then do more inlining and more optimizations, repeat until the function 
size does not raise.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|                            |1
           Keywords|                            |pessimizes-code
   Last reconfirmed|0000-00-00 00:00:00         |2004-03-24 04:25:08
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14703


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/14703] Inadequate optimization of inline templated functions
  2004-03-24  2:25 [Bug c++/14703] New: Inadequate optimization of inline templated functions eric-gcc at omnifarious dot org
                   ` (3 preceding siblings ...)
  2004-03-24  4:25 ` pinskia at gcc dot gnu dot org
@ 2005-02-25 23:49 ` rguenth at gcc dot gnu dot org
  2005-05-27  0:18 ` pinskia at gcc dot gnu dot org
  5 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2005-02-25 23:49 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rguenth at gcc dot gnu dot org  2005-02-25 16:53 -------
http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01571.html

improves this to the extent that the inliner now estimates the size of
fibconst to
   n    size
 0,1,2  0
   3    1
   4    2
   5    4
   6    7

etc., i.e. to the number of additions required.

Inlining all of fibconst<90> now only requires the appropriate limits, or,
of course folding during inlining.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at tat dot physik
                   |                            |dot uni-tuebingen dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14703


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/14703] Inadequate optimization of inline templated functions
  2004-03-24  2:25 [Bug c++/14703] New: Inadequate optimization of inline templated functions eric-gcc at omnifarious dot org
                   ` (4 preceding siblings ...)
  2005-02-25 23:49 ` [Bug tree-optimization/14703] " rguenth at gcc dot gnu dot org
@ 2005-05-27  0:18 ` pinskia at gcc dot gnu dot org
  5 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-05-27  0:18 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-05-27 00:15 -------
What we need for this testcase is the following:
inline, optimize, inline, ...

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2004-12-25 01:27:52         |2005-05-27 00:15:50
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14703


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/14703] Inadequate optimization of inline templated functions
       [not found] <bug-14703-3016@http.gcc.gnu.org/bugzilla/>
  2006-03-02 20:25 ` eric-gcc at omnifarious dot org
@ 2006-03-03 10:29 ` rguenth at gcc dot gnu dot org
  1 sibling, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2006-03-03 10:29 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from rguenth at gcc dot gnu dot org  2006-03-03 10:29 -------
Note that using functions as in fibconst is not really an efficient way to do
this.  Still, with gcc 4.1.0 you now have the ability to use

template<unsigned long long L> inline __attribute__((flatten)) unsigned long
long fibconst()
{
   return fibconst<L - 1>() + fibconst<L - 2>();
}

which will result in one call in main and a fibconst that returns the constant
requested.  Of course for big numbers you'll hit interesting quadraticness in
time and memory required to build the program - so the limits in place actually
prevent you from running into this issue.  It's like

rguenther@g148:/tmp> ~/bin/maxmem2.sh
/space/rguenther/install/gcc-4.1.0/bin/gcc -O2 -S t.C -DVAL=23
total: 149886 kB
rguenther@g148:/tmp> ~/bin/maxmem2.sh
/space/rguenther/install/gcc-4.1.0/bin/gcc -O2 -S t.C -DVAL=24
total: 238102 kB
rguenther@g148:/tmp> ~/bin/maxmem2.sh
/space/rguenther/install/gcc-4.1.0/bin/gcc -O2 -S t.C -DVAL=25
total: 385738 kB
rguenther@g148:/tmp> ~/bin/maxmem2.sh
/space/rguenther/install/gcc-4.1.0/bin/gcc -O2 -S t.C -DVAL=26
total: 623094 kB

so, an exponential growth in memory usage due to the way we're doing the
inlining.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14703


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/14703] Inadequate optimization of inline templated functions
       [not found] <bug-14703-3016@http.gcc.gnu.org/bugzilla/>
@ 2006-03-02 20:25 ` eric-gcc at omnifarious dot org
  2006-03-03 10:29 ` rguenth at gcc dot gnu dot org
  1 sibling, 0 replies; 9+ messages in thread
From: eric-gcc at omnifarious dot org @ 2006-03-02 20:25 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from eric-gcc at omnifarious dot org  2006-03-02 20:25 -------
I'm pleased that I came up with such a difficult test case for the optimizer. 
I never thought it'd be that hard.  :-)

I don't know anything about the internals, but...

The compiler has to generate everything down to the fibconst<0> and fibconst<1>
specializations anyway.  So why can't it memoize and filter the optimization
up?  Say it generates fibconst<1> and fibconst<2> in order to generate
fibconst<3>, then it discovers that fibconst<3> can be optimized to return
plain old '3'.  It can save that, and then when it comes down again needing
fibconst<2> and fibconst<3> in order to generate fibconst<4>, it can see the
already optimized version of fibconst<3> and generate an optimized version of
fibconst<4> that just returns plain old '5'.

Maybe I have things totally wrong and there's no way to do anything like that
with the code.  Or maybe it would turn out that that way of doing things is so
special case that it's not worth bothering with.

But, I just wonder if memoizing some sort of optimized version of a function
would help with a lot of things.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14703


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-03-03 10:29 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-03-24  2:25 [Bug c++/14703] New: Inadequate optimization of inline templated functions eric-gcc at omnifarious dot org
2004-03-24  2:27 ` [Bug c++/14703] " eric-gcc at omnifarious dot org
2004-03-24  2:30 ` eric-gcc at omnifarious dot org
2004-03-24  3:44 ` pinskia at gcc dot gnu dot org
2004-03-24  4:25 ` pinskia at gcc dot gnu dot org
2005-02-25 23:49 ` [Bug tree-optimization/14703] " rguenth at gcc dot gnu dot org
2005-05-27  0:18 ` pinskia at gcc dot gnu dot org
     [not found] <bug-14703-3016@http.gcc.gnu.org/bugzilla/>
2006-03-02 20:25 ` eric-gcc at omnifarious dot org
2006-03-03 10:29 ` rguenth at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).