public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "hubicka at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug middle-end/110713] New: Fatigue2 runs twice as fast with increased inlining limits
Date: Tue, 18 Jul 2023 11:03:02 +0000	[thread overview]
Message-ID: <bug-110713-4@http.gcc.gnu.org/bugzilla/> (raw)

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110713

            Bug ID: 110713
           Summary: Fatigue2 runs twice as fast with increased inlining
                    limits
           Product: gcc
           Version: 13.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hubicka at gcc dot gnu.org
  Target Milestone: ---

jh@ryzen3:~/pb11/lin/source> ~/trunk-histogram/bin/gfortran fatigue2.f90 -Ofast
-march=native -fdump-tree-all-details-blocks -fdump-rtl-all-details
-fdump-ipa-all-details --param max-inline-insns-auto=110 ; perf stat ./a.out
>/dev/null

 Performance counter stats for './a.out':

          13937.07 msec task-clock:u                     #    1.000 CPUs
utilized             
                 0      context-switches:u               #    0.000 /sec        
                 0      cpu-migrations:u                 #    0.000 /sec        
               138      page-faults:u                    #    9.902 /sec        
       67489472294      cycles:u                         #    4.842 GHz        
                (83.33%)
          38791427      stalled-cycles-frontend:u        #    0.06% frontend
cycles idle        (83.33%)
           2351353      stalled-cycles-backend:u         #    0.00% backend
cycles idle         (83.33%)
      147268347462      instructions:u                   #    2.18  insn per
cycle            
                                                  #    0.00  stalled cycles per
insn     (83.33%)
        5705431257      branches:u                       #  409.371 M/sec      
                (83.35%)
          13638274      branch-misses:u                  #    0.24% of all
branches             (83.35%)

      13.941876147 seconds time elapsed

      13.933226000 seconds user
       0.003999000 seconds sys


jh@ryzen3:~/pb11/lin/source> ~/trunk-histogram/bin/gfortran fatigue2.f90 -Ofast
-march=native -fdump-tree-all-details-blocks -fdump-rtl-all-details
-fdump-ipa-all-details  ; perf stat ./a.out >/dev/null

 Performance counter stats for './a.out':

          31300.68 msec task-clock:u                     #    1.000 CPUs
utilized             
                 0      context-switches:u               #    0.000 /sec        
                 0      cpu-migrations:u                 #    0.000 /sec        
               138      page-faults:u                    #    4.409 /sec        
      150619261261      cycles:u                         #    4.812 GHz        
                (83.32%)
         779861463      stalled-cycles-frontend:u        #    0.52% frontend
cycles idle        (83.33%)
           4695025      stalled-cycles-backend:u         #    0.00% backend
cycles idle         (83.34%)
      242822794319      instructions:u                   #    1.61  insn per
cycle            
                                                  #    0.00  stalled cycles per
insn     (83.34%)
       13542051898      branches:u                       #  432.644 M/sec      
                (83.34%)
          14587945      branch-misses:u                  #    0.11% of all
branches             (83.34%)

      31.301169341 seconds time elapsed

      31.296826000 seconds user
       0.003999000 seconds sys

The main differnece is inlning generalized_hookes_law. While it looks quite big
at release_ssa time, after vectorization it gets loopless and inlining is a big
win.

      function generalized_hookes_law (strain_tensor, lambda, mu) result
(stress_tensor)
!
!      Author:       Dr. John K. Prentice
!      Affiliation:  Quetzal Computational Associates, Inc.
!      Dates:        28 November 1997
!
!      Purpose:      Apply the generalized Hooke's law for elasticity to the
strain tensor
!                    (or strain rate tensor) to compute the stress tensor (or
stress rate
!                    tensor)
!
!############################################################################################
!
!      Input:
!
!        strain_tensor                [selected_real_kind(15,90),
dimension(3,3)]
!                                     stress tensor
!
!        lambda                       [selected_real_kind(15,90)]
!                                     Lame constant Lambda
!
!        mu                           [selected_real_kind(15,90)]
!                                     Lame constant mu
!
!     Output:
!
!        stress_tensor                [selected_real_kind(15,90),
dimension(3,3)]
!                                     stress tensor
!
!############################################################################################
!
!
!=========== formal variables =============
!
      real (kind = LONGreal), dimension(:,:), intent(in) :: strain_tensor
      real (kind = LONGreal), intent(in) :: lambda, mu
      real (kind = LONGreal), dimension(3,3) :: stress_tensor
!
!========== internal variables ============
!
      real (kind = LONGreal), dimension(6) ::generalized_strain_vector,        
            &
                                             generalized_stress_vector
      real (kind = LONGreal), dimension(6,6) :: generalized_constitutive_tensor
      integer :: i
!
!        construct the generalized constitutive tensor for elasticity
!
      generalized_constitutive_tensor(:,:) = 0.0_LONGreal
      generalized_constitutive_tensor(1,1) = lambda + 2.0_LONGreal * mu
      generalized_constitutive_tensor(1,2) = lambda
      generalized_constitutive_tensor(1,3) = lambda
      generalized_constitutive_tensor(2,1) = lambda
      generalized_constitutive_tensor(2,2) = lambda + 2.0_LONGreal * mu
      generalized_constitutive_tensor(2,3) = lambda
      generalized_constitutive_tensor(3,1) = lambda
      generalized_constitutive_tensor(3,2) = lambda
      generalized_constitutive_tensor(3,3) = lambda + 2.0_LONGreal * mu
      generalized_constitutive_tensor(4,4) = mu
      generalized_constitutive_tensor(5,5) = mu
      generalized_constitutive_tensor(6,6) = mu
!
!        construct the generalized strain vector (using double index notation)
!
      generalized_strain_vector(1) = strain_tensor(1,1)
      generalized_strain_vector(2) = strain_tensor(2,2)
      generalized_strain_vector(3) = strain_tensor(3,3)
      generalized_strain_vector(4) = strain_tensor(2,3)
      generalized_strain_vector(5) = strain_tensor(1,3)
      generalized_strain_vector(6) = strain_tensor(1,2)
!
!        compute the generalized stress vector
!
      do i = 1, 6
          generalized_stress_vector(i) =
dot_product(generalized_constitutive_tensor(i,:),  &
                                                               
generalized_strain_vector(:))
      end do
!
!        update the stress tensor 
!
      stress_tensor(1,1) = generalized_stress_vector(1)
      stress_tensor(2,2) = generalized_stress_vector(2)
      stress_tensor(3,3) = generalized_stress_vector(3)
      stress_tensor(2,3) = generalized_stress_vector(4)
      stress_tensor(1,3) = generalized_stress_vector(5)
      stress_tensor(1,2) = generalized_stress_vector(6)
      stress_tensor(3,2) = stress_tensor(2,3)
      stress_tensor(3,1) = stress_tensor(1,3)
      stress_tensor(2,1) = stress_tensor(1,2)
!
      end function generalized_hookes_law

                 reply	other threads:[~2023-07-18 11:03 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-110713-4@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).