From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id 66CCD385843F; Tue,  1 Mar 2022 21:58:47 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 66CCD385843F
From: "vmakarov at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/104686] [12 Regression] Huge compile-time regression
 building SPEC 2017 538.imagick_r with -march=skylake
Date: Tue, 01 Mar 2022 21:58:47 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 12.0
X-Bugzilla-Keywords: compile-time-hog, ra
X-Bugzilla-Severity: normal
X-Bugzilla-Who: vmakarov at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 12.0
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-104686-4-40P7qwWc3x@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-104686-4@http.gcc.gnu.org/bugzilla/>
References: <bug-104686-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 01 Mar 2022 21:58:47 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104686
--- Comment #19 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #16)
> it doesn't make a difference for this testcase but profiling shows that
> allocnos_conflict_p is quite expensive so it's best to do it after the ot=
her
> continue checks like the following.  I also notice that the comment of
> allocnos_conflict_p says
>=20
> /* Return TRUE if allocnos A1 and A2 conflicts. Here we are
>    interesting only in conflicts of allocnos with intersected allocno
>    classes. */
>=20
> so doing it after the ira_reg_classes_intersect_p check makes even more
> sense(?)
>=20
> diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc
> index 8b6db1bb417..a5fd79484eb 100644
> --- a/gcc/ira-color.cc
> +++ b/gcc/ira-color.cc
> @@ -1572,15 +1572,14 @@ update_conflict_hard_regno_costs (int *costs, enum
> reg_class aclass,
>         else
>           gcc_unreachable ();
>=20=20
> +       another_aclass =3D ALLOCNO_CLASS (another_allocno);
>         if (another_allocno =3D=3D from
> +           || ALLOCNO_ASSIGNED_P (another_allocno)
> +           || ALLOCNO_COLOR_DATA (another_allocno)->may_be_spilled_p
> +           || ! ira_reg_classes_intersect_p[aclass][another_aclass]
>             || allocnos_conflict_p (another_allocno, start))
>           continue;
>=20=20
> -       another_aclass =3D ALLOCNO_CLASS (another_allocno);
> -       if (! ira_reg_classes_intersect_p[aclass][another_aclass]
> -           || ALLOCNO_ASSIGNED_P (another_allocno)
> -           || ALLOCNO_COLOR_DATA (another_allocno)->may_be_spilled_p)
> -         continue;
>         class_size =3D ira_class_hard_regs_num[another_aclass];
>         ira_allocate_and_copy_costs
>           (&ALLOCNO_UPDATED_CONFLICT_HARD_REG_COSTS (another_allocno),
>=20
>=20

If it is allocnos_conflict_p takes significant time, this change definitely=
 has
sense.  On my estimation it will decrease allocnos_conflict_p calls in abou=
t 4
times (assuming fp and int reg classes and half allocnos already assigned).

In any case, the above change is profitable as allocnos_conflict_p practica=
lly
always takes more time than the condition tests moved up.

> Now, what's more odd is that we sometimes have a nice bitmap representati=
on
> for the conflicts but we always iterate.  So it _seems_ we should be able
> to do sth like
>=20
> diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc
> index 8b6db1bb417..682d1ef7562 100644
> --- a/gcc/ira-color.cc
> +++ b/gcc/ira-color.cc
> @@ -1352,9 +1352,23 @@ allocnos_conflict_p (ira_allocno_t a1, ira_allocno=
_t
> a2)
>      {
>        obj =3D ALLOCNO_OBJECT (a1, word);
>        /* Take preferences of conflicting allocnos into account.  */
> -      FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci)
> -       if (OBJECT_ALLOCNO (conflict_obj) =3D=3D a2)
> -         return true;
> +      if  (!OBJECT_CONFLICT_VEC_P (obj))
> +       {
> +         for (int w2 =3D 0; w2 < ALLOCNO_NUM_OBJECTS (a2); w2++)
> +           {
> +             ira_object_t obj2 =3D ALLOCNO_OBJECT (a2, w2);
> +             if (OBJECT_CONFLICT_ID (obj2) >=3D OBJECT_MIN (obj)
> +                 && OBJECT_CONFLICT_ID (obj2) <=3D OBJECT_MAX (obj)
> +                 && TEST_MINMAX_SET_BIT (OBJECT_CONFLICT_BITVEC (obj),
> +                                         OBJECT_CONFLICT_ID (obj2),
> +                                         OBJECT_MIN (obj), OBJECT_MAX
> (obj)))
> +               return true;
> +           }
> +       }
> +      else
> +       FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci)
> +         if (OBJECT_ALLOCNO (conflict_obj) =3D=3D a2)
> +           return true;
>      }
>    return false;
>  }=20=20
>=20
> which reduces compile-time from 10s to 1s for me ... the above should
> be split out so we can "optimally" use the bit test for
> object vs. allocno when possible.
>=20
> Vlad - any thoughts about the above two things?  Shall I try to polish and
> optimize the bit test or would you be willing to pick those two speedups =
up?

This change also has sense.  Usually for big functions conflict sets are ve=
ry
sparse and bit vectors are not used.  But it seems this is not the case for=
 the
PR.

Please, polish and optimize the change as you proposed and I approve the fi=
nal
version promptly.

Thank you for working on this PR, Richard.=