From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x129.google.com (mail-lf1-x129.google.com [IPv6:2a00:1450:4864:20::129]) by sourceware.org (Postfix) with ESMTPS id 6805E3858D33 for ; Wed, 8 Nov 2023 07:57:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6805E3858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6805E3858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::129 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699430247; cv=none; b=iijMJp/Y/KE/A6P0Su4ufR5NwOALK3wILoPsNlLqwNMDLjcmYqbAEwW6LmoINSDaF79sVJfoloR+17VzZwHSN7xcZO+bPBF6BTx7yvIomzMM0pyWhrAeSkW2CFPZK365hCbkPPeSw2q9pUfLjE8x40wKpBLxtUfDQJa/iv96czc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699430247; c=relaxed/simple; bh=1oLdDETlliQVAcfJczqXM+ZMIFMwcDazjbm/Wk0mHbc=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=Jk/zw2zgmknFrCdr8hF3N5YgxU1i/QuPpjWNkdxUrV0QuiDPieG5ZJEMZMKVZVHcs+MgRuVsV0KqxI3YvTuMy/AmcMH173Fn9iiqCrZwbMDoP1KEB8GsW+GwzcQpBL1tjJaVKzayEqlS0Py3NuQIrAJzeoibc6gRHPCxMghzr9c= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x129.google.com with SMTP id 2adb3069b0e04-507e85ebf50so8185839e87.1 for ; Tue, 07 Nov 2023 23:57:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1699430242; x=1700035042; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Bx52TJfO6kUqiACyTfGrM1M2oiRKpC5Pi9OV1wXoR7g=; b=fMJMa1/zWOwDsY4AA4uMdyJJKBtCdneKVe/oljOz2R7LUtS+TGf3zhdkLtiZrmIgdl xh/2dl+c5LnQaXOBihxS3Tj6leA2FTFu0J7L78575RBjeT343h/YZneMrK2D4aEvaHVg zuJIe1gEkdgCpNi3SvwAJjmy79zB055iOzqW1dX2cRKivnXUBvBtKOTZ6hCeJX0919zJ OnNLXJG6gkSOMdJoc1Q00dcYqahdEGCC1UY3nvsBtTIoAGHIn/j471pYOz7l3NotYhT/ MvVik5iTrueyUPu6GgUWaRP2iXLpKnU8F2vVyAtRLu8Q2RHzMuDwpyV/mY387hgeHILp FeVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699430242; x=1700035042; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Bx52TJfO6kUqiACyTfGrM1M2oiRKpC5Pi9OV1wXoR7g=; b=xG+RZdeTcm8ROQSFAGpFfBJCr5lIFPYNQof/A0IC91VChLtmIz4FM6hP6rr/X5LSYs dZZLvV/3lZvGI5tULim/m07LLXHoUD35pn4WT0G+lGYxCN2BvBSK880GOUnQDJdD0vXv YsH53NHzRWbs+paVrUwXQ6iN1mNpfiJiGqR40agShD4vaH4pxZ7ke+umeiMfqUGhtgub fJir0VVqqGjcaxJRVpzoo6/1B1wq43p3oVJDWFC2NL/p6I3Hzdoj1bh98xrvVCgIlqx3 uvqmSHucpZhqmkgqNqPXZ+KovsrO1kW1HhF2ZprvmwTaGmhwUA84bc6ZdC5Mtp0Eq3PK W4vg== X-Gm-Message-State: AOJu0YxumNaVWi4IINzTxxQaR6+QLBHhyhSbmnnty0WU8YUSb0BAyfue np+PZHF8NA2UEFQyZW7Uq+AHoN9aG1ZyCs7PnZbrmvGm X-Google-Smtp-Source: AGHT+IHfeDsjch0Ie7YXuX6oBQMWLf5BaLcBcSVR55MNo5m3TJrZTXuM+HRlGxnbUcY1t14PdjkIa2wephXIFjKExRU= X-Received: by 2002:a05:6512:3da0:b0:500:7efe:313c with SMTP id k32-20020a0565123da000b005007efe313cmr723930lfv.24.1699430241537; Tue, 07 Nov 2023 23:57:21 -0800 (PST) MIME-Version: 1.0 References: <20231108034740.834590-1-lehua.ding@rivai.ai> <20231108034740.834590-2-lehua.ding@rivai.ai> In-Reply-To: <20231108034740.834590-2-lehua.ding@rivai.ai> From: Richard Biener Date: Wed, 8 Nov 2023 08:57:09 +0100 Message-ID: Subject: Re: [PATCH 1/7] ira: Refactor the handling of register conflicts to make it more general To: Lehua Ding Cc: gcc-patches@gcc.gnu.org, vmakarov@redhat.com, richard.sandiford@arm.com, juzhe.zhong@rivai.ai Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE,URIBL_SBL_A autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Nov 8, 2023 at 4:48=E2=80=AFAM Lehua Ding wro= te: > > This patch does not make any functional changes. It mainly refactor two p= arts: > > 1. The ira_allocno's objects field is expanded to an scalable array, and = multi-word > pseduo registers are split and tracked only when necessary. > 2. Since the objects array has been expanded, there will be more subreg o= bjects > that pass through later, rather than the previous fixed two. Therefore= , it > is necessary to modify the detection of whether two objects conflict, = and > the check method is to pull back the registers occupied by the object = to > the first register of the allocno for judgment. Did you profile this before/after? RA performance is critical ... > gcc/ChangeLog: > > * hard-reg-set.h (struct HARD_REG_SET): Add operator>>. > * ira-build.cc (init_object_start_and_nregs): New func. > (find_object): Ditto. > (ira_create_allocno): Adjust. > (ira_set_allocno_class): Set subreg info. > (ira_create_allocno_objects): Adjust. > (init_regs_with_subreg): Collect access in subreg. > (ira_build): Call init_regs_with_subreg > (ira_destroy): Clear regs_with_subreg > * ira-color.cc (setup_profitable_hard_regs): Adjust. > (get_conflict_and_start_profitable_regs): Adjust. > (check_hard_reg_p): Adjust. > (assign_hard_reg): Adjust. > (improve_allocation): Adjust. > * ira-int.h (struct ira_object): Adjust fields. > (struct ira_allocno): Adjust objects filed. > (ALLOCNO_NUM_OBJECTS): Adjust. > (ALLOCNO_UNIT_SIZE): New. > (ALLOCNO_TRACK_SUBREG_P): New. > (ALLOCNO_NREGS): New. > (OBJECT_SIZE): New. > (OBJECT_OFFSET): New. > (OBJECT_START): New. > (OBJECT_NREGS): New. > (find_object): New. > (has_subreg_object_p): New. > (get_full_object): New. > * ira.cc (check_allocation): Adjust. > > --- > gcc/hard-reg-set.h | 33 +++++++ > gcc/ira-build.cc | 106 +++++++++++++++++++- > gcc/ira-color.cc | 234 ++++++++++++++++++++++++++++++--------------- > gcc/ira-int.h | 45 ++++++++- > gcc/ira.cc | 52 ++++------ > 5 files changed, 349 insertions(+), 121 deletions(-) > > diff --git a/gcc/hard-reg-set.h b/gcc/hard-reg-set.h > index b0bb9bce074..760eadba186 100644 > --- a/gcc/hard-reg-set.h > +++ b/gcc/hard-reg-set.h > @@ -113,6 +113,39 @@ struct HARD_REG_SET > return !operator=3D=3D (other); > } > > + HARD_REG_SET > + operator>> (unsigned int shift_amount) const This is a quite costly operation, why do we need it instead of keeping an "offset" for set queries? > + { > + if (shift_amount =3D=3D 0) > + return *this; > + > + HARD_REG_SET res; > + unsigned int total_bits =3D sizeof (HARD_REG_ELT_TYPE) * 8; > + if (shift_amount >=3D total_bits) > + { > + unsigned int n_elt =3D shift_amount % total_bits; > + shift_amount -=3D n_elt * total_bits; > + for (unsigned int i =3D 0; i < ARRAY_SIZE (elts) - n_elt - 1; i += =3D 1) > + res.elts[i] =3D elts[i + n_elt]; > + /* clear upper n_elt elements. */ > + for (unsigned int i =3D 0; i < n_elt; i +=3D 1) > + res.elts[ARRAY_SIZE (elts) - 1 - i] =3D 0; > + } > + > + if (shift_amount > 0) > + { > + /* The left bits of an element be shifted. */ > + HARD_REG_ELT_TYPE left =3D 0; > + /* Total bits of an element. */ > + for (int i =3D ARRAY_SIZE (elts); i >=3D 0; --i) > + { > + res.elts[i] =3D (elts[i] >> shift_amount) | left; > + left =3D elts[i] << (total_bits - shift_amount); > + } > + } > + return res; > + } > + > HARD_REG_ELT_TYPE elts[HARD_REG_SET_LONGS]; > }; > typedef const HARD_REG_SET &const_hard_reg_set; > diff --git a/gcc/ira-build.cc b/gcc/ira-build.cc > index 93e46033170..07aba27c1c9 100644 > --- a/gcc/ira-build.cc > +++ b/gcc/ira-build.cc > @@ -440,6 +440,40 @@ initiate_allocnos (void) > memset (ira_regno_allocno_map, 0, max_reg_num () * sizeof (ira_allocno= _t)); > } > > +/* Update OBJ's start and nregs field according A and OBJ info. */ > +static void > +init_object_start_and_nregs (ira_allocno_t a, ira_object_t obj) > +{ > + enum reg_class aclass =3D ALLOCNO_CLASS (a); > + gcc_assert (aclass !=3D NO_REGS); > + > + machine_mode mode =3D ALLOCNO_MODE (a); > + int nregs =3D ira_reg_class_max_nregs[aclass][mode]; > + if (ALLOCNO_TRACK_SUBREG_P (a)) > + { > + poly_int64 end =3D OBJECT_OFFSET (obj) + OBJECT_SIZE (obj); > + for (int i =3D 0; i < nregs; i +=3D 1) > + { > + poly_int64 right =3D ALLOCNO_UNIT_SIZE (a) * (i + 1); > + if (OBJECT_START (obj) < 0 && maybe_lt (OBJECT_OFFSET (obj), ri= ght)) > + { > + OBJECT_START (obj) =3D i; > + } > + if (OBJECT_NREGS (obj) < 0 && maybe_le (end, right)) > + { > + OBJECT_NREGS (obj) =3D i + 1 - OBJECT_START (obj); > + break; > + } > + } > + gcc_assert (OBJECT_START (obj) >=3D 0 && OBJECT_NREGS (obj) > 0); > + } > + else > + { > + OBJECT_START (obj) =3D 0; > + OBJECT_NREGS (obj) =3D nregs; > + } > +} > + > /* Create and return an object corresponding to a new allocno A. */ > static ira_object_t > ira_create_object (ira_allocno_t a, int subword) > @@ -460,15 +494,36 @@ ira_create_object (ira_allocno_t a, int subword) > OBJECT_MIN (obj) =3D INT_MAX; > OBJECT_MAX (obj) =3D -1; > OBJECT_LIVE_RANGES (obj) =3D NULL; > + OBJECT_SIZE (obj) =3D UNITS_PER_WORD; > + OBJECT_OFFSET (obj) =3D subword * UNITS_PER_WORD; > + OBJECT_START (obj) =3D -1; > + OBJECT_NREGS (obj) =3D -1; > > ira_object_id_map_vec.safe_push (obj); > ira_object_id_map > =3D ira_object_id_map_vec.address (); > ira_objects_num =3D ira_object_id_map_vec.length (); > > + if (aclass !=3D NO_REGS) > + init_object_start_and_nregs (a, obj); > + > + a->objects.push_back (obj); > + > return obj; > } > > +/* Return the object in allocno A which match START & NREGS. */ > +ira_object_t > +find_object (ira_allocno_t a, int start, int nregs) > +{ > + for (ira_object_t obj : a->objects) linear search? really? > + { > + if (OBJECT_START (obj) =3D=3D start && OBJECT_NREGS (obj) =3D=3D n= regs) > + return obj; > + } > + return NULL; > +} > + > /* Create and return the allocno corresponding to REGNO in > LOOP_TREE_NODE. Add the allocno to the list of allocnos with the > same regno if CAP_P is FALSE. */ > @@ -525,7 +580,8 @@ ira_create_allocno (int regno, bool cap_p, > ALLOCNO_MEMORY_COST (a) =3D 0; > ALLOCNO_UPDATED_MEMORY_COST (a) =3D 0; > ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a) =3D 0; > - ALLOCNO_NUM_OBJECTS (a) =3D 0; > + ALLOCNO_UNIT_SIZE (a) =3D 0; > + ALLOCNO_TRACK_SUBREG_P (a) =3D false; > > ALLOCNO_ADD_DATA (a) =3D NULL; > allocno_vec.safe_push (a); > @@ -535,6 +591,9 @@ ira_create_allocno (int regno, bool cap_p, > return a; > } > > +/* Record the regs referenced by subreg. */ > +static bitmap_head regs_with_subreg; > + > /* Set up register class for A and update its conflict hard > registers. */ > void > @@ -549,6 +608,19 @@ ira_set_allocno_class (ira_allocno_t a, enum reg_cla= ss aclass) > OBJECT_CONFLICT_HARD_REGS (obj) |=3D ~reg_class_contents[aclass]; > OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |=3D ~reg_class_contents[acl= ass]; > } > + > + if (aclass =3D=3D NO_REGS) > + return; > + /* SET the unit_size of one register. */ > + machine_mode mode =3D ALLOCNO_MODE (a); > + int nregs =3D ira_reg_class_max_nregs[aclass][mode]; > + if (nregs =3D=3D 2 && maybe_eq (GET_MODE_SIZE (mode), nregs * UNITS_PE= R_WORD) > + && bitmap_bit_p (®s_with_subreg, ALLOCNO_REGNO (a))) > + { > + ALLOCNO_UNIT_SIZE (a) =3D UNITS_PER_WORD; > + ALLOCNO_TRACK_SUBREG_P (a) =3D true; > + return; > + } > } > > /* Determine the number of objects we should associate with allocno A > @@ -561,12 +633,12 @@ ira_create_allocno_objects (ira_allocno_t a) > int n =3D ira_reg_class_max_nregs[aclass][mode]; > int i; > > - if (n !=3D 2 || maybe_ne (GET_MODE_SIZE (mode), n * UNITS_PER_WORD)) > + if (n !=3D 2 || maybe_ne (GET_MODE_SIZE (mode), n * UNITS_PER_WORD) > + || !bitmap_bit_p (®s_with_subreg, ALLOCNO_REGNO (a))) > n =3D 1; > > - ALLOCNO_NUM_OBJECTS (a) =3D n; > for (i =3D 0; i < n; i++) > - ALLOCNO_OBJECT (a, i) =3D ira_create_object (a, i); > + ira_create_object (a, i); > } > > /* For each allocno, set ALLOCNO_NUM_OBJECTS and create the > @@ -3460,6 +3532,30 @@ update_conflict_hard_reg_costs (void) > } > } > > +/* Traverse all instructions to determine which ones have access through= subreg. > + */ > +static void > +init_regs_with_subreg () > +{ > + bitmap_initialize (®s_with_subreg, ®_obstack); > + basic_block bb; > + rtx_insn *insn; > + df_ref def, use; > + FOR_ALL_BB_FN (bb, cfun) > + FOR_BB_INSNS (bb, insn) > + { > + if (!NONDEBUG_INSN_P (insn)) > + continue; > + df_insn_info *insn_info =3D DF_INSN_INFO_GET (insn); > + FOR_EACH_INSN_INFO_DEF (def, insn_info) > + if (DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_SUBREG)) > + bitmap_set_bit (®s_with_subreg, DF_REF_REGNO (def)); > + FOR_EACH_INSN_INFO_USE (use, insn_info) > + if (DF_REF_FLAGS (use) & (DF_REF_PARTIAL | DF_REF_SUBREG)) > + bitmap_set_bit (®s_with_subreg, DF_REF_REGNO (use)); > + } > +} > + > /* Create a internal representation (IR) for IRA (allocnos, copies, > loop tree nodes). The function returns TRUE if we generate loop > structure (besides nodes representing all function and the basic > @@ -3475,6 +3571,7 @@ ira_build (void) > initiate_allocnos (); > initiate_prefs (); > initiate_copies (); > + init_regs_with_subreg (); > create_loop_tree_nodes (); > form_loop_tree (); > create_allocnos (); > @@ -3565,4 +3662,5 @@ ira_destroy (void) > finish_allocnos (); > finish_cost_vectors (); > ira_finish_allocno_live_ranges (); > + bitmap_clear (®s_with_subreg); > } > diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc > index f2e8ea34152..6af8318e5f5 100644 > --- a/gcc/ira-color.cc > +++ b/gcc/ira-color.cc > @@ -1031,7 +1031,7 @@ static void > setup_profitable_hard_regs (void) > { > unsigned int i; > - int j, k, nobj, hard_regno, nregs, class_size; > + int j, k, nobj, hard_regno, class_size; > ira_allocno_t a; > bitmap_iterator bi; > enum reg_class aclass; > @@ -1076,7 +1076,6 @@ setup_profitable_hard_regs (void) > || (hard_regno =3D ALLOCNO_HARD_REGNO (a)) < 0) > continue; > mode =3D ALLOCNO_MODE (a); > - nregs =3D hard_regno_nregs (hard_regno, mode); > nobj =3D ALLOCNO_NUM_OBJECTS (a); > for (k =3D 0; k < nobj; k++) > { > @@ -1088,24 +1087,39 @@ setup_profitable_hard_regs (void) > { > ira_allocno_t conflict_a =3D OBJECT_ALLOCNO (conflict_obj); > > - /* We can process the conflict allocno repeatedly with > - the same result. */ > - if (nregs =3D=3D nobj && nregs > 1) > + if (!has_subreg_object_p (a)) > { > - int num =3D OBJECT_SUBWORD (conflict_obj); > - > - if (REG_WORDS_BIG_ENDIAN) > - CLEAR_HARD_REG_BIT > - (ALLOCNO_COLOR_DATA (conflict_a)->profitable_hard_r= egs, > - hard_regno + nobj - num - 1); > - else > - CLEAR_HARD_REG_BIT > - (ALLOCNO_COLOR_DATA (conflict_a)->profitable_hard_r= egs, > - hard_regno + num); > + ALLOCNO_COLOR_DATA (conflict_a)->profitable_hard_regs > + &=3D ~ira_reg_mode_hard_regset[hard_regno][mode]; > + continue; > + } > + > + /* Clear all hard regs occupied by obj. */ > + if (REG_WORDS_BIG_ENDIAN) > + { > + int start_regno > + =3D hard_regno + ALLOCNO_NREGS (a) - 1 - OBJECT_START= (obj); > + for (int i =3D 0; i < OBJECT_NREGS (obj); i +=3D 1) > + { > + int regno =3D start_regno - i; > + if (regno >=3D 0 && regno < FIRST_PSEUDO_REGISTER) > + CLEAR_HARD_REG_BIT ( > + ALLOCNO_COLOR_DATA (conflict_a)->profitable_har= d_regs, > + regno); > + } > } > else > - ALLOCNO_COLOR_DATA (conflict_a)->profitable_hard_regs > - &=3D ~ira_reg_mode_hard_regset[hard_regno][mode]; > + { > + int start_regno =3D hard_regno + OBJECT_START (obj); > + for (int i =3D 0; i < OBJECT_NREGS (obj); i +=3D 1) > + { > + int regno =3D start_regno + i; > + if (regno >=3D 0 && regno < FIRST_PSEUDO_REGISTER) > + CLEAR_HARD_REG_BIT ( > + ALLOCNO_COLOR_DATA (conflict_a)->profitable_har= d_regs, > + regno); > + } > + } > } > } > } > @@ -1677,18 +1691,25 @@ update_conflict_hard_regno_costs (int *costs, enu= m reg_class aclass, > aligned. */ > static inline void > get_conflict_and_start_profitable_regs (ira_allocno_t a, bool retry_p, > - HARD_REG_SET *conflict_regs, > + HARD_REG_SET *start_conflict_regs= , > HARD_REG_SET *start_profitable_re= gs) > { > int i, nwords; > ira_object_t obj; > > nwords =3D ALLOCNO_NUM_OBJECTS (a); > - for (i =3D 0; i < nwords; i++) > - { > - obj =3D ALLOCNO_OBJECT (a, i); > - conflict_regs[i] =3D OBJECT_TOTAL_CONFLICT_HARD_REGS (obj); > - } > + CLEAR_HARD_REG_SET (*start_conflict_regs); > + if (has_subreg_object_p (a)) > + for (i =3D 0; i < nwords; i++) > + { > + obj =3D ALLOCNO_OBJECT (a, i); > + for (int j =3D 0; j < OBJECT_NREGS (obj); j +=3D 1) > + *start_conflict_regs |=3D OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) > + >> (OBJECT_START (obj) + j); > + } > + else > + *start_conflict_regs > + =3D OBJECT_TOTAL_CONFLICT_HARD_REGS (get_full_object (a)); > if (retry_p) > *start_profitable_regs > =3D (reg_class_contents[ALLOCNO_CLASS (a)] > @@ -1702,9 +1723,9 @@ get_conflict_and_start_profitable_regs (ira_allocno= _t a, bool retry_p, > PROFITABLE_REGS and whose objects have CONFLICT_REGS. */ > static inline bool > check_hard_reg_p (ira_allocno_t a, int hard_regno, > - HARD_REG_SET *conflict_regs, HARD_REG_SET profitable_re= gs) > + HARD_REG_SET start_conflict_regs, > + HARD_REG_SET profitable_regs) > { > - int j, nwords, nregs; > enum reg_class aclass; > machine_mode mode; > > @@ -1716,28 +1737,17 @@ check_hard_reg_p (ira_allocno_t a, int hard_regno= , > /* Checking only profitable hard regs. */ > if (! TEST_HARD_REG_BIT (profitable_regs, hard_regno)) > return false; > - nregs =3D hard_regno_nregs (hard_regno, mode); > - nwords =3D ALLOCNO_NUM_OBJECTS (a); > - for (j =3D 0; j < nregs; j++) > + > + if (has_subreg_object_p (a)) > + return !TEST_HARD_REG_BIT (start_conflict_regs, hard_regno); > + else > { > - int k; > - int set_to_test_start =3D 0, set_to_test_end =3D nwords; > - > - if (nregs =3D=3D nwords) > - { > - if (REG_WORDS_BIG_ENDIAN) > - set_to_test_start =3D nwords - j - 1; > - else > - set_to_test_start =3D j; > - set_to_test_end =3D set_to_test_start + 1; > - } > - for (k =3D set_to_test_start; k < set_to_test_end; k++) > - if (TEST_HARD_REG_BIT (conflict_regs[k], hard_regno + j)) > - break; > - if (k !=3D set_to_test_end) > - break; > + int nregs =3D hard_regno_nregs (hard_regno, mode); > + for (int i =3D 0; i < nregs; i +=3D 1) > + if (TEST_HARD_REG_BIT (start_conflict_regs, hard_regno + i)) > + return false; > + return true; > } > - return j =3D=3D nregs; > } > > /* Return number of registers needed to be saved and restored at > @@ -1945,7 +1955,7 @@ spill_soft_conflicts (ira_allocno_t a, bitmap alloc= nos_to_spill, > static bool > assign_hard_reg (ira_allocno_t a, bool retry_p) > { > - HARD_REG_SET conflicting_regs[2], profitable_hard_regs; > + HARD_REG_SET start_conflicting_regs, profitable_hard_regs; > int i, j, hard_regno, best_hard_regno, class_size; > int cost, mem_cost, min_cost, full_cost, min_full_cost, nwords, word; > int *a_costs; > @@ -1962,8 +1972,7 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) > HARD_REG_SET soft_conflict_regs =3D {}; > > ira_assert (! ALLOCNO_ASSIGNED_P (a)); > - get_conflict_and_start_profitable_regs (a, retry_p, > - conflicting_regs, > + get_conflict_and_start_profitable_regs (a, retry_p, &start_conflicting= _regs, > &profitable_hard_regs); > aclass =3D ALLOCNO_CLASS (a); > class_size =3D ira_class_hard_regs_num[aclass]; > @@ -2041,7 +2050,6 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) > (hard_regno, ALLOCNO_MODE (conflict_a), > reg_class_contents[aclass]))) > { > - int n_objects =3D ALLOCNO_NUM_OBJECTS (conflict_a); > int conflict_nregs; > > mode =3D ALLOCNO_MODE (conflict_a); > @@ -2076,24 +2084,95 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) > note_conflict (r); > } > } > + else if (has_subreg_object_p (a)) > + { > + /* Set start_conflicting_regs if that cause obj and > + conflict_obj overlap. the overlap position: > + +--------------+ > + | conflict_obj | > + +--------------+ > + > + +-----------+ +-----------+ > + | obj | ... | obj | > + +-----------+ +-----------+ > + > + Point: A B C > + > + the hard regs from A to C point will cause overla= p. > + For REG_WORDS_BIG_ENDIAN: > + A =3D hard_regno + ALLOCNO_NREGS (conflict_a) = - 1 > + - OBJECT_START (conflict_obj) > + - OBJECT_NREGS (obj) + 1 > + C =3D A + OBJECT_NREGS (obj) > + + OBJECT_NREGS (conflict_obj) - 2 > + For !REG_WORDS_BIG_ENDIAN: > + A =3D hard_regno + OBJECT_START (conflict_obj) > + - OBJECT_NREGS (obj) + 1 > + C =3D A + OBJECT_NREGS (obj) > + + OBJECT_NREGS (conflict_obj) - 2 > + */ > + int start_regno; > + int conflict_allocno_nregs, conflict_object_nregs, > + conflict_object_start; > + if (has_subreg_object_p (conflict_a)) > + { > + conflict_allocno_nregs =3D ALLOCNO_NREGS (confl= ict_a); > + conflict_object_nregs =3D OBJECT_NREGS (conflic= t_obj); > + conflict_object_start =3D OBJECT_START (conflic= t_obj); > + } > + else > + { > + conflict_allocno_nregs =3D conflict_object_nreg= s > + =3D hard_regno_nregs (hard_regno, mode); > + conflict_object_start =3D 0; > + } > + if (REG_WORDS_BIG_ENDIAN) > + { > + int A =3D hard_regno + conflict_allocno_nregs -= 1 > + - conflict_object_start - OBJECT_NREGS = (obj) > + + 1; > + start_regno =3D A + OBJECT_NREGS (obj) - 1 > + + OBJECT_START (obj) - ALLOCNO_NR= EGS (a) > + + 1; > + } > + else > + { > + int A =3D hard_regno + conflict_object_start > + - OBJECT_NREGS (obj) + 1; > + start_regno =3D A - OBJECT_START (obj); > + } > + > + for (int i =3D 0; > + i <=3D OBJECT_NREGS (obj) + conflict_object_nr= egs - 2; > + i +=3D 1) > + { > + int regno =3D start_regno + i; > + if (regno >=3D 0 && regno < FIRST_PSEUDO_REGIST= ER) > + SET_HARD_REG_BIT (start_conflicting_regs, reg= no); > + } > + if (hard_reg_set_subset_p (profitable_hard_regs, > + start_conflicting_regs)) > + goto fail; > + } > else > { > - if (conflict_nregs =3D=3D n_objects && conflict_nre= gs > 1) > + if (has_subreg_object_p (conflict_a)) > { > - int num =3D OBJECT_SUBWORD (conflict_obj); > - > - if (REG_WORDS_BIG_ENDIAN) > - SET_HARD_REG_BIT (conflicting_regs[word], > - hard_regno + n_objects - nu= m - 1); > - else > - SET_HARD_REG_BIT (conflicting_regs[word], > - hard_regno + num); > + int start_hard_regno > + =3D REG_WORDS_BIG_ENDIAN > + ? hard_regno + ALLOCNO_NREGS (conflict_a) > + - OBJECT_START (conflict_obj) > + : hard_regno + OBJECT_START (conflict_obj= ); > + for (int i =3D 0; i < OBJECT_NREGS (conflict_ob= j); > + i +=3D 1) > + SET_HARD_REG_BIT (start_conflicting_regs, > + start_hard_regno + i); > } > else > - conflicting_regs[word] > + start_conflicting_regs > |=3D ira_reg_mode_hard_regset[hard_regno][mode]= ; > if (hard_reg_set_subset_p (profitable_hard_regs, > - conflicting_regs[word])) > + start_conflicting_regs)) > goto fail; > } > } > @@ -2160,8 +2239,8 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) > && FIRST_STACK_REG <=3D hard_regno && hard_regno <=3D LAST_STAC= K_REG) > continue; > #endif > - if (! check_hard_reg_p (a, hard_regno, > - conflicting_regs, profitable_hard_regs)) > + if (!check_hard_reg_p (a, hard_regno, start_conflicting_regs, > + profitable_hard_regs)) > continue; > cost =3D costs[i]; > full_cost =3D full_costs[i]; > @@ -3154,7 +3233,7 @@ improve_allocation (void) > machine_mode mode; > int *allocno_costs; > int costs[FIRST_PSEUDO_REGISTER]; > - HARD_REG_SET conflicting_regs[2], profitable_hard_regs; > + HARD_REG_SET start_conflicting_regs, profitable_hard_regs; > ira_allocno_t a; > bitmap_iterator bi; > int saved_nregs; > @@ -3193,7 +3272,7 @@ improve_allocation (void) > - allocno_copy_cost_saving (a, hregno)); > try_p =3D false; > get_conflict_and_start_profitable_regs (a, false, > - conflicting_regs, > + &start_conflicting_regs, > &profitable_hard_regs); > class_size =3D ira_class_hard_regs_num[aclass]; > mode =3D ALLOCNO_MODE (a); > @@ -3202,8 +3281,8 @@ improve_allocation (void) > for (j =3D 0; j < class_size; j++) > { > hregno =3D ira_class_hard_regs[aclass][j]; > - if (! check_hard_reg_p (a, hregno, > - conflicting_regs, profitable_hard_regs)= ) > + if (!check_hard_reg_p (a, hregno, start_conflicting_regs, > + profitable_hard_regs)) > continue; > ira_assert (ira_class_hard_reg_index[aclass][hregno] =3D=3D j); > k =3D allocno_costs =3D=3D NULL ? 0 : j; > @@ -3287,16 +3366,15 @@ improve_allocation (void) > } > conflict_nregs =3D hard_regno_nregs (conflict_hregno, > ALLOCNO_MODE (conflict_a= )); > - auto note_conflict =3D [&](int r) > - { > - if (check_hard_reg_p (a, r, > - conflicting_regs, profitable_hard= _regs)) > - { > - if (spill_a) > - SET_HARD_REG_BIT (soft_conflict_regs, r); > - costs[r] +=3D spill_cost; > - } > - }; > + auto note_conflict =3D [&] (int r) { > + if (check_hard_reg_p (a, r, start_conflicting_regs, > + profitable_hard_regs)) > + { > + if (spill_a) > + SET_HARD_REG_BIT (soft_conflict_regs, r); > + costs[r] +=3D spill_cost; > + } > + }; > for (r =3D conflict_hregno; > r >=3D 0 && (int) end_hard_regno (mode, r) > conflict_= hregno; > r--) > @@ -3314,8 +3392,8 @@ improve_allocation (void) > for (j =3D 0; j < class_size; j++) > { > hregno =3D ira_class_hard_regs[aclass][j]; > - if (check_hard_reg_p (a, hregno, > - conflicting_regs, profitable_hard_regs) > + if (check_hard_reg_p (a, hregno, start_conflicting_regs, > + profitable_hard_regs) > && min_cost > costs[hregno]) > { > best =3D hregno; > diff --git a/gcc/ira-int.h b/gcc/ira-int.h > index 0685e1f4e8d..b6281d3df6d 100644 > --- a/gcc/ira-int.h > +++ b/gcc/ira-int.h > @@ -23,6 +23,7 @@ along with GCC; see the file COPYING3. If not see > > #include "recog.h" > #include "function-abi.h" > +#include > > /* To provide consistency in naming, all IRA external variables, > functions, common typedefs start with prefix ira_. */ > @@ -240,6 +241,13 @@ struct ira_object > Zero means the lowest-order subword (or the entire allocno in case > it is not being tracked in subwords). */ > int subword; > + /* Reprensent OBJECT occupied [start, start + nregs) registers of it's > + ALLOCNO. */ > + int start, nregs; > + /* Reprensent the size and offset of current object, use to track subr= eg > + range, For full reg, the size is GET_MODE_SIZE (ALLOCNO_MODE (alloc= no)), > + offset is 0. */ > + poly_int64 size, offset; > /* Allocated size of the conflicts array. */ > unsigned int conflicts_array_size; > /* A unique number for every instance of this structure, which is used > @@ -295,6 +303,11 @@ struct ira_allocno > reload (at this point pseudo-register has only one allocno) which > did not get stack slot yet. */ > signed int hard_regno : 16; > + /* Unit size of one register that allocate for the allocno. Only use t= o > + compute the start and nregs of subreg which be tracked. */ > + poly_int64 unit_size; > + /* Flag means need track subreg live range for the allocno. */ > + bool track_subreg_p; > /* A bitmask of the ABIs used by calls that occur while the allocno > is live. */ > unsigned int crossed_calls_abis : NUM_ABI_IDS; > @@ -353,8 +366,6 @@ struct ira_allocno > register class living at the point than number of hard-registers > of the class available for the allocation. */ > int excess_pressure_points_num; > - /* The number of objects tracked in the following array. */ > - int num_objects; > /* Accumulated frequency of calls which given allocno > intersects. */ > int call_freq; > @@ -387,8 +398,8 @@ struct ira_allocno > /* An array of structures describing conflict information and live > ranges for each object associated with the allocno. There may be > more than one such object in cases where the allocno represents a > - multi-word register. */ > - ira_object_t objects[2]; > + multi-hardreg pesudo. */ > + std::vector objects; > /* Registers clobbered by intersected calls. */ > HARD_REG_SET crossed_calls_clobbered_regs; > /* Array of usage costs (accumulated and the one updated during > @@ -468,8 +479,12 @@ struct ira_allocno > #define ALLOCNO_EXCESS_PRESSURE_POINTS_NUM(A) \ > ((A)->excess_pressure_points_num) > #define ALLOCNO_OBJECT(A,N) ((A)->objects[N]) > -#define ALLOCNO_NUM_OBJECTS(A) ((A)->num_objects) > +#define ALLOCNO_NUM_OBJECTS(A) ((int) (A)->objects.size ()) > #define ALLOCNO_ADD_DATA(A) ((A)->add_data) > +#define ALLOCNO_UNIT_SIZE(A) ((A)->unit_size) > +#define ALLOCNO_TRACK_SUBREG_P(A) ((A)->track_subreg_p) > +#define ALLOCNO_NREGS(A) = \ > + (ira_reg_class_max_nregs[ALLOCNO_CLASS (A)][ALLOCNO_MODE (A)]) > > /* Typedef for pointer to the subsequent structure. */ > typedef struct ira_emit_data *ira_emit_data_t; > @@ -511,6 +526,8 @@ allocno_emit_reg (ira_allocno_t a) > } > > #define OBJECT_ALLOCNO(O) ((O)->allocno) > +#define OBJECT_SIZE(O) ((O)->size) > +#define OBJECT_OFFSET(O) ((O)->offset) > #define OBJECT_SUBWORD(O) ((O)->subword) > #define OBJECT_CONFLICT_ARRAY(O) ((O)->conflicts_array) > #define OBJECT_CONFLICT_VEC(O) ((ira_object_t *)(O)->conflicts_array) > @@ -524,6 +541,8 @@ allocno_emit_reg (ira_allocno_t a) > #define OBJECT_MAX(O) ((O)->max) > #define OBJECT_CONFLICT_ID(O) ((O)->id) > #define OBJECT_LIVE_RANGES(O) ((O)->live_ranges) > +#define OBJECT_START(O) ((O)->start) > +#define OBJECT_NREGS(O) ((O)->nregs) > > /* Map regno -> allocnos with given regno (see comments for > allocno member `next_regno_allocno'). */ > @@ -1041,6 +1060,8 @@ extern void ira_free_cost_vector (int *, reg_class_= t); > extern void ira_flattening (int, int); > extern bool ira_build (void); > extern void ira_destroy (void); > +extern ira_object_t > +find_object (ira_allocno_t, int, int); > > /* ira-costs.cc */ > extern void ira_init_costs_once (void); > @@ -1708,4 +1729,18 @@ ira_caller_save_loop_spill_p (ira_allocno_t a, ira= _allocno_t subloop_a, > return call_cost && call_cost >=3D spill_cost; > } > > +/* Return true if allocno A has subreg object. */ > +inline bool > +has_subreg_object_p (ira_allocno_t a) > +{ > + return ALLOCNO_NUM_OBJECTS (a) > 1; > +} > + > +/* Return the full object of allocno A. */ > +inline ira_object_t > +get_full_object (ira_allocno_t a) > +{ > + return find_object (a, 0, ALLOCNO_NREGS (a)); > +} > + > #endif /* GCC_IRA_INT_H */ > diff --git a/gcc/ira.cc b/gcc/ira.cc > index d7530f01380..2fa6e0e5c94 100644 > --- a/gcc/ira.cc > +++ b/gcc/ira.cc > @@ -2623,7 +2623,7 @@ static void > check_allocation (void) > { > ira_allocno_t a; > - int hard_regno, nregs, conflict_nregs; > + int hard_regno; > ira_allocno_iterator ai; > > FOR_EACH_ALLOCNO (a, ai) > @@ -2634,28 +2634,18 @@ check_allocation (void) > if (ALLOCNO_CAP_MEMBER (a) !=3D NULL > || (hard_regno =3D ALLOCNO_HARD_REGNO (a)) < 0) > continue; > - nregs =3D hard_regno_nregs (hard_regno, ALLOCNO_MODE (a)); > - if (nregs =3D=3D 1) > - /* We allocated a single hard register. */ > - n =3D 1; > - else if (n > 1) > - /* We allocated multiple hard registers, and we will test > - conflicts in a granularity of single hard regs. */ > - nregs =3D 1; > > for (i =3D 0; i < n; i++) > { > ira_object_t obj =3D ALLOCNO_OBJECT (a, i); > ira_object_t conflict_obj; > ira_object_conflict_iterator oci; > - int this_regno =3D hard_regno; > - if (n > 1) > - { > - if (REG_WORDS_BIG_ENDIAN) > - this_regno +=3D n - i - 1; > - else > - this_regno +=3D i; > - } > + int this_regno; > + if (REG_WORDS_BIG_ENDIAN) > + this_regno =3D hard_regno + ALLOCNO_NREGS (a) - 1 - OBJECT_ST= ART (obj) > + - OBJECT_NREGS (obj) + 1; > + else > + this_regno =3D hard_regno + OBJECT_START (obj); > FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci) > { > ira_allocno_t conflict_a =3D OBJECT_ALLOCNO (conflict_obj); > @@ -2665,24 +2655,18 @@ check_allocation (void) > if (ira_soft_conflict (a, conflict_a)) > continue; > > - conflict_nregs =3D hard_regno_nregs (conflict_hard_regno, > - ALLOCNO_MODE (conflict_a= )); > - > - if (ALLOCNO_NUM_OBJECTS (conflict_a) > 1 > - && conflict_nregs =3D=3D ALLOCNO_NUM_OBJECTS (conflict_= a)) > - { > - if (REG_WORDS_BIG_ENDIAN) > - conflict_hard_regno +=3D (ALLOCNO_NUM_OBJECTS (confli= ct_a) > - - OBJECT_SUBWORD (conflict_ob= j) - 1); > - else > - conflict_hard_regno +=3D OBJECT_SUBWORD (conflict_obj= ); > - conflict_nregs =3D 1; > - } > + if (REG_WORDS_BIG_ENDIAN) > + conflict_hard_regno =3D conflict_hard_regno > + + ALLOCNO_NREGS (conflict_a) - 1 > + - OBJECT_START (conflict_obj) > + - OBJECT_NREGS (conflict_obj) + 1; > + else > + conflict_hard_regno > + =3D conflict_hard_regno + OBJECT_START (conflict_obj); > > - if ((conflict_hard_regno <=3D this_regno > - && this_regno < conflict_hard_regno + conflict_nregs) > - || (this_regno <=3D conflict_hard_regno > - && conflict_hard_regno < this_regno + nregs)) > + if (!(this_regno + OBJECT_NREGS (obj) <=3D conflict_hard_re= gno > + || conflict_hard_regno + OBJECT_NREGS (conflict_obj) > + <=3D this_regno)) > { > fprintf (stderr, "bad allocation for %d and %d\n", > ALLOCNO_REGNO (a), ALLOCNO_REGNO (conflict_a))= ; > -- > 2.36.3 >