From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21675 invoked by alias); 13 Aug 2010 21:00:08 -0000 Received: (qmail 21385 invoked by uid 22791); 13 Aug 2010 20:59:54 -0000 X-SWARE-Spam-Status: No, hits=-6.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,TW_CP,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 13 Aug 2010 20:59:39 +0000 Received: from int-mx04.intmail.prod.int.phx2.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.17]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o7DKxc7O010756 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 13 Aug 2010 16:59:38 -0400 Received: from x61s.makarov (ovpn-113-52.phx2.redhat.com [10.3.113.52]) by int-mx04.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o7DKxZT7023721; Fri, 13 Aug 2010 16:59:36 -0400 Message-ID: <4C65B237.6010201@redhat.com> Date: Fri, 13 Aug 2010 21:37:00 -0000 From: "Vladimir N. Makarov" User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.11) Gecko/20100720 Fedora/3.0.6-1.fc12 Thunderbird/3.0.6 MIME-Version: 1.0 To: GCC Patches CC: Jeff Law Subject: resubmitting IRA improvements: patch 4/4 Content-Type: multipart/mixed; boundary="------------040302030206090201090801" X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2010-08/txt/msg01044.txt.bz2 This is a multi-part message in MIME format. --------------040302030206090201090801 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-length: 6969 This another patch to compensate slow down by the patch removing cover classes. Major changes is to decrease memory footprint of register allocator IR. Is it ok to commit into the trunk? 2010-08-13 Vladimir Makarov * ira-build.c: (ira_create_object): Remove initialization of OBJECT_PROFITABLE_HARD_REGS. Initialize OBJECT_ADD_DATA. (ira_create_allocno): Remove initialization of ALLOCNO_MEM_OPTIMIZED_DEST, ALLOCNO_MEM_OPTIMIZED_DEST_P, ALLOCNO_SOMEWHERE_RENAMED_P, ALLOCNO_CHILD_RENAMED_P, ALLOCNO_IN_GRAPH_P, ALLOCNO_MAY_BE_SPILLED_P, ALLOCNO_COLORABLE_P, ALLOCNO_NEXT_BUCKET_ALLOCNO, ALLOCNO_PREV_BUCKET_ALLOCNO, ALLOCNO_FIRST_COALESCED_ALLOCNO, ALLOCNO_NEXT_COALESCED_ALLOCNO. Initialize ALLOCNO_ADD_DATA. (copy_info_to_removed_store_destinations): Use ALLOCNO_EMIT_DATA and allocno_emit_reg instead of ALLOCNO_MEM_OPTIMIZED_DEST_P and ALLOCNO_REG. (ira_flattening): Ditto. Use ALLOCNO_EMIT_DATA instead of ALLOCNO_MEM_OPTIMIZED_DEST and ALLOCNO_SOMEWHERE_RENAMED_P. * ira.c (ira_reallocate): Remove. (setup_pressure_classes): Call ira_init_register_move_cost_if_necessary. Use ira_register_move_cost instead of ira_get_register_move_cost. (setup_allocno_assignment_flags): Use ALLOCNO_EMIT_DATA. (ira): Call ira_initiate_emit_data and ira_finish_emit_data. * ira-color.c: Use ALLOCNO_COLOR_DATA instead of ALLOCNO_IN_GRAPH_P, ALLOCNO_MAY_BE_SPILLED_P, ALLOCNO_COLORABLE_P, ALLOCNO_AVAILABLE_REGS_NUM, ALLOCNO_NEXT_BUCKET_ALLOCNO, ALLOCNO_PREV_BUCKET_ALLOCNO. ALLOCNO_TEMP. Use OBJECT_COLOR_DATA instead of OBJECT_PROFITABLE_HARD_REGS, OBJECT_HARD_REGS_NODE, OBJECT_HARD_REGS_SUBNODES_START, OBJECT_HARD_REGS_SUBNODES_NUM. Fix formatting. (object_hard_regs_t, object_hard_regs_node_t): Move from ira-int.h. (struct object_hard_regs, struct object_hard_regs_node): Ditto. (struct allocno_color_data): New. (allocno_color_data_t): New typedef. (allocno_color_data): New definition. (ALLOCNO_COLOR_DATA): New macro. (struct object_color_data): New. (object_color_data_t): New typedef. (object_color_data): New definition. (OBJECT_COLOR_DATA): New macro. (update_copy_costs, calculate_allocno_spill_cost): Call ira_init_register_move_cost_if_necessary. Use ira_register_move_cost instead of ira_get_register_move_cost. (move_spill_restore, update_curr_costs): Ditto. (allocno_spill_priority): Make it inline. (color_pass): Allocate and free allocno_color_dat and object_color_data. (struct coalesce_data, coalesce_data_t): New. (allocno_coalesce_data): New definition. (ALLOCNO_COALESCE_DATA): New macro. (merge_allocnos, coalesced_allocno_conflict_p): Use ALLOCNO_COALESCED_DATA instead of ALLOCNO_FIRST_COALESCED_ALLOCNO, ALLOCNO_NEXT_COALESCED_ALLOCNO, ALLOCNO_TEMP. (coalesce_allocnos): Ditto. (setup_coalesced_allocno_costs_and_nums): Ditto. (collect_spilled_coalesced_allocnos): Ditto. (slot_coalesced_allocno_live_ranges_intersect_p): Ditto. (setup_slot_coalesced_allocno_live_ranges): Ditto. (coalesce_spill_slots): Ditto. (ira_sort_regnos_for_alter_reg): Ditto. Allocate, initialize and free allocno_coalesce_data. * ira-conflicts.c: Fix formatting. (process_regs_for_copy): Call ira_init_register_move_cost_if_necessary. Use ira_register_move_cost instead of ira_get_register_move_cost. (build_object_conflicts): Optimize. * ira-costs.c (record_reg_classes): Optimize. Call ira_init_register_move_cost_if_necessary. Use ira_register_move_cost, ira_may_move_in_cost, and ira_may_move_out_cost instead of ira_get_register_move_cost and ira_get_may_move_cost. (record_address_regs): Ditto. (scan_one_insn): Optimize. (find_costs_and_classes): Optimize. (process_bb_node_for_hard_reg_moves): Call ira_init_register_move_cost_if_necessary. Use ira_register_move_cost instead of ira_get_register_move_cost. * ira-emit.c: Use allocno_emit_reg, ALLOCNO_EMIT_DATA instead of ALLOCNO_REG, ALLOCNO_CHILD_RENAMED_P, ALLOCNO_MEM_OPTIMIZED_DEST, ALLOCNO_MEM_OPTIMIZED_DEST_P, and ALLOCNO_SOMEWHERE_RENAMED_P. (ira_allocno_emit_data, void_p, new_allocno_emit_data_vec): New definitions. (ira_initiate_emit_data, ira_finish_emit_data) (create_new_allocno): New functions. (modify_move_list): Call create_new_alloc instead of ira_create_allocno. (emit_move_list): Call ira_init_register_move_cost_if_necessary. Use ira_register_move_cost instead of ira_get_register_move_cost. * ira-int.h: Fix some comments. (object_hard_regs_t, object_hard_regs_node_t): Move to ira-color.c. (struct object_hard_regs, struct object_hard_regs_node): Ditto. (struct ira_object): Remove profitable_hard_regs, hard_regs_node, hard_regs_subnodes_start, hard_regs_subnodes_num. Add new member add_data. (struct ira_allocno): Make mode and aclass a bitfield. Move other bitfield after mode. Make hard_regno a short int. Make hard_regno short. Remove first_coalesced_allocno and next_coalesced_allocno. Move mem_optimized_dest_p, somewhere_renamed_p, child_renamed_p, reg, and mem_optimized_dest into struct ira_emit_data. Remove in_graph_p, may_be_spilled_p, available_regs_num, next_bucket_allocno, prev_bucket_allocno, temp, colorable_p. Add new member add_data. (ALLOCNO_IN_GRAPH_P, ALLOCNO_MAY_BE_SPILLED_P): Remove. (ALLOCNO_COLORABLE_P, ALLOCNO_AVAILABLE_REGS_NUM): Remove. (ALLOCNO_NEXT_BUCKET_ALLOCNO, ALLOCNO_PREV_BUCKET_ALLOCNO): Remove. (ALLOCNO_TEMP, ALLOCNO_FIRST_COALESCED_ALLOCNO): Remove. (ALLOCNO_NEXT_COALESCED_ALLOCNO): Remove. (ALLOCNO_ADD_DATA): New macro. (ira_emit_data_t): New typedef. (struct ira_emit_data): New. Move mem_optimized_dest_p, somewhere_renamed_p, child_renamed_p, reg, mem_optimized_dest from struct ira_allocno. (ALLOCNO_EMIT_DATA): New macro. (ira_allocno_emit_data, allocno_emit_reg): New. (ALLOCNO_PROFITABLE_HARD_REGS, OBJECT_HARD_REGS_NODE): Remove. (OBJECT_HARD_REGS_SUBNODES_STAR, OBJECT_HARD_REGS_SUBNODES_NUM): Remove. (OBJECT_ADD_DATA): New macro. (ira_reallocate): Remove. (ira_initiate_emit_data, ira_finish_emit_data): New. (ira_get_register_move_cost, ira_get_may_move_cost): Remove. (ira_init_register_move_cost_if_necessary): New. (ira_object_conflict_iter_next): Merge into ira_object_conflict_iter_cond. (FOR_EACH_OBJECT_CONFLICT): Don't use ira_object_conflict_iter_next. * ira-live.c: (process_single_reg_class_operands): Call ira_init_register_move_cost_if_necessary. Use ira_register_move_cost instead of ira_get_register_move_cost. --------------040302030206090201090801 Content-Type: text/plain; name="speedup2.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="speedup2.patch" Content-length: 104850 --- t2/ira-build.c 2010-08-13 09:37:52.000000000 -0400 +++ ../../gcc/gcc/ira-build.c 2010-08-10 14:10:47.000000000 -0400 @@ -443,7 +443,7 @@ ira_create_object (ira_allocno_t a, int OBJECT_MIN (obj) = INT_MAX; OBJECT_MAX (obj) = -1; OBJECT_LIVE_RANGES (obj) = NULL; - CLEAR_HARD_REG_SET (OBJECT_PROFITABLE_HARD_REGS (obj)); + OBJECT_ADD_DATA (obj) = NULL; VEC_safe_push (ira_object_t, heap, ira_object_id_map_vec, obj); ira_object_id_map @@ -488,16 +488,9 @@ ira_create_allocno (int regno, bool cap_ ALLOCNO_NO_STACK_REG_P (a) = false; ALLOCNO_TOTAL_NO_STACK_REG_P (a) = false; #endif - ALLOCNO_MEM_OPTIMIZED_DEST (a) = NULL; - ALLOCNO_MEM_OPTIMIZED_DEST_P (a) = false; - ALLOCNO_SOMEWHERE_RENAMED_P (a) = false; - ALLOCNO_CHILD_RENAMED_P (a) = false; ALLOCNO_DONT_REASSIGN_P (a) = false; ALLOCNO_BAD_SPILL_P (a) = false; - ALLOCNO_IN_GRAPH_P (a) = false; ALLOCNO_ASSIGNED_P (a) = false; - ALLOCNO_MAY_BE_SPILLED_P (a) = false; - ALLOCNO_COLORABLE_P (a) = false; ALLOCNO_MODE (a) = (regno < 0 ? VOIDmode : PSEUDO_REGNO_MODE (regno)); ALLOCNO_COPIES (a) = NULL; ALLOCNO_HARD_REG_COSTS (a) = NULL; @@ -510,12 +503,9 @@ ira_create_allocno (int regno, bool cap_ ALLOCNO_MEMORY_COST (a) = 0; ALLOCNO_UPDATED_MEMORY_COST (a) = 0; ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a) = 0; - ALLOCNO_NEXT_BUCKET_ALLOCNO (a) = NULL; - ALLOCNO_PREV_BUCKET_ALLOCNO (a) = NULL; - ALLOCNO_FIRST_COALESCED_ALLOCNO (a) = a; - ALLOCNO_NEXT_COALESCED_ALLOCNO (a) = a; ALLOCNO_NUM_OBJECTS (a) = 0; + ALLOCNO_ADD_DATA (a) = NULL; VEC_safe_push (ira_allocno_t, heap, allocno_vec, a); ira_allocnos = VEC_address (ira_allocno_t, allocno_vec); ira_allocnos_num = VEC_length (ira_allocno_t, allocno_vec); @@ -2635,7 +2625,7 @@ copy_info_to_removed_store_destinations a != NULL; a = ALLOCNO_NEXT_REGNO_ALLOCNO (a)) { - if (a != regno_top_level_allocno_map[REGNO (ALLOCNO_REG (a))]) + if (a != regno_top_level_allocno_map[REGNO (allocno_emit_reg (a))]) /* This allocno will be removed. */ continue; @@ -2647,8 +2637,8 @@ copy_info_to_removed_store_destinations if ((parent_a = parent->regno_allocno_map[regno]) == NULL || (parent_a == regno_top_level_allocno_map[REGNO - (ALLOCNO_REG (parent_a))] - && ALLOCNO_MEM_OPTIMIZED_DEST_P (parent_a))) + (allocno_emit_reg (parent_a))] + && ALLOCNO_EMIT_DATA (parent_a)->mem_optimized_dest_p)) break; if (parent == NULL || parent_a == NULL) continue; @@ -2719,28 +2709,31 @@ ira_flattening (int max_regno_before_emi a != NULL; a = ALLOCNO_NEXT_REGNO_ALLOCNO (a)) { + ira_emit_data_t parent_data, data = ALLOCNO_EMIT_DATA (a); + ira_assert (ALLOCNO_CAP_MEMBER (a) == NULL); - if (ALLOCNO_SOMEWHERE_RENAMED_P (a)) + if (data->somewhere_renamed_p) new_pseudos_p = true; parent_a = ira_parent_allocno (a); if (parent_a == NULL) { ALLOCNO_COPIES (a) = NULL; - regno_top_level_allocno_map[REGNO (ALLOCNO_REG (a))] = a; + regno_top_level_allocno_map[REGNO (data->reg)] = a; continue; } ira_assert (ALLOCNO_CAP_MEMBER (parent_a) == NULL); - if (ALLOCNO_MEM_OPTIMIZED_DEST (a) != NULL) + if (data->mem_optimized_dest != NULL) mem_dest_p = true; - if (REGNO (ALLOCNO_REG (a)) == REGNO (ALLOCNO_REG (parent_a))) + parent_data = ALLOCNO_EMIT_DATA (parent_a); + if (REGNO (data->reg) == REGNO (parent_data->reg)) { merge_hard_reg_conflicts (a, parent_a, true); move_allocno_live_ranges (a, parent_a); merged_p = true; - ALLOCNO_MEM_OPTIMIZED_DEST_P (parent_a) - = (ALLOCNO_MEM_OPTIMIZED_DEST_P (parent_a) - || ALLOCNO_MEM_OPTIMIZED_DEST_P (a)); + parent_data->mem_optimized_dest_p + = (parent_data->mem_optimized_dest_p + || data->mem_optimized_dest_p); continue; } new_pseudos_p = true; @@ -2776,7 +2769,7 @@ ira_flattening (int max_regno_before_emi break; } ALLOCNO_COPIES (a) = NULL; - regno_top_level_allocno_map[REGNO (ALLOCNO_REG (a))] = a; + regno_top_level_allocno_map[REGNO (data->reg)] = a; } if (mem_dest_p && copy_info_to_removed_store_destinations (i)) merged_p = true; @@ -2794,7 +2787,7 @@ ira_flattening (int max_regno_before_emi ira_allocno_object_iterator oi; ira_object_t obj; - if (a != regno_top_level_allocno_map[REGNO (ALLOCNO_REG (a))] + if (a != regno_top_level_allocno_map[REGNO (allocno_emit_reg (a))] || ALLOCNO_CAP_MEMBER (a) != NULL) continue; FOR_EACH_ALLOCNO_OBJECT (a, obj, oi) @@ -2812,7 +2805,7 @@ ira_flattening (int max_regno_before_emi ira_object_t obj = r->object; a = OBJECT_ALLOCNO (obj); - if (a != regno_top_level_allocno_map[REGNO (ALLOCNO_REG (a))] + if (a != regno_top_level_allocno_map[REGNO (allocno_emit_reg (a))] || ALLOCNO_CAP_MEMBER (a) != NULL) continue; @@ -2849,17 +2842,17 @@ ira_flattening (int max_regno_before_emi (ira_dump_file, " Remove cp%d:%c%dr%d-%c%dr%d\n", cp->num, ALLOCNO_CAP_MEMBER (cp->first) != NULL ? 'c' : 'a', ALLOCNO_NUM (cp->first), - REGNO (ALLOCNO_REG (cp->first)), + REGNO (allocno_emit_reg (cp->first)), ALLOCNO_CAP_MEMBER (cp->second) != NULL ? 'c' : 'a', ALLOCNO_NUM (cp->second), - REGNO (ALLOCNO_REG (cp->second))); + REGNO (allocno_emit_reg (cp->second))); cp->loop_tree_node = NULL; continue; } first - = regno_top_level_allocno_map[REGNO (ALLOCNO_REG (cp->first))]; + = regno_top_level_allocno_map[REGNO (allocno_emit_reg (cp->first))]; second - = regno_top_level_allocno_map[REGNO (ALLOCNO_REG (cp->second))]; + = regno_top_level_allocno_map[REGNO (allocno_emit_reg (cp->second))]; node = cp->loop_tree_node; if (node == NULL) keep_p = true; /* It copy generated in ira-emit.c. */ @@ -2869,10 +2862,10 @@ ira_flattening (int max_regno_before_emi which we will have different pseudos. */ node_first = node->regno_allocno_map[ALLOCNO_REGNO (cp->first)]; node_second = node->regno_allocno_map[ALLOCNO_REGNO (cp->second)]; - keep_p = ((REGNO (ALLOCNO_REG (first)) - == REGNO (ALLOCNO_REG (node_first))) - && (REGNO (ALLOCNO_REG (second)) - == REGNO (ALLOCNO_REG (node_second)))); + keep_p = ((REGNO (allocno_emit_reg (first)) + == REGNO (allocno_emit_reg (node_first))) + && (REGNO (allocno_emit_reg (second)) + == REGNO (allocno_emit_reg (node_second)))); } if (keep_p) { @@ -2886,25 +2879,25 @@ ira_flattening (int max_regno_before_emi if (internal_flag_ira_verbose > 4 && ira_dump_file != NULL) fprintf (ira_dump_file, " Remove cp%d:a%dr%d-a%dr%d\n", cp->num, ALLOCNO_NUM (cp->first), - REGNO (ALLOCNO_REG (cp->first)), + REGNO (allocno_emit_reg (cp->first)), ALLOCNO_NUM (cp->second), - REGNO (ALLOCNO_REG (cp->second))); + REGNO (allocno_emit_reg (cp->second))); } } /* Remove unnecessary allocnos on lower levels of the loop tree. */ FOR_EACH_ALLOCNO (a, ai) { - if (a != regno_top_level_allocno_map[REGNO (ALLOCNO_REG (a))] + if (a != regno_top_level_allocno_map[REGNO (allocno_emit_reg (a))] || ALLOCNO_CAP_MEMBER (a) != NULL) { if (internal_flag_ira_verbose > 4 && ira_dump_file != NULL) fprintf (ira_dump_file, " Remove a%dr%d\n", - ALLOCNO_NUM (a), REGNO (ALLOCNO_REG (a))); + ALLOCNO_NUM (a), REGNO (allocno_emit_reg (a))); finish_allocno (a); continue; } ALLOCNO_LOOP_TREE_NODE (a) = ira_loop_tree_root; - ALLOCNO_REGNO (a) = REGNO (ALLOCNO_REG (a)); + ALLOCNO_REGNO (a) = REGNO (allocno_emit_reg (a)); ALLOCNO_CAP (a) = NULL; /* Restore updated costs for assignments from reload. */ ALLOCNO_UPDATED_MEMORY_COST (a) = ALLOCNO_MEMORY_COST (a); --- t2/ira.c 2010-08-13 09:37:52.000000000 -0400 +++ ../../gcc/gcc/ira.c 2010-08-07 15:43:09.000000000 -0400 @@ -636,20 +636,6 @@ ira_allocate (size_t len) return res; } -/* Reallocate memory PTR of size LEN for IRA data. */ -void * -ira_reallocate (void *ptr, size_t len) -{ - void *res; - -#ifndef IRA_NO_OBSTACK - res = obstack_alloc (&ira_obstack, len); -#else - res = xrealloc (ptr, len); -#endif - return res; -} - /* Free memory ADDR allocated for IRA data. */ void ira_free (void *addr ATTRIBUTE_UNUSED) @@ -775,9 +761,8 @@ setup_pressure_classes (void) ira_prohibited_class_mode_regs[cl][m]); if (hard_reg_set_empty_p (temp_hard_regset)) continue; - cost = ira_get_register_move_cost ((enum machine_mode) m, - (enum reg_class) cl, - (enum reg_class) cl); + ira_init_register_move_cost_if_necessary ((enum machine_mode) m); + cost = ira_register_move_cost[m][cl][cl]; if (cost <= ira_max_memory_move_cost[m][cl][1] || cost <= ira_max_memory_move_cost[m][cl][0]) break; @@ -1818,7 +1803,7 @@ setup_allocno_assignment_flags (void) allocnos because the cost info and info about intersected calls are incorrect for them. */ ALLOCNO_ASSIGNED_P (a) = (hard_regno >= 0 - || ALLOCNO_MEM_OPTIMIZED_DEST_P (a) + || ALLOCNO_EMIT_DATA (a)->mem_optimized_dest_p || (ALLOCNO_MEMORY_COST (a) - ALLOCNO_CLASS_COST (a)) < 0); ira_assert (hard_regno < 0 @@ -3393,7 +3378,7 @@ ira (FILE *f) df_clear_flags (DF_NO_INSN_RESCAN); regstat_init_n_sets_and_refs (); regstat_compute_ri (); - + /* If we are not optimizing, then this is the only place before register allocation where dataflow is done. And that is needed to generate these warnings. */ @@ -3465,6 +3450,8 @@ ira (FILE *f) ira_max_point_before_emit = ira_max_point; + ira_initiate_emit_data (); + ira_emit (loops_p); if (ira_conflicts_p) @@ -3497,6 +3484,8 @@ ira (FILE *f) } } + ira_finish_emit_data (); + setup_reg_renumber (); calculate_allocation_cost (); --- t2/ira-color.c 2010-08-13 09:37:52.000000000 -0400 +++ ../../gcc/gcc/ira-color.c 2010-08-11 17:40:13.000000000 -0400 @@ -40,6 +40,119 @@ along with GCC; see the file COPYING3. #include "df.h" #include "ira-int.h" +/* See below. */ +typedef struct object_hard_regs *object_hard_regs_t; + +/* The structure contains information about hard registers can be + assigned to objects. Usually it is allocno profitable hard + registers but in some cases this set can be a bit different. Major + reason of the difference is a requirement to use hard register sets + that form a tree or a forest (set of trees), i.e. hard register set + of a node should contain hard register sets of its subnodes. */ +struct object_hard_regs +{ + /* Hard registers can be assigned to an allocno. */ + HARD_REG_SET set; + /* Overall (spilling) cost of all allocnos with given register + set. */ + long long int cost; +}; + +/* See below. */ +typedef struct object_hard_regs_node *object_hard_regs_node_t; + +/* A node representing object hard registers. Such nodes form a + forest (set of trees). Each subnode of given node in the forest + refers for hard register set (usually object profitable hard + register set) which is a subset of one referred from given + node. */ +struct object_hard_regs_node +{ + /* Set up number of the node in preorder traversing of the forest. */ + int preorder_num; + /* Used for different calculation like finding conflict size of an + allocno. */ + int check; + /* Used for calculation of conflict size of an allocno. The + conflict size of the allocno is maximal number of given object + hard registers needed for allocation of the conflicting allocnos. + Given allocno is trivially colored if this number plus the number + of hard registers needed for given allocno is not greater than + the number of given allocno hard register set. */ + int conflict_size; + /* The number of hard registers given by member hard_regs. */ + int hard_regs_num; + /* The following member is used to form the final forest. */ + bool used_p; + /* Pointer to the corresponding profitable hard registers. */ + object_hard_regs_t hard_regs; + /* Parent, first subnode, previous and next node with the same + parent in the forest. */ + object_hard_regs_node_t parent, first, prev, next; +}; + +/* To decrease footprint of ira_allocno structure we store all data + needed only for coloring in the following structure. */ +struct allocno_color_data +{ + /* TRUE value means that the allocno was not removed yet from the + conflicting graph during colouring. */ + unsigned int in_graph_p : 1; + /* TRUE if it is put on the stack to make other allocnos + colorable. */ + unsigned int may_be_spilled_p : 1; + /* TRUE if the object is trivially colorable. */ + unsigned int colorable_p : 1; + /* Number of hard registers of the allocno class really + available for the allocno allocation. It is number of the + profitable hard regs. */ + int available_regs_num; + /* Allocnos in a bucket (used in coloring) chained by the following + two members. */ + ira_allocno_t next_bucket_allocno; + ira_allocno_t prev_bucket_allocno; + /* Used for temporary purposes. */ + int temp; +}; + +/* See above. */ +typedef struct allocno_color_data *allocno_color_data_t; + +/* Container for storing allocno data concerning coloring. */ +static allocno_color_data_t allocno_color_data; + +/* Macro to access the data concerning coloring. */ +#define ALLOCNO_COLOR_DATA(a) ((allocno_color_data_t) ALLOCNO_ADD_DATA (a)) + +/* To decrease footprint of ira_object structure we store all data + needed only for coloring in the following structure. */ +struct object_color_data +{ + /* Profitable hard regs available for this pseudo allocation. It + means that the set excludes unavailable hard regs and hard regs + conflicting with given pseudo. They should be of the allocno + class. */ + HARD_REG_SET profitable_hard_regs; + /* The object hard registers node. */ + object_hard_regs_node_t hard_regs_node; + /* Array of structures object_hard_regs_subnode representing + given object hard registers node (the 1st element in the array) + and all its subnodes in the tree (forest) of object hard + register nodes (see comments above). */ + int hard_regs_subnodes_start; + /* The length of the previous array. */ + int hard_regs_subnodes_num; +}; + +/* See above. */ +typedef struct object_color_data *object_color_data_t; + +/* Container for storing object data concerning coloring. */ +static object_color_data_t object_color_data; + +/* Macro to access the data concerning coloring. */ +#define OBJECT_COLOR_DATA(o) ((object_color_data_t) OBJECT_ADD_DATA (o)) + /* This file contains code for regional graph coloring, spill/restore code placement optimization, and code helping the reload pass to do a better job. */ @@ -576,11 +689,12 @@ form_object_hard_regs_nodes_forest (void for (k = 0; k < ALLOCNO_NUM_OBJECTS (a); k++) { ira_object_t obj = ALLOCNO_OBJECT (a, k); + object_color_data_t obj_data = OBJECT_COLOR_DATA (obj); - if (hard_reg_set_empty_p (OBJECT_PROFITABLE_HARD_REGS (obj))) + if (hard_reg_set_empty_p (obj_data->profitable_hard_regs)) continue; hv = (add_object_hard_regs - (OBJECT_PROFITABLE_HARD_REGS (obj), + (obj_data->profitable_hard_regs, ALLOCNO_MEMORY_COST (a) - ALLOCNO_CLASS_COST (a))); } } @@ -607,12 +721,13 @@ form_object_hard_regs_nodes_forest (void for (k = 0; k < ALLOCNO_NUM_OBJECTS (a); k++) { ira_object_t obj = ALLOCNO_OBJECT (a, k); + object_color_data_t obj_data = OBJECT_COLOR_DATA (obj); - if (hard_reg_set_empty_p (OBJECT_PROFITABLE_HARD_REGS (obj))) + if (hard_reg_set_empty_p (obj_data->profitable_hard_regs)) continue; VEC_truncate (object_hard_regs_node_t, hard_regs_node_vec, 0); collect_object_hard_regs_cover (hard_regs_roots, - OBJECT_PROFITABLE_HARD_REGS (obj)); + obj_data->profitable_hard_regs); object_hard_regs_node = NULL; for (j = 0; VEC_iterate (object_hard_regs_node_t, hard_regs_node_vec, @@ -624,7 +739,7 @@ form_object_hard_regs_nodes_forest (void : first_common_ancestor_node (node, object_hard_regs_node)); /* That is a temporary storage. */ object_hard_regs_node->used_p = true; - OBJECT_HARD_REGS_NODE (obj) = object_hard_regs_node; + obj_data->hard_regs_node = object_hard_regs_node; } } ira_assert (hard_regs_roots->next == NULL); @@ -649,12 +764,13 @@ form_object_hard_regs_nodes_forest (void for (k = 0; k < ALLOCNO_NUM_OBJECTS (a); k++) { ira_object_t obj = ALLOCNO_OBJECT (a, k); + object_color_data_t obj_data = OBJECT_COLOR_DATA (obj); - if (hard_reg_set_empty_p (OBJECT_PROFITABLE_HARD_REGS (obj))) + if (hard_reg_set_empty_p (obj_data->profitable_hard_regs)) continue; - len = get_object_hard_regs_subnodes_num (OBJECT_HARD_REGS_NODE (obj)); - OBJECT_HARD_REGS_SUBNODES_START (obj) = start; - OBJECT_HARD_REGS_SUBNODES_NUM (obj) = len; + len = get_object_hard_regs_subnodes_num (obj_data->hard_regs_node); + obj_data->hard_regs_subnodes_start = start; + obj_data->hard_regs_subnodes_num = len; start += len; } } @@ -702,9 +818,11 @@ static bool setup_left_conflict_sizes_p (ira_allocno_t a) { int k, nobj, conflict_size; + allocno_color_data_t data; nobj = ALLOCNO_NUM_OBJECTS (a); conflict_size = 0; + data = ALLOCNO_COLOR_DATA (a); for (k = 0; k < nobj; k++) { int i, node_preorder_num, start, left_conflict_subnodes_size; @@ -715,13 +833,13 @@ setup_left_conflict_sizes_p (ira_allocno ira_object_t obj = ALLOCNO_OBJECT (a, k); ira_object_t conflict_obj; ira_object_conflict_iterator oci; + object_color_data_t obj_data; node_check_tick++; - subnodes - = object_hard_regs_subnodes + OBJECT_HARD_REGS_SUBNODES_START (obj); - COPY_HARD_REG_SET (profitable_hard_regs, - OBJECT_PROFITABLE_HARD_REGS (obj)); - node = OBJECT_HARD_REGS_NODE (obj); + obj_data = OBJECT_COLOR_DATA (obj); + subnodes = object_hard_regs_subnodes + obj_data->hard_regs_subnodes_start; + COPY_HARD_REG_SET (profitable_hard_regs, obj_data->profitable_hard_regs); + node = obj_data->hard_regs_node; node_preorder_num = node->preorder_num; COPY_HARD_REG_SET (node_set, node->hard_regs->set); FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci) @@ -730,13 +848,15 @@ setup_left_conflict_sizes_p (ira_allocno ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); object_hard_regs_node_t conflict_node, temp_node; HARD_REG_SET conflict_node_set; + object_color_data_t conflict_obj_data; - if (! ALLOCNO_IN_GRAPH_P (conflict_a) + conflict_obj_data = OBJECT_COLOR_DATA (conflict_obj); + if (! ALLOCNO_COLOR_DATA (conflict_a)->in_graph_p || ! hard_reg_set_intersect_p (profitable_hard_regs, - OBJECT_PROFITABLE_HARD_REGS - (conflict_obj))) + conflict_obj_data + ->profitable_hard_regs)) continue; - conflict_node = OBJECT_HARD_REGS_NODE (conflict_obj); + conflict_node = conflict_obj_data->hard_regs_node; COPY_HARD_REG_SET (conflict_node_set, conflict_node->hard_regs->set); if (hard_reg_set_subset_p (node_set, conflict_node_set)) temp_node = node; @@ -760,7 +880,7 @@ setup_left_conflict_sizes_p (ira_allocno } temp_node->conflict_size += size; } - for (i = 0; i < OBJECT_HARD_REGS_SUBNODES_NUM (obj); i++) + for (i = 0; i < obj_data->hard_regs_subnodes_num; i++) { object_hard_regs_node_t temp_node; @@ -788,7 +908,7 @@ setup_left_conflict_sizes_p (ira_allocno subnodes[i].left_conflict_subnodes_size = 0; } start = node_preorder_num * object_hard_regs_nodes_num; - for (i = OBJECT_HARD_REGS_SUBNODES_NUM (obj) - 1; i >= 0; i--) + for (i = obj_data->hard_regs_subnodes_num - 1; i >= 0; i--) { int size, parent_i; object_hard_regs_node_t parent; @@ -813,8 +933,8 @@ setup_left_conflict_sizes_p (ira_allocno subnodes[0].left_conflict_size)); } conflict_size += ira_reg_class_max_nregs[ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]; - ALLOCNO_COLORABLE_P (a) = conflict_size <= ALLOCNO_AVAILABLE_REGS_NUM (a); - return ALLOCNO_COLORABLE_P (a); + data->colorable_p = conflict_size <= data->available_regs_num; + return data->colorable_p; } /* Update left conflict sizes of hard registers subnodes of allocno A @@ -828,16 +948,18 @@ update_left_conflict_sizes_p (ira_allocn int node_preorder_num, parent_i; object_hard_regs_node_t node, removed_node, parent; object_hard_regs_subnode_t subnodes; + allocno_color_data_t data = ALLOCNO_COLOR_DATA (a); bool colorable_p = true; - ira_assert (! ALLOCNO_COLORABLE_P (a)); + ira_assert (! data->colorable_p); for (k = 0; k < ALLOCNO_NUM_OBJECTS (a); k++) { ira_object_t obj = ALLOCNO_OBJECT (a, k); + object_color_data_t obj_data = OBJECT_COLOR_DATA (obj); - node = OBJECT_HARD_REGS_NODE (obj); + node = obj_data->hard_regs_node; node_preorder_num = node->preorder_num; - removed_node = OBJECT_HARD_REGS_NODE (removed_obj); + removed_node = OBJECT_COLOR_DATA (removed_obj)->hard_regs_node; ira_assert (hard_reg_set_subset_p (removed_node->hard_regs->set, node->hard_regs->set) || hard_reg_set_subset_p (node->hard_regs->set, @@ -846,8 +968,7 @@ update_left_conflict_sizes_p (ira_allocn i = object_hard_regs_subnode_index[start + removed_node->preorder_num]; if (i < 0) i = 0; - subnodes - = object_hard_regs_subnodes + OBJECT_HARD_REGS_SUBNODES_START (obj); + subnodes = object_hard_regs_subnodes + obj_data->hard_regs_subnodes_start; before_conflict_size = (subnodes[i].left_conflict_subnodes_size + MIN (subnodes[i].max_node_impact @@ -882,7 +1003,7 @@ update_left_conflict_sizes_p (ira_allocn if (i != 0 || (conflict_size + ira_reg_class_max_nregs[ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)] - > ALLOCNO_AVAILABLE_REGS_NUM (a))) + > data->available_regs_num)) { colorable_p = false; break; @@ -890,7 +1011,7 @@ update_left_conflict_sizes_p (ira_allocn } if (colorable_p) { - ALLOCNO_COLORABLE_P (a) = true; + data->colorable_p = true; return true; } return false; @@ -907,8 +1028,9 @@ empty_profitable_hard_regs (ira_allocno_ for (k = 0; k < nobj; k++) { ira_object_t obj = ALLOCNO_OBJECT (a, k); + object_color_data_t obj_data = OBJECT_COLOR_DATA (obj); - if (hard_reg_set_empty_p (OBJECT_PROFITABLE_HARD_REGS (obj))) + if (hard_reg_set_empty_p (obj_data->profitable_hard_regs)) return true; } return false; @@ -936,20 +1058,21 @@ setup_profitable_hard_regs (void) for (k = 0; k < nobj; k++) { ira_object_t obj = ALLOCNO_OBJECT (a, k); + object_color_data_t obj_data = OBJECT_COLOR_DATA (obj); if (ALLOCNO_UPDATED_HARD_REG_COSTS (a) == NULL && ALLOCNO_CLASS_COST (a) > ALLOCNO_MEMORY_COST (a)) - CLEAR_HARD_REG_SET (OBJECT_PROFITABLE_HARD_REGS (obj)); + CLEAR_HARD_REG_SET (obj_data->profitable_hard_regs); else { - COPY_HARD_REG_SET (OBJECT_PROFITABLE_HARD_REGS (obj), + COPY_HARD_REG_SET (obj_data->profitable_hard_regs, reg_class_contents[aclass]); AND_COMPL_HARD_REG_SET - (OBJECT_PROFITABLE_HARD_REGS (obj), + (obj_data->profitable_hard_regs, ira_prohibited_class_mode_regs[aclass][mode]); - AND_COMPL_HARD_REG_SET (OBJECT_PROFITABLE_HARD_REGS (obj), + AND_COMPL_HARD_REG_SET (obj_data->profitable_hard_regs, ira_no_alloc_regs); - AND_COMPL_HARD_REG_SET (OBJECT_PROFITABLE_HARD_REGS (obj), + AND_COMPL_HARD_REG_SET (obj_data->profitable_hard_regs, OBJECT_TOTAL_CONFLICT_HARD_REGS (obj)); } } @@ -978,16 +1101,16 @@ setup_profitable_hard_regs (void) if (WORDS_BIG_ENDIAN) CLEAR_HARD_REG_BIT - (OBJECT_PROFITABLE_HARD_REGS (conflict_obj), + (OBJECT_COLOR_DATA (conflict_obj)->profitable_hard_regs, hard_regno + nobj - num - 1); else CLEAR_HARD_REG_BIT - (OBJECT_PROFITABLE_HARD_REGS (conflict_obj), + (OBJECT_COLOR_DATA (conflict_obj)->profitable_hard_regs, hard_regno + num); } else AND_COMPL_HARD_REG_SET - (OBJECT_PROFITABLE_HARD_REGS (conflict_obj), + (OBJECT_COLOR_DATA (conflict_obj)->profitable_hard_regs, ira_reg_mode_hard_regset[hard_regno][mode]); } } @@ -1006,6 +1129,7 @@ setup_profitable_hard_regs (void) for (k = 0; k < nobj; k++) { ira_object_t obj = ALLOCNO_OBJECT (a, k); + object_color_data_t obj_data = OBJECT_COLOR_DATA (obj); if ((costs = ALLOCNO_UPDATED_HARD_REG_COSTS (a)) != NULL || (costs = ALLOCNO_HARD_REG_COSTS (a)) != NULL) @@ -1024,11 +1148,11 @@ setup_profitable_hard_regs (void) else hard_regno += num; } - if (! TEST_HARD_REG_BIT (OBJECT_PROFITABLE_HARD_REGS (obj), + if (! TEST_HARD_REG_BIT (obj_data->profitable_hard_regs, hard_regno)) continue; if (ALLOCNO_UPDATED_MEMORY_COST (a) < costs[j]) - CLEAR_HARD_REG_BIT (OBJECT_PROFITABLE_HARD_REGS (obj), + CLEAR_HARD_REG_BIT (obj_data->profitable_hard_regs, hard_regno); else if (min_cost > costs[j]) min_cost = costs[j]; @@ -1036,7 +1160,7 @@ setup_profitable_hard_regs (void) } else if (ALLOCNO_UPDATED_MEMORY_COST (a) < ALLOCNO_UPDATED_CLASS_COST (a)) - CLEAR_HARD_REG_SET (OBJECT_PROFITABLE_HARD_REGS (obj)); + CLEAR_HARD_REG_SET (obj_data->profitable_hard_regs); } if (ALLOCNO_UPDATED_CLASS_COST (a) > min_cost) ALLOCNO_UPDATED_CLASS_COST (a) = min_cost; @@ -1183,6 +1307,7 @@ update_copy_costs (ira_allocno_t allocno do { mode = ALLOCNO_MODE (allocno); + ira_init_register_move_cost_if_necessary (mode); for (cp = ALLOCNO_COPIES (allocno); cp != NULL; cp = next_cp) { if (cp->first == allocno) @@ -1204,8 +1329,8 @@ update_copy_costs (ira_allocno_t allocno continue; cost = (cp->second == allocno - ? ira_get_register_move_cost (mode, rclass, aclass) - : ira_get_register_move_cost (mode, aclass, rclass)); + ? ira_register_move_cost[mode][rclass][aclass] + : ira_register_move_cost[mode][aclass][rclass]); if (decr_p) cost = -cost; @@ -1267,7 +1392,7 @@ update_conflict_hard_regno_costs (int *c another_aclass = ALLOCNO_CLASS (another_allocno); if (! ira_reg_classes_intersect_p[aclass][another_aclass] || ALLOCNO_ASSIGNED_P (another_allocno) - || ALLOCNO_MAY_BE_SPILLED_P (another_allocno)) + || ALLOCNO_COLOR_DATA (another_allocno)->may_be_spilled_p) continue; class_size = ira_class_hard_regs_num[another_aclass]; ira_allocate_and_copy_costs @@ -1332,7 +1457,7 @@ setup_conflict_profitable_regs (ira_allo reg_class_contents[ALLOCNO_CLASS (a)]); else COPY_HARD_REG_SET (profitable_regs[i], - OBJECT_PROFITABLE_HARD_REGS (obj)); + OBJECT_COLOR_DATA (obj)->profitable_hard_regs); } } @@ -1448,7 +1573,8 @@ assign_hard_reg (ira_allocno_t a, bool r || ALLOCNO_HARD_REGNO (conflict_a) < 0) && !(hard_reg_set_intersect_p (profitable_hard_regs[word], - OBJECT_PROFITABLE_HARD_REGS (conflict_obj)))))) + OBJECT_COLOR_DATA + (conflict_obj)->profitable_hard_regs))))) continue; conflict_aclass = ALLOCNO_CLASS (conflict_a); ira_assert (ira_reg_classes_intersect_p @@ -1484,7 +1610,7 @@ assign_hard_reg (ira_allocno_t a, bool r } } else if (! retry_p - && ! ALLOCNO_MAY_BE_SPILLED_P (conflict_a)) + && ! ALLOCNO_COLOR_DATA (conflict_a)->may_be_spilled_p) { int k, *conflict_costs; @@ -1600,10 +1726,12 @@ static int uncolorable_allocnos_num; /* Return the current spill priority of allocno A. The less the number, the more preferable the allocno for spilling. */ -static int +static inline int allocno_spill_priority (ira_allocno_t a) { - return (ALLOCNO_TEMP (a) + allocno_color_data_t data = ALLOCNO_COLOR_DATA (a); + + return (data->temp / (ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a) * ira_reg_class_max_nregs[ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)] + 1)); @@ -1615,6 +1743,7 @@ static void add_allocno_to_bucket (ira_allocno_t a, ira_allocno_t *bucket_ptr) { ira_allocno_t first_a; + allocno_color_data_t data; if (bucket_ptr == &uncolorable_allocno_bucket && ALLOCNO_CLASS (a) != NO_REGS) @@ -1623,10 +1752,11 @@ add_allocno_to_bucket (ira_allocno_t a, ira_assert (uncolorable_allocnos_num > 0); } first_a = *bucket_ptr; - ALLOCNO_NEXT_BUCKET_ALLOCNO (a) = first_a; - ALLOCNO_PREV_BUCKET_ALLOCNO (a) = NULL; + data = ALLOCNO_COLOR_DATA (a); + data->next_bucket_allocno = first_a; + data->prev_bucket_allocno = NULL; if (first_a != NULL) - ALLOCNO_PREV_BUCKET_ALLOCNO (first_a) = a; + ALLOCNO_COLOR_DATA (first_a)->prev_bucket_allocno = a; *bucket_ptr = a; } @@ -1650,8 +1780,8 @@ bucket_allocno_compare_func (const void a2_freq = ALLOCNO_FREQ (a2); if ((diff = a1_freq - a2_freq) != 0) return diff; - a1_num = ALLOCNO_AVAILABLE_REGS_NUM (a1); - a2_num = ALLOCNO_AVAILABLE_REGS_NUM (a2); + a1_num = ALLOCNO_COLOR_DATA (a1)->available_regs_num; + a2_num = ALLOCNO_COLOR_DATA (a2)->available_regs_num; if ((diff = a2_num - a1_num) != 0) return diff; return ALLOCNO_NUM (a2) - ALLOCNO_NUM (a1); @@ -1668,7 +1798,7 @@ sort_bucket (ira_allocno_t *bucket_ptr, for (n = 0, a = *bucket_ptr; a != NULL; - a = ALLOCNO_NEXT_BUCKET_ALLOCNO (a)) + a = ALLOCNO_COLOR_DATA (a)->next_bucket_allocno) sorted_allocnos[n++] = a; if (n <= 1) return; @@ -1677,10 +1807,10 @@ sort_bucket (ira_allocno_t *bucket_ptr, for (n--; n >= 0; n--) { a = sorted_allocnos[n]; - ALLOCNO_NEXT_BUCKET_ALLOCNO (a) = head; - ALLOCNO_PREV_BUCKET_ALLOCNO (a) = NULL; + ALLOCNO_COLOR_DATA (a)->next_bucket_allocno = head; + ALLOCNO_COLOR_DATA (a)->prev_bucket_allocno = NULL; if (head != NULL) - ALLOCNO_PREV_BUCKET_ALLOCNO (head) = a; + ALLOCNO_COLOR_DATA (head)->prev_bucket_allocno = a; head = a; } *bucket_ptr = head; @@ -1704,17 +1834,17 @@ add_allocno_to_ordered_bucket (ira_alloc for (before = *bucket_ptr, after = NULL; before != NULL; after = before, - before = ALLOCNO_NEXT_BUCKET_ALLOCNO (before)) + before = ALLOCNO_COLOR_DATA (before)->next_bucket_allocno) if (bucket_allocno_compare_func (&allocno, &before) < 0) break; - ALLOCNO_NEXT_BUCKET_ALLOCNO (allocno) = before; - ALLOCNO_PREV_BUCKET_ALLOCNO (allocno) = after; + ALLOCNO_COLOR_DATA (allocno)->next_bucket_allocno = before; + ALLOCNO_COLOR_DATA (allocno)->prev_bucket_allocno = after; if (after == NULL) *bucket_ptr = allocno; else - ALLOCNO_NEXT_BUCKET_ALLOCNO (after) = allocno; + ALLOCNO_COLOR_DATA (after)->next_bucket_allocno = allocno; if (before != NULL) - ALLOCNO_PREV_BUCKET_ALLOCNO (before) = allocno; + ALLOCNO_COLOR_DATA (before)->prev_bucket_allocno = allocno; } /* Delete ALLOCNO from bucket *BUCKET_PTR. It should be there before @@ -1730,17 +1860,17 @@ delete_allocno_from_bucket (ira_allocno_ uncolorable_allocnos_num--; ira_assert (uncolorable_allocnos_num >= 0); } - prev_allocno = ALLOCNO_PREV_BUCKET_ALLOCNO (allocno); - next_allocno = ALLOCNO_NEXT_BUCKET_ALLOCNO (allocno); + prev_allocno = ALLOCNO_COLOR_DATA (allocno)->prev_bucket_allocno; + next_allocno = ALLOCNO_COLOR_DATA (allocno)->next_bucket_allocno; if (prev_allocno != NULL) - ALLOCNO_NEXT_BUCKET_ALLOCNO (prev_allocno) = next_allocno; + ALLOCNO_COLOR_DATA (prev_allocno)->next_bucket_allocno = next_allocno; else { ira_assert (*bucket_ptr == allocno); *bucket_ptr = next_allocno; } if (next_allocno != NULL) - ALLOCNO_PREV_BUCKET_ALLOCNO (next_allocno) = prev_allocno; + ALLOCNO_COLOR_DATA (next_allocno)->prev_bucket_allocno = prev_allocno; } /* Put allocno A onto the coloring stack without removing it from its @@ -1751,9 +1881,11 @@ static void push_allocno_to_stack (ira_allocno_t a) { enum reg_class aclass; + allocno_color_data_t data, conflict_data; int size, i, n = ALLOCNO_NUM_OBJECTS (a); - ALLOCNO_IN_GRAPH_P (a) = false; + data = ALLOCNO_COLOR_DATA (a); + data->in_graph_p = false; VEC_safe_push (ira_allocno_t, heap, allocno_stack_vec, a); aclass = ALLOCNO_CLASS (a); if (aclass == NO_REGS) @@ -1775,12 +1907,13 @@ push_allocno_to_stack (ira_allocno_t a) { ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); - if (ALLOCNO_COLORABLE_P (conflict_a) - || ! ALLOCNO_IN_GRAPH_P (conflict_a) + conflict_data = ALLOCNO_COLOR_DATA (conflict_a); + if (conflict_data->colorable_p + || ! conflict_data->in_graph_p || ALLOCNO_ASSIGNED_P (conflict_a) || !(hard_reg_set_intersect_p - (OBJECT_PROFITABLE_HARD_REGS (obj), - OBJECT_PROFITABLE_HARD_REGS (conflict_obj)))) + (OBJECT_COLOR_DATA (obj)->profitable_hard_regs, + OBJECT_COLOR_DATA (conflict_obj)->profitable_hard_regs))) continue; ira_assert (bitmap_bit_p (coloring_allocno_bitmap, ALLOCNO_NUM (conflict_a))); @@ -1817,15 +1950,15 @@ remove_allocno_from_bucket_and_push (ira ira_print_expanded_allocno (allocno); if (colorable_p) fprintf (ira_dump_file, "(cost %d)\n", - ALLOCNO_TEMP (allocno)); + ALLOCNO_COLOR_DATA (allocno)->temp); else fprintf (ira_dump_file, "(potential spill: %spri=%d, cost=%d)\n", ALLOCNO_BAD_SPILL_P (allocno) ? "bad spill, " : "", allocno_spill_priority (allocno), - ALLOCNO_TEMP (allocno)); + ALLOCNO_COLOR_DATA (allocno)->temp); } if (! colorable_p) - ALLOCNO_MAY_BE_SPILLED_P (allocno) = true; + ALLOCNO_COLOR_DATA (allocno)->may_be_spilled_p = true; push_allocno_to_stack (allocno); } @@ -1901,13 +2034,16 @@ calculate_allocno_spill_cost (ira_allocn + ira_memory_move_cost[mode][rclass][1] * ira_loop_edge_freq (loop_node, regno, false)); else - cost += ((ira_memory_move_cost[mode][rclass][1] - * ira_loop_edge_freq (loop_node, regno, true) - + ira_memory_move_cost[mode][rclass][0] - * ira_loop_edge_freq (loop_node, regno, false)) - - (ira_get_register_move_cost (mode, rclass, rclass) - * (ira_loop_edge_freq (loop_node, regno, false) - + ira_loop_edge_freq (loop_node, regno, true)))); + { + ira_init_register_move_cost_if_necessary (mode); + cost += ((ira_memory_move_cost[mode][rclass][1] + * ira_loop_edge_freq (loop_node, regno, true) + + ira_memory_move_cost[mode][rclass][0] + * ira_loop_edge_freq (loop_node, regno, false)) + - (ira_register_move_cost[mode][rclass][rclass] + * (ira_loop_edge_freq (loop_node, regno, false) + + ira_loop_edge_freq (loop_node, regno, true)))); + } return cost; } @@ -1926,7 +2062,7 @@ allocno_spill_priority_compare (ira_allo if ((diff = pri1 - pri2) != 0) return diff; if ((diff - = ALLOCNO_TEMP (a1) - ALLOCNO_TEMP (a2)) != 0) + = ALLOCNO_COLOR_DATA (a1)->temp - ALLOCNO_COLOR_DATA (a2)->temp) != 0) return diff; return ALLOCNO_NUM (a1) - ALLOCNO_NUM (a2); } @@ -1947,7 +2083,19 @@ static void push_allocnos_to_stack (void) { ira_allocno_t a; + int cost; + /* Calculate uncolorable allocno spill costs. */ + for (a = uncolorable_allocno_bucket; + a != NULL; + a = ALLOCNO_COLOR_DATA (a)->next_bucket_allocno) + if (ALLOCNO_CLASS (a) != NO_REGS) + { + cost = calculate_allocno_spill_cost (a); + /* ??? Remove cost of copies between the coalesced + allocnos. */ + ALLOCNO_COLOR_DATA (a)->temp = cost; + } sort_bucket (&uncolorable_allocno_bucket, allocno_spill_sort_compare); for (;;) { @@ -2001,7 +2149,7 @@ pop_allocnos_from_stack (void) if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) fprintf (ira_dump_file, "spill\n"); } - ALLOCNO_IN_GRAPH_P (allocno) = true; + ALLOCNO_COLOR_DATA (allocno)->in_graph_p = true; } } @@ -2012,9 +2160,11 @@ setup_allocno_available_regs_num (ira_al int i, j, n, hard_regno, hard_regs_num, nwords, nregs; enum reg_class aclass; enum machine_mode mode; + allocno_color_data_t data; aclass = ALLOCNO_CLASS (a); - ALLOCNO_AVAILABLE_REGS_NUM (a) = 0; + data = ALLOCNO_COLOR_DATA (a); + data->available_regs_num = 0; if (aclass == NO_REGS) return; hard_regs_num = ira_class_hard_regs_num[aclass]; @@ -2040,11 +2190,12 @@ setup_allocno_available_regs_num (ira_al for (k = set_to_test_start; k < set_to_test_end; k++) { ira_object_t obj = ALLOCNO_OBJECT (a, k); + object_color_data_t obj_data = OBJECT_COLOR_DATA (obj); /* Checking only profitable hard regs. */ if (TEST_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), hard_regno + j) - || ! TEST_HARD_REG_BIT (OBJECT_PROFITABLE_HARD_REGS (obj), + || ! TEST_HARD_REG_BIT (obj_data->profitable_hard_regs, hard_regno + j)) break; } @@ -2054,7 +2205,7 @@ setup_allocno_available_regs_num (ira_al if (j == nregs) n++; } - ALLOCNO_AVAILABLE_REGS_NUM (a) = n; + data->available_regs_num = n; if (internal_flag_ira_verbose <= 2 || ira_dump_file == NULL) return; fprintf @@ -2065,6 +2216,7 @@ setup_allocno_available_regs_num (ira_al for (i = 0; i < nwords; i++) { ira_object_t obj = ALLOCNO_OBJECT (a, i); + object_color_data_t obj_data = OBJECT_COLOR_DATA (obj); if (nwords != 1) { @@ -2072,18 +2224,16 @@ setup_allocno_available_regs_num (ira_al fprintf (ira_dump_file, ", "); fprintf (ira_dump_file, " obj %d", i); } - print_hard_reg_set (ira_dump_file, OBJECT_PROFITABLE_HARD_REGS (obj), - false); + print_hard_reg_set (ira_dump_file, obj_data->profitable_hard_regs, false); fprintf (ira_dump_file, " (confl regs = "); print_hard_reg_set (ira_dump_file, OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), false); fprintf (ira_dump_file, " ) %snode: ", - hard_reg_set_equal_p (OBJECT_PROFITABLE_HARD_REGS (obj), - OBJECT_HARD_REGS_NODE (obj) - ->hard_regs->set) + hard_reg_set_equal_p (obj_data->profitable_hard_regs, + obj_data->hard_regs_node->hard_regs->set) ? "" : "^"); print_hard_reg_set (ira_dump_file, - OBJECT_HARD_REGS_NODE (obj)->hard_regs->set, false); + obj_data->hard_regs_node->hard_regs->set, false); } fprintf (ira_dump_file, "\n"); @@ -2094,7 +2244,7 @@ setup_allocno_available_regs_num (ira_al static void put_allocno_into_bucket (ira_allocno_t allocno) { - ALLOCNO_IN_GRAPH_P (allocno) = true; + ALLOCNO_COLOR_DATA (allocno)->in_graph_p = true; setup_allocno_available_regs_num (allocno); if (setup_left_conflict_sizes_p (allocno)) add_allocno_to_bucket (allocno, &colorable_allocno_bucket); @@ -2183,12 +2333,12 @@ improve_allocation (void) bitmap_iterator bi; EXECUTE_IF_SET_IN_BITMAP (coloring_allocno_bitmap, 0, i, bi) - ALLOCNO_TEMP (ira_allocnos[i]) = 0; + ALLOCNO_COLOR_DATA (ira_allocnos[i])->temp = 0; check = n = 0; EXECUTE_IF_SET_IN_BITMAP (coloring_allocno_bitmap, 0, i, bi) { a = ira_allocnos[i]; - ALLOCNO_TEMP (a) = 0; + ALLOCNO_COLOR_DATA (a)->temp = 0; if (empty_profitable_hard_regs (a)) continue; check++; @@ -2234,9 +2384,9 @@ improve_allocation (void) { ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); - if (ALLOCNO_TEMP (conflict_a) == check) + if (ALLOCNO_COLOR_DATA (conflict_a)->temp == check) continue; - ALLOCNO_TEMP (conflict_a) = check; + ALLOCNO_COLOR_DATA (conflict_a)->temp = check; if ((conflict_hregno = ALLOCNO_HARD_REGNO (conflict_a)) < 0) continue; spill_cost = ALLOCNO_UPDATED_MEMORY_COST (conflict_a); @@ -2430,7 +2580,7 @@ color_allocnos (void) { a = ira_allocnos[i]; if (ALLOCNO_CLASS (a) != NO_REGS && ! empty_profitable_hard_regs (a)) - ALLOCNO_IN_GRAPH_P (a) = true; + ALLOCNO_COLOR_DATA (a)->in_graph_p = true; else { ALLOCNO_HARD_REGNO (a) = -1; @@ -2451,15 +2601,8 @@ color_allocnos (void) EXECUTE_IF_SET_IN_BITMAP (coloring_allocno_bitmap, 0, i, bi) { a = ira_allocnos[i]; - if (ALLOCNO_IN_GRAPH_P (a)) - { - int cost = calculate_allocno_spill_cost (a); - - /* ??? Remove cost of copies between the coalesced - allocnos. */ - ALLOCNO_TEMP (a) = cost; - put_allocno_into_bucket (a); - } + if (ALLOCNO_COLOR_DATA (a)->in_graph_p) + put_allocno_into_bucket (a); } push_allocnos_to_stack (); pop_allocnos_from_stack (); @@ -2531,7 +2674,7 @@ print_loop_title (ira_loop_tree_node_t l static void color_pass (ira_loop_tree_node_t loop_tree_node) { - int regno, hard_regno, index = -1; + int i, regno, hard_regno, index = -1, n, nobj; int cost, exit_freq, enter_freq; unsigned int j; bitmap_iterator bi; @@ -2546,13 +2689,36 @@ color_pass (ira_loop_tree_node_t loop_tr bitmap_copy (coloring_allocno_bitmap, loop_tree_node->all_allocnos); bitmap_copy (consideration_allocno_bitmap, coloring_allocno_bitmap); + n = nobj = 0; EXECUTE_IF_SET_IN_BITMAP (consideration_allocno_bitmap, 0, j, bi) { a = ira_allocnos[j]; + n++; + nobj += ALLOCNO_NUM_OBJECTS (a); if (! ALLOCNO_ASSIGNED_P (a)) continue; bitmap_clear_bit (coloring_allocno_bitmap, ALLOCNO_NUM (a)); } + allocno_color_data + = (allocno_color_data_t) ira_allocate (sizeof (struct allocno_color_data) + * n); + memset (allocno_color_data, 0, sizeof (struct allocno_color_data) * n); + object_color_data + = (object_color_data_t) ira_allocate (sizeof (struct object_color_data) + * nobj); + memset (object_color_data, 0, sizeof (struct object_color_data) * nobj); + n = nobj = 0; + EXECUTE_IF_SET_IN_BITMAP (consideration_allocno_bitmap, 0, j, bi) + { + a = ira_allocnos[j]; + ALLOCNO_ADD_DATA (a) = allocno_color_data + n; + n++; + for (i = 0; i < ALLOCNO_NUM_OBJECTS (a); i++) + { + OBJECT_ADD_DATA (ALLOCNO_OBJECT (a, i)) = object_color_data + nobj; + nobj++; + } + } /* Color all mentioned allocnos including transparent ones. */ color_allocnos (); /* Process caps. They are processed just once. */ @@ -2660,7 +2826,8 @@ color_pass (ira_loop_tree_node_t loop_tr else { aclass = ALLOCNO_CLASS (subloop_allocno); - cost = (ira_get_register_move_cost (mode, rclass, rclass) + ira_init_register_move_cost_if_necessary (mode); + cost = (ira_register_move_cost[mode][rclass][rclass] * (exit_freq + enter_freq)); ira_allocate_and_set_or_copy_costs (&ALLOCNO_UPDATED_HARD_REG_COSTS (subloop_allocno), aclass, @@ -2682,6 +2849,15 @@ color_pass (ira_loop_tree_node_t loop_tr } } } + ira_free (object_color_data); + ira_free (allocno_color_data); + EXECUTE_IF_SET_IN_BITMAP (coloring_allocno_bitmap, 0, j, bi) + { + a = ira_allocnos[j]; + ALLOCNO_ADD_DATA (a) = NULL; + for (i = 0; i < ALLOCNO_NUM_OBJECTS (a); i++) + OBJECT_ADD_DATA (a) = NULL; + } } /* Initialize the common data for coloring and calls functions to do @@ -2749,6 +2925,7 @@ move_spill_restore (void) - (ALLOCNO_HARD_REG_COSTS (a) == NULL ? ALLOCNO_CLASS_COST (a) : ALLOCNO_HARD_REG_COSTS (a)[index])); + ira_init_register_move_cost_if_necessary (mode); for (subloop_node = loop_node->subloops; subloop_node != NULL; subloop_node = subloop_node->subloop_next) @@ -2776,7 +2953,7 @@ move_spill_restore (void) += (ira_memory_move_cost[mode][rclass][0] * exit_freq + ira_memory_move_cost[mode][rclass][1] * enter_freq); if (hard_regno2 != hard_regno) - cost -= (ira_get_register_move_cost (mode, rclass, rclass) + cost -= (ira_register_move_cost[mode][rclass][rclass] * (exit_freq + enter_freq)); } } @@ -2795,7 +2972,7 @@ move_spill_restore (void) += (ira_memory_move_cost[mode][rclass][1] * exit_freq + ira_memory_move_cost[mode][rclass][0] * enter_freq); if (hard_regno2 != hard_regno) - cost -= (ira_get_register_move_cost (mode, rclass, rclass) + cost -= (ira_register_move_cost[mode][rclass][rclass] * (exit_freq + enter_freq)); } } @@ -2838,6 +3015,7 @@ update_curr_costs (ira_allocno_t a) if (aclass == NO_REGS) return; mode = ALLOCNO_MODE (a); + ira_init_register_move_cost_if_necessary (mode); for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) { if (cp->first == a) @@ -2861,8 +3039,8 @@ update_curr_costs (ira_allocno_t a) if (i < 0) continue; cost = (cp->first == a - ? ira_get_register_move_cost (mode, rclass, aclass) - : ira_get_register_move_cost (mode, aclass, rclass)); + ? ira_register_move_cost[mode][rclass][aclass] + : ira_register_move_cost[mode][aclass][rclass]); ira_allocate_and_set_or_copy_costs (&ALLOCNO_UPDATED_HARD_REG_COSTS (a), aclass, ALLOCNO_CLASS_COST (a), ALLOCNO_HARD_REG_COSTS (a)); @@ -3038,6 +3216,27 @@ static bool allocno_coalesced_p; coalescing. */ static bitmap processed_coalesced_allocno_bitmap; +/* See below. */ +typedef struct coalesce_data *coalesce_data_t; + +/* To decrease footprint of ira_allocno structure we store all data + needed only for coalescing in the following structure. */ +struct coalesce_data +{ + /* Coalesced allocnos form a cyclic list. One allocno given by + FIRST represents all coalesced allocnos. The + list is chained by NEXT. */ + ira_allocno_t first; + ira_allocno_t next; + int temp; +}; + +/* Container for storing allocno data concerning coalescing. */ +static coalesce_data_t allocno_coalesce_data; + +/* Macro to access the data concerning coalescing. */ +#define ALLOCNO_COALESCE_DATA(a) ((coalesce_data_t) ALLOCNO_ADD_DATA (a)) + /* The function is used to sort allocnos according to their execution frequencies. */ static int @@ -3064,21 +3263,21 @@ merge_allocnos (ira_allocno_t a1, ira_al { ira_allocno_t a, first, last, next; - first = ALLOCNO_FIRST_COALESCED_ALLOCNO (a1); - a = ALLOCNO_FIRST_COALESCED_ALLOCNO (a2); + first = ALLOCNO_COALESCE_DATA (a1)->first; + a = ALLOCNO_COALESCE_DATA (a2)->first; if (first == a) return; - for (last = a2, a = ALLOCNO_NEXT_COALESCED_ALLOCNO (a2);; - a = ALLOCNO_NEXT_COALESCED_ALLOCNO (a)) + for (last = a2, a = ALLOCNO_COALESCE_DATA (a2)->next;; + a = ALLOCNO_COALESCE_DATA (a)->next) { - ALLOCNO_FIRST_COALESCED_ALLOCNO (a) = first; + ALLOCNO_COALESCE_DATA (a)->first = first; if (a == a2) break; last = a; } - next = ALLOCNO_NEXT_COALESCED_ALLOCNO (first); - ALLOCNO_NEXT_COALESCED_ALLOCNO (first) = a2; - ALLOCNO_NEXT_COALESCED_ALLOCNO (last) = next; + next = allocno_coalesce_data[ALLOCNO_NUM (first)].next; + allocno_coalesce_data[ALLOCNO_NUM (first)].next = a2; + allocno_coalesce_data[ALLOCNO_NUM (last)].next = next; } /* Return TRUE if there are conflicting allocnos from two sets of @@ -3094,19 +3293,19 @@ coalesced_allocno_conflict_p (ira_allocn if (allocno_coalesced_p) { bitmap_clear (processed_coalesced_allocno_bitmap); - for (a = ALLOCNO_NEXT_COALESCED_ALLOCNO (a1);; - a = ALLOCNO_NEXT_COALESCED_ALLOCNO (a)) + for (a = ALLOCNO_COALESCE_DATA (a1)->next;; + a = ALLOCNO_COALESCE_DATA (a)->next) { bitmap_set_bit (processed_coalesced_allocno_bitmap, ALLOCNO_NUM (a)); if (a == a1) break; } } - for (a = ALLOCNO_NEXT_COALESCED_ALLOCNO (a2);; - a = ALLOCNO_NEXT_COALESCED_ALLOCNO (a)) + for (a = ALLOCNO_COALESCE_DATA (a2)->next;; + a = ALLOCNO_COALESCE_DATA (a)->next) { - for (conflict_a = ALLOCNO_NEXT_COALESCED_ALLOCNO (a1);; - conflict_a = ALLOCNO_NEXT_COALESCED_ALLOCNO (conflict_a)) + for (conflict_a = ALLOCNO_COALESCE_DATA (a1)->next;; + conflict_a = ALLOCNO_COALESCE_DATA (conflict_a)->next) { if (allocnos_conflict_by_live_ranges_p (a, conflict_a)) return true; @@ -3194,8 +3393,8 @@ coalesce_allocnos (void) for (n = 0; i < cp_num; i++) { cp = sorted_copies[i]; - if (ALLOCNO_FIRST_COALESCED_ALLOCNO (cp->first) - != ALLOCNO_FIRST_COALESCED_ALLOCNO (cp->second)) + if (allocno_coalesce_data[ALLOCNO_NUM (cp->first)].first + != allocno_coalesce_data[ALLOCNO_NUM (cp->second)].first) sorted_copies[n++] = cp; } cp_num = n; @@ -3293,18 +3492,18 @@ setup_coalesced_allocno_costs_and_nums ( regno_coalesced_allocno_num[regno] = ++num; continue; } - if (ALLOCNO_FIRST_COALESCED_ALLOCNO (allocno) != allocno) + if (ALLOCNO_COALESCE_DATA (allocno)->first != allocno) continue; num++; - for (cost = 0, a = ALLOCNO_NEXT_COALESCED_ALLOCNO (allocno);; - a = ALLOCNO_NEXT_COALESCED_ALLOCNO (a)) + for (cost = 0, a = ALLOCNO_COALESCE_DATA (allocno)->next;; + a = ALLOCNO_COALESCE_DATA (a)->next) { cost += ALLOCNO_FREQ (a); if (a == allocno) break; } - for (a = ALLOCNO_NEXT_COALESCED_ALLOCNO (allocno);; - a = ALLOCNO_NEXT_COALESCED_ALLOCNO (a)) + for (a = ALLOCNO_COALESCE_DATA (allocno)->next;; + a = ALLOCNO_COALESCE_DATA (a)->next) { regno_coalesced_allocno_num[ALLOCNO_REGNO (a)] = num; regno_coalesced_allocno_cost[ALLOCNO_REGNO (a)] = cost; @@ -3331,7 +3530,7 @@ collect_spilled_coalesced_allocnos (int regno = pseudo_regnos[i]; allocno = ira_regno_allocno_map[regno]; if (allocno == NULL || ALLOCNO_HARD_REGNO (allocno) >= 0 - || ALLOCNO_FIRST_COALESCED_ALLOCNO (allocno) != allocno) + || ALLOCNO_COALESCE_DATA (allocno)->first != allocno) continue; spilled_coalesced_allocnos[num++] = allocno; } @@ -3351,8 +3550,8 @@ slot_coalesced_allocno_live_ranges_inter { ira_allocno_t a; - for (a = ALLOCNO_NEXT_COALESCED_ALLOCNO (allocno);; - a = ALLOCNO_NEXT_COALESCED_ALLOCNO (a)) + for (a = ALLOCNO_COALESCE_DATA (allocno)->next;; + a = ALLOCNO_COALESCE_DATA (a)->next) { int i; int nr = ALLOCNO_NUM_OBJECTS (a); @@ -3381,9 +3580,9 @@ setup_slot_coalesced_allocno_live_ranges ira_allocno_t a; live_range_t r; - n = ALLOCNO_TEMP (allocno); - for (a = ALLOCNO_NEXT_COALESCED_ALLOCNO (allocno);; - a = ALLOCNO_NEXT_COALESCED_ALLOCNO (a)) + n = ALLOCNO_COALESCE_DATA (allocno)->temp; + for (a = ALLOCNO_COALESCE_DATA (allocno)->next;; + a = ALLOCNO_COALESCE_DATA (a)->next) { int nr = ALLOCNO_NUM_OBJECTS (a); for (i = 0; i < nr; i++) @@ -3423,7 +3622,7 @@ coalesce_spill_slots (ira_allocno_t *spi for (i = 0; i < num; i++) { allocno = spilled_coalesced_allocnos[i]; - if (ALLOCNO_FIRST_COALESCED_ALLOCNO (allocno) != allocno + if (ALLOCNO_COALESCE_DATA (allocno)->first != allocno || bitmap_bit_p (set_jump_crosses, ALLOCNO_REGNO (allocno)) || (ALLOCNO_REGNO (allocno) < ira_reg_equiv_len && (ira_reg_equiv_const[ALLOCNO_REGNO (allocno)] != NULL_RTX @@ -3432,8 +3631,8 @@ coalesce_spill_slots (ira_allocno_t *spi for (j = 0; j < i; j++) { a = spilled_coalesced_allocnos[j]; - n = ALLOCNO_TEMP (a); - if (ALLOCNO_FIRST_COALESCED_ALLOCNO (a) == a + n = ALLOCNO_COALESCE_DATA (a)->temp; + if (ALLOCNO_COALESCE_DATA (a)->first == a && ! bitmap_bit_p (set_jump_crosses, ALLOCNO_REGNO (a)) && (ALLOCNO_REGNO (a) >= ira_reg_equiv_len || (! ira_reg_equiv_invariant_p[ALLOCNO_REGNO (a)] @@ -3445,7 +3644,7 @@ coalesce_spill_slots (ira_allocno_t *spi { /* No coalescing: set up number for coalesced allocnos represented by ALLOCNO. */ - ALLOCNO_TEMP (allocno) = last_coalesced_allocno_num++; + ALLOCNO_COALESCE_DATA (allocno)->temp = last_coalesced_allocno_num++; setup_slot_coalesced_allocno_live_ranges (allocno); } else @@ -3457,10 +3656,11 @@ coalesce_spill_slots (ira_allocno_t *spi " Coalescing spilled allocnos a%dr%d->a%dr%d\n", ALLOCNO_NUM (allocno), ALLOCNO_REGNO (allocno), ALLOCNO_NUM (a), ALLOCNO_REGNO (a)); - ALLOCNO_TEMP (allocno) = ALLOCNO_TEMP (a); + ALLOCNO_COALESCE_DATA (allocno)->temp + = ALLOCNO_COALESCE_DATA (a)->temp; setup_slot_coalesced_allocno_live_ranges (allocno); merge_allocnos (a, allocno); - ira_assert (ALLOCNO_FIRST_COALESCED_ALLOCNO (a) == a); + ira_assert (ALLOCNO_COALESCE_DATA (a)->first == a); } } for (i = 0; i < ira_allocnos_num; i++) @@ -3481,6 +3681,7 @@ ira_sort_regnos_for_alter_reg (int *pseu int max_regno = max_reg_num (); int i, regno, num, slot_num; ira_allocno_t allocno, a; + ira_allocno_iterator ai; ira_allocno_t *spilled_coalesced_allocnos; /* Set up allocnos can be coalesced. */ @@ -3494,6 +3695,16 @@ ira_sort_regnos_for_alter_reg (int *pseu } allocno_coalesced_p = false; processed_coalesced_allocno_bitmap = ira_allocate_bitmap (); + allocno_coalesce_data + = (coalesce_data_t) ira_allocate (sizeof (struct coalesce_data) + * ira_allocnos_num); + /* Initialize coalesce data for allocnos. */ + FOR_EACH_ALLOCNO (a, ai) + { + ALLOCNO_ADD_DATA (a) = allocno_coalesce_data + ALLOCNO_NUM (a); + ALLOCNO_COALESCE_DATA (a)->first = a; + ALLOCNO_COALESCE_DATA (a)->next = a; + } coalesce_allocnos (); ira_free_bitmap (coloring_allocno_bitmap); regno_coalesced_allocno_cost @@ -3531,7 +3742,7 @@ ira_sort_regnos_for_alter_reg (int *pseu for (i = 0; i < num; i++) { allocno = spilled_coalesced_allocnos[i]; - if (ALLOCNO_FIRST_COALESCED_ALLOCNO (allocno) != allocno + if (ALLOCNO_COALESCE_DATA (allocno)->first != allocno || ALLOCNO_HARD_REGNO (allocno) >= 0 || (ALLOCNO_REGNO (allocno) < ira_reg_equiv_len && (ira_reg_equiv_const[ALLOCNO_REGNO (allocno)] != NULL_RTX @@ -3540,8 +3751,8 @@ ira_sort_regnos_for_alter_reg (int *pseu if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) fprintf (ira_dump_file, " Slot %d (freq,size):", slot_num); slot_num++; - for (a = ALLOCNO_NEXT_COALESCED_ALLOCNO (allocno);; - a = ALLOCNO_NEXT_COALESCED_ALLOCNO (a)) + for (a = ALLOCNO_COALESCE_DATA (allocno)->next;; + a = ALLOCNO_COALESCE_DATA (a)->next) { ira_assert (ALLOCNO_HARD_REGNO (a) < 0); ALLOCNO_HARD_REGNO (a) = -slot_num; @@ -3562,6 +3773,9 @@ ira_sort_regnos_for_alter_reg (int *pseu /* Sort regnos according the slot numbers. */ regno_max_ref_width = reg_max_ref_width; qsort (pseudo_regnos, n, sizeof (int), coalesced_pseudo_reg_slot_compare); + FOR_EACH_ALLOCNO (a, ai) + ALLOCNO_ADD_DATA (a) = NULL; + ira_free (allocno_coalesce_data); ira_free (regno_coalesced_allocno_num); ira_free (regno_coalesced_allocno_cost); } --- t2/ira-conflicts.c 2010-08-13 09:37:52.000000000 -0400 +++ ../../gcc/gcc/ira-conflicts.c 2010-08-03 18:00:38.000000000 -0400 @@ -206,6 +206,7 @@ allocnos_conflict_for_copy_p (ira_allocn the lowest order words. */ ira_object_t obj1 = ALLOCNO_OBJECT (a1, 0); ira_object_t obj2 = ALLOCNO_OBJECT (a2, 0); + return OBJECTS_CONFLICT_P (obj1, obj2); } @@ -427,7 +428,8 @@ process_regs_for_copy (rtx reg1, rtx reg return false; } - if (! IN_RANGE (allocno_preferenced_hard_regno, 0, FIRST_PSEUDO_REGISTER - 1)) + if (! IN_RANGE (allocno_preferenced_hard_regno, + 0, FIRST_PSEUDO_REGISTER - 1)) /* Can not be tied. */ return false; rclass = REGNO_REG_CLASS (allocno_preferenced_hard_regno); @@ -441,10 +443,11 @@ process_regs_for_copy (rtx reg1, rtx reg if (index < 0) /* Can not be tied. It is not in the allocno class. */ return false; + ira_init_register_move_cost_if_necessary (mode); if (HARD_REGISTER_P (reg1)) - cost = ira_get_register_move_cost (mode, aclass, rclass) * freq; + cost = ira_register_move_cost[mode][aclass][rclass] * freq; else - cost = ira_get_register_move_cost (mode, rclass, aclass) * freq; + cost = ira_register_move_cost[mode][rclass][aclass] * freq; do { ira_allocate_and_set_costs @@ -508,7 +511,8 @@ add_insn_allocno_copies (rtx insn) ? SET_SRC (set) : SUBREG_REG (SET_SRC (set))) != NULL_RTX) { - process_regs_for_copy (SET_DEST (set), SET_SRC (set), false, insn, freq); + process_regs_for_copy (SET_DEST (set), SET_SRC (set), + false, insn, freq); return; } /* Fast check of possibility of constraint or shuffle copies. If @@ -609,6 +613,7 @@ build_object_conflicts (ira_object_t obj ira_allocno_t a = OBJECT_ALLOCNO (obj); IRA_INT_TYPE *object_conflicts; minmax_set_iterator asi; + int parent_min, parent_max; object_conflicts = conflicts[OBJECT_CONFLICT_ID (obj)]; px = 0; @@ -617,6 +622,7 @@ build_object_conflicts (ira_object_t obj { ira_object_t another_obj = ira_object_id_map[i]; ira_allocno_t another_a = OBJECT_ALLOCNO (obj); + ira_assert (ira_reg_classes_intersect_p [ALLOCNO_CLASS (a)][ALLOCNO_CLASS (another_a)]); collected_conflict_objects[px++] = another_obj; @@ -633,6 +639,7 @@ build_object_conflicts (ira_object_t obj else { int conflict_bit_vec_words_num; + OBJECT_CONFLICT_ARRAY (obj) = object_conflicts; if (OBJECT_MAX (obj) < OBJECT_MIN (obj)) conflict_bit_vec_words_num = 0; @@ -651,6 +658,8 @@ build_object_conflicts (ira_object_t obj ira_assert (ALLOCNO_NUM_OBJECTS (a) == ALLOCNO_NUM_OBJECTS (parent_a)); parent_obj = ALLOCNO_OBJECT (parent_a, OBJECT_SUBWORD (obj)); parent_num = OBJECT_CONFLICT_ID (parent_obj); + parent_min = OBJECT_MIN (parent_obj); + parent_max = OBJECT_MAX (parent_obj); FOR_EACH_BIT_IN_MINMAX_SET (object_conflicts, OBJECT_MIN (obj), OBJECT_MAX (obj), i, asi) { @@ -671,9 +680,8 @@ build_object_conflicts (ira_object_t obj == ALLOCNO_NUM_OBJECTS (another_parent_a)); SET_MINMAX_SET_BIT (conflicts[parent_num], OBJECT_CONFLICT_ID (ALLOCNO_OBJECT (another_parent_a, - another_word)), - OBJECT_MIN (parent_obj), - OBJECT_MAX (parent_obj)); + another_word)), + parent_min, parent_max); } } @@ -876,6 +884,7 @@ ira_build_conflicts (void) FOR_EACH_ALLOCNO (a, ai) { int i, n = ALLOCNO_NUM_OBJECTS (a); + for (i = 0; i < n; i++) { ira_object_t obj = ALLOCNO_OBJECT (a, i); --- t3/ira-costs.c 2010-08-13 12:10:16.000000000 -0400 +++ ../../gcc/gcc/ira-costs.c 2010-08-11 18:04:51.000000000 -0400 @@ -488,31 +488,57 @@ record_reg_classes (int n_alts, int n_op Moreover, if we cannot tie them, this alternative needs to do a copy, which is one insn. */ struct costs *pp = this_op_costs[i]; + int *pp_costs = pp->cost; cost_classes_t cost_classes_ptr = regno_cost_classes[REGNO (op)]; enum reg_class *cost_classes = cost_classes_ptr->classes; + bool in_p = recog_data.operand_type[i] != OP_OUT; + bool out_p = recog_data.operand_type[i] != OP_IN; + enum reg_class op_class = classes[i]; + move_table *move_in_cost, *move_out_cost; - for (k = cost_classes_ptr->num - 1; k >= 0; k--) + ira_init_register_move_cost_if_necessary (mode); + if (! in_p) { - rclass = cost_classes[k]; - pp->cost[k] - = (((recog_data.operand_type[i] != OP_OUT - ? ira_get_may_move_cost (mode, rclass, - classes[i], true) : 0) - + (recog_data.operand_type[i] != OP_IN - ? ira_get_may_move_cost (mode, rclass, - classes[i], false) : 0)) - * frequency); + ira_assert (out_p); + move_out_cost = ira_may_move_out_cost[mode]; + for (k = cost_classes_ptr->num - 1; k >= 0; k--) + { + rclass = cost_classes[k]; + pp_costs[k] + = move_out_cost[op_class][rclass] * frequency; + } + } + else if (! out_p) + { + ira_assert (in_p); + move_in_cost = ira_may_move_in_cost[mode]; + for (k = cost_classes_ptr->num - 1; k >= 0; k--) + { + rclass = cost_classes[k]; + pp_costs[k] + = move_in_cost[rclass][op_class] * frequency; + } + } + else + { + move_in_cost = ira_may_move_in_cost[mode]; + move_out_cost = ira_may_move_out_cost[mode]; + for (k = cost_classes_ptr->num - 1; k >= 0; k--) + { + rclass = cost_classes[k]; + pp_costs[k] = ((move_in_cost[rclass][op_class] + + move_out_cost[op_class][rclass]) + * frequency); + } } /* If the alternative actually allows memory, make things a bit cheaper since we won't need an extra insn to load it. */ pp->mem_cost - = ((recog_data.operand_type[i] != OP_IN - ? ira_memory_move_cost[mode][classes[i]][0] : 0) - + (recog_data.operand_type[i] != OP_OUT - ? ira_memory_move_cost[mode][classes[i]][1] : 0) + = ((out_p ? ira_memory_move_cost[mode][op_class][0] : 0) + + (in_p ? ira_memory_move_cost[mode][op_class][1] : 0) - allows_mem[i]) * frequency; /* If we have assigned a class to this allocno in @@ -525,17 +551,15 @@ record_reg_classes (int n_alts, int n_op if (pref_class == NO_REGS) alt_cost - += ((recog_data.operand_type[i] != OP_IN - ? ira_memory_move_cost[mode][classes[i]][0] - : 0) - + (recog_data.operand_type[i] != OP_OUT - ? ira_memory_move_cost[mode][classes[i]][1] + += ((out_p + ? ira_memory_move_cost[mode][op_class][0] : 0) + + (in_p + ? ira_memory_move_cost[mode][op_class][1] : 0)); else if (ira_reg_class_intersect - [pref_class][classes[i]] == NO_REGS) - alt_cost += ira_get_register_move_cost (mode, - pref_class, - classes[i]); + [pref_class][op_class] == NO_REGS) + alt_cost + += ira_register_move_cost[mode][pref_class][op_class]; } if (REGNO (ops[i]) != REGNO (ops[j]) && ! find_reg_note (insn, REG_DEAD, op)) @@ -732,30 +756,56 @@ record_reg_classes (int n_alts, int n_op { unsigned int regno = REGNO (op); struct costs *pp = this_op_costs[i]; + int *pp_costs = pp->cost; cost_classes_t cost_classes_ptr = regno_cost_classes[regno]; enum reg_class *cost_classes = cost_classes_ptr->classes; + bool in_p = recog_data.operand_type[i] != OP_OUT; + bool out_p = recog_data.operand_type[i] != OP_IN; + enum reg_class op_class = classes[i]; + move_table *move_in_cost, *move_out_cost; - for (k = cost_classes_ptr->num - 1; k >= 0; k--) + ira_init_register_move_cost_if_necessary (mode); + if (! in_p) { - rclass = cost_classes[k]; - pp->cost[k] - = (((recog_data.operand_type[i] != OP_OUT - ? ira_get_may_move_cost (mode, rclass, - classes[i], true) : 0) - + (recog_data.operand_type[i] != OP_IN - ? ira_get_may_move_cost (mode, classes[i], - rclass, false) : 0)) - * frequency); + ira_assert (out_p); + move_out_cost = ira_may_move_out_cost[mode]; + for (k = cost_classes_ptr->num - 1; k >= 0; k--) + { + rclass = cost_classes[k]; + pp_costs[k] + = move_out_cost[op_class][rclass] * frequency; + } + } + else if (! out_p) + { + ira_assert (in_p); + move_in_cost = ira_may_move_in_cost[mode]; + for (k = cost_classes_ptr->num - 1; k >= 0; k--) + { + rclass = cost_classes[k]; + pp_costs[k] + = move_in_cost[rclass][op_class] * frequency; + } + } + else + { + move_in_cost = ira_may_move_in_cost[mode]; + move_out_cost = ira_may_move_out_cost[mode]; + for (k = cost_classes_ptr->num - 1; k >= 0; k--) + { + rclass = cost_classes[k]; + pp_costs[k] = ((move_in_cost[rclass][op_class] + + move_out_cost[op_class][rclass]) + * frequency); + } } /* If the alternative actually allows memory, make things a bit cheaper since we won't need an extra insn to load it. */ pp->mem_cost - = ((recog_data.operand_type[i] != OP_IN - ? ira_memory_move_cost[mode][classes[i]][0] : 0) - + (recog_data.operand_type[i] != OP_OUT - ? ira_memory_move_cost[mode][classes[i]][1] : 0) + = ((out_p ? ira_memory_move_cost[mode][op_class][0] : 0) + + (in_p ? ira_memory_move_cost[mode][op_class][1] : 0) - allows_mem[i]) * frequency; /* If we have assigned a class to this allocno in our first pass, add a cost to this alternative @@ -767,17 +817,14 @@ record_reg_classes (int n_alts, int n_op if (pref_class == NO_REGS) alt_cost - += ((recog_data.operand_type[i] != OP_IN - ? ira_memory_move_cost[mode][classes[i]][0] - : 0) - + (recog_data.operand_type[i] != OP_OUT - ? ira_memory_move_cost[mode][classes[i]][1] + += ((out_p + ? ira_memory_move_cost[mode][op_class][0] : 0) + + (in_p + ? ira_memory_move_cost[mode][op_class][1] : 0)); - else if (ira_reg_class_intersect[pref_class][classes[i]] + else if (ira_reg_class_intersect[pref_class][op_class] == NO_REGS) - alt_cost += ira_get_register_move_cost (mode, - pref_class, - classes[i]); + alt_cost += ira_register_move_cost[mode][pref_class][op_class]; } } } @@ -820,6 +867,7 @@ record_reg_classes (int n_alts, int n_op if (REG_P (ops[i]) && REGNO (ops[i]) >= FIRST_PSEUDO_REGISTER) { struct costs *pp = op_costs[i], *qq = this_op_costs[i]; + int *pp_costs = pp->cost, *qq_costs = qq->cost; int scale = 1 + (recog_data.operand_type[i] == OP_INOUT); cost_classes_t cost_classes_ptr = regno_cost_classes[REGNO (ops[i])]; @@ -828,8 +876,8 @@ record_reg_classes (int n_alts, int n_op (qq->mem_cost + op_cost_add) * scale); for (k = cost_classes_ptr->num - 1; k >= 0; k--) - pp->cost[k] - = MIN (pp->cost[k], (qq->cost[k] + op_cost_add) * scale); + pp_costs[k] + = MIN (pp_costs[k], (qq_costs[k] + op_cost_add) * scale); } } @@ -1085,10 +1133,12 @@ record_address_regs (enum machine_mode m case REG: { struct costs *pp; + int *pp_costs; enum reg_class i; int k, regno, add_cost; cost_classes_t cost_classes_ptr; enum reg_class *cost_classes; + move_table *move_in_cost; if (REGNO (x) < FIRST_PSEUDO_REGISTER) break; @@ -1104,14 +1154,17 @@ record_address_regs (enum machine_mode m pp->mem_cost += add_cost; cost_classes_ptr = regno_cost_classes[regno]; cost_classes = cost_classes_ptr->classes; + pp_costs = pp->cost; + ira_init_register_move_cost_if_necessary (Pmode); + move_in_cost = ira_may_move_in_cost[Pmode]; for (k = cost_classes_ptr->num - 1; k >= 0; k--) { i = cost_classes[k]; - add_cost = (ira_get_may_move_cost (Pmode, i, rclass, true) * scale) / 2; - if (INT_MAX - add_cost < pp->cost[k]) - pp->cost[k] = INT_MAX; + add_cost = (move_in_cost[i][rclass] * scale) / 2; + if (INT_MAX - add_cost < pp_costs[k]) + pp_costs[k] = INT_MAX; else - pp->cost[k] += add_cost; + pp_costs[k] += add_cost; } } break; @@ -1243,6 +1296,7 @@ scan_one_insn (rtx insn) int regno = REGNO (recog_data.operand[i]); struct costs *p = COSTS (costs, COST_INDEX (regno)); struct costs *q = op_costs[i]; + int *p_costs = p->cost, *q_costs = q->cost; cost_classes_t cost_classes_ptr = regno_cost_classes[regno]; int add_cost; @@ -1253,11 +1307,11 @@ scan_one_insn (rtx insn) p->mem_cost += add_cost; for (k = cost_classes_ptr->num - 1; k >= 0; k--) { - add_cost = q->cost[k]; - if (add_cost > 0 && INT_MAX - add_cost < p->cost[k]) - p->cost[k] = INT_MAX; + add_cost = q_costs[k]; + if (add_cost > 0 && INT_MAX - add_cost < p_costs[k]) + p_costs[k] = INT_MAX; else - p->cost[k] += add_cost; + p_costs[k] += add_cost; } } @@ -1500,6 +1554,8 @@ find_costs_and_classes (FILE *dump_file) #endif cost_classes_t cost_classes_ptr = regno_cost_classes[i]; enum reg_class *cost_classes = cost_classes_ptr->classes; + int *i_costs = temp_costs->cost; + int i_mem_cost; int equiv_savings = regno_equiv_gains[i]; if (! allocno_p) @@ -1510,17 +1566,21 @@ find_costs_and_classes (FILE *dump_file) inc_dec_p = in_inc_dec[i]; #endif memcpy (temp_costs, COSTS (costs, i), struct_costs_size); + i_mem_cost = temp_costs->mem_cost; } else { if (ira_regno_allocno_map[i] == NULL) continue; memset (temp_costs, 0, struct_costs_size); + i_mem_cost = 0; /* Find cost of all allocnos with the same regno. */ for (a = ira_regno_allocno_map[i]; a != NULL; a = ALLOCNO_NEXT_REGNO_ALLOCNO (a)) { + int *a_costs, *p_costs; + a_num = ALLOCNO_NUM (a); if ((flag_ira_region == IRA_REGION_ALL || flag_ira_region == IRA_REGION_MIXED) @@ -1534,14 +1594,15 @@ find_costs_and_classes (FILE *dump_file) /* Propagate costs to upper levels in the region tree. */ parent_a_num = ALLOCNO_NUM (parent_a); + a_costs = COSTS (total_allocno_costs, a_num)->cost; + p_costs = COSTS (total_allocno_costs, parent_a_num)->cost; for (k = cost_classes_ptr->num - 1; k >= 0; k--) { - add_cost = COSTS (total_allocno_costs, a_num)->cost[k]; - if (add_cost > 0 && INT_MAX - add_cost - < COSTS (total_allocno_costs, parent_a_num)->cost[k]) - COSTS (total_allocno_costs, parent_a_num)->cost[k] = INT_MAX; + add_cost = a_costs[k]; + if (add_cost > 0 && INT_MAX - add_cost < p_costs[k]) + p_costs[k] = INT_MAX; else - COSTS (total_allocno_costs, parent_a_num)->cost[k] += add_cost; + p_costs[k] += add_cost; } add_cost = COSTS (total_allocno_costs, a_num)->mem_cost; if (add_cost > 0 @@ -1555,19 +1616,20 @@ find_costs_and_classes (FILE *dump_file) += add_cost; } + a_costs = COSTS (costs, a_num)->cost; for (k = cost_classes_ptr->num - 1; k >= 0; k--) { - add_cost = COSTS (costs, a_num)->cost[k]; - if (add_cost > 0 && INT_MAX - add_cost < temp_costs->cost[k]) - temp_costs->cost[k] = INT_MAX; + add_cost = a_costs[k]; + if (add_cost > 0 && INT_MAX - add_cost < i_costs[k]) + i_costs[k] = INT_MAX; else - temp_costs->cost[k] += add_cost; + i_costs[k] += add_cost; } add_cost = COSTS (costs, a_num)->mem_cost; - if (add_cost && INT_MAX - add_cost < temp_costs->mem_cost) - temp_costs->mem_cost = INT_MAX; + if (add_cost && INT_MAX - add_cost < i_mem_cost) + i_mem_cost = INT_MAX; else - temp_costs->mem_cost += add_cost; + i_mem_cost += add_cost; #ifdef FORBIDDEN_INC_DEC_CLASSES if (in_inc_dec[a_num]) inc_dec_p = true; @@ -1580,7 +1642,7 @@ find_costs_and_classes (FILE *dump_file) { temp_costs->mem_cost = 0; for (k = cost_classes_ptr->num - 1; k >= 0; k--) - temp_costs->cost[k] += equiv_savings; + i_costs[k] += equiv_savings; } best_cost = (1 << (HOST_BITS_PER_INT - 2)) - 1; @@ -1603,21 +1665,21 @@ find_costs_and_classes (FILE *dump_file) #endif ) continue; - if (temp_costs->cost[k] < best_cost) + if (i_costs[k] < best_cost) { - best_cost = temp_costs->cost[k]; + best_cost = i_costs[k]; best = (enum reg_class) rclass; } - else if (temp_costs->cost[k] == best_cost) + else if (i_costs[k] == best_cost) best = ira_reg_class_subunion[best][rclass]; if (pass == flag_expensive_optimizations - && temp_costs->cost[k] < temp_costs->mem_cost + && i_costs[k] < i_mem_cost && (reg_class_size[reg_class_subunion[alt_class][rclass]] > reg_class_size[alt_class])) alt_class = reg_class_subunion[alt_class][rclass]; } alt_class = ira_allocno_class_translate[alt_class]; - if (best_cost > temp_costs->mem_cost) + if (best_cost > i_mem_cost) regno_aclass[i] = NO_REGS; else { @@ -1630,7 +1692,7 @@ find_costs_and_classes (FILE *dump_file) } if (pass == flag_expensive_optimizations) { - if (best_cost > temp_costs->mem_cost) + if (best_cost > i_mem_cost) best = alt_class = NO_REGS; else if (best == alt_class) alt_class = NO_REGS; @@ -1645,7 +1707,7 @@ find_costs_and_classes (FILE *dump_file) regno_best_class[i] = best; if (! allocno_p) { - pref[i] = best_cost > temp_costs->mem_cost ? NO_REGS : best; + pref[i] = best_cost > i_mem_cost ? NO_REGS : best; continue; } for (a = ira_regno_allocno_map[i]; @@ -1657,6 +1719,9 @@ find_costs_and_classes (FILE *dump_file) best = NO_REGS; else { + int *total_a_costs = COSTS (total_allocno_costs, a_num)->cost; + int *a_costs = COSTS (costs, a_num)->cost; + /* Finding best class which is subset of the common class. */ best_cost = (1 << (HOST_BITS_PER_INT - 2)) - 1; @@ -1680,16 +1745,16 @@ find_costs_and_classes (FILE *dump_file) #endif ) ; - else if (COSTS (total_allocno_costs, a_num)->cost[k] < best_cost) + else if (total_a_costs[k] < best_cost) { - best_cost = COSTS (total_allocno_costs, a_num)->cost[k]; - allocno_cost = COSTS (costs, a_num)->cost[k]; + best_cost = total_a_costs[k]; + allocno_cost = a_costs[k]; best = (enum reg_class) rclass; } - else if (COSTS (total_allocno_costs, a_num)->cost[k] == best_cost) + else if (total_a_costs[k] == best_cost) { best = ira_reg_class_subunion[best][rclass]; - allocno_cost = MAX (allocno_cost, COSTS (costs, a_num)->cost[k]); + allocno_cost = MAX (allocno_cost, a_costs[k]); } } ALLOCNO_CLASS_COST (a) = allocno_cost; @@ -1787,9 +1852,10 @@ process_bb_node_for_hard_reg_moves (ira_ continue; mode = ALLOCNO_MODE (a); hard_reg_class = REGNO_REG_CLASS (hard_regno); + ira_init_register_move_cost_if_necessary (mode); cost - = (to_p ? ira_get_register_move_cost (mode, hard_reg_class, rclass) - : ira_get_register_move_cost (mode, rclass, hard_reg_class)) * freq; + = (to_p ? ira_register_move_cost[mode][hard_reg_class][rclass] + : ira_register_move_cost[mode][rclass][hard_reg_class]) * freq; ira_allocate_and_set_costs (&ALLOCNO_HARD_REG_COSTS (a), rclass, ALLOCNO_CLASS_COST (a)); ira_allocate_and_set_costs (&ALLOCNO_CONFLICT_HARD_REG_COSTS (a), --- t2/ira-emit.c 2010-08-13 09:37:52.000000000 -0400 +++ ../../gcc/gcc/ira-emit.c 2010-08-10 14:06:51.000000000 -0400 @@ -44,6 +44,73 @@ along with GCC; see the file COPYING3. #include "ira-int.h" +/* Data used to emit live range split insns and to flattening IR. */ +ira_emit_data_t ira_allocno_emit_data; + +/* Definitions for vectors of pointers. */ +typedef void *void_p; +DEF_VEC_P (void_p); +DEF_VEC_ALLOC_P (void_p,heap); + +/* Pointers to data allocated for allocnos being created during + emitting. Usually there are quite few such allocnos because they + are created only for resolving loop in register shuffling. */ +static VEC(void_p, heap) *new_allocno_emit_data_vec; + +/* Allocate and initiate the emit data. */ +void +ira_initiate_emit_data (void) +{ + ira_allocno_t a; + ira_allocno_iterator ai; + + ira_allocno_emit_data + = (ira_emit_data_t) ira_allocate (ira_allocnos_num + * sizeof (struct ira_emit_data)); + memset (ira_allocno_emit_data, 0, + ira_allocnos_num * sizeof (struct ira_emit_data)); + FOR_EACH_ALLOCNO (a, ai) + ALLOCNO_ADD_DATA (a) = ira_allocno_emit_data + ALLOCNO_NUM (a); + new_allocno_emit_data_vec = VEC_alloc (void_p, heap, 50); + +} + +/* Free the emit data. */ +void +ira_finish_emit_data (void) +{ + void_p p; + ira_allocno_t a; + ira_allocno_iterator ai; + + ira_free (ira_allocno_emit_data); + FOR_EACH_ALLOCNO (a, ai) + ALLOCNO_ADD_DATA (a) = NULL; + for (;VEC_length (void_p, new_allocno_emit_data_vec) != 0;) + { + p = VEC_pop (void_p, new_allocno_emit_data_vec); + ira_free (p); + } + VEC_free (void_p, heap, new_allocno_emit_data_vec); +} + +/* Create and return a new allocno with given REGNO and + LOOP_TREE_NODE. Allocate emit data for it. */ +static ira_allocno_t +create_new_allocno (int regno, ira_loop_tree_node_t loop_tree_node) +{ + ira_allocno_t a; + + a = ira_create_allocno (regno, false, loop_tree_node); + ALLOCNO_ADD_DATA (a) = ira_allocate (sizeof (struct ira_emit_data)); + memset (ALLOCNO_ADD_DATA (a), 0, sizeof (struct ira_emit_data)); + VEC_safe_push (void_p, heap, new_allocno_emit_data_vec, ALLOCNO_ADD_DATA (a)); + return a; +} + + + +/* See comments below. */ typedef struct move *move_t; /* The structure represents an allocno move. Both allocnos have the @@ -171,7 +238,7 @@ change_regs (rtx *loc) return false; if (ira_curr_regno_allocno_map[regno] == NULL) return false; - reg = ALLOCNO_REG (ira_curr_regno_allocno_map[regno]); + reg = allocno_emit_reg (ira_curr_regno_allocno_map[regno]); if (reg == *loc) return false; *loc = reg; @@ -258,9 +325,9 @@ set_allocno_reg (ira_allocno_t allocno, a != NULL; a = ALLOCNO_NEXT_REGNO_ALLOCNO (a)) if (subloop_tree_node_p (ALLOCNO_LOOP_TREE_NODE (a), node)) - ALLOCNO_REG (a) = reg; + ALLOCNO_EMIT_DATA (a)->reg = reg; for (a = ALLOCNO_CAP (allocno); a != NULL; a = ALLOCNO_CAP (a)) - ALLOCNO_REG (a) = reg; + ALLOCNO_EMIT_DATA (a)->reg = reg; regno = ALLOCNO_REGNO (allocno); for (a = allocno;;) { @@ -273,9 +340,9 @@ set_allocno_reg (ira_allocno_t allocno, } if (a == NULL) continue; - if (ALLOCNO_CHILD_RENAMED_P (a)) + if (ALLOCNO_EMIT_DATA (a)->child_renamed_p) break; - ALLOCNO_CHILD_RENAMED_P (a) = true; + ALLOCNO_EMIT_DATA (a)->child_renamed_p = true; } } @@ -346,14 +413,14 @@ store_can_be_removed_p (ira_allocno_t sr ira_assert (ALLOCNO_CAP_MEMBER (src_allocno) == NULL && ALLOCNO_CAP_MEMBER (dest_allocno) == NULL); orig_regno = ALLOCNO_REGNO (src_allocno); - regno = REGNO (ALLOCNO_REG (dest_allocno)); + regno = REGNO (allocno_emit_reg (dest_allocno)); for (node = ALLOCNO_LOOP_TREE_NODE (src_allocno); node != NULL; node = node->parent) { a = node->regno_allocno_map[orig_regno]; ira_assert (a != NULL); - if (REGNO (ALLOCNO_REG (a)) == (unsigned) regno) + if (REGNO (allocno_emit_reg (a)) == (unsigned) regno) /* We achieved the destination and everything is ok. */ return true; else if (bitmap_bit_p (node->modified_regnos, orig_regno)) @@ -397,8 +464,8 @@ generate_edge_moves (edge e) { src_allocno = src_map[regno]; dest_allocno = dest_map[regno]; - if (REGNO (ALLOCNO_REG (src_allocno)) - == REGNO (ALLOCNO_REG (dest_allocno))) + if (REGNO (allocno_emit_reg (src_allocno)) + == REGNO (allocno_emit_reg (dest_allocno))) continue; /* Remove unnecessary stores at the region exit. We should do this for readonly memory for sure and this is guaranteed by @@ -409,8 +476,8 @@ generate_edge_moves (edge e) && ALLOCNO_HARD_REGNO (src_allocno) >= 0 && store_can_be_removed_p (src_allocno, dest_allocno)) { - ALLOCNO_MEM_OPTIMIZED_DEST (src_allocno) = dest_allocno; - ALLOCNO_MEM_OPTIMIZED_DEST_P (dest_allocno) = true; + ALLOCNO_EMIT_DATA (src_allocno)->mem_optimized_dest = dest_allocno; + ALLOCNO_EMIT_DATA (dest_allocno)->mem_optimized_dest_p = true; if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) fprintf (ira_dump_file, " Remove r%d:a%d->a%d(mem)\n", regno, ALLOCNO_NUM (src_allocno), @@ -500,9 +567,9 @@ change_loop (ira_loop_tree_node_t node) || ira_reg_equiv_invariant_p[regno] || ira_reg_equiv_const[regno] != NULL_RTX)) continue; - original_reg = ALLOCNO_REG (allocno); + original_reg = allocno_emit_reg (allocno); if (parent_allocno == NULL - || (REGNO (ALLOCNO_REG (parent_allocno)) + || (REGNO (allocno_emit_reg (parent_allocno)) == REGNO (original_reg))) { if (internal_flag_ira_verbose > 3 && ira_dump_file) @@ -527,11 +594,11 @@ change_loop (ira_loop_tree_node_t node) continue; used_p = bitmap_bit_p (used_regno_bitmap, regno); bitmap_set_bit (used_regno_bitmap, regno); - ALLOCNO_SOMEWHERE_RENAMED_P (allocno) = true; + ALLOCNO_EMIT_DATA (allocno)->somewhere_renamed_p = true; if (! used_p) continue; bitmap_set_bit (renamed_regno_bitmap, regno); - set_allocno_reg (allocno, create_new_reg (ALLOCNO_REG (allocno))); + set_allocno_reg (allocno, create_new_reg (allocno_emit_reg (allocno))); } } @@ -547,8 +614,8 @@ set_allocno_somewhere_renamed_p (void) { regno = ALLOCNO_REGNO (allocno); if (bitmap_bit_p (renamed_regno_bitmap, regno) - && REGNO (ALLOCNO_REG (allocno)) == regno) - ALLOCNO_SOMEWHERE_RENAMED_P (allocno) = true; + && REGNO (allocno_emit_reg (allocno)) == regno) + ALLOCNO_EMIT_DATA (allocno)->somewhere_renamed_p = true; } } @@ -728,7 +795,7 @@ modify_move_list (move_t list) subsequent IRA internal representation flattening. */ new_allocno - = ira_create_allocno (ALLOCNO_REGNO (set_move->to), false, + = create_new_allocno (ALLOCNO_REGNO (set_move->to), ALLOCNO_LOOP_TREE_NODE (set_move->to)); ALLOCNO_MODE (new_allocno) = ALLOCNO_MODE (set_move->to); ira_set_allocno_class (new_allocno, @@ -736,8 +803,8 @@ modify_move_list (move_t list) ira_create_allocno_objects (new_allocno); ALLOCNO_ASSIGNED_P (new_allocno) = true; ALLOCNO_HARD_REGNO (new_allocno) = -1; - ALLOCNO_REG (new_allocno) - = create_new_reg (ALLOCNO_REG (set_move->to)); + ALLOCNO_EMIT_DATA (new_allocno)->reg + = create_new_reg (allocno_emit_reg (set_move->to)); /* Make it possibly conflicting with all earlier created allocnos. Cases where temporary allocnos @@ -760,7 +827,7 @@ modify_move_list (move_t list) fprintf (ira_dump_file, " Creating temporary allocno a%dr%d\n", ALLOCNO_NUM (new_allocno), - REGNO (ALLOCNO_REG (new_allocno))); + REGNO (allocno_emit_reg (new_allocno))); } } if ((hard_regno = ALLOCNO_HARD_REGNO (to)) < 0) @@ -796,8 +863,8 @@ emit_move_list (move_t list, int freq) for (; list != NULL; list = list->next) { start_sequence (); - emit_move_insn (ALLOCNO_REG (list->to), - ALLOCNO_REG (list->from)); + emit_move_insn (allocno_emit_reg (list->to), + allocno_emit_reg (list->from)); list->insn = get_insns (); end_sequence (); /* The reload needs to have set up insn codes. If the reload @@ -828,8 +895,8 @@ emit_move_list (move_t list, int freq) } else { - cost = (ira_get_register_move_cost (mode, aclass, aclass) - * freq); + ira_init_register_move_cost_if_necessary (mode); + cost = ira_register_move_cost[mode][aclass][aclass] * freq; ira_shuffle_cost += cost; } ira_overall_cost += cost; @@ -961,7 +1028,7 @@ add_range_and_copies_from_move_list (mov { if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL) fprintf (ira_dump_file, " Allocate conflicts for a%dr%d\n", - ALLOCNO_NUM (to), REGNO (ALLOCNO_REG (to))); + ALLOCNO_NUM (to), REGNO (allocno_emit_reg (to))); ira_allocate_object_conflicts (to_obj, n); } } @@ -974,9 +1041,9 @@ add_range_and_copies_from_move_list (mov if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL) fprintf (ira_dump_file, " Adding cp%d:a%dr%d-a%dr%d\n", cp->num, ALLOCNO_NUM (cp->first), - REGNO (ALLOCNO_REG (cp->first)), + REGNO (allocno_emit_reg (cp->first)), ALLOCNO_NUM (cp->second), - REGNO (ALLOCNO_REG (cp->second))); + REGNO (allocno_emit_reg (cp->second))); nr = ALLOCNO_NUM_OBJECTS (from); for (i = 0; i < nr; i++) @@ -990,7 +1057,7 @@ add_range_and_copies_from_move_list (mov fprintf (ira_dump_file, " Adding range [%d..%d] to allocno a%dr%d\n", start, ira_max_point, ALLOCNO_NUM (from), - REGNO (ALLOCNO_REG (from))); + REGNO (allocno_emit_reg (from))); } else { @@ -999,7 +1066,7 @@ add_range_and_copies_from_move_list (mov fprintf (ira_dump_file, " Adding range [%d..%d] to allocno a%dr%d\n", r->start, ira_max_point, ALLOCNO_NUM (from), - REGNO (ALLOCNO_REG (from))); + REGNO (allocno_emit_reg (from))); } } ira_max_point++; @@ -1026,7 +1093,7 @@ add_range_and_copies_from_move_list (mov fprintf (ira_dump_file, " Adding range [%d..%d] to allocno a%dr%d\n", r->start, r->finish, ALLOCNO_NUM (move->to), - REGNO (ALLOCNO_REG (move->to))); + REGNO (allocno_emit_reg (move->to))); } } } @@ -1036,7 +1103,7 @@ add_range_and_copies_from_move_list (mov int nr, i; a = node->regno_allocno_map[regno]; - if ((to = ALLOCNO_MEM_OPTIMIZED_DEST (a)) != NULL) + if ((to = ALLOCNO_EMIT_DATA (a)->mem_optimized_dest) != NULL) a = to; nr = ALLOCNO_NUM_OBJECTS (a); for (i = 0; i < nr; i++) @@ -1050,7 +1117,7 @@ add_range_and_copies_from_move_list (mov " Adding range [%d..%d] to live through %s allocno a%dr%d\n", start, ira_max_point - 1, to != NULL ? "upper level" : "", - ALLOCNO_NUM (a), REGNO (ALLOCNO_REG (a))); + ALLOCNO_NUM (a), REGNO (allocno_emit_reg (a))); } } @@ -1103,7 +1170,7 @@ ira_emit (bool loops_p) ira_allocno_iterator ai; FOR_EACH_ALLOCNO (a, ai) - ALLOCNO_REG (a) = regno_reg_rtx[ALLOCNO_REGNO (a)]; + ALLOCNO_EMIT_DATA (a)->reg = regno_reg_rtx[ALLOCNO_REGNO (a)]; if (! loops_p) return; at_bb_start = (move_t *) ira_allocate (sizeof (move_t) * last_basic_block); --- t3/ira-int.h 2010-08-13 12:10:16.000000000 -0400 +++ ../../gcc/gcc/ira-int.h 2010-08-10 14:03:53.000000000 -0400 @@ -221,57 +221,6 @@ extern int ira_max_point; live ranges with given start/finish point. */ extern live_range_t *ira_start_point_ranges, *ira_finish_point_ranges; -/* See below. */ -typedef struct object_hard_regs *object_hard_regs_t; - -/* The structure contains information about hard registers can be - assigned to objects. Usually it is allocno profitable hard - registers but in some cases this set can be a bit different. Major - reason of the difference is a requirement to use hard register sets - that form a tree or a forest (set of trees), i.e. hard register set - of a node should contain hard register sets of its subnodes. */ -struct object_hard_regs -{ - /* Hard registers can be assigned to an allocno. */ - HARD_REG_SET set; - /* Overall (spilling) cost of all allocnos with given register - set. */ - long long int cost; -}; - -/* See below. */ -typedef struct object_hard_regs_node *object_hard_regs_node_t; - -/* A node representing object hard registers. Such nodes form a - forest (set of trees). Each subnode of given node in the forest - refers for hard register set (usually object profitable hard - register set) which is a subset of one referred from given - node. */ -struct object_hard_regs_node -{ - /* Set up number of the node in preorder traversing of the forest. */ - int preorder_num; - /* Used for different calculation like finding conflict size of an - allocno. */ - int check; - /* Used for calculation of conflict size of an allocno. The - conflict size of the allocno is maximal number of given object - hard registers needed for allocation of the conflicting allocnos. - Given allocno is trivially colored if this number plus the number - of hard registers needed for given allocno is not greater than - the number of given allocno hard register set. */ - int conflict_size; - /* The number of hard registers given by member hard_regs. */ - int hard_regs_num; - /* The following member is used to form the final forest. */ - bool used_p; - /* Pointer to the corresponding profitable hard registers. */ - object_hard_regs_t hard_regs; - /* Parent, first subnode, previous and next node with the same - parent in the forest. */ - object_hard_regs_node_t parent, first, prev, next; -}; - /* A structure representing conflict information for an allocno (or one of its subwords). */ struct ira_object @@ -313,20 +262,9 @@ struct ira_object ira_object structures. Otherwise, we use a bit vector indexed by conflict ID numbers. */ unsigned int conflict_vec_p : 1; - /* Profitable hard regs available for this pseudo allocation. It - means that the set excludes unavailable hard regs and hard regs - conflicting with given pseudo. They should be of the allocno - class. */ - HARD_REG_SET profitable_hard_regs; - /* The object hard registers node. */ - object_hard_regs_node_t hard_regs_node; - /* Array of structures object_hard_regs_subnode representing - given object hard registers node (the 1st element in the array) - and all its subnodes in the tree (forest) of object hard - register nodes (see comments above). */ - int hard_regs_subnodes_start; - /* The length of the previous array. */ - int hard_regs_subnodes_num; + /* Different additional data. It is used to decrease size of + allocno data footprint. */ + void *add_data; }; /* A structure representing an allocno (allocation entity). Allocno @@ -346,16 +284,40 @@ struct ira_allocno int regno; /* Mode of the allocno which is the mode of the corresponding pseudo-register. */ - enum machine_mode mode; + ENUM_BITFIELD (machine_mode) mode : 8; + /* Register class which should be used for allocation for given + allocno. NO_REGS means that we should use memory. */ + ENUM_BITFIELD (reg_class) aclass : 16; + /* During the reload, value TRUE means that we should not reassign a + hard register to the allocno got memory earlier. It is set up + when we removed memory-memory move insn before each iteration of + the reload. */ + unsigned int dont_reassign_p : 1; +#ifdef STACK_REGS + /* Set to TRUE if allocno can't be assigned to the stack hard + register correspondingly in this region and area including the + region and all its subregions recursively. */ + unsigned int no_stack_reg_p : 1, total_no_stack_reg_p : 1; +#endif + /* TRUE value means that there is no sense to spill the allocno + during coloring because the spill will result in additional + reloads in reload pass. */ + unsigned int bad_spill_p : 1; + /* TRUE if a hard register or memory has been assigned to the + allocno. */ + unsigned int assigned_p : 1; + /* TRUE if conflicts for given allocno are represented by vector of + pointers to the conflicting allocnos. Otherwise, we use a bit + vector where a bit with given index represents allocno with the + same number. */ + unsigned int conflict_vec_p : 1; /* Hard register assigned to given allocno. Negative value means that memory was allocated to the allocno. During the reload, spilled allocno has value equal to the corresponding stack slot number (0, ...) - 2. Value -1 is used for allocnos spilled by the reload (at this point pseudo-register has only one allocno) which did not get stack slot yet. */ - int hard_regno; - /* Final rtx representation of the allocno. */ - rtx reg; + short int hard_regno; /* Allocnos with the same regno are linked by the following member. Allocnos corresponding to inner loops are first in the list (it corresponds to depth-first traverse of the loops). */ @@ -373,9 +335,6 @@ struct ira_allocno int nrefs; /* Accumulated frequency of usage of the allocno. */ int freq; - /* Register class which should be used for allocation for given - allocno. NO_REGS means that we should use memory. */ - enum reg_class aclass; /* Minimal accumulated and updated costs of usage register of the allocno class. */ int class_cost, updated_class_cost; @@ -403,11 +362,6 @@ struct ira_allocno /* It is a link to allocno (cap) on lower loop level represented by given cap. Null if given allocno is not a cap. */ ira_allocno_t cap_member; - /* Coalesced allocnos form a cyclic list. One allocno given by - FIRST_COALESCED_ALLOCNO represents all coalesced allocnos. The - list is chained by NEXT_COALESCED_ALLOCNO. */ - ira_allocno_t first_coalesced_allocno; - ira_allocno_t next_coalesced_allocno; /* The number of objects tracked in the following array. */ int num_objects; /* An array of structures describing conflict information and live @@ -420,48 +374,6 @@ struct ira_allocno int call_freq; /* Accumulated number of the intersected calls. */ int calls_crossed_num; - /* TRUE if the allocno assigned to memory was a destination of - removed move (see ira-emit.c) at loop exit because the value of - the corresponding pseudo-register is not changed inside the - loop. */ - unsigned int mem_optimized_dest_p : 1; - /* TRUE if the corresponding pseudo-register has disjoint live - ranges and the other allocnos of the pseudo-register except this - one changed REG. */ - unsigned int somewhere_renamed_p : 1; - /* TRUE if allocno with the same REGNO in a subregion has been - renamed, in other words, got a new pseudo-register. */ - unsigned int child_renamed_p : 1; - /* During the reload, value TRUE means that we should not reassign a - hard register to the allocno got memory earlier. It is set up - when we removed memory-memory move insn before each iteration of - the reload. */ - unsigned int dont_reassign_p : 1; -#ifdef STACK_REGS - /* Set to TRUE if allocno can't be assigned to the stack hard - register correspondingly in this region and area including the - region and all its subregions recursively. */ - unsigned int no_stack_reg_p : 1, total_no_stack_reg_p : 1; -#endif - /* TRUE value means that there is no sense to spill the allocno - during coloring because the spill will result in additional - reloads in reload pass. */ - unsigned int bad_spill_p : 1; - /* TRUE value means that the allocno was not removed yet from the - conflicting graph during colouring. */ - unsigned int in_graph_p : 1; - /* TRUE if a hard register or memory has been assigned to the - allocno. */ - unsigned int assigned_p : 1; - /* TRUE if it is put on the stack to make other allocnos - colorable. */ - unsigned int may_be_spilled_p : 1; - /* TRUE if the allocno is trivially colorable. */ - unsigned int colorable_p : 1; - /* Non NULL if we remove restoring value from given allocno to - MEM_OPTIMIZED_DEST at loop exit (see ira-emit.c) because the - allocno value is not changed inside the loop. */ - ira_allocno_t mem_optimized_dest; /* Array of usage costs (accumulated and the one updated during coloring) for each hard register of the allocno class. The member value can be NULL if all costs are the same and equal to @@ -484,15 +396,9 @@ struct ira_allocno of other allocnos not assigned yet during assigning to given allocno. */ int *conflict_hard_reg_costs, *updated_conflict_hard_reg_costs; - /* Number of hard registers of the allocno class really available - for the allocno allocation. */ - int available_regs_num; - /* Allocnos in a bucket (used in coloring) chained by the following - two members. */ - ira_allocno_t next_bucket_allocno; - ira_allocno_t prev_bucket_allocno; - /* Used for temporary purposes. */ - int temp; + /* Different additional data. It is used to decrease size of + allocno data footprint. */ + void *add_data; }; @@ -520,10 +426,7 @@ struct ira_allocno #define ALLOCNO_TOTAL_NO_STACK_REG_P(A) ((A)->total_no_stack_reg_p) #endif #define ALLOCNO_BAD_SPILL_P(A) ((A)->bad_spill_p) -#define ALLOCNO_IN_GRAPH_P(A) ((A)->in_graph_p) #define ALLOCNO_ASSIGNED_P(A) ((A)->assigned_p) -#define ALLOCNO_MAY_BE_SPILLED_P(A) ((A)->may_be_spilled_p) -#define ALLOCNO_COLORABLE_P(A) ((A)->colorable_p) #define ALLOCNO_MODE(A) ((A)->mode) #define ALLOCNO_COPIES(A) ((A)->allocno_copies) #define ALLOCNO_HARD_REG_COSTS(A) ((A)->hard_reg_costs) @@ -539,14 +442,48 @@ struct ira_allocno #define ALLOCNO_UPDATED_MEMORY_COST(A) ((A)->updated_memory_cost) #define ALLOCNO_EXCESS_PRESSURE_POINTS_NUM(A) \ ((A)->excess_pressure_points_num) -#define ALLOCNO_AVAILABLE_REGS_NUM(A) ((A)->available_regs_num) -#define ALLOCNO_NEXT_BUCKET_ALLOCNO(A) ((A)->next_bucket_allocno) -#define ALLOCNO_PREV_BUCKET_ALLOCNO(A) ((A)->prev_bucket_allocno) -#define ALLOCNO_TEMP(A) ((A)->temp) -#define ALLOCNO_FIRST_COALESCED_ALLOCNO(A) ((A)->first_coalesced_allocno) -#define ALLOCNO_NEXT_COALESCED_ALLOCNO(A) ((A)->next_coalesced_allocno) #define ALLOCNO_OBJECT(A,N) ((A)->objects[N]) #define ALLOCNO_NUM_OBJECTS(A) ((A)->num_objects) +#define ALLOCNO_ADD_DATA(A) ((A)->add_data) + +/* Typedef for pointer to the subsequent structure. */ +typedef struct ira_emit_data *ira_emit_data_t; + +/* Allocno bound data used for emit pseudo live range split insns and + to flattening IR. */ +struct ira_emit_data +{ + /* TRUE if the allocno assigned to memory was a destination of + removed move (see ira-emit.c) at loop exit because the value of + the corresponding pseudo-register is not changed inside the + loop. */ + unsigned int mem_optimized_dest_p : 1; + /* TRUE if the corresponding pseudo-register has disjoint live + ranges and the other allocnos of the pseudo-register except this + one changed REG. */ + unsigned int somewhere_renamed_p : 1; + /* TRUE if allocno with the same REGNO in a subregion has been + renamed, in other words, got a new pseudo-register. */ + unsigned int child_renamed_p : 1; + /* Final rtx representation of the allocno. */ + rtx reg; + /* Non NULL if we remove restoring value from given allocno to + MEM_OPTIMIZED_DEST at loop exit (see ira-emit.c) because the + allocno value is not changed inside the loop. */ + ira_allocno_t mem_optimized_dest; +}; + +#define ALLOCNO_EMIT_DATA(a) ((ira_emit_data_t) ALLOCNO_ADD_DATA (a)) + +/* Data used to emit live range split insns and to flattening IR. */ +extern ira_emit_data_t ira_allocno_emit_data; + +/* Abbreviation for frequent emit data access. */ +static inline rtx +allocno_emit_reg (ira_allocno_t a) +{ + return ALLOCNO_EMIT_DATA (a)->reg; +} #define OBJECT_ALLOCNO(O) ((O)->allocno) #define OBJECT_SUBWORD(O) ((O)->subword) @@ -562,10 +499,7 @@ struct ira_allocno #define OBJECT_MAX(O) ((O)->max) #define OBJECT_CONFLICT_ID(O) ((O)->id) #define OBJECT_LIVE_RANGES(O) ((O)->live_ranges) -#define OBJECT_PROFITABLE_HARD_REGS(O) ((O)->profitable_hard_regs) -#define OBJECT_HARD_REGS_NODE(O) ((O)->hard_regs_node) -#define OBJECT_HARD_REGS_SUBNODES_START(O) ((O)->hard_regs_subnodes_start) -#define OBJECT_HARD_REGS_SUBNODES_NUM(O) ((O)->hard_regs_subnodes_num) +#define OBJECT_ADD_DATA(O) ((O)->add_data) /* Map regno -> allocnos with given regno (see comments for allocno member `next_regno_allocno'). */ @@ -997,7 +931,6 @@ extern struct target_ira_int *this_targe /* ira.c: */ extern void *ira_allocate (size_t); -extern void *ira_reallocate (void *, size_t); extern void ira_free (void *addr); extern bitmap ira_allocate_bitmap (void); extern void ira_free_bitmap (bitmap); @@ -1094,36 +1027,20 @@ extern void ira_finish_assign (void); extern void ira_color (void); /* ira-emit.c */ +extern void ira_initiate_emit_data (void); +extern void ira_finish_emit_data (void); extern void ira_emit (bool); -/* Return cost of moving value of MODE from register of class FROM to - register of class TO. */ -static inline int -ira_get_register_move_cost (enum machine_mode mode, - enum reg_class from, enum reg_class to) +/* Initialize register costs for MODE if necessary. */ +static inline void +ira_init_register_move_cost_if_necessary (enum machine_mode mode) { if (ira_register_move_cost[mode] == NULL) ira_init_register_move_cost (mode); - return ira_register_move_cost[mode][from][to]; } -/* Return cost of moving value of MODE from register of class FROM to - register of class TO. Return zero if IN_P is true and FROM is - subset of TO or if IN_P is false and FROM is superset of TO. */ -static inline int -ira_get_may_move_cost (enum machine_mode mode, - enum reg_class from, enum reg_class to, - bool in_p) -{ - if (ira_register_move_cost[mode] == NULL) - ira_init_register_move_cost (mode); - return (in_p - ? ira_may_move_in_cost[mode][from][to] - : ira_may_move_out_cost[mode][from][to]); -} - /* The iterator for all allocnos. */ @@ -1299,7 +1216,7 @@ typedef struct { unsigned IRA_INT_TYPE word; } ira_object_conflict_iterator; -/* Initialize the iterator I with OBJ conflicts. */ +/* Initialize the iterator I with ALLOCNO conflicts. */ static inline void ira_object_conflict_iter_init (ira_object_conflict_iterator *i, ira_object_t obj) @@ -1323,8 +1240,8 @@ ira_object_conflict_iter_init (ira_objec } } -/* Return TRUE if we have more conflicting objects to visit, in which - case *POBJ is set to the object to be visited. Otherwise, return +/* Return TRUE if we have more conflicting allocnos to visit, in which + case *A is set to the allocno to be visited. Otherwise, return FALSE. */ static inline bool ira_object_conflict_iter_cond (ira_object_conflict_iterator *i, @@ -1334,14 +1251,17 @@ ira_object_conflict_iter_cond (ira_objec if (i->conflict_vec_p) { - obj = ((ira_object_t *) i->vec)[i->word_num]; + obj = ((ira_object_t *) i->vec)[i->word_num++]; if (obj == NULL) return false; } else { + unsigned IRA_INT_TYPE word = i->word; + unsigned int bit_num = i->bit_num; + /* Skip words that are zeros. */ - for (; i->word == 0; i->word = ((IRA_INT_TYPE *) i->vec)[i->word_num]) + for (; word == 0; word = ((IRA_INT_TYPE *) i->vec)[i->word_num]) { i->word_num++; @@ -1349,40 +1269,28 @@ ira_object_conflict_iter_cond (ira_objec if (i->word_num * sizeof (IRA_INT_TYPE) >= i->size) return false; - i->bit_num = i->word_num * IRA_INT_BITS; + bit_num = i->word_num * IRA_INT_BITS; } /* Skip bits that are zero. */ - for (; (i->word & 1) == 0; i->word >>= 1) - i->bit_num++; + for (; (word & 1) == 0; word >>= 1) + bit_num++; - obj = ira_object_id_map[i->bit_num + i->base_conflict_id]; + obj = ira_object_id_map[bit_num + i->base_conflict_id]; + i->bit_num = bit_num + 1; + i->word = word >> 1; } *pobj = obj; return true; } -/* Advance to the next conflicting object. */ -static inline void -ira_object_conflict_iter_next (ira_object_conflict_iterator *i) -{ - if (i->conflict_vec_p) - i->word_num++; - else - { - i->word >>= 1; - i->bit_num++; - } -} - /* Loop over all objects conflicting with OBJ. In each iteration, CONF is set to the next conflicting object. ITER is an instance of ira_object_conflict_iterator used to iterate the conflicts. */ #define FOR_EACH_OBJECT_CONFLICT(OBJ, CONF, ITER) \ for (ira_object_conflict_iter_init (&(ITER), (OBJ)); \ - ira_object_conflict_iter_cond (&(ITER), &(CONF)); \ - ira_object_conflict_iter_next (&(ITER))) + ira_object_conflict_iter_cond (&(ITER), &(CONF));) --- t2/ira-lives.c 2010-08-13 09:37:52.000000000 -0400 +++ ../../gcc/gcc/ira-lives.c 2010-08-10 22:37:54.000000000 -0400 @@ -953,11 +953,11 @@ process_single_reg_class_operands (bool && reg_class_size[cl] <= (unsigned) CLASS_MAX_NREGS (cl, mode)) { int i, size; - cost - = (freq - * (in_p - ? ira_get_register_move_cost (mode, aclass, cl) - : ira_get_register_move_cost (mode, cl, aclass))); + + ira_init_register_move_cost_if_necessary (mode); + cost = freq * (in_p + ? ira_register_move_cost[mode][aclass][cl] + : ira_register_move_cost[mode][cl][aclass]); ira_allocate_and_set_costs (&ALLOCNO_CONFLICT_HARD_REG_COSTS (operand_a), aclass, 0); size = ira_reg_class_max_nregs[aclass][mode]; --------------040302030206090201090801--