* [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) @ 2017-06-19 18:25 Jakub Jelinek 2017-06-20 7:41 ` Richard Biener 0 siblings, 1 reply; 21+ messages in thread From: Jakub Jelinek @ 2017-06-19 18:25 UTC (permalink / raw) To: Richard Biener, Martin Liška; +Cc: gcc-patches Hi! The following patch adds -fsanitize=pointer-overflow support, which adds instrumentation (included in -fsanitize=undefined) that checks that pointer arithmetics doesn't wrap. If the offset on ptr p+ off when treating it as signed value is non-negative, we check whether the result is bigger (uintptr_t comparison) than ptr, if it is negative in ssizetype, we check whether the result is smaller than ptr, otherwise we check at runtime whether (ssizetype) off < 0 and do the check based on that. The patch checks both POINTER_PLUS_EXPR, as well as e.g. ADDR_EXPR of handled components, and even handled components themselves (exception is for constant offset when the base is an automatic non-VLA decl or decl that binds to current function where we can at compile time for sure guarantee it will fit). Martin has said he'll write the sanopt part of optimization (if UBSAN_PTR for some pointer is dominated by UBSAN_PTR for the same pointer and the offset is constant in both cases and equal or absolute value bigger and same sign in the dominating UBSAN_PTR, we can avoid the dominated check). For the cases where there is a dereference (i.e. not ADDR_EXPR of the handled component or POINTER_PLUS_EXPR), I wonder if we couldn't ignore say constant offsets in range <-4096, 4096> or something similar, hoping people don't have anything mapped at the page 0 and -pagesize in hosted env. Thoughts on that? I've bootstrapped/regtested the patch on x86_64-linux and i686-linux and additionally bootstrapped/regtested with bootstrap-ubsan on both too. The latter revealed a couple of issues I'd like to discuss: 1) libcpp/symtab.c contains a couple of spots reduced into: #define DELETED ((char *) -1) void bar (char *); void foo (char *p) { if (p && p != DELETED) bar (p); } where we fold it early into if ((p p+ -1) <= (char *) -3) and as the instrumentation is done during ubsan pass, if p is NULL, we diagnose this as invalid pointer overflow from NULL to 0xffff*f. Shall we change the folder so that during GENERIC folding it actually does the addition and comparison in pointer_sized_int instead (my preference), or shall I move the UBSAN_PTR instrumentation earlier into the FEs (but then I still risk stuff is folded earlier)? 2) libcpp/line-map.c has this: static int location_adhoc_data_update (void **slot, void *data) { *((char **) slot) += *((int64_t *) data); return 1; } where the (why int64_t always?, we really need just intptr_t) adjusts one pointer from an unrelated one (result of realloc). That is a UB and actually can trigger this sanitization if the two regions are far away from each other, e.g. on i686-linux: ../../libcpp/line-map.c:102:21: runtime error: pointer index expression with base 0x0899e308 overflowed to 0xf74c4ab8 ../../libcpp/line-map.c:102:21: runtime error: pointer index expression with base 0x08add7c0 overflowed to 0xf74c9a08 ../../libcpp/line-map.c:102:21: runtime error: pointer index expression with base 0x092ba308 overflowed to 0xf741cab8 ../../libcpp/line-map.c:102:21: runtime error: pointer index expression with base 0x0a3757c0 overflowed to 0xf7453a08 Shall we perform the addition in uintptr_t instead to make it implementation defined rather than UB? 3) not really related to this patch, but something I also saw during the bootstrap-ubsan on i686-linux: ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147426384 - 2147475412 cannot be represented in type 'int' ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147426384 - 2147478324 cannot be represented in type 'int' ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147450216 - 2147451580 cannot be represented in type 'int' ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147450216 - 2147465664 cannot be represented in type 'int' ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147469348 - 2147451544 cannot be represented in type 'int' ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147482364 - 2147475376 cannot be represented in type 'int' ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147483624 - 2147475376 cannot be represented in type 'int' ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147483628 - 2147451544 cannot be represented in type 'int' ../../gcc/memory-block.cc:59:4: runtime error: signed integer overflow: -2147426384 - 2147475376 cannot be represented in type 'int' ../../gcc/memory-block.cc:59:4: runtime error: signed integer overflow: -2147450216 - 2147451544 cannot be represented in type 'int' The problem here is that we lower pointer subtraction, e.g. long foo (char *p, char *q) { return q - p; } as return (ptrdiff_t) ((ssizetype) q - (ssizetype) p); and even for a valid testcase where we have an array across the middle of the virtual address space, say the first one above is (char *) 0x8000dfb0 - (char *) 0x7fffdfd4 subtraction, even if there is 128KB array starting at 0x7fffd000, it will yield UB (not in the source, but in whatever the compiler lowered it into). So, shall we instead do the subtraction in sizetype and only then cast? For sizeof (*ptr) > 1 I think we have some outstanding PR, and it is more difficult to find out in what types to compute it. Or do we want to introduce POINTER_DIFF_EXPR? 2017-06-19 Jakub Jelinek <jakub@redhat.com> PR sanitizer/80998 * sanopt.c (pass_sanopt::execute): Handle IFN_UBSAN_PTR. * tree-ssa-alias.c (call_may_clobber_ref_p_1): Likewise. * flag-types.h (enum sanitize_code): Add SANITIZER_POINTER_OVERFLOW. Or it into SANITIZER_UNDEFINED. * ubsan.c: Include gimple-fold.h and varasm.h. (ubsan_expand_null_ifn): Use PROB_VERY_LIKELY instead of REG_BR_PROB_BASE - PROB_VERY_UNLIKELY. (ubsan_expand_ptr_ifn): New function. (instrument_pointer_overflow): New function. (maybe_instrument_pointer_overflow): New function. (instrument_object_size): Formatting fix. (pass_ubsan::execute): Call instrument_pointer_overflow and maybe_instrument_pointer_overflow. * internal-fn.c (expand_UBSAN_PTR): New function. * ubsan.h (ubsan_expand_ptr_ifn): Declare. * sanitizer.def (__ubsan_handle_pointer_overflow, __ubsan_handle_pointer_overflow_abort): New builtins. * tree-ssa-tail-merge.c (merge_stmts_p): Handle IFN_UBSAN_PTR. * internal-fn.def (UBSAN_PTR): New internal function. * opts.c (sanitizer_opts): Add pointer-overflow. * lto-streamer-in.c (input_function): Handle IFN_UBSAN_PTR. gcc/testsuite/ * c-c++-common/ubsan/ptr-overflow-1.c: New test. * c-c++-common/ubsan/ptr-overflow-2.c: New test. libsanitizer/ * ubsan/ubsan_handlers.cc: Cherry-pick upstream r304461. * ubsan/ubsan_checks.inc: Likewise. * ubsan/ubsan_handlers.h: Likewise. --- gcc/sanopt.c.jj 2017-06-14 18:07:46.459748266 +0200 +++ gcc/sanopt.c 2017-06-15 11:06:53.567321615 +0200 @@ -924,6 +924,9 @@ pass_sanopt::execute (function *fun) case IFN_UBSAN_OBJECT_SIZE: no_next = ubsan_expand_objsize_ifn (&gsi); break; + case IFN_UBSAN_PTR: + no_next = ubsan_expand_ptr_ifn (&gsi); + break; case IFN_UBSAN_VPTR: no_next = ubsan_expand_vptr_ifn (&gsi); break; --- gcc/tree-ssa-alias.c.jj 2017-06-14 18:07:46.215751214 +0200 +++ gcc/tree-ssa-alias.c 2017-06-15 11:06:53.568321603 +0200 @@ -1991,6 +1991,7 @@ call_may_clobber_ref_p_1 (gcall *call, a case IFN_UBSAN_BOUNDS: case IFN_UBSAN_VPTR: case IFN_UBSAN_OBJECT_SIZE: + case IFN_UBSAN_PTR: case IFN_ASAN_CHECK: return false; default: --- gcc/flag-types.h.jj 2017-06-14 18:07:46.068752990 +0200 +++ gcc/flag-types.h 2017-06-15 11:06:53.569321591 +0200 @@ -238,6 +238,7 @@ enum sanitize_code { SANITIZE_OBJECT_SIZE = 1UL << 21, SANITIZE_VPTR = 1UL << 22, SANITIZE_BOUNDS_STRICT = 1UL << 23, + SANITIZE_POINTER_OVERFLOW = 1UL << 24, SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT, SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN @@ -245,7 +246,8 @@ enum sanitize_code { | SANITIZE_BOUNDS | SANITIZE_ALIGNMENT | SANITIZE_NONNULL_ATTRIBUTE | SANITIZE_RETURNS_NONNULL_ATTRIBUTE - | SANITIZE_OBJECT_SIZE | SANITIZE_VPTR, + | SANITIZE_OBJECT_SIZE | SANITIZE_VPTR + | SANITIZE_POINTER_OVERFLOW, SANITIZE_UNDEFINED_NONDEFAULT = SANITIZE_FLOAT_DIVIDE | SANITIZE_FLOAT_CAST | SANITIZE_BOUNDS_STRICT }; --- gcc/ubsan.c.jj 2017-06-15 11:06:45.275423452 +0200 +++ gcc/ubsan.c 2017-06-16 13:27:48.946545636 +0200 @@ -45,6 +45,8 @@ along with GCC; see the file COPYING3. #include "builtins.h" #include "tree-object-size.h" #include "tree-cfg.h" +#include "gimple-fold.h" +#include "varasm.h" /* Map from a tree to a VAR_DECL tree. */ @@ -792,7 +794,7 @@ ubsan_expand_null_ifn (gimple_stmt_itera e = find_edge (cond_bb, fallthru_bb); e->flags = EDGE_FALSE_VALUE; e->count = cond_bb->count; - e->probability = REG_BR_PROB_BASE - PROB_VERY_UNLIKELY; + e->probability = PROB_VERY_LIKELY; /* Update dominance info for the newly created then_bb; note that fallthru_bb's dominance info has already been updated by @@ -861,7 +863,7 @@ ubsan_expand_null_ifn (gimple_stmt_itera e = find_edge (cond1_bb, cond2_bb); e->flags = EDGE_FALSE_VALUE; e->count = cond1_bb->count; - e->probability = REG_BR_PROB_BASE - PROB_VERY_UNLIKELY; + e->probability = PROB_VERY_LIKELY; /* Update dominance info. */ if (dom_info_available_p (CDI_DOMINATORS)) @@ -1011,6 +1013,170 @@ ubsan_expand_objsize_ifn (gimple_stmt_it return true; } +/* Expand UBSAN_PTR internal call. */ + +bool +ubsan_expand_ptr_ifn (gimple_stmt_iterator *gsip) +{ + gimple_stmt_iterator gsi = *gsip; + gimple *stmt = gsi_stmt (gsi); + location_t loc = gimple_location (stmt); + gcc_assert (gimple_call_num_args (stmt) == 2); + tree ptr = gimple_call_arg (stmt, 0); + tree off = gimple_call_arg (stmt, 1); + + if (integer_zerop (off)) + { + gsi_remove (gsip, true); + unlink_stmt_vdef (stmt); + return true; + } + + basic_block cur_bb = gsi_bb (gsi); + tree ptrplusoff = make_ssa_name (pointer_sized_int_node); + tree ptri = make_ssa_name (pointer_sized_int_node); + int pos_neg = get_range_pos_neg (off); + + /* Split the original block holding the pointer dereference. */ + edge e = split_block (cur_bb, stmt); + + /* Get a hold on the 'condition block', the 'then block' and the + 'else block'. */ + basic_block cond_bb = e->src; + basic_block fallthru_bb = e->dest; + basic_block then_bb = create_empty_bb (cond_bb); + basic_block cond_pos_bb = NULL, cond_neg_bb = NULL; + add_bb_to_loop (then_bb, cond_bb->loop_father); + loops_state_set (LOOPS_NEED_FIXUP); + + /* Set up the fallthrough basic block. */ + e->flags = EDGE_FALSE_VALUE; + if (pos_neg != 3) + { + e->count = cond_bb->count; + e->probability = PROB_VERY_LIKELY; + + /* Connect 'then block' with the 'else block'. This is needed + as the ubsan routines we call in the 'then block' are not noreturn. + The 'then block' only has one outcoming edge. */ + make_single_succ_edge (then_bb, fallthru_bb, EDGE_FALLTHRU); + + /* Make an edge coming from the 'cond block' into the 'then block'; + this edge is unlikely taken, so set up the probability + accordingly. */ + e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE); + e->probability = PROB_VERY_UNLIKELY; + } + else + { + profile_count count = cond_bb->count.apply_probability (PROB_EVEN); + e->count = count; + e->probability = PROB_EVEN; + + e = split_block (fallthru_bb, (gimple *) NULL); + cond_neg_bb = e->src; + fallthru_bb = e->dest; + e->count = count; + e->probability = PROB_VERY_LIKELY; + e->flags = EDGE_FALSE_VALUE; + + e = make_edge (cond_neg_bb, then_bb, EDGE_TRUE_VALUE); + e->probability = PROB_VERY_UNLIKELY; + + cond_pos_bb = create_empty_bb (cond_bb); + add_bb_to_loop (cond_pos_bb, cond_bb->loop_father); + + e = make_edge (cond_bb, cond_pos_bb, EDGE_TRUE_VALUE); + e->count = count; + e->probability = PROB_EVEN; + + e = make_edge (cond_pos_bb, then_bb, EDGE_TRUE_VALUE); + e->probability = PROB_VERY_UNLIKELY; + + e = make_edge (cond_pos_bb, fallthru_bb, EDGE_FALSE_VALUE); + e->count = count; + e->probability = PROB_VERY_LIKELY; + + make_single_succ_edge (then_bb, fallthru_bb, EDGE_FALLTHRU); + } + + gimple *g = gimple_build_assign (ptri, NOP_EXPR, ptr); + gimple_set_location (g, loc); + gsi_insert_before (&gsi, g, GSI_SAME_STMT); + g = gimple_build_assign (ptrplusoff, PLUS_EXPR, ptri, off); + gimple_set_location (g, loc); + gsi_insert_before (&gsi, g, GSI_SAME_STMT); + + /* Update dominance info for the newly created then_bb; note that + fallthru_bb's dominance info has already been updated by + split_block. */ + if (dom_info_available_p (CDI_DOMINATORS)) + { + set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb); + if (pos_neg == 3) + { + set_immediate_dominator (CDI_DOMINATORS, cond_pos_bb, cond_bb); + set_immediate_dominator (CDI_DOMINATORS, fallthru_bb, cond_bb); + } + } + + /* Put the ubsan builtin call into the newly created BB. */ + if (flag_sanitize_undefined_trap_on_error) + g = gimple_build_call (builtin_decl_implicit (BUILT_IN_TRAP), 0); + else + { + enum built_in_function bcode + = (flag_sanitize_recover & SANITIZE_POINTER_OVERFLOW) + ? BUILT_IN_UBSAN_HANDLE_POINTER_OVERFLOW + : BUILT_IN_UBSAN_HANDLE_POINTER_OVERFLOW_ABORT; + tree fn = builtin_decl_implicit (bcode); + tree data + = ubsan_create_data ("__ubsan_ptrovf_data", 1, &loc, + NULL_TREE, NULL_TREE); + data = build_fold_addr_expr_loc (loc, data); + g = gimple_build_call (fn, 3, data, ptr, ptrplusoff); + } + gimple_stmt_iterator gsi2 = gsi_start_bb (then_bb); + gimple_set_location (g, loc); + gsi_insert_after (&gsi2, g, GSI_NEW_STMT); + + /* Unlink the UBSAN_PTRs vops before replacing it. */ + unlink_stmt_vdef (stmt); + + if (TREE_CODE (off) == INTEGER_CST) + g = gimple_build_cond (wi::neg_p (off) ? LT_EXPR : GE_EXPR, ptri, + fold_build1 (NEGATE_EXPR, sizetype, off), + NULL_TREE, NULL_TREE); + else if (pos_neg != 3) + g = gimple_build_cond (pos_neg == 1 ? LT_EXPR : GT_EXPR, + ptrplusoff, ptri, NULL_TREE, NULL_TREE); + else + { + gsi2 = gsi_start_bb (cond_pos_bb); + g = gimple_build_cond (LT_EXPR, ptrplusoff, ptri, NULL_TREE, NULL_TREE); + gimple_set_location (g, loc); + gsi_insert_after (&gsi2, g, GSI_NEW_STMT); + + gsi2 = gsi_start_bb (cond_neg_bb); + g = gimple_build_cond (GT_EXPR, ptrplusoff, ptri, NULL_TREE, NULL_TREE); + gimple_set_location (g, loc); + gsi_insert_after (&gsi2, g, GSI_NEW_STMT); + + gimple_seq seq = NULL; + tree t = gimple_build (&seq, loc, NOP_EXPR, ssizetype, off); + t = gimple_build (&seq, loc, GE_EXPR, boolean_type_node, + t, ssize_int (0)); + gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); + g = gimple_build_cond (NE_EXPR, t, boolean_false_node, + NULL_TREE, NULL_TREE); + } + gimple_set_location (g, loc); + /* Replace the UBSAN_PTR with a GIMPLE_COND stmt. */ + gsi_replace (&gsi, g, false); + return false; +} + + /* Cached __ubsan_vptr_type_cache decl. */ static GTY(()) tree ubsan_vptr_type_cache_decl; @@ -1215,6 +1381,103 @@ instrument_null (gimple_stmt_iterator gs instrument_mem_ref (t, base, &gsi, is_lhs); } +/* Instrument pointer arithmetics PTR p+ OFF. */ + +static void +instrument_pointer_overflow (gimple_stmt_iterator *gsi, tree ptr, tree off) +{ + if (TYPE_PRECISION (sizetype) != POINTER_SIZE) + return; + gcall *g = gimple_build_call_internal (IFN_UBSAN_PTR, 2, ptr, off); + gimple_set_location (g, gimple_location (gsi_stmt (*gsi))); + gsi_insert_before (gsi, g, GSI_SAME_STMT); +} + +/* Instrument pointer arithmetics if any. */ + +static void +maybe_instrument_pointer_overflow (gimple_stmt_iterator *gsi, tree t) +{ + if (TYPE_PRECISION (sizetype) != POINTER_SIZE) + return; + + /* Handle also e.g. &s->i. */ + if (TREE_CODE (t) == ADDR_EXPR) + t = TREE_OPERAND (t, 0); + + switch (TREE_CODE (t)) + { + case COMPONENT_REF: + if (TREE_CODE (t) == COMPONENT_REF + && DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (t, 1)) != NULL_TREE) + { + tree repr = DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (t, 1)); + t = build3 (COMPONENT_REF, TREE_TYPE (repr), TREE_OPERAND (t, 0), + repr, TREE_OPERAND (t, 2)); + } + break; + case ARRAY_REF: + case MEM_REF: + break; + default: + return; + } + + HOST_WIDE_INT bitsize, bitpos; + tree offset; + machine_mode mode; + int volatilep = 0, reversep, unsignedp = 0; + tree inner = get_inner_reference (t, &bitsize, &bitpos, &offset, &mode, + &unsignedp, &reversep, &volatilep); + + if ((offset == NULL_TREE && bitpos == 0) + || bitpos % BITS_PER_UNIT != 0) + return; + + bool decl_p = DECL_P (inner); + tree base; + if (decl_p) + { + if (DECL_REGISTER (inner)) + return; + base = inner; + /* If BASE is a fixed size automatic variable or + global variable defined in the current TU and bitpos + fits, don't instrument anything. */ + if (offset == NULL_TREE + && bitpos > 0 + && (VAR_P (base) + || TREE_CODE (base) == PARM_DECL + || TREE_CODE (base) == RESULT_DECL) + && DECL_SIZE (base) + && TREE_CODE (DECL_SIZE (base)) == INTEGER_CST + && compare_tree_int (DECL_SIZE (base), bitpos) >= 0 + && (!is_global_var (base) || decl_binds_to_current_def_p (base))) + return; + } + else if (TREE_CODE (inner) == MEM_REF) + base = TREE_OPERAND (inner, 0); + else + return; + tree ptr = build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (t)), t); + + if (!POINTER_TYPE_P (TREE_TYPE (base)) && !DECL_P (base)) + return; + + tree base_addr = base; + if (decl_p) + base_addr = build1 (ADDR_EXPR, + build_pointer_type (TREE_TYPE (base)), base); + t = fold_build2 (MINUS_EXPR, sizetype, + fold_convert (pointer_sized_int_node, ptr), + fold_convert (pointer_sized_int_node, base_addr)); + t = force_gimple_operand_gsi (gsi, t, true, NULL_TREE, true, + GSI_SAME_STMT); + base_addr = force_gimple_operand_gsi (gsi, base_addr, true, NULL_TREE, true, + GSI_SAME_STMT); + instrument_pointer_overflow (gsi, base_addr, t); +} + /* Build an ubsan builtin call for the signed-integer-overflow sanitization. CODE says what kind of builtin are we building, LOC is a location, LHSTYPE is the type of LHS, OP0 and OP1 @@ -1828,7 +2091,7 @@ instrument_object_size (gimple_stmt_iter { tree rhs1 = gimple_assign_rhs1 (def_stmt); if (TREE_CODE (rhs1) == SSA_NAME - && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs1)) + && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs1)) break; else base = rhs1; @@ -1952,7 +2215,8 @@ public: | SANITIZE_ALIGNMENT | SANITIZE_NONNULL_ATTRIBUTE | SANITIZE_RETURNS_NONNULL_ATTRIBUTE - | SANITIZE_OBJECT_SIZE)); + | SANITIZE_OBJECT_SIZE + | SANITIZE_POINTER_OVERFLOW)); } virtual unsigned int execute (function *); @@ -2043,6 +2307,32 @@ pass_ubsan::execute (function *fun) } } } + + if (sanitize_flags_p (SANITIZE_POINTER_OVERFLOW, fun->decl)) + { + if (is_gimple_assign (stmt) + && gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR) + instrument_pointer_overflow (&gsi, + gimple_assign_rhs1 (stmt), + gimple_assign_rhs2 (stmt)); + if (gimple_store_p (stmt)) + maybe_instrument_pointer_overflow (&gsi, + gimple_get_lhs (stmt)); + if (gimple_assign_single_p (stmt)) + maybe_instrument_pointer_overflow (&gsi, + gimple_assign_rhs1 (stmt)); + if (is_gimple_call (stmt)) + { + unsigned args_num = gimple_call_num_args (stmt); + for (unsigned i = 0; i < args_num; ++i) + { + tree arg = gimple_call_arg (stmt, i); + if (is_gimple_reg (arg)) + continue; + maybe_instrument_pointer_overflow (&gsi, arg); + } + } + } gsi_next (&gsi); } --- gcc/internal-fn.c.jj 2017-06-15 11:03:25.053821114 +0200 +++ gcc/internal-fn.c 2017-06-15 11:06:53.570321578 +0200 @@ -401,6 +401,14 @@ expand_UBSAN_VPTR (internal_fn, gcall *) /* This should get expanded in the sanopt pass. */ static void +expand_UBSAN_PTR (internal_fn, gcall *) +{ + gcc_unreachable (); +} + +/* This should get expanded in the sanopt pass. */ + +static void expand_UBSAN_OBJECT_SIZE (internal_fn, gcall *) { gcc_unreachable (); --- gcc/ubsan.h.jj 2017-06-14 18:07:46.174751709 +0200 +++ gcc/ubsan.h 2017-06-15 11:06:53.570321578 +0200 @@ -45,6 +45,7 @@ enum ubsan_print_style { extern bool ubsan_expand_bounds_ifn (gimple_stmt_iterator *); extern bool ubsan_expand_null_ifn (gimple_stmt_iterator *); extern bool ubsan_expand_objsize_ifn (gimple_stmt_iterator *); +extern bool ubsan_expand_ptr_ifn (gimple_stmt_iterator *); extern bool ubsan_expand_vptr_ifn (gimple_stmt_iterator *); extern bool ubsan_instrument_unreachable (gimple_stmt_iterator *); extern tree ubsan_create_data (const char *, int, const location_t *, ...); --- gcc/sanitizer.def.jj 2017-06-15 11:03:25.054821102 +0200 +++ gcc/sanitizer.def 2017-06-15 11:06:53.571321566 +0200 @@ -444,6 +444,10 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HAN "__ubsan_handle_load_invalid_value", BT_FN_VOID_PTR_PTR, ATTR_COLD_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_POINTER_OVERFLOW, + "__ubsan_handle_pointer_overflow", + BT_FN_VOID_PTR_PTR_PTR, + ATTR_COLD_NOTHROW_LEAF_LIST) DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW_ABORT, "__ubsan_handle_divrem_overflow_abort", BT_FN_VOID_PTR_PTR_PTR, @@ -480,6 +484,10 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HAN "__ubsan_handle_load_invalid_value_abort", BT_FN_VOID_PTR_PTR, ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_POINTER_OVERFLOW_ABORT, + "__ubsan_handle_pointer_overflow_abort", + BT_FN_VOID_PTR_PTR_PTR, + ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST) DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_FLOAT_CAST_OVERFLOW, "__ubsan_handle_float_cast_overflow", BT_FN_VOID_PTR_PTR, --- gcc/tree-ssa-tail-merge.c.jj 2017-06-14 18:07:45.739756964 +0200 +++ gcc/tree-ssa-tail-merge.c 2017-06-15 11:06:53.571321566 +0200 @@ -1239,6 +1239,7 @@ merge_stmts_p (gimple *stmt1, gimple *st case IFN_UBSAN_CHECK_SUB: case IFN_UBSAN_CHECK_MUL: case IFN_UBSAN_OBJECT_SIZE: + case IFN_UBSAN_PTR: case IFN_ASAN_CHECK: /* For these internal functions, gimple_location is an implicit parameter, which will be used explicitly after expansion. --- gcc/internal-fn.def.jj 2017-06-14 18:07:45.679757689 +0200 +++ gcc/internal-fn.def 2017-06-15 11:06:53.572321554 +0200 @@ -165,6 +165,7 @@ DEF_INTERNAL_FN (UBSAN_VPTR, ECF_LEAF | DEF_INTERNAL_FN (UBSAN_CHECK_ADD, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) DEF_INTERNAL_FN (UBSAN_CHECK_SUB, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) DEF_INTERNAL_FN (UBSAN_CHECK_MUL, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) +DEF_INTERNAL_FN (UBSAN_PTR, ECF_LEAF | ECF_NOTHROW, ".R.") DEF_INTERNAL_FN (UBSAN_OBJECT_SIZE, ECF_LEAF | ECF_NOTHROW, NULL) DEF_INTERNAL_FN (ABNORMAL_DISPATCHER, ECF_NORETURN, NULL) DEF_INTERNAL_FN (BUILTIN_EXPECT, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) --- gcc/opts.c.jj 2017-06-14 18:07:46.411748846 +0200 +++ gcc/opts.c 2017-06-15 11:06:53.572321554 +0200 @@ -1504,6 +1504,7 @@ const struct sanitizer_opts_s sanitizer_ true), SANITIZER_OPT (object-size, SANITIZE_OBJECT_SIZE, true), SANITIZER_OPT (vptr, SANITIZE_VPTR, true), + SANITIZER_OPT (pointer-overflow, SANITIZE_POINTER_OVERFLOW, true), SANITIZER_OPT (all, ~0U, true), #undef SANITIZER_OPT { NULL, 0U, 0UL, false } --- gcc/lto-streamer-in.c.jj 2017-06-14 18:07:45.803756191 +0200 +++ gcc/lto-streamer-in.c 2017-06-15 11:06:53.573321541 +0200 @@ -1143,6 +1143,10 @@ input_function (tree fn_decl, struct dat if ((flag_sanitize & SANITIZE_OBJECT_SIZE) == 0) remove = true; break; + case IFN_UBSAN_PTR: + if ((flag_sanitize & SANITIZE_POINTER_OVERFLOW) == 0) + remove = true; + break; case IFN_ASAN_MARK: if ((flag_sanitize & SANITIZE_ADDRESS) == 0) remove = true; --- gcc/testsuite/c-c++-common/ubsan/ptr-overflow-1.c.jj 2017-06-15 11:06:17.700755118 +0200 +++ gcc/testsuite/c-c++-common/ubsan/ptr-overflow-1.c 2017-06-16 13:04:29.216377665 +0200 @@ -0,0 +1,65 @@ +/* PR sanitizer/80998 */ +/* { dg-do run } */ +/* { dg-options "-fsanitize=pointer-overflow -fno-sanitize-recover=pointer-overflow -Wall" } */ + +struct S { int a; int b; int c[64]; }; +__attribute__((noinline, noclone)) char *f1 (char *p) { return p + 1; } +__attribute__((noinline, noclone)) char *f2 (char *p) { return p - 1; } +__attribute__((noinline, noclone)) char *f3 (char *p, int i) { return p + i; } +__attribute__((noinline, noclone)) char *f4 (char *p, int i) { return p - i; } +__attribute__((noinline, noclone)) char *f5 (char *p, unsigned long int i) { return p + i; } +__attribute__((noinline, noclone)) char *f6 (char *p, unsigned long int i) { return p - i; } +__attribute__((noinline, noclone)) int *f7 (struct S *p) { return &p->a; } +__attribute__((noinline, noclone)) int *f8 (struct S *p) { return &p->b; } +__attribute__((noinline, noclone)) int *f9 (struct S *p) { return &p->c[64]; } +__attribute__((noinline, noclone)) int *f10 (struct S *p, int i) { return &p->c[i]; } + +char *volatile p; +struct S *volatile q; +char a[64]; +struct S s; +int *volatile r; + +int +main () +{ + struct S t; + p = &a[32]; + p = f1 (p); + p = f1 (p); + p = f2 (p); + p = f3 (p, 1); + p = f3 (p, -1); + p = f3 (p, 3); + p = f3 (p, -6); + p = f4 (p, 1); + p = f4 (p, -1); + p = f4 (p, 3); + p = f4 (p, -6); + p = f5 (p, 1); + p = f5 (p, 3); + p = f6 (p, 1); + p = f6 (p, 3); + if (sizeof (unsigned long) >= sizeof (char *)) + { + p = f5 (p, -1); + p = f5 (p, -6); + p = f6 (p, -1); + p = f6 (p, -6); + } + q = &s; + r = f7 (q); + r = f8 (q); + r = f9 (q); + r = f10 (q, 0); + r = f10 (q, 10); + r = f10 (q, 64); + q = &t; + r = f7 (q); + r = f8 (q); + r = f9 (q); + r = f10 (q, 0); + r = f10 (q, 10); + r = f10 (q, 64); + return 0; +} --- gcc/testsuite/c-c++-common/ubsan/ptr-overflow-2.c.jj 2017-06-15 11:06:17.700755118 +0200 +++ gcc/testsuite/c-c++-common/ubsan/ptr-overflow-2.c 2017-06-16 14:00:57.545611263 +0200 @@ -0,0 +1,113 @@ +/* PR sanitizer/80998 */ +/* { dg-do run } */ +/* { dg-options "-fsanitize=pointer-overflow -fsanitize-recover=pointer-overflow -fno-ipa-icf -Wall" } */ + +__attribute__((noinline, noclone)) char * f1 (char *p) { return p + 1; } +__attribute__((noinline, noclone)) char * f2 (char *p) { return p - 1; } +__attribute__((noinline, noclone)) char * f3 (char *p, int i) { return p + i; } +__attribute__((noinline, noclone)) char * f4 (char *p, int i) { return p + i; } +__attribute__((noinline, noclone)) char * f5 (char *p, int i) { return p - i; } +__attribute__((noinline, noclone)) char * f6 (char *p, int i) { return p - i; } +__attribute__((noinline, noclone)) char * f7 (char *p, unsigned long int i) { return p + i; } +__attribute__((noinline, noclone)) char * f8 (char *p, unsigned long int i) { return p + i; } +__attribute__((noinline, noclone)) char * f9 (char *p, unsigned long int i) { return p - i; } +__attribute__((noinline, noclone)) char * f10 (char *p, unsigned long int i) { return p - i; } +struct S { int a; int b; int c[64]; }; +__attribute__((noinline, noclone)) int *f11 (struct S *p) { return &p->a; } +__attribute__((noinline, noclone)) int *f12 (struct S *p) { return &p->b; } +__attribute__((noinline, noclone)) int *f13 (struct S *p) { return &p->c[64]; } +__attribute__((noinline, noclone)) int *f14 (struct S *p, int i) { return &p->c[i]; } +__attribute__((noinline, noclone)) int *f15 (struct S *p, int i) { return &p->c[i]; } +__attribute__((noinline, noclone)) int *f16 (struct S *p) { return &p->a; } +__attribute__((noinline, noclone)) int *f17 (struct S *p) { return &p->b; } +__attribute__((noinline, noclone)) int *f18 (struct S *p) { return &p->c[64]; } +__attribute__((noinline, noclone)) int *f19 (struct S *p, int i) { return &p->c[i]; } +__attribute__((noinline, noclone)) int *f20 (struct S *p, int i) { return &p->c[i]; } +__attribute__((noinline, noclone)) int *f21 (struct S *p) { return &p->a; } +__attribute__((noinline, noclone)) int *f22 (struct S *p) { return &p->b; } +__attribute__((noinline, noclone)) int *f23 (struct S *p) { return &p->c[64]; } +__attribute__((noinline, noclone)) int *f24 (struct S *p, int i) { return &p->c[i]; } +__attribute__((noinline, noclone)) int *f25 (struct S *p, int i) { return &p->c[i]; } + +char *volatile p; +__UINTPTR_TYPE__ volatile u; +struct S *volatile q; +int *volatile r; + +int +main () +{ + u = ~(__UINTPTR_TYPE__) 0; + p = (char *) u; + p = f1 (p); + u = 0; + p = (char *) u; + p = f2 (p); + u = -(__UINTPTR_TYPE__) 7; + p = (char *) u; + p = f3 (p, 7); + u = 3; + p = (char *) u; + p = f4 (p, -4); + u = 23; + p = (char *) u; + p = f5 (p, 27); + u = -(__UINTPTR_TYPE__) 15; + p = (char *) u; + p = f6 (p, -15); + u = -(__UINTPTR_TYPE__) 29; + p = (char *) u; + p = f7 (p, 31); + u = 23; + p = (char *) u; + p = f9 (p, 24); + if (sizeof (unsigned long) < sizeof (char *)) + return 0; + u = 7; + p = (char *) u; + p = f8 (p, -8); + u = -(__UINTPTR_TYPE__) 25; + p = (char *) u; + p = f10 (p, -25); + u = ~(__UINTPTR_TYPE__) 0; + q = (struct S *) u; + r = f11 (q); + r = f12 (q); + r = f13 (q); + r = f14 (q, 0); + r = f15 (q, 63); + u = ~(__UINTPTR_TYPE__) 0 - (17 * sizeof (int)); + q = (struct S *) u; + r = f16 (q); + r = f17 (q); + r = f18 (q); + r = f19 (q, 0); + r = f20 (q, 63); + u = 3 * sizeof (int); + q = (struct S *) u; + r = f21 (q); + r = f22 (q); + r = f23 (q); + r = f24 (q, -2); + r = f25 (q, -6); + return 0; +} + +/* { dg-output ":5:6\[79]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+ overflowed to (0\[xX])?0\+(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*:6:6\[79]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?0\+ overflowed to (0\[xX])?\[fF]\+(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*:7:7\[46]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+9 overflowed to (0\[xX])?0\+(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*:8:7\[46]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?0\+3 overflowed to (0\[xX])?\[fF]\+(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*:9:7\[46]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?0\+17 overflowed to (0\[xX])?\[fF]\+\[cC](\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*:10:7\[46]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+1 overflowed to (0\[xX])?0\+(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*:11:\[89]\[80]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+\[eE]3 overflowed to (0\[xX])?0\+2(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*:13:\[89]\[80]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?0\+17 overflowed to (0\[xX])?\[fF]\+(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*:12:\[89]\[80]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?0\+7 overflowed to (0\[xX])?\[fF]\+(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*:14:\[89]\[91]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+\[eE]7 overflowed to (0\[xX])?0\+" } */ +/* { dg-output "(\n|\r\n|\r)" { target int32 } } */ +/* { dg-output "\[^\n\r]*:17:\[67]\[82]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+ overflowed to (0\[xX])?0\+3(\n|\r\n|\r)" { target int32 } } */ +/* { dg-output "\[^\n\r]*:18:\[67]\[86]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+ overflowed to (0\[xX])?0\+107(\n|\r\n|\r)" { target int32 } } */ +/* { dg-output "\[^\n\r]*:19:\[78]\[52]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+ overflowed to (0\[xX])?0\+7(\n|\r\n|\r)" { target int32 } } */ +/* { dg-output "\[^\n\r]*:20:\[78]\[52]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+ overflowed to (0\[xX])?0\+103(\n|\r\n|\r)" { target int32 } } */ +/* { dg-output "\[^\n\r]*:23:\[67]\[86]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+\[bB]\[bB] overflowed to (0\[xX])?0\+\[cC]3(\n|\r\n|\r)" { target int32 } } */ +/* { dg-output "\[^\n\r]*:25:\[78]\[52]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+\[bB]\[bB] overflowed to (0\[xX])?0\+\[bB]\[fF](\n|\r\n|\r)" { target int32 } } */ +/* { dg-output "\[^\n\r]*:30:\[78]\[52]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?0\+\[cC] overflowed to (0\[xX])?\[fF]\+\[cC]" { target int32 } } */ --- libsanitizer/ubsan/ubsan_handlers.cc.jj 2016-11-16 18:51:53.028794605 +0100 +++ libsanitizer/ubsan/ubsan_handlers.cc 2017-06-14 09:54:25.571687721 +0200 @@ -521,6 +521,37 @@ void __ubsan::__ubsan_handle_nonnull_arg Die(); } +static void handlePointerOverflowImpl(PointerOverflowData *Data, + ValueHandle Base, + ValueHandle Result, + ReportOptions Opts) { + SourceLocation Loc = Data->Loc.acquire(); + ErrorType ET = ErrorType::PointerOverflow; + + if (ignoreReport(Loc, Opts, ET)) + return; + + ScopedReport R(Opts, Loc, ET); + + Diag(Loc, DL_Error, "pointer index expression with base %0 overflowed to %1") + << (void *)Base << (void*)Result; +} + +void __ubsan::__ubsan_handle_pointer_overflow(PointerOverflowData *Data, + ValueHandle Base, + ValueHandle Result) { + GET_REPORT_OPTIONS(false); + handlePointerOverflowImpl(Data, Base, Result, Opts); +} + +void __ubsan::__ubsan_handle_pointer_overflow_abort(PointerOverflowData *Data, + ValueHandle Base, + ValueHandle Result) { + GET_REPORT_OPTIONS(true); + handlePointerOverflowImpl(Data, Base, Result, Opts); + Die(); +} + static void handleCFIBadIcall(CFICheckFailData *Data, ValueHandle Function, ReportOptions Opts) { if (Data->CheckKind != CFITCK_ICall) --- libsanitizer/ubsan/ubsan_checks.inc.jj 2016-11-09 15:22:50.139249654 +0100 +++ libsanitizer/ubsan/ubsan_checks.inc 2017-06-14 09:54:25.571687721 +0200 @@ -17,6 +17,7 @@ UBSAN_CHECK(GenericUB, "undefined-behavior", "undefined") UBSAN_CHECK(NullPointerUse, "null-pointer-use", "null") +UBSAN_CHECK(PointerOverflow, "pointer-overflow", "pointer-overflow") UBSAN_CHECK(MisalignedPointerUse, "misaligned-pointer-use", "alignment") UBSAN_CHECK(InsufficientObjectSize, "insufficient-object-size", "object-size") UBSAN_CHECK(SignedIntegerOverflow, "signed-integer-overflow", --- libsanitizer/ubsan/ubsan_handlers.h.jj 2016-11-16 18:51:53.029794593 +0100 +++ libsanitizer/ubsan/ubsan_handlers.h 2017-06-14 09:54:25.571687721 +0200 @@ -146,6 +146,13 @@ struct NonNullArgData { /// \brief Handle passing null pointer to function with nonnull attribute. RECOVERABLE(nonnull_arg, NonNullArgData *Data) +struct PointerOverflowData { + SourceLocation Loc; +}; + +RECOVERABLE(pointer_overflow, PointerOverflowData *Data, ValueHandle Base, + ValueHandle Result) + /// \brief Known CFI check kinds. /// Keep in sync with the enum of the same name in CodeGenFunction.h enum CFITypeCheckKind : unsigned char { Jakub ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) 2017-06-19 18:25 [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) Jakub Jelinek @ 2017-06-20 7:41 ` Richard Biener 2017-06-20 8:14 ` Jakub Jelinek 2017-06-21 8:00 ` Jakub Jelinek 0 siblings, 2 replies; 21+ messages in thread From: Richard Biener @ 2017-06-20 7:41 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Martin Liška, gcc-patches On Mon, 19 Jun 2017, Jakub Jelinek wrote: > Hi! > > The following patch adds -fsanitize=pointer-overflow support, > which adds instrumentation (included in -fsanitize=undefined) that checks > that pointer arithmetics doesn't wrap. If the offset on ptr p+ off when treating > it as signed value is non-negative, we check whether the result is bigger > (uintptr_t comparison) than ptr, if it is negative in ssizetype, we check > whether the result is smaller than ptr, otherwise we check at runtime > whether (ssizetype) off < 0 and do the check based on that. > The patch checks both POINTER_PLUS_EXPR, as well as e.g. ADDR_EXPR of > handled components, and even handled components themselves (exception > is for constant offset when the base is an automatic non-VLA decl or > decl that binds to current function where we can at compile time for > sure guarantee it will fit). Does this "properly" interact with any array-bound sanitizing we do? Say, for &a->b[i].c.d ? > Martin has said he'll write the sanopt part of optimization > (if UBSAN_PTR for some pointer is dominated by UBSAN_PTR for the same > pointer and the offset is constant in both cases and equal or absolute value > bigger and same sign in the dominating UBSAN_PTR, we can avoid the dominated > check). > > For the cases where there is a dereference (i.e. not ADDR_EXPR of the > handled component or POINTER_PLUS_EXPR), I wonder if we couldn't ignore > say constant offsets in range <-4096, 4096> or something similar, hoping > people don't have anything mapped at the page 0 and -pagesize in hosted > env. Thoughts on that? Not sure what the problem is here? > I've bootstrapped/regtested the patch on x86_64-linux and i686-linux > and additionally bootstrapped/regtested with bootstrap-ubsan on both too. > The latter revealed a couple of issues I'd like to discuss: > > 1) libcpp/symtab.c contains a couple of spots reduced into: > #define DELETED ((char *) -1) > void bar (char *); > void > foo (char *p) > { > if (p && p != DELETED) > bar (p); > } > where we fold it early into if ((p p+ -1) <= (char *) -3) > and as the instrumentation is done during ubsan pass, if p is NULL, > we diagnose this as invalid pointer overflow from NULL to 0xffff*f. > Shall we change the folder so that during GENERIC folding it > actually does the addition and comparison in pointer_sized_int > instead (my preference), or shall I move the UBSAN_PTR instrumentation > earlier into the FEs (but then I still risk stuff is folded earlier)? Aww, so we turn the pointer test into a range test ;) That it uses a pointer type rather than an unsigned integer type is a bug, probably caused by pointers being TYPE_UNSIGNED. Not sure if the folding itself is worthwhile to keep though, thus an option would be to not generate range tests from pointers? > 2) libcpp/line-map.c has this: > static int > location_adhoc_data_update (void **slot, void *data) > { > *((char **) slot) += *((int64_t *) data); > return 1; > } > where the (why int64_t always?, we really need just intptr_t) adjusts > one pointer from an unrelated one (result of realloc). That is a UB > and actually can trigger this sanitization if the two regions are > far away from each other, e.g. on i686-linux: > ../../libcpp/line-map.c:102:21: runtime error: pointer index expression with base 0x0899e308 overflowed to 0xf74c4ab8 > ../../libcpp/line-map.c:102:21: runtime error: pointer index expression with base 0x08add7c0 overflowed to 0xf74c9a08 > ../../libcpp/line-map.c:102:21: runtime error: pointer index expression with base 0x092ba308 overflowed to 0xf741cab8 > ../../libcpp/line-map.c:102:21: runtime error: pointer index expression with base 0x0a3757c0 overflowed to 0xf7453a08 > Shall we perform the addition in uintptr_t instead to make it > implementation defined rather than UB? Yes. > 3) not really related to this patch, but something I also saw during the > bootstrap-ubsan on i686-linux: > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147426384 - 2147475412 cannot be represented in type 'int' > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147426384 - 2147478324 cannot be represented in type 'int' > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147450216 - 2147451580 cannot be represented in type 'int' > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147450216 - 2147465664 cannot be represented in type 'int' > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147469348 - 2147451544 cannot be represented in type 'int' > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147482364 - 2147475376 cannot be represented in type 'int' > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147483624 - 2147475376 cannot be represented in type 'int' > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147483628 - 2147451544 cannot be represented in type 'int' > ../../gcc/memory-block.cc:59:4: runtime error: signed integer overflow: -2147426384 - 2147475376 cannot be represented in type 'int' > ../../gcc/memory-block.cc:59:4: runtime error: signed integer overflow: -2147450216 - 2147451544 cannot be represented in type 'int' > The problem here is that we lower pointer subtraction, e.g. > long foo (char *p, char *q) { return q - p; } > as return (ptrdiff_t) ((ssizetype) q - (ssizetype) p); > and even for a valid testcase where we have an array across > the middle of the virtual address space, say the first one above > is (char *) 0x8000dfb0 - (char *) 0x7fffdfd4 subtraction, even if > there is 128KB array starting at 0x7fffd000, it will yield > UB (not in the source, but in whatever the compiler lowered it into). > So, shall we instead do the subtraction in sizetype and only then > cast? For sizeof (*ptr) > 1 I think we have some outstanding PR, > and it is more difficult to find out in what types to compute it. > Or do we want to introduce POINTER_DIFF_EXPR? Just use uintptr_t for the difference computation (well, an unsigned integer type of desired precision -- mind address-spaces), then cast the result to signed. Richard. > 2017-06-19 Jakub Jelinek <jakub@redhat.com> > > PR sanitizer/80998 > * sanopt.c (pass_sanopt::execute): Handle IFN_UBSAN_PTR. > * tree-ssa-alias.c (call_may_clobber_ref_p_1): Likewise. > * flag-types.h (enum sanitize_code): Add SANITIZER_POINTER_OVERFLOW. > Or it into SANITIZER_UNDEFINED. > * ubsan.c: Include gimple-fold.h and varasm.h. > (ubsan_expand_null_ifn): Use PROB_VERY_LIKELY instead of > REG_BR_PROB_BASE - PROB_VERY_UNLIKELY. > (ubsan_expand_ptr_ifn): New function. > (instrument_pointer_overflow): New function. > (maybe_instrument_pointer_overflow): New function. > (instrument_object_size): Formatting fix. > (pass_ubsan::execute): Call instrument_pointer_overflow > and maybe_instrument_pointer_overflow. > * internal-fn.c (expand_UBSAN_PTR): New function. > * ubsan.h (ubsan_expand_ptr_ifn): Declare. > * sanitizer.def (__ubsan_handle_pointer_overflow, > __ubsan_handle_pointer_overflow_abort): New builtins. > * tree-ssa-tail-merge.c (merge_stmts_p): Handle IFN_UBSAN_PTR. > * internal-fn.def (UBSAN_PTR): New internal function. > * opts.c (sanitizer_opts): Add pointer-overflow. > * lto-streamer-in.c (input_function): Handle IFN_UBSAN_PTR. > gcc/testsuite/ > * c-c++-common/ubsan/ptr-overflow-1.c: New test. > * c-c++-common/ubsan/ptr-overflow-2.c: New test. > libsanitizer/ > * ubsan/ubsan_handlers.cc: Cherry-pick upstream r304461. > * ubsan/ubsan_checks.inc: Likewise. > * ubsan/ubsan_handlers.h: Likewise. > > --- gcc/sanopt.c.jj 2017-06-14 18:07:46.459748266 +0200 > +++ gcc/sanopt.c 2017-06-15 11:06:53.567321615 +0200 > @@ -924,6 +924,9 @@ pass_sanopt::execute (function *fun) > case IFN_UBSAN_OBJECT_SIZE: > no_next = ubsan_expand_objsize_ifn (&gsi); > break; > + case IFN_UBSAN_PTR: > + no_next = ubsan_expand_ptr_ifn (&gsi); > + break; > case IFN_UBSAN_VPTR: > no_next = ubsan_expand_vptr_ifn (&gsi); > break; > --- gcc/tree-ssa-alias.c.jj 2017-06-14 18:07:46.215751214 +0200 > +++ gcc/tree-ssa-alias.c 2017-06-15 11:06:53.568321603 +0200 > @@ -1991,6 +1991,7 @@ call_may_clobber_ref_p_1 (gcall *call, a > case IFN_UBSAN_BOUNDS: > case IFN_UBSAN_VPTR: > case IFN_UBSAN_OBJECT_SIZE: > + case IFN_UBSAN_PTR: > case IFN_ASAN_CHECK: > return false; > default: > --- gcc/flag-types.h.jj 2017-06-14 18:07:46.068752990 +0200 > +++ gcc/flag-types.h 2017-06-15 11:06:53.569321591 +0200 > @@ -238,6 +238,7 @@ enum sanitize_code { > SANITIZE_OBJECT_SIZE = 1UL << 21, > SANITIZE_VPTR = 1UL << 22, > SANITIZE_BOUNDS_STRICT = 1UL << 23, > + SANITIZE_POINTER_OVERFLOW = 1UL << 24, > SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT, > SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE > | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN > @@ -245,7 +246,8 @@ enum sanitize_code { > | SANITIZE_BOUNDS | SANITIZE_ALIGNMENT > | SANITIZE_NONNULL_ATTRIBUTE > | SANITIZE_RETURNS_NONNULL_ATTRIBUTE > - | SANITIZE_OBJECT_SIZE | SANITIZE_VPTR, > + | SANITIZE_OBJECT_SIZE | SANITIZE_VPTR > + | SANITIZE_POINTER_OVERFLOW, > SANITIZE_UNDEFINED_NONDEFAULT = SANITIZE_FLOAT_DIVIDE | SANITIZE_FLOAT_CAST > | SANITIZE_BOUNDS_STRICT > }; > --- gcc/ubsan.c.jj 2017-06-15 11:06:45.275423452 +0200 > +++ gcc/ubsan.c 2017-06-16 13:27:48.946545636 +0200 > @@ -45,6 +45,8 @@ along with GCC; see the file COPYING3. > #include "builtins.h" > #include "tree-object-size.h" > #include "tree-cfg.h" > +#include "gimple-fold.h" > +#include "varasm.h" > > /* Map from a tree to a VAR_DECL tree. */ > > @@ -792,7 +794,7 @@ ubsan_expand_null_ifn (gimple_stmt_itera > e = find_edge (cond_bb, fallthru_bb); > e->flags = EDGE_FALSE_VALUE; > e->count = cond_bb->count; > - e->probability = REG_BR_PROB_BASE - PROB_VERY_UNLIKELY; > + e->probability = PROB_VERY_LIKELY; > > /* Update dominance info for the newly created then_bb; note that > fallthru_bb's dominance info has already been updated by > @@ -861,7 +863,7 @@ ubsan_expand_null_ifn (gimple_stmt_itera > e = find_edge (cond1_bb, cond2_bb); > e->flags = EDGE_FALSE_VALUE; > e->count = cond1_bb->count; > - e->probability = REG_BR_PROB_BASE - PROB_VERY_UNLIKELY; > + e->probability = PROB_VERY_LIKELY; > > /* Update dominance info. */ > if (dom_info_available_p (CDI_DOMINATORS)) > @@ -1011,6 +1013,170 @@ ubsan_expand_objsize_ifn (gimple_stmt_it > return true; > } > > +/* Expand UBSAN_PTR internal call. */ > + > +bool > +ubsan_expand_ptr_ifn (gimple_stmt_iterator *gsip) > +{ > + gimple_stmt_iterator gsi = *gsip; > + gimple *stmt = gsi_stmt (gsi); > + location_t loc = gimple_location (stmt); > + gcc_assert (gimple_call_num_args (stmt) == 2); > + tree ptr = gimple_call_arg (stmt, 0); > + tree off = gimple_call_arg (stmt, 1); > + > + if (integer_zerop (off)) > + { > + gsi_remove (gsip, true); > + unlink_stmt_vdef (stmt); > + return true; > + } > + > + basic_block cur_bb = gsi_bb (gsi); > + tree ptrplusoff = make_ssa_name (pointer_sized_int_node); > + tree ptri = make_ssa_name (pointer_sized_int_node); > + int pos_neg = get_range_pos_neg (off); > + > + /* Split the original block holding the pointer dereference. */ > + edge e = split_block (cur_bb, stmt); > + > + /* Get a hold on the 'condition block', the 'then block' and the > + 'else block'. */ > + basic_block cond_bb = e->src; > + basic_block fallthru_bb = e->dest; > + basic_block then_bb = create_empty_bb (cond_bb); > + basic_block cond_pos_bb = NULL, cond_neg_bb = NULL; > + add_bb_to_loop (then_bb, cond_bb->loop_father); > + loops_state_set (LOOPS_NEED_FIXUP); > + > + /* Set up the fallthrough basic block. */ > + e->flags = EDGE_FALSE_VALUE; > + if (pos_neg != 3) > + { > + e->count = cond_bb->count; > + e->probability = PROB_VERY_LIKELY; > + > + /* Connect 'then block' with the 'else block'. This is needed > + as the ubsan routines we call in the 'then block' are not noreturn. > + The 'then block' only has one outcoming edge. */ > + make_single_succ_edge (then_bb, fallthru_bb, EDGE_FALLTHRU); > + > + /* Make an edge coming from the 'cond block' into the 'then block'; > + this edge is unlikely taken, so set up the probability > + accordingly. */ > + e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE); > + e->probability = PROB_VERY_UNLIKELY; > + } > + else > + { > + profile_count count = cond_bb->count.apply_probability (PROB_EVEN); > + e->count = count; > + e->probability = PROB_EVEN; > + > + e = split_block (fallthru_bb, (gimple *) NULL); > + cond_neg_bb = e->src; > + fallthru_bb = e->dest; > + e->count = count; > + e->probability = PROB_VERY_LIKELY; > + e->flags = EDGE_FALSE_VALUE; > + > + e = make_edge (cond_neg_bb, then_bb, EDGE_TRUE_VALUE); > + e->probability = PROB_VERY_UNLIKELY; > + > + cond_pos_bb = create_empty_bb (cond_bb); > + add_bb_to_loop (cond_pos_bb, cond_bb->loop_father); > + > + e = make_edge (cond_bb, cond_pos_bb, EDGE_TRUE_VALUE); > + e->count = count; > + e->probability = PROB_EVEN; > + > + e = make_edge (cond_pos_bb, then_bb, EDGE_TRUE_VALUE); > + e->probability = PROB_VERY_UNLIKELY; > + > + e = make_edge (cond_pos_bb, fallthru_bb, EDGE_FALSE_VALUE); > + e->count = count; > + e->probability = PROB_VERY_LIKELY; > + > + make_single_succ_edge (then_bb, fallthru_bb, EDGE_FALLTHRU); > + } > + > + gimple *g = gimple_build_assign (ptri, NOP_EXPR, ptr); > + gimple_set_location (g, loc); > + gsi_insert_before (&gsi, g, GSI_SAME_STMT); > + g = gimple_build_assign (ptrplusoff, PLUS_EXPR, ptri, off); > + gimple_set_location (g, loc); > + gsi_insert_before (&gsi, g, GSI_SAME_STMT); > + > + /* Update dominance info for the newly created then_bb; note that > + fallthru_bb's dominance info has already been updated by > + split_block. */ > + if (dom_info_available_p (CDI_DOMINATORS)) > + { > + set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb); > + if (pos_neg == 3) > + { > + set_immediate_dominator (CDI_DOMINATORS, cond_pos_bb, cond_bb); > + set_immediate_dominator (CDI_DOMINATORS, fallthru_bb, cond_bb); > + } > + } > + > + /* Put the ubsan builtin call into the newly created BB. */ > + if (flag_sanitize_undefined_trap_on_error) > + g = gimple_build_call (builtin_decl_implicit (BUILT_IN_TRAP), 0); > + else > + { > + enum built_in_function bcode > + = (flag_sanitize_recover & SANITIZE_POINTER_OVERFLOW) > + ? BUILT_IN_UBSAN_HANDLE_POINTER_OVERFLOW > + : BUILT_IN_UBSAN_HANDLE_POINTER_OVERFLOW_ABORT; > + tree fn = builtin_decl_implicit (bcode); > + tree data > + = ubsan_create_data ("__ubsan_ptrovf_data", 1, &loc, > + NULL_TREE, NULL_TREE); > + data = build_fold_addr_expr_loc (loc, data); > + g = gimple_build_call (fn, 3, data, ptr, ptrplusoff); > + } > + gimple_stmt_iterator gsi2 = gsi_start_bb (then_bb); > + gimple_set_location (g, loc); > + gsi_insert_after (&gsi2, g, GSI_NEW_STMT); > + > + /* Unlink the UBSAN_PTRs vops before replacing it. */ > + unlink_stmt_vdef (stmt); > + > + if (TREE_CODE (off) == INTEGER_CST) > + g = gimple_build_cond (wi::neg_p (off) ? LT_EXPR : GE_EXPR, ptri, > + fold_build1 (NEGATE_EXPR, sizetype, off), > + NULL_TREE, NULL_TREE); > + else if (pos_neg != 3) > + g = gimple_build_cond (pos_neg == 1 ? LT_EXPR : GT_EXPR, > + ptrplusoff, ptri, NULL_TREE, NULL_TREE); > + else > + { > + gsi2 = gsi_start_bb (cond_pos_bb); > + g = gimple_build_cond (LT_EXPR, ptrplusoff, ptri, NULL_TREE, NULL_TREE); > + gimple_set_location (g, loc); > + gsi_insert_after (&gsi2, g, GSI_NEW_STMT); > + > + gsi2 = gsi_start_bb (cond_neg_bb); > + g = gimple_build_cond (GT_EXPR, ptrplusoff, ptri, NULL_TREE, NULL_TREE); > + gimple_set_location (g, loc); > + gsi_insert_after (&gsi2, g, GSI_NEW_STMT); > + > + gimple_seq seq = NULL; > + tree t = gimple_build (&seq, loc, NOP_EXPR, ssizetype, off); > + t = gimple_build (&seq, loc, GE_EXPR, boolean_type_node, > + t, ssize_int (0)); > + gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); > + g = gimple_build_cond (NE_EXPR, t, boolean_false_node, > + NULL_TREE, NULL_TREE); > + } > + gimple_set_location (g, loc); > + /* Replace the UBSAN_PTR with a GIMPLE_COND stmt. */ > + gsi_replace (&gsi, g, false); > + return false; > +} > + > + > /* Cached __ubsan_vptr_type_cache decl. */ > static GTY(()) tree ubsan_vptr_type_cache_decl; > > @@ -1215,6 +1381,103 @@ instrument_null (gimple_stmt_iterator gs > instrument_mem_ref (t, base, &gsi, is_lhs); > } > > +/* Instrument pointer arithmetics PTR p+ OFF. */ > + > +static void > +instrument_pointer_overflow (gimple_stmt_iterator *gsi, tree ptr, tree off) > +{ > + if (TYPE_PRECISION (sizetype) != POINTER_SIZE) > + return; > + gcall *g = gimple_build_call_internal (IFN_UBSAN_PTR, 2, ptr, off); > + gimple_set_location (g, gimple_location (gsi_stmt (*gsi))); > + gsi_insert_before (gsi, g, GSI_SAME_STMT); > +} > + > +/* Instrument pointer arithmetics if any. */ > + > +static void > +maybe_instrument_pointer_overflow (gimple_stmt_iterator *gsi, tree t) > +{ > + if (TYPE_PRECISION (sizetype) != POINTER_SIZE) > + return; > + > + /* Handle also e.g. &s->i. */ > + if (TREE_CODE (t) == ADDR_EXPR) > + t = TREE_OPERAND (t, 0); > + > + switch (TREE_CODE (t)) > + { > + case COMPONENT_REF: > + if (TREE_CODE (t) == COMPONENT_REF > + && DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (t, 1)) != NULL_TREE) > + { > + tree repr = DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (t, 1)); > + t = build3 (COMPONENT_REF, TREE_TYPE (repr), TREE_OPERAND (t, 0), > + repr, TREE_OPERAND (t, 2)); > + } > + break; > + case ARRAY_REF: > + case MEM_REF: > + break; > + default: > + return; > + } > + > + HOST_WIDE_INT bitsize, bitpos; > + tree offset; > + machine_mode mode; > + int volatilep = 0, reversep, unsignedp = 0; > + tree inner = get_inner_reference (t, &bitsize, &bitpos, &offset, &mode, > + &unsignedp, &reversep, &volatilep); > + > + if ((offset == NULL_TREE && bitpos == 0) > + || bitpos % BITS_PER_UNIT != 0) > + return; > + > + bool decl_p = DECL_P (inner); > + tree base; > + if (decl_p) > + { > + if (DECL_REGISTER (inner)) > + return; > + base = inner; > + /* If BASE is a fixed size automatic variable or > + global variable defined in the current TU and bitpos > + fits, don't instrument anything. */ > + if (offset == NULL_TREE > + && bitpos > 0 > + && (VAR_P (base) > + || TREE_CODE (base) == PARM_DECL > + || TREE_CODE (base) == RESULT_DECL) > + && DECL_SIZE (base) > + && TREE_CODE (DECL_SIZE (base)) == INTEGER_CST > + && compare_tree_int (DECL_SIZE (base), bitpos) >= 0 > + && (!is_global_var (base) || decl_binds_to_current_def_p (base))) > + return; > + } > + else if (TREE_CODE (inner) == MEM_REF) > + base = TREE_OPERAND (inner, 0); > + else > + return; > + tree ptr = build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (t)), t); > + > + if (!POINTER_TYPE_P (TREE_TYPE (base)) && !DECL_P (base)) > + return; > + > + tree base_addr = base; > + if (decl_p) > + base_addr = build1 (ADDR_EXPR, > + build_pointer_type (TREE_TYPE (base)), base); > + t = fold_build2 (MINUS_EXPR, sizetype, > + fold_convert (pointer_sized_int_node, ptr), > + fold_convert (pointer_sized_int_node, base_addr)); > + t = force_gimple_operand_gsi (gsi, t, true, NULL_TREE, true, > + GSI_SAME_STMT); > + base_addr = force_gimple_operand_gsi (gsi, base_addr, true, NULL_TREE, true, > + GSI_SAME_STMT); > + instrument_pointer_overflow (gsi, base_addr, t); > +} > + > /* Build an ubsan builtin call for the signed-integer-overflow > sanitization. CODE says what kind of builtin are we building, > LOC is a location, LHSTYPE is the type of LHS, OP0 and OP1 > @@ -1828,7 +2091,7 @@ instrument_object_size (gimple_stmt_iter > { > tree rhs1 = gimple_assign_rhs1 (def_stmt); > if (TREE_CODE (rhs1) == SSA_NAME > - && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs1)) > + && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs1)) > break; > else > base = rhs1; > @@ -1952,7 +2215,8 @@ public: > | SANITIZE_ALIGNMENT > | SANITIZE_NONNULL_ATTRIBUTE > | SANITIZE_RETURNS_NONNULL_ATTRIBUTE > - | SANITIZE_OBJECT_SIZE)); > + | SANITIZE_OBJECT_SIZE > + | SANITIZE_POINTER_OVERFLOW)); > } > > virtual unsigned int execute (function *); > @@ -2043,6 +2307,32 @@ pass_ubsan::execute (function *fun) > } > } > } > + > + if (sanitize_flags_p (SANITIZE_POINTER_OVERFLOW, fun->decl)) > + { > + if (is_gimple_assign (stmt) > + && gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR) > + instrument_pointer_overflow (&gsi, > + gimple_assign_rhs1 (stmt), > + gimple_assign_rhs2 (stmt)); > + if (gimple_store_p (stmt)) > + maybe_instrument_pointer_overflow (&gsi, > + gimple_get_lhs (stmt)); > + if (gimple_assign_single_p (stmt)) > + maybe_instrument_pointer_overflow (&gsi, > + gimple_assign_rhs1 (stmt)); > + if (is_gimple_call (stmt)) > + { > + unsigned args_num = gimple_call_num_args (stmt); > + for (unsigned i = 0; i < args_num; ++i) > + { > + tree arg = gimple_call_arg (stmt, i); > + if (is_gimple_reg (arg)) > + continue; > + maybe_instrument_pointer_overflow (&gsi, arg); > + } > + } > + } > > gsi_next (&gsi); > } > --- gcc/internal-fn.c.jj 2017-06-15 11:03:25.053821114 +0200 > +++ gcc/internal-fn.c 2017-06-15 11:06:53.570321578 +0200 > @@ -401,6 +401,14 @@ expand_UBSAN_VPTR (internal_fn, gcall *) > /* This should get expanded in the sanopt pass. */ > > static void > +expand_UBSAN_PTR (internal_fn, gcall *) > +{ > + gcc_unreachable (); > +} > + > +/* This should get expanded in the sanopt pass. */ > + > +static void > expand_UBSAN_OBJECT_SIZE (internal_fn, gcall *) > { > gcc_unreachable (); > --- gcc/ubsan.h.jj 2017-06-14 18:07:46.174751709 +0200 > +++ gcc/ubsan.h 2017-06-15 11:06:53.570321578 +0200 > @@ -45,6 +45,7 @@ enum ubsan_print_style { > extern bool ubsan_expand_bounds_ifn (gimple_stmt_iterator *); > extern bool ubsan_expand_null_ifn (gimple_stmt_iterator *); > extern bool ubsan_expand_objsize_ifn (gimple_stmt_iterator *); > +extern bool ubsan_expand_ptr_ifn (gimple_stmt_iterator *); > extern bool ubsan_expand_vptr_ifn (gimple_stmt_iterator *); > extern bool ubsan_instrument_unreachable (gimple_stmt_iterator *); > extern tree ubsan_create_data (const char *, int, const location_t *, ...); > --- gcc/sanitizer.def.jj 2017-06-15 11:03:25.054821102 +0200 > +++ gcc/sanitizer.def 2017-06-15 11:06:53.571321566 +0200 > @@ -444,6 +444,10 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HAN > "__ubsan_handle_load_invalid_value", > BT_FN_VOID_PTR_PTR, > ATTR_COLD_NOTHROW_LEAF_LIST) > +DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_POINTER_OVERFLOW, > + "__ubsan_handle_pointer_overflow", > + BT_FN_VOID_PTR_PTR_PTR, > + ATTR_COLD_NOTHROW_LEAF_LIST) > DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW_ABORT, > "__ubsan_handle_divrem_overflow_abort", > BT_FN_VOID_PTR_PTR_PTR, > @@ -480,6 +484,10 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HAN > "__ubsan_handle_load_invalid_value_abort", > BT_FN_VOID_PTR_PTR, > ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST) > +DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_POINTER_OVERFLOW_ABORT, > + "__ubsan_handle_pointer_overflow_abort", > + BT_FN_VOID_PTR_PTR_PTR, > + ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST) > DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_FLOAT_CAST_OVERFLOW, > "__ubsan_handle_float_cast_overflow", > BT_FN_VOID_PTR_PTR, > --- gcc/tree-ssa-tail-merge.c.jj 2017-06-14 18:07:45.739756964 +0200 > +++ gcc/tree-ssa-tail-merge.c 2017-06-15 11:06:53.571321566 +0200 > @@ -1239,6 +1239,7 @@ merge_stmts_p (gimple *stmt1, gimple *st > case IFN_UBSAN_CHECK_SUB: > case IFN_UBSAN_CHECK_MUL: > case IFN_UBSAN_OBJECT_SIZE: > + case IFN_UBSAN_PTR: > case IFN_ASAN_CHECK: > /* For these internal functions, gimple_location is an implicit > parameter, which will be used explicitly after expansion. > --- gcc/internal-fn.def.jj 2017-06-14 18:07:45.679757689 +0200 > +++ gcc/internal-fn.def 2017-06-15 11:06:53.572321554 +0200 > @@ -165,6 +165,7 @@ DEF_INTERNAL_FN (UBSAN_VPTR, ECF_LEAF | > DEF_INTERNAL_FN (UBSAN_CHECK_ADD, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) > DEF_INTERNAL_FN (UBSAN_CHECK_SUB, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) > DEF_INTERNAL_FN (UBSAN_CHECK_MUL, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) > +DEF_INTERNAL_FN (UBSAN_PTR, ECF_LEAF | ECF_NOTHROW, ".R.") > DEF_INTERNAL_FN (UBSAN_OBJECT_SIZE, ECF_LEAF | ECF_NOTHROW, NULL) > DEF_INTERNAL_FN (ABNORMAL_DISPATCHER, ECF_NORETURN, NULL) > DEF_INTERNAL_FN (BUILTIN_EXPECT, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) > --- gcc/opts.c.jj 2017-06-14 18:07:46.411748846 +0200 > +++ gcc/opts.c 2017-06-15 11:06:53.572321554 +0200 > @@ -1504,6 +1504,7 @@ const struct sanitizer_opts_s sanitizer_ > true), > SANITIZER_OPT (object-size, SANITIZE_OBJECT_SIZE, true), > SANITIZER_OPT (vptr, SANITIZE_VPTR, true), > + SANITIZER_OPT (pointer-overflow, SANITIZE_POINTER_OVERFLOW, true), > SANITIZER_OPT (all, ~0U, true), > #undef SANITIZER_OPT > { NULL, 0U, 0UL, false } > --- gcc/lto-streamer-in.c.jj 2017-06-14 18:07:45.803756191 +0200 > +++ gcc/lto-streamer-in.c 2017-06-15 11:06:53.573321541 +0200 > @@ -1143,6 +1143,10 @@ input_function (tree fn_decl, struct dat > if ((flag_sanitize & SANITIZE_OBJECT_SIZE) == 0) > remove = true; > break; > + case IFN_UBSAN_PTR: > + if ((flag_sanitize & SANITIZE_POINTER_OVERFLOW) == 0) > + remove = true; > + break; > case IFN_ASAN_MARK: > if ((flag_sanitize & SANITIZE_ADDRESS) == 0) > remove = true; > --- gcc/testsuite/c-c++-common/ubsan/ptr-overflow-1.c.jj 2017-06-15 11:06:17.700755118 +0200 > +++ gcc/testsuite/c-c++-common/ubsan/ptr-overflow-1.c 2017-06-16 13:04:29.216377665 +0200 > @@ -0,0 +1,65 @@ > +/* PR sanitizer/80998 */ > +/* { dg-do run } */ > +/* { dg-options "-fsanitize=pointer-overflow -fno-sanitize-recover=pointer-overflow -Wall" } */ > + > +struct S { int a; int b; int c[64]; }; > +__attribute__((noinline, noclone)) char *f1 (char *p) { return p + 1; } > +__attribute__((noinline, noclone)) char *f2 (char *p) { return p - 1; } > +__attribute__((noinline, noclone)) char *f3 (char *p, int i) { return p + i; } > +__attribute__((noinline, noclone)) char *f4 (char *p, int i) { return p - i; } > +__attribute__((noinline, noclone)) char *f5 (char *p, unsigned long int i) { return p + i; } > +__attribute__((noinline, noclone)) char *f6 (char *p, unsigned long int i) { return p - i; } > +__attribute__((noinline, noclone)) int *f7 (struct S *p) { return &p->a; } > +__attribute__((noinline, noclone)) int *f8 (struct S *p) { return &p->b; } > +__attribute__((noinline, noclone)) int *f9 (struct S *p) { return &p->c[64]; } > +__attribute__((noinline, noclone)) int *f10 (struct S *p, int i) { return &p->c[i]; } > + > +char *volatile p; > +struct S *volatile q; > +char a[64]; > +struct S s; > +int *volatile r; > + > +int > +main () > +{ > + struct S t; > + p = &a[32]; > + p = f1 (p); > + p = f1 (p); > + p = f2 (p); > + p = f3 (p, 1); > + p = f3 (p, -1); > + p = f3 (p, 3); > + p = f3 (p, -6); > + p = f4 (p, 1); > + p = f4 (p, -1); > + p = f4 (p, 3); > + p = f4 (p, -6); > + p = f5 (p, 1); > + p = f5 (p, 3); > + p = f6 (p, 1); > + p = f6 (p, 3); > + if (sizeof (unsigned long) >= sizeof (char *)) > + { > + p = f5 (p, -1); > + p = f5 (p, -6); > + p = f6 (p, -1); > + p = f6 (p, -6); > + } > + q = &s; > + r = f7 (q); > + r = f8 (q); > + r = f9 (q); > + r = f10 (q, 0); > + r = f10 (q, 10); > + r = f10 (q, 64); > + q = &t; > + r = f7 (q); > + r = f8 (q); > + r = f9 (q); > + r = f10 (q, 0); > + r = f10 (q, 10); > + r = f10 (q, 64); > + return 0; > +} > --- gcc/testsuite/c-c++-common/ubsan/ptr-overflow-2.c.jj 2017-06-15 11:06:17.700755118 +0200 > +++ gcc/testsuite/c-c++-common/ubsan/ptr-overflow-2.c 2017-06-16 14:00:57.545611263 +0200 > @@ -0,0 +1,113 @@ > +/* PR sanitizer/80998 */ > +/* { dg-do run } */ > +/* { dg-options "-fsanitize=pointer-overflow -fsanitize-recover=pointer-overflow -fno-ipa-icf -Wall" } */ > + > +__attribute__((noinline, noclone)) char * f1 (char *p) { return p + 1; } > +__attribute__((noinline, noclone)) char * f2 (char *p) { return p - 1; } > +__attribute__((noinline, noclone)) char * f3 (char *p, int i) { return p + i; } > +__attribute__((noinline, noclone)) char * f4 (char *p, int i) { return p + i; } > +__attribute__((noinline, noclone)) char * f5 (char *p, int i) { return p - i; } > +__attribute__((noinline, noclone)) char * f6 (char *p, int i) { return p - i; } > +__attribute__((noinline, noclone)) char * f7 (char *p, unsigned long int i) { return p + i; } > +__attribute__((noinline, noclone)) char * f8 (char *p, unsigned long int i) { return p + i; } > +__attribute__((noinline, noclone)) char * f9 (char *p, unsigned long int i) { return p - i; } > +__attribute__((noinline, noclone)) char * f10 (char *p, unsigned long int i) { return p - i; } > +struct S { int a; int b; int c[64]; }; > +__attribute__((noinline, noclone)) int *f11 (struct S *p) { return &p->a; } > +__attribute__((noinline, noclone)) int *f12 (struct S *p) { return &p->b; } > +__attribute__((noinline, noclone)) int *f13 (struct S *p) { return &p->c[64]; } > +__attribute__((noinline, noclone)) int *f14 (struct S *p, int i) { return &p->c[i]; } > +__attribute__((noinline, noclone)) int *f15 (struct S *p, int i) { return &p->c[i]; } > +__attribute__((noinline, noclone)) int *f16 (struct S *p) { return &p->a; } > +__attribute__((noinline, noclone)) int *f17 (struct S *p) { return &p->b; } > +__attribute__((noinline, noclone)) int *f18 (struct S *p) { return &p->c[64]; } > +__attribute__((noinline, noclone)) int *f19 (struct S *p, int i) { return &p->c[i]; } > +__attribute__((noinline, noclone)) int *f20 (struct S *p, int i) { return &p->c[i]; } > +__attribute__((noinline, noclone)) int *f21 (struct S *p) { return &p->a; } > +__attribute__((noinline, noclone)) int *f22 (struct S *p) { return &p->b; } > +__attribute__((noinline, noclone)) int *f23 (struct S *p) { return &p->c[64]; } > +__attribute__((noinline, noclone)) int *f24 (struct S *p, int i) { return &p->c[i]; } > +__attribute__((noinline, noclone)) int *f25 (struct S *p, int i) { return &p->c[i]; } > + > +char *volatile p; > +__UINTPTR_TYPE__ volatile u; > +struct S *volatile q; > +int *volatile r; > + > +int > +main () > +{ > + u = ~(__UINTPTR_TYPE__) 0; > + p = (char *) u; > + p = f1 (p); > + u = 0; > + p = (char *) u; > + p = f2 (p); > + u = -(__UINTPTR_TYPE__) 7; > + p = (char *) u; > + p = f3 (p, 7); > + u = 3; > + p = (char *) u; > + p = f4 (p, -4); > + u = 23; > + p = (char *) u; > + p = f5 (p, 27); > + u = -(__UINTPTR_TYPE__) 15; > + p = (char *) u; > + p = f6 (p, -15); > + u = -(__UINTPTR_TYPE__) 29; > + p = (char *) u; > + p = f7 (p, 31); > + u = 23; > + p = (char *) u; > + p = f9 (p, 24); > + if (sizeof (unsigned long) < sizeof (char *)) > + return 0; > + u = 7; > + p = (char *) u; > + p = f8 (p, -8); > + u = -(__UINTPTR_TYPE__) 25; > + p = (char *) u; > + p = f10 (p, -25); > + u = ~(__UINTPTR_TYPE__) 0; > + q = (struct S *) u; > + r = f11 (q); > + r = f12 (q); > + r = f13 (q); > + r = f14 (q, 0); > + r = f15 (q, 63); > + u = ~(__UINTPTR_TYPE__) 0 - (17 * sizeof (int)); > + q = (struct S *) u; > + r = f16 (q); > + r = f17 (q); > + r = f18 (q); > + r = f19 (q, 0); > + r = f20 (q, 63); > + u = 3 * sizeof (int); > + q = (struct S *) u; > + r = f21 (q); > + r = f22 (q); > + r = f23 (q); > + r = f24 (q, -2); > + r = f25 (q, -6); > + return 0; > +} > + > +/* { dg-output ":5:6\[79]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+ overflowed to (0\[xX])?0\+(\n|\r\n|\r)" } */ > +/* { dg-output "\[^\n\r]*:6:6\[79]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?0\+ overflowed to (0\[xX])?\[fF]\+(\n|\r\n|\r)" } */ > +/* { dg-output "\[^\n\r]*:7:7\[46]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+9 overflowed to (0\[xX])?0\+(\n|\r\n|\r)" } */ > +/* { dg-output "\[^\n\r]*:8:7\[46]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?0\+3 overflowed to (0\[xX])?\[fF]\+(\n|\r\n|\r)" } */ > +/* { dg-output "\[^\n\r]*:9:7\[46]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?0\+17 overflowed to (0\[xX])?\[fF]\+\[cC](\n|\r\n|\r)" } */ > +/* { dg-output "\[^\n\r]*:10:7\[46]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+1 overflowed to (0\[xX])?0\+(\n|\r\n|\r)" } */ > +/* { dg-output "\[^\n\r]*:11:\[89]\[80]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+\[eE]3 overflowed to (0\[xX])?0\+2(\n|\r\n|\r)" } */ > +/* { dg-output "\[^\n\r]*:13:\[89]\[80]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?0\+17 overflowed to (0\[xX])?\[fF]\+(\n|\r\n|\r)" } */ > +/* { dg-output "\[^\n\r]*:12:\[89]\[80]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?0\+7 overflowed to (0\[xX])?\[fF]\+(\n|\r\n|\r)" } */ > +/* { dg-output "\[^\n\r]*:14:\[89]\[91]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+\[eE]7 overflowed to (0\[xX])?0\+" } */ > +/* { dg-output "(\n|\r\n|\r)" { target int32 } } */ > +/* { dg-output "\[^\n\r]*:17:\[67]\[82]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+ overflowed to (0\[xX])?0\+3(\n|\r\n|\r)" { target int32 } } */ > +/* { dg-output "\[^\n\r]*:18:\[67]\[86]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+ overflowed to (0\[xX])?0\+107(\n|\r\n|\r)" { target int32 } } */ > +/* { dg-output "\[^\n\r]*:19:\[78]\[52]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+ overflowed to (0\[xX])?0\+7(\n|\r\n|\r)" { target int32 } } */ > +/* { dg-output "\[^\n\r]*:20:\[78]\[52]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+ overflowed to (0\[xX])?0\+103(\n|\r\n|\r)" { target int32 } } */ > +/* { dg-output "\[^\n\r]*:23:\[67]\[86]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+\[bB]\[bB] overflowed to (0\[xX])?0\+\[cC]3(\n|\r\n|\r)" { target int32 } } */ > +/* { dg-output "\[^\n\r]*:25:\[78]\[52]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?\[fF]\+\[bB]\[bB] overflowed to (0\[xX])?0\+\[bB]\[fF](\n|\r\n|\r)" { target int32 } } */ > +/* { dg-output "\[^\n\r]*:30:\[78]\[52]\[^\n\r]*runtime error: pointer index expression with base (0\[xX])?0\+\[cC] overflowed to (0\[xX])?\[fF]\+\[cC]" { target int32 } } */ > --- libsanitizer/ubsan/ubsan_handlers.cc.jj 2016-11-16 18:51:53.028794605 +0100 > +++ libsanitizer/ubsan/ubsan_handlers.cc 2017-06-14 09:54:25.571687721 +0200 > @@ -521,6 +521,37 @@ void __ubsan::__ubsan_handle_nonnull_arg > Die(); > } > > +static void handlePointerOverflowImpl(PointerOverflowData *Data, > + ValueHandle Base, > + ValueHandle Result, > + ReportOptions Opts) { > + SourceLocation Loc = Data->Loc.acquire(); > + ErrorType ET = ErrorType::PointerOverflow; > + > + if (ignoreReport(Loc, Opts, ET)) > + return; > + > + ScopedReport R(Opts, Loc, ET); > + > + Diag(Loc, DL_Error, "pointer index expression with base %0 overflowed to %1") > + << (void *)Base << (void*)Result; > +} > + > +void __ubsan::__ubsan_handle_pointer_overflow(PointerOverflowData *Data, > + ValueHandle Base, > + ValueHandle Result) { > + GET_REPORT_OPTIONS(false); > + handlePointerOverflowImpl(Data, Base, Result, Opts); > +} > + > +void __ubsan::__ubsan_handle_pointer_overflow_abort(PointerOverflowData *Data, > + ValueHandle Base, > + ValueHandle Result) { > + GET_REPORT_OPTIONS(true); > + handlePointerOverflowImpl(Data, Base, Result, Opts); > + Die(); > +} > + > static void handleCFIBadIcall(CFICheckFailData *Data, ValueHandle Function, > ReportOptions Opts) { > if (Data->CheckKind != CFITCK_ICall) > --- libsanitizer/ubsan/ubsan_checks.inc.jj 2016-11-09 15:22:50.139249654 +0100 > +++ libsanitizer/ubsan/ubsan_checks.inc 2017-06-14 09:54:25.571687721 +0200 > @@ -17,6 +17,7 @@ > > UBSAN_CHECK(GenericUB, "undefined-behavior", "undefined") > UBSAN_CHECK(NullPointerUse, "null-pointer-use", "null") > +UBSAN_CHECK(PointerOverflow, "pointer-overflow", "pointer-overflow") > UBSAN_CHECK(MisalignedPointerUse, "misaligned-pointer-use", "alignment") > UBSAN_CHECK(InsufficientObjectSize, "insufficient-object-size", "object-size") > UBSAN_CHECK(SignedIntegerOverflow, "signed-integer-overflow", > --- libsanitizer/ubsan/ubsan_handlers.h.jj 2016-11-16 18:51:53.029794593 +0100 > +++ libsanitizer/ubsan/ubsan_handlers.h 2017-06-14 09:54:25.571687721 +0200 > @@ -146,6 +146,13 @@ struct NonNullArgData { > /// \brief Handle passing null pointer to function with nonnull attribute. > RECOVERABLE(nonnull_arg, NonNullArgData *Data) > > +struct PointerOverflowData { > + SourceLocation Loc; > +}; > + > +RECOVERABLE(pointer_overflow, PointerOverflowData *Data, ValueHandle Base, > + ValueHandle Result) > + > /// \brief Known CFI check kinds. > /// Keep in sync with the enum of the same name in CodeGenFunction.h > enum CFITypeCheckKind : unsigned char { > > Jakub > > -- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) 2017-06-20 7:41 ` Richard Biener @ 2017-06-20 8:14 ` Jakub Jelinek 2017-06-20 8:18 ` Richard Biener 2017-06-21 8:00 ` Jakub Jelinek 1 sibling, 1 reply; 21+ messages in thread From: Jakub Jelinek @ 2017-06-20 8:14 UTC (permalink / raw) To: Richard Biener; +Cc: Martin Liška, gcc-patches On Tue, Jun 20, 2017 at 09:41:43AM +0200, Richard Biener wrote: > On Mon, 19 Jun 2017, Jakub Jelinek wrote: > > > Hi! > > > > The following patch adds -fsanitize=pointer-overflow support, > > which adds instrumentation (included in -fsanitize=undefined) that checks > > that pointer arithmetics doesn't wrap. If the offset on ptr p+ off when treating > > it as signed value is non-negative, we check whether the result is bigger > > (uintptr_t comparison) than ptr, if it is negative in ssizetype, we check > > whether the result is smaller than ptr, otherwise we check at runtime > > whether (ssizetype) off < 0 and do the check based on that. > > The patch checks both POINTER_PLUS_EXPR, as well as e.g. ADDR_EXPR of > > handled components, and even handled components themselves (exception > > is for constant offset when the base is an automatic non-VLA decl or > > decl that binds to current function where we can at compile time for > > sure guarantee it will fit). > > Does this "properly" interact with any array-bound sanitizing we do? > Say, for > > &a->b[i].c.d > > ? It doesn't interact with it right now at all, and I think it shouldn't for the case you wrote, &a->b[i].c.d even if i is within bounds of the declared array the pointer still could point to something that wraps around. struct S { struct T { struct U { int d; } c; } b[0x1000000]; } *a; could still in a buggy program point to something that is actually much shorter like a = (struct S *) malloc (0x10000 * sizeof (int)); or similar and there could be a wrap around. What I have considered, but haven't implemented yet, is checking if there is UBSAN_BOUNDS sanitization and it is actually array or structure with array etc. (i.e. there is no pointer dereference) - for &q.b[i].c.d if the bounds check is present and we are sure about the object size (i.e. automatic variable or locally defined file scope var). > > Martin has said he'll write the sanopt part of optimization > > (if UBSAN_PTR for some pointer is dominated by UBSAN_PTR for the same > > pointer and the offset is constant in both cases and equal or absolute value > > bigger and same sign in the dominating UBSAN_PTR, we can avoid the dominated > > check). > > > > For the cases where there is a dereference (i.e. not ADDR_EXPR of the > > handled component or POINTER_PLUS_EXPR), I wonder if we couldn't ignore > > say constant offsets in range <-4096, 4096> or something similar, hoping > > people don't have anything mapped at the page 0 and -pagesize in hosted > > env. Thoughts on that? > > Not sure what the problem is here? It would be an attempt to avoid sanitizing int foo (int *p) { return p[10] + p[-5]; } (when the offset is constant and small and we dereference it). If there is no page mapped at NULL or at the highest page in the virtual address space, then the above will crash in case p + 10 or p - 5 wraps around. > > I've bootstrapped/regtested the patch on x86_64-linux and i686-linux > > and additionally bootstrapped/regtested with bootstrap-ubsan on both too. > > The latter revealed a couple of issues I'd like to discuss: > > > > 1) libcpp/symtab.c contains a couple of spots reduced into: > > #define DELETED ((char *) -1) > > void bar (char *); > > void > > foo (char *p) > > { > > if (p && p != DELETED) > > bar (p); > > } > > where we fold it early into if ((p p+ -1) <= (char *) -3) > > and as the instrumentation is done during ubsan pass, if p is NULL, > > we diagnose this as invalid pointer overflow from NULL to 0xffff*f. > > Shall we change the folder so that during GENERIC folding it > > actually does the addition and comparison in pointer_sized_int > > instead (my preference), or shall I move the UBSAN_PTR instrumentation > > earlier into the FEs (but then I still risk stuff is folded earlier)? > > Aww, so we turn the pointer test into a range test ;) That it uses > a pointer type rather than an unsigned integer type is a bug, probably > caused by pointers being TYPE_UNSIGNED. > > Not sure if the folding itself is worthwhile to keep though, thus an > option would be to not generate range tests from pointers? I'll have a look. Maybe only do it during reassoc and not earlier. > > 3) not really related to this patch, but something I also saw during the > > bootstrap-ubsan on i686-linux: > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147426384 - 2147475412 cannot be represented in type 'int' > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147426384 - 2147478324 cannot be represented in type 'int' > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147450216 - 2147451580 cannot be represented in type 'int' > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147450216 - 2147465664 cannot be represented in type 'int' > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147469348 - 2147451544 cannot be represented in type 'int' > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147482364 - 2147475376 cannot be represented in type 'int' > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147483624 - 2147475376 cannot be represented in type 'int' > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147483628 - 2147451544 cannot be represented in type 'int' > > ../../gcc/memory-block.cc:59:4: runtime error: signed integer overflow: -2147426384 - 2147475376 cannot be represented in type 'int' > > ../../gcc/memory-block.cc:59:4: runtime error: signed integer overflow: -2147450216 - 2147451544 cannot be represented in type 'int' > > The problem here is that we lower pointer subtraction, e.g. > > long foo (char *p, char *q) { return q - p; } > > as return (ptrdiff_t) ((ssizetype) q - (ssizetype) p); > > and even for a valid testcase where we have an array across > > the middle of the virtual address space, say the first one above > > is (char *) 0x8000dfb0 - (char *) 0x7fffdfd4 subtraction, even if > > there is 128KB array starting at 0x7fffd000, it will yield > > UB (not in the source, but in whatever the compiler lowered it into). > > So, shall we instead do the subtraction in sizetype and only then > > cast? For sizeof (*ptr) > 1 I think we have some outstanding PR, > > and it is more difficult to find out in what types to compute it. > > Or do we want to introduce POINTER_DIFF_EXPR? > > Just use uintptr_t for the difference computation (well, an unsigned > integer type of desired precision -- mind address-spaces), then cast > the result to signed. Ok (of course, will handle this separately from the rest). Jakub ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) 2017-06-20 8:14 ` Jakub Jelinek @ 2017-06-20 8:18 ` Richard Biener 2017-06-21 7:58 ` Jakub Jelinek ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Richard Biener @ 2017-06-20 8:18 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Martin Liška, gcc-patches On Tue, 20 Jun 2017, Jakub Jelinek wrote: > On Tue, Jun 20, 2017 at 09:41:43AM +0200, Richard Biener wrote: > > On Mon, 19 Jun 2017, Jakub Jelinek wrote: > > > > > Hi! > > > > > > The following patch adds -fsanitize=pointer-overflow support, > > > which adds instrumentation (included in -fsanitize=undefined) that checks > > > that pointer arithmetics doesn't wrap. If the offset on ptr p+ off when treating > > > it as signed value is non-negative, we check whether the result is bigger > > > (uintptr_t comparison) than ptr, if it is negative in ssizetype, we check > > > whether the result is smaller than ptr, otherwise we check at runtime > > > whether (ssizetype) off < 0 and do the check based on that. > > > The patch checks both POINTER_PLUS_EXPR, as well as e.g. ADDR_EXPR of > > > handled components, and even handled components themselves (exception > > > is for constant offset when the base is an automatic non-VLA decl or > > > decl that binds to current function where we can at compile time for > > > sure guarantee it will fit). > > > > Does this "properly" interact with any array-bound sanitizing we do? > > Say, for > > > > &a->b[i].c.d > > > > ? > > It doesn't interact with it right now at all, and I think it shouldn't > for the case you wrote, &a->b[i].c.d even if i is within bounds of the > declared array the pointer still could point to something that wraps around. > struct S { struct T { struct U { int d; } c; } b[0x1000000]; } *a; > could still in a buggy program point to something that is actually much > shorter like a = (struct S *) malloc (0x10000 * sizeof (int)); > or similar and there could be a wrap around. > > What I have considered, but haven't implemented yet, is checking if there > is UBSAN_BOUNDS sanitization and it is actually array or structure with > array etc. (i.e. there is no pointer dereference) - for > &q.b[i].c.d > if the bounds check is present and we are sure about the object size > (i.e. automatic variable or locally defined file scope var). > > > > Martin has said he'll write the sanopt part of optimization > > > (if UBSAN_PTR for some pointer is dominated by UBSAN_PTR for the same > > > pointer and the offset is constant in both cases and equal or absolute value > > > bigger and same sign in the dominating UBSAN_PTR, we can avoid the dominated > > > check). > > > > > > For the cases where there is a dereference (i.e. not ADDR_EXPR of the > > > handled component or POINTER_PLUS_EXPR), I wonder if we couldn't ignore > > > say constant offsets in range <-4096, 4096> or something similar, hoping > > > people don't have anything mapped at the page 0 and -pagesize in hosted > > > env. Thoughts on that? > > > > Not sure what the problem is here? > > It would be an attempt to avoid sanitizing int foo (int *p) { return p[10] + p[-5]; } > (when the offset is constant and small and we dereference it). > If there is no page mapped at NULL or at the highest page in the virtual > address space, then the above will crash in case p + 10 or p - 5 wraps > around. Ah, so merely an optimization to avoid excessive instrumentation then, yes, this might work (maybe make 4096 a --param configurable to be able to disable it). > > > I've bootstrapped/regtested the patch on x86_64-linux and i686-linux > > > and additionally bootstrapped/regtested with bootstrap-ubsan on both too. > > > The latter revealed a couple of issues I'd like to discuss: > > > > > > 1) libcpp/symtab.c contains a couple of spots reduced into: > > > #define DELETED ((char *) -1) > > > void bar (char *); > > > void > > > foo (char *p) > > > { > > > if (p && p != DELETED) > > > bar (p); > > > } > > > where we fold it early into if ((p p+ -1) <= (char *) -3) > > > and as the instrumentation is done during ubsan pass, if p is NULL, > > > we diagnose this as invalid pointer overflow from NULL to 0xffff*f. > > > Shall we change the folder so that during GENERIC folding it > > > actually does the addition and comparison in pointer_sized_int > > > instead (my preference), or shall I move the UBSAN_PTR instrumentation > > > earlier into the FEs (but then I still risk stuff is folded earlier)? > > > > Aww, so we turn the pointer test into a range test ;) That it uses > > a pointer type rather than an unsigned integer type is a bug, probably > > caused by pointers being TYPE_UNSIGNED. > > > > Not sure if the folding itself is worthwhile to keep though, thus an > > option would be to not generate range tests from pointers? > > I'll have a look. Maybe only do it during reassoc and not earlier. It certainly looks somewhat premature in fold-const.c. > > > 3) not really related to this patch, but something I also saw during the > > > bootstrap-ubsan on i686-linux: > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147426384 - 2147475412 cannot be represented in type 'int' > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147426384 - 2147478324 cannot be represented in type 'int' > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147450216 - 2147451580 cannot be represented in type 'int' > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147450216 - 2147465664 cannot be represented in type 'int' > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147469348 - 2147451544 cannot be represented in type 'int' > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147482364 - 2147475376 cannot be represented in type 'int' > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147483624 - 2147475376 cannot be represented in type 'int' > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147483628 - 2147451544 cannot be represented in type 'int' > > > ../../gcc/memory-block.cc:59:4: runtime error: signed integer overflow: -2147426384 - 2147475376 cannot be represented in type 'int' > > > ../../gcc/memory-block.cc:59:4: runtime error: signed integer overflow: -2147450216 - 2147451544 cannot be represented in type 'int' > > > The problem here is that we lower pointer subtraction, e.g. > > > long foo (char *p, char *q) { return q - p; } > > > as return (ptrdiff_t) ((ssizetype) q - (ssizetype) p); > > > and even for a valid testcase where we have an array across > > > the middle of the virtual address space, say the first one above > > > is (char *) 0x8000dfb0 - (char *) 0x7fffdfd4 subtraction, even if > > > there is 128KB array starting at 0x7fffd000, it will yield > > > UB (not in the source, but in whatever the compiler lowered it into). > > > So, shall we instead do the subtraction in sizetype and only then > > > cast? For sizeof (*ptr) > 1 I think we have some outstanding PR, > > > and it is more difficult to find out in what types to compute it. > > > Or do we want to introduce POINTER_DIFF_EXPR? > > > > Just use uintptr_t for the difference computation (well, an unsigned > > integer type of desired precision -- mind address-spaces), then cast > > the result to signed. > > Ok (of course, will handle this separately from the rest). Yes. Note I didn't look at the actual patch (yet). Richard. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) 2017-06-20 8:18 ` Richard Biener @ 2017-06-21 7:58 ` Jakub Jelinek 2017-06-21 8:04 ` Richard Biener 2017-06-21 14:40 ` [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) Jakub Jelinek 2017-07-04 8:53 ` [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) Jakub Jelinek 2 siblings, 1 reply; 21+ messages in thread From: Jakub Jelinek @ 2017-06-21 7:58 UTC (permalink / raw) To: Richard Biener; +Cc: Martin Liška, gcc-patches [-- Attachment #1: Type: text/plain, Size: 2971 bytes --] On Tue, Jun 20, 2017 at 10:18:20AM +0200, Richard Biener wrote: > > It would be an attempt to avoid sanitizing int foo (int *p) { return p[10] + p[-5]; } > > (when the offset is constant and small and we dereference it). > > If there is no page mapped at NULL or at the highest page in the virtual > > address space, then the above will crash in case p + 10 or p - 5 wraps > > around. > > Ah, so merely an optimization to avoid excessive instrumentation then, > yes, this might work (maybe make 4096 a --param configurable to be able > to disable it). Yes. And I think it can be implemented incrementally. > > > > I've bootstrapped/regtested the patch on x86_64-linux and i686-linux > > > > and additionally bootstrapped/regtested with bootstrap-ubsan on both too. > > > > The latter revealed a couple of issues I'd like to discuss: > > > > > > > > 1) libcpp/symtab.c contains a couple of spots reduced into: > > > > #define DELETED ((char *) -1) > > > > void bar (char *); > > > > void > > > > foo (char *p) > > > > { > > > > if (p && p != DELETED) > > > > bar (p); > > > > } > > > > where we fold it early into if ((p p+ -1) <= (char *) -3) > > > > and as the instrumentation is done during ubsan pass, if p is NULL, > > > > we diagnose this as invalid pointer overflow from NULL to 0xffff*f. > > > > Shall we change the folder so that during GENERIC folding it > > > > actually does the addition and comparison in pointer_sized_int > > > > instead (my preference), or shall I move the UBSAN_PTR instrumentation > > > > earlier into the FEs (but then I still risk stuff is folded earlier)? > > > > > > Aww, so we turn the pointer test into a range test ;) That it uses > > > a pointer type rather than an unsigned integer type is a bug, probably > > > caused by pointers being TYPE_UNSIGNED. > > > > > > Not sure if the folding itself is worthwhile to keep though, thus an > > > option would be to not generate range tests from pointers? > > > > I'll have a look. Maybe only do it during reassoc and not earlier. > > It certainly looks somewhat premature in fold-const.c. So for this, I have right now 2 variant patches: The first one keeps doing what we were except for the -fsanitize=pointer-overflow case and has been bootstrap-ubsan bootstrapped/regtested on x86_64-linux and i686-linux. The second one performs the addition and comparison in pointer sized unsigned type instead (not bootstrapped yet). I think the second one would be my preference. Note build_range_check is used not just during early folding, but e.g. during ifcombine, reassoc etc. Martin is contemplating instrumentation of pointer <=/</>=/> comparisons and in that case we'd need some further build_range_check changes, because while ptr == (void *) 0 || ptr == (void *) 1 || ptr == (void *) 2 would be without UB, ptr <= (void *) 2 would be UB, so we'd need to perform all pointer range checks in integral type except the ones where we just do EQ_EXPR/NE_EXPR. Jakub [-- Attachment #2: U606j_ --] [-- Type: text/plain, Size: 1128 bytes --] 2017-06-21 Jakub Jelinek <jakub@redhat.com> PR sanitizer/80998 * fold-const.c: Include asan.h. (build_range_check): For -fsanitize=pointer-overflow don't add pointer arithmetics for range test. --- gcc/fold-const.c.jj 2017-06-14 18:07:47.000000000 +0200 +++ gcc/fold-const.c 2017-06-20 17:05:44.351608513 +0200 @@ -79,6 +79,7 @@ along with GCC; see the file COPYING3. #include "tree-vrp.h" #include "tree-ssanames.h" #include "selftest.h" +#include "asan.h" /* Nonzero if we are folding constants inside an initializer; zero otherwise. */ @@ -4906,6 +4907,14 @@ build_range_check (location_t loc, tree { if (value != 0 && !TREE_OVERFLOW (value)) { + /* Avoid creating pointer arithmetics that is not present + in the source when sanitizing. */ + if (!integer_zerop (low) + && current_function_decl + && sanitize_flags_p (SANITIZE_POINTER_OVERFLOW, + current_function_decl)) + return 0; + low = fold_build1_loc (loc, NEGATE_EXPR, TREE_TYPE (low), low); return build_range_check (loc, type, fold_build_pointer_plus_loc (loc, exp, low), [-- Attachment #3: U625 --] [-- Type: text/plain, Size: 2274 bytes --] 2017-06-21 Jakub Jelinek <jakub@redhat.com> * fold-const.c (build_range_check): Compute pointer range check in integral type if pointer arithmetics would be needed. Formatting fixes. --- gcc/fold-const.c.jj 2017-06-20 21:38:04.000000000 +0200 +++ gcc/fold-const.c 2017-06-21 09:23:00.572404964 +0200 @@ -4818,21 +4818,21 @@ build_range_check (location_t loc, tree if (low == 0) return fold_build2_loc (loc, LE_EXPR, type, exp, - fold_convert_loc (loc, etype, high)); + fold_convert_loc (loc, etype, high)); if (high == 0) return fold_build2_loc (loc, GE_EXPR, type, exp, - fold_convert_loc (loc, etype, low)); + fold_convert_loc (loc, etype, low)); if (operand_equal_p (low, high, 0)) return fold_build2_loc (loc, EQ_EXPR, type, exp, - fold_convert_loc (loc, etype, low)); + fold_convert_loc (loc, etype, low)); if (TREE_CODE (exp) == BIT_AND_EXPR && maskable_range_p (low, high, etype, &mask, &value)) return fold_build2_loc (loc, EQ_EXPR, type, fold_build2_loc (loc, BIT_AND_EXPR, etype, - exp, mask), + exp, mask), value); if (integer_zerop (low)) @@ -4864,7 +4864,7 @@ build_range_check (location_t loc, tree exp = fold_convert_loc (loc, etype, exp); } return fold_build2_loc (loc, GT_EXPR, type, exp, - build_int_cst (etype, 0)); + build_int_cst (etype, 0)); } } @@ -4895,25 +4895,15 @@ build_range_check (location_t loc, tree return 0; } + if (POINTER_TYPE_P (etype)) + etype = unsigned_type_for (etype); + high = fold_convert_loc (loc, etype, high); low = fold_convert_loc (loc, etype, low); exp = fold_convert_loc (loc, etype, exp); value = const_binop (MINUS_EXPR, high, low); - - if (POINTER_TYPE_P (etype)) - { - if (value != 0 && !TREE_OVERFLOW (value)) - { - low = fold_build1_loc (loc, NEGATE_EXPR, TREE_TYPE (low), low); - return build_range_check (loc, type, - fold_build_pointer_plus_loc (loc, exp, low), - 1, build_int_cst (etype, 0), value); - } - return 0; - } - if (value != 0 && !TREE_OVERFLOW (value)) return build_range_check (loc, type, fold_build2_loc (loc, MINUS_EXPR, etype, exp, low), ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) 2017-06-21 7:58 ` Jakub Jelinek @ 2017-06-21 8:04 ` Richard Biener 0 siblings, 0 replies; 21+ messages in thread From: Richard Biener @ 2017-06-21 8:04 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Martin Liška, gcc-patches On Wed, 21 Jun 2017, Jakub Jelinek wrote: > On Tue, Jun 20, 2017 at 10:18:20AM +0200, Richard Biener wrote: > > > It would be an attempt to avoid sanitizing int foo (int *p) { return p[10] + p[-5]; } > > > (when the offset is constant and small and we dereference it). > > > If there is no page mapped at NULL or at the highest page in the virtual > > > address space, then the above will crash in case p + 10 or p - 5 wraps > > > around. > > > > Ah, so merely an optimization to avoid excessive instrumentation then, > > yes, this might work (maybe make 4096 a --param configurable to be able > > to disable it). > > Yes. And I think it can be implemented incrementally. > > > > > > I've bootstrapped/regtested the patch on x86_64-linux and i686-linux > > > > > and additionally bootstrapped/regtested with bootstrap-ubsan on both too. > > > > > The latter revealed a couple of issues I'd like to discuss: > > > > > > > > > > 1) libcpp/symtab.c contains a couple of spots reduced into: > > > > > #define DELETED ((char *) -1) > > > > > void bar (char *); > > > > > void > > > > > foo (char *p) > > > > > { > > > > > if (p && p != DELETED) > > > > > bar (p); > > > > > } > > > > > where we fold it early into if ((p p+ -1) <= (char *) -3) > > > > > and as the instrumentation is done during ubsan pass, if p is NULL, > > > > > we diagnose this as invalid pointer overflow from NULL to 0xffff*f. > > > > > Shall we change the folder so that during GENERIC folding it > > > > > actually does the addition and comparison in pointer_sized_int > > > > > instead (my preference), or shall I move the UBSAN_PTR instrumentation > > > > > earlier into the FEs (but then I still risk stuff is folded earlier)? > > > > > > > > Aww, so we turn the pointer test into a range test ;) That it uses > > > > a pointer type rather than an unsigned integer type is a bug, probably > > > > caused by pointers being TYPE_UNSIGNED. > > > > > > > > Not sure if the folding itself is worthwhile to keep though, thus an > > > > option would be to not generate range tests from pointers? > > > > > > I'll have a look. Maybe only do it during reassoc and not earlier. > > > > It certainly looks somewhat premature in fold-const.c. > > So for this, I have right now 2 variant patches: > > The first one keeps doing what we were except for the > -fsanitize=pointer-overflow case and has been bootstrap-ubsan > bootstrapped/regtested on x86_64-linux and i686-linux. > > The second one performs the addition and comparison in pointer sized > unsigned type instead (not bootstrapped yet). > > I think the second one would be my preference. Note build_range_check > is used not just during early folding, but e.g. during ifcombine, reassoc > etc. > > Martin is contemplating instrumentation of pointer <=/</>=/> comparisons > and in that case we'd need some further build_range_check changes, > because while ptr == (void *) 0 || ptr == (void *) 1 || ptr == (void *) 2 > would be without UB, ptr <= (void *) 2 would be UB, so we'd need to perform > all pointer range checks in integral type except the ones where we just do > EQ_EXPR/NE_EXPR. Yes, exactly. The 2nd patch is ok if it passes bootstrap/test. Richard. ^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) 2017-06-20 8:18 ` Richard Biener 2017-06-21 7:58 ` Jakub Jelinek @ 2017-06-21 14:40 ` Jakub Jelinek 2017-06-21 15:17 ` Jakub Jelinek 2017-06-21 16:27 ` Marc Glisse 2017-07-04 8:53 ` [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) Jakub Jelinek 2 siblings, 2 replies; 21+ messages in thread From: Jakub Jelinek @ 2017-06-21 14:40 UTC (permalink / raw) To: Richard Biener; +Cc: Martin Liška, gcc-patches On Tue, Jun 20, 2017 at 10:18:20AM +0200, Richard Biener wrote: > > > > 3) not really related to this patch, but something I also saw during the > > > > bootstrap-ubsan on i686-linux: > > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147426384 - 2147475412 cannot be represented in type 'int' > > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147426384 - 2147478324 cannot be represented in type 'int' > > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147450216 - 2147451580 cannot be represented in type 'int' > > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147450216 - 2147465664 cannot be represented in type 'int' > > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147469348 - 2147451544 cannot be represented in type 'int' > > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147482364 - 2147475376 cannot be represented in type 'int' > > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147483624 - 2147475376 cannot be represented in type 'int' > > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147483628 - 2147451544 cannot be represented in type 'int' > > > > ../../gcc/memory-block.cc:59:4: runtime error: signed integer overflow: -2147426384 - 2147475376 cannot be represented in type 'int' > > > > ../../gcc/memory-block.cc:59:4: runtime error: signed integer overflow: -2147450216 - 2147451544 cannot be represented in type 'int' > > > > The problem here is that we lower pointer subtraction, e.g. > > > > long foo (char *p, char *q) { return q - p; } > > > > as return (ptrdiff_t) ((ssizetype) q - (ssizetype) p); > > > > and even for a valid testcase where we have an array across > > > > the middle of the virtual address space, say the first one above > > > > is (char *) 0x8000dfb0 - (char *) 0x7fffdfd4 subtraction, even if > > > > there is 128KB array starting at 0x7fffd000, it will yield > > > > UB (not in the source, but in whatever the compiler lowered it into). > > > > So, shall we instead do the subtraction in sizetype and only then > > > > cast? For sizeof (*ptr) > 1 I think we have some outstanding PR, > > > > and it is more difficult to find out in what types to compute it. > > > > Or do we want to introduce POINTER_DIFF_EXPR? > > > > > > Just use uintptr_t for the difference computation (well, an unsigned > > > integer type of desired precision -- mind address-spaces), then cast > > > the result to signed. > > > > Ok (of course, will handle this separately from the rest). > > Yes. Note I didn't look at the actual patch (yet). So, I wrote following patch to do the subtraction in unsigned type. It passes bootstrap, but on both x86_64-linux and i686-linux regresses: +FAIL: gcc.dg/torture/pr66178.c -O* (test for excess errors) +FAIL: gcc.dg/tree-ssa/cmpexactdiv-2.c scan-tree-dump-not optimized "minus_expr" +FAIL: g++.dg/tree-ssa/pr21082.C -std=gnu++* (test for excess errors) E.g. in the first testcase we have in the test: static uintptr_t a = ((char *)&&l2-(char *)&&l3)+((char *)&&l1-(char *)&&l2); Without the patch, we ended up with: static uintptr_t a = (uintptr_t) (((long int) &l2 - (long int) &l3) + ((long int) &l1 - (long int) &l2)); but with the patch with (the negation in signed type sounds like a folding bug), which is too difficult for the initializer_constant_valid_p* handling: (uintptr_t) (((long unsigned int) -(long int) &l3 - (long unsigned int) &l2) + ((long unsigned int) &l2 + (long unsigned int) &l1)); Shall we just xfail that test, or make sure we don't reassociate such subtractions, something different? The second failure is on: int f (long *a, long *b, long *c) { __PTRDIFF_TYPE__ l1 = b - a; __PTRDIFF_TYPE__ l2 = c - a; return l1 < l2; } where without the patch during forwprop2 we optimize it using match.pd: /* X - Z < Y - Z is the same as X < Y when there is no overflow. */ because we had: b.0_1 = (long int) b_8(D); a.1_2 = (long int) a_9(D); _3 = b.0_1 - a.1_2; c.2_4 = (long int) c_11(D); a.3_5 = (long int) a_9(D); _6 = c.2_4 - a.3_5; _7 = _3 < _6; But with the patch we have: b.0_1 = (long unsigned int) b_9(D); a.1_2 = (long unsigned int) a_10(D); _3 = b.0_1 - a.1_2; _4 = (long int) _3; c.2_5 = (long unsigned int) c_11(D); _6 = c.2_5 - a.1_2; _7 = (long int) _6; _8 = _4 < _7; instead. But that is something we can't generally optimize. So do we need to introduce POINTER_DIFF (where we could still optimize this) or remove the test? If we rely on largest possible array to be half of the VA size - 1 (i.e. where for x > y both being pointers into the same array x - y > 0), then it is a valid optimization of the 2 pointer subtractions, but it is not a valid optimization on comparison of unsigned subtractions cast to signed type. The third one is if (&a[b] - &a[c] != b - c) link_error(); where fold already during generic folding used to be able to cope with it, but now we have: (long int) (((long unsigned int) b - (long unsigned int) c) * 4) /[ex] 4 != b - c which we don't fold. 2017-06-21 Jakub Jelinek <jakub@redhat.com> * c-typeck.c (pointer_diff): Perform subtraction in unsigned type and only cast the result to signed type for the division. * typeck.c (pointer_diff): Perform subtraction in unsigned type and only cast the result to signed type for the division. --- gcc/c/c-typeck.c.jj 2017-06-21 12:46:10.130822026 +0200 +++ gcc/c/c-typeck.c 2017-06-21 13:22:45.868310030 +0200 @@ -3781,7 +3781,7 @@ static tree pointer_diff (location_t loc, tree op0, tree op1) { tree restype = ptrdiff_type_node; - tree result, inttype; + tree result, inttype, uinttype; addr_space_t as0 = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (op0))); addr_space_t as1 = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (op1))); @@ -3809,11 +3809,21 @@ pointer_diff (location_t loc, tree op0, /* Determine integer type to perform computations in. This will usually be the same as the result type (ptrdiff_t), but may need to be a wider - type if pointers for the address space are wider than ptrdiff_t. */ + type if pointers for the address space are wider than ptrdiff_t. + The subtraction should be performed in the unsigned type, because for + the case where one pointer is above the middle of the address space + and the other pointer is below the middle of the address space + we'd otherwise introduce undefined behavior during the subtraction. */ if (TYPE_PRECISION (restype) < TYPE_PRECISION (TREE_TYPE (op0))) - inttype = c_common_type_for_size (TYPE_PRECISION (TREE_TYPE (op0)), 0); + { + inttype = c_common_type_for_size (TYPE_PRECISION (TREE_TYPE (op0)), 0); + uinttype = c_common_type_for_size (TYPE_PRECISION (TREE_TYPE (op0)), 1); + } else - inttype = restype; + { + inttype = restype; + uinttype = c_common_type_for_size (TYPE_PRECISION (restype), 1); + } if (TREE_CODE (target_type) == VOID_TYPE) pedwarn (loc, OPT_Wpointer_arith, @@ -3822,14 +3832,15 @@ pointer_diff (location_t loc, tree op0, pedwarn (loc, OPT_Wpointer_arith, "pointer to a function used in subtraction"); - /* First do the subtraction as integers; + /* First do the subtraction as unsigned integers; + then convert to signed integer and then drop through to build the divide operator. Do not do default conversions on the minus operator in case restype is a short type. */ - op0 = build_binary_op (loc, - MINUS_EXPR, convert (inttype, op0), - convert (inttype, op1), 0); + op0 = build_binary_op (loc, MINUS_EXPR, convert (uinttype, op0), + convert (uinttype, op1), 0); + op0 = convert (inttype, op0); /* This generates an error if op1 is pointer to incomplete type. */ if (!COMPLETE_OR_VOID_TYPE_P (TREE_TYPE (TREE_TYPE (orig_op1)))) error_at (loc, "arithmetic on pointer to an incomplete type"); --- gcc/cp/typeck.c.jj 2017-06-19 08:28:07.000000000 +0200 +++ gcc/cp/typeck.c 2017-06-21 13:26:17.120829891 +0200 @@ -5398,14 +5398,18 @@ pointer_diff (location_t loc, tree op0, return error_mark_node; } - /* First do the subtraction as integers; + tree uinttype = c_common_type_for_size (TYPE_PRECISION (restype), 1); + + /* First do the subtraction as unsigned integers; + then cast to restype and then drop through to build the divide operator. */ op0 = cp_build_binary_op (loc, MINUS_EXPR, - cp_convert (restype, op0, complain), - cp_convert (restype, op1, complain), + cp_convert (uinttype, op0, complain), + cp_convert (uinttype, op1, complain), complain); + op0 = cp_convert (restype, op0, complain); /* This generates an error if op1 is a pointer to an incomplete type. */ if (!COMPLETE_TYPE_P (TREE_TYPE (TREE_TYPE (op1)))) Jakub ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) 2017-06-21 14:40 ` [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) Jakub Jelinek @ 2017-06-21 15:17 ` Jakub Jelinek 2017-06-21 16:27 ` Marc Glisse 1 sibling, 0 replies; 21+ messages in thread From: Jakub Jelinek @ 2017-06-21 15:17 UTC (permalink / raw) To: Richard Biener; +Cc: Martin Liška, gcc-patches On Wed, Jun 21, 2017 at 04:40:01PM +0200, Jakub Jelinek wrote: > So, I wrote following patch to do the subtraction in unsigned > type. It passes bootstrap, but on both x86_64-linux and i686-linux > regresses: > +FAIL: gcc.dg/torture/pr66178.c -O* (test for excess errors) > +FAIL: gcc.dg/tree-ssa/cmpexactdiv-2.c scan-tree-dump-not optimized "minus_expr" > +FAIL: g++.dg/tree-ssa/pr21082.C -std=gnu++* (test for excess errors) Another option is to do what the patch does only when sanitizing and accept in that case less efficient code and rejection of weird corner case testcases like the first one. We risk miscompilation of the pointer differences, but I haven't managed to come up with a testcase where it would show (I guess more likely is when we propagate constants into the pointers). Jakub ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) 2017-06-21 14:40 ` [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) Jakub Jelinek 2017-06-21 15:17 ` Jakub Jelinek @ 2017-06-21 16:27 ` Marc Glisse 2017-06-22 8:31 ` Richard Biener 1 sibling, 1 reply; 21+ messages in thread From: Marc Glisse @ 2017-06-21 16:27 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Richard Biener, Martin Liška, gcc-patches On Wed, 21 Jun 2017, Jakub Jelinek wrote: > So, I wrote following patch to do the subtraction in unsigned > type. It passes bootstrap, but on both x86_64-linux and i686-linux > regresses: > +FAIL: gcc.dg/torture/pr66178.c -O* (test for excess errors) > +FAIL: gcc.dg/tree-ssa/cmpexactdiv-2.c scan-tree-dump-not optimized "minus_expr" > +FAIL: g++.dg/tree-ssa/pr21082.C -std=gnu++* (test for excess errors) > > E.g. in the first testcase we have in the test: > static uintptr_t a = ((char *)&&l2-(char *)&&l3)+((char *)&&l1-(char *)&&l2); > Without the patch, we ended up with: > static uintptr_t a = (uintptr_t) (((long int) &l2 - (long int) &l3) + ((long int) &l1 - (long int) &l2)); > but with the patch with (the negation in signed type sounds like a folding > bug), which is too difficult for the initializer_constant_valid_p* handling: > (uintptr_t) (((long unsigned int) -(long int) &l3 - (long unsigned int) &l2) + ((long unsigned int) &l2 + (long unsigned int) &l1)); > Shall we just xfail that test, or make sure we don't reassociate such > subtractions, something different? Adding to match.pd a few more simple reassoc transformations (like (c-b)+(b-a) for instance) that work for both signed and unsigned is on my TODO-list, though that may not be enough. Maybe together with fixing whatever produced the negation would suffice? > The second failure is on: > int f (long *a, long *b, long *c) { > __PTRDIFF_TYPE__ l1 = b - a; > __PTRDIFF_TYPE__ l2 = c - a; > return l1 < l2; > } > where without the patch during forwprop2 we optimize it > using match.pd: > /* X - Z < Y - Z is the same as X < Y when there is no overflow. */ > because we had: > b.0_1 = (long int) b_8(D); > a.1_2 = (long int) a_9(D); > _3 = b.0_1 - a.1_2; > c.2_4 = (long int) c_11(D); > a.3_5 = (long int) a_9(D); > _6 = c.2_4 - a.3_5; > _7 = _3 < _6; > But with the patch we have: > b.0_1 = (long unsigned int) b_9(D); > a.1_2 = (long unsigned int) a_10(D); > _3 = b.0_1 - a.1_2; > _4 = (long int) _3; > c.2_5 = (long unsigned int) c_11(D); > _6 = c.2_5 - a.1_2; > _7 = (long int) _6; > _8 = _4 < _7; > instead. But that is something we can't generally optimize. > So do we need to introduce POINTER_DIFF (where we could still > optimize this) or remove the test? If we rely on largest possible > array to be half of the VA size - 1 (i.e. where for x > y both being > pointers into the same array x - y > 0), then it is a valid optimization > of the 2 pointer subtractions, but it is not a valid optimization on > comparison of unsigned subtractions cast to signed type. (this testcase was meant as a simpler version of vector.size() < vector.capacity() ) It does indeed seem impossible to do this optimization with the unsigned pointer subtraction. If we consider pointers as unsigned, with a subtraction that has a signed result with the constraint that overflow is undefined, we cannot model that optimally with just the usual signed/unsigned operations, so I am in favor of POINTER_DIFF, at least in the long run (together with having a signed second argument for POINTER_PLUS, etc). For 64-bit platforms it might have been easier to declare that the upper half (3/4 ?) of the address space doesn't exist... > The third one is > if (&a[b] - &a[c] != b - c) > link_error(); > where fold already during generic folding used to be able to cope with it, > but now we have: > (long int) (((long unsigned int) b - (long unsigned int) c) * 4) /[ex] 4 != b - c > which we don't fold. Once we have this last expression, we have lost, we need to know that the multiplication cannot overflow for this. When the size multiplications are done in a signed type in the future (?), it might help. On the other hand, is this an important optimization? I am surprised we are only doing this transformation in generic (so some hack in the front-end could still work), it shouldn't be hard to implement some subset of fold_addr_of_array_ref_difference in match.pd (it is recursive so a complete move may be harder). But that would make your patch break even more stuff :-( -- Marc Glisse ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) 2017-06-21 16:27 ` Marc Glisse @ 2017-06-22 8:31 ` Richard Biener 2017-06-22 9:29 ` Marc Glisse 0 siblings, 1 reply; 21+ messages in thread From: Richard Biener @ 2017-06-22 8:31 UTC (permalink / raw) To: gcc-patches; +Cc: Jakub Jelinek, Martin Liška On Wed, 21 Jun 2017, Marc Glisse wrote: > On Wed, 21 Jun 2017, Jakub Jelinek wrote: > > > So, I wrote following patch to do the subtraction in unsigned > > type. It passes bootstrap, but on both x86_64-linux and i686-linux > > regresses: > > +FAIL: gcc.dg/torture/pr66178.c -O* (test for excess errors) > > +FAIL: gcc.dg/tree-ssa/cmpexactdiv-2.c scan-tree-dump-not optimized > > "minus_expr" > > +FAIL: g++.dg/tree-ssa/pr21082.C -std=gnu++* (test for excess errors) > > > > E.g. in the first testcase we have in the test: > > static uintptr_t a = ((char *)&&l2-(char *)&&l3)+((char *)&&l1-(char > > *)&&l2); > > Without the patch, we ended up with: > > static uintptr_t a = (uintptr_t) (((long int) &l2 - (long int) &l3) + ((long > > int) &l1 - (long int) &l2)); > > but with the patch with (the negation in signed type sounds like a folding > > bug), which is too difficult for the initializer_constant_valid_p* handling: > > (uintptr_t) (((long unsigned int) -(long int) &l3 - (long unsigned int) &l2) > > + ((long unsigned int) &l2 + (long unsigned int) &l1)); > > Shall we just xfail that test, or make sure we don't reassociate such > > subtractions, something different? > > Adding to match.pd a few more simple reassoc transformations (like > (c-b)+(b-a) for instance) that work for both signed and unsigned is on > my TODO-list, though that may not be enough. Maybe together with fixing > whatever produced the negation would suffice? I guess try to fix the negation and see if that magically fixes things... > > The second failure is on: > > int f (long *a, long *b, long *c) { > > __PTRDIFF_TYPE__ l1 = b - a; > > __PTRDIFF_TYPE__ l2 = c - a; > > return l1 < l2; > > } > > where without the patch during forwprop2 we optimize it > > using match.pd: > > /* X - Z < Y - Z is the same as X < Y when there is no overflow. */ > > because we had: > > b.0_1 = (long int) b_8(D); > > a.1_2 = (long int) a_9(D); > > _3 = b.0_1 - a.1_2; > > c.2_4 = (long int) c_11(D); > > a.3_5 = (long int) a_9(D); > > _6 = c.2_4 - a.3_5; > > _7 = _3 < _6; > > But with the patch we have: > > b.0_1 = (long unsigned int) b_9(D); > > a.1_2 = (long unsigned int) a_10(D); > > _3 = b.0_1 - a.1_2; > > _4 = (long int) _3; > > c.2_5 = (long unsigned int) c_11(D); > > _6 = c.2_5 - a.1_2; > > _7 = (long int) _6; > > _8 = _4 < _7; > > instead. But that is something we can't generally optimize. > > So do we need to introduce POINTER_DIFF (where we could still > > optimize this) or remove the test? If we rely on largest possible > > array to be half of the VA size - 1 (i.e. where for x > y both being > > pointers into the same array x - y > 0), then it is a valid optimization > > of the 2 pointer subtractions, but it is not a valid optimization on > > comparison of unsigned subtractions cast to signed type. > > (this testcase was meant as a simpler version of > vector.size() < vector.capacity() ) > > It does indeed seem impossible to do this optimization with the unsigned > pointer subtraction. Yep :/ This is the issue with all the non-pointer integer folding fixes as well -- as soon as we introduce unsigned ops for the fear of introducing undefined overflow we lose on followup optimization opportunities. > If we consider pointers as unsigned, with a subtraction that has a signed > result with the constraint that overflow is undefined, we cannot model that > optimally with just the usual signed/unsigned operations, so I am in favor of > POINTER_DIFF, at least in the long run (together with having a signed second > argument for POINTER_PLUS, etc). For 64-bit platforms it might have been > easier to declare that the upper half (3/4 ?) of the address space doesn't > exist... I repeatedly thought of POINTER_DIFF_EXPR but adding such a basic tree code is quite a big job. So we'd have POINTER_DIFF_EXPR take two pointer typed args and produce ptrdiff_t. What's the advantage of having this? And yes, I agree that POINTER_PLUS_EXPR should take ptrdiff_t rather than sizetype offset -- changing one without the other will lead to awkwardness in required patterns involving both like (p - q) + q. As said, it's a big job with likely all sorts of (testsuite) fallout. > > The third one is > > if (&a[b] - &a[c] != b - c) > > link_error(); > > where fold already during generic folding used to be able to cope with it, > > but now we have: > > (long int) (((long unsigned int) b - (long unsigned int) c) * 4) /[ex] 4 != > > b - c > > which we don't fold. > > Once we have this last expression, we have lost, we need to know that the > multiplication cannot overflow for this. When the size multiplications are > done in a signed type in the future (?), it might help. Not sure where the unsigned multiply comes from -- I guess we fold it inside the cast ... > On the other hand, is this an important optimization? I am surprised we are > only doing this transformation in generic (so some hack in the front-end could > still work), it shouldn't be hard to implement some subset of > fold_addr_of_array_ref_difference in match.pd (it is recursive so a complete > move may be harder). But that would make your patch break even more stuff :-( It's always the question whether the above testsuite regressions have any real-world impact of course. I don't like the idea of not doing foldings condidional on sanitization too much. Richard. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) 2017-06-22 8:31 ` Richard Biener @ 2017-06-22 9:29 ` Marc Glisse 2017-06-22 9:46 ` Richard Biener 0 siblings, 1 reply; 21+ messages in thread From: Marc Glisse @ 2017-06-22 9:29 UTC (permalink / raw) To: Richard Biener; +Cc: gcc-patches, Jakub Jelinek, Martin Liška On Thu, 22 Jun 2017, Richard Biener wrote: >> If we consider pointers as unsigned, with a subtraction that has a signed >> result with the constraint that overflow is undefined, we cannot model that >> optimally with just the usual signed/unsigned operations, so I am in favor of >> POINTER_DIFF, at least in the long run (together with having a signed second >> argument for POINTER_PLUS, etc). For 64-bit platforms it might have been >> easier to declare that the upper half (3/4 ?) of the address space doesn't >> exist... > > I repeatedly thought of POINTER_DIFF_EXPR but adding such a basic tree > code is quite a big job. Yes :-( It is probably not realistic to introduce it just to avoid a couple regressions while fixing a bug. > So we'd have POINTER_DIFF_EXPR take two pointer typed args and produce > ptrdiff_t. What's the advantage of having this? It represents q-p with one statement instead of 3 (long)q-(long)p or 4 (long)((ulong)q-(ulong)p). It allows us to stay in the pointer world, so (q-p)>0 is equivalent to p<q, not just (long)p<(long)q. It properly models what (undefined) overflow means for pointers. Of course it is hard to know in advance if that's significant or negligible, maybe size_t finds its way in too many places anyway. > And yes, I agree that POINTER_PLUS_EXPR should take > ptrdiff_t rather than sizetype offset -- changing one without the > other will lead to awkwardness in required patterns involving > both like (p - q) + q. > > As said, it's a big job with likely all sorts of (testsuite) fallout. > >>> The third one is >>> if (&a[b] - &a[c] != b - c) >>> link_error(); >>> where fold already during generic folding used to be able to cope with it, >>> but now we have: >>> (long int) (((long unsigned int) b - (long unsigned int) c) * 4) /[ex] 4 != >>> b - c >>> which we don't fold. >> >> Once we have this last expression, we have lost, we need to know that the >> multiplication cannot overflow for this. When the size multiplications are >> done in a signed type in the future (?), it might help. > > Not sure where the unsigned multiply comes from -- I guess we fold > it inside the cast ... We usually do those multiplications in an unsigned type. I experimented with changing one such place in https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01641.html , there is probably at least another one in the middle-end. -- Marc Glisse ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) 2017-06-22 9:29 ` Marc Glisse @ 2017-06-22 9:46 ` Richard Biener 2017-07-01 16:41 ` Marc Glisse 0 siblings, 1 reply; 21+ messages in thread From: Richard Biener @ 2017-06-22 9:46 UTC (permalink / raw) To: gcc-patches; +Cc: Jakub Jelinek, Martin Liška On Thu, 22 Jun 2017, Marc Glisse wrote: > On Thu, 22 Jun 2017, Richard Biener wrote: > > > > If we consider pointers as unsigned, with a subtraction that has a signed > > > result with the constraint that overflow is undefined, we cannot model > > > that > > > optimally with just the usual signed/unsigned operations, so I am in favor > > > of > > > POINTER_DIFF, at least in the long run (together with having a signed > > > second > > > argument for POINTER_PLUS, etc). For 64-bit platforms it might have been > > > easier to declare that the upper half (3/4 ?) of the address space doesn't > > > exist... > > > > I repeatedly thought of POINTER_DIFF_EXPR but adding such a basic tree > > code is quite a big job. > > Yes :-( > It is probably not realistic to introduce it just to avoid a couple > regressions while fixing a bug. > > > So we'd have POINTER_DIFF_EXPR take two pointer typed args and produce > > ptrdiff_t. What's the advantage of having this? > > It represents q-p with one statement instead of 3 (long)q-(long)p or 4 > (long)((ulong)q-(ulong)p). It allows us to stay in the pointer world, so > (q-p)>0 is equivalent to p<q, not just (long)p<(long)q. It properly models > what (undefined) overflow means for pointers. > > Of course it is hard to know in advance if that's significant or > negligible, maybe size_t finds its way in too many places anyway. As with all those experiments ... Well, if I would sell this as a consultant to somebody I'd estimate 3 man months for this work which realistically means you have to start now otherwise you won't make it this stage 1. A smaller job would be to make POINTER_PLUS_EXPR take ptrdiff_t as offset operand. But the fallout is likely similar. A lame(?) half-way transition would allow for both unsigned and signed ptrdiff_t (note sizetype -> [u]ptrdiff_t is already a transition for some embedded archs eventually). Maybe allowing both signed and unsigned offsets is desirable (you of course get interesting effects when combining those in GIMPLE where signedness matters). Richard. > > And yes, I agree that POINTER_PLUS_EXPR should take > > ptrdiff_t rather than sizetype offset -- changing one without the > > other will lead to awkwardness in required patterns involving > > both like (p - q) + q. > > > > As said, it's a big job with likely all sorts of (testsuite) fallout. > > > > > > The third one is > > > > if (&a[b] - &a[c] != b - c) > > > > link_error(); > > > > where fold already during generic folding used to be able to cope with > > > > it, > > > > but now we have: > > > > (long int) (((long unsigned int) b - (long unsigned int) c) * 4) /[ex] 4 > > > > != > > > > b - c > > > > which we don't fold. > > > > > > Once we have this last expression, we have lost, we need to know that the > > > multiplication cannot overflow for this. When the size multiplications are > > > done in a signed type in the future (?), it might help. > > > > Not sure where the unsigned multiply comes from -- I guess we fold > > it inside the cast ... > > We usually do those multiplications in an unsigned type. I experimented > with changing one such place in > https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01641.html , there is > probably at least another one in the middle-end. > > -- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) 2017-06-22 9:46 ` Richard Biener @ 2017-07-01 16:41 ` Marc Glisse 2017-07-03 12:37 ` Richard Biener 0 siblings, 1 reply; 21+ messages in thread From: Marc Glisse @ 2017-07-01 16:41 UTC (permalink / raw) To: Richard Biener; +Cc: gcc-patches, Jakub Jelinek, Martin Liška [-- Attachment #1: Type: TEXT/PLAIN, Size: 2099 bytes --] On Thu, 22 Jun 2017, Richard Biener wrote: > On Thu, 22 Jun 2017, Marc Glisse wrote: > >> On Thu, 22 Jun 2017, Richard Biener wrote: >> >>>> If we consider pointers as unsigned, with a subtraction that has a signed >>>> result with the constraint that overflow is undefined, we cannot model >>>> that >>>> optimally with just the usual signed/unsigned operations, so I am in favor >>>> of >>>> POINTER_DIFF, at least in the long run (together with having a signed >>>> second >>>> argument for POINTER_PLUS, etc). For 64-bit platforms it might have been >>>> easier to declare that the upper half (3/4 ?) of the address space doesn't >>>> exist... >>> >>> I repeatedly thought of POINTER_DIFF_EXPR but adding such a basic tree >>> code is quite a big job. >> >> Yes :-( >> It is probably not realistic to introduce it just to avoid a couple >> regressions while fixing a bug. >> >>> So we'd have POINTER_DIFF_EXPR take two pointer typed args and produce >>> ptrdiff_t. What's the advantage of having this? >> >> It represents q-p with one statement instead of 3 (long)q-(long)p or 4 >> (long)((ulong)q-(ulong)p). It allows us to stay in the pointer world, so >> (q-p)>0 is equivalent to p<q, not just (long)p<(long)q. It properly models >> what (undefined) overflow means for pointers. >> >> Of course it is hard to know in advance if that's significant or >> negligible, maybe size_t finds its way in too many places anyway. > > As with all those experiments ... > > Well, if I would sell this as a consultant to somebody I'd estimate > 3 man months for this work which realistically means you have to > start now otherwise you won't make it this stage 1. I wrote a quick prototype to see what the fallout would look like. Surprisingly, it actually passes bootstrap+testsuite on ppc64el with all languages with no regression. Sure, it is probably not a complete migration, there are likely a few places still converting to ptrdiff_t to perform a regular subtraction, but this seems to indicate that the work isn't as bad as using a signed type in pointer_plus_expr for instance. -- Marc Glisse [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Type: TEXT/x-diff; name=ptrdiff.patch, Size: 27968 bytes --] Index: gcc/c/c-fold.c =================================================================== --- gcc/c/c-fold.c (revision 249856) +++ gcc/c/c-fold.c (working copy) @@ -238,20 +238,21 @@ c_fully_fold_internal (tree expr, bool i case COMPOUND_EXPR: case MODIFY_EXPR: case PREDECREMENT_EXPR: case PREINCREMENT_EXPR: case POSTDECREMENT_EXPR: case POSTINCREMENT_EXPR: case PLUS_EXPR: case MINUS_EXPR: case MULT_EXPR: case POINTER_PLUS_EXPR: + case POINTER_DIFF_EXPR: case TRUNC_DIV_EXPR: case CEIL_DIV_EXPR: case FLOOR_DIV_EXPR: case TRUNC_MOD_EXPR: case RDIV_EXPR: case EXACT_DIV_EXPR: case LSHIFT_EXPR: case RSHIFT_EXPR: case BIT_IOR_EXPR: case BIT_XOR_EXPR: Index: gcc/c/c-typeck.c =================================================================== --- gcc/c/c-typeck.c (revision 249856) +++ gcc/c/c-typeck.c (working copy) @@ -3820,23 +3820,21 @@ pointer_diff (location_t loc, tree op0, "pointer of type %<void *%> used in subtraction"); if (TREE_CODE (target_type) == FUNCTION_TYPE) pedwarn (loc, OPT_Wpointer_arith, "pointer to a function used in subtraction"); /* First do the subtraction as integers; then drop through to build the divide operator. Do not do default conversions on the minus operator in case restype is a short type. */ - op0 = build_binary_op (loc, - MINUS_EXPR, convert (inttype, op0), - convert (inttype, op1), 0); + op0 = build2_loc (loc, POINTER_DIFF_EXPR, ptrdiff_type_node, op0, op1); /* This generates an error if op1 is pointer to incomplete type. */ if (!COMPLETE_OR_VOID_TYPE_P (TREE_TYPE (TREE_TYPE (orig_op1)))) error_at (loc, "arithmetic on pointer to an incomplete type"); op1 = c_size_in_bytes (target_type); if (pointer_to_zero_sized_aggr_p (TREE_TYPE (orig_op1))) error_at (loc, "arithmetic on pointer to an empty aggregate"); /* Divide by the size, in easiest possible way. */ @@ -9967,20 +9965,21 @@ c_finish_return (location_t loc, tree re { switch (TREE_CODE (inner)) { CASE_CONVERT: case NON_LVALUE_EXPR: case PLUS_EXPR: case POINTER_PLUS_EXPR: inner = TREE_OPERAND (inner, 0); continue; + case POINTER_DIFF_EXPR: case MINUS_EXPR: /* If the second operand of the MINUS_EXPR has a pointer type (or is converted from it), this may be valid, so don't give a warning. */ { tree op1 = TREE_OPERAND (inner, 1); while (!POINTER_TYPE_P (TREE_TYPE (op1)) && (CONVERT_EXPR_P (op1) || TREE_CODE (op1) == NON_LVALUE_EXPR)) Index: gcc/c-family/c-pretty-print.c =================================================================== --- gcc/c-family/c-pretty-print.c (revision 249856) +++ gcc/c-family/c-pretty-print.c (working copy) @@ -1863,20 +1863,21 @@ c_pretty_printer::multiplicative_express additive-expression - multiplicative-expression */ static void pp_c_additive_expression (c_pretty_printer *pp, tree e) { enum tree_code code = TREE_CODE (e); switch (code) { case POINTER_PLUS_EXPR: case PLUS_EXPR: + case POINTER_DIFF_EXPR: case MINUS_EXPR: pp_c_additive_expression (pp, TREE_OPERAND (e, 0)); pp_c_whitespace (pp); if (code == PLUS_EXPR || code == POINTER_PLUS_EXPR) pp_plus (pp); else pp_minus (pp); pp_c_whitespace (pp); pp->multiplicative_expression (TREE_OPERAND (e, 1)); break; @@ -2279,20 +2280,21 @@ c_pretty_printer::expression (tree e) case NE_EXPR: pp_c_equality_expression (this, e); break; case COND_EXPR: conditional_expression (e); break; case POINTER_PLUS_EXPR: case PLUS_EXPR: + case POINTER_DIFF_EXPR: case MINUS_EXPR: pp_c_additive_expression (this, e); break; case MODIFY_EXPR: case INIT_EXPR: assignment_expression (e); break; case COMPOUND_EXPR: Index: gcc/cfgexpand.c =================================================================== --- gcc/cfgexpand.c (revision 249856) +++ gcc/cfgexpand.c (working copy) @@ -4595,20 +4595,21 @@ expand_debug_expr (tree exp) the operand, because the operand is always unsigned here even if the original C expression is signed. */ op1 = simplify_gen_unary (SIGN_EXTEND, GET_MODE (op0), op1, GET_MODE (op1)); } /* Fall through. */ case PLUS_EXPR: return simplify_gen_binary (PLUS, mode, op0, op1); case MINUS_EXPR: + case POINTER_DIFF_EXPR: return simplify_gen_binary (MINUS, mode, op0, op1); case MULT_EXPR: return simplify_gen_binary (MULT, mode, op0, op1); case RDIV_EXPR: case TRUNC_DIV_EXPR: case EXACT_DIV_EXPR: if (unsignedp) return simplify_gen_binary (UDIV, mode, op0, op1); Index: gcc/cp/constexpr.c =================================================================== --- gcc/cp/constexpr.c (revision 249856) +++ gcc/cp/constexpr.c (working copy) @@ -4247,20 +4247,21 @@ cxx_eval_constant_expression (const cons return t; op1 = TREE_OPERAND (t, 1); r = cxx_eval_constant_expression (ctx, op1, lval, non_constant_p, overflow_p, jump_target); } } break; case POINTER_PLUS_EXPR: + case POINTER_DIFF_EXPR: case PLUS_EXPR: case MINUS_EXPR: case MULT_EXPR: case TRUNC_DIV_EXPR: case CEIL_DIV_EXPR: case FLOOR_DIV_EXPR: case ROUND_DIV_EXPR: case TRUNC_MOD_EXPR: case CEIL_MOD_EXPR: case ROUND_MOD_EXPR: @@ -5460,20 +5461,21 @@ potential_constant_expression_1 (tree t, && TYPE_POLYMORPHIC_P (TREE_TYPE (e))) { if (flags & tf_error) error_at (loc, "typeid-expression is not a constant expression " "because %qE is of polymorphic type", e); return false; } return true; } + case POINTER_DIFF_EXPR: case MINUS_EXPR: want_rval = true; goto binary; case LT_EXPR: case LE_EXPR: case GT_EXPR: case GE_EXPR: case EQ_EXPR: case NE_EXPR: Index: gcc/cp/cp-gimplify.c =================================================================== --- gcc/cp/cp-gimplify.c (revision 249856) +++ gcc/cp/cp-gimplify.c (working copy) @@ -2198,20 +2198,21 @@ cp_fold (tree x) case POSTINCREMENT_EXPR: case INIT_EXPR: case PREDECREMENT_EXPR: case PREINCREMENT_EXPR: case COMPOUND_EXPR: case MODIFY_EXPR: rval_ops = false; /* FALLTHRU */ case POINTER_PLUS_EXPR: case PLUS_EXPR: + case POINTER_DIFF_EXPR: case MINUS_EXPR: case MULT_EXPR: case TRUNC_DIV_EXPR: case CEIL_DIV_EXPR: case FLOOR_DIV_EXPR: case ROUND_DIV_EXPR: case TRUNC_MOD_EXPR: case CEIL_MOD_EXPR: case ROUND_MOD_EXPR: case RDIV_EXPR: Index: gcc/cp/typeck.c =================================================================== --- gcc/cp/typeck.c (revision 249856) +++ gcc/cp/typeck.c (working copy) @@ -4303,20 +4303,21 @@ cp_build_binary_op (location_t location, converted = 1; break; } default: break; } } switch (code) { + case POINTER_DIFF_EXPR: case MINUS_EXPR: /* Subtraction of two similar pointers. We must subtract them as integers, then divide by object size. */ if (code0 == POINTER_TYPE && code1 == POINTER_TYPE && same_type_ignoring_top_level_qualifiers_p (TREE_TYPE (type0), TREE_TYPE (type1))) return pointer_diff (location, op0, op1, common_pointer_type (type0, type1), complain); /* In all other cases except pointer - int, the usual arithmetic rules apply. */ @@ -5394,25 +5395,26 @@ pointer_diff (location_t loc, tree op0, if (complain & tf_error) permerror (loc, "ISO C++ forbids using pointer to " "a method in subtraction"); else return error_mark_node; } /* First do the subtraction as integers; then drop through to build the divide operator. */ - op0 = cp_build_binary_op (loc, - MINUS_EXPR, - cp_convert (restype, op0, complain), - cp_convert (restype, op1, complain), - complain); + op0 = build2_loc (loc, + POINTER_DIFF_EXPR, + ssizetype, + op0, + op1 + ); /* This generates an error if op1 is a pointer to an incomplete type. */ if (!COMPLETE_TYPE_P (TREE_TYPE (TREE_TYPE (op1)))) { if (complain & tf_error) error_at (loc, "invalid use of a pointer to an incomplete type in " "pointer arithmetic"); else return error_mark_node; } Index: gcc/expr.c =================================================================== --- gcc/expr.c (revision 249856) +++ gcc/expr.c (working copy) @@ -8537,20 +8537,21 @@ expand_expr_real_2 (sepops ops, rtx targ if (op1 == const0_rtx) return op0; goto binop2; } expand_operands (treeop0, treeop1, subtarget, &op0, &op1, modifier); return REDUCE_BIT_FIELD (simplify_gen_binary (PLUS, mode, op0, op1)); case MINUS_EXPR: + case POINTER_DIFF_EXPR: do_minus: /* For initializers, we are allowed to return a MINUS of two symbolic constants. Here we handle all cases when both operands are constant. */ /* Handle difference of two symbolic constants, for the sake of an initializer. */ if ((modifier == EXPAND_SUM || modifier == EXPAND_INITIALIZER) && really_constant_p (treeop0) && really_constant_p (treeop1)) { Index: gcc/fold-const.c =================================================================== --- gcc/fold-const.c (revision 249856) +++ gcc/fold-const.c (working copy) @@ -1138,20 +1138,24 @@ const_binop (enum tree_code code, tree a return NULL_TREE; STRIP_NOPS (arg1); STRIP_NOPS (arg2); if (TREE_CODE (arg1) == INTEGER_CST && TREE_CODE (arg2) == INTEGER_CST) { if (code == POINTER_PLUS_EXPR) return int_const_binop (PLUS_EXPR, arg1, fold_convert (TREE_TYPE (arg1), arg2)); + if (code == POINTER_DIFF_EXPR) + return int_const_binop (MINUS_EXPR, + fold_convert (ptrdiff_type_node, arg1), + fold_convert (ptrdiff_type_node, arg2)); return int_const_binop (code, arg1, arg2); } if (TREE_CODE (arg1) == REAL_CST && TREE_CODE (arg2) == REAL_CST) { machine_mode mode; REAL_VALUE_TYPE d1; REAL_VALUE_TYPE d2; REAL_VALUE_TYPE value; @@ -9764,20 +9768,21 @@ fold_binary_loc (location_t loc, con0 = associate_trees (loc, con0, lit0, code, atype); return fold_convert_loc (loc, type, associate_trees (loc, var0, con0, code, atype)); } } return NULL_TREE; + case POINTER_DIFF_EXPR: case MINUS_EXPR: /* (-A) - B -> (-B) - A where B is easily negated and we can swap. */ if (TREE_CODE (arg0) == NEGATE_EXPR && negate_expr_p (op1)) return fold_build2_loc (loc, MINUS_EXPR, type, negate_expr (op1), fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0))); /* Fold __complex__ ( x, 0 ) - __complex__ ( 0, y ) to Index: gcc/match.pd =================================================================== --- gcc/match.pd (revision 249856) +++ gcc/match.pd (working copy) @@ -117,20 +117,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Simplify x - x. This is unsafe for certain floats even in non-IEEE formats. In IEEE, it is unsafe because it does wrong for NaNs. Also note that operand_equal_p is always false if an operand is volatile. */ (simplify (minus @0 @0) (if (!FLOAT_TYPE_P (type) || !HONOR_NANS (type)) { build_zero_cst (type); })) +(simplify + (pointer_diff @0 @0) + { build_zero_cst (type); }) (simplify (mult @0 integer_zerop@1) @1) /* Maybe fold x * 0 to 0. The expressions aren't the same when x is NaN, since x * 0 is also NaN. Nor are they the same in modes with signed zeros, since multiplying a negative value by 0 gives -0, not +0. */ (simplify @@ -1269,20 +1272,27 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) tem5 = ptr1 + tem4; and produce tem5 = ptr2; */ (simplify (pointer_plus @0 (convert?@2 (minus@3 (convert @1) (convert @0)))) /* Conditionally look through a sign-changing conversion. */ (if (TYPE_PRECISION (TREE_TYPE (@2)) == TYPE_PRECISION (TREE_TYPE (@3)) && ((GIMPLE && useless_type_conversion_p (type, TREE_TYPE (@1))) || (GENERIC && type == TREE_TYPE (@1)))) @1)) +(simplify + (pointer_plus @0 (convert?@2 (pointer_diff@3 @1 @0))) + /* Conditionally look through a sign-changing conversion. */ + (if (TYPE_PRECISION (TREE_TYPE (@2)) == TYPE_PRECISION (TREE_TYPE (@3)) + && ((GIMPLE && useless_type_conversion_p (type, TREE_TYPE (@1))) + || (GENERIC && type == TREE_TYPE (@1)))) + @1)) /* Pattern match tem = (sizetype) ptr; tem = tem & algn; tem = -tem; ... = ptr p+ tem; and produce the simpler and easier to analyze with respect to alignment ... = ptr & ~algn; */ (simplify (pointer_plus @0 (negate (bit_and (convert @0) INTEGER_CST@1))) @@ -1295,20 +1305,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) (with { HOST_WIDE_INT diff; } (if (ptr_difference_const (@0, @1, &diff)) { build_int_cst_type (type, diff); })))) (simplify (minus (convert @0) (convert ADDR_EXPR@1)) (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) (with { HOST_WIDE_INT diff; } (if (ptr_difference_const (@0, @1, &diff)) { build_int_cst_type (type, diff); })))) +(simplify + (pointer_diff (convert?@2 ADDR_EXPR@0) (convert?@3 @1)) + (if (tree_nop_conversion_p (TREE_TYPE(@2), TREE_TYPE (@0)) + && tree_nop_conversion_p (TREE_TYPE(@3), TREE_TYPE (@1))) + (with { HOST_WIDE_INT diff; } + (if (ptr_difference_const (@0, @1, &diff)) + { build_int_cst_type (type, diff); })))) +(simplify + (pointer_diff (convert?@2 @0) (convert?@3 ADDR_EXPR@1)) + (if (tree_nop_conversion_p (TREE_TYPE(@2), TREE_TYPE (@0)) + && tree_nop_conversion_p (TREE_TYPE(@3), TREE_TYPE (@1))) + (with { HOST_WIDE_INT diff; } + (if (ptr_difference_const (@0, @1, &diff)) + { build_int_cst_type (type, diff); })))) /* If arg0 is derived from the address of an object or function, we may be able to fold this expression using the object or function's alignment. */ (simplify (bit_and (convert? @0) INTEGER_CST@1) (if (POINTER_TYPE_P (TREE_TYPE (@0)) && tree_nop_conversion_p (type, TREE_TYPE (@0))) (with { @@ -1491,20 +1515,29 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) || (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))) /* For pointer types, if the conversion of A to the final type requires a sign- or zero-extension, then we have to punt - it is not defined which one is correct. */ || (POINTER_TYPE_P (TREE_TYPE (@0)) && TREE_CODE (@1) == INTEGER_CST && tree_int_cst_sign_bit (@1) == 0)) (convert @1)))) + (simplify + (pointer_diff (pointer_plus @0 @1) @0) + (if (element_precision (type) <= element_precision (@1) + || tree_expr_nonnegative_p (@1)) + /* For pointer types, if the conversion of A to the + final type requires a sign- or zero-extension, + then we have to punt - it is not defined which + one is correct. */ + (convert @1))) /* (T)P - (T)(P + A) -> -(T) A */ (for add (plus pointer_plus) (simplify (minus (convert @0) (convert (add @@0 @1))) (if (element_precision (type) <= element_precision (TREE_TYPE (@1)) /* For integer types, if A has a smaller type than T the result depends on the possible overflow in P + A. @@ -1515,20 +1548,29 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) || (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))) /* For pointer types, if the conversion of A to the final type requires a sign- or zero-extension, then we have to punt - it is not defined which one is correct. */ || (POINTER_TYPE_P (TREE_TYPE (@0)) && TREE_CODE (@1) == INTEGER_CST && tree_int_cst_sign_bit (@1) == 0)) (negate (convert @1))))) + (simplify + (pointer_diff @0 (pointer_plus @0 @1)) + (if (element_precision (type) <= element_precision (@1) + || tree_expr_nonnegative_p (@1)) + /* For pointer types, if the conversion of A to the + final type requires a sign- or zero-extension, + then we have to punt - it is not defined which + one is correct. */ + (negate (convert @1)))) /* (T)(P + A) - (T)(P + B) -> (T)A - (T)B */ (for add (plus pointer_plus) (simplify (minus (convert (add @@0 @1)) (convert (add @0 @2))) (if (element_precision (type) <= element_precision (TREE_TYPE (@1)) /* For integer types, if A has a smaller type than T the result depends on the possible overflow in P + A. @@ -1541,20 +1583,29 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* For pointer types, if the conversion of A to the final type requires a sign- or zero-extension, then we have to punt - it is not defined which one is correct. */ || (POINTER_TYPE_P (TREE_TYPE (@0)) && TREE_CODE (@1) == INTEGER_CST && tree_int_cst_sign_bit (@1) == 0 && TREE_CODE (@2) == INTEGER_CST && tree_int_cst_sign_bit (@2) == 0)) (minus (convert @1) (convert @2))))))) + (simplify + (pointer_diff (pointer_plus @0 @1) (pointer_plus @0 @2)) + (if (element_precision (type) <= element_precision (@1) + || (tree_expr_nonnegative_p (@1) && tree_expr_nonnegative_p (@2))) + /* For pointer types, if the conversion of A to the + final type requires a sign- or zero-extension, + then we have to punt - it is not defined which + one is correct. */ + (minus (convert @1) (convert @2)))) /* Simplifications of MIN_EXPR, MAX_EXPR, fmin() and fmax(). */ (for minmax (min max FMIN FMAX) (simplify (minmax @0 @0) @0)) /* min(max(x,y),y) -> y. */ (simplify @@ -2524,20 +2575,25 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Transform comparisons of the form X - Y CMP 0 to X CMP Y. ??? The transformation is valid for the other operators if overflow is undefined for the type, but performing it here badly interacts with the transformation in fold_cond_expr_with_comparison which attempts to synthetize ABS_EXPR. */ (for cmp (eq ne) (simplify (cmp (minus@2 @0 @1) integer_zerop) (if (single_use (@2)) (cmp @0 @1)))) +(for cmp (eq ne) + (simplify + (cmp (pointer_diff@2 @0 @1) integer_zerop) + (if (single_use (@2)) + (cmp @0 @1)))) /* Transform comparisons of the form X * C1 CMP 0 to X CMP 0 in the signed arithmetic case. That form is created by the compiler often enough for folding it to be of value. One example is in computing loop trip counts after Operator Strength Reduction. */ (for cmp (simple_comparison) scmp (swapped_simple_comparison) (simplify (cmp (mult@3 @0 INTEGER_CST@1) integer_zerop@2) /* Handle unfolded multiplication by zero. */ Index: gcc/optabs-tree.c =================================================================== --- gcc/optabs-tree.c (revision 249856) +++ gcc/optabs-tree.c (working copy) @@ -216,20 +216,21 @@ optab_for_tree_code (enum tree_code code trapv = INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_TRAPS (type); switch (code) { case POINTER_PLUS_EXPR: case PLUS_EXPR: if (TYPE_SATURATING (type)) return TYPE_UNSIGNED (type) ? usadd_optab : ssadd_optab; return trapv ? addv_optab : add_optab; + case POINTER_DIFF_EXPR: case MINUS_EXPR: if (TYPE_SATURATING (type)) return TYPE_UNSIGNED (type) ? ussub_optab : sssub_optab; return trapv ? subv_optab : sub_optab; case MULT_EXPR: if (TYPE_SATURATING (type)) return TYPE_UNSIGNED (type) ? usmul_optab : ssmul_optab; return trapv ? smulv_optab : smul_optab; Index: gcc/tree-cfg.c =================================================================== --- gcc/tree-cfg.c (revision 249856) +++ gcc/tree-cfg.c (working copy) @@ -3966,20 +3966,37 @@ verify_gimple_assign_binary (gassign *st error ("type mismatch in pointer plus expression"); debug_generic_stmt (lhs_type); debug_generic_stmt (rhs1_type); debug_generic_stmt (rhs2_type); return true; } return false; } + case POINTER_DIFF_EXPR: + { + if (!POINTER_TYPE_P (rhs1_type) + || !POINTER_TYPE_P (rhs2_type) + // || !useless_type_conversion_p (rhs2_type, rhs1_type) + || !useless_type_conversion_p (ptrdiff_type_node, lhs_type)) + { + error ("type mismatch in pointer diff expression"); + debug_generic_stmt (lhs_type); + debug_generic_stmt (rhs1_type); + debug_generic_stmt (rhs2_type); + return true; + } + + return false; + } + case TRUTH_ANDIF_EXPR: case TRUTH_ORIF_EXPR: case TRUTH_AND_EXPR: case TRUTH_OR_EXPR: case TRUTH_XOR_EXPR: gcc_unreachable (); case LT_EXPR: case LE_EXPR: Index: gcc/tree-inline.c =================================================================== --- gcc/tree-inline.c (revision 249856) +++ gcc/tree-inline.c (working copy) @@ -3906,20 +3906,21 @@ estimate_operator_cost (enum tree_code c return 0; /* Assign cost of 1 to usual operations. ??? We may consider mapping RTL costs to this. */ case COND_EXPR: case VEC_COND_EXPR: case VEC_PERM_EXPR: case PLUS_EXPR: case POINTER_PLUS_EXPR: + case POINTER_DIFF_EXPR: case MINUS_EXPR: case MULT_EXPR: case MULT_HIGHPART_EXPR: case FMA_EXPR: case ADDR_SPACE_CONVERT_EXPR: case FIXED_CONVERT_EXPR: case FIX_TRUNC_EXPR: case NEGATE_EXPR: Index: gcc/tree-pretty-print.c =================================================================== --- gcc/tree-pretty-print.c (revision 249856) +++ gcc/tree-pretty-print.c (working copy) @@ -2311,20 +2311,21 @@ dump_generic_node (pretty_printer *pp, t pp_greater (pp); break; /* Binary arithmetic and logic expressions. */ case WIDEN_SUM_EXPR: case WIDEN_MULT_EXPR: case MULT_EXPR: case MULT_HIGHPART_EXPR: case PLUS_EXPR: case POINTER_PLUS_EXPR: + case POINTER_DIFF_EXPR: case MINUS_EXPR: case TRUNC_DIV_EXPR: case CEIL_DIV_EXPR: case FLOOR_DIV_EXPR: case ROUND_DIV_EXPR: case TRUNC_MOD_EXPR: case CEIL_MOD_EXPR: case FLOOR_MOD_EXPR: case ROUND_MOD_EXPR: case RDIV_EXPR: @@ -3546,20 +3547,21 @@ op_code_prio (enum tree_code code) case LROTATE_EXPR: case RROTATE_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: case WIDEN_LSHIFT_EXPR: return 11; case WIDEN_SUM_EXPR: case PLUS_EXPR: case POINTER_PLUS_EXPR: + case POINTER_DIFF_EXPR: case MINUS_EXPR: return 12; case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case WIDEN_MULT_EXPR: case DOT_PROD_EXPR: case WIDEN_MULT_PLUS_EXPR: case WIDEN_MULT_MINUS_EXPR: case MULT_EXPR: @@ -3732,20 +3734,21 @@ op_symbol_code (enum tree_code code) return "w+"; case WIDEN_MULT_EXPR: return "w*"; case MULT_HIGHPART_EXPR: return "h*"; case NEGATE_EXPR: case MINUS_EXPR: + case POINTER_DIFF_EXPR: return "-"; case BIT_NOT_EXPR: return "~"; case TRUTH_NOT_EXPR: return "!"; case MULT_EXPR: case INDIRECT_REF: Index: gcc/tree-vrp.c =================================================================== --- gcc/tree-vrp.c (revision 249856) +++ gcc/tree-vrp.c (working copy) @@ -3093,29 +3093,29 @@ extract_range_from_binary_expr (value_ra set_value_range (&n_vr0, VR_RANGE, op0, op0, NULL); extract_range_from_binary_expr_1 (vr, code, expr_type, &n_vr0, &vr1); } /* If we didn't derive a range for MINUS_EXPR, and op1's range is ~[op0,op0] or vice-versa, then we can derive a non-null range. This happens often for pointer subtraction. */ if (vr->type == VR_VARYING - && code == MINUS_EXPR + && (code == MINUS_EXPR || code == POINTER_DIFF_EXPR) && TREE_CODE (op0) == SSA_NAME && ((vr0.type == VR_ANTI_RANGE && vr0.min == op1 && vr0.min == vr0.max) || (vr1.type == VR_ANTI_RANGE && vr1.min == op0 && vr1.min == vr1.max))) - set_value_range_to_nonnull (vr, TREE_TYPE (op0)); + set_value_range_to_nonnull (vr, expr_type); } /* Extract range information from a unary operation CODE based on the range of its operand *VR0 with type OP0_TYPE with resulting type TYPE. The resulting range is stored in *VR. */ void extract_range_from_unary_expr (value_range *vr, enum tree_code code, tree type, value_range *vr0_, tree op0_type) Index: gcc/tree.def =================================================================== --- gcc/tree.def (revision 249856) +++ gcc/tree.def (working copy) @@ -671,20 +671,22 @@ DEFTREECODE (PLACEHOLDER_EXPR, "placehol /* Simple arithmetic. */ DEFTREECODE (PLUS_EXPR, "plus_expr", tcc_binary, 2) DEFTREECODE (MINUS_EXPR, "minus_expr", tcc_binary, 2) DEFTREECODE (MULT_EXPR, "mult_expr", tcc_binary, 2) /* Pointer addition. The first operand is always a pointer and the second operand is an integer of type sizetype. */ DEFTREECODE (POINTER_PLUS_EXPR, "pointer_plus_expr", tcc_binary, 2) +DEFTREECODE (POINTER_DIFF_EXPR, "pointer_diff_expr", tcc_binary, 2) + /* Highpart multiplication. For an integral type with precision B, returns bits [2B-1, B] of the full 2*B product. */ DEFTREECODE (MULT_HIGHPART_EXPR, "mult_highpart_expr", tcc_binary, 2) /* Division for integer result that rounds the quotient toward zero. */ DEFTREECODE (TRUNC_DIV_EXPR, "trunc_div_expr", tcc_binary, 2) /* Division for integer result that rounds it toward plus infinity. */ DEFTREECODE (CEIL_DIV_EXPR, "ceil_div_expr", tcc_binary, 2) Index: gcc/varasm.c =================================================================== --- gcc/varasm.c (revision 249856) +++ gcc/varasm.c (working copy) @@ -4542,20 +4542,21 @@ initializer_constant_valid_p_1 (tree val else /* Support narrowing pointer differences. */ ret = narrowing_initializer_constant_valid_p (value, endtype, NULL); if (cache) { cache[0] = value; cache[1] = ret; } return ret; + case POINTER_DIFF_EXPR: case MINUS_EXPR: if (TREE_CODE (endtype) == REAL_TYPE) return NULL_TREE; if (cache && cache[0] == value) return cache[1]; if (! INTEGRAL_TYPE_P (endtype) || TYPE_PRECISION (endtype) >= TYPE_PRECISION (TREE_TYPE (value))) { tree ncache[4] = { NULL_TREE, NULL_TREE, NULL_TREE, NULL_TREE }; tree valid0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) 2017-07-01 16:41 ` Marc Glisse @ 2017-07-03 12:37 ` Richard Biener 2017-10-09 11:01 ` Marc Glisse 0 siblings, 1 reply; 21+ messages in thread From: Richard Biener @ 2017-07-03 12:37 UTC (permalink / raw) To: Marc Glisse; +Cc: gcc-patches, Jakub Jelinek, Martin Liška On Sat, 1 Jul 2017, Marc Glisse wrote: > On Thu, 22 Jun 2017, Richard Biener wrote: > > > On Thu, 22 Jun 2017, Marc Glisse wrote: > > > > > On Thu, 22 Jun 2017, Richard Biener wrote: > > > > > > > > If we consider pointers as unsigned, with a subtraction that has a > > > > > signed > > > > > result with the constraint that overflow is undefined, we cannot model > > > > > that > > > > > optimally with just the usual signed/unsigned operations, so I am in > > > > > favor > > > > > of > > > > > POINTER_DIFF, at least in the long run (together with having a signed > > > > > second > > > > > argument for POINTER_PLUS, etc). For 64-bit platforms it might have > > > > > been > > > > > easier to declare that the upper half (3/4 ?) of the address space > > > > > doesn't > > > > > exist... > > > > > > > > I repeatedly thought of POINTER_DIFF_EXPR but adding such a basic tree > > > > code is quite a big job. > > > > > > Yes :-( > > > It is probably not realistic to introduce it just to avoid a couple > > > regressions while fixing a bug. > > > > > > > So we'd have POINTER_DIFF_EXPR take two pointer typed args and produce > > > > ptrdiff_t. What's the advantage of having this? > > > > > > It represents q-p with one statement instead of 3 (long)q-(long)p or 4 > > > (long)((ulong)q-(ulong)p). It allows us to stay in the pointer world, so > > > (q-p)>0 is equivalent to p<q, not just (long)p<(long)q. It properly models > > > what (undefined) overflow means for pointers. > > > > > > Of course it is hard to know in advance if that's significant or > > > negligible, maybe size_t finds its way in too many places anyway. > > > > As with all those experiments ... > > > > Well, if I would sell this as a consultant to somebody I'd estimate > > 3 man months for this work which realistically means you have to > > start now otherwise you won't make it this stage 1. > > I wrote a quick prototype to see what the fallout would look like. > Surprisingly, it actually passes bootstrap+testsuite on ppc64el with all > languages with no regression. Sure, it is probably not a complete > migration, there are likely a few places still converting to ptrdiff_t > to perform a regular subtraction, but this seems to indicate that the > work isn't as bad as using a signed type in pointer_plus_expr for > instance. The fold_binary_loc hunk looks dangerous (it'll generate MINUS_EXPR from POINTER_MINUS_EXPR in some cases I guess). The tree code needs documenting in tree.def and generic.texi. Otherwise ok(*). Thanks, Richard. (*) ok, just kidding -- or maybe not ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) 2017-07-03 12:37 ` Richard Biener @ 2017-10-09 11:01 ` Marc Glisse 2017-10-19 15:11 ` Richard Biener 0 siblings, 1 reply; 21+ messages in thread From: Marc Glisse @ 2017-10-09 11:01 UTC (permalink / raw) To: gcc-patches [-- Attachment #1: Type: TEXT/PLAIN, Size: 1992 bytes --] On Mon, 3 Jul 2017, Richard Biener wrote: > On Sat, 1 Jul 2017, Marc Glisse wrote: > >> I wrote a quick prototype to see what the fallout would look like. >> Surprisingly, it actually passes bootstrap+testsuite on ppc64el with all >> languages with no regression. Sure, it is probably not a complete >> migration, there are likely a few places still converting to ptrdiff_t >> to perform a regular subtraction, but this seems to indicate that the >> work isn't as bad as using a signed type in pointer_plus_expr for >> instance. > > The fold_binary_loc hunk looks dangerous (it'll generate MINUS_EXPR > from POINTER_MINUS_EXPR in some cases I guess). > > The tree code needs documenting in tree.def and generic.texi. > > Otherwise ok(*). > > Thanks, > Richard. > > (*) ok, just kidding -- or maybe not I updated the prototype a bit. Some things I noticed: - the C front-end has support for address spaces that seems to imply that pointer subtraction (and division by the size) may be done in a type larger than ptrdiff_t. Currently, it generates (ptrdiff_t)(((inttype)q-(inttype)p)/size) for q-p where inttype is some type potentially larger than ptrdiff_t. I am thus generating a POINTER_DIFF_EXPR with that type, while I was originally hoping its type would always be ptrdiff_t. It complicates code and I am not sure I didn't break address spaces anyway... (expansion likely needs to do the inttype dance) Are ptrdiff_t (what POINTER_DIFF_EXPR should return) and size_t (what POINTER_PLUS_EXPR takes as second argument) always the same type signed/unsigned? Counter-examples: m32c (when !TARGET_A16), visium, darwin, powerpcspe, s390, vms... and it isn't even always the same that is bigger than the other. That's quite messy. - I had to use @@ in match.pd, not for constants, but because in GENERIC we sometimes ended up with pointers where operand_equal_p said yes but types_match disagreed. - It currently regresses 2 go tests: net/http runtime/debug -- Marc Glisse [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Type: TEXT/x-diff; name=diff.patch, Size: 34574 bytes --] Index: gcc/c/c-fold.c =================================================================== --- gcc/c/c-fold.c (revision 253536) +++ gcc/c/c-fold.c (working copy) @@ -238,20 +238,21 @@ c_fully_fold_internal (tree expr, bool i case COMPOUND_EXPR: case MODIFY_EXPR: case PREDECREMENT_EXPR: case PREINCREMENT_EXPR: case POSTDECREMENT_EXPR: case POSTINCREMENT_EXPR: case PLUS_EXPR: case MINUS_EXPR: case MULT_EXPR: case POINTER_PLUS_EXPR: + case POINTER_DIFF_EXPR: case TRUNC_DIV_EXPR: case CEIL_DIV_EXPR: case FLOOR_DIV_EXPR: case TRUNC_MOD_EXPR: case RDIV_EXPR: case EXACT_DIV_EXPR: case LSHIFT_EXPR: case RSHIFT_EXPR: case BIT_IOR_EXPR: case BIT_XOR_EXPR: Index: gcc/c/c-typeck.c =================================================================== --- gcc/c/c-typeck.c (revision 253536) +++ gcc/c/c-typeck.c (working copy) @@ -3772,21 +3772,21 @@ parser_build_binary_op (location_t locat && TREE_CODE (type2) == ENUMERAL_TYPE && TYPE_MAIN_VARIANT (type1) != TYPE_MAIN_VARIANT (type2)) warning_at (location, OPT_Wenum_compare, "comparison between %qT and %qT", type1, type2); return result; } \f /* Return a tree for the difference of pointers OP0 and OP1. - The resulting tree has type int. */ + The resulting tree has type ptrdiff_t. */ static tree pointer_diff (location_t loc, tree op0, tree op1) { tree restype = ptrdiff_type_node; tree result, inttype; addr_space_t as0 = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (op0))); addr_space_t as1 = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (op1))); tree target_type = TREE_TYPE (TREE_TYPE (op0)); @@ -3824,23 +3824,21 @@ pointer_diff (location_t loc, tree op0, "pointer of type %<void *%> used in subtraction"); if (TREE_CODE (target_type) == FUNCTION_TYPE) pedwarn (loc, OPT_Wpointer_arith, "pointer to a function used in subtraction"); /* First do the subtraction as integers; then drop through to build the divide operator. Do not do default conversions on the minus operator in case restype is a short type. */ - op0 = build_binary_op (loc, - MINUS_EXPR, convert (inttype, op0), - convert (inttype, op1), false); + op0 = build2_loc (loc, POINTER_DIFF_EXPR, inttype, op0, op1); /* This generates an error if op1 is pointer to incomplete type. */ if (!COMPLETE_OR_VOID_TYPE_P (TREE_TYPE (TREE_TYPE (orig_op1)))) error_at (loc, "arithmetic on pointer to an incomplete type"); op1 = c_size_in_bytes (target_type); if (pointer_to_zero_sized_aggr_p (TREE_TYPE (orig_op1))) error_at (loc, "arithmetic on pointer to an empty aggregate"); /* Divide by the size, in easiest possible way. */ Index: gcc/c-family/c-pretty-print.c =================================================================== --- gcc/c-family/c-pretty-print.c (revision 253536) +++ gcc/c-family/c-pretty-print.c (working copy) @@ -1869,20 +1869,21 @@ c_pretty_printer::multiplicative_express additive-expression - multiplicative-expression */ static void pp_c_additive_expression (c_pretty_printer *pp, tree e) { enum tree_code code = TREE_CODE (e); switch (code) { case POINTER_PLUS_EXPR: case PLUS_EXPR: + case POINTER_DIFF_EXPR: case MINUS_EXPR: pp_c_additive_expression (pp, TREE_OPERAND (e, 0)); pp_c_whitespace (pp); if (code == PLUS_EXPR || code == POINTER_PLUS_EXPR) pp_plus (pp); else pp_minus (pp); pp_c_whitespace (pp); pp->multiplicative_expression (TREE_OPERAND (e, 1)); break; @@ -2285,20 +2286,21 @@ c_pretty_printer::expression (tree e) case NE_EXPR: pp_c_equality_expression (this, e); break; case COND_EXPR: conditional_expression (e); break; case POINTER_PLUS_EXPR: case PLUS_EXPR: + case POINTER_DIFF_EXPR: case MINUS_EXPR: pp_c_additive_expression (this, e); break; case MODIFY_EXPR: case INIT_EXPR: assignment_expression (e); break; case COMPOUND_EXPR: Index: gcc/cfgexpand.c =================================================================== --- gcc/cfgexpand.c (revision 253536) +++ gcc/cfgexpand.c (working copy) @@ -4597,20 +4597,21 @@ expand_debug_expr (tree exp) /* We always sign-extend, regardless of the signedness of the operand, because the operand is always unsigned here even if the original C expression is signed. */ op1 = simplify_gen_unary (SIGN_EXTEND, op0_mode, op1, op1_mode); } /* Fall through. */ case PLUS_EXPR: return simplify_gen_binary (PLUS, mode, op0, op1); case MINUS_EXPR: + case POINTER_DIFF_EXPR: return simplify_gen_binary (MINUS, mode, op0, op1); case MULT_EXPR: return simplify_gen_binary (MULT, mode, op0, op1); case RDIV_EXPR: case TRUNC_DIV_EXPR: case EXACT_DIV_EXPR: if (unsignedp) return simplify_gen_binary (UDIV, mode, op0, op1); Index: gcc/cp/constexpr.c =================================================================== --- gcc/cp/constexpr.c (revision 253536) +++ gcc/cp/constexpr.c (working copy) @@ -4265,20 +4265,21 @@ cxx_eval_constant_expression (const cons return t; op1 = TREE_OPERAND (t, 1); r = cxx_eval_constant_expression (ctx, op1, lval, non_constant_p, overflow_p, jump_target); } } break; case POINTER_PLUS_EXPR: + case POINTER_DIFF_EXPR: case PLUS_EXPR: case MINUS_EXPR: case MULT_EXPR: case TRUNC_DIV_EXPR: case CEIL_DIV_EXPR: case FLOOR_DIV_EXPR: case ROUND_DIV_EXPR: case TRUNC_MOD_EXPR: case CEIL_MOD_EXPR: case ROUND_MOD_EXPR: @@ -5499,20 +5500,21 @@ potential_constant_expression_1 (tree t, && TYPE_POLYMORPHIC_P (TREE_TYPE (e))) { if (flags & tf_error) error_at (loc, "typeid-expression is not a constant expression " "because %qE is of polymorphic type", e); return false; } return true; } + case POINTER_DIFF_EXPR: case MINUS_EXPR: want_rval = true; goto binary; case LT_EXPR: case LE_EXPR: case GT_EXPR: case GE_EXPR: case EQ_EXPR: case NE_EXPR: Index: gcc/cp/cp-gimplify.c =================================================================== --- gcc/cp/cp-gimplify.c (revision 253536) +++ gcc/cp/cp-gimplify.c (working copy) @@ -2205,20 +2205,21 @@ cp_fold (tree x) case POSTINCREMENT_EXPR: case INIT_EXPR: case PREDECREMENT_EXPR: case PREINCREMENT_EXPR: case COMPOUND_EXPR: case MODIFY_EXPR: rval_ops = false; /* FALLTHRU */ case POINTER_PLUS_EXPR: case PLUS_EXPR: + case POINTER_DIFF_EXPR: case MINUS_EXPR: case MULT_EXPR: case TRUNC_DIV_EXPR: case CEIL_DIV_EXPR: case FLOOR_DIV_EXPR: case ROUND_DIV_EXPR: case TRUNC_MOD_EXPR: case CEIL_MOD_EXPR: case ROUND_MOD_EXPR: case RDIV_EXPR: Index: gcc/cp/error.c =================================================================== --- gcc/cp/error.c (revision 253536) +++ gcc/cp/error.c (working copy) @@ -2221,20 +2221,24 @@ dump_expr (cxx_pretty_printer *pp, tree default argument. Note we may have cleared out the first operand in expand_expr, so don't go killing ourselves. */ if (TREE_OPERAND (t, 1)) dump_expr (pp, TREE_OPERAND (t, 1), flags | TFF_EXPR_IN_PARENS); break; case POINTER_PLUS_EXPR: dump_binary_op (pp, "+", t, flags); break; + case POINTER_DIFF_EXPR: + dump_binary_op (pp, "-", t, flags); + break; + case INIT_EXPR: case MODIFY_EXPR: dump_binary_op (pp, assignment_operator_name_info[NOP_EXPR].name, t, flags); break; case PLUS_EXPR: case MINUS_EXPR: case MULT_EXPR: case TRUNC_DIV_EXPR: Index: gcc/cp/typeck.c =================================================================== --- gcc/cp/typeck.c (revision 253536) +++ gcc/cp/typeck.c (working copy) @@ -4305,20 +4305,21 @@ cp_build_binary_op (location_t location, converted = 1; break; } default: break; } } switch (code) { + case POINTER_DIFF_EXPR: case MINUS_EXPR: /* Subtraction of two similar pointers. We must subtract them as integers, then divide by object size. */ if (code0 == POINTER_TYPE && code1 == POINTER_TYPE && same_type_ignoring_top_level_qualifiers_p (TREE_TYPE (type0), TREE_TYPE (type1))) return pointer_diff (location, op0, op1, common_pointer_type (type0, type1), complain); /* In all other cases except pointer - int, the usual arithmetic rules apply. */ @@ -5397,25 +5398,21 @@ pointer_diff (location_t loc, tree op0, if (complain & tf_error) permerror (loc, "ISO C++ forbids using pointer to " "a method in subtraction"); else return error_mark_node; } /* First do the subtraction as integers; then drop through to build the divide operator. */ - op0 = cp_build_binary_op (loc, - MINUS_EXPR, - cp_convert (restype, op0, complain), - cp_convert (restype, op1, complain), - complain); + op0 = build2_loc (loc, POINTER_DIFF_EXPR, restype, op0, op1); /* This generates an error if op1 is a pointer to an incomplete type. */ if (!COMPLETE_TYPE_P (TREE_TYPE (TREE_TYPE (op1)))) { if (complain & tf_error) error_at (loc, "invalid use of a pointer to an incomplete type in " "pointer arithmetic"); else return error_mark_node; } Index: gcc/doc/generic.texi =================================================================== --- gcc/doc/generic.texi (revision 253536) +++ gcc/doc/generic.texi (working copy) @@ -1217,20 +1217,21 @@ the byte offset of the field, but should @tindex RSHIFT_EXPR @tindex BIT_IOR_EXPR @tindex BIT_XOR_EXPR @tindex BIT_AND_EXPR @tindex TRUTH_ANDIF_EXPR @tindex TRUTH_ORIF_EXPR @tindex TRUTH_AND_EXPR @tindex TRUTH_OR_EXPR @tindex TRUTH_XOR_EXPR @tindex POINTER_PLUS_EXPR +@tindex POINTER_DIFF_EXPR @tindex PLUS_EXPR @tindex MINUS_EXPR @tindex MULT_EXPR @tindex MULT_HIGHPART_EXPR @tindex RDIV_EXPR @tindex TRUNC_DIV_EXPR @tindex FLOOR_DIV_EXPR @tindex CEIL_DIV_EXPR @tindex ROUND_DIV_EXPR @tindex TRUNC_MOD_EXPR @@ -1406,22 +1407,27 @@ always of @code{BOOLEAN_TYPE} or @code{I These nodes represent logical and, logical or, and logical exclusive or. They are strict; both arguments are always evaluated. There are no corresponding operators in C or C++, but the front end will sometimes generate these expressions anyhow, if it can tell that strictness does not matter. The type of the operands and that of the result are always of @code{BOOLEAN_TYPE} or @code{INTEGER_TYPE}. @item POINTER_PLUS_EXPR This node represents pointer arithmetic. The first operand is always a pointer/reference type. The second operand is always an unsigned -integer type compatible with sizetype. This is the only binary -arithmetic operand that can operate on pointer types. +integer type compatible with sizetype. This and POINTER_DIFF_EXPR are +the only binary arithmetic operators that can operate on pointer types. + +@item POINTER_DIFF_EXPR +This node represents pointer subtraction. The two operands always +have pointer/reference type. The second operand is always an unsigned +integer type compatible with sizetype. It returns a signed integer. @item PLUS_EXPR @itemx MINUS_EXPR @itemx MULT_EXPR These nodes represent various binary arithmetic operations. Respectively, these operations are addition, subtraction (of the second operand from the first) and multiplication. Their operands may have either integral or floating type, but there will never be case in which one operand is of floating type and the other is of integral type. Index: gcc/expr.c =================================================================== --- gcc/expr.c (revision 253536) +++ gcc/expr.c (working copy) @@ -8539,20 +8539,21 @@ expand_expr_real_2 (sepops ops, rtx targ if (op1 == const0_rtx) return op0; goto binop2; } expand_operands (treeop0, treeop1, subtarget, &op0, &op1, modifier); return REDUCE_BIT_FIELD (simplify_gen_binary (PLUS, mode, op0, op1)); case MINUS_EXPR: + case POINTER_DIFF_EXPR: do_minus: /* For initializers, we are allowed to return a MINUS of two symbolic constants. Here we handle all cases when both operands are constant. */ /* Handle difference of two symbolic constants, for the sake of an initializer. */ if ((modifier == EXPAND_SUM || modifier == EXPAND_INITIALIZER) && really_constant_p (treeop0) && really_constant_p (treeop1)) { Index: gcc/fold-const.c =================================================================== --- gcc/fold-const.c (revision 253536) +++ gcc/fold-const.c (working copy) @@ -1130,20 +1130,25 @@ const_binop (enum tree_code code, tree a return NULL_TREE; STRIP_NOPS (arg1); STRIP_NOPS (arg2); if (TREE_CODE (arg1) == INTEGER_CST && TREE_CODE (arg2) == INTEGER_CST) { if (code == POINTER_PLUS_EXPR) return int_const_binop (PLUS_EXPR, arg1, fold_convert (TREE_TYPE (arg1), arg2)); + /* FIXME. */ + if (code == POINTER_DIFF_EXPR) + return int_const_binop (MINUS_EXPR, + fold_convert (ptrdiff_type_node, arg1), + fold_convert (ptrdiff_type_node, arg2)); return int_const_binop (code, arg1, arg2); } if (TREE_CODE (arg1) == REAL_CST && TREE_CODE (arg2) == REAL_CST) { machine_mode mode; REAL_VALUE_TYPE d1; REAL_VALUE_TYPE d2; REAL_VALUE_TYPE value; @@ -8816,24 +8821,23 @@ fold_addr_of_array_ref_difference (locat /* If the bases are array references as well, recurse. If the bases are pointer indirections compute the difference of the pointers. If the bases are equal, we are set. */ if ((TREE_CODE (base0) == ARRAY_REF && TREE_CODE (base1) == ARRAY_REF && (base_offset = fold_addr_of_array_ref_difference (loc, type, base0, base1))) || (INDIRECT_REF_P (base0) && INDIRECT_REF_P (base1) && (base_offset - = fold_binary_loc (loc, MINUS_EXPR, type, - fold_convert (type, TREE_OPERAND (base0, 0)), - fold_convert (type, - TREE_OPERAND (base1, 0))))) + = fold_binary_loc (loc, POINTER_DIFF_EXPR, type, + TREE_OPERAND (base0, 0), + TREE_OPERAND (base1, 0)))) || operand_equal_p (base0, base1, OEP_ADDRESS_OF)) { tree op0 = fold_convert_loc (loc, type, TREE_OPERAND (aref0, 1)); tree op1 = fold_convert_loc (loc, type, TREE_OPERAND (aref1, 1)); tree esz = fold_convert_loc (loc, type, array_ref_element_size (aref0)); tree diff = fold_build2_loc (loc, MINUS_EXPR, type, op0, op1); return fold_build2_loc (loc, PLUS_EXPR, type, base_offset, fold_build2_loc (loc, MULT_EXPR, type, diff, esz)); @@ -9691,21 +9695,39 @@ fold_binary_loc (location_t loc, } return fold_convert_loc (loc, type, associate_trees (loc, var0, con0, code, atype)); } } return NULL_TREE; + case POINTER_DIFF_EXPR: case MINUS_EXPR: + /* Fold &a[i] - &a[j] to i-j. */ + if (TREE_CODE (arg0) == ADDR_EXPR + && TREE_CODE (TREE_OPERAND (arg0, 0)) == ARRAY_REF + && TREE_CODE (arg1) == ADDR_EXPR + && TREE_CODE (TREE_OPERAND (arg1, 0)) == ARRAY_REF) + { + tree tem = fold_addr_of_array_ref_difference (loc, type, + TREE_OPERAND (arg0, 0), + TREE_OPERAND (arg1, 0)); + if (tem) + return tem; + } + + /* Further transformations are not for pointers. */ + if (code == POINTER_DIFF_EXPR) + return NULL_TREE; + /* (-A) - B -> (-B) - A where B is easily negated and we can swap. */ if (TREE_CODE (arg0) == NEGATE_EXPR && negate_expr_p (op1)) return fold_build2_loc (loc, MINUS_EXPR, type, negate_expr (op1), fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0))); /* Fold __complex__ ( x, 0 ) - __complex__ ( 0, y ) to __complex__ ( x, -y ). This is not the same for SNaNs or if @@ -9749,33 +9771,20 @@ fold_binary_loc (location_t loc, && ! TYPE_OVERFLOW_SANITIZED (type) && ((FLOAT_TYPE_P (type) /* Avoid this transformation if B is a positive REAL_CST. */ && (TREE_CODE (op1) != REAL_CST || REAL_VALUE_NEGATIVE (TREE_REAL_CST (op1)))) || INTEGRAL_TYPE_P (type))) return fold_build2_loc (loc, PLUS_EXPR, type, fold_convert_loc (loc, type, arg0), negate_expr (op1)); - /* Fold &a[i] - &a[j] to i-j. */ - if (TREE_CODE (arg0) == ADDR_EXPR - && TREE_CODE (TREE_OPERAND (arg0, 0)) == ARRAY_REF - && TREE_CODE (arg1) == ADDR_EXPR - && TREE_CODE (TREE_OPERAND (arg1, 0)) == ARRAY_REF) - { - tree tem = fold_addr_of_array_ref_difference (loc, type, - TREE_OPERAND (arg0, 0), - TREE_OPERAND (arg1, 0)); - if (tem) - return tem; - } - if (FLOAT_TYPE_P (type) && flag_unsafe_math_optimizations && (TREE_CODE (arg0) == RDIV_EXPR || TREE_CODE (arg0) == MULT_EXPR) && (TREE_CODE (arg1) == RDIV_EXPR || TREE_CODE (arg1) == MULT_EXPR) && (tem = distribute_real_division (loc, code, type, arg0, arg1))) return tem; /* Handle (A1 * C1) - (A2 * C2) with A1, A2 or C1, C2 being the same or one. Make sure the type is not saturating and has the signedness of the stripped operands, as fold_plusminus_mult_expr will re-associate. Index: gcc/match.pd =================================================================== --- gcc/match.pd (revision 253536) +++ gcc/match.pd (working copy) @@ -117,20 +117,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Simplify x - x. This is unsafe for certain floats even in non-IEEE formats. In IEEE, it is unsafe because it does wrong for NaNs. Also note that operand_equal_p is always false if an operand is volatile. */ (simplify (minus @0 @0) (if (!FLOAT_TYPE_P (type) || !HONOR_NANS (type)) { build_zero_cst (type); })) +(simplify + (pointer_diff @@0 @0) + { build_zero_cst (type); }) (simplify (mult @0 integer_zerop@1) @1) /* Maybe fold x * 0 to 0. The expressions aren't the same when x is NaN, since x * 0 is also NaN. Nor are they the same in modes with signed zeros, since multiplying a negative value by 0 gives -0, not +0. */ (simplify @@ -1377,20 +1380,27 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) tem5 = ptr1 + tem4; and produce tem5 = ptr2; */ (simplify (pointer_plus @0 (convert?@2 (minus@3 (convert @1) (convert @0)))) /* Conditionally look through a sign-changing conversion. */ (if (TYPE_PRECISION (TREE_TYPE (@2)) == TYPE_PRECISION (TREE_TYPE (@3)) && ((GIMPLE && useless_type_conversion_p (type, TREE_TYPE (@1))) || (GENERIC && type == TREE_TYPE (@1)))) @1)) +(simplify + (pointer_plus @0 (convert?@2 (pointer_diff@3 @1 @@0))) + /* Conditionally look through a sign-changing conversion. */ + (if (TYPE_PRECISION (TREE_TYPE (@2)) == TYPE_PRECISION (TREE_TYPE (@3)) + && ((GIMPLE && useless_type_conversion_p (type, TREE_TYPE (@1))) + || (GENERIC && type == TREE_TYPE (@1)))) + @1)) /* Pattern match tem = (sizetype) ptr; tem = tem & algn; tem = -tem; ... = ptr p+ tem; and produce the simpler and easier to analyze with respect to alignment ... = ptr & ~algn; */ (simplify (pointer_plus @0 (negate (bit_and (convert @0) INTEGER_CST@1))) @@ -1403,20 +1413,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) (with { HOST_WIDE_INT diff; } (if (ptr_difference_const (@0, @1, &diff)) { build_int_cst_type (type, diff); })))) (simplify (minus (convert @0) (convert ADDR_EXPR@1)) (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) (with { HOST_WIDE_INT diff; } (if (ptr_difference_const (@0, @1, &diff)) { build_int_cst_type (type, diff); })))) +(simplify + (pointer_diff (convert?@2 ADDR_EXPR@0) (convert?@3 @1)) + (if (tree_nop_conversion_p (TREE_TYPE(@2), TREE_TYPE (@0)) + && tree_nop_conversion_p (TREE_TYPE(@3), TREE_TYPE (@1))) + (with { HOST_WIDE_INT diff; } + (if (ptr_difference_const (@0, @1, &diff)) + { build_int_cst_type (type, diff); })))) +(simplify + (pointer_diff (convert?@2 @0) (convert?@3 ADDR_EXPR@1)) + (if (tree_nop_conversion_p (TREE_TYPE(@2), TREE_TYPE (@0)) + && tree_nop_conversion_p (TREE_TYPE(@3), TREE_TYPE (@1))) + (with { HOST_WIDE_INT diff; } + (if (ptr_difference_const (@0, @1, &diff)) + { build_int_cst_type (type, diff); })))) /* If arg0 is derived from the address of an object or function, we may be able to fold this expression using the object or function's alignment. */ (simplify (bit_and (convert? @0) INTEGER_CST@1) (if (POINTER_TYPE_P (TREE_TYPE (@0)) && tree_nop_conversion_p (type, TREE_TYPE (@0))) (with { @@ -1599,20 +1623,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) || (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))) /* For pointer types, if the conversion of A to the final type requires a sign- or zero-extension, then we have to punt - it is not defined which one is correct. */ || (POINTER_TYPE_P (TREE_TYPE (@0)) && TREE_CODE (@1) == INTEGER_CST && tree_int_cst_sign_bit (@1) == 0)) (convert @1)))) + (simplify + (pointer_diff (pointer_plus @@0 @1) @0) + /* The second argument of pointer_plus must be interpreted as signed, and + thus sign-extended if necessary. */ + (with { tree stype = signed_type_for (TREE_TYPE (@1)); } + (convert (convert:stype @1)))) /* (T)P - (T)(P + A) -> -(T) A */ (for add (plus pointer_plus) (simplify (minus (convert @0) (convert (add @@0 @1))) (if (element_precision (type) <= element_precision (TREE_TYPE (@1)) /* For integer types, if A has a smaller type than T the result depends on the possible overflow in P + A. @@ -1623,20 +1653,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) || (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))) /* For pointer types, if the conversion of A to the final type requires a sign- or zero-extension, then we have to punt - it is not defined which one is correct. */ || (POINTER_TYPE_P (TREE_TYPE (@0)) && TREE_CODE (@1) == INTEGER_CST && tree_int_cst_sign_bit (@1) == 0)) (negate (convert @1))))) + (simplify + (pointer_diff @0 (pointer_plus @@0 @1)) + /* The second argument of pointer_plus must be interpreted as signed, and + thus sign-extended if necessary. */ + (with { tree stype = signed_type_for (TREE_TYPE (@1)); } + (negate (convert (convert:stype @1))))) /* (T)(P + A) - (T)(P + B) -> (T)A - (T)B */ (for add (plus pointer_plus) (simplify (minus (convert (add @@0 @1)) (convert (add @0 @2))) (if (element_precision (type) <= element_precision (TREE_TYPE (@1)) /* For integer types, if A has a smaller type than T the result depends on the possible overflow in P + A. @@ -1649,20 +1685,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* For pointer types, if the conversion of A to the final type requires a sign- or zero-extension, then we have to punt - it is not defined which one is correct. */ || (POINTER_TYPE_P (TREE_TYPE (@0)) && TREE_CODE (@1) == INTEGER_CST && tree_int_cst_sign_bit (@1) == 0 && TREE_CODE (@2) == INTEGER_CST && tree_int_cst_sign_bit (@2) == 0)) (minus (convert @1) (convert @2))))))) + (simplify + (pointer_diff (pointer_plus @@0 @1) (pointer_plus @0 @2)) + /* The second argument of pointer_plus must be interpreted as signed, and + thus sign-extended if necessary. */ + (with { tree stype = signed_type_for (TREE_TYPE (@1)); } + (minus (convert (convert:stype @1)) (convert (convert:stype @2))))) /* Simplifications of MIN_EXPR, MAX_EXPR, fmin() and fmax(). */ (for minmax (min max FMIN FMAX) (simplify (minmax @0 @0) @0)) /* min(max(x,y),y) -> y. */ (simplify @@ -2639,20 +2681,25 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Transform comparisons of the form X - Y CMP 0 to X CMP Y. ??? The transformation is valid for the other operators if overflow is undefined for the type, but performing it here badly interacts with the transformation in fold_cond_expr_with_comparison which attempts to synthetize ABS_EXPR. */ (for cmp (eq ne) (simplify (cmp (minus@2 @0 @1) integer_zerop) (if (single_use (@2)) (cmp @0 @1)))) +(for cmp (eq ne) + (simplify + (cmp (pointer_diff@2 @0 @1) integer_zerop) + (if (single_use (@2)) + (cmp @0 @1)))) /* Transform comparisons of the form X * C1 CMP 0 to X CMP 0 in the signed arithmetic case. That form is created by the compiler often enough for folding it to be of value. One example is in computing loop trip counts after Operator Strength Reduction. */ (for cmp (simple_comparison) scmp (swapped_simple_comparison) (simplify (cmp (mult@3 @0 INTEGER_CST@1) integer_zerop@2) /* Handle unfolded multiplication by zero. */ Index: gcc/optabs-tree.c =================================================================== --- gcc/optabs-tree.c (revision 253536) +++ gcc/optabs-tree.c (working copy) @@ -216,20 +216,21 @@ optab_for_tree_code (enum tree_code code trapv = INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_TRAPS (type); switch (code) { case POINTER_PLUS_EXPR: case PLUS_EXPR: if (TYPE_SATURATING (type)) return TYPE_UNSIGNED (type) ? usadd_optab : ssadd_optab; return trapv ? addv_optab : add_optab; + case POINTER_DIFF_EXPR: case MINUS_EXPR: if (TYPE_SATURATING (type)) return TYPE_UNSIGNED (type) ? ussub_optab : sssub_optab; return trapv ? subv_optab : sub_optab; case MULT_EXPR: if (TYPE_SATURATING (type)) return TYPE_UNSIGNED (type) ? usmul_optab : ssmul_optab; return trapv ? smulv_optab : smul_optab; Index: gcc/tree-cfg.c =================================================================== --- gcc/tree-cfg.c (revision 253536) +++ gcc/tree-cfg.c (working copy) @@ -3973,20 +3973,39 @@ verify_gimple_assign_binary (gassign *st error ("type mismatch in pointer plus expression"); debug_generic_stmt (lhs_type); debug_generic_stmt (rhs1_type); debug_generic_stmt (rhs2_type); return true; } return false; } + case POINTER_DIFF_EXPR: + { + if (!POINTER_TYPE_P (rhs1_type) + || !POINTER_TYPE_P (rhs2_type) + // || !useless_type_conversion_p (rhs2_type, rhs1_type) + // || !useless_type_conversion_p (ptrdiff_type_node, lhs_type)) + || TREE_CODE (lhs_type) != INTEGER_TYPE + || TYPE_UNSIGNED (lhs_type)) + { + error ("type mismatch in pointer diff expression"); + debug_generic_stmt (lhs_type); + debug_generic_stmt (rhs1_type); + debug_generic_stmt (rhs2_type); + return true; + } + + return false; + } + case TRUTH_ANDIF_EXPR: case TRUTH_ORIF_EXPR: case TRUTH_AND_EXPR: case TRUTH_OR_EXPR: case TRUTH_XOR_EXPR: gcc_unreachable (); case LT_EXPR: case LE_EXPR: Index: gcc/tree-inline.c =================================================================== --- gcc/tree-inline.c (revision 253536) +++ gcc/tree-inline.c (working copy) @@ -3914,20 +3914,21 @@ estimate_operator_cost (enum tree_code c return 0; /* Assign cost of 1 to usual operations. ??? We may consider mapping RTL costs to this. */ case COND_EXPR: case VEC_COND_EXPR: case VEC_PERM_EXPR: case PLUS_EXPR: case POINTER_PLUS_EXPR: + case POINTER_DIFF_EXPR: case MINUS_EXPR: case MULT_EXPR: case MULT_HIGHPART_EXPR: case FMA_EXPR: case ADDR_SPACE_CONVERT_EXPR: case FIXED_CONVERT_EXPR: case FIX_TRUNC_EXPR: case NEGATE_EXPR: Index: gcc/tree-pretty-print.c =================================================================== --- gcc/tree-pretty-print.c (revision 253536) +++ gcc/tree-pretty-print.c (working copy) @@ -2299,20 +2299,21 @@ dump_generic_node (pretty_printer *pp, t pp_greater (pp); break; /* Binary arithmetic and logic expressions. */ case WIDEN_SUM_EXPR: case WIDEN_MULT_EXPR: case MULT_EXPR: case MULT_HIGHPART_EXPR: case PLUS_EXPR: case POINTER_PLUS_EXPR: + case POINTER_DIFF_EXPR: case MINUS_EXPR: case TRUNC_DIV_EXPR: case CEIL_DIV_EXPR: case FLOOR_DIV_EXPR: case ROUND_DIV_EXPR: case TRUNC_MOD_EXPR: case CEIL_MOD_EXPR: case FLOOR_MOD_EXPR: case ROUND_MOD_EXPR: case RDIV_EXPR: @@ -3534,20 +3535,21 @@ op_code_prio (enum tree_code code) case LROTATE_EXPR: case RROTATE_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: case WIDEN_LSHIFT_EXPR: return 11; case WIDEN_SUM_EXPR: case PLUS_EXPR: case POINTER_PLUS_EXPR: + case POINTER_DIFF_EXPR: case MINUS_EXPR: return 12; case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case WIDEN_MULT_EXPR: case DOT_PROD_EXPR: case WIDEN_MULT_PLUS_EXPR: case WIDEN_MULT_MINUS_EXPR: case MULT_EXPR: @@ -3720,20 +3722,21 @@ op_symbol_code (enum tree_code code) return "w+"; case WIDEN_MULT_EXPR: return "w*"; case MULT_HIGHPART_EXPR: return "h*"; case NEGATE_EXPR: case MINUS_EXPR: + case POINTER_DIFF_EXPR: return "-"; case BIT_NOT_EXPR: return "~"; case TRUTH_NOT_EXPR: return "!"; case MULT_EXPR: case INDIRECT_REF: Index: gcc/tree-vrp.c =================================================================== --- gcc/tree-vrp.c (revision 253536) +++ gcc/tree-vrp.c (working copy) @@ -3104,29 +3104,29 @@ extract_range_from_binary_expr (value_ra set_value_range (&n_vr0, VR_RANGE, op0, op0, NULL); extract_range_from_binary_expr_1 (vr, code, expr_type, &n_vr0, &vr1); } /* If we didn't derive a range for MINUS_EXPR, and op1's range is ~[op0,op0] or vice-versa, then we can derive a non-null range. This happens often for pointer subtraction. */ if (vr->type == VR_VARYING - && code == MINUS_EXPR + && (code == MINUS_EXPR || code == POINTER_DIFF_EXPR) && TREE_CODE (op0) == SSA_NAME && ((vr0.type == VR_ANTI_RANGE && vr0.min == op1 && vr0.min == vr0.max) || (vr1.type == VR_ANTI_RANGE && vr1.min == op0 && vr1.min == vr1.max))) - set_value_range_to_nonnull (vr, TREE_TYPE (op0)); + set_value_range_to_nonnull (vr, expr_type); } /* Extract range information from a unary operation CODE based on the range of its operand *VR0 with type OP0_TYPE with resulting type TYPE. The resulting range is stored in *VR. */ void extract_range_from_unary_expr (value_range *vr, enum tree_code code, tree type, value_range *vr0_, tree op0_type) Index: gcc/tree.def =================================================================== --- gcc/tree.def (revision 253536) +++ gcc/tree.def (working copy) @@ -669,20 +669,24 @@ DEFTREECODE (PLACEHOLDER_EXPR, "placehol /* Simple arithmetic. */ DEFTREECODE (PLUS_EXPR, "plus_expr", tcc_binary, 2) DEFTREECODE (MINUS_EXPR, "minus_expr", tcc_binary, 2) DEFTREECODE (MULT_EXPR, "mult_expr", tcc_binary, 2) /* Pointer addition. The first operand is always a pointer and the second operand is an integer of type sizetype. */ DEFTREECODE (POINTER_PLUS_EXPR, "pointer_plus_expr", tcc_binary, 2) +/* Pointer subtraction. The two arguments are pointers, and the result + is a signed integer. */ +DEFTREECODE (POINTER_DIFF_EXPR, "pointer_diff_expr", tcc_binary, 2) + /* Highpart multiplication. For an integral type with precision B, returns bits [2B-1, B] of the full 2*B product. */ DEFTREECODE (MULT_HIGHPART_EXPR, "mult_highpart_expr", tcc_binary, 2) /* Division for integer result that rounds the quotient toward zero. */ DEFTREECODE (TRUNC_DIV_EXPR, "trunc_div_expr", tcc_binary, 2) /* Division for integer result that rounds it toward plus infinity. */ DEFTREECODE (CEIL_DIV_EXPR, "ceil_div_expr", tcc_binary, 2) Index: gcc/varasm.c =================================================================== --- gcc/varasm.c (revision 253536) +++ gcc/varasm.c (working copy) @@ -4606,20 +4606,21 @@ initializer_constant_valid_p_1 (tree val else /* Support narrowing pointer differences. */ ret = narrowing_initializer_constant_valid_p (value, endtype, NULL); if (cache) { cache[0] = value; cache[1] = ret; } return ret; + case POINTER_DIFF_EXPR: case MINUS_EXPR: if (TREE_CODE (endtype) == REAL_TYPE) return NULL_TREE; if (cache && cache[0] == value) return cache[1]; if (! INTEGRAL_TYPE_P (endtype) || TYPE_PRECISION (endtype) >= TYPE_PRECISION (TREE_TYPE (value))) { tree ncache[4] = { NULL_TREE, NULL_TREE, NULL_TREE, NULL_TREE }; tree valid0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) 2017-10-09 11:01 ` Marc Glisse @ 2017-10-19 15:11 ` Richard Biener 2017-10-28 13:04 ` Marc Glisse 0 siblings, 1 reply; 21+ messages in thread From: Richard Biener @ 2017-10-19 15:11 UTC (permalink / raw) To: Marc Glisse; +Cc: GCC Patches On Mon, Oct 9, 2017 at 12:55 PM, Marc Glisse <marc.glisse@inria.fr> wrote: > On Mon, 3 Jul 2017, Richard Biener wrote: > >> On Sat, 1 Jul 2017, Marc Glisse wrote: >> >>> I wrote a quick prototype to see what the fallout would look like. >>> Surprisingly, it actually passes bootstrap+testsuite on ppc64el with all >>> languages with no regression. Sure, it is probably not a complete >>> migration, there are likely a few places still converting to ptrdiff_t >>> to perform a regular subtraction, but this seems to indicate that the >>> work isn't as bad as using a signed type in pointer_plus_expr for >>> instance. >> >> >> The fold_binary_loc hunk looks dangerous (it'll generate MINUS_EXPR >> from POINTER_MINUS_EXPR in some cases I guess). >> >> The tree code needs documenting in tree.def and generic.texi. >> >> Otherwise ok(*). >> >> Thanks, >> Richard. >> >> (*) ok, just kidding -- or maybe not > > > I updated the prototype a bit. Some things I noticed: > > - the C front-end has support for address spaces that seems to imply that > pointer subtraction (and division by the size) may be done in a type larger > than ptrdiff_t. Currently, it generates > (ptrdiff_t)(((inttype)q-(inttype)p)/size) for q-p where inttype is some type > potentially larger than ptrdiff_t. It uses a ptrdiff_t corresponding to the respective address space, yes. That we use sizetype elsewhere unconditionally is a bug :/ I am thus generating a POINTER_DIFF_EXPR > with that type, while I was originally hoping its type would always be > ptrdiff_t. It complicates code and I am not sure I didn't break address > spaces anyway... (expansion likely needs to do the inttype dance) I think that's fine. Ideally targets would provide a type to use for each respective address space given we have targets that have sizetype smaller than ptr_mode. > Are ptrdiff_t (what POINTER_DIFF_EXPR should return) and size_t (what > POINTER_PLUS_EXPR takes as second argument) always the same type > signed/unsigned? POINTER_PLUS_EXPR takes 'sizetype', not size_t. So the answer is clearly no. And yes, that's ugly and broken and should be fixed. > Counter-examples: m32c (when !TARGET_A16), visium, darwin, > powerpcspe, s390, vms... and it isn't even always the same that is bigger > than the other. That's quite messy. Eh. Yeah, targets are free to choose 'sizetype' and they do so for efficiency. IMHO we should get rid of this "feature". > - I had to use @@ in match.pd, not for constants, but because in GENERIC we > sometimes ended up with pointers where operand_equal_p said yes but > types_match disagreed. > > - It currently regresses 2 go tests: net/http runtime/debug Those are flaky for me and fail sometimes and sometimes not. +@item POINTER_DIFF_EXPR +This node represents pointer subtraction. The two operands always +have pointer/reference type. The second operand is always an unsigned +integer type compatible with sizetype. It returns a signed integer. the 2nd sentence looks bogusly copied. + /* FIXME. */ + if (code == POINTER_DIFF_EXPR) + return int_const_binop (MINUS_EXPR, + fold_convert (ptrdiff_type_node, arg1), + fold_convert (ptrdiff_type_node, arg2)); wide_int_to_tree (type, wi::to_widest (arg1) - wi::to_widest (arg2)); ? + case POINTER_DIFF_EXPR: + { + if (!POINTER_TYPE_P (rhs1_type) + || !POINTER_TYPE_P (rhs2_type) + // || !useless_type_conversion_p (rhs2_type, rhs1_type) types_compatible_p (rhs1_type, rhs2_type)? + // || !useless_type_conversion_p (ptrdiff_type_node, lhs_type)) + || TREE_CODE (lhs_type) != INTEGER_TYPE + || TYPE_UNSIGNED (lhs_type)) + { + error ("type mismatch in pointer diff expression"); + debug_generic_stmt (lhs_type); + debug_generic_stmt (rhs1_type); + debug_generic_stmt (rhs2_type); + return true; there's also verify_expr which could want adjustment for newly created tree kinds. So if the precision of the result is larger than that of the pointer operands we sign-extend the result, right? So the subtraction is performed in precision of the pointer operands and then sign-extended/truncated to the result type? Which means it is _not_ a widening subtraction to get around the half-address-space issue. The tree.def documentation should reflect this semantic detail. Not sure if the RTL expansion follows it. I think that we'd ideally have a single legal INTEGER_TYPE precision per pointer type precision and that those precisions should match... we don't have to follow the mistakes of POINTER_PLUS_EXPR. So ... above verify TYPE_PRECISION (rhs1_type) == TYPE_PRECISION (lhs_type)? Some targets have 24bit ptr_mode but no 24bit integer type which means the FE likely chooses 32bit int for the difference computation. But the middle-end should be free to create a 24bit precision type with SImode. Otherwise as said before - go ahead, I think this would be great to have for GCC 8. I'd say ask the maintainers of the above list of targets to do some testing. "Fixing" POINTER_PLUS_EXPR would be very nice as well. Again, matching precision - I'm not sure if we need to force a signed operand, having either might be nice given all sizes are usually unsigned. Thanks and sorry for the delay, Richard. > -- > Marc Glisse ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) 2017-10-19 15:11 ` Richard Biener @ 2017-10-28 13:04 ` Marc Glisse 2017-10-28 17:13 ` Richard Biener 0 siblings, 1 reply; 21+ messages in thread From: Marc Glisse @ 2017-10-28 13:04 UTC (permalink / raw) To: Richard Biener; +Cc: GCC Patches I am sending the new version of the patch in a separate email, to make it more visible, and only replying to a few points here. On Thu, 19 Oct 2017, Richard Biener wrote: > On Mon, Oct 9, 2017 at 12:55 PM, Marc Glisse <marc.glisse@inria.fr> wrote: >> On Mon, 3 Jul 2017, Richard Biener wrote: >> >>> On Sat, 1 Jul 2017, Marc Glisse wrote: >>> >>>> I wrote a quick prototype to see what the fallout would look like. >>>> Surprisingly, it actually passes bootstrap+testsuite on ppc64el with all >>>> languages with no regression. Sure, it is probably not a complete >>>> migration, there are likely a few places still converting to ptrdiff_t >>>> to perform a regular subtraction, but this seems to indicate that the >>>> work isn't as bad as using a signed type in pointer_plus_expr for >>>> instance. >>> >>> >>> The fold_binary_loc hunk looks dangerous (it'll generate MINUS_EXPR >>> from POINTER_MINUS_EXPR in some cases I guess). >>> >>> The tree code needs documenting in tree.def and generic.texi. >>> >>> Otherwise ok(*). >>> >>> Thanks, >>> Richard. >>> >>> (*) ok, just kidding -- or maybe not >> >> >> I updated the prototype a bit. Some things I noticed: >> >> - the C front-end has support for address spaces that seems to imply that >> pointer subtraction (and division by the size) may be done in a type larger >> than ptrdiff_t. Currently, it generates >> (ptrdiff_t)(((inttype)q-(inttype)p)/size) for q-p where inttype is some type >> potentially larger than ptrdiff_t. > > It uses a ptrdiff_t corresponding to the respective address space, yes. > That we use sizetype elsewhere unconditionally is a bug :/ > >> I am thus generating a POINTER_DIFF_EXPR >> with that type, while I was originally hoping its type would always be >> ptrdiff_t. It complicates code and I am not sure I didn't break address >> spaces anyway... (expansion likely needs to do the inttype dance) > > I think that's fine. Ideally targets would provide a type to use for each > respective address space given we have targets that have sizetype smaller > than ptr_mode. > >> Are ptrdiff_t (what POINTER_DIFF_EXPR should return) and size_t (what >> POINTER_PLUS_EXPR takes as second argument) always the same type >> signed/unsigned? > > POINTER_PLUS_EXPR takes 'sizetype', not size_t. Ah, yes, that's the one I meant... > So the answer is clearly no. And yes, that's ugly and broken and should be fixed. > >> Counter-examples: m32c (when !TARGET_A16), visium, darwin, >> powerpcspe, s390, vms... and it isn't even always the same that is bigger >> than the other. That's quite messy. > > Eh. Yeah, targets are free to choose 'sizetype' and they do so for > efficiency. IMHO we should get rid of this "feature". > >> - I had to use @@ in match.pd, not for constants, but because in GENERIC we >> sometimes ended up with pointers where operand_equal_p said yes but >> types_match disagreed. >> >> - It currently regresses 2 go tests: net/http runtime/debug > > Those are flaky for me and fail sometimes and sometimes not. Yes, I noticed that (there are 1 or 2 others in the go testsuite). > +@item POINTER_DIFF_EXPR > +This node represents pointer subtraction. The two operands always > +have pointer/reference type. The second operand is always an unsigned > +integer type compatible with sizetype. It returns a signed integer. > > the 2nd sentence looks bogusly copied. Oups, removed. > > + /* FIXME. */ > + if (code == POINTER_DIFF_EXPR) > + return int_const_binop (MINUS_EXPR, > + fold_convert (ptrdiff_type_node, arg1), > + fold_convert (ptrdiff_type_node, arg2)); > > wide_int_to_tree (type, wi::to_widest (arg1) - wi::to_widest (arg2)); Before your reply, I wrote something similar using offset_int instead of widest_int (and handling overflow, mostly for documentation purposes). I wasn't sure which one to pick, I can change to widest_int if you think it is better... > + case POINTER_DIFF_EXPR: > + { > + if (!POINTER_TYPE_P (rhs1_type) > + || !POINTER_TYPE_P (rhs2_type) > + // || !useless_type_conversion_p (rhs2_type, rhs1_type) > > types_compatible_p (rhs1_type, rhs2_type)? Makes sense. > + // || !useless_type_conversion_p (ptrdiff_type_node, lhs_type)) > + || TREE_CODE (lhs_type) != INTEGER_TYPE > + || TYPE_UNSIGNED (lhs_type)) > + { > + error ("type mismatch in pointer diff expression"); > + debug_generic_stmt (lhs_type); > + debug_generic_stmt (rhs1_type); > + debug_generic_stmt (rhs2_type); > + return true; > > there's also verify_expr which could want adjustment for newly created > tree kinds. ok. > So if the precision of the result is larger than that of the pointer > operands we sign-extend the result, right? So the subtraction is > performed in precision of the pointer operands and then > sign-extended/truncated to the result type? Which means it is _not_ a > widening subtraction to get around the half-address-space issue. The > tree.def documentation should reflect this semantic detail. Not sure if > the RTL expansion follows it. I actually have no idea what expansion does if the size is different (that was one of my comments). Crashing is not unlikely. I have changed things a bit. Now POINTER_DIFF_EXPR always has the same precision as input and output (so expansion should be fine). And there is code in the C/C++ front-ends that uses casts and MINUS_EXPR if we want a result wider than pointers, although it might very well be dead code... My goal was not to help with objects larger than half the address space, but to fix things for small objects unlucky enough to straddle the middle of the address space. By properly modeling what overflow means for this operation, I expect we can add more simplifications in match.pd. > I think that we'd ideally have a single legal INTEGER_TYPE precision > per pointer type precision and that those precisions should match... > we don't have to follow the mistakes of POINTER_PLUS_EXPR. > > So ... above verify TYPE_PRECISION (rhs1_type) == TYPE_PRECISION (lhs_type)? > Some targets have 24bit ptr_mode but no 24bit integer type which means the > FE likely chooses 32bit int for the difference computation. But the middle-end > should be free to create a 24bit precision type with SImode. > > Otherwise as said before - go ahead, I think this would be great to > have for GCC 8. I'd say > ask the maintainers of the above list of targets to do some testing. > > "Fixing" POINTER_PLUS_EXPR would be very nice as well. Again, matching > precision - I'm not sure if we need to force a signed operand, having either > might be nice given all sizes are usually unsigned. > > Thanks and sorry for the delay, > Richard. -- Marc Glisse ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) 2017-10-28 13:04 ` Marc Glisse @ 2017-10-28 17:13 ` Richard Biener 0 siblings, 0 replies; 21+ messages in thread From: Richard Biener @ 2017-10-28 17:13 UTC (permalink / raw) To: Marc Glisse; +Cc: GCC Patches On October 28, 2017 2:53:56 PM GMT+02:00, Marc Glisse <marc.glisse@inria.fr> wrote: > >I am sending the new version of the patch in a separate email, to make >it >more visible, and only replying to a few points here. > >On Thu, 19 Oct 2017, Richard Biener wrote: > >> On Mon, Oct 9, 2017 at 12:55 PM, Marc Glisse <marc.glisse@inria.fr> >wrote: >>> On Mon, 3 Jul 2017, Richard Biener wrote: >>> >>>> On Sat, 1 Jul 2017, Marc Glisse wrote: >>>> >>>>> I wrote a quick prototype to see what the fallout would look like. >>>>> Surprisingly, it actually passes bootstrap+testsuite on ppc64el >with all >>>>> languages with no regression. Sure, it is probably not a complete >>>>> migration, there are likely a few places still converting to >ptrdiff_t >>>>> to perform a regular subtraction, but this seems to indicate that >the >>>>> work isn't as bad as using a signed type in pointer_plus_expr for >>>>> instance. >>>> >>>> >>>> The fold_binary_loc hunk looks dangerous (it'll generate MINUS_EXPR >>>> from POINTER_MINUS_EXPR in some cases I guess). >>>> >>>> The tree code needs documenting in tree.def and generic.texi. >>>> >>>> Otherwise ok(*). >>>> >>>> Thanks, >>>> Richard. >>>> >>>> (*) ok, just kidding -- or maybe not >>> >>> >>> I updated the prototype a bit. Some things I noticed: >>> >>> - the C front-end has support for address spaces that seems to imply >that >>> pointer subtraction (and division by the size) may be done in a type >larger >>> than ptrdiff_t. Currently, it generates >>> (ptrdiff_t)(((inttype)q-(inttype)p)/size) for q-p where inttype is >some type >>> potentially larger than ptrdiff_t. >> >> It uses a ptrdiff_t corresponding to the respective address space, >yes. >> That we use sizetype elsewhere unconditionally is a bug :/ >> >>> I am thus generating a POINTER_DIFF_EXPR >>> with that type, while I was originally hoping its type would always >be >>> ptrdiff_t. It complicates code and I am not sure I didn't break >address >>> spaces anyway... (expansion likely needs to do the inttype dance) >> >> I think that's fine. Ideally targets would provide a type to use for >each >> respective address space given we have targets that have sizetype >smaller >> than ptr_mode. >> >>> Are ptrdiff_t (what POINTER_DIFF_EXPR should return) and size_t >(what >>> POINTER_PLUS_EXPR takes as second argument) always the same type >>> signed/unsigned? >> >> POINTER_PLUS_EXPR takes 'sizetype', not size_t. > >Ah, yes, that's the one I meant... > >> So the answer is clearly no. And yes, that's ugly and broken and >should be fixed. >> >>> Counter-examples: m32c (when !TARGET_A16), visium, darwin, >>> powerpcspe, s390, vms... and it isn't even always the same that is >bigger >>> than the other. That's quite messy. >> >> Eh. Yeah, targets are free to choose 'sizetype' and they do so for >> efficiency. IMHO we should get rid of this "feature". >> >>> - I had to use @@ in match.pd, not for constants, but because in >GENERIC we >>> sometimes ended up with pointers where operand_equal_p said yes but >>> types_match disagreed. >>> >>> - It currently regresses 2 go tests: net/http runtime/debug >> >> Those are flaky for me and fail sometimes and sometimes not. > >Yes, I noticed that (there are 1 or 2 others in the go testsuite). > >> +@item POINTER_DIFF_EXPR >> +This node represents pointer subtraction. The two operands always >> +have pointer/reference type. The second operand is always an >unsigned >> +integer type compatible with sizetype. It returns a signed integer. >> >> the 2nd sentence looks bogusly copied. > >Oups, removed. > >> >> + /* FIXME. */ >> + if (code == POINTER_DIFF_EXPR) >> + return int_const_binop (MINUS_EXPR, >> + fold_convert (ptrdiff_type_node, >arg1), >> + fold_convert (ptrdiff_type_node, >arg2)); >> >> wide_int_to_tree (type, wi::to_widest (arg1) - wi::to_widest >(arg2)); > >Before your reply, I wrote something similar using offset_int instead >of >widest_int (and handling overflow, mostly for documentation purposes). >I >wasn't sure which one to pick, I can change to widest_int if you think >it >is better... Offset_int is better. >> + case POINTER_DIFF_EXPR: >> + { >> + if (!POINTER_TYPE_P (rhs1_type) >> + || !POINTER_TYPE_P (rhs2_type) >> + // || !useless_type_conversion_p (rhs2_type, rhs1_type) >> >> types_compatible_p (rhs1_type, rhs2_type)? > >Makes sense. > >> + // || !useless_type_conversion_p (ptrdiff_type_node, >lhs_type)) >> + || TREE_CODE (lhs_type) != INTEGER_TYPE >> + || TYPE_UNSIGNED (lhs_type)) >> + { >> + error ("type mismatch in pointer diff expression"); >> + debug_generic_stmt (lhs_type); >> + debug_generic_stmt (rhs1_type); >> + debug_generic_stmt (rhs2_type); >> + return true; >> >> there's also verify_expr which could want adjustment for newly >created >> tree kinds. > >ok. > >> So if the precision of the result is larger than that of the pointer >> operands we sign-extend the result, right? So the subtraction is >> performed in precision of the pointer operands and then >> sign-extended/truncated to the result type? Which means it is _not_ a > >> widening subtraction to get around the half-address-space issue. The > >> tree.def documentation should reflect this semantic detail. Not sure >if >> the RTL expansion follows it. > >I actually have no idea what expansion does if the size is different >(that >was one of my comments). Crashing is not unlikely. > >I have changed things a bit. Now POINTER_DIFF_EXPR always has the same >precision as input and output (so expansion should be fine). And there >is >code in the C/C++ front-ends that uses casts and MINUS_EXPR if we want >a >result wider than pointers, although it might very well be dead code... > >My goal was not to help with objects larger than half the address >space, >but to fix things for small objects unlucky enough to straddle the >middle >of the address space. By properly modeling what overflow means for this > >operation, I expect we can add more simplifications in match.pd. Yeah. Thanks, Richard. >> I think that we'd ideally have a single legal INTEGER_TYPE precision >> per pointer type precision and that those precisions should match... >> we don't have to follow the mistakes of POINTER_PLUS_EXPR. >> >> So ... above verify TYPE_PRECISION (rhs1_type) == TYPE_PRECISION >(lhs_type)? >> Some targets have 24bit ptr_mode but no 24bit integer type which >means the >> FE likely chooses 32bit int for the difference computation. But the >middle-end >> should be free to create a 24bit precision type with SImode. >> >> Otherwise as said before - go ahead, I think this would be great to >> have for GCC 8. I'd say >> ask the maintainers of the above list of targets to do some testing. >> >> "Fixing" POINTER_PLUS_EXPR would be very nice as well. Again, >matching >> precision - I'm not sure if we need to force a signed operand, having >either >> might be nice given all sizes are usually unsigned. >> >> Thanks and sorry for the delay, >> Richard. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) 2017-06-20 8:18 ` Richard Biener 2017-06-21 7:58 ` Jakub Jelinek 2017-06-21 14:40 ` [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) Jakub Jelinek @ 2017-07-04 8:53 ` Jakub Jelinek 2 siblings, 0 replies; 21+ messages in thread From: Jakub Jelinek @ 2017-07-04 8:53 UTC (permalink / raw) To: Richard Biener; +Cc: Martin Liška, gcc-patches On Tue, Jun 20, 2017 at 10:18:20AM +0200, Richard Biener wrote: > > Ok (of course, will handle this separately from the rest). > > Yes. Note I didn't look at the actual patch (yet). I'd like to ping the -fsanitize=pointer-overflow patch (though if you're busy, it can certainly wait a few weeks). Jakub ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) 2017-06-20 7:41 ` Richard Biener 2017-06-20 8:14 ` Jakub Jelinek @ 2017-06-21 8:00 ` Jakub Jelinek 2017-06-21 8:05 ` Richard Biener 1 sibling, 1 reply; 21+ messages in thread From: Jakub Jelinek @ 2017-06-21 8:00 UTC (permalink / raw) To: Richard Biener; +Cc: Martin Liška, gcc-patches On Tue, Jun 20, 2017 at 09:41:43AM +0200, Richard Biener wrote: > > 2) libcpp/line-map.c has this: > > static int > > location_adhoc_data_update (void **slot, void *data) > > { > > *((char **) slot) += *((int64_t *) data); > > return 1; > > } > > where the (why int64_t always?, we really need just intptr_t) adjusts > > one pointer from an unrelated one (result of realloc). That is a UB > > and actually can trigger this sanitization if the two regions are > > far away from each other, e.g. on i686-linux: > > ../../libcpp/line-map.c:102:21: runtime error: pointer index expression with base 0x0899e308 overflowed to 0xf74c4ab8 > > ../../libcpp/line-map.c:102:21: runtime error: pointer index expression with base 0x08add7c0 overflowed to 0xf74c9a08 > > ../../libcpp/line-map.c:102:21: runtime error: pointer index expression with base 0x092ba308 overflowed to 0xf741cab8 > > ../../libcpp/line-map.c:102:21: runtime error: pointer index expression with base 0x0a3757c0 overflowed to 0xf7453a08 > > Shall we perform the addition in uintptr_t instead to make it > > implementation defined rather than UB? > > Yes. Here is a patch for 2), bootstrap-ubsan bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Note both ptrdiff_t and uintptr_t are already used in libcpp, so I think it shouldn't create new portability issues. 2017-06-21 Jakub Jelinek <jakub@redhat.com> * line-map.c (location_adhoc_data_update): Perform addition in uintptr_t type rather than char * type. Read *data using ptrdiff_t type instead of int64_t. (get_combined_adhoc_loc): Change offset type to ptrdiff_t from int64_t. --- libcpp/line-map.c.jj 2017-06-19 08:28:18.000000000 +0200 +++ libcpp/line-map.c 2017-06-20 16:43:25.193063344 +0200 @@ -99,7 +99,8 @@ location_adhoc_data_eq (const void *l1, static int location_adhoc_data_update (void **slot, void *data) { - *((char **) slot) += *((int64_t *) data); + *((char **) slot) + = (char *) ((uintptr_t) *((char **) slot) + *((ptrdiff_t *) data)); return 1; } @@ -221,7 +222,7 @@ get_combined_adhoc_loc (struct line_maps set->location_adhoc_data_map.allocated) { char *orig_data = (char *) set->location_adhoc_data_map.data; - int64_t offset; + ptrdiff_t offset; /* Cast away extern "C" from the type of xrealloc. */ line_map_realloc reallocator = (set->reallocator ? set->reallocator Jakub ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) 2017-06-21 8:00 ` Jakub Jelinek @ 2017-06-21 8:05 ` Richard Biener 0 siblings, 0 replies; 21+ messages in thread From: Richard Biener @ 2017-06-21 8:05 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Martin Liška, gcc-patches On Wed, 21 Jun 2017, Jakub Jelinek wrote: > On Tue, Jun 20, 2017 at 09:41:43AM +0200, Richard Biener wrote: > > > 2) libcpp/line-map.c has this: > > > static int > > > location_adhoc_data_update (void **slot, void *data) > > > { > > > *((char **) slot) += *((int64_t *) data); > > > return 1; > > > } > > > where the (why int64_t always?, we really need just intptr_t) adjusts > > > one pointer from an unrelated one (result of realloc). That is a UB > > > and actually can trigger this sanitization if the two regions are > > > far away from each other, e.g. on i686-linux: > > > ../../libcpp/line-map.c:102:21: runtime error: pointer index expression with base 0x0899e308 overflowed to 0xf74c4ab8 > > > ../../libcpp/line-map.c:102:21: runtime error: pointer index expression with base 0x08add7c0 overflowed to 0xf74c9a08 > > > ../../libcpp/line-map.c:102:21: runtime error: pointer index expression with base 0x092ba308 overflowed to 0xf741cab8 > > > ../../libcpp/line-map.c:102:21: runtime error: pointer index expression with base 0x0a3757c0 overflowed to 0xf7453a08 > > > Shall we perform the addition in uintptr_t instead to make it > > > implementation defined rather than UB? > > > > Yes. > > Here is a patch for 2), bootstrap-ubsan bootstrapped/regtested on > x86_64-linux and i686-linux, ok for trunk? Ok. Richard. > Note both ptrdiff_t and uintptr_t are already used in libcpp, so I think > it shouldn't create new portability issues. > > 2017-06-21 Jakub Jelinek <jakub@redhat.com> > > * line-map.c (location_adhoc_data_update): Perform addition in > uintptr_t type rather than char * type. Read *data using > ptrdiff_t type instead of int64_t. > (get_combined_adhoc_loc): Change offset type to ptrdiff_t from > int64_t. > > --- libcpp/line-map.c.jj 2017-06-19 08:28:18.000000000 +0200 > +++ libcpp/line-map.c 2017-06-20 16:43:25.193063344 +0200 > @@ -99,7 +99,8 @@ location_adhoc_data_eq (const void *l1, > static int > location_adhoc_data_update (void **slot, void *data) > { > - *((char **) slot) += *((int64_t *) data); > + *((char **) slot) > + = (char *) ((uintptr_t) *((char **) slot) + *((ptrdiff_t *) data)); > return 1; > } > > @@ -221,7 +222,7 @@ get_combined_adhoc_loc (struct line_maps > set->location_adhoc_data_map.allocated) > { > char *orig_data = (char *) set->location_adhoc_data_map.data; > - int64_t offset; > + ptrdiff_t offset; > /* Cast away extern "C" from the type of xrealloc. */ > line_map_realloc reallocator = (set->reallocator > ? set->reallocator > > > Jakub > > -- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2017-10-28 17:11 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-06-19 18:25 [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) Jakub Jelinek 2017-06-20 7:41 ` Richard Biener 2017-06-20 8:14 ` Jakub Jelinek 2017-06-20 8:18 ` Richard Biener 2017-06-21 7:58 ` Jakub Jelinek 2017-06-21 8:04 ` Richard Biener 2017-06-21 14:40 ` [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) Jakub Jelinek 2017-06-21 15:17 ` Jakub Jelinek 2017-06-21 16:27 ` Marc Glisse 2017-06-22 8:31 ` Richard Biener 2017-06-22 9:29 ` Marc Glisse 2017-06-22 9:46 ` Richard Biener 2017-07-01 16:41 ` Marc Glisse 2017-07-03 12:37 ` Richard Biener 2017-10-09 11:01 ` Marc Glisse 2017-10-19 15:11 ` Richard Biener 2017-10-28 13:04 ` Marc Glisse 2017-10-28 17:13 ` Richard Biener 2017-07-04 8:53 ` [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) Jakub Jelinek 2017-06-21 8:00 ` Jakub Jelinek 2017-06-21 8:05 ` Richard Biener
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).