From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 6D9983889809 for ; Fri, 18 Mar 2022 17:01:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6D9983889809 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-463-_eNB0k8AMxiw8xE9iuiEOw-1; Fri, 18 Mar 2022 13:00:59 -0400 X-MC-Unique: _eNB0k8AMxiw8xE9iuiEOw-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6DE98802803; Fri, 18 Mar 2022 17:00:59 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.81]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 073064292CB; Fri, 18 Mar 2022 17:00:58 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 22IH0uDI1822124 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Fri, 18 Mar 2022 18:00:56 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 22IH0sdA1822123; Fri, 18 Mar 2022 18:00:54 +0100 Date: Fri, 18 Mar 2022 18:00:54 +0100 From: Jakub Jelinek To: Richard Biener , Jeff Law Cc: gcc-patches@gcc.gnu.org, Siddhesh Poyarekar , Martin Sebor Subject: [PATCH] Allow (void *) 0xdeadbeef accesses without warnings [PR99578] Message-ID: Reply-To: Jakub Jelinek MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.85 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Mar 2022 17:01:07 -0000 Hi! Starting with GCC11 we keep emitting false positive -Warray-bounds or -Wstringop-overflow etc. warnings on widely used *(type *)0x12345000 style accesses (or memory/string routines to such pointers). This is a standard programming style supported by all C/C++ compilers I've ever tried, used mostly in kernel or DSP programming, but sometimes also together with mmap MAP_FIXED when certain things, often I/O registers but could be anything else too are known to be present at fixed addresses. Such INTEGER_CST addresses can appear in code either because a user used it like that (in which case it is fine) or because somebody used pointer arithmetics (including &((struct whatever *)NULL)->field) on a NULL pointer. The middle-end warning code wrongly assumes that the latter case is what is very likely, while the former is unlikely and users should change their code. The following patch attempts to differentiate between the two, but because we are in stage4, does it in a temporary compromise way. For GCC 13, the aim is to represent in the IL the user (type *)0x12345000 style addresses as INTEGER_CSTs as before, and represent the addresses derived from pointer arithmetics on NULL pointer as &MEM_REF[(void*)0B + offset]. When we see say POINTER_PLUS of NULL and non-zero offset, I think we should fold it to this form. We IMNSHO need to support the poor man's offsetof I'm afraid still in wide use, so we should IMHO fold casts of such addresses to integers into INTEGER_CST offset. SCCVN IMHO shouldn't the &MEM_REF[(void*)0B + offset] and (void*)offset forms as equivalent (yes, it will be a small missed optimization but RTL will see them the same), but e.g. the patch already does ==/!= folding between the two forms. As doing this fully seems to be more like a stage1 project, the following patch keeps doing what GCC11/GCC12 did until now for pointers smaller than a parameter, by default equal to 4KB, which means that PR102037 will remain unfixed for now (unless users lower the new param manually) and for the most common cases nothing changes right now, but for higher addresses we'll assume that such INTEGER_CSTs are valid direct address accesses and use the &MEM_REF[(void*)0B + offset] forms for the invalid code on which we'll warn. Yet another alternative would be just the params.opt and pointer-query.cc (compute_objsize_r) hunks which will stop warning if we go more than 4KB from a NULL pointer altogether and doing the full set of changes only in GCC 13 (I've bootstrapped/regtested such patch last night on x86_64-linux and i686-linux). So far tested with make check-gcc/check-g++ on x86_64-linux, furthermore also tested with the parameter's default changed from 4096 to 16. Ok for trunk if it passes full bootstrap/regtest? 2022-03-18 Jakub Jelinek PR middle-end/99578 PR middle-end/100680 PR tree-optimization/100834 * params.opt (--param=min-pagesize=): New parameter. * gimple-expr.cc (is_gimple_invariant_address): Allow ADDR_EXPR of MEM_REF with integer_zerop as first operand as invariant. (is_gimple_ip_invariant_address): Likewise. * match.pd (&MEM[0 + off] == off): New simplification. * gimple-ssa-warn-restrict.cc (builtin_memref::set_base_and_offset): Ignore MEM_REFs with integer_zerop as address. * gimple-fold.cc (maybe_canonicalize_mem_ref_addr): Don't fold &MEM[0 + off] into off if off is larger than min-pagesize. * gimple-array-bounds.cc (array_bounds_checker::check_mem_ref): Return false for MEM_REFs with integer_zerop as address. * pointer-query.cc (handle_mem_ref): Handle MEM_REFs with integer_zerop as address specially. (compute_objsize_r) : Formatting fix. (compute_objsize_r) : Use maximum object size instead of zero for pointer constants equal or larger than min-pagesize. * fold-const.cc (build_fold_addr_expr_with_type_loc): * gcc.dg/tree-ssa/pr99578-1.c: New test. * gcc.dg/pr99578-1.c: New test. * gcc.dg/pr99578-2.c: New test. * gcc.dg/pr99578-3.c: New test. * gcc.dg/pr100680.c: New test. * gcc.dg/pr100834.c: New test. --- gcc/params.opt.jj 2022-03-17 18:48:06.723406664 +0100 +++ gcc/params.opt 2022-03-18 11:59:31.518452714 +0100 @@ -613,6 +613,10 @@ The maximum number of insns in loop head Common Joined UInteger Var(param_max_modulo_backtrack_attempts) Init(40) Param Optimization The maximum number of backtrack attempts the scheduler should make when modulo scheduling a loop. +-param=min-pagesize= +Common Joined UInteger Var(param_min_pagesize) Init(4096) Param Optimization +Minimum page size for warning purposes. + -param=max-partial-antic-length= Common Joined UInteger Var(param_max_partial_antic_length) Init(100) Param Optimization Maximum length of partial antic set when performing tree pre optimization. --- gcc/gimple-expr.cc.jj 2022-03-17 18:48:06.691407102 +0100 +++ gcc/gimple-expr.cc 2022-03-18 11:59:31.484453184 +0100 @@ -690,9 +690,13 @@ is_gimple_invariant_address (const_tree if (TREE_CODE (op) == MEM_REF) { const_tree op0 = TREE_OPERAND (op, 0); - return (TREE_CODE (op0) == ADDR_EXPR - && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0)) - || decl_address_invariant_p (TREE_OPERAND (op0, 0)))); + return ((TREE_CODE (op0) == ADDR_EXPR + && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0)) + || decl_address_invariant_p (TREE_OPERAND (op0, 0)))) + /* Allow &MEM [(void *)0B + 8192B] as representation + of &NULL->field_at_offset_8192 to differentiate for + warning purposes those cases from (type *)0x2000. */ + || integer_zerop (op0)); } return CONSTANT_CLASS_P (op) || decl_address_invariant_p (op); @@ -716,9 +720,10 @@ is_gimple_ip_invariant_address (const_tr if (TREE_CODE (op) == MEM_REF) { const_tree op0 = TREE_OPERAND (op, 0); - return (TREE_CODE (op0) == ADDR_EXPR - && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0)) - || decl_address_ip_invariant_p (TREE_OPERAND (op0, 0)))); + return ((TREE_CODE (op0) == ADDR_EXPR + && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0)) + || decl_address_ip_invariant_p (TREE_OPERAND (op0, 0)))) + || integer_zerop (op0)); } return CONSTANT_CLASS_P (op) || decl_address_ip_invariant_p (op); --- gcc/match.pd.jj 2022-03-16 10:56:18.995730484 +0100 +++ gcc/match.pd 2022-03-18 14:53:55.726876501 +0100 @@ -5606,6 +5606,25 @@ (define_operator_list SYNC_FETCH_AND_AND (if (cmp == NE_EXPR) { constant_boolean_node (true, type); }))))))) +/* Simplify pointer equality compares of &MEM_REF[0 + offset] and integer. */ +(for neeq (ne eq) + (simplify + (neeq (convert? addr@0) INTEGER_CST@1) + (if (TYPE_PRECISION (TREE_TYPE (@1)) == TYPE_PRECISION (TREE_TYPE (@0)) + && TYPE_PRECISION (TREE_TYPE (@1)) == TYPE_PRECISION (sizetype)) + (with { poly_int64 off; + HOST_WIDE_INT coff; + tree base = get_addr_base_and_unit_offset (TREE_OPERAND (@0, 0), + &off); } + (if (base + && TREE_CODE (base) == MEM_REF + && integer_zerop (TREE_OPERAND (base, 0)) + && TREE_CODE (TREE_OPERAND (base, 1)) == INTEGER_CST + && off.is_constant (&coff)) + { constant_boolean_node ((wi::to_wide (TREE_OPERAND (base, 1)) + coff + == wi::to_wide (@1)) + ^ (neeq != EQ_EXPR), type); }))))) + /* Simplify pointer equality compares using PTA. */ (for neeq (ne eq) (simplify --- gcc/gimple-ssa-warn-restrict.cc.jj 2022-02-04 14:36:55.000000000 +0100 +++ gcc/gimple-ssa-warn-restrict.cc 2022-03-18 13:42:05.590650262 +0100 @@ -521,7 +521,7 @@ builtin_memref::set_base_and_offset (tre offrange[1] += maxobjsize; } - if (TREE_CODE (base) == MEM_REF) + if (TREE_CODE (base) == MEM_REF && !integer_zerop (TREE_OPERAND (base, 0))) { tree memrefoff = fold_convert (ptrdiff_type_node, TREE_OPERAND (base, 1)); extend_offset_range (memrefoff); --- gcc/gimple-fold.cc.jj 2022-03-17 18:48:06.709406855 +0100 +++ gcc/gimple-fold.cc 2022-03-18 11:59:31.500452963 +0100 @@ -6018,7 +6018,25 @@ maybe_canonicalize_mem_ref_addr (tree *t if (wi::to_poly_wide (TREE_OPERAND (base, 0)).to_shwi (&moffset)) { coffset += moffset; - *orig_t = build_int_cst (TREE_TYPE (*orig_t), coffset); + /* Don't fold &NULL->field into INTEGER_CSTs, so that we + for warning purposes keep the distinction between such + invalid cases and valid (type *)0x7e124000. */ + if (integer_zerop (TREE_OPERAND (*t, 0)) + && maybe_ge (coffset, param_min_pagesize)) + { + if (t == &TREE_OPERAND (*orig_t, 0)) + return res; + location_t loc = EXPR_LOCATION (*orig_t); + tree ptype = TREE_TYPE (*orig_t); + tree tem = build2_loc (loc, MEM_REF, TREE_TYPE (ptype), + TREE_OPERAND (*t, 0), + build_int_cst (ptr_type_node, + coffset)); + *orig_t + = build_fold_addr_expr_with_type_loc (loc, tem, ptype); + } + else + *orig_t = build_int_cst (TREE_TYPE (*orig_t), coffset); return true; } } --- gcc/gimple-array-bounds.cc.jj 2022-03-17 18:48:06.691407102 +0100 +++ gcc/gimple-array-bounds.cc 2022-03-18 11:59:31.509452838 +0100 @@ -393,7 +393,8 @@ bool array_bounds_checker::check_mem_ref (location_t location, tree ref, bool ignore_off_by_one) { - if (warning_suppressed_p (ref, OPT_Warray_bounds)) + if (warning_suppressed_p (ref, OPT_Warray_bounds) + || integer_zerop (TREE_OPERAND (ref, 0))) return false; /* The statement used to allocate the array or null. */ --- gcc/pointer-query.cc.jj 2022-03-17 18:48:06.724406650 +0100 +++ gcc/pointer-query.cc 2022-03-18 16:14:38.184691128 +0100 @@ -1968,6 +1968,17 @@ handle_mem_ref (tree mref, gimple *stmt, } tree mrefop = TREE_OPERAND (mref, 0); + if (integer_zerop (mrefop)) + { + addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (mref)); + if (targetm.addr_space.zero_address_valid (as)) + pref->set_max_size_range (); + else + pref->sizrng[0] = pref->sizrng[1] = 0; + pref->ref = null_pointer_node; + return true; + } + if (!compute_objsize_r (mrefop, stmt, false, ostype, pref, snlim, qry)) return false; @@ -2243,7 +2254,7 @@ compute_objsize_r (tree ptr, gimple *stm } case ARRAY_REF: - return handle_array_ref (ptr, stmt, addr, ostype, pref, snlim, qry); + return handle_array_ref (ptr, stmt, addr, ostype, pref, snlim, qry); case COMPONENT_REF: return handle_component_ref (ptr, stmt, addr, ostype, pref, snlim, qry); @@ -2264,12 +2275,14 @@ compute_objsize_r (tree ptr, gimple *stm } case INTEGER_CST: - /* Pointer constants other than null are most likely the result - of erroneous null pointer addition/subtraction. Unless zero - is a valid address set size to zero. For null pointers, set - size to the maximum for now since those may be the result of - jump threading. */ - if (integer_zerop (ptr)) + /* Pointer constants other than null smaller than param_min_pagesize + might be the result of erroneous null pointer addition/subtraction. + Unless zero is a valid address set size to zero. For null pointers, + set size to the maximum for now since those may be the result of + jump threading. Similarly, for values >= param_min_pagesize in + order to support (type *) 0x7cdeab00. */ + if (integer_zerop (ptr) + || wi::to_widest (ptr) >= param_min_pagesize) pref->set_max_size_range (); else if (POINTER_TYPE_P (TREE_TYPE (ptr))) { --- gcc/fold-const.cc.jj 2022-03-17 18:48:06.675407320 +0100 +++ gcc/fold-const.cc 2022-03-18 11:59:31.541452395 +0100 @@ -9206,7 +9206,12 @@ build_fold_addr_expr_with_type_loc (loca t = fold_convert_loc (loc, ptrtype, t); } else if (TREE_CODE (t) == MEM_REF - && TREE_CODE (TREE_OPERAND (t, 0)) == INTEGER_CST) + && TREE_CODE (TREE_OPERAND (t, 0)) == INTEGER_CST + /* Don't fold &MEM [(void *)0B + 8192B] + as a representation of &((type *)NULL)->field_at_offset_8192. */ + && (!integer_zerop (TREE_OPERAND (t, 0)) + || TREE_CODE (TREE_OPERAND (t, 1)) != INTEGER_CST + || wi::to_widest (TREE_OPERAND (t, 1)) < param_min_pagesize)) return fold_binary (POINTER_PLUS_EXPR, ptrtype, TREE_OPERAND (t, 0), convert_to_ptrofftype (TREE_OPERAND (t, 1))); --- gcc/testsuite/gcc.dg/tree-ssa/pr99578-1.c.jj 2022-03-18 15:12:47.470156316 +0100 +++ gcc/testsuite/gcc.dg/tree-ssa/pr99578-1.c 2022-03-18 15:18:30.249401754 +0100 @@ -0,0 +1,22 @@ +/* PR middle-end/99578 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ +/* { dg-final { scan-tree-dump-not "&MEM" "optimized" } } */ +/* { dg-final { scan-tree-dump-times "PHI <-?1\\\(\[0-9\]+\\\), -?1\\\(\[0-9\]+\\\)>" 2 "optimized" } } */ + +struct S { int a, b[4]; }; +struct T { int a, b[8192], c[4]; }; + +int +foo (struct S *p) +{ + if (p) return -1; + return p->b == (void *)4; +} + +int +bar (struct T *p) +{ + if (p) return -1; + return p->c == (void *)32772; +} --- gcc/testsuite/gcc.dg/pr99578-1.c.jj 2022-03-18 15:04:22.825163042 +0100 +++ gcc/testsuite/gcc.dg/pr99578-1.c 2022-03-18 15:09:26.693941208 +0100 @@ -0,0 +1,26 @@ +/* PR middle-end/99578 */ +/* { dg-do compile { target int32 } } */ +/* { dg-options "-O2 -Warray-bounds" } */ + +struct S { int a, b[4]; }; +struct T { int a, b[8192], c[4]; }; + +void +foo (struct S *p) +{ + if (p) return; + __builtin_memset (p->b, 0, sizeof p->b); /* { dg-warning "offset \\\[0, 15\\\] is out of the bounds \\\[0, 0\\\]" } */ +} + +void +bar (struct T *p) +{ + if (p) return; + __builtin_memset (p->c, 0, sizeof p->c); /* { dg-warning "offset \\\[0, 15\\\] is out of the bounds \\\[0, 0\\\]" } */ +} + +void +baz (void) +{ + __builtin_memset ((void *) 0x8004, 0, 16); /* { dg-bogus "is out of the bounds" } */ +} --- gcc/testsuite/gcc.dg/pr99578-2.c.jj 2022-03-18 15:09:58.339502263 +0100 +++ gcc/testsuite/gcc.dg/pr99578-2.c 2022-03-18 15:10:33.308017231 +0100 @@ -0,0 +1,26 @@ +/* PR middle-end/99578 */ +/* { dg-do compile { target int32 } } */ +/* { dg-options "-O2 -Wstringop-overflow" } */ + +struct S { int a, b[4]; }; +struct T { int a, b[8192], c[4]; }; + +void +foo (struct S *p) +{ + if (p) return; + __builtin_memset (p->b, 0, sizeof p->b); /* { dg-warning "writing 16 bytes into a region of size 0 overflows the destination" } */ +} + +void +bar (struct T *p) +{ + if (p) return; + __builtin_memset (p->c, 0, sizeof p->c); /* { dg-warning "writing 16 bytes into a region of size 0 overflows the destination" } */ +} + +void +baz (void) +{ + __builtin_memset ((void *) 0x8004, 0, 16); /* { dg-bogus "overflows the destination" } */ +} --- gcc/testsuite/gcc.dg/pr99578-3.c.jj 2022-03-18 16:43:56.303435806 +0100 +++ gcc/testsuite/gcc.dg/pr99578-3.c 2022-03-18 16:43:51.379503346 +0100 @@ -0,0 +1,13 @@ +/* PR middle-end/99578 */ +/* { dg-do compile { target size32plus } } */ +/* { dg-options "-O2 -Wstringop-overread" } */ + +struct S { unsigned int s; }; +extern struct S v; +extern void *memcpy (void *, const void *, __SIZE_TYPE__); + +void +foo (void) +{ + memcpy (&v, (void *)(0xe8ffc000), sizeof (struct S)); /* { dg-bogus "from a region of size 0" } */ +} --- gcc/testsuite/gcc.dg/pr100680.c.jj 2022-03-18 16:48:39.136556295 +0100 +++ gcc/testsuite/gcc.dg/pr100680.c 2022-03-18 16:55:37.512876159 +0100 @@ -0,0 +1,31 @@ +/* PR middle-end/100680 */ +/* { dg-do compile { target size32plus } } */ +/* { dg-options "-O2 -Wstringop-overread" } */ + +struct s { + char a[8]; + int i; + long l; +}; + +extern char ea[8]; +static char sa[8] = { 1, 2, 3, 4 }; + +int +test (void) +{ + const struct s *ps = (const struct s *) 0x12345678L; + if (__builtin_memcmp (ps->a, ps->a, 8)) + return 0; + + if (__builtin_memcmp (ps->a, ea, 8)) /* { dg-bogus "exceeds source size 0" } */ + return 0; + + if (__builtin_memcmp (ps->a, sa, 8)) /* { dg-bogus "exceeds source size 0" } */ + return 0; + + if (__builtin_memcmp (ps->a, "abcdABCD", 8)) /* { dg-bogus "exceeds source size 0" } */ + return 0; + + return 1; +} --- gcc/testsuite/gcc.dg/pr100834.c.jj 2022-03-18 17:23:13.802552517 +0100 +++ gcc/testsuite/gcc.dg/pr100834.c 2022-03-18 17:24:21.985630103 +0100 @@ -0,0 +1,42 @@ +/* PR tree-optimization/100834 */ +/* { dg-do compile { target size32plus } } */ +/* { dg-options "-O2 -Wall" } */ + +#define PAGE_SIZE 4096 +#define STACK_SIZE PAGE_SIZE + +union registers +{ + struct + { + unsigned long r15, r14, r13, r12, r11, r10, r9, r8; + unsigned long rdi, rsi, rbp, unused, rbx, rdx, rcx, rax; + }; + unsigned long by_index[16]; +}; + +struct per_cpu +{ + union + { + unsigned char stack[STACK_SIZE]; + struct + { + unsigned char __fill[STACK_SIZE - sizeof (union registers)]; + union registers guest_regs; + }; + }; +} __attribute__((aligned (PAGE_SIZE))); + +static inline struct per_cpu * +this_cpu_data (void) +{ + return (struct per_cpu *) 0xdeadbeef; +} + +void +foo (void) +{ + struct per_cpu *cpu_data = this_cpu_data (); + __builtin_memset (&cpu_data->guest_regs, 0, sizeof (cpu_data->guest_regs)); /* { dg-bogus "is out of the bounds" } */ +} Jakub