On 6/26/19 6:11 PM, Jeff Law wrote: > On 6/18/19 9:19 PM, Martin Sebor wrote: >> On 6/14/19 2:59 PM, Jeff Law wrote: > [ big snip ] >>> A COND_EXPR on the RHS of an assignment is valid gimple.  That's what we >>> need to consider here -- what is and what is not valid gimple.  And its >>> more likely that PHIs will be transformed into RHS COND_EXPRs -- that's >>> standard practice for if-conversion. >>> >>> Gosh, how to get one?  It happens all the time :-)  Since I know DOM so >>> well, I just shoved a quick assert into optimize_stmt to abort if we >>> encounter a gimple assignment where the RHS is a COND_EXPR.  It blew up >>> instantly building libgcc :-) >>> >>> COmpile the attached code with -O2 -m32. >> >> I wasn't saying it's not valid Gimple, just that I hadn't seen it >> come up here despite compiling Glibc and the kernel with the patched >> GCC.  The only codes I saw are these: >> >>   addr_expr, array_ref, bit_and_expr, component_ref, max_expr, >>   mem_ref, nop_expr, parm_decl, pointer_plus_expr, ssa_name, >>   and var_decl > The only one here that's really surprising is the MAX_EXPR. But it is > what it is. > >> >> What I was asking for is a test case that makes COND_EXPR come up >> in this context.  But I managed to find one by triggering the ICE >> with GDB.  The file reduced to the following test case: > Sorry. email can be a tough medium to nail down specific details. > >> >>   extern struct S s;   // S has to be an incomplete type >> >>   void* f (int n) >>   { >>     void *p; >>     int x = 0; >> >>     for (int i = n; i >= 0; i--) >>       { >>         p = &s; >>         if (p == (void*)-1) >>            x = 1; >>         else if (p) >>            return p; >>       } >> >>     return x ? (void*)-1 : 0; >>   } >> >> and the dump: >> >>    [local count: 59055800]: >>   # x_10 = PHI <1(5), 0(2)> >>   _5 = x_10 != 0 ? -1B : 0B; >> >>    [local count: 114863532]: >>   # _3 = PHI <&s(4), _5(6), &s(3)> >>   return _3; >> >> It seems a little odd that the COND_EXPR disappears when either >> of the constant addresses becomes the address of an object (or >> the result of alloca), and also when the number of iterations >> of the loop is constant.  That's probably why it so rarely comes >> up in this context. > Going into phiopt2 we have: > > ;; basic block 6, loop depth 0 > ;; pred: 5 > if (x_1 != 0) > goto ; [71.00%] > else > goto ; [29.00%] > ;; succ: 8 > ;; 7 > > ;; basic block 7, loop depth 0 > ;; pred: 6 > ;; succ: 8 > > ;; basic block 8, loop depth 0 > ;; pred: 3 > ;; 7 > ;; 6 > # _3 = PHI <&s(3), 0B(7), -1B(6)> > return _3; > > The subgraph starting at block #6 is a classic case for turning branchy > code into straightline code using a COND_EXPR on the RHS of an > assignment. So you end up with something like this: > > ;; basic block 6, loop depth 0 > ;; pred: 5 > _5 = x_1 != 0 ? -1B : 0B; > ;; succ: 7 > > ;; basic block 7, loop depth 0 > ;; pred: 3 > ;; 6 > # _3 = PHI <&s(3), _5(6)> > return _3; > > > Now for this specific case within phiopt we are limited to cases there > the result is 0/1 or 0/-1. That's why you get something different when > you exchange one of the constants for the address of an object, or > anything else for that matter. > > This is all a bit academic -- the key point is that we can have a > COND_EXPR on the RHS of an assignment. That's allowed by gimple. > > Sadly this is also likely one of the places where target characteristics > come into play -- targets define a BRANCH_COST which can significantly > change the decisions for the initial generation of conditionals. It's > one of the things that makes writing tests for jump threading, if > conversion and other optimizations so damn painful -- on one target > we'll have a series of conditional jumps, on anothers we may have a > series of logicals, potentially with COND_EXPRs. > > >> >> That said, even though I've seen a few references to COND_EXPR >> in the midle-end I have been assuming that they, in general, do >> get transformed into PHIs.  But checking the internals manual >> I found in Section 12.6.3 this: >> >>   A C ?: expression is converted into an if statement with each >>   branch assigning to the same temporary. ... The GIMPLE level >>   if-conversion pass re-introduces ?: expression, if appropriate. >>   It is used to vectorize loops with conditions using vector >>   conditional operations. >> >> This GDB test case is likely the result of this reintroduction. > Nope. It happens much earlier in the pipeline :-) > > >>> >>> And in a more general sense, this kind of permissiveness is not future >>> proof.  What happens if someone needs to add another EXPR node that is >>> valid on the RHS where such recursion is undesirable?  How are they >>> supposed to know that we've got this permissive recursive call and that >>> it's going to do the wrong thing?  And if it's an EXPR node with no >>> arguments, then we're going to do a read out of the bounds of the object >>> and all bets are off at that point (yes we have zero operand EXPR nodes, >>> but thankfully I don't think they should up in the contexts you care about). >>> >> >> Sure.  When things are hardcoded this way the same argument can >> be made both ways.  If we just handle A and B and then someone >> adds an X that matches one of the two, it won't be handled. Either >> way, some work is necessary to deal with the new Z.  One way to >> avoid it would be by providing an API to query this property of >> a code (e.g., a function like int gimple_result_aliases_operand >> (gimple *stmt, int opno)). > Adding a new opcode could have caused the original code to generate > incorrect code. In the new version an indirect opcode would just result > in something we refuse to analyze -- so we could miss a warning, but > more importantly we would _not_ transform the return statement so we'd > be safe WRT correct code generation. > > WRT making an API to do these kinds of queries. Sure. It's a fairly > natural extension to stuff we've done in the past. You could (for > example) wrap up all kinds existing properties of nodes into an API. > You'll see that's already done in a fairly ad-hoc fashion, but it's not > even close to being pervasive or standard practice. > > I would certainly applaud bringing some sanity to this kind of issue and > iterating towards a better place. But it's ugly, painful work with > little visible end user benefit. > >> >>> But those seem like distinct issues.  The one I pointed out looks like a >>> pretty simple case to handle.  It's just a logic error. >> >> It wasn't a logic error.  I simply didn't think it was necessary >> to worry about the accuracy of an inherently inaccurate warning, >> and I didn't want make more intrusive changes than was called for >> by my simple fix/enhancement.  I was also keeping in mind your >> preference for smaller change sets.  But since you insist I went >> ahead and improved things.  The warning is all the better for it, >> and the changes I made exposed a number of other bugs in GCC. >> At the same time, it seems that the bar for a simple bug fixes >> and small enhancements should not be in excess of what's possible >> by the original design. > It looked like a logic error to me that could be easily fixed. I didn't > want to make it out to a huge deal. ANd yes, there's a natural tension > between addressing small stuff as you see them in a larger kit and > breaking things down and iterating. There aren't hard and fast rules here. > > > >> >> >> gcc-71924.diff >> >> PR middle-end/71924 - missing -Wreturn-local-addr returning alloca result >> PR middle-end/90549 - missing -Wreturn-local-addr maybe returning an address of a local array plus offset >> >> gcc/ChangeLog: >> >> PR middle-end/71924 >> PR middle-end/90549 >> * gimple-ssa-isolate-paths.c (isolate_path): Add attribute. >> (args_loc_t): New type. >> (args_loc_t, locmap_t): same. >> (diag_returned_locals): New function. >> (is_addr_local): Same. >> (handle_return_addr_local_phi_arg, warn_return_addr_local): Same. >> (find_implicit_erroneous_behavior): Call warn_return_addr_local_phi_arg. >> (find_explicit_erroneous_behavior): Call warn_return_addr_local. >> >> libgcc/ChangeLog: >> * generic-morestack.c: Disable -fisolate-erroneous-paths-dereference >> to get around PR libgcc/90918. >> >> gcc/testsuite/ChangeLog: >> >> PR middle-end/71924 >> PR middle-end/90549 >> * gcc.dg/Wreturn-local-addr-2.c: New test. >> * gcc.dg/Wreturn-local-addr-4.c: New test. >> * gcc.dg/Wreturn-local-addr-5.c: New test. >> * gcc.dg/Wreturn-local-addr-6.c: New test. >> * gcc.dg/Wreturn-local-addr-7.c: New test. >> * gcc.dg/Wreturn-local-addr-8.c: New test. >> * gcc.dg/Walloca-4.c: Prune expected warnings. >> * gcc.dg/pr41551.c: Same. >> * gcc.dg/pr59523.c: Same. >> * gcc.dg/tree-ssa/pr88775-2.c: Same. >> * gcc.dg/winline-7.c: Same. >> >> diff --git a/gcc/gimple-ssa-isolate-paths.c b/gcc/gimple-ssa-isolate-paths.c >> index 33fe352bb23..13b1f5f2349 100644 >> --- a/gcc/gimple-ssa-isolate-paths.c >> +++ b/gcc/gimple-ssa-isolate-paths.c >> @@ -130,7 +130,7 @@ insert_trap (gimple_stmt_iterator *si_p, tree op) >> >> Return BB'. */ >> >> -basic_block >> +ATTRIBUTE_RETURNS_NONNULL basic_block >> isolate_path (basic_block bb, basic_block duplicate, >> edge e, gimple *stmt, tree op, bool ret_zero) >> { >> @@ -341,6 +341,283 @@ stmt_uses_0_or_null_in_undefined_way (gimple *stmt) >> return false; >> } >> >> +/* Describes the property of a return statement that may return >> + the address of one or more local variables. */ >> +struct args_loc_t >> +{ >> + args_loc_t (): nargs (), locvec (), ptr (&ptr) >> + { >> + locvec.create (4); >> + } >> + >> + args_loc_t (const args_loc_t &rhs) >> + : nargs (rhs.nargs), locvec (rhs.locvec.copy ()), ptr (&ptr) { } >> + >> + args_loc_t& operator= (const args_loc_t &rhs) >> + { >> + nargs = rhs.nargs; >> + locvec.release (); >> + locvec = rhs.locvec.copy (); >> + return *this; >> + } >> + >> + ~args_loc_t () >> + { >> + locvec.release (); >> + gcc_assert (ptr == &ptr); >> + } >> + >> + /* For a PHI in a return statement its number of arguments. When less >> + than LOCVEC.LENGTH () implies that an address of local may but need >> + not be returned by the statement. Otherwise it implies it definitely >> + is returned. Used to issue "may be" vs "is" diagnostics. */ >> + unsigned nargs; >> + /* The locations of local variables/alloca calls returned by the return >> + statement. Avoid using auto_vec here since it's not safe to copy due >> + to pr90904. */ >> + vec locvec; >> + void *ptr; >> +}; > Make this a class, please. We use "struct" for POD. I've made this change. > Is there a strong need for an overloaded operator? Our guidelines > generally discourage operator overloads. This one seems on the border > to me. Others may have different ideas where the line is. If we really > don't need the overloaded operator, then we can avoid the controversy > completely. The copy assignment operator is necessary for the type to be stored in a hash_map. Otherwise, the implicitly generated assignment will be used instead which is unsafe in vec and results in memory corription (I opened bug 90904 for this). After working around this bug by providing the assignment operator I then ran into the hash_map bug 90959 that also results in memory corruption. I spent a few fun hours tracking down the GCC miscompilations that they led to. The golden rule in C++ is that every type should either be safe to copy and assign with the expected semantics or made not assignable and not copyable by declaring the copy ctor and copy assignment operator private (or deleted in more modern C++). It would be very helpful to detect this kind of problem (classes that aren't safely assignable and copyable because they acquire/release resources in ctors/dtors). Not just to us in GCC but I'm sure to others as well. It's something I've been hoping to implement for a while now. >> + >> +/* Return true if EXPR is a expression of pointer type that refers >> + to the address of a variable with automatic storage duration. >> + If so, add an entry to *PLOCMAP and insert INTO PLOCMAP->LOCVEC >> + the locations of the corresponding local variables whose address >> + is returned by the statement. VISITED is a bitmap of PHI nodes >> + already visited by recursive calls. */ >> + >> +static bool >> +is_addr_local (gimple *return_stmt, tree exp, locmap_t *plocmap, >> + hash_set *visited) >> +{ >> + if (TREE_CODE (exp) == ADDR_EXPR) >> + { >> + tree baseaddr = get_base_address (TREE_OPERAND (exp, 0)); >> + if (TREE_CODE (baseaddr) == MEM_REF) >> + return is_addr_local (return_stmt, TREE_OPERAND (baseaddr, 0), >> + plocmap, visited); >> + >> + if ((!VAR_P (baseaddr) >> + || is_global_var (baseaddr)) >> + && TREE_CODE (baseaddr) != PARM_DECL) >> + return false; >> + >> + args_loc_t &argsloc = plocmap->get_or_insert (return_stmt); >> + argsloc.locvec.safe_push (DECL_SOURCE_LOCATION (baseaddr)); >> + return true; >> + } >> + >> + if (!POINTER_TYPE_P (TREE_TYPE (exp))) >> + return false; >> + >> + if (TREE_CODE (exp) == SSA_NAME) >> + { >> + gimple *def_stmt = SSA_NAME_DEF_STMT (exp); >> + enum gimple_code code = gimple_code (def_stmt); >> + >> + if (is_gimple_assign (def_stmt)) >> + { >> + tree type = TREE_TYPE (gimple_assign_lhs (def_stmt)); >> + if (POINTER_TYPE_P (type)) >> + { >> + tree_code code = gimple_assign_rhs_code (def_stmt); >> + tree ptr1 = NULL_TREE, ptr2 = NULL_TREE; >> + if (code == COND_EXPR) >> + { >> + ptr1 = gimple_assign_rhs2 (def_stmt); >> + ptr2 = gimple_assign_rhs3 (def_stmt); >> + } >> + else if (code == MAX_EXPR || code == MIN_EXPR) >> + { >> + ptr1 = gimple_assign_rhs1 (def_stmt); >> + ptr2 = gimple_assign_rhs2 (def_stmt); >> + } >> + else if (code == ADDR_EXPR >> + || code == NOP_EXPR >> + || code == POINTER_PLUS_EXPR) >> + ptr1 = gimple_assign_rhs1 (def_stmt); >> + >> + /* Avoid short-circuiting the logical OR result in case >> + both oerands refer to local variables, in which case >> + both should be identified in the warning. */ >> + bool res1 = false, res2 = false; >> + if (ptr1) >> + res1 = is_addr_local (return_stmt, ptr1, plocmap, visited); >> + if (ptr2) >> + res2 = is_addr_local (return_stmt, ptr2, plocmap, visited); >> + return res1 || res2; >> + } >> + return false; >> + } > s/oerands/operands/ a few lines up. > > So how do you deal with the case where one operand of a > {MIN,MAX,COND}_EXPR is the address of a local, but the other is not? It > seems like we'll end up returning true from this function in that case > and potentially trigger changing the return value to NULL. Or am I > missing something? I only briefly considered MIN/MAX but because in C and C++ the less than and greater than expressions are only defined for pointers into the same array/object or subobject and I didn't ponder it too hard. (In C they're undefined otherwise, in C++ the result is unspecified.) But you bring it up because you think the address should be returned regardless, even if the expression is invalid (or if it's COND_EXPR with mixed local/nonlocal addresses) so I've changed it to do that for all these explicitly handled codes and added tests to verify it. Retested on x86_64-linux. Martin