> >Do you want me to revert that patch? > > Yes -- if your tests confirm that it is safe to do so. The attached patch successfully completed a testing cycle for the 3.4 branch on x86 (including the testcase for PR opt/8634) but I'm not convinced it is safe because, even if the comment in maybe_set_unchanging seems to imply that the problem for PR opt/8634 was caused by the double store in store_constructor, this is not the case: the original problem came from the mere existence of /u on memory writes for non-static initializers. See http://gcc.gnu.org/ml/gcc-patches/2002-12/msg00533.html http://gcc.gnu.org/ml/gcc-patches/2003-04/msg00482.html http://gcc.gnu.org/ml/gcc-patches/2003-04/msg00489.html And I'm not sure whether rth's patch catches all the cases. > It's supposed to be the RTL equivalent of "const". In other words, once > initialized, an RTX_UNCHANGING_P thing is immutable, and therefore no > writes can alias it. If you're after the one write to the > RTX_UNCHANGING_P thing, then it's value is always valid. That's true > even if someone has its address; they cannot write through the pointer. Yes, I understand the definition. The problem for me is really the purpose. > This is clearly a valuable optimization aid. Optimization or correctness? We had several bugreports recently on the 3.3 branch where reload generates moves to read-only memory. The only way to prevent that without rewriting the compiler seems to be testing the /u flag. > I think that simply making this flag a tri-state will be the cleanest fix. > For memory to which this clearing optimization applies, the flag should not > be set, because the memory is written twice. If we're worried about > pessimization in that case, we should avoid writing the memory twice. I don't think this is radical enough. I think we need to separate correctness from optimization, i.e having a way to say "this place can never ever be written to" and to say "this place is not supposed to have been written more than once because of the semantics of the language". The latter could be relaxed by the middle-end if it deems it profitable, the former being of course immutable. > Frankly, I suspect that there is virtually no real code where writing > only to the holes (where "holes" means "fields that are not explicitly > initialized to a non-zero value, and, if the compiler so desires, parts > of the object that are not part of any field") has any observable > performance from clearing the whole structure. If most of the structure > is zero, then that will certainly be true. If only a tiny bit of the > structure is non-zero, that will certainly be true. If the non-zero > parts are contiguous, that will probably be true. In practice, there > are few inner loops involving initializing every other field in a > structure, and that is the case where we would lose. Frankly, I don't feel confident enough to implement myself that change on the 3.4 branch at this point. On the other hand, it's clearly a better solution than the blockage. So, if someone more confident than me wants to do it, I'll help him to merge its work with the patch from ACT and verify that it fixes all the known failures. 2004-03-20 Eric Botcazou Mark Mitchell PR optimization/13424 * explow.c (maybe_set_unchanging): Revert 2003-04-07 patch. * tree.h (readwrite_fields_p): New prototype. * alias.c (readwrite_fields_p): New function. * expr.c (store_constructor): When clearing the aggregate because of an incomplete or mostly zero constructor, do not put the /u flag if the target is already unchanging. Record whether a non-unchanging aggregate containing read-write fields is cleared with the /u flag. In that case, emit a blockage right after the clearing. -- Eric Botcazou