From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zack Weinberg To: Diego Novillo Cc: gcc@gcc.gnu.org Subject: Re: better -Wuninitialized (Re: Ada files now checked in) Date: Sun, 07 Oct 2001 11:55:00 -0000 Message-id: <20011007115518.Q9432@codesourcery.com> References: <20011007012444.47FDFF28C1@nile.gnat.com> <20011007033442.A8217@tornado.cygnus.com> <20011007005944.I9432@codesourcery.com> <20011007132101.A10392@tornado.cygnus.com> <20011007105319.M9432@codesourcery.com> <20011007142131.A10736@tornado.cygnus.com> X-SW-Source: 2001-10/msg00507.html On Sun, Oct 07, 2001 at 02:21:31PM -0400, Diego Novillo wrote: > On Sun, 07 Oct 2001, Zack Weinberg wrote: > > > > - if its only reaching definition is the ghost def, the variable > > > *is* used uninitialized. > > > > > > - if one of its reaching definitions is the ghost def, the > > > variable *may be* used uninitialized. > > ... > > > > I'm not too familiar with reaching definitions, do they take control > > dependencies into account? > > > Yes, that's what the SSA form is for: > > 1 int a, b; > 2 > 3 b = foo(); > 4 if (b < 100) > 5 a = 10; > 6 b = b + a; The question is what happens with 1 int a, b; 2 3 b = foo(); 4 if (b < 100) 5 a = 10; 6 bar(); 7 if (b < 100) 8 b = b + a; which is the canonical case that the current code gets wrong. (And imagine that line 7 is actually several hundred lines of spaghetti which do not touch A or B.) hmm... At line 6, the reaching set for A is {def(A, 5), def(A, 0)}, but at line 7 it ought to be just {def(A, 5)}. Does it know that? > > It would often be helpful if an uninitialized variable could be > > automatically set to a "poison" value by the compiler. This would > > prevent one major cause of hard-to-find context-dependent bugs. It > > sounds like this can easily be implemented by emitting real code for > > the ghost definitions; dead code elimination would then zap it in all > > cases where there isn't a problem. Have you considered this? > > > Not really. But it is definitely doable. The only problem is > what to consider a 'poison' value. Something that will cause an immediate fault in the program if it gets used. More, you want a value that is *likely* to get used and therefore to expose the bug. (Zero, for instance, is a bad choice.) For pointers, this is easy - pick a non-NULL value pointing into unmapped memory. Floats should probably get a signalling NaN. Integers are harder, but on the theory that numbers in real life tend to be small, use a big one. For signed int, probably it should be negative on the theory that the programmer may not have considered negative numbers. (But *not* -1.) Booleans you are probably up a creek with. Another consideration is that the bit pattern ought to be recognizable as a poison value. The garbage collector uses 0xA5A5A5A5.... for this reason. It's also an opportunity to make jokes with the hexadecimal constant. Dead beef anyone? > OTOH, if the compiler is already warning you that you're using the > thing uninitialized, why would you also need this run-time trick? Because you may incorrectly think you have inspected the code and determined that there is no actual problem. This is of course more likely the more false positives there are. > Hmm, I should've initialized p in the example. But good point. > This would've given you a warning for *p. De-referencing a > pointer is a use of the pointer and a def of every variable in > its equivalence set. In this case, we could empty the > equivalence set if p is used uninitialized. Sounds good. > In tree SSA we call calculate_dominance_info and > compute_dominance_frontiers directly. Also, the code uses > sbitmaps quite frequently. The bitmaps are typically > O(n_basic_blocks). What problem are you referring to? The bitmaps are probably sparse, and n_basic_blocks can blow up, at which point your memory usage blows up too. Brad Lucier has some good examples of this problem. zw