* Question about declaring an array in the stack on runtime @ 2023-07-15 10:43 James R T 2023-07-15 10:49 ` John Scott ` (2 more replies) 0 siblings, 3 replies; 5+ messages in thread From: James R T @ 2023-07-15 10:43 UTC (permalink / raw) To: gcc-help Hi folks, I hope that this is the correct mailing list to ask this question. I have the following C code snippet: ```c #include <stdio.h> int main() { unsigned int* arr; int some_var = 7; if (some_var == 7) { arr = (unsigned int[7]){9, 10, 11, 12, 13, 14, 15}; } printf("Value of arr:\n"); for (unsigned int i = 0; i < 7; i++) { printf("%u ", arr[i]); } return 0; } ``` I have included the relevant Godbolt link here: https://godbolt.org/z/b4rbn6eGT I have a few questions related to this code snippet: 1. Is the conditional assignment to `arr` considered undefined behavior? If it is, which exact clause of the C standard does this code snippet violate and why? As seen in the Godbolt link, there is different behavior between GCC and Clang (only GCC `-O1` and above prints garbage values) which made me suspect that this is UB. 2. Regardless of whether this is UB or not, is it possible for GCC to also output a warning in `-O0` as in `-O2`? If the behavior changes across different optimization levels, it seems that it's worth a warning or two. It can be a different warning instead of `-Wdangling-pointer` since looking at the produced assembly code, GCC seems to simply optimize out the whole conditional assignment block in `-O2`. If it is UB, I understand that it is impossible to catch all UB, but I am just checking on whether it is possible to catch this specific one from GCC's perspective. Just FYI, I have also tried using `-fsanitize=address` and `-fsanitize=undefined` and it seems that AddressSanitizer would throw a `stack-use-after-scope` error in GCC if `-fsanitize=address` is specified for both `-O0` and `-O2`, but not in Clang. `-fsanitize=undefined` does not seem to be able to detect anything. If the GCC maintainers consider this an acceptable proposal to add the warning, I am also willing to post a bug report and develop the corresponding patch for it, although I would appreciate some guidance since I am not very familiar with GCC's codebase. Looking forward to your reply and have a great day ahead! Best regards, James Raphael Tiovalen ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Question about declaring an array in the stack on runtime 2023-07-15 10:43 Question about declaring an array in the stack on runtime James R T @ 2023-07-15 10:49 ` John Scott 2023-07-15 11:19 ` Xi Ruoyao 2023-07-17 12:45 ` David Brown 2 siblings, 0 replies; 5+ messages in thread From: John Scott @ 2023-07-15 10:49 UTC (permalink / raw) To: gcc-help [-- Attachment #1.1: Type: text/plain, Size: 756 bytes --] On Sat, 2023-07-15 at 18:43 +0800, James R T via Gcc-help wrote: > Is the conditional assignment to `arr` considered undefined > behavior? It's not the assignment that's the problem. The problem is that you're using a compound literal, and a compound literal, like typical automatically allocated variables, has a lifetime determined by its scope. In other words, the following: if (some_var == 7) { arr = (unsigned int[7]){9, 10, 11, 12, 13, 14, 15}; } is the same as if (some_var == 7) { unsigned int baz[7] = {9, 10, 11, 12, 13, 14, 15}; arr = baz; } I hope this makes the problem a little more obvious: after you leave the '}', the array doesn't exist anymore, and arr is a dangling pointer to an object that doesn't exist. [-- Attachment #1.2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 228 bytes --] [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5880 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Question about declaring an array in the stack on runtime 2023-07-15 10:43 Question about declaring an array in the stack on runtime James R T 2023-07-15 10:49 ` John Scott @ 2023-07-15 11:19 ` Xi Ruoyao 2023-07-15 12:57 ` James R T 2023-07-17 12:45 ` David Brown 2 siblings, 1 reply; 5+ messages in thread From: Xi Ruoyao @ 2023-07-15 11:19 UTC (permalink / raw) To: James R T, gcc-help On Sat, 2023-07-15 at 18:43 +0800, James R T via Gcc-help wrote: > Hi folks, > > I hope that this is the correct mailing list to ask this question. > > I have the following C code snippet: > > ```c > #include <stdio.h> > > int main() { > unsigned int* arr; > int some_var = 7; > > if (some_var == 7) { > arr = (unsigned int[7]){9, 10, 11, 12, 13, 14, 15}; > } > > printf("Value of arr:\n"); > for (unsigned int i = 0; i < 7; i++) { > printf("%u ", arr[i]); > } > > return 0; > } > ``` > > I have included the relevant Godbolt link here: > https://godbolt.org/z/b4rbn6eGT > > I have a few questions related to this code snippet: > > 1. Is the conditional assignment to `arr` considered undefined > behavior? No. But the usage of arr[i] in the printf call is considered undefined behavior. > If it is, which exact clause of the C standard does this > code snippet violate and why? From C23 6.5.2.5p5: A compound literal provides an unnamed object whose value, type, storage duration and other properties are as if given by the definition syntax in the constraints; if the storage duration is automatic, the lifetime of the instance of the unnamed object is the current execution of the enclosing block. And 6.2.4p2: If an object is referred to outside of its lifetime, the behavior is undefined. And 6.8p1: primary-block: compound-statement selection-statement iteration-statement And 6.8p3: A block is either a primary block, a secondary block, or the block associated with a function definition; So here the "enclosing block" is the if statement, after the if statement the lifetime of the unnamed object provided by the compound literal has ended, thus referring it in the printf call is undefined. > 2. Regardless of whether this is UB or not, is it possible for GCC to > also output a warning in `-O0` as in `-O2`? If the behavior changes > across different optimization levels, it seems that it's worth a > warning or two. It can be a different warning instead of > `-Wdangling-pointer` since looking at the produced assembly code, GCC > seems to simply optimize out the whole conditional assignment block in > `-O2`. If it is UB, I understand that it is impossible to catch all > UB, but I am just checking on whether it is possible to catch this > specific one from GCC's perspective. It's very difficult. The emit of the warning depends on the optimizing of the loop. Without optimization the compiler doesn't even know the loop will be iterated at least once, so there will be no warning is produced at -O0. > Just FYI, I have also tried using > `-fsanitize=address` and `-fsanitize=undefined` and it seems that > AddressSanitizer would throw a `stack-use-after-scope` error in GCC if > `-fsanitize=address` is specified for both `-O0` and `-O2`, but not in > Clang. `-fsanitize=undefined` does not seem to be able to detect > anything. It seems Clang completely unroll the loop and produced a series of printf calls at -O2: leaq .str.1(%rip), %rbx movq %rbx, %rdi movl $9, %esi xorl %eax, %eax callq printf@PLT movq %rbx, %rdi movl $10, %esi xorl %eax, %eax callq printf@PLT ... ... So obviously -fsanitize=address won't work because there is no memory access at all. On the contrary, clang -fsanitize=address -O0 is able to detect the issue. And this is just completely out of the capability of - fsanitize=undefined. Despite the naming, -fsanitize=undefined can only catch a *small* subset of undefined behaviors. Generally sanitizers are useful tools, but not silver bullets. > If the GCC maintainers consider this an acceptable proposal to add the > warning, I am also willing to post a bug report and develop the > corresponding patch for it, although I would appreciate some guidance > since I am not very familiar with GCC's codebase. I believe there has been a lot of duplicates about "no warnings at -O0". And as I've said, these are very difficult to fix. A lot of warnings just inherently needs some information from the optimizer, or there will be either too many false positives, or too many false negatives. If you need such warnings, you should enable the optimization. -- Xi Ruoyao <xry111@xry111.site> School of Aerospace Science and Technology, Xidian University ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Question about declaring an array in the stack on runtime 2023-07-15 11:19 ` Xi Ruoyao @ 2023-07-15 12:57 ` James R T 0 siblings, 0 replies; 5+ messages in thread From: James R T @ 2023-07-15 12:57 UTC (permalink / raw) To: Xi Ruoyao, jscott; +Cc: gcc-help On Sat, Jul 15, 2023 at 6:49 PM John Scott <jscott@posteo.net> wrote: > It's not the assignment that's the problem. The problem is that you're > using a compound literal, and a compound literal, like typical > automatically allocated variables, has a lifetime determined by its > scope. > > In other words, the following: > if (some_var == 7) { > arr = (unsigned int[7]){9, 10, 11, 12, 13, 14, 15}; > } > > is the same as > if (some_var == 7) { > unsigned int baz[7] = {9, 10, 11, 12, 13, 14, 15}; > arr = baz; > } > > I hope this makes the problem a little more obvious: after you leave the > '}', the array doesn't exist anymore, and arr is a dangling pointer to > an object that doesn't exist. Ah thank you for this John, the rearranged code is particularly illuminating. On Sat, Jul 15, 2023 at 7:19 PM Xi Ruoyao <xry111@xry111.site> wrote: > No. But the usage of arr[i] in the printf call is considered undefined > behavior. Got it. > From C23 6.5.2.5p5: > > A compound literal provides an unnamed object whose value, type, storage > duration and other properties are as if given by the definition syntax > in the constraints; if the storage duration is automatic, the lifetime > of the instance of the unnamed object is the current execution of the > enclosing block. > > And 6.2.4p2: > > If an object is referred to outside of its lifetime, the behavior is > undefined. > > And 6.8p1: > > primary-block: > compound-statement > selection-statement > iteration-statement > > And 6.8p3: > > A block is either a primary block, a secondary block, or the block > associated with a function definition; > > So here the "enclosing block" is the if statement, after the if > statement the lifetime of the unnamed object provided by the compound > literal has ended, thus referring it in the printf call is undefined. Understood, thank you for the references to the specific clauses. Will personally study these notes on the lifetimes of compound literals more for my understanding. > It's very difficult. The emit of the warning depends on the optimizing > of the loop. Without optimization the compiler doesn't even know the > loop will be iterated at least once, so there will be no warning is > produced at -O0. Ah alright, understood. > It seems Clang completely unroll the loop and produced a series of > printf calls at -O2: > > leaq .str.1(%rip), %rbx > movq %rbx, %rdi > movl $9, %esi > xorl %eax, %eax > callq printf@PLT > movq %rbx, %rdi > movl $10, %esi > xorl %eax, %eax > callq printf@PLT > > ... ... Yes, I noticed that. I would assume that this is more of an artifact of me hardcoding the value of `some_var`. If the value of `some_var` cannot be determined at compile time, then I would assume that Clang would not be able to unroll the loop and all bets are off, especially since it is UB. > So obviously -fsanitize=address won't work because there is no memory > access at all. On the contrary, clang -fsanitize=address -O0 is able to > detect the issue. Oh, interesting. Attempting to specify `-fsanitize=address` on Godbolt for Clang `-O0` does not seem to detect the issue. I have not tried it locally though. > And this is just completely out of the capability of - > fsanitize=undefined. Despite the naming, -fsanitize=undefined can only > catch a *small* subset of undefined behaviors. > > Generally sanitizers are useful tools, but not silver bullets. Noted the part on `-fsanitize=undefined`. Yes, of course, I understand that sanitizers would not be able to catch all issues. > I believe there has been a lot of duplicates about "no warnings at -O0". > And as I've said, these are very difficult to fix. A lot of warnings > just inherently needs some information from the optimizer, or there will > be either too many false positives, or too many false negatives. If you > need such warnings, you should enable the optimization. Understood, no worries. I was just checking on this specific issue. Thank you for your responses, Xi and John. I hope that both of you have a great day ahead. Best regards, James Raphael Tiovalen ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Question about declaring an array in the stack on runtime 2023-07-15 10:43 Question about declaring an array in the stack on runtime James R T 2023-07-15 10:49 ` John Scott 2023-07-15 11:19 ` Xi Ruoyao @ 2023-07-17 12:45 ` David Brown 2 siblings, 0 replies; 5+ messages in thread From: David Brown @ 2023-07-17 12:45 UTC (permalink / raw) To: gcc-help On 15/07/2023 12:43, James R T via Gcc-help wrote: > Hi folks, > > 2. Regardless of whether this is UB or not, is it possible for GCC to > also output a warning in `-O0` as in `-O2`? If the behavior changes > across different optimization levels, it seems that it's worth a > warning or two. It can be a different warning instead of > `-Wdangling-pointer` since looking at the produced assembly code, GCC > seems to simply optimize out the whole conditional assignment block in > `-O2`. If it is UB, I understand that it is impossible to catch all > UB, but I am just checking on whether it is possible to catch this > specific one from GCC's perspective. Just FYI, I have also tried using > `-fsanitize=address` and `-fsanitize=undefined` and it seems that > AddressSanitizer would throw a `stack-use-after-scope` error in GCC if > `-fsanitize=address` is specified for both `-O0` and `-O2`, but not in > Clang. `-fsanitize=undefined` does not seem to be able to detect > anything. > I think you've had solid answers to your other points already, so you can see why it is UB. When you use the -O flags, you are not just enabling a selection of optimisation passes - you are also changing the code analysis passes. With -O0, relatively little analysis is done. This means that many warnings are not active at -O0. So it is quite normal for warnings like this to require -O1 or above to work, and some may need -O2 to work fully (I don't have details off-hand - I'm a long-term gcc user, not a gcc developer). Personally, I've never had any use for -O0 - I use -O2 usually and -O1 at a minimum, even for early checking of code, precisely because it allows far better static warnings. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-07-17 12:45 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-07-15 10:43 Question about declaring an array in the stack on runtime James R T 2023-07-15 10:49 ` John Scott 2023-07-15 11:19 ` Xi Ruoyao 2023-07-15 12:57 ` James R T 2023-07-17 12:45 ` David Brown
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).