Hello, Attached is a patch that aims to substantially improve FORTIFY's usefulness with clang, and make defining FORTIFY'ed functions require less ceremony. I'm not looking for a thorough review at this time, nor do I hope to land this patch in one piece. The goal of this thread is to ensure that everyone's more or less happy with where I'd like to take glibc's FORTIFY implementation. Please note that this change is intended to be a functional nop for all compilers that aren't clang >= 5.0, which was just released last Thursday. Diving in: as said, this patch removes a lot of duplication from FORTIFY in the common case, and makes FORTIFY far more useful for those who use glibc with clang. Namely, with this patch, clang becomes capable of emitting compile-time diagnostics on par (*) with GCC's, and clang's ability to perform run-time checks is substantially improved over what we have today. It essentially does this by wrapping up the majority of the compiler-specific incantations (declaring __foo_chk_warn, conditionally calling it, ...) behind a macro, and uses that to stamp out FORTIFY's compile-time diagnostic bits. While this approach is the cleanest I've been able to come up with, it has potential downsides: - Compile-time diagnostics with GCC are somewhat different than what they are today. To show this, I've attached tst-chk2-output.diff, which is a diff of the diagnostics produced by running GCC 7.1 on debug/tst-chk2.c. I don't find the difference to be substantial, but it does exist. - In very rare cases, the code generated by GCC will be a bit worse (e.g. slower+larger) with this patch. I know this may be a tough sell, but please hear me out. :) With this patch, we sometimes emit diagnostics by emitting code like: if (should_emit_compile_time_warning) { volatile char c = 0; if (__glibc_unlikely (c)) function_with_warnattr (); } Where `should_emit_compile_time_warning` always folds to a constant during compilation. So, 0 diagnostics should mean 0 change in code quality. I don't believe this is a deal-breaker, since: - if you're using FORTIFY, you presumably care enough to fix FORTIFY-related warnings, which makes this regression nonexistent for you, - if you're using FORTIFY, you know there will be a (small, but present) performance penalty, - directly after each __glibc_unlikely(c) branch is call to a FORTIFY function with known broken input, which should abort the program anyway, and - this doesn't apply to *all* diagnostics (e.g. bad calls to open() don't have this penalty); just ones where we can unify clang's and GCC's diagnostic emission bits. In any case, please see binary-output.diff for an idea of the difference this makes on code compiled with GCC 7.1. The function being compiled was a call to wmemmove with an undersized buffer. With a sufficiently large buffer, both today's FORTIFY implementation and the proposed one produce identical bodies for the function in question. Other than that, I know of no regressions that this patch causes with GCC. For clang, in very rare cases (read: I've seen ~10 instances of this testing similar implementations across Android, ChromeOS, and another very large code base), it can also break existing code. For specifics on that, and an overview on how clang's FORTIFY implementation works, please see https://docs.google.com/document/d/1DFfZDICTbL7RqS74wJVIJ-YnjQOj1SaoqfhbgddFYSM/edit?usp=sharing The "How does clang handle it?" section covers the primary attributes this patch uses, and the "Incompatibilities between clang and GCC FORTIFY" section covers places where this patch might break existing clang users. Though I expect clang breakages to be very rare and trivial to fix, this patch introduces a _CLANG_FORTIFY_DISABLE macro, which can be used to turn this entire patch into a nop for clang >= 5.0. ----- I apologize if this is a lot to dump into one email. As said, if we decide this is an acceptable direction to head in, I hope to land this patch in many easily reviewable parts later on. If you have any questions, comments, concerns, etc. please don't hesitate to voice them. Thank you very much for your time, :) George (*) - This means that clang can catch almost as many *types* of bugs as GCC at compile-time. Due to architectural constraints, clang can only perform these checks prior to optimizing code. So, clang is still unable to statically diagnose bugs that require it to do more than simple constant folding. __builtin_object_size, however, isn't required to be folded until after we optimize code, so many dynamic checks can still take advantage of optimizations.