* [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) @ 2018-02-20 17:49 Jakub Jelinek 2018-02-20 19:03 ` Martin Sebor 0 siblings, 1 reply; 19+ messages in thread From: Jakub Jelinek @ 2018-02-20 17:49 UTC (permalink / raw) To: Richard Biener, Jeff Law; +Cc: gcc-patches, Martin Sebor Hi! The following testcase distilled from pdftex is miscompiled on i?86, because gimple_fold_builtin_strlen sets incorrect value range on strlen call on SSA_NAME with def_stmt of PHI <"mu", something> where we can't determine anything about string length of something, so the right value range is [0, maximum_possible_strlen], but we actually used [2, INF]. get_range_strlen has many modes, for get_maxval_strlen we check return value of get_range_strlen, don't set fuzzy and everything seems correct. Otherwise we have the fuzzy mode which is used in 2 arg get_range_strlen overload, which is used for warnings and recently also for gimple_fold_builtin_strlen which sets value ranges from it. Unlike warnings, which can perhaps afford some heuristics and guessing, for gimple_fold_builtin_strlen we need to be 100% conservative. The thing that isn't handled conservatively is PHIs and COND_EXPR. The current code, if we can't figure one of the args out, for PHIs in fuzzy mode increases the *maxval value to +INF, but doesn't touch *minval, for COND_EXPR doesn't adjust the *minval/*maxval at all and just returns false. Unfortunately, changing that breaks FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for warnings, line 165) FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for warnings, line 166) FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for warnings, line 167) FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for warnings, line 168) FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for warnings, line 309) FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for warnings, line 310) FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for warnings, line 311) FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for warnings, line 312) FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for excess errors) FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for warnings, line 62) FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for warnings, line 63) FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for warnings, line 402) FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for warnings, line 403) FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for warnings, line 430) FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for warnings, line 431) FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for warnings, line 458) FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for warnings, line 459) so this patch instead stops using the 2 argument get_range_strlen in gimple_fold_builtin_strlen and makes sure we handle that case conservatively, i.e. PHI of something known + unknown is unknown, likewise COND_EXPR. Bootstrapped on x86_64-linux and i686-linux, regtest pending, ok for trunk if it passes regtest on both? For stage1 I guess Martin can tweak the warning code paths and if they will become the same as the conservatively correct ones, what is in this patch can be adjusted again. 2018-02-20 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/84478 * gimple-fold.c (get_range_strlen): Make minlen const and assume it can't be NULL. Add type 3 support which is conservatively correct in PHIs. Formatting and comment capitalization fixes. Add warning that the 2 argument get_range_strlen is only usable for warnings. (gimple_fold_builtin_strlen): Use the 6 arg get_range_strlen overload rather than 2 arg, use it only if it returns true and flexarray is false, pass 3 as type to it. * gcc.c-torture/execute/pr84478.c: New test. --- gcc/gimple-fold.c.jj 2018-02-19 19:57:03.424279589 +0100 +++ gcc/gimple-fold.c 2018-02-20 16:21:34.583020305 +0100 @@ -1283,10 +1283,11 @@ gimple_fold_builtin_memset (gimple_stmt_ value of ARG in LENGTH[0] and LENGTH[1], respectively. If ARG is an SSA name variable, follow its use-def chains. When TYPE == 0, if LENGTH[1] is not equal to the length we determine or - if we are unable to determine the length or value, return False. + if we are unable to determine the length or value, return false. VISITED is a bitmap of visited variables. TYPE is 0 if string length should be obtained, 1 for maximum string - length and 2 for maximum value ARG can have. + length and 2 for maximum value ARG can have, 3 is like 1, but provide + conservatively correct rather than optimistic answer. When FUZZY is set and the length of a string cannot be determined, the function instead considers as the maximum possible length the size of a character array it may refer to. @@ -1302,9 +1303,8 @@ get_range_strlen (tree arg, tree length[ tree var, val = NULL_TREE; gimple *def_stmt; - /* The minimum and maximum length. The MAXLEN pointer stays unchanged - but MINLEN may be cleared during the execution of the function. */ - tree *minlen = length; + /* The minimum and maximum length. */ + tree *const minlen = length; tree *const maxlen = length + 1; if (TREE_CODE (arg) != SSA_NAME) @@ -1445,12 +1445,11 @@ get_range_strlen (tree arg, tree length[ if (!val) return false; - if (minlen - && (!*minlen - || (type > 0 - && TREE_CODE (*minlen) == INTEGER_CST - && TREE_CODE (val) == INTEGER_CST - && tree_int_cst_lt (val, *minlen)))) + if (!*minlen + || (type > 0 + && TREE_CODE (*minlen) == INTEGER_CST + && TREE_CODE (val) == INTEGER_CST + && tree_int_cst_lt (val, *minlen))) *minlen = val; if (*maxlen) @@ -1503,18 +1502,16 @@ get_range_strlen (tree arg, tree length[ { tree op2 = gimple_assign_rhs2 (def_stmt); tree op3 = gimple_assign_rhs3 (def_stmt); - return get_range_strlen (op2, length, visited, type, fuzzy, flexp) - && get_range_strlen (op3, length, visited, type, fuzzy, flexp); + return (get_range_strlen (op2, length, visited, type, fuzzy, flexp) + && get_range_strlen (op3, length, visited, type, fuzzy, + flexp)); } return false; case GIMPLE_PHI: - { - /* All the arguments of the PHI node must have the same constant - length. */ - unsigned i; - - for (i = 0; i < gimple_phi_num_args (def_stmt); i++) + /* All the arguments of the PHI node must have the same constant + length. */ + for (unsigned i = 0; i < gimple_phi_num_args (def_stmt); i++) { tree arg = gimple_phi_arg (def_stmt, i)->def; @@ -1529,13 +1526,12 @@ get_range_strlen (tree arg, tree length[ if (!get_range_strlen (arg, length, visited, type, fuzzy, flexp)) { - if (fuzzy) + if (fuzzy && type != 3) *maxlen = build_all_ones_cst (size_type_node); else return false; } } - } return true; default: @@ -1554,7 +1550,10 @@ get_range_strlen (tree arg, tree length[ Return true if the range of the string lengths has been obtained from the upper bound of an array at the end of a struct. Such an array may hold a string that's longer than its upper bound - due to it being used as a poor-man's flexible array member. */ + due to it being used as a poor-man's flexible array member. + + This function should be only used for warning code, as it doesn't + handle PHIs in a conservatively correct way. */ bool get_range_strlen (tree arg, tree minmaxlen[2]) @@ -3533,8 +3532,12 @@ gimple_fold_builtin_strlen (gimple_stmt_ wide_int minlen; wide_int maxlen; - tree lenrange[2]; - if (!get_range_strlen (gimple_call_arg (stmt, 0), lenrange) + tree lenrange[2] = { NULL_TREE, NULL_TREE }; + bitmap visited = NULL; + bool flexarray = false; + if (get_range_strlen (gimple_call_arg (stmt, 0), lenrange, &visited, + 3, true, &flexarray) + && !flexarray && lenrange[0] && TREE_CODE (lenrange[0]) == INTEGER_CST && lenrange[1] && TREE_CODE (lenrange[1]) == INTEGER_CST) { @@ -3554,6 +3557,9 @@ gimple_fold_builtin_strlen (gimple_stmt_ maxlen = wi::to_wide (max_object_size (), prec) - 2; } + if (visited) + BITMAP_FREE (visited); + if (minlen == maxlen) { lenrange[0] = force_gimple_operand_gsi (gsi, lenrange[0], true, NULL, --- gcc/testsuite/gcc.c-torture/execute/pr84478.c.jj 2018-02-20 16:32:00.683086212 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr84478.c 2018-02-20 16:31:33.497081640 +0100 @@ -0,0 +1,49 @@ +/* PR tree-optimization/84478 */ + +long poolptr; +unsigned char *strpool; +static const char *poolfilearr[] = { + "mu", + "", +#define A "x", +#define B A "xx", A A "xxx", A A A A A +#define C B B B B B B B B B B +#define D C C C C C C C C C C + D C C C C C C C B B B + ((void *)0) +}; + +__attribute__((noipa)) long +makestring (void) +{ + return 1; +} + +__attribute__((noipa)) long +loadpoolstrings (long spare_size) +{ + const char *s; + long g = 0; + int i = 0, j = 0; + while ((s = poolfilearr[j++])) + { + int l = __builtin_strlen (s); + i += l; + if (i >= spare_size) return 0; + while (l-- > 0) strpool[poolptr++] = *s++; + g = makestring (); + } + return g; +} + +int +main () +{ + strpool = __builtin_malloc (4000); + if (!strpool) + return 0; + asm volatile ("" : : : "memory"); + volatile int r = loadpoolstrings (4000); + __builtin_free (strpool); + return 0; +} Jakub ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) 2018-02-20 17:49 [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) Jakub Jelinek @ 2018-02-20 19:03 ` Martin Sebor 2018-02-20 20:13 ` Martin Sebor ` (2 more replies) 0 siblings, 3 replies; 19+ messages in thread From: Martin Sebor @ 2018-02-20 19:03 UTC (permalink / raw) To: Jakub Jelinek, Richard Biener, Jeff Law; +Cc: gcc-patches [-- Attachment #1: Type: text/plain, Size: 9972 bytes --] On 02/20/2018 10:17 AM, Jakub Jelinek wrote: > Hi! > > The following testcase distilled from pdftex is miscompiled on i?86, > because gimple_fold_builtin_strlen sets incorrect value range on > strlen call on SSA_NAME with def_stmt of PHI <"mu", something> where > we can't determine anything about string length of something, so the > right value range is [0, maximum_possible_strlen], but we actually used > [2, INF]. > > get_range_strlen has many modes, for get_maxval_strlen we check return > value of get_range_strlen, don't set fuzzy and everything seems correct. > Otherwise we have the fuzzy mode which is used in 2 arg get_range_strlen > overload, which is used for warnings and recently also for > gimple_fold_builtin_strlen which sets value ranges from it. Unlike > warnings, which can perhaps afford some heuristics and guessing, for > gimple_fold_builtin_strlen we need to be 100% conservative. > > The thing that isn't handled conservatively is PHIs and COND_EXPR. > The current code, if we can't figure one of the args out, for PHIs in > fuzzy mode increases the *maxval value to +INF, but doesn't touch > *minval, for COND_EXPR doesn't adjust the *minval/*maxval at all and just > returns false. Unfortunately, changing that breaks It sounds like not setting *minlen is the problem then. Attached is a slightly smaller patch that fixes the bug with no testsuite regressions on x86_64-linux. How does it look to you? Martin > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for warnings, line 165) > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for warnings, line 166) > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for warnings, line 167) > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for warnings, line 168) > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for warnings, line 309) > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for warnings, line 310) > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for warnings, line 311) > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for warnings, line 312) > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-11.c (test for excess errors) > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for warnings, line 62) > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for warnings, line 63) > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for warnings, line 402) > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for warnings, line 403) > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for warnings, line 430) > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for warnings, line 431) > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for warnings, line 458) > FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for warnings, line 459) > so this patch instead stops using the 2 argument get_range_strlen > in gimple_fold_builtin_strlen and makes sure we handle that case > conservatively, i.e. PHI of something known + unknown is unknown, likewise > COND_EXPR. > > Bootstrapped on x86_64-linux and i686-linux, regtest pending, ok for trunk > if it passes regtest on both? > > For stage1 I guess Martin can tweak the warning code paths and if they will > become the same as the conservatively correct ones, what is in this patch > can be adjusted again. > > 2018-02-20 Jakub Jelinek <jakub@redhat.com> > > PR tree-optimization/84478 > * gimple-fold.c (get_range_strlen): Make minlen const and assume it > can't be NULL. Add type 3 support which is conservatively correct > in PHIs. Formatting and comment capitalization fixes. Add warning > that the 2 argument get_range_strlen is only usable for warnings. > (gimple_fold_builtin_strlen): Use the 6 arg get_range_strlen overload > rather than 2 arg, use it only if it returns true and flexarray is > false, pass 3 as type to it. > > * gcc.c-torture/execute/pr84478.c: New test. > > --- gcc/gimple-fold.c.jj 2018-02-19 19:57:03.424279589 +0100 > +++ gcc/gimple-fold.c 2018-02-20 16:21:34.583020305 +0100 > @@ -1283,10 +1283,11 @@ gimple_fold_builtin_memset (gimple_stmt_ > value of ARG in LENGTH[0] and LENGTH[1], respectively. > If ARG is an SSA name variable, follow its use-def chains. When > TYPE == 0, if LENGTH[1] is not equal to the length we determine or > - if we are unable to determine the length or value, return False. > + if we are unable to determine the length or value, return false. > VISITED is a bitmap of visited variables. > TYPE is 0 if string length should be obtained, 1 for maximum string > - length and 2 for maximum value ARG can have. > + length and 2 for maximum value ARG can have, 3 is like 1, but provide > + conservatively correct rather than optimistic answer. > When FUZZY is set and the length of a string cannot be determined, > the function instead considers as the maximum possible length the > size of a character array it may refer to. > @@ -1302,9 +1303,8 @@ get_range_strlen (tree arg, tree length[ > tree var, val = NULL_TREE; > gimple *def_stmt; > > - /* The minimum and maximum length. The MAXLEN pointer stays unchanged > - but MINLEN may be cleared during the execution of the function. */ > - tree *minlen = length; > + /* The minimum and maximum length. */ > + tree *const minlen = length; > tree *const maxlen = length + 1; > > if (TREE_CODE (arg) != SSA_NAME) > @@ -1445,12 +1445,11 @@ get_range_strlen (tree arg, tree length[ > if (!val) > return false; > > - if (minlen > - && (!*minlen > - || (type > 0 > - && TREE_CODE (*minlen) == INTEGER_CST > - && TREE_CODE (val) == INTEGER_CST > - && tree_int_cst_lt (val, *minlen)))) > + if (!*minlen > + || (type > 0 > + && TREE_CODE (*minlen) == INTEGER_CST > + && TREE_CODE (val) == INTEGER_CST > + && tree_int_cst_lt (val, *minlen))) > *minlen = val; > > if (*maxlen) > @@ -1503,18 +1502,16 @@ get_range_strlen (tree arg, tree length[ > { > tree op2 = gimple_assign_rhs2 (def_stmt); > tree op3 = gimple_assign_rhs3 (def_stmt); > - return get_range_strlen (op2, length, visited, type, fuzzy, flexp) > - && get_range_strlen (op3, length, visited, type, fuzzy, flexp); > + return (get_range_strlen (op2, length, visited, type, fuzzy, flexp) > + && get_range_strlen (op3, length, visited, type, fuzzy, > + flexp)); > } > return false; > > case GIMPLE_PHI: > - { > - /* All the arguments of the PHI node must have the same constant > - length. */ > - unsigned i; > - > - for (i = 0; i < gimple_phi_num_args (def_stmt); i++) > + /* All the arguments of the PHI node must have the same constant > + length. */ > + for (unsigned i = 0; i < gimple_phi_num_args (def_stmt); i++) > { > tree arg = gimple_phi_arg (def_stmt, i)->def; > > @@ -1529,13 +1526,12 @@ get_range_strlen (tree arg, tree length[ > > if (!get_range_strlen (arg, length, visited, type, fuzzy, flexp)) > { > - if (fuzzy) > + if (fuzzy && type != 3) > *maxlen = build_all_ones_cst (size_type_node); > else > return false; > } > } > - } > return true; > > default: > @@ -1554,7 +1550,10 @@ get_range_strlen (tree arg, tree length[ > Return true if the range of the string lengths has been obtained > from the upper bound of an array at the end of a struct. Such > an array may hold a string that's longer than its upper bound > - due to it being used as a poor-man's flexible array member. */ > + due to it being used as a poor-man's flexible array member. > + > + This function should be only used for warning code, as it doesn't > + handle PHIs in a conservatively correct way. */ > > bool > get_range_strlen (tree arg, tree minmaxlen[2]) > @@ -3533,8 +3532,12 @@ gimple_fold_builtin_strlen (gimple_stmt_ > wide_int minlen; > wide_int maxlen; > > - tree lenrange[2]; > - if (!get_range_strlen (gimple_call_arg (stmt, 0), lenrange) > + tree lenrange[2] = { NULL_TREE, NULL_TREE }; > + bitmap visited = NULL; > + bool flexarray = false; > + if (get_range_strlen (gimple_call_arg (stmt, 0), lenrange, &visited, > + 3, true, &flexarray) > + && !flexarray > && lenrange[0] && TREE_CODE (lenrange[0]) == INTEGER_CST > && lenrange[1] && TREE_CODE (lenrange[1]) == INTEGER_CST) > { > @@ -3554,6 +3557,9 @@ gimple_fold_builtin_strlen (gimple_stmt_ > maxlen = wi::to_wide (max_object_size (), prec) - 2; > } > > + if (visited) > + BITMAP_FREE (visited); > + > if (minlen == maxlen) > { > lenrange[0] = force_gimple_operand_gsi (gsi, lenrange[0], true, NULL, > --- gcc/testsuite/gcc.c-torture/execute/pr84478.c.jj 2018-02-20 16:32:00.683086212 +0100 > +++ gcc/testsuite/gcc.c-torture/execute/pr84478.c 2018-02-20 16:31:33.497081640 +0100 > @@ -0,0 +1,49 @@ > +/* PR tree-optimization/84478 */ > + > +long poolptr; > +unsigned char *strpool; > +static const char *poolfilearr[] = { > + "mu", > + "", > +#define A "x", > +#define B A "xx", A A "xxx", A A A A A > +#define C B B B B B B B B B B > +#define D C C C C C C C C C C > + D C C C C C C C B B B > + ((void *)0) > +}; > + > +__attribute__((noipa)) long > +makestring (void) > +{ > + return 1; > +} > + > +__attribute__((noipa)) long > +loadpoolstrings (long spare_size) > +{ > + const char *s; > + long g = 0; > + int i = 0, j = 0; > + while ((s = poolfilearr[j++])) > + { > + int l = __builtin_strlen (s); > + i += l; > + if (i >= spare_size) return 0; > + while (l-- > 0) strpool[poolptr++] = *s++; > + g = makestring (); > + } > + return g; > +} > + > +int > +main () > +{ > + strpool = __builtin_malloc (4000); > + if (!strpool) > + return 0; > + asm volatile ("" : : : "memory"); > + volatile int r = loadpoolstrings (4000); > + __builtin_free (strpool); > + return 0; > +} > > Jakub > [-- Attachment #2: gcc-84478.diff --] [-- Type: text/x-patch, Size: 2436 bytes --] PR tree-optimization/84478 - [8 Regression] pdftex miscompilation on i386 gcc/ChangeLog: PR tree-optimization/84478 * gimple-fold.c (get_range_strlen): Set *MINLEN to zero. (get_range_strlen): Reset range on failure. gcc/testsuite/ChangeLog: PR tree-optimization/84478 * gcc.c-torture/execute/pr84478.c: New test. Index: gcc/gimple-fold.c =================================================================== --- gcc/gimple-fold.c (revision 257796) +++ gcc/gimple-fold.c (working copy) @@ -1369,7 +1369,10 @@ get_range_strlen (tree arg, tree length[2], bitmap tree eltype = TREE_TYPE (type); if (TREE_CODE (type) != ARRAY_TYPE || !INTEGRAL_TYPE_P (eltype)) - return false; + { + *minlen = ssize_int (0); + return false; + } val = TYPE_SIZE_UNIT (type); if (!val || integer_zerop (val)) @@ -1565,7 +1568,11 @@ get_range_strlen (tree arg, tree minmaxlen[2]) minmaxlen[1] = NULL_TREE; bool flexarray = false; - get_range_strlen (arg, minmaxlen, &visited, 1, true, &flexarray); + if (!get_range_strlen (arg, minmaxlen, &visited, 1, true, &flexarray)) + { + minmaxlen[0] = NULL_TREE; + minmaxlen[1] = NULL_TREE; + } if (visited) BITMAP_FREE (visited); Index: gcc/testsuite/gcc.c-torture/execute/pr84478.c =================================================================== --- gcc/testsuite/gcc.c-torture/execute/pr84478.c (nonexistent) +++ gcc/testsuite/gcc.c-torture/execute/pr84478.c (working copy) @@ -0,0 +1,48 @@ +/* PR tree-optimization/84478 - pdftex miscompilation on i386 */ + +long poolptr; +unsigned char *strpool; +static const char *poolfilearr[] = { + "mu", +#define A "", +#define B A A A A A A A A A A +#define C B B B B B B B B B B +#define D C C C C C C C C C C + D C C C C C C C B B B A + ((void *)0) +}; + +__attribute__((noipa)) long +makestring (void) +{ + return 0; +} + +__attribute__((noipa)) +int +loadpoolstrings (long spare_size) +{ + const char *s; + long g = 0; + int i = 0, j = 0; + while ((s = poolfilearr[j++])) + { + int l = __builtin_strlen (s); + i += l; + if (i >= spare_size) return 0; + while (l-- > 0) strpool[poolptr++] = *s++; + g = makestring (); + } + return g; +} + +int +main () +{ + poolptr = 0; + strpool = __builtin_malloc (1000); + asm volatile ("" : : : "memory"); + volatile int r = loadpoolstrings (1000); + __builtin_free (strpool); + return 0; +} ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) 2018-02-20 19:03 ` Martin Sebor @ 2018-02-20 20:13 ` Martin Sebor 2018-02-21 0:49 ` [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478, take 2 and 3) Jakub Jelinek 2018-02-21 3:36 ` [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) Jeff Law 2018-02-20 20:49 ` Jakub Jelinek 2018-02-21 3:42 ` Jeff Law 2 siblings, 2 replies; 19+ messages in thread From: Martin Sebor @ 2018-02-20 20:13 UTC (permalink / raw) To: Jakub Jelinek, Richard Biener, Jeff Law; +Cc: gcc-patches [-- Attachment #1: Type: text/plain, Size: 1834 bytes --] On 02/20/2018 12:03 PM, Martin Sebor wrote: > On 02/20/2018 10:17 AM, Jakub Jelinek wrote: >> Hi! >> >> The following testcase distilled from pdftex is miscompiled on i?86, >> because gimple_fold_builtin_strlen sets incorrect value range on >> strlen call on SSA_NAME with def_stmt of PHI <"mu", something> where >> we can't determine anything about string length of something, so the >> right value range is [0, maximum_possible_strlen], but we actually used >> [2, INF]. >> >> get_range_strlen has many modes, for get_maxval_strlen we check return >> value of get_range_strlen, don't set fuzzy and everything seems correct. >> Otherwise we have the fuzzy mode which is used in 2 arg get_range_strlen >> overload, which is used for warnings and recently also for >> gimple_fold_builtin_strlen which sets value ranges from it. Unlike >> warnings, which can perhaps afford some heuristics and guessing, for >> gimple_fold_builtin_strlen we need to be 100% conservative. >> >> The thing that isn't handled conservatively is PHIs and COND_EXPR. >> The current code, if we can't figure one of the args out, for PHIs in >> fuzzy mode increases the *maxval value to +INF, but doesn't touch >> *minval, for COND_EXPR doesn't adjust the *minval/*maxval at all and just >> returns false. Unfortunately, changing that breaks > > It sounds like not setting *minlen is the problem then. Attached > is a slightly smaller patch that fixes the bug with no testsuite > regressions on x86_64-linux. How does it look to you? A safer and even more conservative alternative that should be equivalent to your approach while avoiding the sprintf regressions is to add another mode to the function and have it clear *minlen as an option. This lets the strlen code obtain the conservative lower bound without compromising the sprintf warnings. Martin [-- Attachment #2: gcc-84478.diff --] [-- Type: text/x-patch, Size: 7266 bytes --] PR tree-optimization/84478 - [8 Regression] pdftex miscompilation on i386 gcc/ChangeLog: PR tree-optimization/84478 * gimple-fold.h (get_range_strlen): Add argument. * gimple-fold.c (get_range_strlen): Set *MINLEN to zero. (get_range_strlen): Reset range on failure. * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Add third argument to the call to get_range_strlen. gcc/testsuite/ChangeLog: PR tree-optimization/84478 * gcc.c-torture/execute/pr84478.c: New test. Index: gcc/gimple-fold.c =================================================================== --- gcc/gimple-fold.c (revision 257850) +++ gcc/gimple-fold.c (working copy) @@ -1290,6 +1290,9 @@ gimple_fold_builtin_memset (gimple_stmt_iterator * When FUZZY is set and the length of a string cannot be determined, the function instead considers as the maximum possible length the size of a character array it may refer to. + When STRICT_MIN is set the function will clear MINMAXLEN[0] if + the length of any of the referenced strings cannot be determined + (and thus can be zero). Set *FLEXP to true if the range of the string lengths has been obtained from the upper bound of an array at the end of a struct. Such an array may hold a string that's longer than its upper bound @@ -1297,7 +1300,7 @@ gimple_fold_builtin_memset (gimple_stmt_iterator * static bool get_range_strlen (tree arg, tree length[2], bitmap *visited, int type, - bool fuzzy, bool *flexp) + bool fuzzy, bool strict_min, bool *flexp) { tree var, val = NULL_TREE; gimple *def_stmt; @@ -1320,7 +1323,8 @@ get_range_strlen (tree arg, tree length[2], bitmap if (TREE_CODE (aop0) == INDIRECT_REF && TREE_CODE (TREE_OPERAND (aop0, 0)) == SSA_NAME) return get_range_strlen (TREE_OPERAND (aop0, 0), - length, visited, type, fuzzy, flexp); + length, visited, type, fuzzy, + strict_min, flexp); } else if (TREE_CODE (TREE_OPERAND (op, 0)) == COMPONENT_REF && fuzzy) { @@ -1354,7 +1358,7 @@ get_range_strlen (tree arg, tree length[2], bitmap { if (TREE_CODE (arg) == ADDR_EXPR) return get_range_strlen (TREE_OPERAND (arg, 0), length, - visited, type, fuzzy, flexp); + visited, type, fuzzy, strict_min, flexp); if (TREE_CODE (arg) == ARRAY_REF) { @@ -1497,14 +1501,17 @@ get_range_strlen (tree arg, tree length[2], bitmap || gimple_assign_unary_nop_p (def_stmt)) { tree rhs = gimple_assign_rhs1 (def_stmt); - return get_range_strlen (rhs, length, visited, type, fuzzy, flexp); + return get_range_strlen (rhs, length, visited, type, fuzzy, + strict_min, flexp); } else if (gimple_assign_rhs_code (def_stmt) == COND_EXPR) { tree op2 = gimple_assign_rhs2 (def_stmt); tree op3 = gimple_assign_rhs3 (def_stmt); - return get_range_strlen (op2, length, visited, type, fuzzy, flexp) - && get_range_strlen (op3, length, visited, type, fuzzy, flexp); + return (get_range_strlen (op2, length, visited, type, fuzzy, + strict_min, flexp) + && get_range_strlen (op3, length, visited, type, fuzzy, + strict_min, flexp)); } return false; @@ -1527,10 +1534,15 @@ get_range_strlen (tree arg, tree length[2], bitmap if (arg == gimple_phi_result (def_stmt)) continue; - if (!get_range_strlen (arg, length, visited, type, fuzzy, flexp)) + if (!get_range_strlen (arg, length, visited, type, fuzzy, + strict_min, flexp)) { if (fuzzy) - *maxlen = build_all_ones_cst (size_type_node); + { + if (strict_min) + *minlen = ssize_int (0); + *maxlen = build_all_ones_cst (size_type_node); + } else return false; } @@ -1551,6 +1563,9 @@ get_range_strlen (tree arg, tree length[2], bitmap and array declared as 'char array[8]', MINMAXLEN[0] will be set to 3 and MINMAXLEN[1] to 7, the longest string that could be stored in array. + When STRICT_MIN is set the function will clear MINMAXLEN[0] if + the length of any of the referenced strings cannot be determined + (and thus can be zero). Return true if the range of the string lengths has been obtained from the upper bound of an array at the end of a struct. Such an array may hold a string that's longer than its upper bound @@ -1557,7 +1572,7 @@ get_range_strlen (tree arg, tree length[2], bitmap due to it being used as a poor-man's flexible array member. */ bool -get_range_strlen (tree arg, tree minmaxlen[2]) +get_range_strlen (tree arg, tree minmaxlen[2], bool strict_min /* = true */) { bitmap visited = NULL; @@ -1565,7 +1580,12 @@ bool minmaxlen[1] = NULL_TREE; bool flexarray = false; - get_range_strlen (arg, minmaxlen, &visited, 1, true, &flexarray); + if (!get_range_strlen (arg, minmaxlen, &visited, 1, true, strict_min, + &flexarray)) + { + minmaxlen[0] = NULL_TREE; + minmaxlen[1] = NULL_TREE; + } if (visited) BITMAP_FREE (visited); @@ -1580,7 +1600,7 @@ get_maxval_strlen (tree arg, int type) tree len[2] = { NULL_TREE, NULL_TREE }; bool dummy; - if (!get_range_strlen (arg, len, &visited, type, false, &dummy)) + if (!get_range_strlen (arg, len, &visited, type, false, true, &dummy)) len[1] = NULL_TREE; if (visited) BITMAP_FREE (visited); Index: gcc/gimple-fold.h =================================================================== --- gcc/gimple-fold.h (revision 257850) +++ gcc/gimple-fold.h (working copy) @@ -25,7 +25,7 @@ along with GCC; see the file COPYING3. If not see extern tree create_tmp_reg_or_ssa_name (tree, gimple *stmt = NULL); extern tree canonicalize_constructor_val (tree, tree); extern tree get_symbol_constant_value (tree); -extern bool get_range_strlen (tree, tree[2]); +extern bool get_range_strlen (tree, tree[2], bool = true); extern tree get_maxval_strlen (tree, int); extern void gimplify_and_update_call_from_tree (gimple_stmt_iterator *, tree); extern bool fold_stmt (gimple_stmt_iterator *); Index: gcc/testsuite/gcc.c-torture/execute/pr84478.c =================================================================== --- gcc/testsuite/gcc.c-torture/execute/pr84478.c (nonexistent) +++ gcc/testsuite/gcc.c-torture/execute/pr84478.c (working copy) @@ -0,0 +1,48 @@ +/* PR tree-optimization/84478 - pdftex miscompilation on i386 */ + +long poolptr; +unsigned char *strpool; +static const char *poolfilearr[] = { + "mu", +#define A "", +#define B A A A A A A A A A A +#define C B B B B B B B B B B +#define D C C C C C C C C C C + D C C C C C C C B B B A + ((void *)0) +}; + +__attribute__((noipa)) long +makestring (void) +{ + return 0; +} + +__attribute__((noipa)) +int +loadpoolstrings (long spare_size) +{ + const char *s; + long g = 0; + int i = 0, j = 0; + while ((s = poolfilearr[j++])) + { + int l = __builtin_strlen (s); + i += l; + if (i >= spare_size) return 0; + while (l-- > 0) strpool[poolptr++] = *s++; + g = makestring (); + } + return g; +} + +int +main () +{ + poolptr = 0; + strpool = __builtin_malloc (1000); + asm volatile ("" : : : "memory"); + volatile int r = loadpoolstrings (1000); + __builtin_free (strpool); + return 0; +} ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478, take 2 and 3) 2018-02-20 20:13 ` Martin Sebor @ 2018-02-21 0:49 ` Jakub Jelinek 2018-02-20 22:49 ` Martin Sebor ` (2 more replies) 2018-02-21 3:36 ` [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) Jeff Law 1 sibling, 3 replies; 19+ messages in thread From: Jakub Jelinek @ 2018-02-21 0:49 UTC (permalink / raw) To: Richard Biener, Jeff Law, Martin Sebor; +Cc: gcc-patches [-- Attachment #1: Type: text/plain, Size: 1305 bytes --] On Tue, Feb 20, 2018 at 01:13:13PM -0700, Martin Sebor wrote: > A safer and even more conservative alternative that should be > equivalent to your approach while avoiding the sprintf regressions > is to add another mode to the function and have it clear *minlen > as an option. This lets the strlen code obtain the conservative > lower bound without compromising the sprintf warnings. I fail to see what it would be good for to set *MINLEN to zero and *MAXLEN to all ones for the non-warning use cases, we simply don't know anything about it, both NULL_TREEs i.e. returning false is better. I'm offering two alternate patches which use fuzzy == 0 for the previous !fuzzy, fuzzy == 1 for conservatively correct code that assumes strlen can't cross field/variable boundaries in compliant programs and fuzzy == 2 which does that + whatever the warning code wants. Additionally, I've rewritten the COND_EXPR handling, so that it matches exactly the PHI handling. The first patch doesn't change the 2 argument get_range_strlen and changes gimple_fold_builtin_strlen to use the 6 argument one, the second patch changes also the 2 argument get_range_strlen similarly to what you've done in your patch. Tested on x86_64-linux and i686-linux, ok for trunk if it passes bootstrap/regtest? Which one? Jakub [-- Attachment #2: T076d --] [-- Type: text/plain, Size: 8442 bytes --] 2018-02-20 Jakub Jelinek <jakub@redhat.com> Martin Sebor <msebor@redhat.com> PR tree-optimization/84478 * gimple-fold.c (get_range_strlen): Make minlen const and assume it can't be NULL. Change FUZZY from bool to int, for 1 add PHI/COND_EXPR support which is conservatively correct, for 2 only stay conservative for maxlen. Formatting and comment capitalization fixes. Add warning that the 2 argument get_range_strlen is only usable for warnings, adjust 6 arg get_range_strlen caller and clear minmaxlen[0] and [1] if it returned false. (get_maxval_strlen): Adjust 6 arg get_range_strlen caller. (gimple_fold_builtin_strlen): Use the 6 arg get_range_strlen overload rather than 2 arg, use it only if it returns true and flexarray is false, pass 3 as type to it. * gcc.c-torture/execute/pr84478.c: New test. --- gcc/gimple-fold.c.jj 2018-02-19 19:57:03.424279589 +0100 +++ gcc/gimple-fold.c 2018-02-20 22:03:47.595265756 +0100 @@ -1283,13 +1283,16 @@ gimple_fold_builtin_memset (gimple_stmt_ value of ARG in LENGTH[0] and LENGTH[1], respectively. If ARG is an SSA name variable, follow its use-def chains. When TYPE == 0, if LENGTH[1] is not equal to the length we determine or - if we are unable to determine the length or value, return False. + if we are unable to determine the length or value, return false. VISITED is a bitmap of visited variables. TYPE is 0 if string length should be obtained, 1 for maximum string length and 2 for maximum value ARG can have. - When FUZZY is set and the length of a string cannot be determined, + When FUZZY is non-zero and the length of a string cannot be determined, the function instead considers as the maximum possible length the - size of a character array it may refer to. + size of a character array it may refer to. If FUZZY is 2, it will handle + PHIs and COND_EXPRs optimistically, if we can determine string length + minimum and maximum, it will use the minimum from the ones where it + can be determined. Set *FLEXP to true if the range of the string lengths has been obtained from the upper bound of an array at the end of a struct. Such an array may hold a string that's longer than its upper bound @@ -1297,14 +1300,13 @@ gimple_fold_builtin_memset (gimple_stmt_ static bool get_range_strlen (tree arg, tree length[2], bitmap *visited, int type, - bool fuzzy, bool *flexp) + int fuzzy, bool *flexp) { tree var, val = NULL_TREE; gimple *def_stmt; - /* The minimum and maximum length. The MAXLEN pointer stays unchanged - but MINLEN may be cleared during the execution of the function. */ - tree *minlen = length; + /* The minimum and maximum length. */ + tree *const minlen = length; tree *const maxlen = length + 1; if (TREE_CODE (arg) != SSA_NAME) @@ -1445,12 +1447,11 @@ get_range_strlen (tree arg, tree length[ if (!val) return false; - if (minlen - && (!*minlen - || (type > 0 - && TREE_CODE (*minlen) == INTEGER_CST - && TREE_CODE (val) == INTEGER_CST - && tree_int_cst_lt (val, *minlen)))) + if (!*minlen + || (type > 0 + && TREE_CODE (*minlen) == INTEGER_CST + && TREE_CODE (val) == INTEGER_CST + && tree_int_cst_lt (val, *minlen))) *minlen = val; if (*maxlen) @@ -1501,20 +1502,26 @@ get_range_strlen (tree arg, tree length[ } else if (gimple_assign_rhs_code (def_stmt) == COND_EXPR) { - tree op2 = gimple_assign_rhs2 (def_stmt); - tree op3 = gimple_assign_rhs3 (def_stmt); - return get_range_strlen (op2, length, visited, type, fuzzy, flexp) - && get_range_strlen (op3, length, visited, type, fuzzy, flexp); + tree ops[2] = { gimple_assign_rhs2 (def_stmt), + gimple_assign_rhs3 (def_stmt) }; + + for (unsigned int i = 0; i < 2; i++) + if (!get_range_strlen (ops[i], length, visited, type, fuzzy, + flexp)) + { + if (fuzzy == 2) + *maxlen = build_all_ones_cst (size_type_node); + else + return false; + } + return true; } return false; case GIMPLE_PHI: - { - /* All the arguments of the PHI node must have the same constant - length. */ - unsigned i; - - for (i = 0; i < gimple_phi_num_args (def_stmt); i++) + /* All the arguments of the PHI node must have the same constant + length. */ + for (unsigned i = 0; i < gimple_phi_num_args (def_stmt); i++) { tree arg = gimple_phi_arg (def_stmt, i)->def; @@ -1529,13 +1536,12 @@ get_range_strlen (tree arg, tree length[ if (!get_range_strlen (arg, length, visited, type, fuzzy, flexp)) { - if (fuzzy) + if (fuzzy == 2) *maxlen = build_all_ones_cst (size_type_node); else return false; } } - } return true; default: @@ -1549,12 +1555,15 @@ get_range_strlen (tree arg, tree length[ character arrays, use the upper bound of the array as the maximum length. For example, given an expression like 'x ? array : "xyz"' and array declared as 'char array[8]', MINMAXLEN[0] will be set - to 3 and MINMAXLEN[1] to 7, the longest string that could be + to 0 and MINMAXLEN[1] to 7, the longest string that could be stored in array. Return true if the range of the string lengths has been obtained from the upper bound of an array at the end of a struct. Such an array may hold a string that's longer than its upper bound - due to it being used as a poor-man's flexible array member. */ + due to it being used as a poor-man's flexible array member. + + This function should be only used for warning code, as it doesn't + handle PHIs in a conservatively correct way. */ bool get_range_strlen (tree arg, tree minmaxlen[2]) @@ -1565,7 +1574,11 @@ get_range_strlen (tree arg, tree minmaxl minmaxlen[1] = NULL_TREE; bool flexarray = false; - get_range_strlen (arg, minmaxlen, &visited, 1, true, &flexarray); + if (!get_range_strlen (arg, minmaxlen, &visited, 1, 2, &flexarray)) + { + minmaxlen[0] = NULL_TREE; + minmaxlen[1] = NULL_TREE; + } if (visited) BITMAP_FREE (visited); @@ -1580,7 +1593,7 @@ get_maxval_strlen (tree arg, int type) tree len[2] = { NULL_TREE, NULL_TREE }; bool dummy; - if (!get_range_strlen (arg, len, &visited, type, false, &dummy)) + if (!get_range_strlen (arg, len, &visited, type, 0, &dummy)) len[1] = NULL_TREE; if (visited) BITMAP_FREE (visited); @@ -3533,8 +3546,12 @@ gimple_fold_builtin_strlen (gimple_stmt_ wide_int minlen; wide_int maxlen; - tree lenrange[2]; - if (!get_range_strlen (gimple_call_arg (stmt, 0), lenrange) + tree lenrange[2] = { NULL_TREE, NULL_TREE }; + bitmap visited = NULL; + bool flexarray = false; + if (get_range_strlen (gimple_call_arg (stmt, 0), lenrange, &visited, + 1, 1, &flexarray) + && !flexarray && lenrange[0] && TREE_CODE (lenrange[0]) == INTEGER_CST && lenrange[1] && TREE_CODE (lenrange[1]) == INTEGER_CST) { @@ -3554,6 +3571,9 @@ gimple_fold_builtin_strlen (gimple_stmt_ maxlen = wi::to_wide (max_object_size (), prec) - 2; } + if (visited) + BITMAP_FREE (visited); + if (minlen == maxlen) { lenrange[0] = force_gimple_operand_gsi (gsi, lenrange[0], true, NULL, --- gcc/testsuite/gcc.c-torture/execute/pr84478.c.jj 2018-02-20 16:32:00.683086212 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr84478.c 2018-02-20 16:31:33.497081640 +0100 @@ -0,0 +1,49 @@ +/* PR tree-optimization/84478 */ + +long poolptr; +unsigned char *strpool; +static const char *poolfilearr[] = { + "mu", + "", +#define A "x", +#define B A "xx", A A "xxx", A A A A A +#define C B B B B B B B B B B +#define D C C C C C C C C C C + D C C C C C C C B B B + ((void *)0) +}; + +__attribute__((noipa)) long +makestring (void) +{ + return 1; +} + +__attribute__((noipa)) long +loadpoolstrings (long spare_size) +{ + const char *s; + long g = 0; + int i = 0, j = 0; + while ((s = poolfilearr[j++])) + { + int l = __builtin_strlen (s); + i += l; + if (i >= spare_size) return 0; + while (l-- > 0) strpool[poolptr++] = *s++; + g = makestring (); + } + return g; +} + +int +main () +{ + strpool = __builtin_malloc (4000); + if (!strpool) + return 0; + asm volatile ("" : : : "memory"); + volatile int r = loadpoolstrings (4000); + __builtin_free (strpool); + return 0; +} [-- Attachment #3: T076 --] [-- Type: text/plain, Size: 8901 bytes --] 2018-02-20 Jakub Jelinek <jakub@redhat.com> Martin Sebor <msebor@redhat.com> PR tree-optimization/84478 * gimple-fold.h (get_range_strlen): Add a bool argument defaulted to false. * gimple-fold.c (get_range_strlen): Make minlen const and assume it can't be NULL. Change FUZZY from bool to int, for 1 add PHI/COND_EXPR support which is conservatively correct, for 2 only stay conservative for maxlen. Formatting and comment capitalization fixes. Add STRICT argument to the 2 argument get_range_strlen, adjust 6 arg get_range_strlen caller and clear minmaxlen[0] and [1] if it returned false. (get_maxval_strlen): Adjust 6 arg get_range_strlen caller. (gimple_fold_builtin_strlen): Pass true as last argument to get_range_strlen. * gcc.c-torture/execute/pr84478.c: New test. --- gcc/gimple-fold.h.jj 2018-01-03 10:19:55.771534056 +0100 +++ gcc/gimple-fold.h 2018-02-20 22:13:24.012326866 +0100 @@ -25,7 +25,7 @@ along with GCC; see the file COPYING3. extern tree create_tmp_reg_or_ssa_name (tree, gimple *stmt = NULL); extern tree canonicalize_constructor_val (tree, tree); extern tree get_symbol_constant_value (tree); -extern bool get_range_strlen (tree, tree[2]); +extern bool get_range_strlen (tree, tree[2], bool = false); extern tree get_maxval_strlen (tree, int); extern void gimplify_and_update_call_from_tree (gimple_stmt_iterator *, tree); extern bool fold_stmt (gimple_stmt_iterator *); --- gcc/gimple-fold.c.jj 2018-02-19 19:57:03.424279589 +0100 +++ gcc/gimple-fold.c 2018-02-20 22:17:32.460373162 +0100 @@ -1283,13 +1283,16 @@ gimple_fold_builtin_memset (gimple_stmt_ value of ARG in LENGTH[0] and LENGTH[1], respectively. If ARG is an SSA name variable, follow its use-def chains. When TYPE == 0, if LENGTH[1] is not equal to the length we determine or - if we are unable to determine the length or value, return False. + if we are unable to determine the length or value, return false. VISITED is a bitmap of visited variables. TYPE is 0 if string length should be obtained, 1 for maximum string length and 2 for maximum value ARG can have. - When FUZZY is set and the length of a string cannot be determined, + When FUZZY is non-zero and the length of a string cannot be determined, the function instead considers as the maximum possible length the - size of a character array it may refer to. + size of a character array it may refer to. If FUZZY is 2, it will handle + PHIs and COND_EXPRs optimistically, if we can determine string length + minimum and maximum, it will use the minimum from the ones where it + can be determined. Set *FLEXP to true if the range of the string lengths has been obtained from the upper bound of an array at the end of a struct. Such an array may hold a string that's longer than its upper bound @@ -1297,14 +1300,13 @@ gimple_fold_builtin_memset (gimple_stmt_ static bool get_range_strlen (tree arg, tree length[2], bitmap *visited, int type, - bool fuzzy, bool *flexp) + int fuzzy, bool *flexp) { tree var, val = NULL_TREE; gimple *def_stmt; - /* The minimum and maximum length. The MAXLEN pointer stays unchanged - but MINLEN may be cleared during the execution of the function. */ - tree *minlen = length; + /* The minimum and maximum length. */ + tree *const minlen = length; tree *const maxlen = length + 1; if (TREE_CODE (arg) != SSA_NAME) @@ -1445,12 +1447,11 @@ get_range_strlen (tree arg, tree length[ if (!val) return false; - if (minlen - && (!*minlen - || (type > 0 - && TREE_CODE (*minlen) == INTEGER_CST - && TREE_CODE (val) == INTEGER_CST - && tree_int_cst_lt (val, *minlen)))) + if (!*minlen + || (type > 0 + && TREE_CODE (*minlen) == INTEGER_CST + && TREE_CODE (val) == INTEGER_CST + && tree_int_cst_lt (val, *minlen))) *minlen = val; if (*maxlen) @@ -1501,20 +1502,26 @@ get_range_strlen (tree arg, tree length[ } else if (gimple_assign_rhs_code (def_stmt) == COND_EXPR) { - tree op2 = gimple_assign_rhs2 (def_stmt); - tree op3 = gimple_assign_rhs3 (def_stmt); - return get_range_strlen (op2, length, visited, type, fuzzy, flexp) - && get_range_strlen (op3, length, visited, type, fuzzy, flexp); + tree ops[2] = { gimple_assign_rhs2 (def_stmt), + gimple_assign_rhs3 (def_stmt) }; + + for (unsigned int i = 0; i < 2; i++) + if (!get_range_strlen (ops[i], length, visited, type, fuzzy, + flexp)) + { + if (fuzzy == 2) + *maxlen = build_all_ones_cst (size_type_node); + else + return false; + } + return true; } return false; case GIMPLE_PHI: - { - /* All the arguments of the PHI node must have the same constant - length. */ - unsigned i; - - for (i = 0; i < gimple_phi_num_args (def_stmt); i++) + /* All the arguments of the PHI node must have the same constant + length. */ + for (unsigned i = 0; i < gimple_phi_num_args (def_stmt); i++) { tree arg = gimple_phi_arg (def_stmt, i)->def; @@ -1529,13 +1536,12 @@ get_range_strlen (tree arg, tree length[ if (!get_range_strlen (arg, length, visited, type, fuzzy, flexp)) { - if (fuzzy) + if (fuzzy == 2) *maxlen = build_all_ones_cst (size_type_node); else return false; } } - } return true; default: @@ -1549,15 +1555,21 @@ get_range_strlen (tree arg, tree length[ character arrays, use the upper bound of the array as the maximum length. For example, given an expression like 'x ? array : "xyz"' and array declared as 'char array[8]', MINMAXLEN[0] will be set - to 3 and MINMAXLEN[1] to 7, the longest string that could be + to 0 and MINMAXLEN[1] to 7, the longest string that could be stored in array. Return true if the range of the string lengths has been obtained from the upper bound of an array at the end of a struct. Such an array may hold a string that's longer than its upper bound - due to it being used as a poor-man's flexible array member. */ + due to it being used as a poor-man's flexible array member. + + STRICT is true if it will handle PHIs and COND_EXPRs conservatively + and false if PHIs and COND_EXPRs are to be handled optimistically, + if we can determine string length minimum and maximum; it will use + the minimum from the ones where it can be determined. + STRICT false should be only used for warning code. */ bool -get_range_strlen (tree arg, tree minmaxlen[2]) +get_range_strlen (tree arg, tree minmaxlen[2], bool strict) { bitmap visited = NULL; @@ -1565,7 +1577,12 @@ get_range_strlen (tree arg, tree minmaxl minmaxlen[1] = NULL_TREE; bool flexarray = false; - get_range_strlen (arg, minmaxlen, &visited, 1, true, &flexarray); + if (!get_range_strlen (arg, minmaxlen, &visited, 1, strict ? 1 : 2, + &flexarray)) + { + minmaxlen[0] = NULL_TREE; + minmaxlen[1] = NULL_TREE; + } if (visited) BITMAP_FREE (visited); @@ -1580,7 +1597,7 @@ get_maxval_strlen (tree arg, int type) tree len[2] = { NULL_TREE, NULL_TREE }; bool dummy; - if (!get_range_strlen (arg, len, &visited, type, false, &dummy)) + if (!get_range_strlen (arg, len, &visited, type, 0, &dummy)) len[1] = NULL_TREE; if (visited) BITMAP_FREE (visited); @@ -3534,7 +3551,7 @@ gimple_fold_builtin_strlen (gimple_stmt_ wide_int maxlen; tree lenrange[2]; - if (!get_range_strlen (gimple_call_arg (stmt, 0), lenrange) + if (!get_range_strlen (gimple_call_arg (stmt, 0), lenrange, true) && lenrange[0] && TREE_CODE (lenrange[0]) == INTEGER_CST && lenrange[1] && TREE_CODE (lenrange[1]) == INTEGER_CST) { --- gcc/testsuite/gcc.c-torture/execute/pr84478.c.jj 2018-02-20 16:32:00.683086212 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr84478.c 2018-02-20 16:31:33.497081640 +0100 @@ -0,0 +1,49 @@ +/* PR tree-optimization/84478 */ + +long poolptr; +unsigned char *strpool; +static const char *poolfilearr[] = { + "mu", + "", +#define A "x", +#define B A "xx", A A "xxx", A A A A A +#define C B B B B B B B B B B +#define D C C C C C C C C C C + D C C C C C C C B B B + ((void *)0) +}; + +__attribute__((noipa)) long +makestring (void) +{ + return 1; +} + +__attribute__((noipa)) long +loadpoolstrings (long spare_size) +{ + const char *s; + long g = 0; + int i = 0, j = 0; + while ((s = poolfilearr[j++])) + { + int l = __builtin_strlen (s); + i += l; + if (i >= spare_size) return 0; + while (l-- > 0) strpool[poolptr++] = *s++; + g = makestring (); + } + return g; +} + +int +main () +{ + strpool = __builtin_malloc (4000); + if (!strpool) + return 0; + asm volatile ("" : : : "memory"); + volatile int r = loadpoolstrings (4000); + __builtin_free (strpool); + return 0; +} ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478, take 2 and 3) 2018-02-21 0:49 ` [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478, take 2 and 3) Jakub Jelinek @ 2018-02-20 22:49 ` Martin Sebor 2018-02-21 3:56 ` Jeff Law 2018-02-21 3:56 ` Jeff Law 2 siblings, 0 replies; 19+ messages in thread From: Martin Sebor @ 2018-02-20 22:49 UTC (permalink / raw) To: Jakub Jelinek, Richard Biener, Jeff Law; +Cc: gcc-patches On 02/20/2018 02:34 PM, Jakub Jelinek wrote: > On Tue, Feb 20, 2018 at 01:13:13PM -0700, Martin Sebor wrote: >> A safer and even more conservative alternative that should be >> equivalent to your approach while avoiding the sprintf regressions >> is to add another mode to the function and have it clear *minlen >> as an option. This lets the strlen code obtain the conservative >> lower bound without compromising the sprintf warnings. > > I fail to see what it would be good for to set *MINLEN to zero and > *MAXLEN to all ones for the non-warning use cases, we simply don't know > anything about it, both NULL_TREEs i.e. returning false is better. I'm > offering two alternate patches which use > fuzzy == 0 for the previous !fuzzy, fuzzy == 1 for conservatively correct > code that assumes strlen can't cross field/variable boundaries in > compliant programs and fuzzy == 2 which does that + whatever the warning > code wants. Additionally, I've rewritten the COND_EXPR handling, so that > it matches exactly the PHI handling. > > The first patch doesn't change the 2 argument get_range_strlen and changes > gimple_fold_builtin_strlen to use the 6 argument one, the second patch > changes also the 2 argument get_range_strlen similarly to what you've done > in your patch. > > Tested on x86_64-linux and i686-linux, ok for trunk if it passes > bootstrap/regtest? Which one? I don't have a preference as long as it doesn't introduce any test suite regressions. That's all I was trying to do in my approach (besides fixing the bug). Martin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478, take 2 and 3) 2018-02-21 0:49 ` [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478, take 2 and 3) Jakub Jelinek 2018-02-20 22:49 ` Martin Sebor @ 2018-02-21 3:56 ` Jeff Law 2018-02-21 14:49 ` Jakub Jelinek 2018-02-21 3:56 ` Jeff Law 2 siblings, 1 reply; 19+ messages in thread From: Jeff Law @ 2018-02-21 3:56 UTC (permalink / raw) To: Jakub Jelinek, Richard Biener, Martin Sebor; +Cc: gcc-patches On 02/20/2018 02:34 PM, Jakub Jelinek wrote: > On Tue, Feb 20, 2018 at 01:13:13PM -0700, Martin Sebor wrote: >> A safer and even more conservative alternative that should be >> equivalent to your approach while avoiding the sprintf regressions >> is to add another mode to the function and have it clear *minlen >> as an option. This lets the strlen code obtain the conservative >> lower bound without compromising the sprintf warnings. > > I fail to see what it would be good for to set *MINLEN to zero and > *MAXLEN to all ones for the non-warning use cases, we simply don't know > anything about it, both NULL_TREEs i.e. returning false is better. I'm > offering two alternate patches which use > fuzzy == 0 for the previous !fuzzy, fuzzy == 1 for conservatively correct > code that assumes strlen can't cross field/variable boundaries in > compliant programs and fuzzy == 2 which does that + whatever the warning > code wants. Additionally, I've rewritten the COND_EXPR handling, so that > it matches exactly the PHI handling. > > The first patch doesn't change the 2 argument get_range_strlen and changes > gimple_fold_builtin_strlen to use the 6 argument one, the second patch > changes also the 2 argument get_range_strlen similarly to what you've done > in your patch. > > Tested on x86_64-linux and i686-linux, ok for trunk if it passes > bootstrap/regtest? Which one? I'd lean towards the second -- essentially trying to keep the 6 operand version for internal (recursive) use only. In my mind I'd like to encapsulate the 6 operand version so that it can't be directly called and the only public interface is the 3 operand version using strict/no-strict. Let's go with that. We can try to clean this up further during gcc-9. Jeff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478, take 2 and 3) 2018-02-21 3:56 ` Jeff Law @ 2018-02-21 14:49 ` Jakub Jelinek 0 siblings, 0 replies; 19+ messages in thread From: Jakub Jelinek @ 2018-02-21 14:49 UTC (permalink / raw) To: Jeff Law; +Cc: Richard Biener, Martin Sebor, gcc-patches On Tue, Feb 20, 2018 at 08:56:11PM -0700, Jeff Law wrote: > On 02/20/2018 02:34 PM, Jakub Jelinek wrote: > > On Tue, Feb 20, 2018 at 01:13:13PM -0700, Martin Sebor wrote: > >> A safer and even more conservative alternative that should be > >> equivalent to your approach while avoiding the sprintf regressions > >> is to add another mode to the function and have it clear *minlen > >> as an option. This lets the strlen code obtain the conservative > >> lower bound without compromising the sprintf warnings. > > > > I fail to see what it would be good for to set *MINLEN to zero and > > *MAXLEN to all ones for the non-warning use cases, we simply don't know > > anything about it, both NULL_TREEs i.e. returning false is better. I'm > > offering two alternate patches which use > > fuzzy == 0 for the previous !fuzzy, fuzzy == 1 for conservatively correct > > code that assumes strlen can't cross field/variable boundaries in > > compliant programs and fuzzy == 2 which does that + whatever the warning > > code wants. Additionally, I've rewritten the COND_EXPR handling, so that > > it matches exactly the PHI handling. > > > > The first patch doesn't change the 2 argument get_range_strlen and changes > > gimple_fold_builtin_strlen to use the 6 argument one, the second patch > > changes also the 2 argument get_range_strlen similarly to what you've done > > in your patch. > > > > Tested on x86_64-linux and i686-linux, ok for trunk if it passes > > bootstrap/regtest? Which one? > I'd lean towards the second -- essentially trying to keep the 6 operand > version for internal (recursive) use only. > > In my mind I'd like to encapsulate the 6 operand version so that it > can't be directly called and the only public interface is the 3 operand > version using strict/no-strict. > > Let's go with that. We can try to clean this up further during gcc-9. Ok, committed the second patch after bootstrapping/regtesting it on x86_64-linux and i686-linux. There is one thing that might still be desirable to change, because right now the default arg for the non-internal get_range_strlen is the warning non-conservative one, so if somebody adds another use for code generation it might break stuff again. Perhaps we should either flip the default or make the third argument a required argument and adjust all the 6 callers. Jakub ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478, take 2 and 3) 2018-02-21 0:49 ` [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478, take 2 and 3) Jakub Jelinek 2018-02-20 22:49 ` Martin Sebor 2018-02-21 3:56 ` Jeff Law @ 2018-02-21 3:56 ` Jeff Law 2 siblings, 0 replies; 19+ messages in thread From: Jeff Law @ 2018-02-21 3:56 UTC (permalink / raw) To: Jakub Jelinek, Richard Biener, Martin Sebor; +Cc: gcc-patches On 02/20/2018 02:34 PM, Jakub Jelinek wrote: > On Tue, Feb 20, 2018 at 01:13:13PM -0700, Martin Sebor wrote: >> A safer and even more conservative alternative that should be >> equivalent to your approach while avoiding the sprintf regressions >> is to add another mode to the function and have it clear *minlen >> as an option. This lets the strlen code obtain the conservative >> lower bound without compromising the sprintf warnings. > > I fail to see what it would be good for to set *MINLEN to zero and > *MAXLEN to all ones for the non-warning use cases, we simply don't know > anything about it, both NULL_TREEs i.e. returning false is better. I'm > offering two alternate patches which use > fuzzy == 0 for the previous !fuzzy, fuzzy == 1 for conservatively correct > code that assumes strlen can't cross field/variable boundaries in > compliant programs and fuzzy == 2 which does that + whatever the warning > code wants. Additionally, I've rewritten the COND_EXPR handling, so that > it matches exactly the PHI handling. > > The first patch doesn't change the 2 argument get_range_strlen and changes > gimple_fold_builtin_strlen to use the 6 argument one, the second patch > changes also the 2 argument get_range_strlen similarly to what you've done > in your patch. > > Tested on x86_64-linux and i686-linux, ok for trunk if it passes > bootstrap/regtest? Which one? Just to be clear, the encapsulation proposal is a gcc-9 thingie. I don't think we need that to move forward. jeff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) 2018-02-20 20:13 ` Martin Sebor 2018-02-21 0:49 ` [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478, take 2 and 3) Jakub Jelinek @ 2018-02-21 3:36 ` Jeff Law 1 sibling, 0 replies; 19+ messages in thread From: Jeff Law @ 2018-02-21 3:36 UTC (permalink / raw) To: Martin Sebor, Jakub Jelinek, Richard Biener; +Cc: gcc-patches On 02/20/2018 01:13 PM, Martin Sebor wrote: > On 02/20/2018 12:03 PM, Martin Sebor wrote: >> On 02/20/2018 10:17 AM, Jakub Jelinek wrote: >>> Hi! >>> >>> The following testcase distilled from pdftex is miscompiled on i?86, >>> because gimple_fold_builtin_strlen sets incorrect value range on >>> strlen call on SSA_NAME with def_stmt of PHI <"mu", something> where >>> we can't determine anything about string length of something, so the >>> right value range is [0, maximum_possible_strlen], but we actually used >>> [2, INF]. >>> >>> get_range_strlen has many modes, for get_maxval_strlen we check return >>> value of get_range_strlen, don't set fuzzy and everything seems correct. >>> Otherwise we have the fuzzy mode which is used in 2 arg get_range_strlen >>> overload, which is used for warnings and recently also for >>> gimple_fold_builtin_strlen which sets value ranges from it. Unlike >>> warnings, which can perhaps afford some heuristics and guessing, for >>> gimple_fold_builtin_strlen we need to be 100% conservative. >>> >>> The thing that isn't handled conservatively is PHIs and COND_EXPR. >>> The current code, if we can't figure one of the args out, for PHIs in >>> fuzzy mode increases the *maxval value to +INF, but doesn't touch >>> *minval, for COND_EXPR doesn't adjust the *minval/*maxval at all and >>> just >>> returns false. Unfortunately, changing that breaks >> >> It sounds like not setting *minlen is the problem then. Attached >> is a slightly smaller patch that fixes the bug with no testsuite >> regressions on x86_64-linux. How does it look to you? > > A safer and even more conservative alternative that should be > equivalent to your approach while avoiding the sprintf regressions > is to add another mode to the function and have it clear *minlen > as an option. This lets the strlen code obtain the conservative > lower bound without compromising the sprintf warnings. Adding another mode to this function seems wrong to me -- it's already a bit of a tangled mess to figure out. ISTM that either we allow fuzzy results or not. For warnings, fuzzy rules. For code generation fuzzy is strictly not allowed. Is there some reason that will not work? jeff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) 2018-02-20 19:03 ` Martin Sebor 2018-02-20 20:13 ` Martin Sebor @ 2018-02-20 20:49 ` Jakub Jelinek 2018-02-20 23:59 ` Martin Sebor 2018-02-21 3:42 ` Jeff Law 2 siblings, 1 reply; 19+ messages in thread From: Jakub Jelinek @ 2018-02-20 20:49 UTC (permalink / raw) To: Martin Sebor; +Cc: Richard Biener, Jeff Law, gcc-patches On Tue, Feb 20, 2018 at 12:03:26PM -0700, Martin Sebor wrote: > PR tree-optimization/84478 - [8 Regression] pdftex miscompilation on i386 > > gcc/ChangeLog: > > PR tree-optimization/84478 > * gimple-fold.c (get_range_strlen): Set *MINLEN to zero. > (get_range_strlen): Reset range on failure. > > gcc/testsuite/ChangeLog: > > PR tree-optimization/84478 > * gcc.c-torture/execute/pr84478.c: New test. > > Index: gcc/gimple-fold.c > =================================================================== > --- gcc/gimple-fold.c (revision 257796) > +++ gcc/gimple-fold.c (working copy) > @@ -1369,7 +1369,10 @@ get_range_strlen (tree arg, tree length[2], bitmap > tree eltype = TREE_TYPE (type); > if (TREE_CODE (type) != ARRAY_TYPE > || !INTEGRAL_TYPE_P (eltype)) > - return false; > + { > + *minlen = ssize_int (0); > + return false; > + } This is just one of the 13 spots where we return false, so this doesn't look safe or sufficient to me, even when you actually honor the return value in 2 argument get_range_strlen. You'd really need to do { if (fuzzy) - *maxlen = build_all_ones_cst (size_type_node); + { + *minlen = size_int (0); + *maxlen = build_all_ones_cst (size_type_node); + } else return false; } or just drop that if (fuzzy) stuff from there, but that breaks the warning tests. It would help if you explained why you think it is a good idea ignoring the other phi arguments if you have one (or more) where you can determine length. One variation of my patch could be instead of adding type 3 change fuzzy from bool to int, and use fuzzy == 1 for the strlen value ranges and fuzzy == 2 for the warning code (i.e. 2 operand get_range_strlen). Note, my patch passed regtest on both x86_64-linux and i686-linux. Jakub ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) 2018-02-20 20:49 ` Jakub Jelinek @ 2018-02-20 23:59 ` Martin Sebor 2018-02-21 3:25 ` Jeff Law 2018-02-21 7:49 ` Jakub Jelinek 0 siblings, 2 replies; 19+ messages in thread From: Martin Sebor @ 2018-02-20 23:59 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Richard Biener, Jeff Law, gcc-patches > It would help if you explained why you think it is a good idea > ignoring the other phi arguments if you have one (or more) where you can > determine length. It's a heuristic that was meant just for the -Wformat-overflow warning. When making decisions that affect code generation it's obviously not correct to ignore the possibility that unknown arguments may be shorter than the minimum or longer than the maximum. The fuzzy argument was meant to differentiate between two got but I forgot about it when I added the fix for PR 83671. For GCC 8 I don't have a preference for how to fix this as long as it doesn't regress the warning tests. I think the ultimate solution (for GCC 9) may be to either disable the heuristic for code generation purposes (e.g., via another argument/flag) or provide a pointer argument to indicate to the caller that the minimum is based on known strings, and that the real minimum may be zero. Martin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) 2018-02-20 23:59 ` Martin Sebor @ 2018-02-21 3:25 ` Jeff Law 2018-02-21 15:49 ` Martin Sebor 2018-02-22 7:39 ` Richard Biener 2018-02-21 7:49 ` Jakub Jelinek 1 sibling, 2 replies; 19+ messages in thread From: Jeff Law @ 2018-02-21 3:25 UTC (permalink / raw) To: Martin Sebor, Jakub Jelinek; +Cc: Richard Biener, gcc-patches On 02/20/2018 04:59 PM, Martin Sebor wrote: >> It would help if you explained why you think it is a good idea >> ignoring the other phi arguments if you have one (or more) where you can >> determine length. > > It's a heuristic that was meant just for the -Wformat-overflow > warning. When making decisions that affect code generation it's > obviously not correct to ignore the possibility that unknown > arguments may be shorter than the minimum or longer than > the maximum. The fuzzy argument was meant to differentiate > between two got but I forgot about it when I added the fix > for PR 83671. > > For GCC 8 I don't have a preference for how to fix this as long > as it doesn't regress the warning tests. > > I think the ultimate solution (for GCC 9) may be to either > disable the heuristic for code generation purposes (e.g., via > another argument/flag) or provide a pointer argument to indicate > to the caller that the minimum is based on known strings, and that > the real minimum may be zero. I'm still getting refamiliar with this code. But one thing that jumps out immediately is how much this reminds me of the discussion we had around 77608 -- where I argued that returning something that was not conservatively correct was just asking for long term problems. I realize we're talking about different routines, but the concerns are the same -- when we return something that is not conservatively correct it's easy for someone to mistakenly use those results for code generation purposes. The fuzzy stuff is in there to reduce the false positive rates and we're not *supposed* to be using fuzzy results for code generation purposes, but as I argued in 77608, it's easy to miss. I'll reiterate my desire to make this kind of mistake harder to make. Jeff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) 2018-02-21 3:25 ` Jeff Law @ 2018-02-21 15:49 ` Martin Sebor 2018-02-22 7:39 ` Richard Biener 1 sibling, 0 replies; 19+ messages in thread From: Martin Sebor @ 2018-02-21 15:49 UTC (permalink / raw) To: Jeff Law, Jakub Jelinek; +Cc: Richard Biener, gcc-patches On 02/20/2018 08:25 PM, Jeff Law wrote: > On 02/20/2018 04:59 PM, Martin Sebor wrote: >>> It would help if you explained why you think it is a good idea >>> ignoring the other phi arguments if you have one (or more) where you can >>> determine length. >> >> It's a heuristic that was meant just for the -Wformat-overflow >> warning. When making decisions that affect code generation it's >> obviously not correct to ignore the possibility that unknown >> arguments may be shorter than the minimum or longer than >> the maximum. The fuzzy argument was meant to differentiate >> between two got but I forgot about it when I added the fix >> for PR 83671. >> >> For GCC 8 I don't have a preference for how to fix this as long >> as it doesn't regress the warning tests. >> >> I think the ultimate solution (for GCC 9) may be to either >> disable the heuristic for code generation purposes (e.g., via >> another argument/flag) or provide a pointer argument to indicate >> to the caller that the minimum is based on known strings, and that >> the real minimum may be zero. > I'm still getting refamiliar with this code. But one thing that jumps > out immediately is how much this reminds me of the discussion we had > around 77608 -- where I argued that returning something that was not > conservatively correct was just asking for long term problems. > > I realize we're talking about different routines, but the concerns are > the same -- when we return something that is not conservatively correct > it's easy for someone to mistakenly use those results for code > generation purposes. > > The fuzzy stuff is in there to reduce the false positive rates and we're > not *supposed* to be using fuzzy results for code generation purposes, > but as I argued in 77608, it's easy to miss. > > I'll reiterate my desire to make this kind of mistake harder to make. I agree. I'll take care of it in stage 1. Martin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) 2018-02-21 3:25 ` Jeff Law 2018-02-21 15:49 ` Martin Sebor @ 2018-02-22 7:39 ` Richard Biener 1 sibling, 0 replies; 19+ messages in thread From: Richard Biener @ 2018-02-22 7:39 UTC (permalink / raw) To: Jeff Law, Martin Sebor, Jakub Jelinek; +Cc: gcc-patches On February 21, 2018 4:25:14 AM GMT+01:00, Jeff Law <law@redhat.com> wrote: >On 02/20/2018 04:59 PM, Martin Sebor wrote: >>> It would help if you explained why you think it is a good idea >>> ignoring the other phi arguments if you have one (or more) where you >can >>> determine length. >> >> It's a heuristic that was meant just for the -Wformat-overflow >> warning. When making decisions that affect code generation it's >> obviously not correct to ignore the possibility that unknown >> arguments may be shorter than the minimum or longer than >> the maximum. The fuzzy argument was meant to differentiate >> between two got but I forgot about it when I added the fix >> for PR 83671. >> >> For GCC 8 I don't have a preference for how to fix this as long >> as it doesn't regress the warning tests. >> >> I think the ultimate solution (for GCC 9) may be to either >> disable the heuristic for code generation purposes (e.g., via >> another argument/flag) or provide a pointer argument to indicate >> to the caller that the minimum is based on known strings, and that >> the real minimum may be zero. >I'm still getting refamiliar with this code. But one thing that jumps >out immediately is how much this reminds me of the discussion we had >around 77608 -- where I argued that returning something that was not >conservatively correct was just asking for long term problems. > >I realize we're talking about different routines, but the concerns are >the same -- when we return something that is not conservatively correct >it's easy for someone to mistakenly use those results for code >generation purposes. > >The fuzzy stuff is in there to reduce the false positive rates and >we're >not *supposed* to be using fuzzy results for code generation purposes, >but as I argued in 77608, it's easy to miss. > >I'll reiterate my desire to make this kind of mistake harder to make. Can you simply provide that interface via a separately named API rather than overloading one that is also supposed to be used by optimization? Richard. >Jeff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) 2018-02-20 23:59 ` Martin Sebor 2018-02-21 3:25 ` Jeff Law @ 2018-02-21 7:49 ` Jakub Jelinek 2018-02-21 1:00 ` Martin Sebor 2018-02-21 3:31 ` Jeff Law 1 sibling, 2 replies; 19+ messages in thread From: Jakub Jelinek @ 2018-02-21 7:49 UTC (permalink / raw) To: Martin Sebor; +Cc: Richard Biener, Jeff Law, gcc-patches On Tue, Feb 20, 2018 at 04:59:12PM -0700, Martin Sebor wrote: > > It would help if you explained why you think it is a good idea > > ignoring the other phi arguments if you have one (or more) where you can > > determine length. > > It's a heuristic that was meant just for the -Wformat-overflow > warning. When making decisions that affect code generation it's > obviously not correct to ignore the possibility that unknown > arguments may be shorter than the minimum or longer than > the maximum. The fuzzy argument was meant to differentiate > between two got but I forgot about it when I added the fix > for PR 83671. get_range_strlen (the 2 argument one) is right now called: 3 times from builtins.c for -Wstringop-overflow once from gimple-ssa-sprintf.c for -Wformat-overflow once from tree-ssa-strlen.c for -Wstringop-truncation once from gimple-fold.c for gimple_fold_builtin_strlen value ranges So, which of these do you want the heuristics for? All 3 warnings, just one of them, something else? In the 2 patches I've posted last all the 3 different warnings are treated the same (fuzzy == 2). Looking at strlenopt-40.c testcase, in the test you clearly rely on fuzzy argument being set for the value ranges case (gimple_fold_builtin_strlen), otherwise many of the subtests would fail. Jakub ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) 2018-02-21 7:49 ` Jakub Jelinek @ 2018-02-21 1:00 ` Martin Sebor 2018-02-21 3:31 ` Jeff Law 1 sibling, 0 replies; 19+ messages in thread From: Martin Sebor @ 2018-02-21 1:00 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Richard Biener, Jeff Law, gcc-patches [-- Attachment #1: Type: text/plain, Size: 2276 bytes --] On 02/20/2018 05:14 PM, Jakub Jelinek wrote: > On Tue, Feb 20, 2018 at 04:59:12PM -0700, Martin Sebor wrote: >>> It would help if you explained why you think it is a good idea >>> ignoring the other phi arguments if you have one (or more) where you can >>> determine length. >> >> It's a heuristic that was meant just for the -Wformat-overflow >> warning. When making decisions that affect code generation it's >> obviously not correct to ignore the possibility that unknown >> arguments may be shorter than the minimum or longer than >> the maximum. The fuzzy argument was meant to differentiate >> between two got but I forgot about it when I added the fix >> for PR 83671. > > get_range_strlen (the 2 argument one) is right now called: > 3 times from builtins.c for -Wstringop-overflow > once from gimple-ssa-sprintf.c for -Wformat-overflow > once from tree-ssa-strlen.c for -Wstringop-truncation Right. The warnings depend on the heuristic and on using a non-zero lower bound when some of the strings is unknown or has unknown length but other don't. > once from gimple-fold.c for gimple_fold_builtin_strlen value ranges Right. This depends on it too, except it needs to avoid using a nonzero lower bound when any of the strings is unknown/has unknown length. Let's fix that. > > So, which of these do you want the heuristics for? All 3 warnings, > just one of them, something else? In the 2 patches I've posted last > all the 3 different warnings are treated the same (fuzzy == 2). All of them. The bug is in how strlen is folded. There is nothing wrong with the warnings, so fix the bug and leave the warnings alone. My second patch did that and it passed all tests. (I forgot to include the change to gimple-ssa-sprintf.c but it was the same as the one in tree-ssa-strlen.c: call get_range_strlen with true as the third argument. Attached is the same patch with that bit added.) > Looking at strlenopt-40.c testcase, in the test you clearly rely on > fuzzy argument being set for the value ranges case > (gimple_fold_builtin_strlen), otherwise many of the > subtests would fail. Right, that needs to continue to work. It does with my patch. If your solution regresses any of the tests then use the second patch I posted. What am I missing? Martin [-- Attachment #2: gcc-84478.diff --] [-- Type: text/x-patch, Size: 6636 bytes --] PR tree-optimization/84478 - [8 Regression] pdftex miscompilation on i386 gcc/ChangeLog: PR tree-optimization/84478 * gimple-fold.h (get_range_strlen): Add argument. * gimple-fold.c (get_range_strlen): Set *MINLEN to zero. (get_range_strlen): Reset range on failure. * gimple-ssa-sprintf.c (get_string_length): Add third argument to the call to get_range_strlen. * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same. gcc/testsuite/ChangeLog: PR tree-optimization/84478 * gcc.c-torture/execute/pr84478.c: New test. Index: gcc/gimple-fold.c =================================================================== --- gcc/gimple-fold.c (revision 257859) +++ gcc/gimple-fold.c (working copy) @@ -1290,6 +1290,9 @@ gimple_fold_builtin_memset (gimple_stmt_iterator * When FUZZY is set and the length of a string cannot be determined, the function instead considers as the maximum possible length the size of a character array it may refer to. + When STRICT_MIN is set the function will clear MINMAXLEN[0] if + the length of any of the referenced strings cannot be determined + (and thus can be zero). Set *FLEXP to true if the range of the string lengths has been obtained from the upper bound of an array at the end of a struct. Such an array may hold a string that's longer than its upper bound @@ -1297,7 +1300,7 @@ gimple_fold_builtin_memset (gimple_stmt_iterator * static bool get_range_strlen (tree arg, tree length[2], bitmap *visited, int type, - bool fuzzy, bool *flexp) + bool fuzzy, bool strict_min, bool *flexp) { tree var, val = NULL_TREE; gimple *def_stmt; @@ -1320,7 +1323,8 @@ get_range_strlen (tree arg, tree length[2], bitmap if (TREE_CODE (aop0) == INDIRECT_REF && TREE_CODE (TREE_OPERAND (aop0, 0)) == SSA_NAME) return get_range_strlen (TREE_OPERAND (aop0, 0), - length, visited, type, fuzzy, flexp); + length, visited, type, fuzzy, + strict_min, flexp); } else if (TREE_CODE (TREE_OPERAND (op, 0)) == COMPONENT_REF && fuzzy) { @@ -1354,7 +1358,7 @@ get_range_strlen (tree arg, tree length[2], bitmap { if (TREE_CODE (arg) == ADDR_EXPR) return get_range_strlen (TREE_OPERAND (arg, 0), length, - visited, type, fuzzy, flexp); + visited, type, fuzzy, strict_min, flexp); if (TREE_CODE (arg) == ARRAY_REF) { @@ -1497,14 +1501,17 @@ get_range_strlen (tree arg, tree length[2], bitmap || gimple_assign_unary_nop_p (def_stmt)) { tree rhs = gimple_assign_rhs1 (def_stmt); - return get_range_strlen (rhs, length, visited, type, fuzzy, flexp); + return get_range_strlen (rhs, length, visited, type, fuzzy, + strict_min, flexp); } else if (gimple_assign_rhs_code (def_stmt) == COND_EXPR) { tree op2 = gimple_assign_rhs2 (def_stmt); tree op3 = gimple_assign_rhs3 (def_stmt); - return get_range_strlen (op2, length, visited, type, fuzzy, flexp) - && get_range_strlen (op3, length, visited, type, fuzzy, flexp); + return (get_range_strlen (op2, length, visited, type, fuzzy, + strict_min, flexp) + && get_range_strlen (op3, length, visited, type, fuzzy, + strict_min, flexp)); } return false; @@ -1527,10 +1534,15 @@ get_range_strlen (tree arg, tree length[2], bitmap if (arg == gimple_phi_result (def_stmt)) continue; - if (!get_range_strlen (arg, length, visited, type, fuzzy, flexp)) + if (!get_range_strlen (arg, length, visited, type, fuzzy, + strict_min, flexp)) { if (fuzzy) - *maxlen = build_all_ones_cst (size_type_node); + { + if (strict_min) + *minlen = ssize_int (0); + *maxlen = build_all_ones_cst (size_type_node); + } else return false; } @@ -1551,6 +1563,9 @@ get_range_strlen (tree arg, tree length[2], bitmap and array declared as 'char array[8]', MINMAXLEN[0] will be set to 3 and MINMAXLEN[1] to 7, the longest string that could be stored in array. + When STRICT_MIN is set the function will clear MINMAXLEN[0] if + the length of any of the referenced strings cannot be determined + (and thus can be zero). Return true if the range of the string lengths has been obtained from the upper bound of an array at the end of a struct. Such an array may hold a string that's longer than its upper bound @@ -1557,7 +1572,7 @@ get_range_strlen (tree arg, tree length[2], bitmap due to it being used as a poor-man's flexible array member. */ bool -get_range_strlen (tree arg, tree minmaxlen[2]) +get_range_strlen (tree arg, tree minmaxlen[2], bool strict_min /* = true */) { bitmap visited = NULL; @@ -1565,7 +1580,12 @@ bool minmaxlen[1] = NULL_TREE; bool flexarray = false; - get_range_strlen (arg, minmaxlen, &visited, 1, true, &flexarray); + if (!get_range_strlen (arg, minmaxlen, &visited, 1, true, strict_min, + &flexarray)) + { + minmaxlen[0] = NULL_TREE; + minmaxlen[1] = NULL_TREE; + } if (visited) BITMAP_FREE (visited); @@ -1580,7 +1600,7 @@ get_maxval_strlen (tree arg, int type) tree len[2] = { NULL_TREE, NULL_TREE }; bool dummy; - if (!get_range_strlen (arg, len, &visited, type, false, &dummy)) + if (!get_range_strlen (arg, len, &visited, type, false, true, &dummy)) len[1] = NULL_TREE; if (visited) BITMAP_FREE (visited); Index: gcc/gimple-fold.h =================================================================== --- gcc/gimple-fold.h (revision 257859) +++ gcc/gimple-fold.h (working copy) @@ -25,7 +25,7 @@ along with GCC; see the file COPYING3. If not see extern tree create_tmp_reg_or_ssa_name (tree, gimple *stmt = NULL); extern tree canonicalize_constructor_val (tree, tree); extern tree get_symbol_constant_value (tree); -extern bool get_range_strlen (tree, tree[2]); +extern bool get_range_strlen (tree, tree[2], bool = true); extern tree get_maxval_strlen (tree, int); extern void gimplify_and_update_call_from_tree (gimple_stmt_iterator *, tree); extern bool fold_stmt (gimple_stmt_iterator *); Index: gcc/gimple-ssa-sprintf.c =================================================================== --- gcc/gimple-ssa-sprintf.c (revision 257859) +++ gcc/gimple-ssa-sprintf.c (working copy) @@ -2075,7 +2075,7 @@ get_string_length (tree str) aren't known to point any such arrays result in LENRANGE[1] set to SIZE_MAX. */ tree lenrange[2]; - bool flexarray = get_range_strlen (str, lenrange); + bool flexarray = get_range_strlen (str, lenrange, true); if (lenrange [0] || lenrange [1]) { ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) 2018-02-21 7:49 ` Jakub Jelinek 2018-02-21 1:00 ` Martin Sebor @ 2018-02-21 3:31 ` Jeff Law 2018-02-21 8:49 ` Jakub Jelinek 1 sibling, 1 reply; 19+ messages in thread From: Jeff Law @ 2018-02-21 3:31 UTC (permalink / raw) To: Jakub Jelinek, Martin Sebor; +Cc: Richard Biener, gcc-patches On 02/20/2018 05:14 PM, Jakub Jelinek wrote: > On Tue, Feb 20, 2018 at 04:59:12PM -0700, Martin Sebor wrote: >>> It would help if you explained why you think it is a good idea >>> ignoring the other phi arguments if you have one (or more) where you can >>> determine length. >> >> It's a heuristic that was meant just for the -Wformat-overflow >> warning. When making decisions that affect code generation it's >> obviously not correct to ignore the possibility that unknown >> arguments may be shorter than the minimum or longer than >> the maximum. The fuzzy argument was meant to differentiate >> between two got but I forgot about it when I added the fix >> for PR 83671. > > get_range_strlen (the 2 argument one) is right now called: > 3 times from builtins.c for -Wstringop-overflow > once from gimple-ssa-sprintf.c for -Wformat-overflow > once from tree-ssa-strlen.c for -Wstringop-truncation > once from gimple-fold.c for gimple_fold_builtin_strlen value ranges Presumably it's the last one that's causing problems and were we should focus our effort. > > So, which of these do you want the heuristics for? All 3 warnings, > just one of them, something else? In the 2 patches I've posted last > all the 3 different warnings are treated the same (fuzzy == 2). > > Looking at strlenopt-40.c testcase, in the test you clearly rely on > fuzzy argument being set for the value ranges case > (gimple_fold_builtin_strlen), otherwise many of the > subtests would fail. Which tests specifically? I did a real quick scan and didn't see anything in there that depended on incorrect behavior for the call from gimple_fold_builtin_strlen. BUt it was a quick scan and I could have missed something. jeff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) 2018-02-21 3:31 ` Jeff Law @ 2018-02-21 8:49 ` Jakub Jelinek 0 siblings, 0 replies; 19+ messages in thread From: Jakub Jelinek @ 2018-02-21 8:49 UTC (permalink / raw) To: Jeff Law; +Cc: Martin Sebor, Richard Biener, gcc-patches On Tue, Feb 20, 2018 at 08:31:06PM -0700, Jeff Law wrote: > On 02/20/2018 05:14 PM, Jakub Jelinek wrote: > > On Tue, Feb 20, 2018 at 04:59:12PM -0700, Martin Sebor wrote: > >>> It would help if you explained why you think it is a good idea > >>> ignoring the other phi arguments if you have one (or more) where you can > >>> determine length. > >> > >> It's a heuristic that was meant just for the -Wformat-overflow > >> warning. When making decisions that affect code generation it's > >> obviously not correct to ignore the possibility that unknown > >> arguments may be shorter than the minimum or longer than > >> the maximum. The fuzzy argument was meant to differentiate > >> between two got but I forgot about it when I added the fix > >> for PR 83671. > > > > get_range_strlen (the 2 argument one) is right now called: > > 3 times from builtins.c for -Wstringop-overflow > > once from gimple-ssa-sprintf.c for -Wformat-overflow > > once from tree-ssa-strlen.c for -Wstringop-truncation > > once from gimple-fold.c for gimple_fold_builtin_strlen value ranges > Presumably it's the last one that's causing problems and were we should > focus our effort. Sure. And that is what my patches do, not change anything for the warning cases and fix up the non-warning one. > > Looking at strlenopt-40.c testcase, in the test you clearly rely on > > fuzzy argument being set for the value ranges case > > (gimple_fold_builtin_strlen), otherwise many of the > > subtests would fail. > Which tests specifically? I did a real quick scan and didn't see > anything in there that depended on incorrect behavior for the call from > gimple_fold_builtin_strlen. BUt it was a quick scan and I could have > missed something. elim_global_arrays, elim_pointer_to_arrays etc. Basically, calling strlen on an array of known fixed length and checking that the upper bound of it is at most the array size minus 1. Jakub ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) 2018-02-20 19:03 ` Martin Sebor 2018-02-20 20:13 ` Martin Sebor 2018-02-20 20:49 ` Jakub Jelinek @ 2018-02-21 3:42 ` Jeff Law 2 siblings, 0 replies; 19+ messages in thread From: Jeff Law @ 2018-02-21 3:42 UTC (permalink / raw) To: Martin Sebor, Jakub Jelinek, Richard Biener; +Cc: gcc-patches On 02/20/2018 12:03 PM, Martin Sebor wrote: >> The thing that isn't handled conservatively is PHIs and COND_EXPR. >> The current code, if we can't figure one of the args out, for PHIs in >> fuzzy mode increases the *maxval value to +INF, but doesn't touch >> *minval, for COND_EXPR doesn't adjust the *minval/*maxval at all and just >> returns false. Unfortunately, changing that breaks > > It sounds like not setting *minlen is the problem then. Attached > is a slightly smaller patch that fixes the bug with no testsuite > regressions on x86_64-linux. How does it look to you? What I don't like here is that we ultimately continue to use the two operand get_range_strlen from the folder. Meaning we're asking for fuzzy results in a code generation path. I'd lean more towards a solution that always gives conservatively correct results in the codegen path while allowing fuzzy on the warning paths. Jeff ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2018-02-22 7:39 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-02-20 17:49 [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) Jakub Jelinek 2018-02-20 19:03 ` Martin Sebor 2018-02-20 20:13 ` Martin Sebor 2018-02-21 0:49 ` [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478, take 2 and 3) Jakub Jelinek 2018-02-20 22:49 ` Martin Sebor 2018-02-21 3:56 ` Jeff Law 2018-02-21 14:49 ` Jakub Jelinek 2018-02-21 3:56 ` Jeff Law 2018-02-21 3:36 ` [PATCH] Fix pdftex miscompilation due to get_range_strlen (PR tree-optimization/84478) Jeff Law 2018-02-20 20:49 ` Jakub Jelinek 2018-02-20 23:59 ` Martin Sebor 2018-02-21 3:25 ` Jeff Law 2018-02-21 15:49 ` Martin Sebor 2018-02-22 7:39 ` Richard Biener 2018-02-21 7:49 ` Jakub Jelinek 2018-02-21 1:00 ` Martin Sebor 2018-02-21 3:31 ` Jeff Law 2018-02-21 8:49 ` Jakub Jelinek 2018-02-21 3:42 ` Jeff Law
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).