public inbox for bzip2-devel@sourceware.org
 help / color / mirror / Atom feed
* [Bug bzip2/28904] New: libbzip2: ‘cost[3]’ may be used uninitialized in this function [-Wmaybe-uninitialized]
@ 2022-02-18  3:18 demerphq at gmail dot com
  2022-04-19 22:04 ` [Bug bzip2/28904] " mark at klomp dot org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: demerphq at gmail dot com @ 2022-02-18  3:18 UTC (permalink / raw)
  To: bzip2-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=28904

            Bug ID: 28904
           Summary: libbzip2: ‘cost[3]’ may be used uninitialized in this
                    function [-Wmaybe-uninitialized]
           Product: bzip2
           Version: unspecified
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P2
         Component: bzip2
          Assignee: nobody at sourceware dot org
          Reporter: demerphq at gmail dot com
                CC: bzip2-devel at sourceware dot org
  Target Milestone: ---

Hi,

I am a contributor to the Perl project. https://github.com/Perl/perl5

We bundle your library libbzip2 via the perl Module Compress-Raw-Bzip2, which
is hosted at:

https://github.com/pmqs/Compress-Raw-Bzip2

We are trying to ensure our builds are warning clean, however when building
this package your library produces "may be used uninitialized" warnings on some
compilers, you can see these warnings below. 

While this is almost certainly a false positive warning, it is also trivial to
silence the warnings with essentially no performance hit by initializing the
cost structure with 0's at the start.

We have an open bug report for this for perl here:
https://github.com/Perl/perl5/issues/19432

And an open bug report for this for the module here:
https://github.com/pmqs/Compress-Raw-Bzip2/issues/4

And a patch for it here:
https://github.com/pmqs/Compress-Raw-Bzip2/pull/5/commits/641a440ec6229c1d368b9ead48f4968b955c0115

The maintainer of Compress-Raw-Bzip2 (on CC) is known to prefer not to apply
changes to this library code that are not upstream. (A reasonable policy.)

I was wondering if you could apply this patch, or something equivalent that
would fix this so that the perl project could have warning free builds. Your
library is the last module we ship that produces warnings during the build
process. We target a wide range of compilers and try to keep our code as clean
as possible, and while again I accept that your code likely is not buggy as you
initialize it in a loop at line 357, 

357:         for (t = 0; t < nGroups; t++) cost[t] = 0;

as far as I can tell nGroups is not a constant, and the compiler reasonably
cannot be certain that all BZ_N_GROUPS entries in the array are initialized
before use.  

We in the perl development community would appreciate it if you would release a
patch that would silence these warnings for us. Then Paul could roll a new
release of his module, and we in the Perl could then use that and be happy with
a completely warning free build.

gcc -c  -I. -D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe
-fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -Wall -Werror=pointer-arith -Werror=vla -Wextra
-Wno-long-long -Wno-declaration-after-statement -Wc++-compat -Wwrite-strings
-O3   -DVERSION=\"2.101\" -DXS_VERSION=\"2.101\" -fPIC "-I../.."  -Wall
-Wno-comment  -DBZ_NO_STDIO  compress.c
compress.c: In function ‘BZ2_compressBlock’:
compress.c:392:54: warning: ‘cost[5]’ may be used uninitialized in this
function [-Wmaybe-uninitialized]
                for (t = 0; t < nGroups; t++) cost[t] += s->len[t][icv];
                                                      ^~
compress.c:256:11: note: ‘cost[5]’ was declared here
    UInt16 cost[BZ_N_GROUPS];
           ^~~~
compress.c:402:21: warning: ‘cost[3]’ may be used uninitialized in this
function [-Wmaybe-uninitialized]
             if (cost[t] < bc) { bc = cost[t]; bt = t; };
                 ~~~~^~~
compress.c:256:11: note: ‘cost[3]’ was declared here
    UInt16 cost[BZ_N_GROUPS];
           ^~~~
compress.c:402:21: warning: ‘cost[2]’ may be used uninitialized in this
function [-Wmaybe-uninitialized]
             if (cost[t] < bc) { bc = cost[t]; bt = t; };
                 ~~~~^~~
compress.c:256:11: note: ‘cost[2]’ was declared here
    UInt16 cost[BZ_N_GROUPS];
           ^~~~
Your consideration on this matter would be much appreciated. 

Thank you very much for your time in producing this library, I personally have
made use of it many times in my career.

cheers,
Yves

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug bzip2/28904] libbzip2: ‘cost[3]’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  2022-02-18  3:18 [Bug bzip2/28904] New: libbzip2: ‘cost[3]’ may be used uninitialized in this function [-Wmaybe-uninitialized] demerphq at gmail dot com
@ 2022-04-19 22:04 ` mark at klomp dot org
  2022-04-19 22:23 ` mark at klomp dot org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: mark at klomp dot org @ 2022-04-19 22:04 UTC (permalink / raw)
  To: bzip2-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=28904

Mark Wielaard <mark at klomp dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2022-04-19
     Ever confirmed|0                           |1
                 CC|                            |mark at klomp dot org

--- Comment #1 from Mark Wielaard <mark at klomp dot org> ---
Sorry for the late reply.

I think you are right that initializing these two small (6 ints) arrays at the
start of the function is fine if it helps the compiler.

I haven't been able to reproduce the compiler warning myself though.
My proposed patch would be:

diff --git a/compress.c b/compress.c
index 5dfa002..c825c78 100644
--- a/compress.c
+++ b/compress.c
@@ -253,8 +253,8 @@ void sendMTFValues ( EState* s )
    --*/


-   UInt16 cost[BZ_N_GROUPS];
-   Int32  fave[BZ_N_GROUPS];
+   UInt16 cost[BZ_N_GROUPS] = { 0 };
+   Int32  fave[BZ_N_GROUPS] = { 0 };

    UInt16* mtfv = s->mtfv;


{ 0 } initializes the whole array with zero, so we don't have to know the exact
number of elements.

Does the above work for you?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug bzip2/28904] libbzip2: ‘cost[3]’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  2022-02-18  3:18 [Bug bzip2/28904] New: libbzip2: ‘cost[3]’ may be used uninitialized in this function [-Wmaybe-uninitialized] demerphq at gmail dot com
  2022-04-19 22:04 ` [Bug bzip2/28904] " mark at klomp dot org
@ 2022-04-19 22:23 ` mark at klomp dot org
  2022-04-20  1:57 ` demerphq at gmail dot com
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: mark at klomp dot org @ 2022-04-19 22:23 UTC (permalink / raw)
  To: bzip2-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=28904

--- Comment #2 from Mark Wielaard <mark at klomp dot org> ---
A different solution might be to use the constant BZ_N_GROUPS instead of the
variable nGroups when resetting the arrays:

diff --git a/compress.c b/compress.c
index 5dfa002..2dc5dc1 100644
--- a/compress.c
+++ b/compress.c
@@ -321,7 +321,7 @@ void sendMTFValues ( EState* s )
    ---*/
    for (iter = 0; iter < BZ_N_ITERS; iter++) {

-      for (t = 0; t < nGroups; t++) fave[t] = 0;
+      for (t = 0; t < BZ_N_GROUPS; t++) fave[t] = 0;

       for (t = 0; t < nGroups; t++)
          for (v = 0; v < alphaSize; v++)
@@ -353,7 +353,7 @@ void sendMTFValues ( EState* s )
             Calculate the cost of this group as coded
             by each of the coding tables.
          --*/
-         for (t = 0; t < nGroups; t++) cost[t] = 0;
+         for (t = 0; t < BZ_N_GROUPS; t++) cost[t] = 0;

          if (nGroups == 6 && 50 == ge-gs+1) {
             /*--- fast track the common case ---*/

Which might also help the compiler to simply zero the whole (6 element) arrays
in one go (the common case), instead of having to check whether nGroups is
smaller in this particular case.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug bzip2/28904] libbzip2: ‘cost[3]’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  2022-02-18  3:18 [Bug bzip2/28904] New: libbzip2: ‘cost[3]’ may be used uninitialized in this function [-Wmaybe-uninitialized] demerphq at gmail dot com
  2022-04-19 22:04 ` [Bug bzip2/28904] " mark at klomp dot org
  2022-04-19 22:23 ` mark at klomp dot org
@ 2022-04-20  1:57 ` demerphq at gmail dot com
  2022-05-26 20:53 ` mark at klomp dot org
  2022-05-31 19:06 ` email at arsoftware dot net.br
  4 siblings, 0 replies; 6+ messages in thread
From: demerphq at gmail dot com @ 2022-04-20  1:57 UTC (permalink / raw)
  To: bzip2-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=28904

--- Comment #3 from demerphq <demerphq at gmail dot com> ---
On Wed, 20 Apr 2022 at 00:23, mark at klomp dot org
<sourceware-bugzilla@sourceware.org> wrote:
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=28904
>
> --- Comment #2 from Mark Wielaard <mark at klomp dot org> ---
> A different solution might be to use the constant BZ_N_GROUPS instead of the
> variable nGroups when resetting the arrays:
>
> diff --git a/compress.c b/compress.c
> index 5dfa002..2dc5dc1 100644
> --- a/compress.c
> +++ b/compress.c
> @@ -321,7 +321,7 @@ void sendMTFValues ( EState* s )
>     ---*/
>     for (iter = 0; iter < BZ_N_ITERS; iter++) {
>
> -      for (t = 0; t < nGroups; t++) fave[t] = 0;
> +      for (t = 0; t < BZ_N_GROUPS; t++) fave[t] = 0;
>
>        for (t = 0; t < nGroups; t++)
>           for (v = 0; v < alphaSize; v++)
> @@ -353,7 +353,7 @@ void sendMTFValues ( EState* s )
>              Calculate the cost of this group as coded
>              by each of the coding tables.
>           --*/
> -         for (t = 0; t < nGroups; t++) cost[t] = 0;
> +         for (t = 0; t < BZ_N_GROUPS; t++) cost[t] = 0;
>
>           if (nGroups == 6 && 50 == ge-gs+1) {
>              /*--- fast track the common case ---*/
>
> Which might also help the compiler to simply zero the whole (6 element) arrays
> in one go (the common case), instead of having to check whether nGroups is
> smaller in this particular case.

I am fine with both of your proposals. I was wondering why you don't
use memset() instead of the explicit loop? From what I understand it
is faster.

Yves

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug bzip2/28904] libbzip2: ‘cost[3]’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  2022-02-18  3:18 [Bug bzip2/28904] New: libbzip2: ‘cost[3]’ may be used uninitialized in this function [-Wmaybe-uninitialized] demerphq at gmail dot com
                   ` (2 preceding siblings ...)
  2022-04-20  1:57 ` demerphq at gmail dot com
@ 2022-05-26 20:53 ` mark at klomp dot org
  2022-05-31 19:06 ` email at arsoftware dot net.br
  4 siblings, 0 replies; 6+ messages in thread
From: mark at klomp dot org @ 2022-05-26 20:53 UTC (permalink / raw)
  To: bzip2-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=28904

Mark Wielaard <mark at klomp dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #4 from Mark Wielaard <mark at klomp dot org> ---
(In reply to demerphq from comment #3)
> I am fine with both of your proposals. I was wondering why you don't
> use memset() instead of the explicit loop? From what I understand it
> is faster.

I went with the second option. I am not using memset but let the compiler pick
the most efficient way of initializing. With just 6 elements calling memcpy
might just be extra overhead. Thanks for the report.

commit 9de658d248f9fd304afa3321dd7a9de1280356ec
Author: Mark Wielaard <mark@klomp.org>
Date:   Thu May 26 22:38:01 2022 +0200

    Initialize the fave and cost arrays fully

    We try to be smart in sendMTFValues by initializing just nGroups
    number of elements instead of all BZ_N_GROUPS elements. But this means
    the compiler doesn't know all elements are correctly initialized and
    might warn. The arrays are really small, BZ_N_GROUPS, 6 elements. And
    nGroups == BZ_N_GROUPS is the common case. So just initialize them all
    always. Using a constant loop might also help the compiler to optimize
    the initialization.

    https://sourceware.org/bugzilla/show_bug.cgi?id=28904

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug bzip2/28904] libbzip2: ‘cost[3]’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  2022-02-18  3:18 [Bug bzip2/28904] New: libbzip2: ‘cost[3]’ may be used uninitialized in this function [-Wmaybe-uninitialized] demerphq at gmail dot com
                   ` (3 preceding siblings ...)
  2022-05-26 20:53 ` mark at klomp dot org
@ 2022-05-31 19:06 ` email at arsoftware dot net.br
  4 siblings, 0 replies; 6+ messages in thread
From: email at arsoftware dot net.br @ 2022-05-31 19:06 UTC (permalink / raw)
  To: bzip2-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=28904

--- Comment #5 from email at arsoftware dot net.br ---
   Need to check the code
   Wait...
   Thanks
   Em 19 de abr. de 2022 19:04, mark at klomp dot org via Bzip2-devel
   <bzip2-devel@sourceware.org> escreveu:

     https://sourceware.org/bugzilla/show_bug.cgi?id=28904

     Mark Wielaard <mark at klomp dot org> changed:

     What |Removed |Added
    
----------------------------------------------------------------------------
     Status|UNCONFIRMED |NEW
     Last reconfirmed| |2022-04-19
     Ever confirmed|0 |1
     CC| |mark at klomp dot org

     --- Comment #1 from Mark Wielaard <mark at klomp dot org> ---
     Sorry for the late reply.

     I think you are right that initializing these two small (6 ints)
     arrays at the
     start of the function is fine if it helps the compiler.

     I haven't been able to reproduce the compiler warning myself
     though.
     My proposed patch would be:

     diff --git a/compress.c b/compress.c
     index 5dfa002..c825c78 100644
     --- a/compress.c
     +++ b/compress.c
     @@ -253,8 +253,8 @@ void sendMTFValues ( EState* s )
     --*/


     - UInt16 cost[BZ_N_GROUPS];
     - Int32 fave[BZ_N_GROUPS];
     + UInt16 cost[BZ_N_GROUPS] = { 0 };
     + Int32 fave[BZ_N_GROUPS] = { 0 };

     UInt16* mtfv = s->mtfv;


     { 0 } initializes the whole array with zero, so we don't have to
     know the exact
     number of elements.

     Does the above work for you?

     --
     You are receiving this mail because:
     You are on the CC list for the bug.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-05-31 19:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-18  3:18 [Bug bzip2/28904] New: libbzip2: ‘cost[3]’ may be used uninitialized in this function [-Wmaybe-uninitialized] demerphq at gmail dot com
2022-04-19 22:04 ` [Bug bzip2/28904] " mark at klomp dot org
2022-04-19 22:23 ` mark at klomp dot org
2022-04-20  1:57 ` demerphq at gmail dot com
2022-05-26 20:53 ` mark at klomp dot org
2022-05-31 19:06 ` email at arsoftware dot net.br

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).