public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
From: Paul Wise <pabs3@bonedaddy.net>
To: libc-help <libc-help@sourceware.org>
Subject: is this a bug in glibc or readpst?
Date: Wed, 23 Nov 2022 10:02:57 +0800	[thread overview]
Message-ID: <2cefc4fa95dd439c2581f4f06d520c004cd33708.camel@bonedaddy.net> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 2336 bytes --]

Hi folks,

I have a strange bug that might be an issue in glibc or it might be a
bug in how readpst works and or uses freopen in multi-process mode.

This is a summary of how the debugging process went up until now:

readpst from Debian buster in multi-process mode works but readpst from
Debian bullseye randomly loses some data. Current readpst works on
Debian buster but not Debian bullseye. The problem isn't related to the
GCC optimisation level. The problem isn't compiler related, clang
exhibits the problem too. Upgrading libc6 from 2.28-10 to 2.29-1 caused
the issue. Bisecting glibc pointed at commit 0b727ed4d, which is titled
"libio: Flush stream at freopen (BZ#21037)" and looks legitimate as it
aligns glibc freopen with POSIX specifications. readpst is using
freopen() after fork() to get new *.pst FILE pointers for child
processes. Both the parent and child FILE are opened read-only. The
FILE position is 0 after freopen for both scenarios. readpst seems to
be skipping some PST file blocks in the broken scenario. The debug logs
seem to indicate that in the broken scenario it reads data from a wrong
location, even though the file position is 0 after freopen. Switching
the readpst code to use fclose()+fopen() after fork() instead of
freopen() after fork() fixes the issue.

Here are the glibc freopen commit and the initial libpst bug report.

https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=0b727ed4d605d9318cb0d323c88abb0d5a441a9b
https://github.com/pst-format/libpst/issues/7

The readpst forking and freopen functions are here:

https://github.com/pst-format/libpst/blob/main/src/readpst.c#L203
https://github.com/pst-format/libpst/blob/main/src/libpst.c#L395

I have been debugging this using the attached scripts on a Debian
system. The outside script sets up a chroot using Debian schroot and
then runs the inside script to do the test inside the chroot. If you
want to run the inside script you may want to customise the paths it
uses so the script doesn't put files in your home directory.

I am hoping someone can help give me an idea if this is likely to be a
problem in glibc or readpst/libpst and or what other debugging strategy
might be useful to figure that out and pinpoint where the problem is.

-- 
bye,
pabs

https://bonedaddy.net/pabs3/

[-- Attachment #1.2: outside --]
[-- Type: application/x-shellscript, Size: 698 bytes --]

[-- Attachment #1.3: inside --]
[-- Type: application/x-shellscript, Size: 1244 bytes --]

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

             reply	other threads:[~2022-11-23  2:04 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-23  2:02 Paul Wise [this message]
2022-11-23  8:57 ` Florian Weimer
2022-11-24  0:06   ` Paul Wise
2022-12-02 17:37     ` Florian Weimer
2023-08-06  6:07       ` Paul Wise
2023-08-07  8:59         ` Florian Weimer
2023-08-07 11:00           ` Paul Wise
2023-08-07 11:46             ` Florian Weimer
2023-08-07 11:58               ` Paul Wise

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2cefc4fa95dd439c2581f4f06d520c004cd33708.camel@bonedaddy.net \
    --to=pabs3@bonedaddy.net \
    --cc=libc-help@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).