From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 132DB3854545 for ; Wed, 23 Nov 2022 08:57:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 132DB3854545 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669193867; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=BgRTiiu6oJP7hyI5rc5AarJiFA1epMH9o7HIZLqaW8s=; b=aXmiaJKgpoFXaq8vqUd174ekW4qTfz5BMQVVbb4bwTa4yQk37ZUmeYqbcSKljwrnvU5csM 1Le705+/wEKbwcVQDbKC4bI2pMqfG2BWoMAb58czJQzDyoRSiWkKEm9oIcvvQ8KHq440pq 7VbAYrrg/5DZkz6zcP/hRYhx5PtP5oU= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-654-lU2seQggPCekuFy5mtCD7g-1; Wed, 23 Nov 2022 03:57:36 -0500 X-MC-Unique: lU2seQggPCekuFy5mtCD7g-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8ACD43C3C161; Wed, 23 Nov 2022 08:57:36 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.2.16.18]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0286BC50925; Wed, 23 Nov 2022 08:57:35 +0000 (UTC) From: Florian Weimer To: Paul Wise via Libc-help Cc: Paul Wise Subject: Re: is this a bug in glibc or readpst? References: <2cefc4fa95dd439c2581f4f06d520c004cd33708.camel@bonedaddy.net> Date: Wed, 23 Nov 2022 09:57:32 +0100 In-Reply-To: <2cefc4fa95dd439c2581f4f06d520c004cd33708.camel@bonedaddy.net> (Paul Wise via Libc-help's message of "Wed, 23 Nov 2022 10:02:57 +0800") Message-ID: <875yf6nj43.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: * Paul Wise via Libc-help: > readpst from Debian buster in multi-process mode works but readpst from > Debian bullseye randomly loses some data. Current readpst works on > Debian buster but not Debian bullseye. The problem isn't related to the > GCC optimisation level. The problem isn't compiler related, clang > exhibits the problem too. Upgrading libc6 from 2.28-10 to 2.29-1 caused > the issue. Bisecting glibc pointed at commit 0b727ed4d, which is titled > "libio: Flush stream at freopen (BZ#21037)" and looks legitimate as it > aligns glibc freopen with POSIX specifications. readpst is using > freopen() after fork() to get new *.pst FILE pointers for child > processes. Both the parent and child FILE are opened read-only. The > FILE position is 0 after freopen for both scenarios. readpst seems to > be skipping some PST file blocks in the broken scenario. The debug logs > seem to indicate that in the broken scenario it reads data from a wrong > location, even though the file position is 0 after freopen. Switching > the readpst code to use fclose()+fopen() after fork() instead of > freopen() after fork() fixes the issue. Fork still shares the underlying file description. It only duplicates the descriptors. If the subprocess changes the file pointer back to 0, it will affect the original process, too. This is just how file descriptors work. Could this explain the issue? Thanks, Florian