From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id 5808B3857C59 for ; Mon, 10 May 2021 19:17:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5808B3857C59 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-9-_93XunQyPpiSzuMPafUU9w-1; Mon, 10 May 2021 15:17:39 -0400 X-MC-Unique: _93XunQyPpiSzuMPafUU9w-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4A101EC1A0; Mon, 10 May 2021 19:17:38 +0000 (UTC) Received: from oldenburg.str.redhat.com (ovpn-112-137.ams2.redhat.com [10.36.112.137]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 710457012E; Mon, 10 May 2021 19:17:37 +0000 (UTC) From: Florian Weimer To: Adhemerval Zanella Cc: libc-alpha@sourceware.org Subject: Re: [PATCH v2 4/4] libio: Fix race access to _IO_FLAGS2_NEED_LOCK References: <20210510170426.2533768-1-adhemerval.zanella@linaro.org> <20210510170426.2533768-4-adhemerval.zanella@linaro.org> <875yzq7a8s.fsf@oldenburg.str.redhat.com> <04d487bf-6c41-97ed-df4e-7d2eac9e1337@linaro.org> Date: Mon, 10 May 2021 21:18:03 +0200 In-Reply-To: <04d487bf-6c41-97ed-df4e-7d2eac9e1337@linaro.org> (Adhemerval Zanella's message of "Mon, 10 May 2021 16:11:16 -0300") Message-ID: <87k0o65rf8.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 May 2021 19:17:43 -0000 * Adhemerval Zanella: > I see that my first assumption of moving the _IO_FLAGS2_NEED_LOCK after > the lock should be suffice. The data race is not really related to=20 > _IO_FLAGS2_NEED_LOCK set in fact, but rather the small window where=20 > the _flags2 read and update with _IO_FLAGS2_NEED_LOCK might lost some > bits update. Right. > Maybe the comment: > > /* The _IO_FLAGS2_NEED_LOCK is required here to enable the non-atomic > fast path on function that call _IO_need_lock. However it requires > to be set with stream locked to avoid a potential data race. */ s/fast path/path/ I suppose, and =E2=80=9Cit necessary to set it with the s= tream locked=E2=80=9D. >> With more careful review, maybe we could eliminate the _flags2 update >> altogether. What we cannot do is make the flockfile operation itself >> conditional on _IO_FLAGS2_NEED_LOCK because of observable impact >> vis-a-vis funlockfile. If we want to optimize flockfile, I think we'd >> have to change _IO_lock_lock and add an atomics-free fast path for >> SINGLE_THREAD_P (that still performs the lock update). With that, the >> entire _IO_FLAGS2_NEED_LOCK business could be removed again, I think. > > I am not sure we could eliminate since we need a way to inform the > function that call _IO_need_lock that they can elide _IO_flockfile. _IO_need_lock would go away (along with _IO_FLAGS2_NEED_LOCK) if we had the SINGLE_THREAD_P optimization inside _IO_lock_lock and _IO_lock_unlock. > The single-thread optimization won't really work: um multithread=20 > process the idea exactly to make a set of stream operation to avoid > taking the lock. Why not? The single-thread optimization would still avoid the atomic compare-and-set, which should be a significant win on AArch64 and POWER. There is more work to be done for the single-threaded case compared to _IO_FLAGS2_NEED_LOCK, but all that is without atomics, so I think those should be reasonably fast. > And _IO_FILE::_flag does not see to have space (high-order word is > _IO_MAGIC and all flags seems to be used). This general optimzation pattern for locks relies on SINGLE_THREAD_P only. Thanks, Florian