public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug stdio/27777] New: fclose does a linear search, takes ages when FILE* are opened
@ 2021-04-24 21:50 alexandre.ferrieux at orange dot com
2021-04-24 21:51 ` [Bug stdio/27777] fclose does a linear search, takes ages when many " alexandre.ferrieux at orange dot com
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: alexandre.ferrieux at orange dot com @ 2021-04-24 21:50 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27777
Bug ID: 27777
Summary: fclose does a linear search, takes ages when FILE* are
opened
Product: glibc
Version: 2.34
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: stdio
Assignee: unassigned at sourceware dot org
Reporter: alexandre.ferrieux at orange dot com
Target Milestone: ---
If one has *many* opened streams (which is possible if the max number of opened
file descriptors per process has been tuned beyond the typical 1024), fclose()
starts being *very* slow.
The root cause is the following linear search in genops.c/_IO_un_link:
d18ea0c5 68 for (f = &_IO_list_all->file._chain; *f; f =
&(*f)->_chain)
9964a145 69 if (*f == (FILE *) fp)
40a55d20 70 {
cedb4109 71 *f = fp->file._chain;
40a55d20 72 break;
UD 73 }
Clearly a singly-linked list does not allow for O(1) removal.
Given what I understand of the design constraints of this list, the most
natural fix would be to switch to a doubly-linked list.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug stdio/27777] fclose does a linear search, takes ages when many FILE* are opened
2021-04-24 21:50 [Bug stdio/27777] New: fclose does a linear search, takes ages when FILE* are opened alexandre.ferrieux at orange dot com
@ 2021-04-24 21:51 ` alexandre.ferrieux at orange dot com
2024-04-24 17:05 ` carlos at redhat dot com
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: alexandre.ferrieux at orange dot com @ 2021-04-24 21:51 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27777
Alexandre Ferrieux <alexandre.ferrieux at orange dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|fclose does a linear |fclose does a linear
|search, takes ages when |search, takes ages when
|FILE* are opened |many FILE* are opened
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug stdio/27777] fclose does a linear search, takes ages when many FILE* are opened
2021-04-24 21:50 [Bug stdio/27777] New: fclose does a linear search, takes ages when FILE* are opened alexandre.ferrieux at orange dot com
2021-04-24 21:51 ` [Bug stdio/27777] fclose does a linear search, takes ages when many " alexandre.ferrieux at orange dot com
@ 2024-04-24 17:05 ` carlos at redhat dot com
2024-04-24 21:12 ` alexandre.ferrieux at orange dot com
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: carlos at redhat dot com @ 2024-04-24 17:05 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27777
Carlos O'Donell <carlos at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |carlos at redhat dot com
Status|UNCONFIRMED |NEW
Last reconfirmed| |2024-04-24
Ever confirmed|0 |1
--- Comment #1 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Alexandre Ferrieux from comment #0)
> If one has *many* opened streams (which is possible if the max number of
> opened file descriptors per process has been tuned beyond the typical 1024),
> fclose() starts being *very* slow.
>
> The root cause is the following linear search in genops.c/_IO_un_link:
>
> d18ea0c5 68 for (f = &_IO_list_all->file._chain; *f; f = &(*f)->_chain)
> 9964a145 69 if (*f == (FILE *) fp)
> 40a55d20 70 {
> cedb4109 71 *f = fp->file._chain;
> 40a55d20 72 break;
> UD 73 }
>
> Clearly a singly-linked list does not allow for O(1) removal.
> Given what I understand of the design constraints of this list, the most
> natural fix would be to switch to a doubly-linked list.
Just wanted to mention that I saw this recently while walking the list of bugs.
I can confirm that this is the case. Please feel free to work with the
community to improve the performance. Likewise you might also want to look at
fcloseall() performance too.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug stdio/27777] fclose does a linear search, takes ages when many FILE* are opened
2021-04-24 21:50 [Bug stdio/27777] New: fclose does a linear search, takes ages when FILE* are opened alexandre.ferrieux at orange dot com
2021-04-24 21:51 ` [Bug stdio/27777] fclose does a linear search, takes ages when many " alexandre.ferrieux at orange dot com
2024-04-24 17:05 ` carlos at redhat dot com
@ 2024-04-24 21:12 ` alexandre.ferrieux at orange dot com
2024-04-25 10:52 ` carlos at redhat dot com
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: alexandre.ferrieux at orange dot com @ 2024-04-24 21:12 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27777
--- Comment #2 from Alexandre Ferrieux <alexandre.ferrieux at orange dot com> ---
Hi,
Since that was (exactly) 3 years ago, I assume some parts have evolved in
between. Would love to contribute: can you point me to the official git repo ?
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug stdio/27777] fclose does a linear search, takes ages when many FILE* are opened
2021-04-24 21:50 [Bug stdio/27777] New: fclose does a linear search, takes ages when FILE* are opened alexandre.ferrieux at orange dot com
` (2 preceding siblings ...)
2024-04-24 21:12 ` alexandre.ferrieux at orange dot com
@ 2024-04-25 10:52 ` carlos at redhat dot com
2024-04-25 14:17 ` sam at gentoo dot org
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: carlos at redhat dot com @ 2024-04-25 10:52 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27777
--- Comment #3 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Alexandre Ferrieux from comment #2)
> Hi,
> Since that was (exactly) 3 years ago, I assume some parts have evolved in
> between. Would love to contribute: can you point me to the official git repo
> ?
The issue still remains :-)
68 for (f = &_IO_list_all->file._chain; *f; f = &(*f)->_chain)
69 if (*f == (FILE *) fp)
70 {
71 *f = fp->file._chain;
72 break;
73 }
The interesting think here would be to write a microbenchmark that shows the
degenerate case of the performance, like we have for empty critical sections.
(a) Write benchtests/bench-file-unlink.c has lots of open descriptors and times
the unlink time indirectly (worst case).
(b) Change implementation and show the performance improves.
References:
Contribution Checklist:
https://sourceware.org/glibc/wiki/Contribution%20checklist
Testing Builds:
https://sourceware.org/glibc/wiki/Testing/Builds
Good luck!
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug stdio/27777] fclose does a linear search, takes ages when many FILE* are opened
2021-04-24 21:50 [Bug stdio/27777] New: fclose does a linear search, takes ages when FILE* are opened alexandre.ferrieux at orange dot com
` (3 preceding siblings ...)
2024-04-25 10:52 ` carlos at redhat dot com
@ 2024-04-25 14:17 ` sam at gentoo dot org
2024-04-25 16:20 ` alexandre.ferrieux at orange dot com
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: sam at gentoo dot org @ 2024-04-25 14:17 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27777
Sam James <sam at gentoo dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |sam at gentoo dot org
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug stdio/27777] fclose does a linear search, takes ages when many FILE* are opened
2021-04-24 21:50 [Bug stdio/27777] New: fclose does a linear search, takes ages when FILE* are opened alexandre.ferrieux at orange dot com
` (4 preceding siblings ...)
2024-04-25 14:17 ` sam at gentoo dot org
@ 2024-04-25 16:20 ` alexandre.ferrieux at orange dot com
2024-04-26 14:23 ` alexandre.ferrieux at orange dot com
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: alexandre.ferrieux at orange dot com @ 2024-04-25 16:20 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27777
--- Comment #4 from Alexandre Ferrieux <alexandre.ferrieux at orange dot com> ---
Could you please help me find my path into benchtest ? The README leavse
unanswered a few critical questions:
- what is the minimal unit bench template ?
- visibly several styles have accumulated over decades , which is the
"current" one ?
- how do you invoke a single bench with 'make' ?
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug stdio/27777] fclose does a linear search, takes ages when many FILE* are opened
2021-04-24 21:50 [Bug stdio/27777] New: fclose does a linear search, takes ages when FILE* are opened alexandre.ferrieux at orange dot com
` (5 preceding siblings ...)
2024-04-25 16:20 ` alexandre.ferrieux at orange dot com
@ 2024-04-26 14:23 ` alexandre.ferrieux at orange dot com
2024-05-17 21:14 ` cvs-commit at gcc dot gnu.org
2024-05-17 21:15 ` hjl.tools at gmail dot com
8 siblings, 0 replies; 10+ messages in thread
From: alexandre.ferrieux at orange dot com @ 2024-04-26 14:23 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27777
--- Comment #5 from Alexandre Ferrieux <alexandre.ferrieux at orange dot com> ---
Okay, I proceeded without the benchtest as I couldn't make it work in a
reasonable time.
Patch submitted as
https://patchwork.sourceware.org/project/glibc/patch/20240426141847.2458829-1-alexandre.ferrieux@orange.com/
The simple program below could be used in benchtest :)
-----
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <signal.h>
#include <time.h>
#define GDBSYNC 0
#define N 1000000
#define M 100
int time_ms(void)
{
struct timespec tp;
if (0>clock_gettime(CLOCK_REALTIME, &tp)) {fprintf(stderr,"### failed to
clock_gettime - errno=%d\n",errno);exit(1);}
return (int)(tp.tv_sec*1000+tp.tv_nsec/1000000);
}
int main(int argc,char **argv)
{
#if GDBSYNC
char z[65536];
#endif
FILE *ff,*keep[M];
int i,t1,t2,t3,t4;
fprintf(stderr,"Preparing %d FILEs ...\n",N);fflush(stderr);
for(i=0;i<N;i++) {
ff=fdopen(0,"r");
if (!ff) {fprintf(stderr,"### failed to fdopen -
errno=%d\n",errno);exit(1);}
//if (i<10) fprintf(stderr,"ff=%p\n",ff);
if (i<M) keep[i]=ff;
}
#if GDBSYNC
fprintf(stderr,"Type Enter to proceed: ");fflush(stderr);
gets(z);
#endif
fprintf(stderr,"Now fclosing the first %d ...\n",M);fflush(stderr);
t1=time_ms();
for(i=0;i<M;i++) fclose(keep[i]);
t2=time_ms();
fprintf(stderr,"Now calling fcloseall()...\n");fflush(stderr);
t3=time_ms();
fcloseall();
t4=time_ms();
fprintf(stderr,"DONE fclose(%d)=%dms
fcloseall()=%dms\n",M,t2-t1,t4-t3);fflush(stderr);
exit(0);
}
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug stdio/27777] fclose does a linear search, takes ages when many FILE* are opened
2021-04-24 21:50 [Bug stdio/27777] New: fclose does a linear search, takes ages when FILE* are opened alexandre.ferrieux at orange dot com
` (6 preceding siblings ...)
2024-04-26 14:23 ` alexandre.ferrieux at orange dot com
@ 2024-05-17 21:14 ` cvs-commit at gcc dot gnu.org
2024-05-17 21:15 ` hjl.tools at gmail dot com
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-05-17 21:14 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27777
--- Comment #6 from Sourceware Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <hjl@sourceware.org>:
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=2a99e2398d9d717c034e915f7846a49e623f5450
commit 2a99e2398d9d717c034e915f7846a49e623f5450
Author: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
Date: Thu May 16 05:54:30 2024 -0700
Use a doubly-linked list for _IO_list_all (bug 27777)
This patch fixes BZ #27777 "fclose does a linear search, takes ages when
many FILE* are opened". Simply put, the master list of opened (FILE*),
namely _IO_list_all, is a singly-linked list. As a consequence, the
removal of a single element is in O(N), which cripples the performance
of fclose(). The patch switches to a doubly-linked list, yielding O(1)
removal. The one padding field in struct _IO_FILE, __pad5, is renamed
to _prevchain for a doubly-linked list. Since fields in struct _IO_FILE
after the _lock field are internal to glibc and opaque to applications.
We can change them as long as the size of struct _IO_FILE is unchanged,
which is checked as the part of glibc ABI with sizes of _IO_2_1_stdin_,
_IO_2_1_stdout_ and _IO_2_1_stderr_.
NB: When _IO_vtable_offset (fp) == 0, copy relocation will cover the
whole struct _IO_FILE. Otherwise, only fields up to the _lock field
will be copied to applications at run-time. It is used to check if
the _prevchain field can be safely accessed.
After opening 2 million (FILE*), the fclose() of 100 of them takes quite
a few seconds without the patch, and under 2 seconds with it on a loaded
machine.
No test is added since there are no functional changes.
Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug stdio/27777] fclose does a linear search, takes ages when many FILE* are opened
2021-04-24 21:50 [Bug stdio/27777] New: fclose does a linear search, takes ages when FILE* are opened alexandre.ferrieux at orange dot com
` (7 preceding siblings ...)
2024-05-17 21:14 ` cvs-commit at gcc dot gnu.org
@ 2024-05-17 21:15 ` hjl.tools at gmail dot com
8 siblings, 0 replies; 10+ messages in thread
From: hjl.tools at gmail dot com @ 2024-05-17 21:15 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27777
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Target Milestone|--- |2.40
Status|NEW |RESOLVED
--- Comment #7 from H.J. Lu <hjl.tools at gmail dot com> ---
Fixed for 2.40.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-05-17 21:15 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-24 21:50 [Bug stdio/27777] New: fclose does a linear search, takes ages when FILE* are opened alexandre.ferrieux at orange dot com
2021-04-24 21:51 ` [Bug stdio/27777] fclose does a linear search, takes ages when many " alexandre.ferrieux at orange dot com
2024-04-24 17:05 ` carlos at redhat dot com
2024-04-24 21:12 ` alexandre.ferrieux at orange dot com
2024-04-25 10:52 ` carlos at redhat dot com
2024-04-25 14:17 ` sam at gentoo dot org
2024-04-25 16:20 ` alexandre.ferrieux at orange dot com
2024-04-26 14:23 ` alexandre.ferrieux at orange dot com
2024-05-17 21:14 ` cvs-commit at gcc dot gnu.org
2024-05-17 21:15 ` hjl.tools at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).