On 11/23/23 10:16, Tetsuo Handa wrote: > On 2023/09/28 5:03, wcohen at redhat dot com via Systemtap wrote: >> commit ca71442b93af61cbd9a1386e24e967373f928eae (HEAD -> master, origin/master, >> origin/HEAD) >> Author: William Cohen >> Date: Wed Sep 27 10:09:11 2023 -0400 >> >> Eliminate use of kernel's flush_scheduled_work() in systemtap modules >> >> Kernel git commit 20bdedafd2f63e0ba70991127f9b5c0826ebdb32 turns use >> of flush_scheduled_work() into a warning which causes builds of >> anything using it to fail because warnings are treated as errors. >> Previous users of flush_scheduled_work() in the kernel have been >> converted over to use individual workqueues. Systemtap runtime now >> does the same. It creates and uses its own workqueue to eliminate the >> use of flush_scheduled_work(). > Hi, Sorry for not getting to this earlier. Attached is a patch that attempts to address the problems mentioned in this email. Could you verify that the patch addresses all the concerns. > That commit does not look correct. > > systemtap_wq = alloc_workqueue("systemtap-wq", WQ_UNBOUND | WQ_MEM_RECLAIM, 0); > if (!systemtap_wq) > return !systemtap_wq; > > causes stap_init_module() to return 1 (i.e. a positive value) when alloc_workqueue() > failed. The caller ( https://elixir.bootlin.com/linux/v6.6/source/kernel/module/main.c#L2528 ) > emits a warning message but thinks that stap_init_module() succeeded. > stap_init_module() should return -ENOMEM when alloc_workqueue() failed. This area of code now return -ENOMEM if alloc_workqueue fails. It also eliminates the WQ_MEM_RECLAIM flag. > > Also, when alloc_workqueue() with WQ_MEM_RECLAIM flag succeeded, alloc_workqueue() > allocated a "struct task_struct". Not calling destroy_workqueue(systemtap_wq) leaks > that "struct task_struct" and related kernel resources. > > rc = systemtap_kernel_module_init(); > if (rc) > return rc; > > I don't know the content of systemtap_kernel_module_init() emitted by translate.cxx , > but it seems to me that destroy_workqueue(systemtap_wq) will not be called. > Like https://lkml.kernel.org/r/49925af7-78a8-a3dd-bce6-cfc02e1a9236@I-love.SAKURA.ne.jp > says, please be sure to call destroy_workqueue(systemtap_wq) when stap_init_module() > returns an error after alloc_workqueue() succeeded. The stap_init_module is generated by the stap translator, but it doesn't call destroy_workqueue(). The code now calls destroy_workqueue before each of the returns in stap_init_module(). There is a use of the workqueue in systemtap_kernel_module_exit(), so the destroy_workqueue() follows it. The stap_cleanup_module() was simplified as systemtap_wq must be available after the stap_init_module() is successful. -Will > > Also, system_wq is created without WQ_MEM_RECLAIM flag > ( https://elixir.bootlin.com/linux/v6.6/source/kernel/workqueue.c#L6596 ). > If you don't have a reason to create systemtap_wq with WQ_MEM_RECLAIM flag, > use of WQ_MEM_RECLAIM flag is a waste of kernel resource. >