From pavel.polacek at ujep.cz Wed Mar 22 07:53:09 2017 From: pavel.polacek at ujep.cz (Pavel Polacek) Date: Wed, 22 Mar 2017 08:53:09 +0100 Subject: Bug#858417: libapache2-mod-shib2: Lots of apache workers in "Closing connection" state. Endless sleeping of apache workers. Message-ID: <20170322075309.19144.66201.reportbug@elf.ujep.cz> Package: libapache2-mod-shib2 Version: 2.5.3+dfsg-2 Severity: normal Dear Maintainer, LAMP server: apache2-mpm-itk + mod_php + mod_shib. Apache workers hangs in "Closing connection" state. Apache in "C" state, wait for another apache worker (process 754): strace -p 754 Process 754 attached wait4(771, Apache process 771: gdb --pid 771 (gdb) bt #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007f032831c479 in _L_lock_909 () from /lib/x86_64-linux-gnu/libpthread.so.0 #2 0x00007f032831c2a0 in __GI___pthread_mutex_lock (mutex=0x7f032ab42420) at ../nptl/pthread_mutex_lock.c:79 #3 0x00007f031fa7ab8f in log4shib::Category::removeAllAppenders() () from /usr/lib/x86_64-linux-gnu/liblog4shib.so.1 #4 0x00007f031fa7b9da in log4shib::HierarchyMaintainer::shutdown() () from /usr/lib/x86_64-linux-gnu/liblog4shib.so.1 #5 0x00007f031fa7bc4c in log4shib::HierarchyMaintainer::~HierarchyMaintainer() () from /usr/lib/x86_64-linux-gnu/liblog4shib.so.1 #6 0x00007f0327f9eb29 in __run_exit_handlers (status=0, listp=0x7f032830c5a8 <__exit_funcs>, run_list_atexit=run_list_atexit at entry=true) at exit.c:82 #7 0x00007f0327f9eb75 in __GI_exit (status=) at exit.c:104 #8 0x00007f0325281aea in itk_fork_process () from /usr/lib/apache2/modules/mpm_itk.so #9 0x00007f0328e81f40 in ap_run_process_connection () #10 0x00007f0324c6f7ba in ?? () from /usr/lib/apache2/modules/mod_mpm_prefork.so #11 0x00007f0324c6fa01 in ?? () from /usr/lib/apache2/modules/mod_mpm_prefork.so #12 0x00007f0324c70667 in ?? () from /usr/lib/apache2/modules/mod_mpm_prefork.so #13 0x00007f0328e5c7ee in ap_run_mpm () #14 0x00007f0328e555f3 in main () Apache mod_shib module is in suspicion. So I set logging off in /etc/shibboleth/native.logger, all line comment out. native.logger: log4j.rootCategory=INFO, native_log # This is a Debian-specific change. log4j.appender.native_log=org.apache.log4j.LocalSyslogAppender log4j.appender.native_log.syslogName=shibboleth-sp log4j.appender.native_log.facility=3 log4j.appender.native_log.layout=org.apache.log4j.BasicLayout Logging off is only workaround. Now I have 50 workers instead of 1000 after 10 hours of running apache. Thank you Pavel Polacek -- System Information: Debian Release: 8.7 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 3.16.0-4-amd64 (SMP w/4 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages libapache2-mod-shib2 depends on: ii apache2-bin [apache2-api-20120211] 2.4.10-10+deb8u8 ii libc6 2.19-18+deb8u7 ii libgcc1 1:4.9.2-10 ii libgssapi-krb5-2 1.12.1+dfsg-19+deb8u2 ii libshibsp-plugins 2.5.3+dfsg-2 ii libshibsp6 2.5.3+dfsg-2 ii libstdc++6 4.9.2-10 ii libxerces-c3.1 3.1.1-5.1+deb8u3 ii libxmltooling6 1.5.3-2+deb8u1 ii shibboleth-sp2-utils 2.5.3+dfsg-2 libapache2-mod-shib2 recommends no packages. libapache2-mod-shib2 suggests no packages. -- no debconf information From wferi at niif.hu Fri Mar 24 13:29:20 2017 From: wferi at niif.hu (Ferenc =?UTF-8?Q?W=C3=A1gner?=) Date: Fri, 24 Mar 2017 14:29:20 +0100 Subject: Bug#858417: libapache2-mod-shib2: Lots of apache workers in "Closing connection" state. Endless sleeping of apache workers. In-Reply-To: <20170322075309.19144.66201.reportbug@elf.ujep.cz> (Pavel Polacek's message of "Wed, 22 Mar 2017 08:53:09 +0100") References: <20170322075309.19144.66201.reportbug@elf.ujep.cz> Message-ID: <878tnuu7kf.fsf@lant.ki.iif.hu> Pavel Polacek writes: > (gdb) bt > #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 > #1 0x00007f032831c479 in _L_lock_909 () from /lib/x86_64-linux-gnu/libpthread.so.0 > #2 0x00007f032831c2a0 in __GI___pthread_mutex_lock (mutex=0x7f032ab42420) at ../nptl/pthread_mutex_lock.c:79 > #3 0x00007f031fa7ab8f in log4shib::Category::removeAllAppenders() () from /usr/lib/x86_64-linux-gnu/liblog4shib.so.1 > #4 0x00007f031fa7b9da in log4shib::HierarchyMaintainer::shutdown() () from /usr/lib/x86_64-linux-gnu/liblog4shib.so.1 > #5 0x00007f031fa7bc4c in log4shib::HierarchyMaintainer::~HierarchyMaintainer() () from /usr/lib/x86_64-linux-gnu/liblog4shib.so.1 > #6 0x00007f0327f9eb29 in __run_exit_handlers (status=0, listp=0x7f032830c5a8 <__exit_funcs>, run_list_atexit=run_list_atexit at entry=true) > at exit.c:82 > #7 0x00007f0327f9eb75 in __GI_exit (status=) at exit.c:104 > #8 0x00007f0325281aea in itk_fork_process () from /usr/lib/apache2/modules/mpm_itk.so > #9 0x00007f0328e81f40 in ap_run_process_connection () > #10 0x00007f0324c6f7ba in ?? () from /usr/lib/apache2/modules/mod_mpm_prefork.so > #11 0x00007f0324c6fa01 in ?? () from /usr/lib/apache2/modules/mod_mpm_prefork.so > #12 0x00007f0324c70667 in ?? () from /usr/lib/apache2/modules/mod_mpm_prefork.so > #13 0x00007f0328e5c7ee in ap_run_mpm () > #14 0x00007f0328e555f3 in main () > > Apache mod_shib module is in suspicion. > So I set logging off in /etc/shibboleth/native.logger, all line comment out. > [...] > Now I have 50 workers instead of 1000 after 10 hours of running apache. Hi, Do you mean that disabling logging fixed the problem for you and there are no hung workers in that case? Also, please provide the output of the "thread apply all bt" gdb command on a hung process. This looks like a log4shib threading problem, probably inherited from log4cpp. -- Thanks, Feri From cantor.2 at osu.edu Fri Mar 24 15:58:17 2017 From: cantor.2 at osu.edu (Cantor, Scott) Date: Fri, 24 Mar 2017 15:58:17 +0000 Subject: Bug#858417: libapache2-mod-shib2: Lots of apache workers in "Closing connection" state. Endless sleeping of apache workers. In-Reply-To: <878tnuu7kf.fsf@lant.ki.iif.hu> References: <20170322075309.19144.66201.reportbug@elf.ujep.cz> <878tnuu7kf.fsf@lant.ki.iif.hu> Message-ID: <54A8B997-AA34-43E9-A6C5-561178F4064A@osu.edu> On 3/24/17, 9:29 AM, "Pkg-shibboleth-devel on behalf of Ferenc W?gner" wrote: > This looks like a log4shib threading problem, probably inherited from log4cpp. More a "the design of the library just doesn't work for these kinds of process lifecycles" problem, there are issues open on I suspect related issues in the Shibboleth issue tracker, I don't have a specific issue number at hand right this second. I believe there are a number of issues around changes to that code, some other changes in the SP to deal with the Apache permission issues, etc. At this point if you can use syslog you can probably avoid a lot of this mess since native.log doesn't really get used much anyway, and another key is to make sure the logger setting in shibboleth2.xml isn't set and it's not trying to reload logging configuration. This trace also suggests prefork is being used, which should never be used with mod_shib, that's a DOS attack waiting to happen. -- Scott From pavel.polacek at ujep.cz Fri Mar 24 17:16:29 2017 From: pavel.polacek at ujep.cz (Pavel Polacek) Date: Fri, 24 Mar 2017 18:16:29 +0100 (CET) Subject: Bug#858417: libapache2-mod-shib2: Lots of apache workers in "Closing connection" state. Endless sleeping of apache workers. In-Reply-To: <878tnuu7kf.fsf@lant.ki.iif.hu> References: <20170322075309.19144.66201.reportbug@elf.ujep.cz> <878tnuu7kf.fsf@lant.ki.iif.hu> Message-ID: Hi, (gdb) thread apply all bt Thread 3 (Thread 0x7ff0094ed700 (LWP 31498)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007ff017af0ecd in xmltooling::ReloadableXMLFile::reload_fn(void*) () from /usr/lib/x86_64-linux-gnu/libxmltooling-lite.so.6 #2 0x00007ff01f7ea064 in start_thread (arg=0x7ff0094ed700) at pthread_create.c:309 #3 0x00007ff01f51f62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Thread 2 (Thread 0x7ff000cab700 (LWP 31510)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007ff017afbec3 in xmltooling::CondWaitImpl::timedwait(xmltooling::Mutex*, int) () from /usr/lib/x86_64-linux-gnu/libxmltooling-lite.so.6 #2 0x00007ff0177d9241 in ?? () from /usr/lib/x86_64-linux-gnu/libshibsp-lite.so.6 #3 0x00007ff01f7ea064 in start_thread (arg=0x7ff000cab700) at pthread_create.c:309 #4 0x00007ff01f51f62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Thread 1 (Thread 0x7ff020579780 (LWP 31497)): #0 0x00007ff01f7f1489 in __libc_waitpid (pid=31518, stat_loc=0x7ffc44121764, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:40 #1 0x00007ff01c751b06 in itk_fork_process () from /usr/lib/apache2/modules/mpm_itk.so #2 0x00007ff020351f40 in ap_run_process_connection () #3 0x00007ff01c13f7ba in ?? () from /usr/lib/apache2/modules/mod_mpm_prefork.so #4 0x00007ff01c13fa01 in ?? () from /usr/lib/apache2/modules/mod_mpm_prefork.so #5 0x00007ff01c140667 in ?? () from /usr/lib/apache2/modules/mod_mpm_prefork.so #6 0x00007ff02032c7ee in ap_run_mpm () #7 0x00007ff0203255f3 in main () (gdb) > Do you mean that disabling logging fixed the problem for you and there > are no hung workers in that case? Yes, I commented out all lines in /etc/shibboleth/native.logger and it solve my problem. > > Also, please provide the output of the "thread apply all bt" gdb command > on a hung process. > > This looks like a log4shib threading problem, probably inherited from > log4cpp. > -- > Thanks, > Feri > Thank you Pavel Polacek