Bug#516356: Logout delay using XDMCP

Margarita Manterola margamanterola at gmail.com
Fri Feb 20 19:53:26 UTC 2009


Package: gdm
Version: 2.20.7-4

Hi!

On Tuestday I upgraded a thinclient server that was running Etch, to
Lenny. One of the problems I found is only triggered on my user, and
not the other two users of the server, but I've been investigating it
for a long while without finding what's going on, so I'm reporting it
as a bug

The problem is like this: when I try to logout (by clicking
System->Logout and then the Logout button), there's something like a
70% chance that the logout process will take 45 seconds instead of 1.
During these 45 seconds, I'm able to keep doing stuff, opening and
closing programs, reading new mail, chatting on xchat, etc.  Then
finally, everything closes as usual.

Yesterday, just after the windows closed, gdm was segfaulting.  So I
compiled it with noopt,nostrip, in order to obtain a core file and
debug the output.  Today, I can't get it to segfault again.  I
compiled it only with nostrip, but it didn't segfault then, either.  I
guess I changed something else in the middle, but I can't remember
exactly what.

What I've found up to now:

When there's a delay, this is written to the syslog (with debug enabled):

Feb 20 10:01:26 calipso gdm[12224]: DEBUG: gdm_socket_handler:
Accepting new connection fd 9
Feb 20 10:01:26 calipso gdm[12224]: DEBUG: Handling user message: 'VERSION'
Feb 20 10:01:26 calipso gdm[12224]: DEBUG: close_if_needed: Got G_IO_HUP on 9
Feb 20 10:01:26 calipso gdm[12224]: DEBUG: close_if_needed: Got error on 9
Feb 20 10:01:26 calipso gdm[16838]: DEBUG: Attempting to parse key
string: xdmcp/PingIntervalSeconds=15
Feb 20 10:01:39 calipso gdm[12231]: DEBUG: Attempting to parse key
string: xdmcp/PingIntervalSeconds=15
Feb 20 10:01:39 calipso gdm[12239]: DEBUG: Attempting to parse key
string: xdmcp/PingIntervalSeconds=15
Feb 20 10:01:41 calipso gdm[16838]: DEBUG: Attempting to parse key
string: xdmcp/PingIntervalSeconds=15
Feb 20 10:01:54 calipso gdm[12231]: DEBUG: Attempting to parse key
string: xdmcp/PingIntervalSeconds=15
Feb 20 10:01:54 calipso gdm[12239]: DEBUG: Attempting to parse key
string: xdmcp/PingIntervalSeconds=15
Feb 20 10:01:56 calipso gdm[16838]: DEBUG: Attempting to parse key
string: xdmcp/PingIntervalSeconds=15
Feb 20 10:02:09 calipso gdm[12231]: DEBUG: Attempting to parse key
string: xdmcp/PingIntervalSeconds=15
Feb 20 10:02:09 calipso gdm[12239]: DEBUG: Attempting to parse key
string: xdmcp/PingIntervalSeconds=15
Feb 20 10:02:11 calipso gdm[16838]: DEBUG: Attempting to parse key
string: xdmcp/PingIntervalSeconds=15
Feb 20 10:02:12 calipso gdm[16838]: DEBUG: slave_waitpid: done_waiting
Feb 20 10:02:13 calipso gdm[16838]: DEBUG: Session: start_time:
1235131208 end_time: 1235131333
(...)

When there's no delay, this is what's written to syslog:

Feb 20 10:15:43 calipso gdm[12224]: DEBUG: gdm_socket_handler:
Accepting new connection fd 9
Feb 20 10:15:43 calipso gdm[12224]: DEBUG: Handling user message: 'VERSION'
Feb 20 10:15:43 calipso gdm[12224]: DEBUG: close_if_needed: Got G_IO_HUP on 9
Feb 20 10:15:43 calipso gdm[12224]: DEBUG: close_if_needed: Got error on 9
Feb 20 10:15:43 calipso gdm[17636]: DEBUG: slave_waitpid: done_waiting
Feb 20 10:15:43 calipso gdm[17636]: DEBUG: Session: start_time:
1235131635 end_time: 1235132143
(...)

It can be seen that there's no much difference, except the pings every
15 seconds, done by the 3 gdm instances running.  In both cases the
parent gdm gets an error, but in neither case says which error it is.

Later on in the output, there's a snippet that can only be seen in the
one with the delay, compare:

Feb 20 10:02:13 calipso gdm[16838]: DEBUG: Running gdm_verify_cleanup
and pamh != NULL
Feb 20 10:02:13 calipso gdm[16838]: DEBUG: Running pam_close_session
Feb 20 10:02:13 calipso gdm[16838]: DEBUG: Running pam_setcred with
PAM_DELETE_CRED
Feb 20 10:02:13 calipso gdm[16838]: DEBUG: gdm_slave_session_start:
Session ended OK (now all finished)
Feb 20 10:02:13 calipso gdm[16838]: DEBUG: Sending LOGGED_IN == 0 for
slave 16838
Feb 20 10:02:13 calipso gdm[12224]: DEBUG: Attempting to parse key
string: debug/Enable=false
Feb 20 10:02:13 calipso gdm[12224]: DEBUG: Handling message: 'LOGGED_IN 16838 0'
Feb 20 10:02:13 calipso gdm[12224]: DEBUG: Got logged in == FALSE
Feb 20 10:02:13 calipso gdm[12224]: DEBUG: mainloop_sig_callback: Got signal 17
Feb 20 10:02:13 calipso gdm[12224]: WARNING: gdm_cleanup_children:
child 16838 crashed of signal 6
Feb 20 10:02:13 calipso gdm[12224]: WARNING: gdm_cleanup_children:
Slave crashed, killing its children

with the case with no delay (i.e. the correct one):

Feb 20 10:15:44 calipso gdm[17636]: DEBUG: Running gdm_verify_cleanup
and pamh != NULL
Feb 20 10:15:44 calipso gdm[17636]: DEBUG: Running pam_close_session
Feb 20 10:15:44 calipso gdm[17636]: DEBUG: Running pam_setcred with
PAM_DELETE_CRED
Feb 20 10:15:44 calipso gdm[12224]: DEBUG: mainloop_sig_callback: Got signal 17
Feb 20 10:15:44 calipso gdm[12224]: WARNING: gdm_cleanup_children:
child 17636 crashed of signal 6
Feb 20 10:15:44 calipso gdm[12224]: WARNING: gdm_cleanup_children:
Slave crashed, killing its children

It's quite shocking that the success message ("Session ended OK") is
actually sent when there was a problem, and not when things were fine,
and only then is the "logged in = FALSE" set.  Also that in both cases
"Slave crashed".

In any case, this doesn't really shed much light on the delay.  The
other thing I checked was strace.  This is what happens (on my gdm,
not the global one) while the delay is taking place:

select(17, [13 16], NULL, NULL, NULL) = ? ERESTARTNOHAND (To be restarted)
--- SIGALRM (Alarm clock) @ 0 (0) ---
time(NULL)                        = 1235137838
stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1061, ...}) = 0
writev(2, [{"gdm[18918]: DEBUG: Attempting to parse key string:
xdmcp/PingIntervalSeconds=15\n"..., 80}], 1) = 80
send(5, "<28>Feb 20 11:50:38 gdm[18918]: DEBUG: Attempting to parse
key string: xdmcp/PingIntervalSeconds=15\n"..., 100, MSG_NOSIGNAL) =
100
alarm(15)                         = 0
select(5, [4], [4], NULL, NULL)   = 1 (out [4])
writev(4, [{"+\27\1\0"..., 4}], 1) = 4
select(5, [4], [], NULL, NULL)    = 1 (in [4])
read(4, "\1\2F\1\0\0\0\0Y\0`\1\0\0\0\0\1\0\0\0\f\0\0\0pI/\10\0\0\0\0"...,
4096) = 32
read(4, 0x8b168b4, 4096)          = -1 EAGAIN (Resource temporarily unavailable)
sigreturn()                       = ? (mask now [])
select(17, [13 16], NULL, NULL, NULL) = ? ERESTARTNOHAND (To be restarted)
--- SIGALRM (Alarm clock) @ 0 (0) ---
time(NULL)                        = 1235137853
stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1061, ...}) = 0
writev(2, [{"gdm[18918]: DEBUG: Attempting to parse key string:
xdmcp/PingIntervalSeconds=15\n"..., 80}], 1) = 80
send(5, "<28>Feb 20 11:50:53 gdm[18918]: DEBUG: Attempting to parse
key string: xdmcp/PingIntervalSeconds=15\n"..., 100, MSG_NOSIGNAL) =
100
alarm(15)                         = 0
select(5, [4], [4], NULL, NULL)   = 1 (out [4])
writev(4, [{"+\27\1\0"..., 4}], 1) = 4
select(5, [4], [], NULL, NULL)    = 1 (in [4])
read(4, "\1\2G\1\0\0\0\0Y\0`\1\0\0\0\0\1\0\0\0\f\0\0\0pI/\10\0\0\0\0"...,
4096) = 32
read(4, 0x8b168b4, 4096)          = -1 EAGAIN (Resource temporarily unavailable)
sigreturn()                       = ? (mask now [])
select(17, [13 16], NULL, NULL, NULL) = ? ERESTARTNOHAND (To be restarted)
--- SIGALRM (Alarm clock) @ 0 (0) ---
time(NULL)                        = 1235137868
stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1061, ...}) = 0
writev(2, [{"gdm[18918]: DEBUG: Attempting to parse key string:
xdmcp/PingIntervalSeconds=15\n"..., 80}], 1) = 80
send(5, "<28>Feb 20 11:51:08 gdm[18918]: DEBUG: Attempting to parse
key string: xdmcp/PingIntervalSeconds=15\n"..., 100, MSG_NOSIGNAL) =
100
alarm(15)                         = 0
select(5, [4], [4], NULL, NULL)   = 1 (out [4])
writev(4, [{"+\27\1\0"..., 4}], 1) = 4
select(5, [4], [], NULL, NULL)    = 1 (in [4])
read(4, "\1\2H\1\0\0\0\0Y\0`\1\0\0\0\0\1\0\0\0\f\0\0\0pI/\10\0\0\0\0"...,
4096) = 32
read(4, 0x8b168b4, 4096)          = -1 EAGAIN (Resource temporarily unavailable)
sigreturn()                       = ? (mask now [])

It looks like it's only the pinging stuff seen before.  I don't know
if there's anything else.

I couldn't find anything useful in .xsession-errors.  There's the
"Failed to authenticate with GDM" message from gnome-panel, but it's
there in both cases.  And there's a switch in the order of some lines,
but nothing that looks relevant.

In case there's anything I've missed, I'm attaching the full logs.

Please tell me what other tests I might perform.

-- 
Besos,
Marga
-------------- next part --------------
A non-text attachment was scrubbed...
Name: logout_no_delay.log
Type: text/x-log
Size: 112670 bytes
Desc: not available
Url : http://lists.alioth.debian.org/pipermail/pkg-gnome-maintainers/attachments/20090220/224073fe/attachment-0004.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: logout_with_delay.log
Type: text/x-log
Size: 114151 bytes
Desc: not available
Url : http://lists.alioth.debian.org/pipermail/pkg-gnome-maintainers/attachments/20090220/224073fe/attachment-0005.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: xsession-no-delay2.log
Type: text/x-log
Size: 10854 bytes
Desc: not available
Url : http://lists.alioth.debian.org/pipermail/pkg-gnome-maintainers/attachments/20090220/224073fe/attachment-0006.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: xsession-with-delay.log
Type: text/x-log
Size: 10527 bytes
Desc: not available
Url : http://lists.alioth.debian.org/pipermail/pkg-gnome-maintainers/attachments/20090220/224073fe/attachment-0007.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gdm-no-delay.strace
Type: application/octet-stream
Size: 102426 bytes
Desc: not available
Url : http://lists.alioth.debian.org/pipermail/pkg-gnome-maintainers/attachments/20090220/224073fe/attachment-0002.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gdm-with-delay.strace
Type: application/octet-stream
Size: 106162 bytes
Desc: not available
Url : http://lists.alioth.debian.org/pipermail/pkg-gnome-maintainers/attachments/20090220/224073fe/attachment-0003.obj 


More information about the pkg-gnome-maintainers mailing list