All Forums |
Register |
Login |
Search |
Subscriptions |
My Profile |
Inbox |
Tool Warehouse |
FAQs |
Resources |
Help |
Member List |
Address Book |
Logout |
|
|
SIGPIPE on Socket
|
Logged in as: Guest |
Users viewing this topic: none |
|
Login |
|
|
SIGPIPE on Socket - Nov. 30, '04, 7:46:41 PM
|
|
|
sfrare
Posts: 28
Joined: Oct. 24, '04,
Status: offline
|
Hi:
I am into some socket code that isn't behaving as it should. It goes like this:
Server starts:
Server sets to non-blocking I/O
Client starts:
Client get connection with server
Client switches to blocking I/O
Client sends data
Client shuts down the write side of the pipe only
Client waits on ACK from server
Server:
Server gets released from the blocking I/O wait on the socket since the write end is shut
Server goes to send ACK to client and gets SIGPIPE (signal 13)
I did verify that the client was closing with the correct enum so that write only is supposed to be closed (I.E. #define SHUT_WR 1) Is there some other socket option I need to set?
Thanks
Steve
|
|
|
RE: SIGPIPE on Socket - Dec. 1, '04, 12:45:11 AM
|
|
|
Rodney
Posts: 3714
Joined: Jul. 9, '02,
From: /Tools lab
Status: offline
|
Actual code is the only way to real see what is being told to be done.
You have given your interpretation of what the code should do.
The two need not be the same. It may also be that you have coded as
you describe, but you have not coded as you intended.
If you write something at one end of a socket then close it and the other end of the socket
is doing a select(), then should not the select() kick off because of the write? And if
it does kick off because of the close, the write bit should be set too for the file descriptor.
A SIGPIPE is as it says on the write(2) man page: the socket is no longer connected to a peer.
> I did verify that the client was closing with the correct enum so that write only is supposed to be closed (I.E. #define SHUT_WR 1)
What does this mean? Which API did you close the socket (file descriptor) with?
(A "#define" is a macro, not an enum, BTW)
The design sounds flakey. The client should wait for an ACK before closing if it wants an ACK.
|
|
|
RE: SIGPIPE on Socket - Dec. 1, '04, 1:23:59 AM
|
|
|
sfrare
Posts: 28
Joined: Oct. 24, '04,
Status: offline
|
quote:
ORIGINAL: Rodney
Actual code is the only way to real see what is being told to be done.
I was trying to avoid that.. ;-> But will do.
> I did verify that the client was closing with the correct enum so that write only is supposed to be closed (I.E. #define SHUT_WR 1)
quote:
ORIGINAL: Rodney
What does this mean? Which API did you close the socket (file descriptor) with?
(A "#define" is a macro, not an enum, BTW)
Sorry, I got the pedantic semantrics wrong. :-)
quote:
ORIGINAL: Rodney
The design sounds flakey. The client should wait for an ACK before closing if it wants an ACK.
It does wait, the client closes the write end then waits on the ACK, the server attempts to send the ACK but finds the socket has been shutdown.
I don' write 'em, I just port 'em. You can see the test if you search Google "SOCK_Test.cpp"
Thanks Rodney!
Steve
|
|
|
RE: SIGPIPE on Socket - Dec. 1, '04, 2:09:56 AM
|
|
|
Rodney
Posts: 3714
Joined: Jul. 9, '02,
From: /Tools lab
Status: offline
|
> You can see the test if you search Google "SOCK_Test.cpp"
Okay, I've just read this file.
The "real" code, cli_stream.close_writer() et al., is in other files.
It's the shutdown() API. This is where I will start to *rant* about WinSock
because the WinSock shutdown() sucks. I spent a grief-ridden amount of time
on this during the Softway days. (For those wondering, Interix uses WinSock
as the TCP/IP stack because (a) it already exists and (b) so there was co-ord
between Win32 & Interix for sockets). I think it was one of the daemons I was
porting that used shutdown() that I had problems with. What did I do?...
Dang, 8 years later is too long to recall one line out of millions.
>It does wait, the client closes the write end then waits on the ACK, the server attempts to send the ACK but finds the socket has been shutdown.
I mean: client writes data, server reads data, server writes ACK, client reads ACK, client closes, server closes.
So let's change the the ACE code with some good olde #ifndef __INTERIX
In client() function do:
#ifndef __INTERIX
// Explicitly close the writer-side of the connection.
if (cli_stream.close_writer () == -1)
ACE_ERROR ((LM_ERROR, ACE_TEXT ("(%P|%t) %p\n"), ACE_TEXT ("close_writer")));
#endif /* __INTERIX */
There should be no need for changes in the server() function from a quick read of it.
Server() chews off 1 byte at a time until it can't read any more. Then read() returns 0 (zero)
and then server() sends the ACK. Server() only selects when there is data to read.
Give the above change a whirl.
|
|
|
RE: SIGPIPE on Socket - Dec. 1, '04, 10:34:22 PM
|
|
|
sfrare
Posts: 28
Joined: Oct. 24, '04,
Status: offline
|
Hi Rodney:
Unfortunately that change makes the server hang since it is doing blocking I/O.
There must be a way to do this, the steps are the same ones Microsoft says to use (-1 step for the server, which is have the server side do a close_writer as well) but I tried that and it breaks before that. These are the steps I checked, hopefully this is supported under Interix (or at least some variation of).
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/winsock/winsock/graceful_shutdown_linger_options_and_socket_closure_2.asp
Thanks
Steve
|
|
|
RE: SIGPIPE on Socket - Dec. 2, '04, 1:51:00 AM
|
|
|
sfrare
Posts: 28
Joined: Oct. 24, '04,
Status: offline
|
Hi Rodney:
I wanted to let you know that most of the ACE port is working, I don't think I have updated you on that.
Of roughly 140 tests (which test everything from the behavior of 'new' to file writing, byte ordering, etc.. ) works. I have six or so tests that fail. Three are failing because I don't have the semantics clear on this graceful socket shutdown thing, one has to do with pthread TSS of which shouldn't fail since I told it to use it's freakin' emulator!, and two more socket related ones. Whew! So pretty good overall.
Thanks for all your help!
Steve
|
|
|
RE: SIGPIPE on Socket - Dec. 2, '04, 3:26:08 PM
|
|
|
Rodney
Posts: 3714
Joined: Jul. 9, '02,
From: /Tools lab
Status: offline
|
> Unfortunately that change makes the server hang since it is doing blocking I/O.
errr, you said it was doing non-blocking I/O before.
Anyway the select() is doing a timing of 0 (zero) which means the select will
wait until one of the descriptors is selected. In this case it's only the read
selector of the one file descriptor. So the server gets select() kicking with
data from the client. It then read()'s this data one char at a time. When
no more data can be read (read() returns 0) then the server sends a '\0' (the
nul being it's ACK). The server doesn't depend on any other notification.
Then the server does a close. The change of #ifdef'ing in the close the shutdown()
should have no effect on the server. So where in the code is the server hanging?
|
|
|
RE: SIGPIPE on Socket - Dec. 3, '04, 12:34:12 AM
|
|
|
sfrare
Posts: 28
Joined: Oct. 24, '04,
Status: offline
|
Hi:
I captured some truss output, the server goes back to select and hangs. Here is a before and after change:
Before:
fcntl(5, 0x5, 0x1002) fcntl returned 0
sysconf() sysconf returned 1024 0x400
select() select returned 0
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 0
pthread_mutex_lock(0x124D70) pthread_mutex_lock returned 0
pthread_mutex_unlock(0x124D70) pthread_mutex_unlock returned 0
pthread_mutex_lock(0x124D70) pthread_mutex_lock returned 0
pthread_cond_signal(0x124D74) pthread_cond_signal returned 0
pthread_mutex_unlock(0x124D70) pthread_mutex_unlock returned 0
pthread_sigmask(1, 0x124F98, 0x83E540) pthread_sigmask returned 0
pthread_mutex_lock(0x125538) pthread_mutex_lock returned 0
pthread_mutex_unlock(0x125538) pthread_mutex_unlock returned 0
write(3, 0x12A020, 93) write returned 93 0x5D
pthread_mutex_lock(0x125538) pthread_mutex_lock returned 0
pthread_cond_signal(0x12553C) pthread_cond_signal returned 0
pthread_mutex_unlock(0x125538) pthread_mutex_unlock returned 0
pthread_sigmask(3, 0x83E540, 0x0) pthread_sigmask returned 0
write(5, 0x40621B, 1) write failed: errno 32, Broken pipe
signal 13 SIGPIPE code=1
process killed by signal 13
After:
fcntl(5, 0x5, 0x1002) fcntl returned 0
sysconf() sysconf returned 1024 0x400
select() select returned 0
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read failed: errno 11, Resource temporarily unavailable
pthread_mutex_lock(0x124D70) pthread_mutex_lock returned 0
pthread_mutex_unlock(0x124D70) pthread_mutex_unlock returned 0
pthread_mutex_lock(0x124D70) pthread_mutex_lock returned 0
pthread_cond_signal(0x124D74) pthread_cond_signal returned 0
pthread_mutex_unlock(0x124D70) pthread_mutex_unlock returned 0
pthread_sigmask(1, 0x124F98, 0x83E540) pthread_sigmask returned 0
pthread_mutex_lock(0x125538) pthread_mutex_lock returned 0
pthread_mutex_unlock(0x125538) pthread_mutex_unlock returned 0
write(3, 0x12A020, 85) write returned 85 0x55
pthread_mutex_lock(0x125538) pthread_mutex_lock returned 0
pthread_cond_signal(0x12553C) pthread_cond_signal returned 0
pthread_mutex_unlock(0x125538) pthread_mutex_unlock returned 0
pthread_sigmask(3, 0x83E540, 0x0) pthread_sigmask returned 0
sysconf() sysconf returned 1024 0x400
select()
That select just above is where it (the server) hangs. Maybe I have something reversed in the manner in which blocking and non-blocking are setup?? I am still unsure...
Thanks
Steve
|
|
|
RE: SIGPIPE on Socket - Dec. 3, '04, 2:21:03 AM
|
|
|
Rodney
Posts: 3714
Joined: Jul. 9, '02,
From: /Tools lab
Status: offline
|
Select(), the API, really doesn't care about blocking or non-blocking.
Read(), write() and open() care.
> read(5, 0x83F800, 1) read failed: errno 11, Resource temporarily unavailable
This is EAGAIN which means that the read() is happening on a non-blocking descriptor.
I've been thinking about the original code a bunch more WRT shutdown() and the write()
to a socket. After mulling, hymning and hawing I think I know what's happening with the
EPIPE getting returned. But I don't have the src to play with anymore.
In the meantime, we need a workaround of some sort.
The problem is, I think, that the socket is marked closed (both directions) when only
the remote side (our rec'v side from the server perspective) gets closed (FIN).
So the next write() gets an EPIPE.
I'll have to think about a workaround some more cause I think you want an ACE client
to be talking with any ACE server the same way. Though this may be an SOL state
because of something that happen much earlier.
< Message edited by Rodney -- Dec. 3, '04, 2:45:38 AM >
|
|
|
RE: SIGPIPE on Socket - Dec. 3, '04, 2:48:11 AM
|
|
|
Rodney
Posts: 3714
Joined: Jul. 9, '02,
From: /Tools lab
Status: offline
|
In all of the truss output (I image that there is gobs of it) are there
any other writes() from the server side to file descriptor 5 than the one in
the bit of o/p you quote ?
|
|
|
RE: SIGPIPE on Socket - Dec. 3, '04, 4:07:21 AM
|
|
|
sfrare
Posts: 28
Joined: Oct. 24, '04,
Status: offline
|
Hi Rodney:
I searched on 'write(5' and came up empty, in the hang case. In the original case there is just the one write. Here is the hang case with the getids and pthreads lines removed (it is only 17KB but removing that junk cleans up a lot!)
tracing pid 1699
open("/opt/ACE_wrappers/lib/libdl.so.3.5", 0x1) open failed: errno 2, No such file or directory
open("/usr/lib/libdl.so.3.5", 0x1) open returned 3
read(3, 0x83EC18, 4096) read returned 4096 0x1000
close(3) close returned 0
unixpath2win() unixpath2win returned 0
open("/opt/ACE_wrappers/lib/libTest_Output.so.5.4.2", 0x1) open returned 3
read(3, 0x83EBF8, 4096) read returned 4096 0x1000
close(3) close returned 0
unixpath2win() unixpath2win returned 0
open("/opt/ACE_wrappers/lib/libACE.so.5.4.2", 0x1) open returned 3
read(3, 0x83EBD8, 4096) read returned 4096 0x1000
close(3) close returned 0
unixpath2win() unixpath2win returned 0
open("/opt/ACE_wrappers/lib/libstdc++.so.3.5", 0x1) open failed: errno 2, No such file or directory
open("/usr/lib/libstdc++.so.3.5", 0x1) open returned 3
read(3, 0x83EBB8, 4096) read returned 4096 0x1000
close(3) close returned 0
unixpath2win() unixpath2win returned 0
open("/opt/ACE_wrappers/lib/libm.so.3.5", 0x1) open failed: errno 2, No such file or directory
open("/usr/lib/libm.so.3.5", 0x1) open returned 3
read(3, 0x83EB98, 4096) read returned 4096 0x1000
close(3) close returned 0
unixpath2win() unixpath2win returned 0
open("/opt/ACE_wrappers/lib/libc.so.3.5", 0x1) open failed: errno 2, No such file or directory
open("/usr/lib/libc.so.3.5", 0x1) open returned 3
read(3, 0x83EB78, 4096) read returned 4096 0x1000
close(3) close returned 0
unixpath2win() unixpath2win returned 0
mkdir("log/") mkdir failed: errno 17, File exists
open("log/SOCK_Test.log", 0x302, 0666) open returned 3
gettzenv() gettzenv returned 0
open_nocancel("/usr/share/zoneinfo/America/Tijuana", 0x1) open_nocancel returned 4
read_nocancel(4, 0x83CC8C, 7484) read_nocancel returned 844 0x34C
close_nocancel(4) close_nocancel returned 0
fstat(3, 0xB60710, 0x83EA34) fstat ret: 0 dev: 0x40000000000043 ino: 0x00062832
isatty(3) isatty returned 0
write(3, 0x12A020, 102) write returned 102 0x66
socket() socket returned 4
bind() bind returned 0
listen() listen returned 0
getsockname() getsockname returned 0
write(3, 0x12A020, 72) write returned 72 0x48
fork() fork returned 1407 0x57F
getids() getids returned 0
fcntl(4, 0x4, 0x401078) fcntl returned 2
fcntl(4, 0x5, 0x1002) fcntl returned 0
sysconf() sysconf returned 1024 0x400
select() select returned 0
accept() accept returned 5
gethostbyaddr() gethostbyaddr returned 0
write(3, 0x12A020, 80) write returned 80 0x50
fcntl(5, 0x4, 0x0) fcntl returned 2
fcntl(5, 0x5, 0x1002) fcntl returned 0
sysconf() sysconf returned 1024 0x400
select() select returned 0
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read returned 1
read(5, 0x83F800, 1) read failed: errno 11, Resource temporarily unavailable
write(3, 0x12A020, 85) write returned 85 0x55
sysconf() sysconf returned 1024 0x400
select()
Thanks
Steve
|
|
|
RE: SIGPIPE on Socket - Dec. 4, '04, 2:31:20 AM
|
|
|
Rodney
Posts: 3714
Joined: Jul. 9, '02,
From: /Tools lab
Status: offline
|
> I searched on 'write(5' and came up empty, in the hang case
Thanks for the info.
Okay, I was looking to see if a twisted scenario might cause the
problem, but this nulls my theory. Gotta come up with a new one.
|
|
|
RE: SIGPIPE on Socket - Dec. 7, '04, 1:53:41 PM
|
|
|
Rodney
Posts: 3714
Joined: Jul. 9, '02,
From: /Tools lab
Status: offline
|
Okay, I think I've come up with a new theory based on the data. The
idea came to me a few minutes ago. But I have to test it to see if it'll
hold water. I expect there's a good chance it won't wash, but c'est la vie;
if I don't try we won't know. FYI.
|
|
|
RE: SIGPIPE on Socket - Dec. 7, '04, 5:00:08 PM
|
|
|
Rodney
Posts: 3714
Joined: Jul. 9, '02,
From: /Tools lab
Status: offline
|
I've written code that duplicates the problem.
The problem happens whether the descriptor is in blocking or non-blocking mode.
So mode of the socket is not an issue (or a workaround).
What I was hoping was that if the server-side (the side not doing the shutdown())
did a write that succeeded before the shutdown() call that things would be okay
because a "write bit" would be certainly on. But it did not work.
The only thing I can confirm is that the notification of shutdown can be received
before the read or before the read has finished (by WinSock text). So the notification
of the shutdown must set or clear something about the writablility of the socket.
The read() side seems to be okay with the shutdown.
|
|
|
RE: SIGPIPE on Socket - Dec. 8, '04, 12:41:39 AM
|
|
|
sfrare
Posts: 28
Joined: Oct. 24, '04,
Status: offline
|
Hi Rodney:
First of all I really appreciate all the help you have given to me. It appears I am in the SOL case and Interix cannot perform the graceful shutdown operation as described here:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/winsock/winsock/graceful_shutdown_linger_options_and_socket_closure_2.asp
With or without the one extra Winsock step (it appears to me that Unix does not require the server side to close the write end...)
Hmm... I will have to think on this... Of course I could always throw in a hack in the upper layers of ACE for a home grown signalling protocol to coerce these symantics but I don't see any other hacks in the ACE code for this....
Going to have to chew on this for a while...
Thanks!
Steve
|
|
|
New Messages |
No New Messages |
Hot Topic w/ New Messages |
Hot Topic w/o New Messages |
|
Locked w/ New Messages |
Locked w/o New Messages |
|
Post New Thread
Reply to Message
Post New Poll
Submit Vote
Delete My Own Post
Delete My Own Thread
Rate Posts |
|
|
|