日期:2014-05-20  浏览次数:21255 次

Socket.Send长期处于Block的状态,不返回、不产生异常有谁遇到过?
最近有一个程序遇到了比较麻烦的问题,大体上是这样的:
C# code

        public override int Send(byte[] buffer)
        {
            if (Connected)
            {
                lock (clientSocket)
                {
                    return clientSocket.Send(buffer);
                }
            }
            else
                return COMM_FAILURE;
        }


clientSocket是System.Net.Sockets.Socket类的对象,是由一个TCPListener监听后AcceptSocket得到的。
在长期的工作过程中,大约每一天会出现一个这样的现象:
调用Send的线程在
 return clientSocket.Send(buffer);
这一行上停(Block)住。
具体的查看方法是在在VisualStudio2008下,断点停下其它正常工作的线程可知这个线程的停止位置。同时,由于代码中存在着的lock(clientSocket)的操作,有一部分线程也由于此原因而被Block(这个暂且不说)。

奇怪的一个问题在于,在程序中还同时存在着另外一个clientSocket,这一个clientSocket是由该程序主动生成并与某TCP Server(外部系统)建立而生成的连接。这一个连接上也有Send操作,却不存在这个问题。

通过查看Send并反编译System.dll取得的源代码来看,似乎这段代码是停在了:
 num = UnsafeNclNativeMethods.OSSOCK.send(this.m_Handle.DangerousGetHandle(), numRef + offset, size, socketFlags);
该代码实际调用的是:
[DllImport("ws2_32.dll", SetLastError=true)]
internal static extern unsafe int send([In] IntPtr socketHandle, [In] byte* pinnedBuffer, [In] int len, [In] SocketFlags socketFlags);
 
搜索MSDN得到关于此函数的介绍:
http://msdn.microsoft.com/en-us/library/ms740149(VS.85).aspx

其中有两个地方我觉得比较值得注意:
一处在remark中
C# code
/*
If no buffer space is available within the transport system to hold the data to be transmitted, send will block unless the socket has been placed in nonblocking mode. On nonblocking stream oriented sockets, the number of bytes written can be between 1 and the requested length, depending on buffer availability on both the client and server computers. The select, WSAAsyncSelect or WSAEventSelect functions can be used to determine when it is possible to send more data.
*/


一处在后面的Add New Content中:
C# code
/*
blocking send() and delayed ack                    Ralph_G    |   Edit    |   Hide History
Please Wait  Please Wait
blocking send() and delayed ack  |  Modified on 12/10/2008 1:39 AM by Ralph_G

I have an (time critical) application where a sender sends large blocks of data in constant time intervals using a blocking call to send(). The receiver doesn't return anything. Sometimes nothing is sent for about 200ms (which means that we have to drop packets an the sender side), I used wireshark to log the network traffic and saw that always when the 200ms pause occured the TCP ACK from the sender was delayed. I assume that we sometimes send an even number of packets, causing the ACK to be sent immediatly and sometimes an uneven number of packets, causing the TCP delayed ACK mechanism to delay the ACK for 200 ms. And i assume that the blocked send() returns only after it received an ACK for the last sent packet - is this true? If I set the TcpAckFrequency registry key to 1 on the receiver side, the problem disappears. TCP_NODELAY is set on both sides.

I'm looking for a better solution than setting the registry key, should it be possible to resolve the issue using overlapped I/O?
*/


尤其是后面那个,如果这个说法真的成立则基本上可以解释所遇到的问题。希望能有大虾提供相关的建议,重要的是确认或者否定上面那个New Content中的说法,谢谢。




------解决方案--------------------
可能是网络上丢包比较严重,最好加上SendTimeout避免网络丢包把程序卡住。