IOCP[출처:CodeProject]

[Develope]/Network

IOCP[출처:CodeProject]

하늘을닮은호수M 2005. 5. 20. 13:41

Sample Image

1.1 Requirements

The article expects the reader to be familiar with C++, TCP/IP, Socket programming, MFC and Multithreading.

The source code uses Winsock 2.0 and IOCP technology and requires:

Windows NT/2000 or later: Requires Windows NT 3.5 or later.
Windows 95/98/ME: Unsupported.
Visual C++ .NET or a fully updated Visual C++ 6.0.

1.2 Abstract

When you develop different types of software, sooner or later you have to deal with server/client development. To write a comprehensive server/client code is a difficult task for a programmer. This documentation presents a simple but powerful server/client source code that can be extended to any type of server/client application. This source code uses the advanced IOCP technology which can efficiently service multiple clients. IOCP presents an efficient solution to the “one-thread-per-client” bottleneck problem (among others), using only a few processing threads and asynchronous input/output send/receive. IOCP technology is widely used for different types of high performance servers as Apache etc. The source code also provides a set of functions which are frequently used when dealing with communication and server/client software as file receiving/transferring function and logical thread pool handling. This article focuses on the practical solutions that arise with IOCP programming API and also presents an overview documentation of the source code. Furthermore, a simple echo server/client which can handle multiple connections and file transfer is presented.

2.1 Introduction

This article presents a class which can be used for both client and server code. The class uses IOCP (Input Output Completion Ports) and asynchronous (non-blocking) function calls which we are explaining later. The source code is based on many other source codes and articles as: [1, 2, 3].

With this simple source code, you can:

Service or connect to multiple clients and servers.
Send or receive files asynchronously.
Create and manage a logical worker thread pool to process heavier client/server requests or computations.

It is difficult to find a comprehensive but simple source code to handle client/server communications. The source codes that can be found on the net are too complex (20+ classes) or don’t provide sufficient efficiency. This source code is designed to be as simple and well documented as possible. In this article, we briefly present the IOCP technology provided by Winsock API 2.0, and also explain the thorny problems that arise when coding and the solutions to them.

2.2 Introduction to asynchronous Input Output Completion Ports (IOCP)

A server application is fairly meaningless if it can not service multiple clients at the same time, commonly asynchronous I/O calls and multithreading are used for this purpose. By definition, an asynchronous I/O call returns immediately, leaving the I/O call pending. At some time, the result of the I/O asynchronous call must be synchronized with the main thread. This can be done in different ways, which have their disadvantages. The synchronization can be performed by:

Using events. A signal is set as soon as the asynchronous call has been finished. The disadvantage of this approach is that the thread has to check or wait for the event to be set.
Using the GetOverlappedResult function. The approach also has the same disadvantage as the approach above.
Using asynchronous procedure calls (or APC). There are several disadvantages in using this approach also. First, the APC is always called in the context of the calling thread, and second, in order to be able to execute APCs, the calling thread has to be suspended in so called alterable wait state.
Using IOCP. The disadvantage of this approach is that there are many practical thorny programming problems that must be solved. Coding IOCP can be a bit of a hassle.

2.2.1 Why using IOCP?

By using IOCP, we can overcome the “one-thread-per-client” problem. It is commonly known that the performance decreases heavily if the software does not run on a true multiprocessor machine. Threads are system resources that are neither unlimited nor cheap.

IOCP provides a way to have a few (I/O worker) threads handling multiple clients' input/output “fairly”. The threads are suspended and are not using CPU cycles until there is some thing to do.

2.3 What is IOCP?

We have already stated that IOCP is nothing but a thread synchronization object, similar to a semaphore, therefore IOCP is not a sophisticated concept. An IOCP object can be associated with several I/O objects that support pending asynchronous I/O calls. A thread that has access to an IOCP can be suspended until a pending asynchronous I/O call is finished.

3 How does IOCP work?

To get more information about this part, I refer to other articles as [1, 2, 3].

When working with IOCP, you have to deal with three things, associating a socket to the completion port, making the asynchronous I/O call, and synchronization with the thread. To get the result from the asynchronous I/O call and to know, for example, which client has made the call, you have to pass two parameters around. The “Completion Key parameter” and the “OVERLAPPED structure”.

3.1 The Completion Key parameter

The first parameter the “CompletionKey” is just a variable of type DWORD. You can pass whatever unique value you want, which will always be associated with the object. Normally, a pointer to a structure or a class which contains some client specific object is passed with this parameter. In the source code, a pointer to a structure “ClientContext” is passed as the “CompletionKey” parameter.

3.2 The OVERLAPPED parameter

This parameter is commonly used to pass around the memory buffer that is used by the asynchronous I/O call. It is important to notice that this data will be locked and is not paged out of physical memory. We will discuss this later.

3.3 Associating a socket to the completion port

Once a completion port is created, the association of a socket to the completion port can be done by calling the function CreateIoCompletionPort in the following way:

BOOL IOCPS::AssociateSocketWithCompletionPort(SOCKET socket,                HANDLE hCompletionPort, DWORD dwCompletionKey){   HANDLE h = CreateIoCompletionPort((HANDLE) socket,                  hCompletionPort, dwCompletionKey, m_nIOWorkers);   return h == hCompletionPort;}

3.4 Making the asynchronous I/O call

To make the actual asynchronous call, the functions WSASend, WSARecv, etc., are called, they also need to have a parameter WSABUF that contains a pointer to a buffer that is going to be used. A rule of thumb is that normally when the server/client wants to call an I/O operation, it is are not made directly, but is posted into the completion port, and is performed by the I/O worker threads. The reason for this is because we want the CPU cycles to be partitioned fairly. The I/O calls are done by posting a status to the completion port, see below:

BOOL bSuccess = PostQueuedCompletionStatus(m_hCompletionPort,     pOverlapBuff->GetUsed(), (DWORD) pContext, &pOverlapBuff->m_ol);

3.5 Synchronization with the thread

Synchronization with the I/O worker threads is done by calling the GetQueuedCompletionStatus function (see below). The function also provides the CompletionKey parameter and the OVERLAPPED parameter (see below).

BOOL GetQueuedCompletionStatus(   HANDLE CompletionPort, // handle to completion port   LPDWORD lpNumberOfBytes, // bytes transferred   PULONG_PTR lpCompletionKey, // file completion key   LPOVERLAPPED *lpOverlapped, // buffer   DWORD dwMilliseconds // optional timeout value   );

3.6 Four thorny IOCP coding hassles and their solutions

There are some problems that arise from using IOCP, some of them are not intuitive. In a multithreaded scenario using IOCPs, the control flow of a thread function is less straightforward, because there is no relationship between threads and communications. In this section, we represent four different problems that can occur when developing server/client applications using IOCPs. They are:

WSAENOBUFS error problem.
The package reordering problem.
The pending read problem.
The access violation problem.

3.6.1 The WS3ENOBUFS error problem

This problem is non intuitive and difficult to detect, because at first sight, it seems to be a normal deadlock or memory leakage “bug”. Assume that you have developed your server and everything runs fine. When you stress test the server, it suddenly hangs. If you are lucky, you find out that it has something to do with the WSAENOBUFS error.

With every overlapped send or receive operation, it is probable that the data buffers submitted will be locked. When memory is locked, it cannot be paged out of physical memory. The operating system imposes a limit on the amount of memory that may be locked. When this limit is reached, overlapped operations will fail with the WSAENOBUFS error.

If a server posts many overlapped receives on each connection, this limit will be reached as the number of connections grow. If a server anticipates handling a very high number of concurrent clients, the server can post a single zero byte receive on each connection. Because there is no buffer associated with the receive operation, no memory needs to be locked. With this approach, the per-socket receive buffer should be left intact because once the zero-byte receive operation completes, the server can simply perform a non-blocking receive to retrieve all the data buffered in the socket's receive buffer. There is no more data pending when the non-blocking receive fails with WSAEWOULDBLOCK.

This design would be for those that require the maximum possible concurrent connections while sacrificing the data throughput on each connection. Of course, the more you are aware of how the clients will be interacting with the server, the better. In the previous example, a non-blocking receive is performed once the zero-byte receive completes to retrieve the buffered data. If the server knows that clients send data in bursts, then once the zero-byte receive completes, it may post one or more overlapped receives in case the client sends a substantial amount of data (greater than the per-socket receive buffer that is 8 KB by default).

A simple practical solution to the WSAENOBUFS error problem in the source code provided is to make a NULL, zero byte asynchronous WSAREAD every time, before a real asynchronous read call. Therefore, for each client connection, the loop call in the table below is run in the IO completion port.

A simple practical solution to the WSAENOBUFS error problem in the source code provided is to start several pending read loop calls and then submit a NULL, zero byte asynchronous WSAREAD loop. By doing this and knowing that submitted calls in the completion port always return in order, we will always unlock overlapped memory.

3.6.2 The package reordering problem

This problem has also been discussed by [3]. Although committed operations using the IO completion port will always complete in the order that they were submitted, thread scheduling issues may mean that the actual work associated with the completion is processed in an undefined order. For example, if you have two I/O worker threads and you should receive “byte chunk 1, byte chunk 2, byte chunk 3”, you may process the byte chunks in wrong order namely “byte chunk 2, byte chunk 1, byte chunk 3”. This also means that when you are sending the data by posting a send request on the I/O completion port, the data can actually be sent reordered.

This can be solved by only using one worker thread and committing only one I/O call and waiting for it to finish, but if we do this we lose all the benefits of IOCP.

A simple practical solution to this problem is to add a sequence number to our buffer class and only process the data in the buffer if the buffer sequence number is in order. This means that the buffers that have incorrect numbers have to be saved for later, and because of performance reasons, we are saving the buffers in a hash map object (e.g., m_SendBufferMap and m_ReadBufferMap).

To get more information about this solution, please revise the source code and take a look into the following functions in the IOCPS class:

GetNextSendBuffer (..) and GetNextReadBuffer(..) to get the ordered send or receive buffer.
IncreaseReadSequenceNumber(..) and IncreaseReadSequenceNumber(..) to increase the sequence numbers.

3.6.3 Asynchronous pending reads and byte chunk package processing problem.

The most common server protocols are a packet based protocol where the first X bytes is a header and the header contains details of the length of the complete packet. The server can read the header, work out how much more data is required, and keep reading until it has a complete packet. This works fine when the server is making one asynchronous read call at a time. But if we want to use the IOCP server full potential, we should have several pending asynchronous reads waiting for data to arrive. This means that several asynchronous reads can complete out of order (as discussed before in section 3.6.2) and byte chunk streams returned by pending reads will not be processed in order. Furthermore, we can notice that a byte chunk stream can contain one or several packages and also partial packages as shown in figure 1.

Figure 1. The figure shows how partial packages (green) and complete packages (yellow) can arrive asynchronously in different byte chunk streams (marked 1, 2 ,3).

This means that we have to process the byte stream chunks in order to successfully read a complete package, furthermore we have to handle partial packages (marked with green in figure 1). This makes the byte chunk package processing more difficult. The full solution to this problem can be found in the ProcessPackage(..) function in the IOCPS class.

3.6.4 The access violation problem

This is a minor problem and is a result of the design of the code, rather than a IOCP specific problem. Suppose that a client connection is lost and an I/O call returns with an error flag, so that we know that the client is gone. In the parameter “CompletionKey”, we pass a pointer to a structure “ClientContext” that contains client specific data. What happens if we free the memory occupied by this “ClientContext” structure and some other I/O call performed by the same client returns with an error code and we transform the parameter “CompletionKey” variable of DWORD to a pointer to “ClientContext” and try to access or delete it? An access violation occurs!

The solution to this problem is to add a number to the structures that contain the number of pending I/O calls (m_nNumberOfPendlingIO), and we only delete the structure when we know that there is no more pending I/O calls. This is made by EnterIoLoop(..) function and ReleaseClientContext(..).

3.7 Overview of the source code

The goal of the source code is to provide a set of simple classes that are handling all the hassled code which has to do with IOCP. The source code also provides a set of functions which are frequently used when dealing with communication and server/client software as file receiving/transferring functions, logical thread pool handling, etc.

Figure 2. The figure above illustrates the overview of the IOCP class source code functionality.

We have several IO Worker Threads handling asynchronous I/O calls through the completion port (IOCP), these Workers call some virtual functions which can put requests that need a large amount of computation in a work queue. The logical workers take a job from the queue and process it and send back the result by using some of the functions provided by the class. The Graphical User Interface (GUI) usually communicates with the main class by Windows messages (because MCF is not thread safe) and by calling functions or using shared variables.

Figure 3. The figure above shows the class overview.

The classes that can be observed in figure 3 are:

CIOCPBuffer: A class used to manage the buffers used by the asynchronous I/O calls.
IOCPS: The main class that handles all the communication.
JobItem: A structure which contains the job to be performed by logical worker threads.
ClientContext: A structure that holds client specific information (status, data, etc.)..

3.7.1 The buffer design – The CIOCPBuffer class

When using asynchronous I/O calls, we have to provide a private buffer to be used with the I/O operation. There are some considerations that have to be done when we are allocating buffers to use:

We should not allocate buffers in the memory heap and we should use “VirtualAlloc” function to allocate memory in a virtual memory page instead of the heap.

To allocate and free memory is expensive, therefore we should reuse buffers (memory) which have been allocated. Therefore, we save buffers in the linked list structures below:

// Free Buffer List..    CRWCriticalSection m_FreeBufferListLock;   CPtrList m_FreeBufferList;// OccupiedBuffer List.. (Buffers that is currently used)    CRWCriticalSection m_BufferListLock;   CPtrList m_BufferList; // Now we use the function AllocateBuffer(..) // to allocate memory or reuse a buffer.

Sometimes when an asynchronous I/O call completes, we may have partial packages in the buffer, therefore a need to “split” a buffer is needed to get a complete message.
This is made by the SplitBuffer function in the CIOCPS class. Also, some times we need to copy information between buffers and this is made by the AddAndFlush(..) function in the IOCPS class.
As we know, we also need to add a sequence number and a state (IOType variable IOZeroReadCompleted, etc.) to our buffer.
We need also methods to convert data to byte stream and byte stream to data, some of these functions are also provided in the CIOCPBuffer class.

All the solutions to the things we have discussed above exist in the CIOCPBuffer class.

3.8 How to use the source code?

By inheriting your own class from IOCP (shown in figure 3) and using the virtual functions and the functionality provided by the IOCPS class (e.g., threadpool), it is possible to implement any type of server or client that can efficiently manage a huge number of connections using only a few number of threads.

3.8.1 Starting and closing the server/client

To start the server, call the function:

BOOL Start(int nPort=999, int iMaxNumConnections=1201, int iMaxIOWorkers=1,    int nOfWorkers=1, int iMaxNumberOfFreeBuffer=0, int iMaxNumberOfFreeContext=0,    BOOL bOrderedSend=TRUE, BOOL bOrderedRead=TRUE, int iNumberOfPendlingReads=4);

Parameter	Description
`nPort`	Is the port number that the server will listen on. (Let it be -1 for client mode.)
`iMaxNumConnections`	Maximum number of connections allowed (use a big prime number).
`iMaxIOWorkers`	Number of Input Output Worker threads
`nOfWorkers`	Number of logical Workers. (Can be changed in runtime.)
`iMaxNumberOfFreeBuffer`	Maximum number of buffers that we save for reuse. (-1 for none. 0=Infinite number.)
`iMaxNumberOfFreeContext`	Maximum number of Client information objects that are saved for reuse. (-1 for none. 0=Infinite number.)
`bOrderedRead`	Make sequential reads (we discussed this before in section 3.6.2).
`bOrderedSend`	Make sequential writes (we discussed this before in section 3.6.2).
`iNumberOfPendlingReads`	Number of pending asynchronous read loops that are waiting for data.

To connect to a remote connection (Client mode nPort= -1), call the function:

Connect(const CString &strIPAddr, int nPort

Parameter	Description
`strIPAddr`	The IP address of the remote server.
`nPort`	The port

To close the server, call the function: ShutDown().

Example:

MyIOCP m_iocp;if(!m_iocp.Start(-1,1210,2,1,0,0))  AfxMessageBox("Error could not start the Client");  ….  m_iocp.ShutDown();

4 Source code description

For more details about the source code, please check the comments in the source code.

4.1.1 Virtual Functions

NotifyNewConnection
Called when a new connection has been established..
NotifyNewClientContext
Called when an empty ClientContext structure is allocated.
NotifyDisconnectedClient
Called when a client disconnects.
ProcessJob
Called when logical workers want to process a Job.
NotifyReceivedPackage
Notifies that a new package has arrived.
NotifyFileCompleted
Notifies that a file transfer is finished.

4.1.2 Important variables

Notice that all variables have to be exclusive locked by the function that uses the shared variables, this is important to avoid access violations and overlapping writes. All the variables with name XXX that are needed to be locked have a XXXLock variable.

m_ContextMapLock
```
ContextMap m_ContextMap;
```
Holds all the client data (socket, client data, etc.).
m_NumberOfActiveConnections
Holds the number of connected connections.

4.1.3 Important functions

GetNumberOfConnections()
Returns the number of connections.
CString GetHostAdress(ClientContext* p)
Returns the host address, given a client context.
BOOL ASendToAll(CIOCPBuffer *pBuff);
Sends the content of the buffer to all the connected clients.
DisconnectClient(CString sID)
Disconnect a client given the unique identification number.
CString GetHostIP()
Return the local IP number.
JobItem* GetJob()
Removes a JobItem from the queue, returns NULL if there are no Jobs.
BOOL AddJob(JobItem *pJob)
Adds a Job to the queue.
BOOL SetWorkers(int nThreads)
Sets the number of logical workers that can be called anytime.
DisconnectAll();
Disconnect all the clients.
ARead(…)
Makes an asynchronous read.
ASend(…)
Makes an asynchronous send. Sends data to a client.
ClientContext* FindClient(CString strClient)
Finds a client given a string ID. OBS! Not thread safe!.
DisconnectClient(ClientContext* pContext, BOOL bGraceful=FALSE);
Disconnects a client.
DisconnectAll()
Disconnects all the connected clients.
StartSendFile(ClientContext *pContext)
Sends a file specified in the ClientContext structure, using the optimized transmitfile(..) function.
PrepareReceiveFile(..)
Prepares the connection for receiving a file. When you call this function, all incoming byte streams are written to a file.
PrepareSendFile(..)
Opens a file and sends a package containing information about the file to the remote connection. The function also disables ASend(..) function until the file is transmitted or aborted.
DisableSendFile(..)
Disables send file mode.
DisableRecevideFile(..)
Disables receive file mode.

5 File transfer

The file transfer is done by using the Winsock 2.0 TransmitFile function. The TransmitFile function transmits file data over a connected socket handle. This function uses the operating system's cache manager to retrieve the file data, and provides high-performance file data transfer over sockets. There is some important aspect of asynchronous file transferring:

While the TransmitFile function has not returned, no other sends or writes to the socket should be performed because this will corrupt the file. Therefore, all the calls to ASend will be disabled after the PrepareSendFile(..) function.
Since operating system reads the file data sequentially, you can improve caching performance by opening the file handle with FILE_FLAG_SEQUENTIAL_SCAN.
We are using the e kernel asynchronous procedure calls when sending the file (TF_USE_KERNEL_APC). Use of TF_USE_KERNEL_APC can deliver significant performance benefits. It is possible (though unlikely), however, that the thread in which context TransmitFile is initiated is being used for heavy computations; this situation may prevent APCs from launching.

The transfer file is made in this order, the sever initializes the file transfer by calling the PrepareSendFile(..) function. When the client receives the information about the file, it prepares for it by calling PrepareReceiveFile(..), and sends a package to the sever to start the file transfer. When the package arrives in the server side, the server calls the StartSendFile(..) function that uses the high performed TransmitFile function that transmits the specified file.

6 The source code example

The provided source code example, has an echo server/client that also supports file transmission (figure 4). In the source code, a class MyIOCP inherited from IOCP handles the communication between the client and the server by using the virtual functions mentioned in section 4.1.1.

The most important part of the client or server code is the virtual function NotifyReceivedPackage, as described below:

void MyIOCP::NotifyReceivedPackage(CIOCPBuffer *pOverlapBuff,                            int nSize, ClientContext *pContext){   BYTE PackageType=pOverlapBuff->GetPackageType();   switch (PackageType)   {     case Job_SendText2Client :       Packagetext(pOverlapBuff,nSize,pContext);       break;     case Job_SendFileInfo :       PackageFileTransfer(pOverlapBuff,nSize,pContext);       break;      case Job_StartFileTransfer:       PackageStartFileTransfer(pOverlapBuff,nSize,pContext);       break;     case Job_AbortFileTransfer:       DisableSendFile(pContext);       break;   };}

The function handles an incoming message and performs the request send by the remote connection. In this case, it is only a matter of a simple echo or file transfer. The source code is divided into two projects, IOCP and IOCPClient, which are the server and the client side of the connection.

6.1 Compiler issues

When compiling with VC++ 6.0, you may get some strange errors dealing with the CFile class as:

“if (pContext->m_File.m_hFile != INVALID_HANDLE_VALUE)   <-error C2446: '!=' : no conversion from 'void *' to 'unsigned int'”,   this problems can be solved if you update   the header files (*.h) or your VC++ 6.0 version.   The source code compiles well under .net or VC++ 7.0.

7 Future work

In the future, the source code should be updated, so we use the AcceptEx(..) function to accept new connections.

8 References

“Developing a Truly Scalable Winsock Server using IO Completion Ports”, norm.net, 22/03/2005.
“Windows Sockets 2.0: Write Scalable Winsock Apps Using Completion Ports”, Anthony Jones & Amol Deshpande, 22/02/2005.
“A reusable, high performance, socket server class - Part 1-6”, Len Holgate, JetByte Limited.

'[Develope] > Network' 카테고리의 다른 글

per-process timer :: timer_create (0)	2005.06.13
per-process timers :: timer_getoverrun, timer_gettime, timer_settime (0)	2005.06.13
[펌] autoconf, automake (0)	2005.06.02
RTS를 이용한 Asynchronous IO (0)	2005.05.20
Asynchronous IO (0)	2005.02.21

현재글IOCP[출처:CodeProject]

- LG U+ IPTV CDN(Hybrid, Cloud)/DRM - KT GiGAeyes(영상보안), 화상회의 - tving 미디어플랫폼 개발 - SKT(B), LG U+ VoIP(SIP, SBC, CSCF), - SKComms nateon(VoIP, 화상채팅, 토크온) - Streaming, 영상보안, VA(Video Analytics)

VoIP, ffmpeg, AVC, C99, svn, MySQL, CDN, IPTV, RTP, OTT, awk, H.263, grep, memory_leak, trac, AWS, 3GPP, MPEG-DASH, EC2, PSS,

Today :
Yesterday :

일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

미디어 서비스