Wepoll Source code analysis
For GSoc’s tokio project.
Fundamental data struct
queue
Simple bidirectional queue with a dummy node to simplify code as default.
| 1 | typedef struct queue_node { | 
Wepoll has implemented api to add new node to front and back of existed queue in O(1) time.
ts_tree
Wepoll implement red black tree and provide thread-safe version of it. Thread-safe red black binary tree is used to manage port_state which related to certain IOCP port. The ts_tree_node is embedded in port_state and contained in a local ts_tree called epoll__handle_tree. It manage all IOCP port and provide access to associated port_state by looking for the HANDLE returned from CreateIoCompletionPort().
| 1 | typedef struct ts_tree { | 
The ts_tree is guarded by SRWLOCK to control read/write of it. And ts_tree_node is managed by reflock to control its lifetime.  Under normal operation, threads increase and decrease the reference count, which are wait-free operations. The reflock normally prevents a chunk of memory from being freed, but does allow the chunk of memory to eventually be released in a coordinated fashion.
epoll_create()
The epoll_create() and epoll_create1 ignore their variables and call epoll__create. In epoll__create, it create a new port by calling CreateIoCompletionPort, initialing port_state struct and adding this new port_state to epoll__handle_tree.
| 1 | port_state_t* port_new(HANDLE* iocp_out) { | 
The port_new function take care of port_state initialization.
epoll_ctl()
First, epoll_ctl find the port_state specified by HANDLE and increase the reference count associated to it.
Second, enter the critical section and compplete the work according to op:
- EPOLL_CTL_ADD 
 The- port__ctl_addcreate a new- sock_stateand get- ws_base_socket&- poll_groupfor new- sock_state, then add it to the- port_state‘s tree struct to manage it.
- EPOLL_CTL_MOD 
 The- port__ctl_modget associated- sock_state, set event on it, and add it to the- sock_update_queuewaiting for updates. Then called- port__update_events_if_polling…
- EPOLL_CTL_DEL 
 The- port__ctl_delget- sock_statein- port_stateby SOCKET, and delete this- sock_state. If this socket’s polling request is still pending, cancel it:
| 1 | /* CancelIoEx() may fail with ERROR_NOT_FOUND if the overlapped operation has | 
And remove it from port_state‘s update queue and tree struct managing related sock_state. If the poll request still needs to complete, the sock_state object can’t be free()d yet. So it’s added to port_state deleted socket queue.
epoll_wait()
First, epoll_ctl find the port_state specified by HANDLE and increase the reference count associated to it.
Second, choose the appropriate timeout and location for storing iocp_event(on stack or on heap). Then begin the loop, dequeue completion packets until either at least one interesting event has been discovered, or the timeout is reached.
In detail, it update all sock_state in port_state‘s update_queue in the proper way. And In detail, it update all sock_state in port_state‘s update_queue in the proper way. And then waits for pending I/O operations that are associated with the specified completion port to complete by GetQueuedCompletionStatusEx. Store the iocp events returned in the location decided before. Iterate each event and process it by calling sock_feed_event:
| 1 | int sock_feed_event(port_state_t* port_state, | 
If EPOLLONESHOT is set, clear all events flag.
Finally, if there’s still polling, update events in port_state‘s update_queue.
epoll_close()
First, epoll_close find the port_state specified by HANDLE and increase the reference count associated to it. And close the IOCP port associated with itself.
Then, force delete all sock_state in both sock_tree and sock_deleted_queue. And clear all poll_group in sock_state and the afd handler in them as well.