Java程序員進(jìn)階三條必經(jīng)之路:數(shù)據(jù)庫(kù)定庵、虛擬機(jī)吏饿、異步通信。
前言
從零單排高性能問(wèn)題蔬浙,這次輪到異步通信了猪落。這個(gè)領(lǐng)域入門(mén)有點(diǎn)難,需要了解UNIX五種IO模型和TCP協(xié)議畴博,熟練使用三大異步通信框架:Netty笨忌、NodeJS、Tornado俱病。目前所有標(biāo)榜異步的通信框架用的都不是異步IO模型官疲,而是IO多路復(fù)用中的epoll。因?yàn)镻ython提供了對(duì)Linux內(nèi)核API的友好封裝亮隙,所以我選擇Python來(lái)學(xué)習(xí)IO多路復(fù)用途凫。
IO多路復(fù)用
- select
舉一個(gè)EchoServer的例子,客戶(hù)端發(fā)送任何內(nèi)容溢吻,服務(wù)端會(huì)原模原樣返回维费。
#!/usr/bin/env python
# -*- coding: utf-8 -*-
'''
Created on Feb 16, 2016
@author: mountain
'''
import socket
import select
from Queue import Queue
#AF_INET指定使用IPv4協(xié)議,如果要用更先進(jìn)的IPv6促王,就指定為AF_INET6掩完。
#SOCK_STREAM指定使用面向流的TCP協(xié)議,如果要使用面向數(shù)據(jù)包的UCP協(xié)議硼砰,就指定SOCK_DGRAM且蓬。
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.setblocking(False)
#設(shè)置監(jiān)聽(tīng)的ip和port
server_address = ('localhost', 1234)
server.bind(server_address)
#設(shè)置backlog為5,client向server發(fā)起connect题翰,server accept后建立長(zhǎng)連接恶阴,
#backlog指定排隊(duì)等待server accept的連接數(shù)量诈胜,超過(guò)這個(gè)數(shù)量,server將拒絕連接冯事。
server.listen(5)
#注冊(cè)在socket上的讀事件
inputs = [server]
#注冊(cè)在socket上的寫(xiě)事件
outputs = []
#注冊(cè)在socket上的異常事件
exceptions = []
#每個(gè)socket有一個(gè)發(fā)送消息的隊(duì)列
msg_queues = {}
print "server is listening on %s:%s." % server_address
while inputs:
#第四個(gè)參數(shù)是timeout焦匈,可選,表示n秒內(nèi)沒(méi)有任何事件通知昵仅,就執(zhí)行下面代碼
readable, writable, exceptional = select.select(inputs, outputs, exceptions)
for sock in readable:
#client向server發(fā)起connect也是讀事件缓熟,server accept后產(chǎn)生socket加入讀隊(duì)列中
if sock is server:
conn, addr = sock.accept()
conn.setblocking(False)
inputs.append(conn)
msg_queues[conn] = Queue()
print "server accepts a conn."
else:
#讀取client發(fā)過(guò)來(lái)的數(shù)據(jù),最多讀取1k byte摔笤。
data = sock.recv(1024)
#將收到的數(shù)據(jù)返回給client
if data:
msg_queues[sock].put(data)
if sock not in outputs:
#下次select的時(shí)候會(huì)觸發(fā)寫(xiě)事件通知够滑,寫(xiě)和讀事件不太一樣,前者是可寫(xiě)就會(huì)觸發(fā)事件吕世,并不一定要真的去寫(xiě)
outputs.append(sock)
else:
#client傳過(guò)來(lái)的消息為空彰触,說(shuō)明已斷開(kāi)連接
print "server closes a conn."
if sock in outputs:
outputs.remove(sock)
inputs.remove(sock)
sock.close()
del msg_queues[sock]
for sock in writable:
if not msg_queues[sock].empty():
sock.send(msg_queues[sock].get_nowait())
if msg_queues[sock].empty():
outputs.remove(sock)
for sock in exceptional:
inputs.remove(sock)
if sock in outputs:
outputs.remove(sock)
sock.close()
del msg_queues[sock]
[mountain@king ~/workspace/wire]$ telnet localhost 1234
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
1
1
select有3個(gè)缺點(diǎn):
1. 每次調(diào)用select,都需要把fd集合從用戶(hù)態(tài)拷貝到內(nèi)核態(tài)命辖,這個(gè)開(kāi)銷(xiāo)在fd很多時(shí)會(huì)很大况毅。
1. 每次調(diào)用select后,都需要在內(nèi)核遍歷傳遞進(jìn)來(lái)的所有fd尔艇,這個(gè)開(kāi)銷(xiāo)在fd很多時(shí)也很大尔许。
這點(diǎn)從python的例子里看不出來(lái),因?yàn)閜ython select api更加友好终娃,直接返回就緒的socket列表味廊。事實(shí)上linux內(nèi)核select api返回的是就緒socket數(shù)目:
int select (int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);
1. fd數(shù)量有限,默認(rèn)1024尝抖。
- poll
采用poll重新實(shí)現(xiàn)EchoServer毡们,只要搞懂了select,poll也不難昧辽,只是api的參數(shù)不太一樣而已衙熔。
#!/usr/bin/env python
# -*- coding: utf-8 -*-
'''
Created on Feb 27, 2016
@author: mountain
'''
import select
import socket
import sys
import Queue
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.setblocking(False)
server_address = ('localhost', 1234)
server.bind(server_address)
server.listen(5)
print 'server is listening on %s port %s' % server_address
msg_queues = {}
timeout = 1000 * 60
#POLLIN: There is data to read
#POLLPRI: There is urgent data to read
#POLLOUT: Ready for output
#POLLERR: Error condition of some sort
#POLLHUP: Hung up
#POLLNVAL: Invalid request: descriptor not open
READ_ONLY = select.POLLIN | select.POLLPRI | select.POLLHUP | select.POLLERR
READ_WRITE = READ_ONLY | select.POLLOUT
poller = select.poll()
#注冊(cè)需要監(jiān)聽(tīng)的事件
poller.register(server, READ_ONLY)
#文件描述符和socket映射
fd_to_socket = { server.fileno(): server}
while True:
events = poller.poll(timeout)
for fd, flag in events:
sock = fd_to_socket[fd]
if flag & (select.POLLIN | select.POLLPRI):
if sock is server:
conn, client_address = sock.accept()
conn.setblocking(False)
fd_to_socket[conn.fileno()] = conn
poller.register(conn, READ_ONLY)
msg_queues[conn] = Queue.Queue()
else:
data = sock.recv(1024)
if data:
msg_queues[sock].put(data)
poller.modify(sock, READ_WRITE)
else:
poller.unregister(sock)
sock.close()
del msg_queues[sock]
elif flag & select.POLLHUP:
poller.unregister(sock)
sock.close()
del msg_queues[sock]
elif flag & select.POLLOUT:
if not msg_queues[sock].empty():
msg = msg_queues[sock].get_nowait()
sock.send(msg)
else:
poller.modify(sock, READ_ONLY)
elif flag & select.POLLERR:
poller.unregister(sock)
sock.close()
del msg_queues[sock]
poll解決了select的第三個(gè)缺點(diǎn),fd數(shù)量不受限制搅荞,但是失去了select的跨平臺(tái)特性红氯,它的linux內(nèi)核api是這樣的:
int poll (struct pollfd *fds, unsigned int nfds, int timeout);
struct pollfd {
int fd; /* file descriptor */
short events; /* requested events to watch */
short revents; /* returned events witnessed */
};
- epoll
用法與poll幾乎一樣。
#!/usr/bin/env python
# -*- coding: utf-8 -*-
'''
Created on Feb 28, 2016
@author: mountain
'''
import select
import socket
import Queue
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.setblocking(False)
server_address = ('localhost', 1234)
server.bind(server_address)
server.listen(5)
print 'server is listening on %s port %s' % server_address
msg_queues = {}
timeout = 60
READ_ONLY = select.EPOLLIN | select.EPOLLPRI
READ_WRITE = READ_ONLY | select.EPOLLOUT
epoll = select.epoll()
#注冊(cè)需要監(jiān)聽(tīng)的事件
epoll.register(server, READ_ONLY)
#文件描述符和socket映射
fd_to_socket = { server.fileno(): server}
while True:
events = epoll.poll(timeout)
for fd, flag in events:
sock = fd_to_socket[fd]
if flag & READ_ONLY:
if sock is server:
conn, client_address = sock.accept()
conn.setblocking(False)
fd_to_socket[conn.fileno()] = conn
epoll.register(conn, READ_ONLY)
msg_queues[conn] = Queue.Queue()
else:
data = sock.recv(1024)
if data:
msg_queues[sock].put(data)
epoll.modify(sock, READ_WRITE)
else:
epoll.unregister(sock)
sock.close()
del msg_queues[sock]
elif flag & select.EPOLLHUP:
epoll.unregister(sock)
sock.close()
del msg_queues[sock]
elif flag & select.EPOLLOUT:
if not msg_queues[sock].empty():
msg = msg_queues[sock].get_nowait()
sock.send(msg)
else:
epoll.modify(sock, READ_ONLY)
elif flag & select.EPOLLERR:
epoll.unregister(sock)
sock.close()
del msg_queues[sock]
epoll解決了select的三個(gè)缺點(diǎn)咕痛,是目前最好的IO多路復(fù)用解決方案痢甘。為了更好地理解epoll,我們來(lái)看一下linux內(nèi)核api的用法茉贡。
int epoll_create(int size)//創(chuàng)建一個(gè)epoll的句柄塞栅,size用來(lái)告訴內(nèi)核這個(gè)監(jiān)聽(tīng)的數(shù)目一共有多大。
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event)//注冊(cè)事件腔丧,每個(gè)fd只拷貝一次放椰。
int epoll_wait(int epfd, struct epoll_event * events, int maxevents, int timeout)/*等待IO事件作烟,事件發(fā)生時(shí),
內(nèi)核調(diào)用回調(diào)函數(shù)砾医,把就緒fd放入就緒鏈表中拿撩,并喚醒epoll_wait,epoll_wait只需要遍歷就緒鏈表即可如蚜,
而select和poll都是遍歷所有fd压恒,這效率高下立判。*/