親測兑牡,先給出結(jié)論:
select不存在驚群效應(yīng),每次來一個(gè)socket消息遍蟋,只有一個(gè)消費(fèi)進(jìn)程被喚醒吹害。
e_poll存在驚群效應(yīng),每次來一個(gè)socket連接請(qǐng)求虚青,處于空閑狀態(tài)的消費(fèi)進(jìn)程都被喚醒它呀。
先對(duì)select機(jī)制的測試,直接上代碼:
//select測試
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <sys/wait.h>
#include <stdio.h>
#include <string.h>
#define PROCESS_NUM 10
int main()
{
int fd = socket(PF_INET, SOCK_STREAM, 0);
int connfd;
int pid;
char sendbuff[1024];
struct sockaddr_in serveraddr;
serveraddr.sin_family = AF_INET;
serveraddr.sin_addr.s_addr = htonl(INADDR_ANY);
serveraddr.sin_port = htons(1234);
bind(fd, (struct sockaddr*)&serveraddr, sizeof(serveraddr));
listen(fd, 1024);
int i;
for(i = 0; i < PROCESS_NUM; i++)
{
int pid = fork();
if(pid == 0)
{
while(1)
{
connfd = accept(fd, (struct sockaddr*)NULL, NULL);
snprintf(sendbuff, sizeof(sendbuff), "accept PID is %d\n", getpid());
send(connfd, sendbuff, strlen(sendbuff) + 1, 0);
printf("process %d accept success!\n", getpid());
close(connfd);
}
}
}
int status;
wait(&status);
return 0;
}
測試結(jié)果棒厘,服務(wù)端為:
[root@localhost operea_study]# gcc select.c -o a
[root@localhost operea_study]# ./a
process 12515 accept success!
客戶端為:
[minping@localhost ~]$ curl 127.0.0.1:1234
accept PID is 12515
[minping@localhost ~]$
證明linux的select不存在驚群效應(yīng):當(dāng)多個(gè)進(jìn)程都阻塞在對(duì)同一個(gè)socket的accept上纵穿,當(dāng)有一個(gè)新的連接到來時(shí),奢人,確實(shí)只有一個(gè)進(jìn)程被內(nèi)核喚醒谓媒,其他進(jìn)程還是繼續(xù)保持休眠狀態(tài)。
然后對(duì)e_poll的測試:
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/epoll.h>
#include <netdb.h>
#include <string.h>
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/wait.h>
#define PROCESS_NUM 10
static int
create_and_bind (char *port)
{
int fd = socket(PF_INET, SOCK_STREAM, 0);
struct sockaddr_in serveraddr;
serveraddr.sin_family = AF_INET;
serveraddr.sin_addr.s_addr = htonl(INADDR_ANY);
serveraddr.sin_port = htons(atoi(port));
bind(fd, (struct sockaddr*)&serveraddr, sizeof(serveraddr));
return fd;
}
static int
make_socket_non_blocking (int sfd)
{
int flags, s;
flags = fcntl (sfd, F_GETFL, 0);
if (flags == -1)
{
perror ("fcntl");
return -1;
}
flags |= O_NONBLOCK;
s = fcntl (sfd, F_SETFL, flags);
if (s == -1)
{
perror ("fcntl");
return -1;
}
return 0;
}
#define MAXEVENTS 64
int
main (int argc, char *argv[])
{
int sfd, s;
int efd;
struct epoll_event event;
struct epoll_event *events;
sfd = create_and_bind("1234");
if (sfd == -1)
abort ();
s = make_socket_non_blocking (sfd);
if (s == -1)
abort ();
s = listen(sfd, SOMAXCONN);
if (s == -1)
{
perror ("listen");
abort ();
}
efd = epoll_create(MAXEVENTS);
if (efd == -1)
{
perror("epoll_create");
abort();
}
event.data.fd = sfd;
//event.events = EPOLLIN | EPOLLET;
event.events = EPOLLIN;
s = epoll_ctl(efd, EPOLL_CTL_ADD, sfd, &event);
if (s == -1)
{
perror("epoll_ctl");
abort();
}
/* Buffer where events are returned */
events = calloc(MAXEVENTS, sizeof event);
int k;
for(k = 0; k < PROCESS_NUM; k++)
{
int pid = fork();
if(pid == 0)
{
/* The event loop */
while (1)
{
int n, i;
n = epoll_wait(efd, events, MAXEVENTS, -1);
printf("process %d return from epoll_wait!\n", getpid());
/* sleep here is very important!*/
//sleep(2);
for (i = 0; i < n; i++)
{
if ((events[i].events & EPOLLERR) || (events[i].events & EPOLLHUP) || (!(events[i].events & EPOLLIN)))
{
/* An error has occured on this fd, or the socket is not
ready for reading (why were we notified then?) */
fprintf (stderr, "epoll error\n");
close (events[i].data.fd);
continue;
}
else if (sfd == events[i].data.fd)
{
/* We have a notification on the listening socket, which
means one or more incoming connections. */
struct sockaddr in_addr;
socklen_t in_len;
int infd;
char hbuf[NI_MAXHOST], sbuf[NI_MAXSERV];
in_len = sizeof in_addr;
infd = accept(sfd, &in_addr, &in_len);
if (infd == -1)
{
printf("process %d accept failed!\n", getpid());
break;
}
printf("process %d accept successed!\n", getpid());
/* Make the incoming socket non-blocking and add it to the
list of fds to monitor. */
close(infd);
}
}
}
}
}
int status;
wait(&status);
free (events);
close (sfd);
return EXIT_SUCCESS;
}
服務(wù)端測試結(jié)果:
[root@localhost operea_study]# gcc e_poll.c -o b
[root@localhost operea_study]# ./b
process 12778 return from epoll_wait!
process 12779 return from epoll_wait!
process 12779 accept successed!
process 12780 return from epoll_wait!
process 12787 return from epoll_wait!
process 12781 return from epoll_wait!
process 12786 return from epoll_wait!
process 12786 accept failed!
process 12780 accept failed!
process 12785 return from epoll_wait!
process 12782 return from epoll_wait!
process 12783 return from epoll_wait!
process 12784 return from epoll_wait!
process 12784 accept failed!
process 12778 accept failed!
process 12781 accept failed!
process 12787 accept failed!
process 12783 accept failed!
process 12782 accept failed!
process 12785 accept failed!
客戶端測試結(jié)果:
[minping@localhost ~]$ curl 127.0.0.1:1234
curl: (52) Empty reply from server
[minping@localhost ~]$
發(fā)現(xiàn)epoll下何乎,10個(gè)監(jiān)聽在這個(gè)socket上處于epoll_wait狀態(tài)的進(jìn)程句惯,在有連接請(qǐng)求過來的時(shí)候土辩,都被喚醒了,驚群效應(yīng)發(fā)生抢野。
再過一下事件的經(jīng)過:當(dāng)主進(jìn)程創(chuàng)建socket拷淘,然后bind,然后listen后指孤,將該socket加入到epoll中启涯。然后fork出多個(gè)進(jìn)程,每個(gè)進(jìn)程都阻塞在對(duì)這個(gè)socket監(jiān)聽的epoll_wait上恃轩,當(dāng)有一個(gè)新的連接過來時(shí)结洼,發(fā)現(xiàn)這些進(jìn)程全部被喚醒。
那么問題來了叉跛,為什么linux內(nèi)核對(duì)select做了修復(fù)补君,避免了驚群效應(yīng)的發(fā)生,而對(duì)epoll卻不進(jìn)行處理昧互,到目前為止,epoll還是會(huì)發(fā)生驚群效應(yīng)呢伟桅?
下面摘抄自網(wǎng)友的一段話敞掘,覺得說的還比較在理:
accept確實(shí)應(yīng)該只能被一個(gè)進(jìn)程調(diào)用成功,但是epoll的情況就比較復(fù)雜楣铁,epoll監(jiān)聽的文件描述符玖雁,
除了可能后續(xù)被accept調(diào)用外,還可能是其他網(wǎng)絡(luò)IO事件的監(jiān)聽對(duì)象盖腕,那其他網(wǎng)絡(luò)IO是否只能由一個(gè)進(jìn)程處理我們是不得知的赫冬。
所以linux對(duì)epoll并沒有就驚群效應(yīng)做修復(fù),而是放之溃列,讓用戶層自己做處理劲厌。
第二個(gè)問題,上面說了linux對(duì)epoll并沒有修復(fù)驚群效應(yīng)問題听隐,而是留給用戶層(業(yè)務(wù)層)自己來處理补鼻,那么nginx是怎么處理驚群效應(yīng)的呢?
nginx的網(wǎng)絡(luò)架構(gòu)大致是這樣的:讀取主配置文件創(chuàng)建socket雅任,bind风范,listen等一系列動(dòng)作做好后,fork出一堆子進(jìn)程(為了最大限度利用cpu沪么,一般幾個(gè)核就fork幾個(gè)子進(jìn)程)硼婿,然后每個(gè)子進(jìn)程會(huì)調(diào)用ngx_event_process_init 函數(shù)來初始化自己進(jìn)程的內(nèi)部的連接池。ngx_event_process_init 函數(shù)還有一個(gè)很重要的工作就是:如果如果配置文件里面沒有開啟accept_mutex鎖的話禽车,就通過 ngx_add_event 將監(jiān)聽套接字(socket fd)添加到每個(gè)fork出來的子進(jìn)程的 epoll 中寇漫。
每個(gè)子進(jìn)程的ngx_event_process_init 函數(shù)執(zhí)行完后就會(huì)在一個(gè)死循環(huán)中執(zhí)行ngx_process_events_and_timers刊殉,這個(gè)函數(shù)會(huì)讀取配置文件里面有沒有開啟accept mutex鎖,如果開啟了accept mutex鎖猪腕,則此進(jìn)程會(huì)嘗試獲取鎖冗澈,獲取成功就將socket fd當(dāng)?shù)雷约哼@個(gè)子進(jìn)程的epoll中。ngx_process_events_and_timers再調(diào)用 ngx_process_events陋葡,在 ngx_process_events這個(gè)函數(shù)里面阻塞調(diào)用 epoll_wait亚亲。
總結(jié)來說,nginx就是利用 accept_mutex 這把鎖來解決epoll_wait驚群問題的腐缤。
如果配置文件中沒有開啟 accept_mutex捌归,則所有的監(jiān)聽套接字不管三七二十一,都加入到每子個(gè)進(jìn)程的 epoll中岭粤,這樣當(dāng)一個(gè)新的連接來到時(shí)惜索,所有的 worker 子進(jìn)程都會(huì)驚醒。
如果配置文件中開啟了 accept_mutex剃浇,則只有一個(gè)子進(jìn)程會(huì)將監(jiān)聽套接字添加到 epoll 中巾兆,這樣當(dāng)一個(gè)新的連接來到時(shí),當(dāng)然就只有一個(gè) worker 子進(jìn)程會(huì)被喚醒了虎囚。
總結(jié):
1角塑,accept 不會(huì)有驚群,epoll_wait 才會(huì)淘讥。
2圃伶,Nginx 的 accept_mutex,并不是解決 accept 驚群問題,而是解決 epoll_wait 驚群問題蒲列。
3窒朋,說Nginx 解決了 epoll_wait 驚群問題,也是不對(duì)的蝗岖,它只是控制是否將監(jiān)聽套接字加入到子進(jìn)程的epoll 中侥猩。監(jiān)聽套接字只在一個(gè)子進(jìn)程的 epoll 中,當(dāng)新的連接來到時(shí)抵赢,其他子進(jìn)程當(dāng)然不會(huì)驚醒了拭宁。