一免猾、背景
在生產(chǎn)環(huán)境中是辕,有客戶架構(gòu)為阿里云線上環(huán)境及線下IDC需要內(nèi)網(wǎng)互通,互聯(lián)采用阿里云使用第三方深信服云產(chǎn)品與線下IDC側(cè)Cisco防火墻ipsec打通實(shí)現(xiàn)猎提,主要用于定時(shí)阿里云文件及數(shù)據(jù)備份至IDC获三,在生產(chǎn)應(yīng)用中無故隧道會(huì)不定時(shí)中斷,聯(lián)系深信服及思科售后排查均沒有結(jié)果锨苏,但是進(jìn)行手動(dòng)的重啟阿里云上深信服設(shè)備隧道立即恢復(fù)疙教,在兩邊網(wǎng)絡(luò)工程師排查無果后,想到去編寫監(jiān)控腳本伞租,如果隧道終端去利用python重啟深信服設(shè)備贞谓,從而恢復(fù)隧道,數(shù)據(jù)傳輸延遲timeout及使用斷點(diǎn)續(xù)傳葵诈,當(dāng)網(wǎng)絡(luò)層面異常無法解決時(shí)裸弦,換另一種思路來解決問題。
二作喘、技術(shù)要點(diǎn)
2.1 編寫隧道監(jiān)控腳本
由于線上阿里云側(cè)為公有云理疙,且為配置EIP及NAT網(wǎng)關(guān),ecs均采用前端公網(wǎng)SLB負(fù)責(zé)業(yè)務(wù)請(qǐng)求接入泞坦,其內(nèi)部無法出公網(wǎng)窖贤,隧道監(jiān)控腳本想告警出來發(fā)送至微信及后續(xù)的去操作深信服需要公網(wǎng)連通,因此在線下IDC側(cè)放置檢查及重啟腳本贰锁。
2.2 深信服重啟
- 利用Python編寫去操作深信服赃梧,web頁面模擬登錄,主要利用到了selenium模塊豌熄,logging來記錄日志授嘀。
- 利用阿里云ECS API來操作重啟深信服設(shè)備。
三房轿、源碼
3.1 隧道檢測(cè)腳本
3.1.1 檢查腳本功能
檢測(cè)隧道連通性粤攒,如果隧道終端告警至微信及釘釘,其次觸發(fā)深信服重啟腳本囱持。
#!/bin/bash
#檢測(cè)內(nèi)網(wǎng)地址
IP=10.10.10.2
dir="/sangfor/Shscripts/pdc/"
if [ ! -d ${dir} ];then
mkdir -p ${dir}
fi
echo 1 > ${dir}pdcping.lock
while true
do
#日志分割歸檔
Time=`date +%F`
TIME="${Time} 23:59"
if [ "${data}" == "${TIME}" ];then
mkdir ${dir}${Time} && mv ${dir}pdcping.log ${dir}${Time}-pingpdc.log
mv ${dir}${Time}-pingpdc.log ${dir}${Time}
fi
find ${dir} -mtime +7 -type d -exec rm -rf {} \;
find ${dir} -mtime +7 -name "*-pingpdc.log" -exec rm -rf {} \;
data=`date +%F' '%H:%M`
data1=`date +%F' '%H:%M:%S`
echo "------------${data1}---------------">>${dir}pingpdc.log
ping -c 10 ${IP} >>${dir}pingpdc.log
if [ $? -eq 1 ];then
STAT=`cat ${dir}pdcping.lock`
if [ ${STAT} -eq 1 ];then
/usr/local/python34/bin/python3 /sangfor/Pysangfor/sangfor_public.py
echo 0 > ${dir}pdcping.lock
else
continue
fi
else
STAT=`cat ${dir}pdcping.lock`
if [ ${STAT} -eq 0 ];then
echo 1 > ${dir}pdcping.lock
else
continue
fi
fi
done
3.1.2 檢查腳本功能
為防止隧道檢測(cè)腳本異常夯接,另外編寫監(jiān)控監(jiān)測(cè)腳本的腳本配合定時(shí)任務(wù)來定時(shí)監(jiān)控,如果異常纷妆,重新拉起盔几。
#!/bin/bash
num=$(ps -ef |grep pdc.sh|wc -l)
cmd="/usr/bin/nohup /bin/bash /sangfor/Shscripts/pdc/pdc.sh &"
if [ ${num} -lt 2 ];then
${cmd}
fi
配合定時(shí)任務(wù)
* * * * * /bin/bash /sangfor/Shscripts/pdc/checkpdc.sh
3.2 深信服操作腳本
3.2.1 基礎(chǔ)環(huán)境部署
- A 依賴軟件及python3.4版本安裝
yum -y install zlib-devel zlib readline-devel openssl-devel wget gcc-c++ Xvfb lrzsz firefox
cd /tmp
wget -c https://www.python.org/ftp/python/3.4.5/Python-3.4.5.tgz
tar -zxvf Python-3.4.5.tgz
cd Python-3.4.5
./configure --prefix=/usr/local/python34
make && make install
echo "export PATH=$PATH:/usr/local/python34/bin" >/etc/profile.d/python34.sh
source /etc/profile.d/python34.sh
- B pip安裝
cd /tmp
wget https://bootstrap.pypa.io/get-pip.py
python3 get-pip.py
- C 安裝python模塊
pip3 install selenium
pip3 install pyvirtualdisplay
pip3 install xvfbwrapper
- D 下載安裝geckodriver
cd /tmp
wget -c https://github.com/mozilla/geckodriver/releases/download/v0.16.1/geckodriver-v0.16.1-linux64.tar.gz
tar zxvf geckodriver-v0.16.1-linux64.tar.gz
cp geckodriver /usr/bin/
3.2.2 python代碼
github地址
模擬web登錄操作深信服
cat > /sangfor/Pysangfor/sangfor_public.py<<EOF
#!/bin/env python3
# -*- coding:UTF-8 -*-
# _author:kaliarch
#導(dǎo)入模塊
from pyvirtualdisplay import Display
from selenium import webdriver
import time
import os
import logging
#定義深信服重啟類
class Glp_SangFor:
def __init__(self,logger):
self.logger = logger
self.logger.info("--------------start log----------------")
self.display = Display(visible=0, size=(800, 600))
self.display.start()
self.browser = webdriver.Firefox()
self.logger.info("start browser successfuly")
self.sangfor_url = "深信服公網(wǎng)url"
self.username = '深信服登錄用戶名'
self.password = '深信服登錄密碼'
def login(self):
self.browser.get(self.sangfor_url)
self.browser.implicitly_wait(5)
self.browser.find_element_by_name('user').send_keys(self.username)
self.browser.find_element_by_name('password').send_keys(self.password)
self.browser.find_element_by_class_name('buttons').click()
self.browser.implicitly_wait(5)
self.logger.info("loggin sangfor successfuly")
def client_reboot(self):
self.browser.find_element_by_id("ext-gen111").click()
print(self.browser.find_element_by_id("ext-gen111").text)
self.browser.implicitly_wait(15)
time.sleep(60)
self.logger.info("switch mainiframe start")
try:
print(self.browser.find_element_by_link_text("重啟/重啟服務(wù)/關(guān)機(jī)").text)
self.browser.find_element_by_link_text("重啟/重啟服務(wù)/關(guān)機(jī)").click()
self.browser.implicitly_wait(3)
self.browser.switch_to_frame("mainiframe")
self.browser.implicitly_wait(8)
time.sleep(10)
self.browser.find_element_by_xpath("http://button[@id='ext-gen19']").click()
print(self.browser.find_element_by_xpath("http://button[@id='ext-gen19']").text)
self.browser.implicitly_wait(10)
#self.browser.find_element_by_xpath("http://button[@id='ext-gen42']").click()
print(self.browser.find_element_by_xpath("http://button[@id='ext-gen42']").text)
except Exception as e:
self.logger.exception("reboot successful")
return 1
self.browser.close()
self.logger.info("browser close successful")
self.logger.info("--------------end log----------------")
return 0
#定義日志記錄
class Glp_Log:
def __init__(self,filename):
self.filename = filename
def createDir(self):
_LOGDIR = os.path.join(os.path.dirname(__file__), 'publiclog')
print(_LOGDIR)
_TIME = time.strftime('%Y-%m-%d', time.gmtime()) + '-'
_LOGNAME = _TIME + self.filename
print(_LOGNAME)
LOGFILENAME = os.path.join(_LOGDIR, _LOGNAME)
print(LOGFILENAME)
if not os.path.exists(_LOGDIR):
os.mkdir(_LOGDIR)
return LOGFILENAME
print(LOGFILENAME)
def createlogger(self,logfilename):
logger= logging.getLogger()
logger.setLevel(logging.INFO)
handler = logging.FileHandler(logfilename)
handler.setLevel(logging.INFO)
formater = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formater)
logger.addHandler(handler)
return logger
#主函數(shù)調(diào)用
if __name__ == '__main__':
os.system("pkill firefox")
os.system("pkill Xvfb")
glploger = Glp_Log('public-***.log')
logfilename = glploger.createDir()
logger = glploger.createlogger(logfilename)
sangfor_oper = Glp_SangFor(logger)
sangfor_oper.login()
sangfor_oper.client_reboot()
EOF
通過阿里云ECS API操作深信服設(shè)備
#!/bin/env python3
# -*- coding:UTF-8 -*-
# _author:kaliarch
from aliyunsdkcore import client
from aliyunsdkecs.request.v20140526 import RebootInstanceRequest,StartInstanceRequest,StopInstanceRequest
import time
import os
import logging
class ecsOper():
def __init__(self,logger):
self.clentoper = client.AcsClient('<accessKeyId>', '<accessSecret>', 'cn-hangzhou')
self.logger = logger
self.logger.info("------------------------start reboot *** ecs of API log-------------")
def reboot_instance(self):
# 設(shè)置參數(shù)
request = RebootInstanceRequest.RebootInstanceRequest()
request.set_accept_format('json')
request.add_query_param('InstanceId', 'i-bpxxzx1rlsgvclq79au')
# 發(fā)起請(qǐng)求
response = self.clentoper.do_action_with_exception(request)
self.logger.info("public ecs *** reboot successful!")
self.logger.info(response)
print(response)
def start_instance(self):
request = StartInstanceRequest.StartInstanceRequest()
request.set_accept_format('json')
request.add_query_param('InstanceId', 'i-bpxxzx1rlsgvclq79au')
# 發(fā)起請(qǐng)求
response = self.clentoper.do_action_with_exception(request)
self.logger.info("public ecs *** start successful!")
self.logger.info(response)
print(response)
def stop_instance(self):
request = StopInstanceRequest.StopInstanceRequest()
request.set_accept_format('json')
request.add_query_param('InstanceId', 'i-bp1djzd1rlsgvclq79au')
request.add_query_param('ForceStop', 'false')
# 發(fā)起請(qǐng)求
response = self.clentoper.do_action_with_exception(request)
request.add_query_param('InstanceId', 'i-bpxxzxd1rlsgvclq79au')
self.logger.info(response)
print(response)
def testlog(self):
self.logger.info("public test log")
class Glp_Log:
def __init__(self,filename):
self.filename = filename
def createDir(self):
_LOGDIR = os.path.join(os.path.dirname(__file__), 'publiclog')
print(_LOGDIR)
_TIME = time.strftime('%Y-%m-%d', time.gmtime()) + '-'
_LOGNAME = _TIME + self.filename
print(_LOGNAME)
LOGFILENAME = os.path.join(_LOGDIR, _LOGNAME)
print(LOGFILENAME)
if not os.path.exists(_LOGDIR):
os.mkdir(_LOGDIR)
return LOGFILENAME
print(LOGFILENAME)
def createlogger(self,logfilename):
logger= logging.getLogger()
logger.setLevel(logging.INFO)
handler = logging.FileHandler(logfilename)
handler.setLevel(logging.INFO)
formater = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formater)
logger.addHandler(handler)
return logger
if __name__ == "__main__":
glploger = Glp_Log('public-***.log')
logfilename = glploger.createDir()
logger = glploger.createlogger(logfilename)
app = ecsOper(logger)
app.reboot_instance()
四、效果展示
查看檢查腳本日志已經(jīng)進(jìn)行了切割掩幢,且保留7天的日志逊拍,防止日志過大占用過多磁盤空間
微信告警信息
釘釘告警信息
查看python腳本深信服重啟日志
五上鞠、總結(jié)
其簡(jiǎn)單的實(shí)現(xiàn)了故障自愈,利用其思路客戶配合很多業(yè)務(wù)芯丧,例如簡(jiǎn)單的應(yīng)用重啟等芍阎。