一绑莺、信息安全技術(shù)政務(wù)信息共享數(shù)據(jù)安全技術(shù)要求
數(shù)據(jù)安全技術(shù)要求
共享數(shù)據(jù)提供方在進(jìn)行數(shù)據(jù)分級(jí)分類時(shí)的安全要求包括:
- 應(yīng)按照政務(wù)信息資源分級(jí)分類相關(guān)要求對(duì)共享數(shù)據(jù)分級(jí)分類并進(jìn)行標(biāo)記,根據(jù)標(biāo)記可對(duì)數(shù)據(jù)安全等級(jí)進(jìn)行識(shí)別篱瞎,并需要保留標(biāo)記記錄,作為審計(jì)依據(jù)荧降;
- 應(yīng)按照數(shù)據(jù)級(jí)別確定所必要的安全防護(hù)措施遍希;
- 應(yīng)對(duì)共享數(shù)據(jù)分級(jí)分類的變更進(jìn)行記錄,并通知相關(guān)數(shù)據(jù)使用方巡验;
- 應(yīng)明確使用方對(duì)共享數(shù)據(jù)的使用權(quán)限,包括是否允許數(shù)據(jù)存儲(chǔ)碘耳、數(shù)據(jù)存儲(chǔ)保護(hù)要求显设、是否允許使用方提供給第三方等。
二辛辨、敏感數(shù)據(jù)識(shí)別和分級(jí)打標(biāo)
數(shù)據(jù)分級(jí)分類的原則
-
分類:
依據(jù)數(shù)據(jù)的來源捕捂、內(nèi)容和用途對(duì)數(shù)據(jù)進(jìn)行分類; -
分級(jí):
按照數(shù)據(jù)的價(jià)值斗搞、內(nèi)容敏感程度指攒、影響和分發(fā)范圍不同對(duì)數(shù)據(jù)進(jìn)行敏感級(jí)別劃分。
數(shù)據(jù)分級(jí)分類方式
- 根據(jù)梳理出的備案數(shù)據(jù)資產(chǎn)僻焚,進(jìn)行敏感數(shù)據(jù)的自動(dòng)探測(cè)允悦,通過特征探測(cè)定位敏感數(shù)據(jù)分布在哪些數(shù)據(jù)資產(chǎn)中;
- 針對(duì)敏感的數(shù)據(jù)資產(chǎn)進(jìn)行分級(jí)分類標(biāo)記虑啤,分類出敏感數(shù)據(jù)所有者(部門隙弛、系統(tǒng)、管理人員等)狞山;
- 根據(jù)已分類的數(shù)據(jù)資產(chǎn)由業(yè)務(wù)部門進(jìn)行敏感分級(jí)全闷,將分類的數(shù)據(jù)資產(chǎn)劃分公開、內(nèi)部铣墨、敏感等不同的敏感級(jí)別室埋。
敏感數(shù)據(jù)識(shí)別
通過用戶自定義規(guī)則,自動(dòng)識(shí)別敏感數(shù)據(jù)
使用自帶的規(guī)則或自定義規(guī)則伊约,對(duì)其結(jié)構(gòu)化表或者非結(jié)構(gòu)化文件進(jìn)行整體掃描、分級(jí)
image.png
三孕蝉、敏感數(shù)據(jù)自動(dòng)識(shí)別實(shí)現(xiàn)
3.1屡律、敏感字段標(biāo)注方案
敏感字段包括:
統(tǒng)一社會(huì)信用代碼,車輛識(shí)別代碼降淮,營(yíng)業(yè)執(zhí)照號(hào)碼超埋,稅務(wù)登記證號(hào)碼搏讶,組織機(jī)構(gòu)代碼,圖片霍殴,日期媒惕,IP地址,MAC地址来庭,城市妒蔚,性別,民族月弛,省份肴盏,車牌號(hào),電話號(hào)碼帽衙,軍官證菜皂,郵箱,護(hù)照號(hào)厉萝,港澳通行證恍飘,姓名,地址谴垫,手機(jī)號(hào)章母,身份證,銀行卡
弹渔。
發(fā)現(xiàn)敏感字段方法
- 定期全庫掃描胳施,識(shí)別敏感字段 (周期觸發(fā))。
- 新增或修改表和字段肢专,增量掃描識(shí)別出敏感字段舞肆。
需要監(jiān)聽數(shù)據(jù)庫對(duì)表或字段的操作,來指定表或字段進(jìn)行敏感識(shí)別掃描博杖,需結(jié)合數(shù)據(jù)庫代理服務(wù)
- 手動(dòng)觸發(fā)掃描
3.2椿胯、敏感字段識(shí)別
識(shí)別方式:正則匹配
,關(guān)鍵字
剃根,算法
- 銀行卡號(hào)哩盲、證件號(hào)、手機(jī)號(hào)狈醉,有明確的規(guī)則廉油,可以根據(jù)正則表達(dá)式和算法匹配
- 姓名、特殊字段苗傅,沒有明確信息抒线,可能是任意字符串,可以通過配置關(guān)鍵字來進(jìn)行匹配
- 營(yíng)業(yè)執(zhí)照渣慕、地址嘶炭、圖片等抱慌,沒有明確規(guī)則,可以通過自然語言算法來識(shí)別眨猎,使用開源算法庫
數(shù)據(jù)識(shí)別問題
- 全庫掃描占用資源較大抑进,是否可以使用采樣的方式
- 臟數(shù)據(jù)的判斷識(shí)別,有的字段是NULL或者空格的睡陪,是否可以直接默認(rèn)是定義為敏感級(jí)別
- 數(shù)據(jù)打標(biāo)寺渗,是對(duì)全庫字段打標(biāo),還是只對(duì)采樣數(shù)據(jù)進(jìn)行打標(biāo)宝穗,并單獨(dú)存庫用走后期的統(tǒng)計(jì)分析
四户秤、demo代碼
4.1、識(shí)別mysql數(shù)據(jù)庫中手機(jī)號(hào)碼字段
對(duì)指定的mysql實(shí)例下的所有庫逮矛、所有表鸡号、所有字段,遍歷去匹配正則表達(dá)式须鼎,然后進(jìn)行標(biāo)記鲸伴。
# -*- coding:utf-8 -*-
"""
@Author : Browser
@file : identity_mysql.py
@time : 2019/09/30
@software : PyCharm
@description: " "
"""
import pymysql
import re
s1 = "無風(fēng)險(xiǎn)"
s2 = "低風(fēng)險(xiǎn)"
s3 = "中風(fēng)險(xiǎn)"
s4 = "高風(fēng)險(xiǎn)"
# 通過正則匹配出個(gè)人手機(jī)號(hào)碼
def check_secret(value):
phone_pattern = '^[1](([3][0-9])|([4][5-9])|([5][0-3,5-9])|([6][5,6])|([7][0-8])|([8][0-9])|([9][1,8,9]))[0-9]{8}$'
if re.match(phone_pattern, value):
return ('%s' % s3)
else:
return ('%s' % s1)
class DB(object):
def __init__(self,ip,username,password):
self.ip = ip
self.username = username
self.password = password
self.db = pymysql.connect(self.ip,self.username,self.password)
self.cursor = self.db.cursor()
# 通過schemata獲取所有數(shù)據(jù)庫名稱
def get_database(self):
self.cursor.execute("SELECT schema_name from information_schema.schemata ")
database_list = self.cursor.fetchall()
result = []
for line in database_list:
if line[0] not in ['information_schema','mysql','performance_schema','sys','loonflownew']: #排除默認(rèn)的數(shù)據(jù)庫
result.append(line[0])
return result
# 獲取表名
def get_table(self,database):
self.cursor.execute("select table_name from information_schema.tables where table_schema= '%s' " % database)
table_list = self.cursor.fetchall()
result = []
for line in table_list:
result.append(line[0])
return result
# 獲取字段名
def get_column(self,database,table):
self.cursor.execute("select column_name from information_schema.columns where table_schema='%s' and table_name='%s'" % (database,table))
column_list = self.cursor.fetchall()
result = []
for line in column_list:
result.append(line[0])
return result
# 獲取字段內(nèi)容
def get_content(self,database,table,column):
self.cursor.execute("select %s from %s.%s LIMIT 0,1" %(column,database,table))
content = self.cursor.fetchall()
if content:
return content[0][0]
def __del__(self):
self.db.close()
if __name__ == '__main__':
# db = DB('192.168.189.154','root','Gepoint')
db = DB('rm-bp1i3518ykiqi60my8o.mysql.rds.aliyuncs.com','root','Epoint@123@)!(')
databases = db.get_database()
for database in databases:
tables = db.get_table(database)
for table in tables:
columns = db.get_column(database,table)
for column in columns:
data = db.get_content(database,table,column)
data_str = str(data)
result = [database,table,column,data_str,check_secret(data_str)]
result_str = str(result) + "\r\n"
with open('message.txt','a+',encoding='UTF-8') as file:
file.write(result_str)
4.2、敏感數(shù)據(jù)識(shí)別規(guī)則
IP地址: 正則表達(dá)式
#精確匹配IP地址
def check_ip(value):
ip_pattern = r'^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$'
if re.match(ip_pattern, value):
print('%s' % s2)
else:
print('%s' % s1)
MAC地址: 正則表達(dá)式
#精確匹配MAC地址
def check_mac(value):
mac_pattern = r'^(?:(?:(?:[a-f0-9A-F]{2}:){5})|(?:(?:[a-f0-9A-F]{2}-){5}))[a-f0-9A-F]{2}$'
if re.match(mac_pattern, value):
print('%s' % s2)
else:
print('%s' % s1)
IPv6地址: 正則表達(dá)式
def check_ipv6(value):
ipv6_pattern = r'^\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?\s*$'
if re.match(ipv6_pattern, value):
print('%s' % s2)
else:
print('%s' % s1)
手機(jī)號(hào): 正則表達(dá)式
def check_phone(value):
phone_pattern = r'^[1](([3][0-9])|([4][5-9])|([5][0-3,5-9])|([6][5,6])|([7][0-8])|([8][0-9])|([9][1,8,9]))[0-9]{8}$'
if re.match(phone_pattern, value):
print('%s' % s3)
else:
print('%s' % s1)
銀行卡: 算法
def check_bank_card(card_num):
total = 0
card_num_length = len(card_num)
for item in range(1, card_num_length + 1):
t = int(card_num[card_num_length - item])
if item % 2 == 0:
t *= 2
total += t if t < 10 else t % 10 + t // 10
else:
total += t
return total % 10 == 0
身份證: 算法
def check_IDNumber(value):
str_to_int = {'0': 0, '1': 1, '2': 2, '3': 3, '4': 4, '5': 5,
'6': 6, '7': 7, '8': 8, '9': 9, 'X': 10}
check_dict = {0: '1', 1: '0', 2: 'X', 3: '9', 4: '8', 5: '7',
6: '6', 7: '5', 8: '4', 9: '3', 10: '2'}
if len(value) != 18:
raise TypeError(u'請(qǐng)輸入標(biāo)準(zhǔn)的第二代身份證號(hào)碼')
check_num = 0
for index, num in enumerate(value):
if index == 17:
verify_code = check_dict.get(check_num % 11)
if num == verify_code:
print(u"身份證號(hào): %s, 校驗(yàn)通過," % value + s4)
else:
print(u"身份證號(hào): %s, 校驗(yàn)不通過, 正確尾號(hào)應(yīng)該為:%s," % (value, verify_code) + s1)
check_num += str_to_int.get(num) * (2 ** (17 - index) % 11)
地址:自然語言處理工具包(CRF)
import re,sys
from pyhanlp import *
s1 = "無風(fēng)險(xiǎn)"
s2 = "低風(fēng)險(xiǎn)"
s3 = "中風(fēng)險(xiǎn)"
s4 = "高風(fēng)險(xiǎn)"
value = sys.argv[1]
def check_chinese_address_recognition(value):
CRFnewSegment = HanLP.newSegment("crf")
address_list = CRFnewSegment.seg(value)
dict = {}
for i in address_list:
dict[str(i.word)] = [str(i.nature)]
Address = r'(ns|nsf)'
for key,value in dict.items():
value = str(value)
if re.search(Address,value):
print('地址:%s' % key + '晋控,風(fēng)險(xiǎn)等級(jí):' + s3)
else:
print('常規(guī)詞:%s' % key + '汞窗,風(fēng)險(xiǎn)等級(jí):' + s1)
if __name__ == "__main__":
check_chinese_address_recognition(value)
image.png
姓名:自然語言處理工具包(CRF)
import sys,re
from pyhanlp import *
s1 = "無風(fēng)險(xiǎn)"
s2 = "低風(fēng)險(xiǎn)"
s3 = "中風(fēng)險(xiǎn)"
s4 = "高風(fēng)險(xiǎn)"
value = sys.argv[1]
def check_chinese_name_recognition(value):
CRFnewSegment = HanLP.newSegment("crf")
name_list = CRFnewSegment.seg(value)
dict = {}
for i in name_list:
dict[str(i.word)] = [str(i.nature)]
Person_Name = r'nr'
for key,value in dict.items():
result = str(value)
if re.search(Person_Name,result):
print('姓名:%s' % key + ',風(fēng)險(xiǎn)等級(jí):' + s4)
else:
print('常規(guī)詞:%s' % key + '赡译,風(fēng)險(xiǎn)等級(jí):' + s1)
if __name__ == "__main__":
check_chinese_name_recognition(value)
image.png
性別: 正則表達(dá)式
def check_gender(value):
gender_pattern = r'^((男|male)|(女|female))$'
if re.match(gender_pattern,value):
print('%s' % s2)
else:
print('%s' % s1)
民族: 正則表達(dá)式
def check_national(value):
national_pattern = r'^((漢|滿|蒙古|回|藏|維吾爾|苗|彝|壯|布依|侗|瑤|白|土家|哈尼|哈薩克|傣|黎' \
r'|傈僳|佤|畬|高山|拉祜|水|東鄉(xiāng)|納西|景頗|柯爾克孜|土|達(dá)斡爾|仫佬|羌|布朗' \
r'|撒拉|毛南|仡佬|錫伯|阿昌|普米|朝鮮|塔吉克|怒|烏孜別克|俄羅斯|鄂溫克|德昂' \
r'|保安|裕固|京|塔塔爾|獨(dú)龍|鄂倫春|赫哲|門巴|珞巴|基諾)' \
r'|(漢族|滿族|蒙古族|回族|藏族|維吾爾族|苗族|彝族|壯族|布依族|侗族|瑤族|白族|' \
r'土家族|哈尼族|哈薩克族|傣族|黎族|傈僳族|佤族|畬族|高山族|拉祜族|水族|東鄉(xiāng)族|' \
r'納西族|景頗族|柯爾克孜族|土族|達(dá)斡爾族|仫佬族|羌族|布朗族|撒拉族|毛南族|仡佬族|' \
r'錫伯族|阿昌族|普米族|朝鮮族|塔吉克族|怒族|烏孜別克族|俄羅斯族|鄂溫克族|德昂族|' \
r'保安族|裕固族|京族|塔塔爾族|獨(dú)龍族|鄂倫春族|赫哲族|門巴族|珞巴族|基諾族))$'
if re.match(national_pattern,value):
print('%s' % s3)
else:
print('%s' % s1)
省份: 正則表達(dá)式
def check_provinces(value):
provinces_pattern = r'^(北京市|天津市|上海市|重慶市|河北省|山西省|遼寧省|吉林省|黑龍江省|江蘇省|' \
r'浙江省|安徽省|福建省|江西省|山東省|河南省|湖北省|湖南省|廣東省|海南省|四川省|' \
r'貴州省|云南省|陜西省|甘肅省|青海省|臺(tái)灣省|內(nèi)蒙古自治區(qū)|廣西壯族自治區(qū)|西藏自治區(qū)|'\
r'寧夏回族自治區(qū)|新疆維吾爾自治區(qū)|香港特別行政區(qū)|澳門特別行政區(qū))$'
if re.match(provinces_pattern,value):
print('%s' % s2)
else:
print('%s' % s1)
車牌號(hào): 正則表達(dá)式
def check_carnum(value):
carnum_pattern = r'([京津滬渝冀豫云遼黑湘皖魯新蘇浙贛鄂桂甘晉蒙陜吉閩貴粵青藏川寧瓊使領(lǐng)A-Z]' \
r'{1}[A-Z]{1}(([0-9]{5}[DF])|(DF[0-9]{4})))|' \
r'([京津滬渝冀豫云遼黑湘皖魯新蘇浙贛鄂桂甘晉蒙陜吉閩貴粵青藏川寧瓊使領(lǐng)A-Z]' \
r'{1}[A-Z]{1}[A-HJ-NP-Z0-9]{4}[A-HJ-NP-Z0-9掛學(xué)警港澳]{1})'
if re.match(carnum_pattern,value):
print('%s' % s3)
else:
print('%s' % s1)
電話號(hào)碼: 正則表達(dá)式
def check_telephone(value):
telephone_pattern = r'^((0\d{2,3})-)(\d{7,8})|(\d{7,8})$'
if re.match(telephone_pattern,value):
print('%s' % s3)
else:
print('%s' % s1)
軍官證:正則表達(dá)式
def check_officer(value):
officer_pattern = r'^[\u4E00-\u9FA5](字第)([0-9a-zA-Z]{4,8})(號(hào)?)$'
if re.match(officer_pattern,value):
print('%s' % s3)
else:
print('%s' % s1)
郵箱: 正則表達(dá)式
def check_email(value):
email_pattern = r'[\w-]+@[\w-]+(.[\w-]+)+'
if re.match(email_pattern, value):
print('%s' % s2)
else:
print('%s' % s1)
護(hù)照號(hào): 正則表達(dá)式
def check_passport(value):
passport_pattern = r'^([a-zA-z]|[0-9]){5,17}$'
if re.match(passport_pattern,value):
print('%s' % s3)
else:
print('%s' % s1)
港澳通行證: 正則表達(dá)式
def check_HM_pass(value):
HM_pass_pattern = r'^[HMhm]{1}([0-9]{10}|[0-9]{8})$'
if re.match(HM_pass_pattern, value):
print('%s' % s3)
else:
print('%s' % s1)
JDBC連接串: 正則表達(dá)式
def check_jdbc(value):
jdbc_pattern = r'^jdbc:(((microsoft:)?sqlserver:\/\/((25[0-5]|2[0-4]\d|[0-1]\d{2}|[1-9]?\d)\.(25[0-5]|2[0-4]\d|[0-1]\d{2}|[1-9]?\d)\.(25[0-5]|2[0-4]\d|[0-1]\d{2}|[1-9]?\d)\.(25[0-5]|2[0-4]\d|[0-1]\d{2}|[1-9]?\d)):(([1-9]([0-9]{0,3}))|([1-6][0-5][0-5][0-3][0-5]))(;[ \d\w\/=\?%\-&_~`@[\]\':+!]*)?)|' \
r'(oracle:thin:@((25[0-5]|2[0-4]\d|[0-1]\d{2}|[1-9]?\d)\.(25[0-5]|2[0-4]\d|[0-1]\d{2}|[1-9]?\d)\.(25[0-5]|2[0-4]\d|[0-1]\d{2}|[1-9]?\d)\.(25[0-5]|2[0-4]\d|[0-1]\d{2}|[1-9]?\d)):(([1-9]([0-9]{0,3}))|([1-6][0-5][0-5][0-3][0-5])):[A-Za-z0-9_]+)|' \
r'(mysql:\/\/((25[0-5]|2[0-4]\d|[0-1]\d{2}|[1-9]?\d)\.(25[0-5]|2[0-4]\d|[0-1]\d{2}|[1-9]?\d)\.(25[0-5]|2[0-4]\d|[0-1]\d{2}|[1-9]?\d)\.(25[0-5]|2[0-4]\d|[0-1]\d{2}|[1-9]?\d)):(([1-9]([0-9]{0,3}))|([1-6][0-5][0-5][0-3][0-5]))\/([A-Za-z0-9_]+)(\?([\d\w\/=\?%\-&_~`@[\]\':+!]*))?))$'
if re.match(jdbc_pattern,value):
print('%s' % s4)
else:
print('%s' % s1)
日期:正則表達(dá)式
def check_datetime(value):
datatime_pattern = r'((((19|20)\d{2})[-/](0?(1|[3-9])|1[012])[-/](0?[1-9]|[12]\d|30))|(((19|20)\d{2})[-/](0?[13578]|1[02])[-/]31)|' \
r'(((19|20)\d{2})[-/]0?2[-/](0?[1-9]|1\d|2[0-8]))|((((19|20)([13579][26]|[2468][048]|0[48]))|(2000))[-/]0?2[-/]29))' \
r'\s([0-1][0-9]|2[0-3]):([0-5][0-9]):([0-5][0-9])$'
if re.match(datatime_pattern, value):
print('%s' % s2)
else:
print('%s' % s1)
車輛識(shí)別代碼:正則表達(dá)式
def check_vin(value):
vin_pattern = r'^[A-HJ-NPR-Z\\d]{8}[\dX][A-HJ-NPR-Z\d]{2}\d{6}$'
if re.match(vin_pattern,value):
print('%s' % s3)
else:
print('%s' % s1)
組織機(jī)構(gòu)代碼:算法
def check_organization(value):
organization_str = value.upper().replace('-', '')
organization_pattern = r'^[\dA-Z]{8}[X\d]$'
if re.search(organization_pattern, organization_str, re.S):
verify_code = [3, 7, 9, 10, 5, 8, 4, 2]
verify_code = 11 - sum([int(
(ord(organization_str[index]) - 55) if organization_str[index].isalpha() else organization_str[index]
) * verify_code[index] for index in range(8)]) % 11
verify_code = 'X' if verify_code == 10 else ('0' if verify_code == 11 else str(verify_code))
if verify_code == organization_str[-1]:
print('%s' % s3)
else:
print('%s' % s1)
else:
print('%s' % s1)
營(yíng)業(yè)執(zhí)照號(hào)碼 :算法
def check_business(value):
business_pattern = r'^\d{15}$'
if re.search(business_pattern, value, re.S):
verify_code = 10
for index in range(14):
verify_code = (((verify_code % 11 + int(value[index])) % 10 or 10) * 2) % 11
verify_code = (11 - (verify_code % 10)) % 10
if str(verify_code) == value[-1]:
print('%s' % s3)
else:
print('%s' % s1)
else:
print('%s' % s1)
統(tǒng)一社會(huì)信用代碼:算法
def check_credit(value):
credit_str = value.upper()
credit_pattern = r'^(1[129]|5[1239]|9[123]|Y1)\d{6}[\dA-Z]{8}[X\d][\dA-Z]$'
if len(credit_str) != 18:
return False
search = re.search(credit_pattern, credit_str, re.S)
if search:
# if check_organization(xinyong_str[8:17]):
str_to_num = {
'A': 10, 'B': 11, 'C': 12, 'D': 13, 'E': 14, 'F': 15, 'G': 16, 'H': 17, 'J': 18, 'K': 19,
'L': 20, 'M': 21, 'N': 22, 'P': 23, 'Q': 24, 'R': 25, 'T': 26, 'U': 27, 'W': 28, 'X': 29, 'Y': 30}
num_to_str = {
10: 'A', 11: 'B', 12: 'C', 13: 'D', 14: 'E', 15: 'F', 16: 'G', 17: 'H', 18: 'J', 19: 'K',
20: 'L', 21: 'M', 22: 'N', 23: 'P', 24: 'Q', 25: 'R', 26: 'T', 27: 'U', 28: 'W', 29: 'X', 30: 'Y'}
verify_code = [1, 3, 9, 27, 19, 26, 16, 17, 20, 29, 25, 13, 8, 24, 10, 30, 28]
verify_code = 31 - sum([(str_to_num.get(credit_str[index], 0) if credit_str[index].isalpha() else int(credit_str[index])
) * verify_code[index] for index in range(17)]) % 31
verify_code = num_to_str.get(verify_code, '') if verify_code > 9 else verify_code
if verify_code == credit_str[-1]:
print('%s' % s3)
else:
print('%s' % s1)
else:
print('%s' % s1)