python標(biāo)準(zhǔn)庫對于這個庫的介紹不是很全面,所以今天嚴(yán)小日就粘貼到這里并友情提供下翻譯(翻譯是根據(jù)理解的文意翻譯的伏伯,而不是逐字翻譯的,所以可能會有出入):
When identifying things (such as host names) in the internet, it is often necessary to compare such identifications for “equality”. Exactly how this comparison is executed may depend on the application domain, e.g. whether it should be case-insensitive or not. It may be also necessary to restrict the possible identifications, to allow only identifications consisting of “printable” characters.
RFC 3454 defines a procedure for “preparing” Unicode strings in internet protocols. Before passing strings onto the wire, they are processed with the preparation procedure, after which they have a certain normalized form. The RFC defines a set of tables, which can be combined into profiles. Each profile must define which tables it uses, and what other optional parts of the stringprep
procedure are part of the profile. One example of a stringprep
profile is nameprep
, which is used for internationalized domain names.
The module stringprep
only exposes the tables from RFC 3454. As these tables would be very large to represent them as dictionaries or lists, the module uses the Unicode character database internally. The module source code itself was generated using the mkstringprep.py
utility.
As a result, these tables are exposed as functions, not as data structures. There are two kinds of tables in the RFC: sets and mappings. For a set, stringprep
provides the “characteristic function”, i.e. a function that returns true if the parameter is part of the set. For mappings, it provides the mapping function: given the key, it returns the associated value. Below is a list of all functions available in the module.
當(dāng)你在網(wǎng)上區(qū)別類似主機名字的時候氢伟,需要區(qū)別這些hostname是否確切相同哮奇,特別是如今比較操作都依賴應(yīng)用爱只,比如是否大小寫要區(qū)分開.同時也很好必要去限制標(biāo)識:全部由僅可以打印的字符組成包斑,所以不會exactly equale
那么由此RFC 3454就定義了一個程序用來在互聯(lián)網(wǎng)中prepara unicode字符串: 在將字符串發(fā)送到網(wǎng)線之前,被預(yù)先的程序提前處理,這樣處理完后他們就有了規(guī)范化的樣式,
RFC定義了很多表格渠欺,這些表格組成了大體的一個輪廓,每個程序都是其中的一員,一個例子就是nameprep,這個可以被用作國際通用domain name
stringprep 只顯示了RFC3454中的表格椎眯,因為這些表格如果用字典或者列別表示的話會非常大挠将,所以內(nèi)部采用了Unicode編碼,模塊原代碼本身是采用mkstringgrep.py生成的
因此,這些表的表現(xiàn)形式為函數(shù)编整,而不是數(shù)據(jù)結(jié)構(gòu)舔稀,RFC表格中有兩種表格,其中一種提供了'特征函數(shù)'掌测,可以判斷參數(shù)是否在集合中如果是就返回TRue内贮,
另外一種是映射,提供了映射函數(shù):根據(jù)key提供關(guān)聯(lián)的數(shù)值.
ps:這個庫有點偏冷門了汞斧,很少用到夜郁,但是根據(jù)這個庫順便了解下RFC是很有必要的