三懈万、RSA scalabilityRSA可擴(kuò)展性
Obviously a post-quantum RSA public key n will need to be quite large toresist the attacks described in Section 2. This section analyzes the scalability ofthe best algorithms available for RSA key generation, encryption, decryption,signature generation, and signature verification.很明顯,后量子RSA公鑰n需要相當(dāng)大才能抵抗第2節(jié)中描述的攻擊蔚晨。本節(jié)分析了可用于RSA密鑰生成,加密傻唾,解密娃胆,簽名生成和簽名驗(yàn)證的最佳算法的可擴(kuò)展性哩罪。
Small exponents.The fundamental RSA public-key operation is computingan eth power modulo n. This modular exponentiation uses approximately lgesquarings modulo n, and, thanks to standard windowing techniques, o(lg e) extramultiplications modulo n.小指數(shù)。基本的RSA公鑰運(yùn)算是計(jì)算一個(gè)n次方的模锁摔。 這個(gè)模冪運(yùn)算使用大約lge平方模n廓旬,并且,由于標(biāo)準(zhǔn)的窗口技術(shù)谐腰,o(eg)額外的乘法模n嗤谚。
In the original RSA paper [43], e was a random number with as many bitsas n. Rabin in [42] suggested instead using a small constant e, and said thate = 2 is “several hundred times faster.” Rabin’s speedup factor grows as Θ(lg n),making it particularly important for the large sizes of n considered in this paper.在最初的RSA文件[43]中,e是一個(gè)隨機(jī)數(shù)怔蚌,其位數(shù)與n相同巩步。 拉賓[42]建議使用一個(gè)小的常數(shù)e,并且說(shuō)e = 2是“幾百倍”桦踊。拉賓的加速因子增長(zhǎng)為Θ(lg n)椅野,這對(duì)于考慮的大尺寸的n 本文。
The slower but simpler choice e = 3 was deployed in a variety of real-worldapplications. The much slower alternative e = 65537 subsequently became popular as a means of compensating for poor choices of RSA message-randomizationmechanisms, but with proper randomization no attacks against e = 3 are knownthat are faster than factorization.較慢但簡(jiǎn)單的選擇e = 3被部署在各種實(shí)際應(yīng)用中籍胯。 作為補(bǔ)償RSA消息隨機(jī)化機(jī)制選擇不良的一種手段竟闪,后來(lái)慢得多的e = 65537變得越來(lái)越流行,但是通過(guò)適當(dāng)?shù)碾S機(jī)化杖狼,沒(méi)有發(fā)現(xiàn)e = 3的攻擊比分解更快炼蛤。
For simplicity this paper also focuses on e = 3. Computing an eth powermodulo n then takes one squaring modulo n and one general multiplicationmodulo n. Each of these steps takes just (lgn)^(1+o(1))bit operations using standardfast-multiplication techniques; see below for further discussion. Notice that(lgn)^(1+o(1))is asymptotically far below the (lgn)^(2+o(1))cost of Shor’s algorithm.為簡(jiǎn)單起見(jiàn),本文也將重點(diǎn)放在e = 3上蝶涩。計(jì)算n次模n理朋,然后取n個(gè)模n和一個(gè)一般乘n n絮识。 這些步驟中的每一個(gè)都使用標(biāo)準(zhǔn)的快速乘法技術(shù)來(lái)進(jìn)行(lgn)^(1 + o(1))位操作; 請(qǐng)參閱下面的進(jìn)一步討論。 注意(lgn)^(1 + o(1))是漸近遠(yuǎn)低于Shor算法的(lgn)^(2 + o(1))成本的嗽上。
Many primes.The fundamental RSA secret-key operation is computing aneth root modulo n. For e = 3 one chooses n as a product of distinct primescongruent to 2 modulo 3; then the inverse of x → x^3 mod n is x→ x^d mod n,where d=(1+2∏p|n(p?1))/3. Unfortunately, d is not a small exponent—ithas approximately lgn bits.許多素?cái)?shù)次舌。基本的RSA秘密密鑰操作是計(jì)算eth根模n。 對(duì)于e = 3兽愤,選擇n作為與2模3一致的不同素?cái)?shù)的乘積; 那么x→x ^ 3 mod n的倒數(shù)是x→x ^ d mod n彼念,其中d =(1 +2Πp| n(p-1))/ 3。 不幸的是浅萧,d不是一個(gè)小指數(shù)逐沙,它大約有l(wèi)gn個(gè)位。
A classic speedup in the computation of x^d mod n is to compute x^d mod p andx^d mod q, where p and q are the prime divisors of n, and to combine them intox^d mod n by a suitably explicit form of the Chinese remainder theorem. Fermat’sidentity x^p mod p = x mod p further implies that x^d mod p = x^(d mod (p?1))mod p(since d mod (p?1) ≥ 1) and similarly x^d mod q = x^(d mod (q?1))mod q. Theexponents d mod (p?1) and d mod (q?1) have only half as many bits as n; theexponentiation x^d mod n is thus replaced by two exponentiations with half-size exponents and half-size moduli.計(jì)算x ^ d mod n的一個(gè)經(jīng)典加速是計(jì)算x ^ d mod p和x ^ d mod q洼畅,其中p和q是n的主要因數(shù)吩案,并且將它們組合為x ^ d mod n 中國(guó)剩余定理的適當(dāng)?shù)娘@式形式。 費(fèi)馬的同一性x ^ p mod p = x mod p進(jìn)一步意味著x d d mod p = x ^(d mod(p-1))mod p(因?yàn)閐 mod(p-1)≥1)并且類(lèi)似地x ^ d mod q = x ^(d mod(q-1))mod q土思。 指數(shù)d mod(p-1)和d mod(q-1)只有n的一半; 指數(shù)x ^ d mod n因此被具有半尺寸指數(shù)和半尺寸模數(shù)的兩個(gè)指數(shù)代替。
If n is a product of more primes, say k ≥ 3 primes, then the same speedupbecomes even more effective, using k exponentiations with (1/k)-size exponentsand (1/k)-size moduli. Prime generation also becomes much easier since theprimes are smaller. Of course, if primes are too small then the attacker can findthem using the ring algorithms discussed in the previous section—specificallyEECM before quantum computers, and GEECM after quantum computers.如果n是更多素?cái)?shù)的乘積忆嗜,說(shuō)k≥3個(gè)素?cái)?shù)己儒,則使用(1 / k)大小指數(shù)和(1 / k)大小模數(shù)的k次取冪,相同的加速變得更加有效捆毫。 由于素?cái)?shù)較小闪湾,素?cái)?shù)一代也變得容易得多。 當(dāng)然绩卤,如果素?cái)?shù)太小途样,那么攻擊者就可以使用上一節(jié)討論的環(huán)算法 - 特別是量子計(jì)算機(jī)之前的EECM和量子計(jì)算機(jī)之后的GEECM。
What matters for this paper is how multi-prime RSA scales to much largermoduli n. Before quantum computers the top threats are EECM and NFS, andbalancing these threats implies that each prime p has (lgn)^(2/3+o(1))bits (seeabove), i.e., that k ∈ (lg n)^(1/3+o(1)). After quantum computers the top threatsare GEECM and Shor’s algorithm, and balancing these threats implies thateach prime p has just (lg lg n)^(2+o(1))bits, i.e., that k ∈ (lg n)/(lg lg n)^(2+o(1)). RSAkey generation, decryption, and signature generation then take (lgn)^(1+o(1))bitoperations; see below for further discussion.這篇論文的重點(diǎn)是多重RSA如何擴(kuò)展到更大的模數(shù)n濒憋。 在量子計(jì)算機(jī)之前何暇,最主要的威脅是EECM和NFS,平衡這些威脅意味著每個(gè)素?cái)?shù)p有(lgn)^(2/3 + o(1))位(見(jiàn)上文)凛驮,即k∈(lg n)^(1/3 + O(1))裆站。 在量子計(jì)算機(jī)之后,最主要的威脅是GEECM和Shor算法黔夭,平衡這些威脅意味著每個(gè)素?cái)?shù)p只有(lg lg n)^(2 + o(1))個(gè)比特宏胯,即k∈(lg n)/ lg lg n)^(2 + o(1))。 RSA密鑰生成本姥,解密和簽名生成然后采燃缗邸(lgn)^(1 + o(1))位操作; 請(qǐng)參閱下面的進(jìn)一步討論。
Key generation.To recap: A k-prime exponent-3 RSA public key n is a productof k distinct primes p congruent to 2 modulo 3. In particular, a post-quantumRSA public key n is a product of k distinct primes p congruent to 2 modulo 3,where each prime p has (lglgn)^(2+o(1))bits.密鑰生成婚惫。概括來(lái)說(shuō):k-prime指數(shù)-3RSA公鑰n是k個(gè)與2模3相同的不同的素?cái)?shù)p的乘積氛赐。特別地魂爪,后量子RSA公鑰n是k個(gè)不同的素?cái)?shù)p與2相等的乘積 模3,其中每個(gè)素?cái)?shù)p具有(lglgn)^(2 + o(1))位鹰祸。
Standard prime-generation techniques use (lg p)^(3+o(1))bit operations. See, e.g.,[6, Section 3] and [38, Section 4.5]. The point is that one must try about log prandom numbers before finding a prime, and checking primality has similar costto a single exponentiation modulo p.標(biāo)準(zhǔn)的素?cái)?shù)生成技術(shù)使用(lg p)^(3 + o(1))位操作甫窟。 參見(jiàn)例如[6,第3節(jié)]和[38蛙婴,第4.5節(jié)]粗井。 重點(diǎn)是在找到素?cái)?shù)之前必須嘗試一下log p隨機(jī)數(shù),而檢查素?cái)?shù)與單個(gè)取冪模p有相似的代價(jià)街图。
A standard speedup is to check whether p is divisible by any primes up throughsome limit, say y. The chance of a random integer surviving this divisibility testis approximately 1/ log y, reducing the original pool of log p random numbers to(log p)/ log y random numbers and saving an overall factor of log y if the trialdivision is not a bottleneck. The conventional view is that keeping the cost oftrial division under control requires y to be chosen as a polynomial in lgp, savinga factor of only Θ(lg lg p) and thus still requiring (lg p)^(3+o(1)) bit operations.一個(gè)標(biāo)準(zhǔn)的加速是檢查p是否可以被任何通過(guò)某個(gè)極限的冪整除浇衬,比如y。 幸運(yùn)的是餐济,這個(gè)可分性測(cè)試中存在一個(gè)隨機(jī)整數(shù)的概率大約為1 / log y耘擂,如果試算組不是原始的隨機(jī)數(shù),則將原始的隨機(jī)數(shù)池減少到(log p)/ log y隨機(jī)數(shù)絮姆, 一個(gè)瓶頸醉冤。 傳統(tǒng)的觀點(diǎn)認(rèn)為,控制審判的成本需要y被選為lgp中的一個(gè)多項(xiàng)式篙悯,只保存了一個(gè)因子(lg lg p)蚁阳,因此仍然需要(1g p)^(3 + o(1 ))位操作。
A nonstandard speedup is to replace trial division (or sieving) by batch trialdivision [8] or batch smoothness detection [9]. The algorithm of [9] reads afinite sequence S of positive integers and a finite set P of primes, and finds“the largest P-smooth divisor of each integer in S” using just b(lgb)^(2+o(1))bitoperations, where b is the total number of bits in P and S. In particular, if Pis the set of primes up through y, and S is a sequence of Θ(y/ lg p) integerseach having Θ(lgp) bits, then b is Θ(y) and this algorithm uses just y(lgy)^(2+o(1))bit operations, i.e., (lg p)(lg y)^(2+o(1)) bit operations for each element of S. Largersequences S can trivially be split into sequences of size Θ(y/lgp), producing thesame performance per element of S.一個(gè)非標(biāo)準(zhǔn)的加速是通過(guò)分批試驗(yàn)[8]或分批光滑檢測(cè)[9]來(lái)代替試驗(yàn)分割(或篩分)鸽照。 文獻(xiàn)[9]的算法讀取正整數(shù)的有限序列S和素?cái)?shù)的有限集合P螺捐,并使用b(lgb)^(2 + o(1),得到“S中每個(gè)整數(shù)的最大P光滑除數(shù)” ))位操作矮燎,其中b是P和S中的總位數(shù)定血。特別地,如果P是通過(guò)y的整數(shù)集合诞外,并且S是每個(gè)具有Θ(y / lg p)的整數(shù)的序列澜沟, (lgp)比特,則b是Θ(y)峡谊,該算法僅使用y(lgy)^(2 + o(1))比特運(yùn)算倔喂,即(lg p)(lg y)^(2 + o )位操作。較大的序列S可以平分為大小為Θ(y / lgp)的序列靖苇,每個(gè)元素S產(chǎn)生相同的性能席噩。
To do even better, assume that the original size of S is at least 2^(2α),andapply batch smoothness detection successively for y = 2^(2^0), y = 2^(2^1), y=2^(2^)2,and so on through y = 2^(2^α). Each step weeds out about half of the remainingelements of S as composites; the next step costs about four times as much perelement but is applied to only half as many elements. The total cost is just (lgp)(2^α)^(1+o(1)) bit operations for each of the original elements of S. Each of the original elements has probability about 1/2^α of surviving this process and incurring an exponentiation, which costs (lg p)^(2+o(1)) bit operations. Choosing 2^α ∈ (lg p)^(0.5+o(1)) balances these costs as (lg p)^(1.5+o(1)) for each of the original elements of S, i.e., (lg p)^(2.5+o(1)) for each prime generated.為了做得更好,假設(shè)S的原始大小至少為2 ^(2α)贤壁,并且對(duì)y = 2 ^(2 ^ 0)悼枢,y = 2 ^(2 ^ 1),y = 2 ^(2 ^)2脾拆,依此類(lèi)推馒索,直到y(tǒng) = 2 ^(2 ^α)莹妒。 每個(gè)步驟都將S中剩下的一半元素作為復(fù)合材料除去; 下一步的成本大約是每個(gè)元素的四倍,但只適用于一半的元素绰上。 對(duì)于S的每個(gè)原始元素旨怠,總成本只是(lgp)(2 ^α)^(1 + o(1))位操作。每個(gè)原始元素具有約1/2 ^α的存活概率 并產(chǎn)生一個(gè)指數(shù)運(yùn)算蜈块,其代價(jià)是(lg p)^(2 + o(1))位操作鉴腻。 選擇2 ^α∈(lg p)^(0.5 + o(1))為S的每個(gè)原始元素(即(lg p))平衡這些代價(jià)為(lg p) ^(2.5 + o(1))為每個(gè)生成的素?cái)?shù)。
In the context of post-quantum RSA the assumption about the original sizeof S is satisfied: one has to generate (lg n)^(1+o(1))primes, so the original size ofS is (lgn)^(1+o(1)), which is at least 2^(2^α)for 2^α ∈ (1 + o(1)) lg lg n; this choice ofα satisfies 2^α ∈ (lg p)^(0.5+o(1))since lg p ∈ (lg lg n)^(2+o(1)). The primes are alsobalanced, in the sense that (lg n)/k ∈ (lg p)^(1+o(1))for each p, so generating kprimes in this way uses k(lg p)^(2.5+o(1))= (lg n)(lg p)^(1.5+o(1))= (lg n)(lg lg n)^(3+o(1))bit operations.在后量子RSA的情況下百揭,滿足S的原始大小的假設(shè):必須生成(lg n)^(1 + o(1))素?cái)?shù)爽哎,所以S的原始大小是(lgn)^ 1 + o(1)),對(duì)于2 ^α∈(1 + o(1))lg lg n器一,至少為2 ^(2 ^α) 由于lg p∈(lg lg n)^(2 + o(1))课锌,所以α的選擇滿足2 ^α∈(lg p)^(0.5 + o(1) 素?cái)?shù)也是平衡的,就每個(gè)p而言(lg n)/ k∈(lg p)^(1 + o(1))祈秕,所以用這種方式生成k個(gè)素?cái)?shù)使用k(lg p)(2.5 (1))=(lg n)(lg p)^(1.5 + o(1))=(lg n)(lg lg n)^(3 + o(1))位運(yùn)算渺贤。
Computing n by multiplying these primes uses only (lg n)(lg lg n)^(2+o(1))bitoperations using standard fast-arithmetic techniques; see, e.g., [10, Section 12].At this level of detail it does not matter whether one uses the classic Sch¨onhage–Strassen multiplication algorithm [46], F¨urer’s multiplication algorithm [21], orthe Harvey–van der Hoeven–Lecerf multiplication algorithm [27].通過(guò)乘以這些素?cái)?shù)來(lái)計(jì)算n只使用標(biāo)準(zhǔn)快速算術(shù)技術(shù)使用(lg n)^(2 + o(1))位操作; 例如,參見(jiàn)[10请毛,第12章]志鞍。在這個(gè)細(xì)節(jié)層次上,是否使用經(jīng)典的Schonon-Strassen乘法算法[46]获印,F(xiàn)üller的乘法算法[21]或Harvey-van der Hoeven-Lecerf乘法算法[27]述雾。
The total number of bit operations for key generation is essentially linear inlg n. For comparison, the usual picture is that prime generation is vastly moreexpensive than any of the other steps in RSA.密鑰生成的位操作總數(shù)在lg n中基本上是線性的街州。 為了比較兼丰,通常的情況是,素?cái)?shù)比RSA中的任何其他步驟要昂貴得多唆缴。
One can try to further accelerate key generation using Takagi’s idea [52] ofchoosing n as p^(k?1)q. We point out two reasons that this is worrisome. The firstreason is lattice attacks [13]. The second reason is that any nth power modulon has small order, namely some divisor of (p ? 1)(q ? 1); Shor’s algorithm findsthe order at relatively high speed once the nth power is computed.人們可以嘗試使用Takagi的將[n]選擇為p ^(k-1)q的思想[52]來(lái)進(jìn)一步加速密鑰生成鳍征。 我們指出兩個(gè)原因,這是令人擔(dān)憂的面徽。 第一個(gè)原因是格子攻擊[13]艳丛。 第二個(gè)原因是任何n次冪模n都有小數(shù)階,即(p - 1)(q - 1)的一些除數(shù)趟紊。 Shor算法一旦計(jì)算n次冪氮双,就會(huì)以相對(duì)較高的速度找到次序。
Encryption and decryption.There are many different RSA encryption mechanisms in the literature. The oldest mechanisms use RSA to directly encrypt auser’s message; this requires careful padding and scrambling of the message.Newer mechanisms generate a secret key (for example, an AES key), use thesecret key to encrypt and authenticate the user’s message, and use RSA to encrypt the secret key; this allows simpler padding, since the secret key is alreadyrandomized. The newest mechanisms such as Shoup’s “RSA-KEM” [51] simplyuse RSA to encrypt lg n bits of random data, hash the random data to obtaina secret key, and use the secret key to encrypt and authenticate the user’s message; this does not require any padding. For simplicity this paper takes the lastapproach.加密和解密霎匈。文獻(xiàn)中有許多不同的RSA加密機(jī)制戴差。 最老的機(jī)制使用RSA直接加密用戶的消息; 這需要仔細(xì)填充和加擾消息。 較新的機(jī)制產(chǎn)生一個(gè)密鑰(例如AES密鑰)铛嘱,使用該密鑰對(duì)用戶的消息進(jìn)行加密和認(rèn)證暖释,并使用RSA對(duì)密鑰進(jìn)行加密; 這允許更簡(jiǎn)單的填充袭厂,因?yàn)槊荑€已經(jīng)被隨機(jī)化了。 Shoup的“RSA-KEM”[51]等最新的機(jī)制簡(jiǎn)單地使用RSA對(duì)隨機(jī)數(shù)據(jù)的n位進(jìn)行加密球匕,對(duì)隨機(jī)數(shù)據(jù)進(jìn)行散列纹磺,得到一個(gè)密鑰,用密鑰對(duì)用戶的消息進(jìn)行加密和認(rèn)證亮曹。 這不需要任何填充橄杨。 為了簡(jiǎn)單起見(jiàn),本文采用最后一種方法乾忱。
Generating large amounts of truly random data is expensive. Fortunately,truly random data can be simulated by pseudorandom data produced by astream cipher from a much smaller key. (Even better, slight deficiencies in therandomness of the cipher key do not compromise security.) The literature contains several scalable ciphers that produce a Θ(b)-bit block of output from aΘ(b)-bit key, with a conjectured 2^bsecurity level, using b^(2+o(1))bit operations(and even fewer for some ciphers), i.e., b^(1+o(1))bit operations for each output bit.In the context of post-quantum RSA one has b ∈ Θ(lg lg n) so generating lgnpseudorandom bits costs (lg n)(lg lg n)^(1+o(1))bit operations. The same cipherscan also be converted into hash functions with only a constant-factor loss inefficiency, so hashing the bits also costs (lg n)(lg lg n)^(1+o(1))bit operations.產(chǎn)生大量真正隨機(jī)的數(shù)據(jù)是昂貴的讥珍。 幸運(yùn)的是,真正的隨機(jī)數(shù)據(jù)可以通過(guò)流密碼產(chǎn)生的偽隨機(jī)數(shù)據(jù)來(lái)模擬一個(gè)更小的密鑰窄瘟。 (甚至更好的是衷佃,密碼密鑰的隨機(jī)性的輕微缺陷不會(huì)危及安全性)。文獻(xiàn)包含幾個(gè)可伸縮密碼蹄葱,其產(chǎn)生來(lái)自Θ(b)位密鑰的Θ(b)位輸出塊氏义,具有猜測(cè) 對(duì)于每個(gè)輸出比特,使用b ^(2 + o(1))比特運(yùn)算(對(duì)于某些密碼甚至更少)图云,即b ^(1 + o(1))比特運(yùn)算惯悠。 在后量子RSA的情況下,有一個(gè)b∈Θ(lg lg n)竣况,所以產(chǎn)生lgn偽隨機(jī)比特成本(lg n)(lg lg n)^(1 + o(1))比特運(yùn)算克婶。 也可以將相同的密碼轉(zhuǎn)換成散列函數(shù),其效率只有一個(gè)常數(shù)因子的損失丹泉,所以對(duì)這些比特進(jìn)行散列也會(huì)導(dǎo)致比特操作的成本(lg n)(lg lg n)^(1 + o(1))情萤。
Multiplication also takes (lgn)(lglgn)^(1+o(1))bit operations. Squaring, reduction modulo n, multiplication, and another reduction modulo n together take(lgn)(lglgn)^(1+o(1))bit operations. The overall cost of RSA encryption is therefore(lgn)(lglgn)^(1+o(1))bit operations plus the cost of encrypting and authenticatingthe user’s message under the resulting secret key.乘法還需要(lgn)(lglgn)^(1 + o(1))位操作。 平方摹恨,減模n筋岛,乘法和另一個(gè)減法模n一起取(lgn)(lglgn)^(1 + o(1))位操作晒哄。 因此睁宰,RSA加密的總體成本是(lgn)(lglgn)^(1 + o(1))比特操作加上在產(chǎn)生的密鑰下加密和認(rèn)證用戶消息的成本。
Decryption is more complicated but not much slower; it works as follows.First reduce the ciphertext modulo all of the prime divisors of n. This takes(lg n)(lg lg n)^(2+o(1))bit operations using a remainder tree or a scaled remaindertree; see, e.g., [10, Section 18]. Then compute a cube root modulo each prime.A cube root modulo p takes (lg p)^(2+o(1))bit operations, so all of the cube rootstogether take (lg n)(lg lg n)^(2+o(1))bit operations. Then reconstruct the cube rootmodulo n. This takes (lg n)(lg lg n)^(2+o(1))bit operations using fast interpolationtechniques; see, e.g., [10, Section 23]. Finally hash the cube root. The overallcost of RSA decryption is (lg n)(lg lg n)^(2+o(1))bit operations, plus the cost ofverifying and decrypting the user’s message under the resulting secret key.解密更復(fù)雜寝凌,但速度并不慢柒傻。 它的工作原理如下。 首先減少密文模n的所有素因子较木。 這采用(lg n)(lg lg n)^(2 + o(1))位運(yùn)算使用余數(shù)樹(shù)或縮放余數(shù)樹(shù); 參見(jiàn)例如[10红符,第18節(jié)]。 然后計(jì)算每個(gè)素?cái)?shù)模的立方根。 立方根模p任バⅰ(lg p)^(2 + o(1))位操作刹前,所有立方體根都一起取(1g n)(2 + o雌桑。 然后重建立方根n喇喉。 這使用快速插值技術(shù)來(lái)執(zhí)行(lg n)(lg lg n)^(2 + o(1))位操作; 例如參見(jiàn)[10,第23節(jié)]校坑。 最后散列立方根拣技。 RSA解密的總體成本是(lg n)(lg lg n)^(2 + o(1))位操作,加上在得到的密鑰下驗(yàn)證和解密用戶消息的成本耍目。
Shamir in [47] proposed decrypting modulo just one prime, and choosingplaintexts to be smaller than primes. However, this requires exponents to bemuch larger for security, and in the context of post-quantum RSA this slowsdown encryption by vastly more than it speeds up decryption. A more interesting variant, which we do not explore further, is to use a significant fraction ofthe primes to decrypt a plaintext having (lg n)/(lg lg n)^(0.5+o(1))bits; this shouldreduce the total cost of encryption and decryption to (lg n)(lg lg n)^(1.5+o(1))bitoperations with a properly chosen exponent.Shamir在文獻(xiàn)[47]中提出的解密只是一個(gè)素?cái)?shù)膏斤,選擇明文小于質(zhì)數(shù)。 然而邪驮,這要求指數(shù)在安全性方面要大得多莫辨,并且在后量子RSA的情況下,加密的速度比加速解密要慢得多毅访。 一個(gè)更有趣的變體沮榜,我們不進(jìn)一步探討脊僚,是使用相當(dāng)一部分質(zhì)數(shù)來(lái)解密具有(lg n)/(lg lg n)^(0.5 + o(1))位的明文; 這應(yīng)該將具有適當(dāng)選擇的指數(shù)的加密和解密的總成本降低到(lg n)(lg + l(1))位操作揪荣。
Signature generation and verification.Standard padding schemes for RSAsignatures involve the same operations discussed above, such as hashing to ashort string and using a stream cipher to expand the short string to a longstring.簽名生成和驗(yàn)證。用于RSA簽名的標(biāo)準(zhǔn)填充方案涉及上面討論的相同操作属铁,例如散列為短字符串并使用流密碼將短字符串?dāng)U展為長(zhǎng)字符串守呜。
The final speeds are, unsurprisingly, (lg n)(lg lg n)^(2+o(1))bit operations to generate a signature and (lg n)(lg lg n)^(1+o(1))bit operations to verify a signature,plus the cost of hashing the user’s message.最后的速度并不令人驚訝型酥,生成一個(gè)簽名和(lg n)(lg lg n)^(1 + o(1)) 以驗(yàn)證簽名,加上散列用戶消息的代價(jià)查乒。
[if !supportLists]四弥喉、[endif]Concrete parameters and initial implementation具體參數(shù)和初始實(shí)現(xiàn)
Summarizing what we’ve learned so far: Shor’s algorithm takes (lg n)^(2+o(1))qubitoperations to factor n. If the prime divisors of n are too small then GEECMbecomes a larger threat than Shor’s algorithm; protecting against GEECM requires each prime to have (lg lg n)^(2+o(1))bits. Section 3 showed that, under thisconstraint, all of the RSA operations can be carried out using (lg n)(lg lg n)^(O(1))bit operations; the O(1) is 3 + o(1) for key generation, 2 + o(1) for decryptionand signature generation, and 1 + o(1) for encryption and signature verification.總結(jié)迄今為止我們學(xué)到的東西:Shor算法將(lg n)^(2 + o(1))量子位運(yùn)算用于因子n。 如果n的主因子太小侣颂,那么GEECM比Shor的算法成為更大的威脅; 對(duì)于GEECM的保護(hù)要求每個(gè)素?cái)?shù)都有(lg lg n)^(2 + o(1))位档桃。 第3節(jié)表明枪孩,在這個(gè)約束下憔晒,所有的RSA操作都可以使用(lg n)(lg lg n)^(O(1))位操作來(lái)完成; 對(duì)于密鑰生成,O(1)是3 + o(1)蔑舞,對(duì)解密和簽名生成是2 + o(1)拒担,對(duì)加密和簽名驗(yàn)證是1 + o(1)。
These asymptotics do not imply anything about any particular size of n. Thissection looks at performance in more detail, and in particular reports successfulgeneration of a 1-terabyte post-quantum RSA key built from 4096-bit primes.這些漸近詞并不意味著任何特定的n的大小攻询。 本節(jié)更詳細(xì)地介紹了性能从撼,特別是成功生成了由4096位素?cái)?shù)構(gòu)建的1 TB后量子RSA密鑰。
Prime sizes and key sizes.Before looking at performance, we explain whythese sizes (1-terabyte key, 4096-bit primes) provide ample security原始大小和密鑰大小。在查看性能之前低零,我們解釋為什么這些大衅畔琛(1 TB密鑰,4096位素?cái)?shù))提供足夠的安全性
A 1-terabyte key n has 2^43 bits, so Shor’s algorithm uses 2^44 multiplicationsmodulo n. We have not found literature analyzing the cost of circuits for optimized FFT-based multiplication at this scale, so we extrapolate as follows.一個(gè)1 TB的密鑰n有2^43個(gè)比特掏婶,所以Shor算法使用2^44乘法模n啃奴。 我們還沒(méi)有找到文獻(xiàn)分析在這個(gè)尺度下基于FFT的乘法電路的成本,所以我們推斷如下雄妥。
The recent speed records from Harvey–van der Hoeven–Lecerf [28] for multiplication of degree-2^21 polynomials over a particularly favorable finite field, F2^60 ,use 640 milliseconds on a 3.4GHz CPU core. More than half of the cycles areperforming 128-bit vector xor, and more than 10% of the cycles are performing64×64-bit polynomial multiplications, according to [28, Section 3.3], for a totalof approximately 2^40 bit operations to multiply 2^27-bit inputs.Harvey-van der Hoeven-Lecerf [28]最近對(duì)一個(gè)特別有利的有限域F2 ^ 60乘以2 ^ 21多項(xiàng)式的速度記錄在3.4GHz的CPU內(nèi)核上使用640毫秒最蕾。 根據(jù)[28,3.3節(jié)]老厌,超過(guò)一半的周期執(zhí)行128位向量xor瘟则,并且超過(guò)10%的周期正在執(zhí)行64×64位多項(xiàng)式乘法,總共大約2 ^ 40位 操作來(lái)乘以2 ^ 27位輸入枝秤。
Imagine that the same 2^13 ratio scales directly from 2^27-bit inputs to 2^43-bit inputs; that integer multiplication uses as few bit operations as binary-polynomialmultiplication; that reduction modulo n does not cost anything; and that thereare no overheads for switching from bit operations to reversible qubit operationsinside a realistic quantum-computer architecture. (For comparison, the ratio in[56] is more than 2^20 for 2^20-bit inputs.) Each multiplication modulo n insideShor’s algorithm then uses 2^56 qubit operations, and overall Shor’s algorithmconsumes an astonishing 2^100 qubit operations.假設(shè)相同的2^13比例直接從2^27位輸入擴(kuò)展到2^43位輸入; 該整數(shù)乘法使用與二進(jìn)制多項(xiàng)式乘法一樣少的位操作; 減數(shù)n不會(huì)花費(fèi)任何東西; 并且在現(xiàn)實(shí)的量子計(jì)算機(jī)體系結(jié)構(gòu)內(nèi)沒(méi)有從位操作切換到可逆量子位操作的開(kāi)銷(xiāo)醋拧。 (為了比較,對(duì)于2 ^ 20位輸入淀弹,[56]中的比率大于2 ^ 20)趁仙。在Shor算法內(nèi)部的每個(gè)乘法模n使用2 ^ 56個(gè)量化位運(yùn)算,并且整個(gè)Shor算法消耗驚人的2 ^ 100 量子比特操作垦页。
We caution the reader that this is only a preliminary estimate. A thoroughanalysis would have to account for several overheads mentioned above; for thenumber of Shor iterations required; for known techniques to reduce the numberof iterations; for techniques to use slightly fewer multiplications per iteration;and for the latest improvements in integer-multiplication algorithms.我們告誡讀者雀费,這只是一個(gè)初步的估計(jì)。 一個(gè)徹底的分析將不得不考慮上面提到的幾個(gè)開(kāi)銷(xiāo); 對(duì)于所需的Shor迭代次數(shù); 用于減少迭代次數(shù)的已知技術(shù); 對(duì)于每次迭代使用稍少的乘法的技術(shù); 以及整數(shù)乘法算法的最新改進(jìn)痊焊。
As for prime sizes: Standard pre-quantum cost analyses conclude that 4096-bit RSA keys provide roughly 2^140 security against all available algorithms. ECMis well known to be inferior to NFS at such sizes; evidently it uses even morethan 2^140 bit operations to find 2048-bit primes. ECM would be even sloweragainst a much larger modulus, simply because arithmetic is slower. However,the speedup from ECM to GEECM reduces the post-quantum security level of2048-bit primes. Rather than engaging in a detailed analysis of this loss, we moveup to 4096-bit primes, obviously putting GEECM far out of reach.對(duì)于素?cái)?shù)大姓蛋馈:標(biāo)準(zhǔn)的預(yù)量子成本分析得出結(jié)論,4096位的RSA密鑰提供了大約2 ^ 140的安全性薄啥,抵御所有可用的算法辕羽。 眾所周知,ECM在這種尺寸下比NFS差垄惧。 顯然它使用甚至超過(guò)2 ^ 140位操作來(lái)查找2048位素?cái)?shù)刁愿。 因?yàn)樗阈g(shù)運(yùn)算速度較慢,所以ECM對(duì)于更大的模數(shù)將更慢到逊。 然而铣口,從ECM到GEECM的加速降低了2048位素?cái)?shù)的后量子安全級(jí)別。 我們沒(méi)有對(duì)這個(gè)損失進(jìn)行詳細(xì)的分析觉壶,而是移動(dòng)到4096位的素?cái)?shù)脑题,顯然把GEECM放在了遙遠(yuǎn)的地方。
Implementation.We now discuss our implementation of post-quantum RSA.Our main result is successful generation of a 1-terabyte exponent-3 RSA keyconsisting of 4096-bit primes. We also have preliminary results for encryptionand decryption, although so far only for smaller sizes.實(shí)現(xiàn)铜靶。我們現(xiàn)在討論我們的后量子RSA的實(shí)現(xiàn)叔遂。 我們的主要結(jié)果是成功生成了一個(gè)由4096位素?cái)?shù)組成的1TB指數(shù)-3的RSA密鑰。 我們也有加密和解密的初步結(jié)果,雖然到目前為止只適用于較小的尺寸已艰。
Our computations were performed on a heterogeneous cluster. We give a description of the machines in Appendix A. The memory-intensive portions of ourcomputations were carried out a single machine running Ubuntu with 24 coresat 3.40 GHz (4 Intel Xeon E7-8893 v2 processors), 3 terabytes of DRAM, and4.9 terabytes of swap memory built from enterprise SSDs. We will refer to thismachine as lattice0 below. We measured memory consumption and overallruntime for bignum multiplications using GNU’s Multiple Precision (GMP) Library [26]. We encountered a number of software limits and bugs, which we detail in Appendix A.我們的計(jì)算是在異構(gòu)集群上進(jìn)行的痊末。 我們?cè)诟戒汚中給出了這些機(jī)器的描述。我們計(jì)算的內(nèi)存密集型部分是在一臺(tái)運(yùn)行Ubuntu的機(jī)器上運(yùn)行的哩掺,這個(gè)機(jī)器上有24個(gè)核心舌胶,頻率為3.40GHz(4個(gè)Intel Xeon E7-8893 v2處理器),3TB的DRAM疮丛, 從企業(yè)級(jí)SSD構(gòu)建的4.9TB交換內(nèi)存幔嫂。 下面我們將把這臺(tái)機(jī)器稱為lattice0。 我們使用GNU的多精度(GMP)庫(kù)[26]來(lái)測(cè)量?jī)?nèi)存消耗和總體運(yùn)行時(shí)間誊薄。 我們遇到了一些軟件限制和錯(cuò)誤履恩,我們?cè)诟戒汚中詳細(xì)說(shuō)明。
Prime generation.Generating a 1-terabyte exponent-3 RSA key requires 2^314096-bit primes that are congruent to 2 mod 3. To efficiently generate such alarge number of primes, our implementation first applies the batched smoothnessdetection technique discussed in Section 3 to an input collection of random 4096-bit numbers. We then use the Fermat congruence primality test to produce ourfinal set of primes. While we do not prove that each number in the final outputis prime, this test is sufficient to guarantee with high confidence that all of the4096-bit numbers in the final output are prime. See [31] for quantitative upperbounds on the error probability素?cái)?shù)生成呢蔫。生成一個(gè)1 TB的指數(shù)-3 RSA密鑰需要2 ^ 31 4096位素?cái)?shù)與2 mod 3一致切心。為了有效地生成如此大量的素?cái)?shù),我們的實(shí)現(xiàn)首先將第3節(jié)中討論的成批平滑檢測(cè)技術(shù)應(yīng)用于 隨機(jī)4096位數(shù)字的輸入集合片吊。 然后绽昏,我們使用費(fèi)馬相合性素測(cè)試來(lái)產(chǎn)生我們最后的一組素?cái)?shù)。 雖然我們不能證明最終輸出中的每個(gè)數(shù)字都是素?cái)?shù)俏脊,但是這個(gè)測(cè)試足以保證所有最終輸出中的4096位數(shù)字都是質(zhì)數(shù)全谤。 有關(guān)錯(cuò)誤概率的定量上限見(jiàn)[31]
We found that first filtering for random numbers congruent to 5 mod 6, andthen applying batch sieving with the successive bounds y = 2^10 and y = 2^20worked well in practice. Our heterogeneous cluster was able to generate primesat a rate of 750–1585 primes per core-hour. Generating all 231 primes took approximately 1,975,000 core-hours. In calendar time, prime generation completedin four months running on spare compute capacity of a 1,400-core cluster.我們發(fā)現(xiàn),首先對(duì)隨機(jī)數(shù)進(jìn)行濾波爷贫,使其與5 mod 6一致认然,然后應(yīng)用連續(xù)邊界y = 2 ^ 10和y = 2 ^ 20的分批篩選在實(shí)踐中運(yùn)行良好。 我們的異構(gòu)集群能夠以每核心小時(shí)750-1585個(gè)素?cái)?shù)的速度生成素?cái)?shù)漫萄。 生成所有231個(gè)素?cái)?shù)大約需要1,975,000個(gè)核心小時(shí)卷员。 在日歷時(shí)間內(nèi),主要產(chǎn)能在四個(gè)月內(nèi)以1400核心群集的備用計(jì)算容量運(yùn)行腾务。
Product tree.After we successfully generated 2^31 4096-bit primes, we used aproduct tree to compute the 1-terabyte public RSA key. We distributed individual multiplications across our heterogeneous cluster to reduce the wall-clocktime. We first multiplied batches of 8 million primes and wrote their productsout to disk. Each subsequent single-threaded multiplication job read two integers from disk and wrote their product back to disk. Running times varieddue to different CPU types and non-pqRSA related jobs sharing cache space.Once the integers reached 256GB in size, we finished computing the producton lattice0. The aggregate wall-clock time used by individual multiply jobs was about 1,239,626 seconds, and the elapsed time for the terabyte key generation was about four days. The final multiplication of two 512 GB integers took 176,223 seconds in wall-clock time, using 3.166TB of RAM and 2.5 TB of swap storage.產(chǎn)品樹(shù)毕骡。在成功生成2 ^ 31 4096位素?cái)?shù)后,我們使用產(chǎn)品樹(shù)來(lái)計(jì)算1TB的公共RSA密鑰岩瘦。 我們?cè)谡麄€(gè)異構(gòu)集群中分布單獨(dú)的乘法運(yùn)算以減少掛鐘時(shí)間未巫。 我們首先乘以800萬(wàn)個(gè)素?cái)?shù)的批次,并將他們的產(chǎn)品寫(xiě)入磁盤(pán)担钮。 隨后的每個(gè)單線程乘法作業(yè)都從磁盤(pán)讀取兩個(gè)整數(shù)橱赠,并將其產(chǎn)品寫(xiě)回磁盤(pán)尤仍。 由于不同的CPU類(lèi)型和共享緩存空間的非pqRSA相關(guān)作業(yè)箫津,運(yùn)行時(shí)間不同。 一旦整數(shù)達(dá)到了256GB的大小,我們就完成了對(duì)lattice0的計(jì)算苏遥。 單個(gè)乘法作業(yè)使用的總計(jì)掛鐘時(shí)間大約為1239626秒饼拍,太字節(jié)密鑰生成的時(shí)間大約為4天。 兩個(gè)512 GB整數(shù)的最終乘法操作耗時(shí)176223秒田炭,使用3.166TB的RAM和2.5TB的交換存儲(chǔ)空間师抄。
Encryption.We implemented RSA encryption using RSA-KEM, as describedin Section 3. With the exponent e = 3, we found that a simple square-and-reduce using GMP’s mpz_mult and mpz_mod was almost twice as fast as using themodular exponentiation function mpz_powm. Each operation was single-threaded.We were able to complete RSA encryption for modulus sizes up to 2 terabits, asshown in Table 4.1. For the 2Tb (256GB) encryption, the longest multiplicationtook 13 hours, modular reduction took 40 hours, and in total encryption took alittle over 100 hours.加密。我們使用RSA-KEM實(shí)現(xiàn)了RSA加密教硫,如第3節(jié)所述叨吮。指數(shù)e = 3,我們發(fā)現(xiàn)使用GMP的mpz_mult和mpz_mod的簡(jiǎn)單平方和減少速度幾乎是使用模指數(shù)函數(shù)mpz_powm的兩倍瞬矩。 每個(gè)操作都是單線程的茶鉴。 我們能夠完成RSA加密,最大模數(shù)為2 terabits景用,如表4.1所示涵叮。 對(duì)于2Tb(256GB)加密,最長(zhǎng)的乘法需要13個(gè)小時(shí)伞插,模塊化的減少需要40個(gè)小時(shí)割粮,總的加密需要超過(guò)100個(gè)小時(shí)。
Decryption.We implemented RSA decryption as described in Section 3. Table 4.1 gives wall-clock timings for the three computational steps in decryption,each parallelized across 48 threads. Precomputing the entire product and remainder tree for a terabyte-sized key and storing it to disk would have taken32TB of disk space, so instead we recomputed portions of the trees on the fly.The reported timings for the remainder tree step in Table 4.1 include the time ittakes to recompute both the product and remainder tree with a batch size of 8million primes. Using a batch size of 8 million primes was roughly twice as fastas using a batch size of 2 million primes. We obtained experimental results fordecryption of messages for key sizes of up to 16GB.解密媚污。我們實(shí)現(xiàn)了第3節(jié)中所述的RSA解密舀瓢。表4.1給出了解密中三個(gè)計(jì)算步驟的掛鐘時(shí)序,每個(gè)步驟在48個(gè)線程中并行化耗美。 預(yù)先計(jì)算一個(gè)太字節(jié)大小的密鑰的整個(gè)產(chǎn)品和剩余樹(shù)氢伟,并將其存儲(chǔ)到磁盤(pán)將需要32TB的磁盤(pán)空間,所以我們相反地重新計(jì)算了部分樹(shù)幽歼。 表4.1中剩余樹(shù)步驟的報(bào)告時(shí)間包括重新計(jì)算批量為800萬(wàn)個(gè)素?cái)?shù)的產(chǎn)品和剩余樹(shù)所花費(fèi)的時(shí)間朵锣。 使用800萬(wàn)個(gè)素?cái)?shù)的批量大約是使用200萬(wàn)個(gè)素?cái)?shù)的批量大小的兩倍。 我們獲得了密鑰大小高達(dá)16GB的消息解密實(shí)驗(yàn)結(jié)果甸私。
A Appendix: Implementation barriers and details附錄:實(shí)施障礙和細(xì)節(jié)
Extending GMP’s integer capacity. The GMP library uses hard-coded 32-bit integers to represent sizes in multiple locations in the library. Without anymodifications, GMP supports 237-bit integers on 64-bit machines [25]. To represent large values, we extended GMP’s capacity from 32-bit integers to 64-bitintegers by changing the data typing in GMP’s integer structure, mpz. Namely,we changed mpz_size and mpz_alloc from int types to int64_t types. To accommodate increased memory usage, we increased the bound for GMP’s memoryallocation for the mpzstruct in realloc.c to LLONG MAX. The final modificationswe made were to create binary-format I/O functions for 64-bit mpzs, namely inmpz_inp_out.c and mpz_out_raw.c.擴(kuò)展GMP的整數(shù)容量诚些。 GMP庫(kù)使用硬編碼的32位整數(shù)來(lái)表示庫(kù)中多個(gè)位置的大小。 沒(méi)有任何修改皇型,GMP在64位機(jī)器上支持237位整數(shù)[25]诬烹。 為了表示較大的值,我們通過(guò)改變GMP的整數(shù)結(jié)構(gòu)mpz中的數(shù)據(jù)類(lèi)型弃鸦,將GMP的容量從32位整數(shù)擴(kuò)展到64位整數(shù)绞吁。 也就是說(shuō),我們將mpz_size和mpz_alloc從int類(lèi)型更改為int64_t類(lèi)型唬格。 為了適應(yīng)增加的內(nèi)存使用家破,我們將realloc.c中的mpz結(jié)構(gòu)的GMP內(nèi)存分配的邊界增加到了LLONG MAX颜说。 我們所做的最后修改是為64位mpzs創(chuàng)建二進(jìn)制格式的I / O函數(shù),即在mpz_inp_out.c和mpz_out_raw.c中汰聋。
Impact of swapping.We initially evaluated the performance of our product-tree implementation by generating a “dummy key”, a terabyte product of random 4096-bit integers. During this product computation, we counted instructionsper CPU cycle (IPCs) with the command perf stat -e instructions,cycles -a sleep 1 to measure the lost performance caused by swapping. When no swapping occurred, the machine had about 2 instructions per cycle, but upon swapping, the instructions per cycles dropped as low as 0.37 instructions per cycle and held around 0.5 to 1.2 instructions per cycle.交換的影響门粪。我們最初通過(guò)生成一個(gè)“偽密鑰”(一個(gè)隨機(jī)4096位整數(shù)的太字節(jié)乘積)來(lái)評(píng)估產(chǎn)品樹(shù)實(shí)現(xiàn)的性能。 在此產(chǎn)品計(jì)算期間烹困,我們使用命令perf stat -e指令計(jì)算每個(gè)CPU周期(IPC)的指令數(shù)玄妈,周期數(shù)為-h睡眠1以測(cè)量由交換引起的性能損失。 當(dāng)沒(méi)有交換時(shí)髓梅,機(jī)器每個(gè)周期有大約2條指令拟蜻,但是在交換之后,每個(gè)周期的指令下降到每個(gè)周期0.37條指令枯饿,并且每個(gè)周期保持大約0.5到1.2條指令瞭郑。
GMP memory consumption.GMP’s memory consumption is another concern. High RAM and swap usage at higher levels in the product tree are attributed to GMP’s FFT implementation. According to GMP’s developers, theirFFT implementation consumes about 8n bytes of temporary memory space foran n·n product where n is the byte size of the factors [57]. This massive consumption of memory also triggered a known race condition in the Linux kernel [2]. Thebug was found in the huge memory.c code. There are numerous bug reports forvariants of the same bug on various mainline Linux systems throughout the pastsix years. Disabling transparent huge pages avoided the transparent hugepagecode in the kernel.
GMP內(nèi)存消耗。GMP的內(nèi)存消耗是另一個(gè)問(wèn)題鸭你。 產(chǎn)品樹(shù)中較高級(jí)別的高RAM和交換使用歸因于GMP的FFT實(shí)現(xiàn)屈张。 根據(jù)GMP的開(kāi)發(fā)人員,他們的FFT實(shí)現(xiàn)為一個(gè)n·n乘積消耗大約8n字節(jié)的臨時(shí)存儲(chǔ)空間袱巨,其中n是因子的字節(jié)大小[57]阁谆。 這種大規(guī)模的內(nèi)存消耗也觸發(fā)了Linux內(nèi)核中已知的競(jìng)爭(zhēng)條件[2]。 這個(gè)bug在巨大的memory.c代碼中被發(fā)現(xiàn)愉老。 在過(guò)去的六年中场绿,各種主流Linux系統(tǒng)上都有相同bug的變種報(bào)告。 禁用透明的巨大頁(yè)面避免了內(nèi)核中透明的巨大頁(yè)面代碼嫉入。
Measurements for 1-terabyte key product tree.In Table A.1, we showthe wall-clock time for each level of computing a 1-terabyte product tree. Levelsfar down in the product tree are easily parallelized. We carried out the entirecomputation on lattice0 using 48 threads. The computation used a peak of3.16TB of RAM and 2.22TB of swap memory, and completed in 356,709 seconds,or approximately 4 days, in wall-clock time.測(cè)量1 TB的關(guān)鍵產(chǎn)品樹(shù)焰盗。在表A.1中,我們顯示了計(jì)算1 TB產(chǎn)品樹(shù)的每個(gè)級(jí)別的掛鐘時(shí)間咒林。 產(chǎn)品樹(shù)中的級(jí)別很容易并行化熬拒。 我們使用48個(gè)線程對(duì)lattice0進(jìn)行了整個(gè)計(jì)算。 計(jì)算使用了3.16TB的RAM和2.22TB的交換內(nèi)存的峰值垫竞,在掛鐘時(shí)間內(nèi)完成了356,709秒或約4天澎粟。
異構(gòu)集群描述。
B Credits for multi-prime RSA學(xué)分多素?cái)?shù)RSA
The idea of using RSA with more than two primes is most commonly creditedto Collins, Hopkins, Langford, and Sabin, who received patent 5848159 in 1998for “RSA with several primes”:使用兩個(gè)以上素?cái)?shù)的RSA的想法最常見(jiàn)的是柯林斯欢瞪,霍普金斯活烙,蘭福德和薩賓,他們?cè)?998年獲得了專利5848159“RSA with primes”:
The invention, allowing 4 primes each about 150 digits long to obtaina 600 digit n, instead of two primes about 350 [sic] digits long, resultsin a marked improvement in computer performance. For, not only areprimes that are 150 digits in size easier to find and verify than ones on the order of 350 digits, but by applying techniques the inventors derive from the Chinese Remainder Theorem (CRT), public key cryptography calculations for encryption and decryption are completed much faster— even if performed serially on a single processor system.本發(fā)明允許每個(gè)約150個(gè)數(shù)字長(zhǎng)的4個(gè)素?cái)?shù)以獲得600個(gè)數(shù)字n遣鼓,而不是兩個(gè)大約350個(gè)數(shù)字長(zhǎng)度的素?cái)?shù)啸盏,這導(dǎo)致計(jì)算機(jī)性能的顯著提高。 因?yàn)椴粌H是150位數(shù)的素?cái)?shù)比350位數(shù)更容易查找和驗(yàn)證骑祟,而且通過(guò)應(yīng)用發(fā)明人從中國(guó)剩余定理(CRT)得出的技術(shù)回懦,用于加密和解密的公鑰密碼計(jì)算 即使在單個(gè)處理器系統(tǒng)上串行執(zhí)行气笙,也要快得多。
However, the same idea had already appeared in the original RSA patent in1983:然而粉怕,在1983年的原始RSA專利中也出現(xiàn)了同樣的想法:
In alternative embodiments, the present invention may use a modulusn which is a product of three or more primes (not necessarily distinct).Decoding may be performed modulo each of the prime factors of n andthe results combined using “Chinese remaindering” or any equivalentmethod to obtain the result modulo n.在替代實(shí)施例中健民,本發(fā)明可以使用三個(gè)或更多個(gè)素?cái)?shù)(不一定是不同)的乘積的模數(shù)n抒巢。 解碼可以對(duì)n的每個(gè)主要因子進(jìn)行模數(shù)贫贝,并且使用“中國(guó)剩余”或任何等效方法將結(jié)果組合起來(lái)以獲得模n的結(jié)果。
In any event, both of these patents have now expired, so they will not interferewith the deployment of post-quantum RSA.無(wú)論如何蛉谜,這兩項(xiàng)專利現(xiàn)在已經(jīng)過(guò)期稚晚,所以它們不會(huì)干擾后量子RSA的部署。