# Perfect Matchings and RNA Secondary Structures
# Problem
# Figure 5. A perfect matching on the basepair edges is highlighted in red and represents a candidate secondary structure for the RNA strand.A matching in a graph G is a collection of edges of G for which no node belongs to more than one edge in the collection. See Figure 2 for examples of matchings. If G contains an even number of nodes (say 2n), then a matching on G is perfect if it contains n edges, which is clearly the maximum possible. An example of a graph containing a perfect matching is shown in Figure 3.
# First, let Kn denote the complete graph on 2n labeled nodes, in which every node is connected to every other node with an edge, and let pn denote the total number of perfect matchings in Kn. For a given node x, there are 2n?1 ways to join x to the other nodes in the graph, after which point we must form a perfect matching on the remaining 2n?2 nodes. This reasoning provides us with the recurrence relation pn=(2n?1)?pn?1; using the fact that p1 is 1, this recurrence relation implies the closed equation pn=(2n?1)(2n?3)(2n?5)?(3)(1).
# Given an RNA string s=s1…sn, a bonding graph for s is formed as follows. First, assign each symbol of s to a node, and arrange these nodes in order around a circle, connecting them with edges called adjacency edges. Second, form all possible edges {A, U} and {C, G}, called basepair edges; we will represent basepair edges with dashed edges, as illustrated by the bonding graph in Figure 4.
# Note that a matching contained in the basepair edges will represent one possibility for base pairing interactions in s, as shown in Figure 5. For such a matching to exist, s must have the same number of occurrences of 'A' as 'U' and the same number of occurrences of 'C' as 'G'.
# Given: An RNA string s of length at most 80 bp having the same number of occurrences of 'A' as 'U' and the same number of occurrences of 'C' as 'G'.
# Return: The total possible number of perfect matchings of basepair edges in the bonding graph of s.
# Sample Dataset
# >Rosalind_23
# AGCUAGUCAU
# Sample Output
# 12
# 本題要求計(jì)算RNA序列對(duì)應(yīng)的堿基配對(duì)圖中袋哼,所有含有k個(gè)堿基對(duì)的完美匹配數(shù)量。其中k為RNA序列的堿基數(shù)的一半楞件。一個(gè)完美匹配表示所有堿基都和其它一個(gè)堿基配對(duì),而且每個(gè)堿基只能匹配一次。我們可以使用公式 pn=(2n-1)(2n-3)...(3)(1) 來(lái)計(jì)算完美匹配的數(shù)量粒督,其中 n = k * 2民泵。因此皆撩,本題的步驟包括:根據(jù)給定的RNA序列生成堿基配對(duì)圖,計(jì)算該圖中所有含有 k 個(gè)堿基對(duì)的完美匹配的數(shù)量续誉。輸出計(jì)算結(jié)果莱没。
from mathimport factorial
def count_perfect_matchings(s):
# 計(jì)算A、C酷鸦、G和U在s中的出現(xiàn)次數(shù)
? ? counts = {'A':0, 'C':0, 'G':0, 'U':0}
for nucleotidein s:
counts[nucleotide] +=1
? ? ? ? print(counts)
# 檢查A的數(shù)量等于U的數(shù)量饰躲,C的數(shù)量等于G的數(shù)量
? ? if counts['A'] != counts['U']or counts['C'] != counts['G']:
return 0
? ? # 計(jì)算完美匹配的總數(shù)
? ? n = counts['A']
return factorial(n) * factorial(n -1)
s ="AGCUAGUCAU"
print(count_perfect_matchings(s))