Majority Element

Problem

Given an array of size n, find the majority element. The majority element is the element that appears more than ? n/2 ? times.

You may assume that the array is non-empty and the majority element always exist in the array.

Approach #1 Brute Force

Intuition
We can exhaust the search space in quadratic time by checking whether each element is the majority element.

Algorithm
The brute force algorithm iterates over the array, and then iterates again for each number to count its occurrences. As soon as a number is found to have appeared more than any other can possibly have appeared, return it.

#include <iostream>
#include <vector>

int majorityElement(std::vector<int>& nums)
{
    int size = (int)nums.size();
    int halfCount = size / 2;
    
    for (auto num : nums)
    {
        int count = 0;
        
        for (auto elem : nums)
        {
            if (elem == num)
            {
                ++count;
            }
        }
        
        if (count > halfCount)
        {
            return num;
        }
    }
    
    return -1;
}

int main()
{
    int arr[] = { 1, 2, 3, 2, 4, 2, 2, 2, 2, 5, 7};
    std::vector<int> nums(arr, arr + sizeof(arr) / sizeof(arr[0]));
    int result = majorityElement(nums);
    
    std::cout << result << std::endl;
    
    return 0;
}

Complexity Analysis

  • Time complexity : O(n^2)
    The brute force algorithm contains two nested for loops that each run for n iterations, adding up to quadratic time complexity.
  • Space complexity : O(1)
    The brute force solution does not allocate additional space proportional to the input size.

Approach #2 HashMap

Intuition
We know that the majority element occurs more than [n/2] times, and a HashMap allows us to count element occurrences efficiently.
Algorithm
We can use a HashMap that maps elements to counts in order to count occurrences in linear time by looping over nums. Then, we simply return the key with maximum value.

#include <iostream>
#include <vector>
#include <unordered_map>

int majorityElement(std::vector<int>& nums)
{
    // hash
    std::unordered_map<int, int> counts;
    for (auto num : nums)
    {
        if (counts.count(num))
        {
            ++counts[num];
        }
        else
        {
            counts[num] = 1;
        }
    }
    
    // iteration
    int size = (int)nums.size();
    int halfCount = size / 2;
    
    for (auto elem : nums)
    {
        if (counts[elem] > halfCount)
        {
            return elem;
        }
    }
    
    return -1;
}

int main()
{
    int arr[] = { 1, 2, 3, 2, 4, 2, 2, 2, 2, 5, 7};
    std::vector<int> nums(arr, arr + sizeof(arr) / sizeof(arr[0]));
    int result = majorityElement(nums);
    
    std::cout << result << std::endl;
    
    return 0;
}

Complexity Analysis

  • Time complexity : O(n)
    We iterate over nums once and make a constant time HashMap insertion on each iteration. Therefore, the algorithm runs inO(n) time.
  • Space complexity : O(n)
    At most, the HashMap can contain n – [n/2] associations, so it occupies O(n) space. This is because an arbitrary array of length n can contain n distinct values, but nums is guaranteed to contain a majority element, which will occupy (at minimum) [n/2] +1 array indices. Therefore, n – ([n/2] +1) indices can be occupied by distinct, non-majority elements (plus 1 for the majority element itself), leaving us with (at most) n - [n/2] distinct elements.

Approach #3 Sorting

Intuition
If the elements are sorted in monotonically increasing (or decreasing) order, the majority element can be found at index ??n/2???? (and ??n/2???? +1, incidentally, if n is even).

Algorithm
For this algorithm, we simply do exactly what is described: sort nums, and return the element in question. To see why this will always return the majority element (given that the array has one), consider the figure below (the top example is for an odd-length array and the bottom is for an even-length array):

For each example, the line below the array denotes the range of indices that are covered by a majority element that happens to be the array minimum. As you might expect, the line above the array is similar, but for the case where the majority element is also the array maximum. In all other cases, this line will lie somewhere between these two, but notice that even in these two most extreme cases, they overlap at index ??n/2????for both even- and odd-length arrays. Therefore, no matter what value the majority element has in relation to the rest of the array, returning the value at ??n/2???? will never be wrong.

#include <iostream>
#include <vector>
#include <algorithm>

int majorityElement(std::vector<int>& nums)
{
    std::sort(nums.begin(), nums.end());
    return nums[nums.size() / 2];
}

int main()
{
    int arr[] = { 1, 2, 3, 2, 4, 2, 2, 2, 2, 5, 7};
    std::vector<int> nums(arr, arr + sizeof(arr) / sizeof(arr[0]));
    int result = majorityElement(nums);
    
    std::cout << result << std::endl;
    
    return 0;
}

Complexity Analysis

  • Time complexity : O(nlgn)
    Sorting the array costs O(nlgn) time in Python and Java, so it dominates the overall runtime.
  • Space complexity : O(1) or O(n)
    We sorted nums in place here - if that is not allowed, then we must spend linear additional space on a copy of nums and sort the copy instead.

Approach #4 Randomization

Intuition
Because more than ??n/2? array indices are occupied by the majority element, a random array index is likely to contain the majority element.

Algorithm
Because a given index is likely to have the majority element, we can just select a random index, check whether its value is the majority element, return if it is, and repeat if it is not. The algorithm is verifiably correct because we ensure that the randomly chosen value is the majority element before ever returning.

Complexity Analysis

  • Time complexity : O(∞)
    It is technically possible for this algorithm to run indefinitely (if we never manage to randomly select the majority element), so the worst possible runtime is unbounded. However, the expected runtime is far better - linear, in fact. For ease of analysis, convince yourself that because the majority element is guaranteed to occupy more than half of the array, the expected number of iterations will be less than it would be if the element we sought occupied exactly half of the array. Therefore, we can calculate the expected number of iterations for this modified version of the problem and assert that our version is easier.

Because the series converges, the expected number of iterations for the modified problem is constant. Based on an expected-constant number of iterations in which we perform linear work, the expected runtime is linear for the modifed problem. Therefore, the expected runtime for our problem is also linear, as the runtime of the modifed problem serves as an upper bound for it.

  • Space complexity : O(1)
    Much like the brute force solution, the randomized approach runs with constant additional space.

Approach #5 Divide and Conquer

Intuition
If we know the majority element in the left and right halves of an array, we can determine which is the global majority element in linear time.

Algorithm
Here, we apply a classical divide & conquer approach that recurses on the left and right halves of an array until an answer can be trivially achieved for a length-1 array. Note that because actually passing copies of subarrays costs time and space, we instead pass lo and hi indices that describe the relevant slice of the overall array. In this case, the majority element for a length-1 slice is trivially its only element, so the recursion stops there. If the current slice is longer than length-1, we must combine the answers for the slice's left and right halves. If they agree on the majority element, then the majority element for the overall slice is obviously the same1. If they disagree, only one of them can be "right", so we need to count the occurrences of the left and right majority elements to determine which subslice's answer is globally correct. The overall answer for the array is thus the majority element between indices 0 and n.

#include <iostream>
#include <vector>
#include <algorithm>

int countInRange(std::vector<int>& nums, int num, int lo, int hi)
{
    int count = 0;
    for (int i = lo; i < hi; ++i)
    {
        if (nums[i] == num)
        {
            ++count;
        }
    }
    
    return count;
}

int majorityElementRec(std::vector<int>& nums, int lo, int hi)
{
    if (lo == hi - 1)
    {
        return nums[lo];
    }
    
    int mid = lo + (hi - lo) / 2;
    int left = majorityElementRec(nums, lo, mid);
    int right = majorityElementRec(nums, mid, hi);
    
    if (left == right)
    {
        return left;
    }
    
    int leftCount = countInRange(nums, left, lo, hi);
    int rightCount = countInRange(nums, right, lo, hi);
    
    return leftCount > rightCount ? left : right;
}

int majorityElement(std::vector<int>& nums)
{
    return majorityElementRec(nums, 0, (int)nums.size());
}

int main()
{
    int arr[] = { 1, 2, 3, 2, 4, 2, 2, 2, 2, 5, 7};
    std::vector<int> nums(arr, arr + sizeof(arr) / sizeof(arr[0]));
    int result = majorityElement(nums);
    
    std::cout << result << std::endl;
    
    return 0;
}

Complexity Analysis

  • Time complexity :O(nlgn)
    Each recursive call to majority_element_rec performs two recursive calls on subslices of size n/2 and two linear scans of length nn. Therefore, the time complexity of the divide & conquer approach can be represented by the following recurrence relation:
    T(n) = 2T(n/2) + 2n

By the master theorem, the recurrence satisfies case 2, so the complexity can be analyzed as such:

  • Space complexity : O(lgn)
    Although the divide & conquer does not explicitly allocate any additional memory, it uses a non-constant amount of additional memory in stack frames due to recursion. Because the algorithm "cuts" the array in half at each level of recursion, it follows that there can only be O(lgn) "cuts" before the base case of 1 is reached. It follows from this fact that the resulting recursion tree is balanced, and therefore all paths from the root to a leaf are of length O(lgn).

Because the recursion tree is traversed in a depth-first manner, the space complexity is therefore equivalent to the length of the longest path, which is, of course, O(lgn).

Approach #6 Boyer-Moore Voting Algorithm

Intuition
If we had some way of counting instances of the majority element as +1 and instances of any other element as -1, summing them would make it obvious that the majority element is indeed the majority element.

Algorithm
Essentially, what Boyer-Moore does is look for a suffix suf of nums where suf[0] is the majority element in that suffix. To do this, we maintain a count, which is incremented whenever we see an instance of our current candidate for majority element and decremented whenever we see anything else.

Whenever count equals 0, we effectively forget about everything in nums up to the current index and consider the current number as the candidate for majority element. It is not immediately obvious why we can get away with forgetting prefixes of nums - consider the following examples (pipes are inserted to separate runs of nonzero count).
[7, 7, 5, 7, 5, 1 | 5, 7 | 5, 5, 7, 7 | 7, 7, 7, 7]

Here, the 7 at index 0 is selected to be the first candidate for majority element. count will eventually reach 0 after index 5 is processed, so the 5 at index 6 will be the next candidate. In this case, 7 is the true majority element, so by disregarding this prefix, we are ignoring an equal number of majority and minority elements - therefore, 7 will still be the majority element in the suffix formed by throwing away the first prefix.
[7, 7, 5, 7, 5, 1 | 5, 7 | 5, 5, 7, 7 | 5, 5, 5, 5]

Now, the majority element is 5 (we changed the last run of the array from 7s to 5s), but our first candidate is still 7. In this case, our candidate is not the true majority element, but we still cannot discard more majority elements than minority elements (this would imply that count could reach -1 before we reassign candidate, which is obviously false).

Therefore, given that it is impossible (in both cases) to discard more majority elements than minority elements, we are safe in discarding the prefix and attempting to recursively solve the majority element problem for the suffix. Eventually, a suffix will be found for which count does not hit 0, and the majority element of that suffix will necessarily be the same as the majority element of the overall array.

#include <iostream>
#include <vector>
#include <algorithm>

int majorityElement(std::vector<int>& nums)
{
    int count = 0;
    int candidate = 0;
    
    for (auto num : nums)
    {
        if (0 == count)
        {
            candidate = num;
        }
        
        count += (candidate == num) ? 1 : -1;
    }
    
    return candidate;
}

int main()
{
    int arr[] = { 1, 2, 3, 2, 4, 2, 2, 2, 2, 5, 7};
    std::vector<int> nums(arr, arr + sizeof(arr) / sizeof(arr[0]));
    int result = majorityElement(nums);
    
    std::cout << result << std::endl;
    
    return 0;
}

Complexity Analysis

  • Time complexity : O(n)
    Boyer-Moore performs constant work exactly nn times, so the algorithm runs in linear time.
  • Space complexity : O(1)
    Boyer-Moore allocates only constant additional memory.

Majority Element II

Given an integer array of size n, find all elements that appear more than ? n/3 ? times.
Note: The algorithm should run in linear time and in O(1) space.

Approach #1 Boyer-Moore Voting Algorithm

#include <iostream>
#include <vector>
#include <algorithm>

std::vector<int> majorityElement(std::vector<int>& nums)
{
    std::vector<int> result;
    
    int candidate1 = 0;
    int candidate2 = 0;
    int count1 = 0;
    int count2 = 0;
    
    for (auto num : nums)
    {
        if (num == candidate1)
        {
            ++count1;
        }
        else if (num == candidate2)
        {
            ++count2;
        }
        else if (0 == count1)
        {
            candidate1 = num;
            count1 = 1;
        }
        else if (0 == count2)
        {
            candidate2 = num;
            count2 = 1;
        }
        else
        {
            --count1;
            --count2;
        }
    }
    
    count1 = 0;
    count2 = 0;
    
    for (auto elem : nums)
    {
        if (elem == candidate1)
        {
            ++count1;
        }
        else if (elem == candidate2)
        {
            ++count2;
        }
    }
    
    if (count1 > (int)nums.size() / 3)
    {
        result.push_back(candidate1);
    }
    
    if (count2 > (int)nums.size() / 3)
    {
        result.push_back(candidate2);
    }
    
    return result;
}

int main()
{
    int arr[] = { 2, 2, 3, 2, 4, 2, 3, 2, 3, 5, 3};
    std::vector<int> nums(arr, arr + sizeof(arr) / sizeof(arr[0]));
    std::vector<int> result = majorityElement(nums);
    
    for (auto ret : result)
    {
        std::cout << ret << std::endl;
    }
    
    return 0;
}

參考:
https://leetcode.com/problems/majority-element/description/
https://leetcode.com/problems/majority-element-ii/description/
https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_majority_vote_algorithm
https://gregable.com/2013/10/majority-vote-algorithm-find-majority.html
https://blog.csdn.net/novostary/article/details/47680171
https://blog.csdn.net/wmdshhz0404/article/details/52602395
https://www.cnblogs.com/grandyang/p/4606822.html
https://www.cnblogs.com/grandyang/p/4233501.html
https://www.zhihu.com/question/49973163/answer/235921864

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市,隨后出現(xiàn)的幾起案子档址,更是在濱河造成了極大的恐慌,老刑警劉巖碧绞,帶你破解...
    沈念sama閱讀 218,640評論 6 507
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場離奇詭異,居然都是意外死亡叔汁,警方通過查閱死者的電腦和手機(jī)傻挂,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 93,254評論 3 395
  • 文/潘曉璐 我一進(jìn)店門乘碑,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人金拒,你說我怎么就攤上這事兽肤。” “怎么了绪抛?”我有些...
    開封第一講書人閱讀 165,011評論 0 355
  • 文/不壞的土叔 我叫張陵资铡,是天一觀的道長。 經(jīng)常有香客問我幢码,道長害驹,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 58,755評論 1 294
  • 正文 為了忘掉前任蛤育,我火速辦了婚禮宛官,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘瓦糕。我一直安慰自己底洗,他們只是感情好,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,774評論 6 392
  • 文/花漫 我一把揭開白布咕娄。 她就那樣靜靜地躺著亥揖,像睡著了一般。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上费变,一...
    開封第一講書人閱讀 51,610評論 1 305
  • 那天摧扇,我揣著相機(jī)與錄音,去河邊找鬼挚歧。 笑死扛稽,一個(gè)胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的滑负。 我是一名探鬼主播在张,決...
    沈念sama閱讀 40,352評論 3 418
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼矮慕!你這毒婦竟也來了帮匾?” 一聲冷哼從身側(cè)響起,我...
    開封第一講書人閱讀 39,257評論 0 276
  • 序言:老撾萬榮一對情侶失蹤痴鳄,失蹤者是張志新(化名)和其女友劉穎瘟斜,沒想到半個(gè)月后,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體痪寻,經(jīng)...
    沈念sama閱讀 45,717評論 1 315
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡螺句,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,894評論 3 336
  • 正文 我和宋清朗相戀三年,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了槽华。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 40,021評論 1 350
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡趟妥,死狀恐怖猫态,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情披摄,我是刑警寧澤亲雪,帶...
    沈念sama閱讀 35,735評論 5 346
  • 正文 年R本政府宣布,位于F島的核電站疚膊,受9級特大地震影響义辕,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜寓盗,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,354評論 3 330
  • 文/蒙蒙 一灌砖、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧傀蚌,春花似錦基显、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,936評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至,卻和暖如春窜醉,著一層夾襖步出監(jiān)牢的瞬間宪萄,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 33,054評論 1 270
  • 我被黑心中介騙來泰國打工榨惰, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留拜英,地道東北人。 一個(gè)月前我還...
    沈念sama閱讀 48,224評論 3 371
  • 正文 我出身青樓读串,卻偏偏與公主長得像聊记,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個(gè)殘疾皇子恢暖,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 44,974評論 2 355

推薦閱讀更多精彩內(nèi)容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi閱讀 7,332評論 0 10
  • 指針式C語言的靈魂排监,我簡單寫一下自己的見解 指針(pointer)簡介 指針是一個(gè)值為內(nèi)存地址的變量變量就是一個(gè)內(nèi)...
    RicherYY閱讀 358評論 0 0
  • 【養(yǎng)心養(yǎng)意】20171123學(xué)習(xí)力踐行Day44 ^o^兒歌~小星星亮晶晶 ^o^讀詩一首 ^o^畫日記 ^o^親...
    愛己及人閱讀 154評論 0 0
  • 五女拜壽 熱鬧非凡 各人感受峻異 家人之間投射出社會(huì)的看法 過年 全家團(tuán)聚 各有夢想和煩惱 求學(xué)的,做官的杰捂,當(dāng)老板...
    儉以養(yǎng)德文以載道閱讀 171評論 0 5
  • 玉樹瓊花 毫不浮夸 像是你拿著的毛筆 飽蘸水樣的柔情 肆意流淌對樹的深情 冰清玉潔 毫不吝嗇 像是你揣著的棋子 拋...
    陶纓子閱讀 368評論 7 27