View Question
Q: algorithms ( No Answer,   11 Comments )
 Question
 Subject: algorithms Category: Computers > Algorithms Asked by: semsem66-ga List Price: \$2.50 Posted: 20 Oct 2005 12:30 PDT Expires: 19 Nov 2005 11:30 PST Question ID: 582711
 ```Let A be an array containing n very large positive integers not necessarily distinct. A majority is a number that appears at least dn/2e times in the array. Describe an O(n)-time algorithm that finds a majority in A if exists. Explain why your algorithm has time complexity O(n).``` Clarification of Question by semsem66-ga on 20 Oct 2005 16:22 PDT `apears at least n/2 times. Not dn/2e.`
 There is no answer at this time.

 Subject: Re: algorithms From: philnj-ga on 21 Oct 2005 13:00 PDT
 ```This is easy at O(n squared) but O(n) has me stumped. Maybe someone smarter will wander by.```
 Subject: Re: algorithms From: mathtalk-ga on 21 Oct 2005 18:23 PDT
 ```I haven't found a solution, yet! Verifying that a value is or is not a majority in the array is O(n). A sorted array would make it trivial to find using n or fewer steps, as the value of any "majority" number must to appear in the middle of the array, and it would extend continguously to the required length n/2. An easy refinement by binary search cuts down the verification steps, assuming the sorted array, to O(log(n)). But soring the array is O(n*log(n)). I'd think about some kind of recursive solution for n = 2^m, based on the notion that if a value attains a majority, it must do so in at least one of the two halves of the array. But it's not obvious how one might suitably "pad" an array which is not exactly a power of 2. A more general problem would be to identify the "mode" (most frequently occuring value) in an array. A standard "adversary" argument, to the effect that in a worst case each entry of the array must be inspected, tells us that time complexity has O(n) lower bound. regards, mathtalk-ga```
 Subject: Re: algorithms From: philnj-ga on 24 Oct 2005 06:27 PDT
 ```I agree with mathtalk (exactly the type of smart person I was expecting). I have a hunch that the answer (if it exists) lies in recursion. Is the problem stated correctly?```
 Subject: Re: algorithms From: mathtalk-ga on 24 Oct 2005 10:51 PDT
 ```Here's a fairly recent paper that addresses an "approximation" to the more general problem of computing the mode, answering "approximately" range mode queries in constant time through preprocessing steps that take O(n log n) time: [Approximate Range Mode and Range Median Queries] http://www.scs.carleton.ca/~kranakis/Papers/stacs05.pdf Assuming the state of the art is fairly represented by this paper, we should not expect an algorithm for the exact mode to require less than O(n log n) time. Of course this leaves open a number of possible approaches to finding an element that has a "majority" frequency, or that none exists, in O(n) time, but it certainly helps to focus our attention on the circumstances that permit a "fast" decision. regards, mathtalk-ga```
 Subject: Re: algorithms From: mathtalk-ga on 24 Oct 2005 16:32 PDT
 ```The clever ideas needed here are outlined in this Wikipedia article: [Selection algorithm -- Wikipedia] http://en.wikipedia.org/wiki/Selection_algorithm See the heading "Linear general selection algorithms" and the explanation that follows for how to find the kth ascending element among n items in O(n) time. This amounts to saying we can determine the entry a[k] that would exist in the array if it were sorted. We do so without actually sorting the array, though the method proposed is tantamount to a partial quicksort (with a clever choice of pivots to preserve linear time complexity). Let's regard the indexing of (what would be) the sorted array as a[1],...,a[n]. If n is odd, check only one candidate a[(n+1)/2] to see if it is a majority value, by counting all its occurrences in the original array. If n is even, then one has to check two candidates, a[n/2] and a[(n+2)/2], as both, only one, or neither of these could turn out to be a majority value. regards, mathtalk-ga```
 Subject: Re: algorithms From: mathtalk-ga on 24 Oct 2005 16:37 PDT
 ```Here's another write up of the (worst case) linear-time selection algorithm: [Deterministic selection] http://www.ics.uci.edu/~eppstein/161/960130.html```
 Subject: Re: algorithms From: quadraticresidue-ga on 27 Oct 2005 23:41 PDT
 ```Use deterministic selection to find the median. Count the number of elements that are equal to the median. Note that if a majority of the elements are identical, the median element must be one of these elements.```
 Subject: Re: algorithms From: mathtalk-ga on 28 Oct 2005 07:06 PDT
 ```Hi, quadraticresidue-ga: If there are an odd number of elements, then yes, the only candidate is the median. But if there are an even number, then "median" is not quite what needs to be checked. [Often the median is defined as arithmetic mean of two middle entries, if the sample size is even.] Consider a sample of only two distinct (large positive) values, with even sample size. According to OP's specification, if both are equally represented, then both are "majority" values. By perturbing the upper or lower range of values, one can arrange that either (or both) to fail to be majority values (without changing the median). So in the even sample size it is necessary in general to check both candidates, but of course the time complexity remains linear in the sample size. regards, mathtalk-ga```
 Subject: Re: algorithms From: geezel-ga on 05 Nov 2005 13:40 PST
 ```The idea of finding a median is cute, but assuming a majority is a number that appears _more_ than n/2 times (as the term would suggest), there's a simpler way. The basic idea is that counting the number of times our number will appear, and subtracting the number of times all other numbers appear, will obviously leave us with a positive number (as it appears more than half of the time). Assume the numbers appear in an array nums[0...n-1]: candidate = nums[0] count = 1 for i = 1 to n-1 { if nums[i] == candidate { count = count+1 } else { count = count-1 } if count == 0 { candidate = nums[i] count = 1 } } If a "majority" exists, it will appear as the candidate at the end. The wording of the question seems to make this OK. This algorithm runs in O(n). If the size of the numbers in the array was limited to a number up to a constant in n, you could use something similar to Bin Sort, but this option was intentionally blocked. As I'm guessing this is homework, your teacher may want you to prove the correctness of this algorithm. You may consider a loop invarient that states that the current candidate will only be the candidate at the end (assuming the existance of a majority) if it is the major one. Good luck, and consider consulting people you study with!```
 Subject: Re: algorithms From: mathtalk-ga on 08 Nov 2005 05:23 PST
 ```geezel-ga's approach is a vast simplification over mine! It can also be modified to handle cases where n is even and some value appears in exactly half of the entries (the weaker notion of "majority" as defined by the original Question). One way is to keep track of a "previous" candidate in addition to the "current" candidate. Using the fact that the values are all positive integers, we can modify geezel-ga's code by starting this way: candidate0 = 0; candidate = nums[0]; count = 1; Later, when candidate is being updated (upon count dropping to zero): if count == 0 { candidate0 = candidate; candidate = nums[i]; count = 1l } If n is even, one must perform the O(n) checking for both candidate and candidate0 values, to determine whether either has the required n/2 entries. To illustrate, suppose we have four entries: 100 200 100 300 which winds up with a "true" majority candidate0 = 100 and "false" candidate = 300. regards, mathtalk-ga```
 Subject: Re: algorithms From: meranaamxpert-ga on 05 Jan 2006 00:10 PST
 ```I have a cute solution here... say a[0..n-1] is the given array Take a pivot element (say P) and do the first step of quick sort i.e., place this pivot element P in the correct position (say at k) of the array such that all elements in a[0]..a[k-1] are less than P and a[k+1] to a[n-1] are greater than P.. (Note: the array is not sorted!!) This step takes O(n) time only.. if we choose the pivot carefully such that "k" is somewhere in the middle of the new array, we can easily find the repeating number which will be either in the first or second half.. plz let me know if u have more doubts..```