Google Answers Logo
View Question
 
Q: algorithms ( No Answer,   11 Comments )
Question  
Subject: algorithms
Category: Computers > Algorithms
Asked by: semsem66-ga
List Price: $2.50
Posted: 20 Oct 2005 12:30 PDT
Expires: 19 Nov 2005 11:30 PST
Question ID: 582711
Let A be an array containing n very large positive integers not
necessarily distinct. A majority is a
number that appears at least dn/2e times in the array. Describe an
O(n)-time algorithm that finds a
majority in A if exists. Explain why your algorithm has time complexity O(n).

Clarification of Question by semsem66-ga on 20 Oct 2005 16:22 PDT
apears at least n/2 times. Not dn/2e.
Answer  
There is no answer at this time.

Comments  
Subject: Re: algorithms
From: philnj-ga on 21 Oct 2005 13:00 PDT
 
This is easy at O(n squared) but O(n) has me stumped.  Maybe someone
smarter will wander by.
Subject: Re: algorithms
From: mathtalk-ga on 21 Oct 2005 18:23 PDT
 
I haven't found a solution, yet!

Verifying that a value is or is not a majority in the array is O(n).

A sorted array would make it trivial to find using n or fewer steps,
as the value of any "majority" number must to appear in the middle of
the array, and it would extend continguously to the required length
n/2.  An easy refinement by binary search cuts down the verification
steps, assuming the sorted array, to O(log(n)).  But soring the array
is O(n*log(n)).

I'd think about some kind of recursive solution for n = 2^m, based on
the notion that if a value attains a majority, it must do so in at
least one of the two halves of the array.  But it's not obvious how
one might suitably "pad" an array which is not exactly a power of 2.

A more general problem would be to identify the "mode" (most
frequently occuring value) in an array.

A standard "adversary" argument, to the effect that in a worst case
each entry of the array must be inspected, tells us that time
complexity has O(n) lower bound.


regards, mathtalk-ga
Subject: Re: algorithms
From: philnj-ga on 24 Oct 2005 06:27 PDT
 
I agree with mathtalk (exactly the type of smart person I was
expecting).  I have a hunch that the answer (if it exists) lies in
recursion.  Is the problem stated correctly?
Subject: Re: algorithms
From: mathtalk-ga on 24 Oct 2005 10:51 PDT
 
Here's a fairly recent paper that addresses an "approximation" to the
more general problem of computing the mode, answering "approximately"
range mode queries in constant time through preprocessing steps that
take O(n log n) time:

[Approximate Range Mode and Range Median Queries]
http://www.scs.carleton.ca/~kranakis/Papers/stacs05.pdf

Assuming the state of the art is fairly represented by this paper, we
should not expect an algorithm for the exact mode to require less than
O(n log n) time.

Of course this leaves open a number of possible approaches to finding
an element that has a "majority" frequency, or that none exists, in
O(n) time, but it certainly helps to focus our attention on the
circumstances that permit a "fast" decision.


regards, mathtalk-ga
Subject: Re: algorithms
From: mathtalk-ga on 24 Oct 2005 16:32 PDT
 
The clever ideas needed here are outlined in this Wikipedia article:

[Selection algorithm -- Wikipedia]
http://en.wikipedia.org/wiki/Selection_algorithm

See the heading "Linear general selection algorithms" and the
explanation that follows for how to find the kth ascending element
among n items in O(n) time.

This amounts to saying we can determine the entry a[k] that would
exist in the array if it were sorted.  We do so without actually
sorting the array, though the method proposed is tantamount to a
partial quicksort (with a clever choice of pivots to preserve linear
time complexity).

Let's regard the indexing of (what would be) the sorted array as a[1],...,a[n].

If n is odd, check only one candidate a[(n+1)/2] to see if it is a
majority value, by counting all its occurrences in the original array.

If n is even, then one has to check two candidates, a[n/2] and
a[(n+2)/2], as both, only one, or neither of these could turn out to
be a majority value.


regards, mathtalk-ga
Subject: Re: algorithms
From: mathtalk-ga on 24 Oct 2005 16:37 PDT
 
Here's another write up of the (worst case) linear-time selection algorithm:

[Deterministic selection]
http://www.ics.uci.edu/~eppstein/161/960130.html
Subject: Re: algorithms
From: quadraticresidue-ga on 27 Oct 2005 23:41 PDT
 
Use deterministic selection to find the median.  Count the number of
elements that are equal to the median.  Note that if a majority of the
elements are identical, the median element must be one of these
elements.
Subject: Re: algorithms
From: mathtalk-ga on 28 Oct 2005 07:06 PDT
 
Hi, quadraticresidue-ga:

If there are an odd number of elements, then yes, the only candidate
is the median.  But if there are an even number, then "median" is not
quite what needs to be checked.  [Often the median is defined as
arithmetic mean of two middle entries, if the sample size is even.]

Consider a sample of only two distinct (large positive) values, with
even sample size.  According to OP's specification, if both are
equally represented, then both are "majority" values.

By perturbing the upper or lower range of values, one can arrange that
either (or both) to fail to be majority values (without changing the
median).  So in the even sample size it is necessary in general to
check both candidates, but of course the time complexity remains
linear in the sample size.


regards, mathtalk-ga
Subject: Re: algorithms
From: geezel-ga on 05 Nov 2005 13:40 PST
 
The idea of finding a median is cute, but assuming a majority is a
number that appears _more_ than n/2 times (as the term would suggest),
there's a simpler way.
The basic idea is that counting the number of times our number will
appear, and subtracting the number of times all other numbers appear,
will obviously leave us with a positive number (as it appears more
than half of the time).

Assume the numbers appear in an array nums[0...n-1]:

candidate = nums[0]
count = 1

for i = 1 to n-1 {
  if nums[i] == candidate {
    count = count+1
  } else {
    count = count-1
  } 

  if count == 0 {
    candidate = nums[i]
    count = 1
  }
}

If a "majority" exists, it will appear as the candidate at the end.
The wording of the question seems to make this OK.

This algorithm runs in O(n). If the size of the numbers in the array
was limited to a number up to a constant in n, you could use something
similar to Bin Sort, but this option was intentionally blocked.

As I'm guessing this is homework, your teacher may want you to prove
the correctness of this algorithm. You may consider a loop invarient
that states that the current candidate will only be the candidate at
the end (assuming the existance of a majority) if it is the major one.

Good luck, and consider consulting people you study with!
Subject: Re: algorithms
From: mathtalk-ga on 08 Nov 2005 05:23 PST
 
geezel-ga's approach is a vast simplification over mine!  It can also
be modified to handle cases where n is even and some value appears in
exactly half of the entries (the weaker notion of "majority" as
defined by the original Question).

One way is to keep track of a "previous" candidate in addition to the
"current" candidate.  Using the fact that the values are all positive
integers, we can modify geezel-ga's code by starting this way:

candidate0 = 0;
candidate = nums[0];
count = 1;

Later, when candidate is being updated (upon count dropping to zero):

if count == 0 {
  candidate0 = candidate;
  candidate = nums[i];
  count = 1l
}

If n is even, one must perform the O(n) checking for both candidate
and candidate0 values, to determine whether either has the required
n/2 entries.  To illustrate, suppose we have four entries:

100 200 100 300

which winds up with a "true" majority candidate0 = 100 and "false" candidate = 300.


regards, mathtalk-ga
Subject: Re: algorithms
From: meranaamxpert-ga on 05 Jan 2006 00:10 PST
 
I have a cute solution here...
say a[0..n-1] is the given array

Take a pivot element (say P) and do the first step of quick sort i.e.,
place this pivot element P in the correct position (say at k) of the
array such that all elements in a[0]..a[k-1] are less than P and
a[k+1] to a[n-1] are greater than P..
(Note: the array is not sorted!!)

This step takes O(n) time only..

if we choose the pivot carefully such that "k" is somewhere in the
middle of the  new array, we can easily find the repeating number
which will be either in the first or second half..

plz let me know if u have more doubts..

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy