Hello JB,
Interesting question! Just a few comments:
1. The kind of distribution assumption you make is crucial to the
answer you want. E.g., assuming it to be a Gaussian will lead to a
very different answer as opposed to a Laplace distribution. Once that
is assumed (with reasonable justification), then you can estimate the
parameters of the distribution (in case of Gaussian it is the mean and
variance) using some actual data collected. Many ways to do that, and
the most simplest/common is Maximum Likelihood Estimation. Only have
the mean of # of visits per hour will not work, as you can imagine: in
order to get an estimation of the peak value you need something which
can tell you the spread of the distribution, not just the center.
2. The max # of visits for an avg. day is actually bounded above, as
you have the total # of visits on a average day. This is useful since
even if you use a distribution with unbounded support (like Gaussian)
you may still want to consider a truncated Gaussian which removes the
tails (mainly the right tail). This typically gives you a more
reliable estimation.
3. You need to get some data (# of vists of each hour) in order to get
a sensible estimation. Actually a lot of webservices can do that, by
just simply adding a block of javascript in the webpage. For example,
http://www.webstats4u.com
Good luck! |