Hello Codemasterjs,
There are a number of optimization methods that can be used to solve
this kind of problem. Their effectiveness can vary significantly. For
example, there are some "single pass" algorithms that will not provide
an optimal solution but will often get you "close" to a good answer
and are simple to implement. The algorithm used by the program you
refer to may use an iterative technique and I'll provide pointers to
those as well.
For example, let's assume you have three workers (A, B, C) and a set
of accounts with values as follows:
100 150 210 30 89 75 66 29 22 40
A very simple method can be described as follows:
1. sum the account value (TV) and divide by the number of workers (N)
- this is the "goal". For the values above, TV=811, goal=270.3
2 sort the accounts based on value (AV). For the values above, the
sorted values are 210 150 100 89 75 66 40 30 29 22.
3 starting with the account with the highest AV, assign one account
to each worker. So A has 210, B has 150, and C has 100.
4 assign accounts with the highest AV to each worker in order unless
the sum exceeds the goal. Repeat until you have an account which
cannot be assigned. For this example, it would go something like...
A - 210; B 150, 89; C 100 (assign 89 to B) [totals 210, 239, 100]
A - 210; B 150, 89; C 100, 75 (assign 75 to C) [totals 210, 239,
175]
A - 210; B 150, 89; C 100, 75, 66 (assign 66 to C) [totals 210, 239,
241]
A - 210, 40; B 150, 89; C 100, 75, 66 (assign 40 to A) [totals 250,
239, 241]
A - 210, 40; B 150, 89, 30; C 100, 75, 66 (assign 30 to B) [totals
250, 269, 241]
A - 210, 40; B 150, 89, 30; C 100, 75, 66, 29 (assign 29 to C)
[totals 250, 269, 270]
at this point, 22 can't be assigned to any of the workers without
exceeding the goal.
5 assign the remaining accounts to the worker with the smallest sum
(each step)
A - 210, 40, 22; B 150, 89, 30; C 100 75, 66, 29 (assign 22 to A)
[totals 272, 269, 270]
which appears to be the "best solution" (and the algorithm got lucky
with this set of data).
This kind of algorithm is often called "greedy" in that it tries to
make the best choice at each step of the algorithm. If you stop at
this point, you will generally get a solution may be optimum or may be
'good enough".
Further optimization is often done with some kind of stochastic
method. For example, choose one (or more) account at random to swap
between two workers. If the result is "closer to the goal" than the
current "best solution", then keep it. If not, stay with what you have
as the best. Repeat the swap with the current best some number of
iterations - for this much data, say 10 times. Again, the solution may
not be optimal, but will complete in a reasonable time and often
improves the result.
A similar problem to this one is called the "knapsack" problem. The
problem and possible solvers for this kind of problem has references
at
http://www.irisa.fr/cosi/Rajopadhye/knapsack.html
an article that describes the general problem and refers to a few
algorithms to solve it
http://citeseer.nj.nec.com/context/208189/0
a list of papers that refer to a genetic algorithm solution to the
multidimensional knapsack problem.
Plenty of more references can be found with a search using
knapsack problem algorithm
References to greedy algorithms (e.g., Dijkstra's algorithm)
http://ciips.ee.uwa.edu.au/~morris/Year2/PLDS210/dijkstra.html
or minimal spanning trees
http://ciips.ee.uwa.edu.au/~morris/Year2/PLDS210/mst.html
Plenty of more references can be found using a search using
distribute work evenly greedy algorithm
References to stochastic methods (e.g, simulated annealing)
http://courses.ece.uiuc.edu/ece482/Projects/part1.pdf
and search with a phrases such as
simulated annealing algorithm
synthetic annealing algorithm
A few other references:
http://faculty.smu.edu/barr/ip/01ip16newDir.PDF
New directions in optimization; has a number of suggested algorithms
to optimize solutions.
Good search terms in general include
distribute work evenly algorithm
--Maniac |