Sunday, March 20, 2011

Project Euler Problem 1

I recently attempted many of the simpler Project Euler problems, and decided it would be good to write down my thinking process.  If you don't know, Project Euler is a collection of programming problems with a heavy emphasis on mathematics.  They highly discourage sharing the answers since they want you to solve the problems by yourself, but since answers are already scattered all over the Internet it's up to you to challenge yourself and avoid spoilers.  Consider yourself warned - this post contains solutions and discussions thereof.

Problem statement:  Add all the natural numbers below one thousand that are multiples of 3 or 5.

As a reminder, natural numbers are the set of positive integers.  This problem, despite being fairly simple, actually has a few different approaches, and the simplicity is actually nice because it makes the explanations simple too.

The brute-force approach is to simply loop through the numbers from 1 to 999, and add them to some sum if it's a multiple of 3 or 5:

    public static long sumMultiples(long limit) {
        long sum = 0;
        int[] factors = {  3, 5 };

        for (int i = 1; i < limit; i++) {
            for (int m : factors) {
                if (i % m == 0) {
                    sum += i;
                    break;
                }
            }
        }

        return sum;
    }

This is an O(n * m) algorithm, where n is the limit and m is the number of factors to consider.  For larger values of the input n, this approach would take longer amounts of time because it has to evaluate every single number from 1 to the limit, for each factor.

A slight improvement is to realize that the factors occur in fixed intervals between the values 1 and (limit - 1).  So instead of checking every number, we just hop over the ones that aren't multiples of the factor:


    public static long sumMultiples2(long limit) {
        long sum = 0;
        int[] factors = {  3, 5 };

        for (int factor : factors) {
            for (int i = 1; i * factor < limit; i++) {
                sum += i * factor;
            }
        }

        for (int i = 1; i * 15 < limit; i++) {
            sum -= i * 15;
        }

        return sum;
    }

We see the first for-loop having a nested for-loop, with the terminating condition (i * factor < limit) - this could be improved also by simply doing (i < limit) with the increment set to the factor (in other words, do i += factor instead of i++).  Then there is another for-loop which subtracts the multiples of 15, since these multiples would have been added twice (once when going through factors of 3, and again when handling factors of 5).

A greatly improved approach is to find a formula for the sequence of numbers.  It turns out that the sum of multiples of a number, up to a limit, can be calculated using this formula:  sum = n * p * (p + 1) / 2, where p = (limit - 1) / n .  The derivation of this formula is left as an exercise for the reader.  Thus the solution for problem 1 can be given with this:

    public static long sumMultiples3(long limit) {
        return sumDivisibleBy(3, limit) +
               sumDivisibleBy(5, limit) -
               sumDivisibleBy(15, limit);
    }

    public static long sumDivisibleBy(int n, long lim) {
        long p = (lim - 1) / n;

        return n * (p * (p + 1)) / 2;
    }

Here we sum the sums of the sequences of those which have as factors 3 and 5, and then subtract the sum for 15 as those would have been counted twice.  This O(1) implementation thus can handle much larger limits and calculate them much more quickly.  A timing table for the 3 approaches, using increasingly larger limits:

     Limit |  elapsed time in seconds
   1000000 |   0.09 vs  0.00 vs 0.00
   2000000 |   0.09 vs  0.00 vs 0.00
   4000000 |   0.04 vs  0.01 vs 0.00
   8000000 |   0.09 vs  0.01 vs 0.00
  16000000 |   0.17 vs  0.03 vs 0.00
  32000000 |   0.35 vs  0.06 vs 0.00
  64000000 |   0.66 vs  0.12 vs 0.00
 128000000 |   1.30 vs  0.24 vs 0.00
 256000000 |   2.80 vs  0.59 vs 0.00
 512000000 |   5.33 vs  1.17 vs 0.00
1024000000 |  11.88 vs  2.05 vs 0.00

Note this calculation was done on a personal computer with hardware of vintage circa 2007.

No comments:

Post a Comment