Tuesday, February 22, 2011

Project Euler–Problem 7


From Project Euler:
By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.
What is the 10001st prime number?


Finding primes is an interesting problem. There are a number of ways that we can go about doing so, and it is something we will need to do several times throughout the Project Euler problems. So we will want to develop a solution that is not only accurate, but also fairly efficient.
A fairly efficient method for finding all of the prime numbers below a given value is the Sieve of Eratosthenes.  The way it works is simple:
  1. Create  a list of consecutive integers from two to n: (2, 3, 4, …, n)
  2. Start with p = 2, the first prime number.
  3. Remove from the list all multiples of p less than or equal to n.
  4. Find the smallest number remaining on the list greater than p. This is the next prime number. Set p equal to this number.
  5. Repeat this process until p2 is greater than n.
  6. All remaining numbers on the list are prime.
This works extremely well if you know what your upper bound is going to be.  While we could take a guess and generate all the primes up to some arbitrarily large number, we would like a way of using this method to generate any arbitrary number of primes. This can still be done if we take into account the following facts:
  1. If we want an arbitrary limit, there is no n value, so we cannot stop striking composite numbers as in step 5 above.
  2. Since we have no upper bound, striking all of the multiples of p would be impossible (there are infinitely many). So we will only strike the next multiple and worry about the larger ones when we get to them.
  3. There is insufficient memory to store a list of all numbers (obviously), so we will ignore all primes that have already been yielded and all composites that have already been stricken. Thus we will only store a list of composites that will need to be stricken when we reach them.
This last observation is the real trick. Instead of keeping a list of prime candidates, we’ll keep  a list of known composites to strike. We’ll also need to know what primes that composite is a multiple of in order to know what the next multiples of each will be. For instance, 12 is a multiple of both 2 and 3. This means that it would be stricken by both in the original sieve algorithm. We would need to know this so that we can see that the next multiple of 2 will be 14 and the next multiple of 3 will be 15.
One final note before we start implementing our algorithm. When we reach a prime number p, we do not need to strike the first p-1 multiples of p. This is because those multiples will be a multiple of another prime as well. This can easily be seen with an example:
  • Given the prime number p = 11
  • 2p = 22 which is a multiple of 2
  • 3p = 33 which is a multiple of 3
  • 4p = 2*2p = 44 which is a multiple of 2
  • 5p = 55 which is a multiple of 5
  • 6p = 2*3p = 66 which is a multiple of both 2 and 3
  • 7p = 77 which is a multiple of 7
  • 8p = 2*2*2p = 88 which is a multiple of 2
  • 9p = 3*3p = 99which is a multiple of 3
  • 10p = 5*2p = 110 which is a multiple of both 2 and 5
  • 11p = 121 which is only a multiple of 11.
Therefore, when we find a prime p, the first composite we need to concern ourselves with is p2.
class PrimeGenerator
 include Enumerable

 def each
  composites = {}
  n = 2
  while true
   if composites.has_key?(n)
    factors = composites.delete(n)
    factors.each { |factor| composites.merge!(n + factor => [factor]) { |key,left,right| left + right } }
    composites.merge!(n**2 => [n]) { |key,left,right| left + right }
    yield n
   n += 1
With this class in place, finding the 10,001st prime is as simple as:
The full source and specifications can be seen on github. Next time, discovering the largest product of five consecutive digits in a 1000-digit number.

No comments: