March 24, 2011

Finding the Nth prime number

It’s a common requirement in programming challenges (such as Project Euler) to determine whether a number is a prime number. Recently I was tasked with writing a nice method of finding the n^th prime number.

(Originally written in March 2010)

The obvious, brute-force solution here is to iterate through the whole number system, checking to see if each number is prime, counting how many primes we have found and repeating until we find the n^th. We can immediately improve this process by only considering odd numbers once we pass the number 2. Two is the only even prime: any even number is a multiple of 2 and therefore has factors besides 1 and itself.

By definition, a prime number only has factors 1 and itself and we are able to see if a number, f, is a factor of another, p, by checking f divides into p without remainder. If we then iterate from 2 to p-1, checking to see if each number is a factor of p, we can conclude p is prime when no factors are found.

This is quite a drawn-out process. The best optimisation we can include is found when we consider that each factor has a reciprocal pair. For example, 6n is divided by the pairs (2, 3n) and (3, 2n). The numbers in each pair converge at the square-root of p. Therefore we can further reduce the number of checks we need to make by only considering possible factors from 3 to p½ which are odd (knowing that p > 2 is not prime if it is even).

Not knowing anything else, this is about as efficient a method we can find for determining if some integer p is prime. But in this instance we do have more information. Indeed, from counting the primes we have come across we will know the primes less than p when determining if p is itself prime.

The missing piece of the puzzle is how primes relate to lower primes. If we consider some integer n, we know what we can express n as a product of its factors. Each of those factors (integers themselves) can equally be expressed as a product of their factors, and so we can express n as the product of factors of its immediate factors. At what point does this end? When the factors are prime. Indeed, any non-prime integer can be expressed solely as the product of prime factors. Therefore, if we know all of the primes less than p and none of them divide into p without remainder, p is also prime.

Algorithm: and so to find the n^th prime number:

Iterate over the whole number system, ignoring even numbers greater than 2: 2, 3, 5, 7, 9, 11, …

For each integer, p, check if p is prime:

Iterate over the primes already found which are less than the square-root of p

For each prime in this set, f, check to see if f is a factor of p:

If f divides p then p is non-prime. Continue from step 2 for the next p.

If no factors are found, p is prime. Continue to step 3.

If p is not the n^th prime we have found, add it to the list of primes. Continue from step 2 for the next p.

Otherwise, p is the n^th prime we have found and we should return it.

A PHP implementation of this algorithm follows with a couple of further improvements. If we wish to get a set of primes, from the ith to the jth, it is extremely wasteful to continue calculating what the primes are between 1 and i (and k, for each i < k ≤ j). Thus I have implemented the ability to request an array of n^th‘s, minimising the amount of work required. There are also a couple of optional parameters for specifying a time limit, and how accurately this limit is adhered to.

<?php

/**
 * A method for finding n-th prime numbers. Usage example follows
 * at the end. 24 March 2011.
 * @author Paul Scott <paul.scotto@gmail.com>
 */

/**
 * Calculate the n-th prime number(s).
 *
 * This function operates on the realisation that a number is
 * prime if a factor cannot be found in the primes less than it
 * (since all numbers can be expressed as a product of primes). We
 * can further optimise this process by only considering primes
 * less than or equal to the square root as potential factors.
 *
 * @param int|array $nth
 *        The index/indices of the primes you want.
 * @param int|false $t
 *        The number of seconds to allow for calculation, or false
 *        (denoting no time limit (though bare max_execution_time
 *        in mind)).
 * @param int $tCheck
 *        How frequently (in terms of numbers tested for being
 *        prime) a breached time limit is checked for.
 *
 * @return int|array|null
 *        Returns an integer if $nth is an integer and an array if
 *        $nth an array. The array is associative of the form $n
 *        => get_prime($n), where $n is an integer within the
 *        original $nth array.
 *            If the time limit is reached, null is returned where
 *        $nth is an integer and an array with any computed primes
 *        is returned where $nth is an array.
 */
function get_prime($nth, $t = false, $tCheck = 1000)
{
    // Transform the args into a common form, discarding bad n-ths
    $singular = !is_array($nth);
    $nth = array_filter((array) $nth,
        function($n) {
            return is_int($n) && $n > 0;
        });
    if (!$nth) return $singular ? null : array();

    // The n-th prime we're aiming for
    $n = max($nth);

    // The first prime is the only even one
    $primes = array(1 => 2);
    if ($n == 1) {
        return $singular ? $primes[1] : $primes;
    }

    // Loop counters
    $c = 1;
    $p = 3;
    $begin = microtime(true);

    while (true)
    {
        // Check if $p is prime
        $prime = true;
        $sqrt = sqrt($p);
        for ($i = 1; $i < $c && $primes[$i] <= $sqrt; $i++) {
            if ($p % $primes[$i] == 0) {
                $prime = false;
                break;
            }
        }
        // Record $p if prime
        if ($prime) {
            $primes[++$c] = $p;
            if ($c == $n) {
                break;
            }
        }
        // Check if time limit expired (every $tCheck passes)
        if (
                $t && ($p % $tCheck <= 1) &&
                (microtime(true) - $begin) > $t
        ) {
            break;
        }
        // Next $p to check
        $p += 2;
    }

    if ($singular) {
        return isset($primes[$n]) ? $primes[$n] : null;
    } else {
        return array_intersect_key(
            $primes,
            array_fill_keys($nth, null)
        );
    }
}

/*
 * =============
 * USAGE EXAMPLE
 */

?>

<p>The 1,000<sup>th</sup> prime number:</p>
<pre>get_prime(1000);</pre>
<?php stopwatch(function() { return get_prime(1000); }); ?>

<hr />

<p>In trying to calculate the 10,000<sup>th</sup> prime we limit
ourselves to 0.1 seconds. Where this is not enough time the
function will bail and return <code>null</code>.</p>
<pre>get_prime(10000);</pre>
<?php stopwatch(function() { return get_prime(10000); }); ?>
<pre>get_prime(10000, 0.1);</pre>
<?php stopwatch(function() { return get_prime(10000, 0.1); }); ?>

<hr />

<p>Array functions can be used to get a set of primes&hellip;</p>
<pre>array_map('get_prime', range(0, 50));</pre>
<?php stopwatch(function() {
    return array_map('get_prime', range(0, 50));
}); ?>

<hr />

<p>&hellip; but it is far more efficient to request them all from
<code>get_prime</code> simultaneously. The resultant array will be
slightly different.</p>
<pre>get_prime(range(0, 50));</pre>
<?php stopwatch(function() {
    return get_prime(range(0, 50));
}); ?>

<?php

/**
 * Function-call timing.
 * @param callable $fn
 * @param bool $dump
 */
function stopwatch($fn, $dump = true)
{
    $t = microtime(true);
    $result = $fn();
    $t = microtime(true) - $t;

    echo '<em>Call took ', $t, ' seconds.</em>';
    if ($dump) {
        echo '<pre>';
        var_dump($result);
        echo '</pre>';
    }
    return $result;
}

109

Kudos

109

Kudos