Skip to main content

Prime Factors

Authors: @Ishwarendra @wingkwong

Overview

A prime factor is a factor of a number that is also a prime number. For example, the prime factors of 1212 are 22 and 33, as 12=22312 = 2 * 2 * 3. Prime factorization is the process of finding the prime factors of a number. This can be done by repeatedly dividing the number by prime numbers until the remaining number is also a prime number. The prime factorization of a number is unique, meaning that there is only one combination of prime numbers that multiply together to equal that number. The prime factorization of a number can be useful in solving mathematical problems such as finding the greatest common divisor (GCD) of two numbers or the least common multiple (LCM) of two numbers.

Approach #1: Finding prime factor in n\sqrt n time

We know that every composite number NN has a prime factor less than NN. Thus, we can iterate over numbers from 22 to NN and whenever we find number that divides NN we divide NN by it until it no longer divides NN.

But wouldn't we accidently divide N by a non-prime number this way?

No, we won't because every composite number has a prime factor less than NN so as we iterate over 22 to NN we will cancel out all prime factors of any non-prime numbers from NN making it indivisible by that non-prime number.

We can further speed up our algorithm by using the fact that a number can have atmost 11 prime factor greater than (n)\sqrt(n). This can be easily proved by assuming that a number contains two prime factors greater than n\sqrt{n}, let's call them PP and QQ.

Since P>nP > \sqrt n and Q>nQ > \sqrt n     \implies PQ>nn(=n2)P \cdot Q > \sqrt n \cdot \sqrt n (= n^2). This is not possible as PP and QQ are prime factor of NN so their product must be less than or equal to NN.

Implementation

Written by @Ishwarendra
vector<int> getPrimeFactors(int n) {
vector<int> prime_factors;
// i * i <= n or i <= sqrt(n) both work.
for (int i = 2; i * i <= n; i++) {
while (n % i == 0) {
prime_factors.emplace_back(i);
n /= i;
}
}
if (n > 1) prime_factors.emplace_back(n);
return prime_factors;
}
  • Time Complexity: O(n)O(\sqrt{n})

Approach #2: Finding prime factor in log(n) time

Before moving further, it's important to recognise that a number NN cannot have more than log(N)log(N) prime factors. Since the smallest prime factor a number can have is 22 and 2log(n)n2^{log(n)} \geq n. If the number has a prime factor bigger than 22 than it will have even lesser prime factors. The pre-requisite for this method is knowing Sieve of Eratosthenes.

We can optimise our code to find prime factors of a number NN in O(log(N))O(log(N)) time. But to do so, we need to do some pre-computations which takes O(MAX)O(MAX) time, where MAXMAX is the largest number upto which we can find it's prime factors. Thus the final time complexity is O(MAX+log(N))O(MAX + log(N)). This is helpful only if we have to find prime factor of multiple numbers less than or equal to MAXMAX.

Since the pre computation takes O(MAX)O(MAX) time so we cannot find prime factors of very large numbers (due to constraints on time) using this method. The pre computation process is a modification of Sieve of Eratosthenes. For each number from 22 to MAXMAX we find it's smallest prime factor and store it in an array, let's call it min_prime. We will see how to find min_prime array later, first let us see how to calculate prime factor of a number NN if we already have the min_prime array in hand.

Implementation

Written by @Ishwarendra
// For sake of simplicity let us assume min_prime is a global vector.
vector<int> getPrimeFactorsInLogn(int n) {
vector<int> prime_factors;
while (n > 1) {
int smallest_prime_factor_of_n = min_prime[n];
prime_factors.push_back(smallest_prime_factor_of_n);
n /= smallest_prime_factor_of_n;
}
return prime_factors;
}
  • Time Complexity: O(log(n))O(log(n)), since each time the number nn is divided by a number greater than equal to 22, so it will take only log(n)log(n) operation for nn to reach 11.
How to find min_prime array?
vector<int> minPrime(int MAX) {
vector<int> min_prime(MAX + 1);
// Set min_prime[i] = i
iota(begin(min_prime), end(min_prime), 0);
for (int i = 2; i * i <= MAX; i++) {
if (min_prime[i] != i) continue;
// if min_prime[i] = i then i must be a prime number.
// Any multiple of i less than i * i (i.e., i * 2, i * 3, ... i * (i - 1)) has a smaller prime factor than i.
// Any number >= i * i may have i as it's minimum prime factor
for (int j = i * i; j <= MAX; j += i) min_prime[j] = min(min_prime[j], i);
}
return min_prime;
}

Now, that we are aware the techniques used to find prime factors let us solve some problem to get a better grip over the topic.

Example #1: 2521. Distinct Prime Factors Product of Array

Let us take two numbers A=p1a1×p2a2××pkakA = {p_1} ^ {a_1} \times {p_2} ^ {a_2} \times \cdots \times {p_k} ^ {a_k} and B=p1b1×p2b2××pkbkB= {p_1} ^ {b_1} \times {p_2} ^ {b_2} \times \cdots \times {p_k} ^ {b_k}, where pip_i is a prime numbers and ai,bi0a_i, b_i \geq 0.

What is A×BA \times B?

We know ab×ac=ab+ca^b \times a^c = a^{b + c}. Using this property we can say that A×B=(p1a1+b1)×(p2a2+b2)××(pkak+bk).A \times B = ({p_1}^{a_1 + b_1}) \times ({p_2} ^ {a_2 + b_2}) \times \cdots \times ({p_k}^{a_k + b_k}). Thus, we can find prime factor of individual numbers and just add up the power of primes. Function getPrimeFactors(n) is same as coded in the start. To get distinct elements we use set to store all prime factors of all numbers.

Written by @Ishwarendra
int distinctPrimeFactors(vector<int>& nums) {
set<int> distinct_pf;
for (auto num : nums) {
auto pf = getPrimeFactors(num);
for (auto prime : pf) distinct_pf.emplace(prime);
}
return distinct_pf.size();
}
  • Time Complexity: O(nMAX)O(n \cdot \sqrt MAX), where MAXMAX is the max number present in nums and nn is length of nums.
  • Space Complexity: O(\text\{number of prime from 2 to MAX}).

We can code same approach using getPrimeFactorsInLogn() function (Method-2).

Written by @Ishwarendra
class Solution {
vector<int> minPrime(int MAX) {
vector<int> min_prime(MAX + 1);
iota(begin(min_prime), end(min_prime), 0);
for (int i = 2; i * i <= MAX; i++) {
if (min_prime[i] != i) continue;
for (int j = i * i; j <= MAX; j += i) min_prime[j] = min(min_prime[j], i);
}
return min_prime;
}

// We need to pass the min_prime vector to use inside the function.
vector<int> getPrimeFactorsInLogn(int n, vector<int> &min_prime) {
vector<int> prime_factors;
while (n > 1) {
int smallest_prime_factor_of_n = min_prime[n];
prime_factors.push_back(smallest_prime_factor_of_n);
n /= smallest_prime_factor_of_n;
}
return prime_factors;
}

public:
int distinctPrimeFactors(vector<int>& nums) {
set<int> distinct_pf;
// Finding the limit upto which we should run sieve.
const int max = *max_element(begin(nums), end(nums));
auto min_prime = minPrime(max + 1);
for (auto num : nums) {
// Get prime factors of each number in log(n) time.
auto pf = getPrimeFactorsInLogn(num, min_prime);
for (auto prime : pf) distinct_pf.emplace(prime);
}
return distinct_pf.size();
}
};
  • Time Complexity: O(MAX+nlog(MAX))O(MAX + n \cdot log(MAX)), where MAXMAX is the max element of nums and nn is the length of nums.
  • Space Complexity: O(MAX)O(MAX), since we need the min_prime vector. Other vector are of size log(MAX)log(MAX) only.

Example #2: 2507. Smallest Value After Replacing with Sum of Prime Factors

In this problem, we have keep replacing nn with sum of it's prime factors. We need to stop the process once nn becomes prime as a prime number has only two factors 11 and itself. Since 11 is not a prime number so it doesn't change. Once our process stops, we will have the minimum number, because sum of prime factor of any number is always less than itself.

Sum of prime factor of any number is always less than or equal to itself. Why?

If we take a composite number then it can be written as product of atleast 22 different numbers. Let NN be a composite number such that N=ABN = A \cdot B, where A,B>1A, B > 1.

We know that

AB2A\begin{equation} A \cdot B \geq 2 \cdot A \end{equation}AB2B\begin{equation} A \cdot B \geq 2 \cdot B \end{equation}

Adding (1)(1) and (2)(2) we get, 2(AB)2(A+B)    ABA+B.2 \cdot (A \cdot B) \geq 2 \cdot (A + B) \implies A \cdot B \geq A + B.

Written by @Ishwarendra
class Solution {
public:
vector<int> getPrimeFactors(int n) {
vector<int> prime_factors;
// i * i <= n or i <= sqrt(n) both work.
for (int i = 2; i * i <= n; i++) {
while (n % i == 0) {
prime_factors.emplace_back(i);
n /= i;
}
}
if (n > 1) prime_factors.emplace_back(n);
return prime_factors;
}

int smallestValue(int n) {
while (true) {
auto pf = getPrimeFactors(n);
int sum = accumulate(begin(pf), end(pf), 0);
if (sum == n) break;
n = sum;
}
return n;
}
};
  • Time Complexity: O(nsqrtnlog(n)O(n \cdot sqrt{n} \cdot log(n).
  • Space Complexity: O(log(n))O(log(n)), since there cannot be more than log(n)log(n) prime factor of a number nn.

The above code is bound to come out of while loop in atmost log(n)log(n) steps. In the worst case, n=2Pn = 2 \cdot P where P=n2P = \frac{n}{2} is a prime number. In this case nn decrease from nn to n2+2\frac{n}{2} + 2 which is approximately n2\frac{n}{2}. Thus at each step nn decrease by a factor of 22. If it has bigger prime factors then it will take even less steps to terminate the while loop.

Example #3: 952. Largest Component Size by Common Factor

This problem requires basic knowledge of graph traversal. The two popular traversals are Depth First Search and Breadth First Search.

First let us see how to build the graph. Naive way of building graph would be to iterate over all (i,j)(i, j) pairs and check if gcd(numsi,numsj)>1\gcd(nums_i, nums_j) > 1. If it is then add an edge bweten them, else not. But this approch will take O(n2)O(n^2) operations in worst case, which will not pass.

A different way to build graph could be to add edge between pp and numsinums_i if pp is a prime factor of numsinums_i. Since any number nn has atmost log(n)log(n) prime factors so we will have atmost nlog(n)n \cdot log(n) edges in the graph. But while counting component we need to make sure that the node is present in nums. If it isn't then it can serve as a bridge connecting two numbers from nums, but it should not included in count of nodes in connected component.

An Example

For Example: nums = [6, 8]. In this case 22 which is a prime factor of 66 and 88 acts as a bridge connecting 66 to 88 but total connected component should be 22 only.

Graph for above example looks as shown below. Although total node in the connected component is 44 but we do not count 22 and 33 when reporting answer as it isn't present in nums. Image generated using CS Academy Graph Editor.

graph when nums=[2, 6]

Now, as the graph is built we need three main functions:

  1. getPrimeFactorsInLogn(n): Gives us prime factor of a number nn. Since max(nums)\max(nums) is only 10510^5 so we prefer this function over the one which does the same thing in n\sqrt n time.
  2. getComponentCount(src): Starting from node src\text{src} how many nodes (that are present in nums) can be reached from it.
  3. largestComponentSize(nums): The main function that builds graph and use other 22 functions.
Written by @Ishwarendra
class Solution {
// isPresent[i] = 1 if i is present in nums, else 0
// g = graph using which we can find maximum connected component size
vector<int> isPresent;
vector<vector<int>> g;
vector<int> minPrime(int MAX) {
vector<int> min_prime(MAX + 1);
iota(begin(min_prime), end(min_prime), 0);
for (int i = 2; i * i <= MAX; i++) {
if (min_prime[i] != i) continue;
for (int j = i * i; j <= MAX; j += i) min_prime[j] = min(min_prime[j], i);
}
return min_prime;
}

vector<int> getPrimeFactorsInLogn(int n, vector<int> &min_prime) {
vector<int> prime_factors;
while (n > 1) {
int smallest_prime_factor_of_n = min_prime[n];
// Since we only need unique prime factors so we won't push same number again and again
if (n % smallest_prime_factor_of_n == 0) prime_factors.push_back(smallest_prime_factor_of_n);
// Remove the prime factor completely from n
while (n % smallest_prime_factor_of_n == 0) n /= smallest_prime_factor_of_n;
}
return prime_factors;
}

int getComponentSize(int src, vector<bool> &vis) {
queue<int> q;
q.emplace(src);
vis[src] = true;
int count = isPresent[src];
while (!q.empty()) {
int node = q.front();
q.pop();
for (int child : g[node]) {
if (!vis[child]) {
q.emplace(child);
vis[child] = true;
// Increase component size only if the number is present in nums.
count += isPresent[child];
}
}
}
return count;
}
public:
int largestComponentSize(vector<int>& nums) {
int max_elem = *max_element(begin(nums), end(nums));
auto min_prime = minPrime(max_elem + 1);
isPresent.resize(max_elem + 1);
g.resize(max_elem + 1);
vector<bool> vis(max_elem + 1);
for (int num : nums) {
isPresent[num] = 1;
auto pf = getPrimeFactorsInLogn(num, min_prime);
for (auto prime : pf) {
if (prime == num) continue;
// num and prime has gcd > 1
g[prime].emplace_back(num);
g[num].emplace_back(prime);
}
}
int ans = isPresent[1];
for (int src = 2; src <= max_elem; src++) {
ans = max(ans, getComponentSize(src, vis));
}
return ans;
}
};
  • Time Complexity: O(MAX+nlog(n))O(MAX + n \cdot log(n)), where MAXMAX is the max element of nums and nn is length of nums.
  • Space Complexity: O(MAX+nlog(n))O(MAX + n \cdot log(n)), as we are building a graph of nlog(n)n \cdot log(n) edges and (MAX+1)(MAX + 1) nodes.

The process of finding connected components can be done using a data structure Disjoint Set Union as well.

Suggested Problems

Problem NameDifficultySolution Link
263. Ugly NumberEasyView Solutions
650. 2 Keys KeyboardMediumView Solutions
2709. Greatest Common Divisor TraversalHardView Solutions