Binary Search
Overview
Binary search is a widely used algorithm for searching an element in a sorted array or list. The basic idea of binary search is to divide the search space in half with each iteration and compare the middle element with the target element. If the middle element is greater than the target element, the search space is reduced to the left half of the array, otherwise, it is reduced to the right half. This process is repeated until the target element is found or the search space is exhausted.
The time complexity of binary search is , which is more efficient than the linear search algorithm , which checks all elements one by one. However, for binary search to work, the array or list must be sorted.
Binary search can be implemented using a while loop, recursion, or a combination of both. The implementation details may vary, but the basic idea remains the same. The basic steps for a binary search algorithm are:
- Initialize two pointers, one pointing to the start of the array and the other pointing to the end.
- Find the middle element of the array by calculating the average of the two pointers.
- Compare the middle element with the target element.
- If the middle element is equal to the target element, the search is complete and the index of the target element is returned.
- If the middle element is less than the target element, move the left pointer to the middle element + 1 and repeat step 2.
- If the middle element is greater than the target element, move the right pointer to the middle element - 1 and repeat step 2.
- If the left pointer is greater than the right pointer, the target element is not found and the function returns -1.
In conclusion, binary search is a fast and efficient algorithm for searching an element in a sorted array or list. Its time complexity is , which is much better than the linear search algorithm. However, it requires the array or list to be sorted for it to work.
Let's look at the most basic form of binary search:
Example: 0704. Binary Search
Given an array of integers
nums
which is sorted in ascending order, and an integertarget
, write a function to searchtarget
innums
. Iftarget
exists, then return its index. Otherwise, return-1
.You must write an algorithm with
O(log n)
runtime complexity.
For example, given the sorted input:
nums = [-1,0,3,5,9,12], target = 9
The index of the element 9 is 4. We can use the following template to find the target
- Python
- C++
- Java
def binarySearch(nums, target):
lp, rp = 0, len(nums) - 1
while (lp <= rp):
mid = (lp + rp) // 2
if (nums[mid] == target):
return mid
elif (nums[mid] < target):
lp = mid + 1
else:
rp = mid - 1
return -1
int binarySearch(vector<int>& nums, int target) {
int lp = 0, rp = nums.size() - 1;
while (lp <= rp) {
int mid = lp + (rp - lp) / 2;
if (nums[mid] == target) {
return mid;
} else if (nums[mid] < target) {
lp = mid + 1;
} else {
rp = mid - 1;
}
}
return -1;
}
int binarySearch(int[] nums, int target) {
int lp = 0, rp = nums.length - 1;
while (lp <= rp) {
int mid = lp + (rp - lp) / 2;
if (nums[mid] == target) {
return mid;
} else if (nums[mid] < target) {
lp = mid + 1;
} else {
rp = mid - 1;
}
}
return -1;
}
There can be very challenging questions using binary search, but we should focus on the basic application first.
Tips
-
When the input is sorted, it's probably a hint to use binary search.
-
When you need to find the first / last index of something, then it may be another hint to use binary search (as index is sorted).
-
When initialising the boundary, we may think about the possible range (sometimes may not necessarily be the max of the constraint
-
In while condition, you may see some people write , , , or . either one works depending on how to write the logic - just a personal preference. See here for more.
-
Think about the mid - if there are even elements, should we pick the left or the right mid? e.g. [1,2,3,4] like choosing or ?
-
Think about how to shrink - if is never be the answer, then we can exclude it ( or ).
-
If you are using C++, most people use to avoid integer overflow problem. Let's say & are large enough like ~INT_MAX, when you sum them up, it causes overflow.
We know that is actually somewhere in between and so let where is an arbitrary value. Since we know that , we can substitute to have the following