Finding frequency of given element in a sorted array in C

Arrays are a fundamental data structure in the C programming language, used for storing a collection of items at contiguous memory locations. The items can be accessed randomly by using the index number of any element. This concept becomes particularly useful when dealing with the problem of finding the frequency of a given element in a sorted array. Frequency, in this context, refers to the number of times a particular element appears in the array. Understanding how to efficiently perform this operation can significantly enhance the performance of an application, especially in data analysis or search algorithms.

Understanding the Problem

In the realm of arrays and data structures, ‘frequency’ of an element refers to the number of times the element appears within the array. For instance, if we have an array [1, 2, 2, 3, 3, 3, 4], the frequency of the element 3 is 3 because it appears three times.

Why Sorted Matters

Having a sorted array is a significant advantage when searching for the frequency of an element. In a sorted array, all instances of a given element are contiguous. This characteristic allows for optimized search techniques, such as binary search, to quickly locate the first occurrence of the element and subsequently count the total number of occurrences, potentially reducing the time complexity from linear to logarithmic in the best-case scenarios.

One straightforward method to count the occurrences of an element in an array is by using a linear search. This approach does not take advantage of the array being sorted but instead iterates through each element, comparing it to the target element to count its frequency.

Linear Search Explained

Linear search operates by starting at the first element of the array and sequentially checking every element until it finds the target element. Every time the target is found, a counter is incremented. This process continues until the entire array has been traversed. Although this method is simple to understand and implement, it is not the most efficient, especially for large arrays or arrays where the target element occurs early.

Code Snippet

1#include <stdio.h>

2

3// Function to find the frequency of a given element in an array

4int findFrequency(int arr[], int n, int x) {

5    int count = 0;

6    for (int i = 0; i < n; i++) {

7        if (arr[i] == x) {

8            count++;

9        }

10    }

11    return count;

12}

13

14int main() {

15    int arr[] = {1, 2, 2, 3, 3, 3, 4};

16    int n = sizeof(arr) / sizeof(arr[0]);

17    int x = 3;

18    printf("Frequency of %d is: %d\n", x, findFrequency(arr, n, x));

19    return 0;

20}

Time Complexity Analysis

The time complexity of the naive approach is O(n), where n is the number of elements in the array. This is because, in the worst case, the algorithm might need to examine each element of the array once. This linear time complexity is not the most efficient, particularly for large arrays or when the operation needs to be performed multiple times on similar datasets.

Binary Search Basics

Binary search operates on the principle of divide and conquer, targeting the middle of a sorted array and narrowing the search range based on comparison with the target value. Its time complexity is O(log n), making it vastly superior to a linear search for large datasets.

Finding the First Occurrence

To adapt binary search for finding the first occurrence of an element, we modify the condition when the target is found. If the target is equal to the middle element, instead of returning immediately, we continue the search in the left half to check for earlier occurrences.

1int findFirstOccurrence(int arr[], int n, int x) {

2    int low = 0, high = n - 1, result = -1;

3    while (low <= high) {

4        int mid = (low + high) / 2;

5        if (x == arr[mid]) {

6            result = mid;

7            high = mid - 1;

8        } else if (x < arr[mid]) low = mid + 1;

9        else high = mid - 1;

10    }

11    return result;

12}

Finding the Last Occurrence

Similarly, to find the last occurrence, the search continues in the right half after finding a match, to ensure no later instances of the element are present.

1int findLastOccurrence(int arr[], int n, int x) {

2    int low = 0, high = n - 1, result = -1;

3    while (low <= high) {

4        int mid = (low + high) / 2;

5        if (x == arr[mid]) {

6            result = mid;

7            low = mid + 1;

8        } else if (x < arr[mid]) low = mid + 1;

9        else high = mid - 1;

10    }

11    return result;

12}

Calculating Frequency

The frequency of the element can be calculated by subtracting the position of the first occurrence from that of the last occurrence and adding one. This method guarantees a quick and efficient calculation of frequency using binary search.

Code Optimization Tips

Edge Case Considerations

Handling edge cases, such as empty arrays or the target element not being present, is crucial for robustness. Always check for these scenarios before proceeding with the core logic.

Loop Invariants for Binary Search

Understanding and maintaining loop invariants, like ensuring the search space is halved at each step, is vital for correctly implementing binary search without falling into infinite loops or missing elements.

Handling Non-Existent Elements

Ensure to check if the first occurrence is -1 before proceeding to find the last occurrence to handle searches for elements not present in the array gracefully.