Prefix Sum Array
Concept Overview
- A prefix sum array, also known as a cumulative sum array, is a data structure used for efficient computation of the sum of elements within a given range of an array.
- From an architect's perspective, it's a fundamental optimization technique that trades space for time.
- By pre-calculating and storing the cumulative sums, we can answer multiple range sum queries in constant time, O(1), after an initial linear-time pre-computation.
Detailed Explanation
-
The core idea is simple:
-
We create a new array, let's call it prefix_sum, where prefix_sum[i] stores the sum of all elements from the beginning of the original array up to index i.
-
Let the original array be arr of size n.
- The prefix sum array, prefix_sum, will also be of size n (or n+1 for easier indexing).
-
Pre-computation:
- The prefix_sum array is built iteratively:
- prefix_sum[0] = arr[0]
- prefix_sum[i] = prefix_sum[i-1] + arr[i] for i > 0.
-
Range Sum Query:
-
To find the sum of elements from index i to j (inclusive) in the original array, we use the pre-computed values: sum(i, j) = prefix_sum[j] - prefix_sum[i-1], (assuming prefix_sum[-1] = 0).
- This works because prefix_sum[j] contains the sum from index 0 to j, and prefix_sum[i-1] contains the sum from 0 to i-1.
- Subtracting the latter from the former leaves the sum of elements from i to j.
Code Examples
public class PrefixSumExample {
public static int[] buildPrefixSumArray(int[] arr) {
int n = arr.length;
int[] prefixSum = new int[n];
if (n > 0) {
prefixSum[0] = arr[0];
for (int i = 1; i < n; i++) {
prefixSum[i] = prefixSum[i - 1] + arr[i];
}
}
return prefixSum;
}
public static int getRangeSum(int[] prefixSum, int i, int j) {
if (i > j || i < 0 || j >= prefixSum.length) {
throw new IllegalArgumentException("Invalid range.");
}
if (i == 0) {
return prefixSum[j];
} else {
return prefixSum[j] - prefixSum[i - 1];
}
}
public static void main(String[] args) {
int[] arr = {10, 20, 30, 40, 50};
int[] prefixSum = buildPrefixSumArray(arr);
System.out.println("Original array: " + java.util.Arrays.toString(arr));
System.out.println("Prefix sum array: " + java.util.Arrays.toString(prefixSum));
int startIdx = 1;
int endIdx = 3;
int sum = getRangeSum(prefixSum, startIdx, endIdx);
System.out.println("Sum from index " + startIdx + " to " + endIdx + ": " + sum);
}
}
Diagrams
- A simple, textual representation is effective.
Example:
- Calculating prefix sum array.
- Formula: pf[i]=pf[i−1]+arr[i]
Array Data: |
-3 |
6 |
2 |
4 |
5 |
2 |
8 |
-9 |
3 |
1 |
Index: |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
Prefix-Sum Array: |
-3 |
3 |
5 |
9 |
14 |
16 |
24 |
15 |
18 |
19 |
- Edge case:
- For index value of 0
- pf[0]=arr[0]
- For sum from index '0' to 'k'
- sum=pf[k]
Real-World Applications
Advantages & Drawbacks
- Advantages:
- O(1) Time Complexity for Queries:
- After the initial setup, any range sum query is incredibly fast.
- Simplicity:
- The concept is easy to understand and implement.
- Drawbacks:
- O(N) Space Complexity:
- Requires an additional array of the same size as the original, which doubles the space requirement.
- Static Data Requirement:
- It's not efficient for arrays that undergo frequent updates.
- Any change to an element arr[i] requires re-calculating the prefix sums for all indices from i to the end, which is an O(N) operation.
Trade-offs
- The core trade-off is Space vs. Time.
-
Prefix Sum:
- High space (O(N)) for low time (O(1)) per query. Best for many queries on a static dataset.
-
Brute-Force Summation:
- Low space (O(1)) but high time (O(N)) per query.
- Best for few queries or a frequently updated dataset.
- Other alternatives include more complex data structures like a Segment Tree or a Fenwick Tree (Binary Indexed Tree).
- These structures offer a better balance, providing O(logN) time for both range queries and updates.
However, they are more complex to implement.
Best Practices
-
Use prefix sums when you have a large number of range sum queries on an array that is static or infrequently updated.
-
Consider using an n+1 size array for the prefix sum to simplify the i=0 case, where the prefixSum[j]−prefixSum[i−1] formula still holds by assuming prefixSum[-1] = 0.
-
For 2D matrices, pre-computing a 2D prefix sum array is a common practice to answer queries on rectangular regions.
Interview Angle
- Possible Questions:
- How would you find the sum of elements in a range [i, j] of an array?
- Note: Start with brute-force, then introduce prefix sum as an optimization.
- What are the time and space complexities of a prefix sum array? What is its main drawback?"
- When would you choose a prefix sum array over a brute-force approach?"
- What if the array is frequently updated? What data structure would you use then?
- Note: This is a classic follow-up that leads to Segment Trees or Fenwick Trees.
- How to Answer:
- Start by defining the problem and the simple, brute-force solution (looping from i to j). State its complexity (O(N)).
- Introduce the prefix sum as an optimization for scenarios with multiple queries. Clearly explain the pre-computation step and the formula for queries.
- Be ready to discuss the trade-offs and when it's appropriate. The ability to compare different solutions (brute-force vs. prefix sum vs. segment tree) demonstrates a deeper architectural understanding.
Coding problems
- Q1. Find sum of all values between gien indexes (startIndex,endIndex) inclusive.
- Q2. Find Equilibrium index in given array
- Q3. Find all Equilibrium indexs in given array
- Q4. Given an array of integers, count the number of special indexes. Where a special index is defined as, after removing that index, it satisfies below equation, in the resulting array where the special index is removed.
- Special Index ==> Sum of all Odd Index elements == Sum of all even index elements.
- Asked in:
- Amazon, Google, Microsoft, Adobe, DirectI & Codenation.
Online References
Summary
- A prefix sum array is an elegant data structure that pre-computes the cumulative sums of an array.
- This allows for constant-time, O(1), range sum queries at the cost of O(N) space and a one-time O(N) pre-computation.
- It's an excellent technique for optimizing applications with numerous queries on static data.
- Analogy:
- Think of a running bank balance. The prefix_sum[i] is your bank balance at the end of day i.
- To find out how much you spent between day i and j (inclusive), you just look at the balances: balance at day j - balance at day (i-1).
- You don't need to re-add every transaction from day i to j individually.
- This analogy perfectly captures the essence of a prefix sum. It's a fundamental building block, a low-level optimization that can have a significant impact when applied correctly.
- It's also the basis for more advanced data structures like the Fenwick Tree.
Go to Home
------- End -------