Efficient Submatrix Sum Query Solutions: How to Solve Matrix Problems Asked in Google, Microsoft, and More.
Introductions:
This blog dives into an optimal approach for solving submatrix sum queries, a common problem in coding interviews at top tech companies like Google, Microsoft, and Amazon. We provide a step-by-step explanation of the prefix sum method and offer efficient Python and C++ solutions to tackle these matrix-related challenges.
Problem statement
You have given a 2-dimensional array ‘ARR’ with ‘N’ rows and ‘M’ columns. The queries are given in a 2-dimensional array ‘Queries’ of size ‘K’, in which Queries[i] contains four integers denoting the left top and right bottom indices of the submatrix whose sum we need to find. Your task is to find the sum of the submatrix for each query given in the array ‘Queries’.
For example:
Consider ARR = [[1 , 2 , 3] , [3 , 4 , 1] , [2 , 1 , 2]] and Queries = [[0 , 0 , 1 , 2]], the submatrix with the left top index (0 , 0) and right bottom index (1 , 2) is
[[1 , 2 , 3] ,
[3 , 4 , 1]].
The sum of the submatrix is 14. Hence, the answer is 14 in this case.
Detailed explanation ( Input/output format, Notes, Images )
Constraints :
1 <= N <= 10 ^ 3
1 <= M <= 10 ^ 3
1 <= K <= 10 ^ 3
1 <= ARR[i][j] <= 10 ^ 6
0 <= Queries[i][0] , Queries[i][2] < N
0 <= Queries[i][1] , Queries[i][3] < M
Where 'T' denotes the number of test cases, 'N' and 'M' denotes the number of rows and the number of columns in the array ‘ARR’ respectively, ’K’ denotes the number of rows in the array ‘Queries’, 'ARR[i][j]' denotes the ’j-th’ element of 'i-th' row of the array 'ARR' and 'Queries[i]' contains four integers denoting the left top and right bottom indices of the submatrix.
Time Limit: 1sec
Sample Input 1 :
2
2 2 1
4 2
1 3
0 0 1 0
3 3 2
2 1 2
3 2 6
1 4 5
1 1 2 2
0 1 0 2
Sample Output 1 :
5
17 3
Explanation of sample input 1:
For the first test case,
The submatrix with the left top index (0 , 0) and right bottom index (1 , 0) is
[[4] ,
[1]].
The sum of the submatrix is 5. Hence, the answer is 5 in this case.
For the second test case,
The submatrix with the left top index (1 ,1) and right bottom index (2 , 2) is
[[2 , 6] ,
[4 , 5]].
The sum of the submatrix is 17. Hence, the answer is 17 in this case.
The submatrix with the left top index (0 , 1) and right bottom index (0 , 2) is
[[1 , 2]].
The sum of the submatrix is 3. Hence, the answer is 3 in this case.
Sample Input 2 :
2
2 2 2
5 6
7 5
0 0 0 0
0 0 1 1
3 3 2
3 4 3
4 3 4
1 1 1
0 0 0 2
0 0 2 1
Sample Output 2 :
5 23
10 16
Steps to approach:
- Prefix Sum Array Construction: Construct a 2D prefix sum array
prefix_sum
, whereprefix_sum[i][j]
stores the sum of elements in the rectangle from the top-left corner(0,0)
to(i,j)
of the original matrix. This can be calculated as:
[
\text{prefix_sum}[i][j] = \text{ARR}[i][j] + \text{prefix_sum}[i-1][j] + \text{prefix_sum}[i][j-1] – \text{prefix_sum}[i-1][j-1]
] - Query Calculation: For each query, you can compute the sum of the submatrix using the inclusion-exclusion principle:
[
\text{submatrix_sum} = \text{prefix_sum}[br][bc] – \text{prefix_sum}[br][tc-1] – \text{prefix_sum}[tr-1][bc] + \text{prefix_sum}[tr-1][tc-1]
]
where(tr, tc)
is the top-left and(br, bc)
is the bottom-right of the submatrix. - Edge Case Handling: If the submatrix touches the boundaries of the matrix (i.e., rows or columns are 0), handle it separately by skipping terms that go out of bounds.
Python Implementation
def preprocess_prefix_sum(ARR, N, M):
# Create a prefix sum array
prefix_sum = [[0 for _ in range(M)] for _ in range(N)]
# Compute the prefix sum
for i in range(N):
for j in range(M):
prefix_sum[i][j] = ARR[i][j]
if i > 0:
prefix_sum[i][j] += prefix_sum[i-1][j]
if j > 0:
prefix_sum[i][j] += prefix_sum[i][j-1]
if i > 0 and j > 0:
prefix_sum[i][j] -= prefix_sum[i-1][j-1]
return prefix_sum
def query_sum(prefix_sum, tr, tc, br, bc):
total = prefix_sum[br][bc]
if tr > 0:
total -= prefix_sum[tr-1][bc]
if tc > 0:
total -= prefix_sum[br][tc-1]
if tr > 0 and tc > 0:
total += prefix_sum[tr-1][tc-1]
return total
def solve():
T = int(input()) # Number of test cases
for _ in range(T):
N, M, K = map(int, input().split())
ARR = [list(map(int, input().split())) for _ in range(N)]
Queries = [list(map(int, input().split())) for _ in range(K)]
# Precompute prefix sum
prefix_sum = preprocess_prefix_sum(ARR, N, M)
# Process each query
results = []
for query in Queries:
tr, tc, br, bc = query
result = query_sum(prefix_sum, tr, tc, br, bc)
results.append(result)
# Print the results for this test case
print(" ".join(map(str, results)))
# Example usage:
solve()
C++ Implementation
#include <iostream>
#include <vector>
using namespace std;
// Function to preprocess the prefix sum array
vector<vector<int>> preprocess_prefix_sum(const vector<vector<int>>& ARR, int N, int M) {
vector<vector<int>> prefix_sum(N, vector<int>(M, 0));
for (int i = 0; i < N; i++) {
for (int j = 0; j < M; j++) {
prefix_sum[i][j] = ARR[i][j];
if (i > 0) {
prefix_sum[i][j] += prefix_sum[i - 1][j];
}
if (j > 0) {
prefix_sum[i][j] += prefix_sum[i][j - 1];
}
if (i > 0 && j > 0) {
prefix_sum[i][j] -= prefix_sum[i - 1][j - 1];
}
}
}
return prefix_sum;
}
// Function to calculate the sum of the submatrix
int query_sum(const vector<vector<int>>& prefix_sum, int tr, int tc, int br, int bc) {
int total = prefix_sum[br][bc];
if (tr > 0) {
total -= prefix_sum[tr - 1][bc];
}
if (tc > 0) {
total -= prefix_sum[br][tc - 1];
}
if (tr > 0 && tc > 0) {
total += prefix_sum[tr - 1][tc - 1];
}
return total;
}
void solve() {
int T;
cin >> T; // Number of test cases
while (T--) {
int N, M, K;
cin >> N >> M >> K;
vector<vector<int>> ARR(N, vector<int>(M));
for (int i = 0; i < N; i++) {
for (int j = 0; j < M; j++) {
cin >> ARR[i][j];
}
}
vector<vector<int>> Queries(K, vector<int>(4));
for (int i = 0; i < K; i++) {
cin >> Queries[i][0] >> Queries[i][1] >> Queries[i][2] >> Queries[i][3];
}
// Precompute prefix sum
vector<vector<int>> prefix_sum = preprocess_prefix_sum(ARR, N, M);
// Process each query
for (const auto& query : Queries) {
int tr = query[0], tc = query[1], br = query[2], bc = query[3];
cout << query_sum(prefix_sum, tr, tc, br, bc) << " ";
}
cout << endl;
}
}
// Example usage:
int main() {
solve();
return 0;
}
Time Complexity:
- Preprocessing (building prefix sum): O(N * M) where N and M are the dimensions of the matrix.
- Querying each submatrix: O(1) per query using the prefix sum technique.
Thus, the overall time complexity is O(N * M + K) per test case, which is highly efficient for large matrices.
Space Complexity:
- Space: O(N * M) for the prefix sum array.
Easy Mode Approach:
Here’s a more polished and easy-to-understand version that you can directly add to your blog page, complete with detailed explanations, clear formatting, and structured steps:
Efficient Submatrix Sum Calculation Using 2D Prefix Sum Array
When solving matrix-related problems, one of the most efficient techniques for answering multiple sum queries is using a 2D Prefix Sum Array. This method significantly reduces the time complexity for each query, allowing you to handle even large matrices efficiently.
Step 1: Constructing the 2D Prefix Sum Array
The prefix sum array is a 2D array where each element at position prefix_sum[i][j]
stores the sum of all elements in the rectangle from the top-left corner (0, 0) to (i, j) of the original matrix ARR
.
The formula for calculating each element of the prefix sum array is:
[
\text{prefix_sum}[i][j] = \text{ARR}[i][j] + \text{prefix_sum}[i-1][j] + \text{prefix_sum}[i][j-1] – \text{prefix_sum}[i-1][j-1]
]
Explanation:
ARR[i][j]
: The value at the current position.prefix_sum[i-1][j]
: The sum of elements above the current row.prefix_sum[i][j-1]
: The sum of elements to the left of the current column.prefix_sum[i-1][j-1]
: Subtract the overlap (top-left corner) that gets added twice.
Example:
Consider the matrix ARR
:
[
\begin{bmatrix}
1 & 2 & 3 \
3 & 4 & 1 \
2 & 1 & 2
\end{bmatrix}
]
The corresponding prefix_sum
array would be:
[
\begin{bmatrix}
1 & 3 & 6 \
4 & 10 & 14 \
6 & 13 & 19
\end{bmatrix}
]
Step 2: Calculating the Submatrix Sum Using Inclusion-Exclusion
Once the prefix sum array is built, you can efficiently compute the sum of any submatrix in constant time using the inclusion-exclusion principle.
For a given submatrix defined by:
- Top-left corner (tr, tc)
- Bottom-right corner (br, bc)
The formula to calculate the sum of the submatrix is:
[
\text{submatrix_sum} = \text{prefix_sum}[br][bc] – \text{prefix_sum}[br][tc-1] – \text{prefix_sum}[tr-1][bc] + \text{prefix_sum}[tr-1][tc-1]
]
Explanation:
prefix_sum[br][bc]
: Sum of elements from(0,0)
to(br, bc)
.prefix_sum[br][tc-1]
: Subtract the sum of elements to the left of the submatrix.prefix_sum[tr-1][bc]
: Subtract the sum of elements above the submatrix.prefix_sum[tr-1][tc-1]
: Add back the overlap that was subtracted twice.
Example:
For the submatrix with top-left corner (1,1)
and bottom-right corner (2,2)
in our previous matrix, the sum is:
[
\text{submatrix_sum} = 19 – 6 – 10 + 1 = 4
]
Step 3: Handling Edge Cases
When the submatrix touches the boundaries of the matrix (i.e., when tr
, tc
, br
, or bc
are 0), some of the terms in the formula might go out of bounds. To handle these cases, we can conditionally skip terms:
- If
tr == 0
, skipprefix_sum[tr-1][bc]
. - If
tc == 0
, skipprefix_sum[br][tc-1]
. - If both
tr == 0
andtc == 0
, skip addingprefix_sum[tr-1][tc-1]
.
Key Benefits of This Approach:
- Efficiency: Preprocessing the matrix with a 2D prefix sum takes O(N * M) time. After that, each submatrix sum query can be computed in O(1) time, making it ideal for large matrices and multiple queries.
- Scalability: This method can handle matrices of size up to 1000×1000 and thousands of queries within the time limits of competitive programming.
Code Implementation:
def preprocess_prefix_sum(ARR, N, M):
prefix_sum = [[0 for _ in range(M)] for _ in range(N)]
for i in range(N):
for j in range(M):
prefix_sum[i][j] = ARR[i][j]
if i > 0:
prefix_sum[i][j] += prefix_sum[i-1][j]
if j > 0:
prefix_sum[i][j] += prefix_sum[i][j-1]
if i > 0 and j > 0:
prefix_sum[i][j] -= prefix_sum[i-1][j-1]
return prefix_sum
def query_sum(prefix_sum, tr, tc, br, bc):
total = prefix_sum[br][bc]
if tr > 0:
total -= prefix_sum[tr-1][bc]
if tc > 0:
total -= prefix_sum[br][tc-1]
if tr > 0 and tc > 0:
total += prefix_sum[tr-1][tc-1]
return total
Conclusion:
The 2D prefix sum array technique is a highly efficient solution to submatrix sum queries. By preprocessing the matrix, you can reduce the time complexity of each query to constant time, making it an ideal approach for coding interviews and competitive programming problems, such as those frequently asked in top tech companies like Google, Microsoft, and Amazon.