This project involves two parallel implementations of LU decomposition that use Gaussian elimination to factor a dense N x N
matrix into an upper-triangular one and a lower-triangular one. In matrix computations, pivoting involves finding the largest magnitude value in a row, column, or both and then interchanging rows and/or columns in the matrix for the next step in the algorithm. The purpose of pivoting is to reduce round-off error, which enhances numerical stability. This project uses row pivoting, a form of pivoting involves interchanging rows of a trailing submatrix based on the largest value in the current column. To perform LU decomposition with row pivoting, one needs to compute a permutation matrix P such that PA = LU. The permutation matrix keeps track of row exchanges performed.
My primary objective was to reduce the execution time of the algorithm for large matrices like a 8000 X 8000
matrix leveraging parallel computing techniques like Open-MP and p-thread.
Below is pseudocode for a sequential implementation of LU decomposition with row pivoting.
inputs: a(n,n)
outputs: π(n), l(n,n), and u(n,n)
initialize π as a vector of length n
initialize u as an n x n matrix with 0s below the diagonal
initialize l as an n x n matrix with 1s on the diagonal and 0s above the diagonal
for i = 1 to n
π[i] = i
for k = 1 to n
max = 0
for i = k to n
if max < |a(i,k)|
max = |a(i,k)|
k' = i
if max == 0
error (singular matrix)
swap π[k] and π[k']
swap a(k,:) and a(k',:)
swap l(k,1:k-1) and l(k',1:k-1)
u(k,k) = a(k,k)
for i = k+1 to n
l(i,k) = a(i,k)/u(k,k)
u(k,i) = a(k,i)
for i = k+1 to n
for j = k+1 to n
a(i,j) = a(i,j) - l(i,k)*u(k,j)
Here, the vector π is a compact representation of a permutation matrix p(n,n),
which is very sparse. For the ith row of p, π(i) stores the column index of
the sole position that contains a 1.
File names | Description |
---|---|
matrix_generator.cpp | Contains source code to generate random 𝑛 × 𝑛 matrix. |
serial.cpp | Contains source code to get LU decomposition of input matrix serially. (i.e.,without multithreading) |
pthread.cpp | Contains source code to get LU decomposition of input matrix parallelly. (using pthreads) |
omp.cpp | Contains source code to get LU decomposition of input matrix parallelly. (using OpenMp) |
looper.sh | Bash file used to Save matrix generated by input.cpp to a text file. Run all the 3 implementations i.e., serial, pthread and openmp for different matrix sizes and threads. Saves the output of the above three programs into a text file so that they can be used for analysis. |
graph_plot.py | Python script used to Read data from the text file that contains all the output. Plot required graphs by making use of appropriate libraries. |
MakeFile | Contains code to invoke all the above mentioned programs and finally display required plots on screen. |
- Given control flow describes the flow of data.
- Plot is between time & matrix size where matrix size varies from 100 to 1000 and time from 0 to 4s.
- Graph is plotted between time & matrix but all three variants are considered i.e., Serial , pthread , OpenMP.
- This plot shows the comparative analysis of pthread & OpenMP.
- 3D plot is plotted between Serial, pthread, OpenMP with matrix size, number of threads & time as its dimensions