Dynamic Programming—Chained Matrix Multiplication

 

 

Multiplying unequal matrices

•      Suppose we want to multiply two matrices do not have the same number of rows and columns

•      We can multiply two matrices A1 and A2 only if the number of columns of A1 is equal to the number of rows of A2

•      Example:  We want to multiply a 2 X 3 matrix by a
3 X 4 matrix

–    The resultant matrix is a 2 X 4 matrix

•    This will have 4 terms in the top row and 4 in the bottom

•    Each term is the result of 3 multiplications

•    So the total number of multiplications is 2*3*4

•      Generalizing: if we want to multiply an N X M matrix by an M X P matrix it will take N*M*P multiplications

 

Chained Matrix Multiplication

•      We are given a sequence (chain) A1, A2, …,An of n matrices, and we wish to find the product

•      The way we parenthesize a chain of matrices can have a dramatic impact on the cost of evaluating the product.

•      This problem is to determine the best way to parenthesize the matrices to minimize the number of multiplications

Example:

•       A1  5 X 3

•       A2  3 X 4

•       A3  4 X 6

•      A4  6 X 5

•      The problem: what is the best order to multiply them?

•      If we multiply

–    (A1( (A2A3) A4) )    takes 237 multiplications

–    (A1  (A2  (A3A4) ))   takes 255 multiplications

–    ( (A1A2) (A3A4) )    takes 280 multiplications

–    ( ( (A1A2)  A3)  A4) takes 330 multiplications

–    (  (A1(A2A3)  )  A4)  takes 312 multiplications

 

How to parenthesize the matrices

•      In the case of four matrices, there are only five ways to order the multiplications

•      But with n matrices, the number of ways to parenthesize them grows exponentially   (4n/n3/2) so we do not want to look at all the possibilities

•      Dividing  the problem into subproblems

–    We use the principal of optimality which is said to apply if an optimal solution to an instance of a problem always contains optimal solutions to all substances.

–    If A1((((A2A3)A4)A5)A6) is the optimal order then we know that (A2A3)A4  is the optimal order for A2A3A4

 

 

The matrix-chain problem

•      We need to find the best way to parenthesize the chain of matrices to minimize the number of scalar multiplications

•      We do this by dividing the problem into subproblems, then finding the optimal solution to the subproblem.

•       Suppose we have matrix-chain A1 .. An

•       We divide this into subproblems A1..Ak and Ak+1 .. An

•      The problem is that we do not know what the k should be

•      We find k by looking at the optimal solutions of each of the subproblems.

–    This means looking at all the values for k

 

The chain A1..A4

•      Look at the possible chains of length two

–    How many values of k are there for chains of length two?

 

•      Then the possible chains of length three

–    How many values of k?

•      Then chains of length four;  etc.

•      We need some kind of organization here to help us find the optimal way to make chains of length three out of chains of length two

•      We need some symbolism and tables to keep track of calculations and minimum number of scalar multiplications

 

How we keep track

•      We keep the dimensions of the matrices in a 1-D array.

0

1

2

3

4

5

3

4

6

2

•       A1  5 X 3

•       A2  3 X 4

•       A3  4 X 6

•      A4  6 X 2

•      Ai has dimensions i-1 and i

•      So our array starts at zero and goes to four

•      We keep the number of multiplications for chains of length one, two three and four in a 2-D array

 

The 2-D array  called N

•      N[1][1] will hold the number of multiplications to multiply from A1 to A1

–    This of course is zero, since it is a chain of length one

–    N[i][i] where i = 1 to n is always zero

•      N[1][2] holds the minimum number of multiplications to multiply A1 to A2

–    This is a chain of length two

–    Add N[2][3], N[3][4] to the table; this holds the other chains of length two

•        N[1][3] holds the minimum number of multiplications for one chain of length three  N1,3

–    This could be either (A1A2)A3 nor A1(A2A3)

–    So we have two possibilities, and we must decide on the minimum

We use the term k to show where we decide to put the parentheses

–    In the terminology we use, we must decide on k, which is the how we divide the matrix chain

•      Ni,j = min{ Ni,k + Nk+1,,j + di-1*dk*dj }
     
where i <= k < j

 

Explaining the formula

•      Ni,j = min{ Ni,k + Nk+1,j + di-1*dk*dj }
     
where i <= k < j

•      Our chains are symbolized by Ni,j

•      In our problem we have two chains of length three, denoted N1,3 and N2,4

•      Looking first at N1,3

–    There are two possibilities (A1A2)A3  or A1(A2A3)

•    We already have in our 2-D table how many multiplications it takes to multiply (A1A2)   and  (A2A3)

–    So we have two possible values of k, and we must check them both, and take the minimum

–    i = 1 and j = 3   and k can equal either 1 or 2

We plug them in to the formula, and take the minimum.

 

Basically, what the formula is doing is taking the number of multications for a chain of length two from the table,  and adding to that sum the number the number of multications needed to link in another matrix

 

Another Example

•      A1 is the first matrix in our chain, An the last

•   A1 30 X 35

•   A2 35 X 15

•   A3 15 X 5

•   A4 5 X 10

•   A5 10 X 20

•   A6 20 X 25

•      We keep the dimensions in an array.

•      Ai has dimensions i-1 and i

•      So our array index starts at zero and goes to six

 

The variables and data structures

•       The notation for the optimal number of scalar multiplications for subproblem is Ni,j

–   i is the subscript of the first matrix in the subproblem;

–   j is the subscript of the last

–   k is the subscript for the way to break Ni,j into subproblems

•   i <= k < j

•     Breaking  Ni,j  into subproblems would be Ni,k  and Nk+1, j

–    We are looking for the k that gives Ni,j ( the optimal for this subproblem).

–    It is given by the formula:

•      Ni,j = min{ Ni,k + Nk+1,j + di-1*dk*dj }
                                               
where i <= k < j

 

The 2-D array for multiplications

 

1

2

3

4

5

6

1

0

15,750

7,875

9,375

11,875

15,125

2

 

0

2,625

4,375

7,125

10,500

3

 

 

0

750

750

5,375

4

 

 

 

0

1,000

3,500

5

 

 

 

 

0

5,000

6

 

 

 

 

 

0

 

The 2-D array to keep track of k

 

1

2

3

4

5

6

1

 

1

1

3

3

3

2

 

 

2

3

3

3

3

 

 

 

3

3

3

4

 

 

 

 

4

5

5

 

 

 

 

 

5

6

 

 

 

 

 

 

 

Pseudocode for chained matrices

int minMult(int n, int [ ] d, index [ ][ ] P)
{   index  i,j, k dia;
    int [ ][ ] M = new int[1..n][1..n]
    for (i = 1; i <= n; i++) M[i][i] = 0;  // initialize
    for (dia=1; dia<=n-1; dia++)
       for (i=1; i <=n-dia; i++)
      {    j = I + dia;
           M[i][j]= min M[i][k] + M[k+1][j] +d[i-1]*d[k] * d[j];
                       i
k j-1
           P[i][j] = the value of k that gave the min
    }
    return M[i][j];
}

      

Printing the optimal order

•      In addition to finding the number of multiplications, we need to keep track of where to put the parentheses

•      This is done in a separate 2-D array

•      K is recorded for Ni,j when the minimum is found

 

Analyzing the Matrix Chain-Algorithm

•      We can compute N1,n with an algorithm that consists primarily of three nested for-loops

•      The outer loop is executed n times

•      The inner loop is executed at most n times, usually somewhat less

–    A loop inside the inner loop, for k is executed less than times

•      But the bottom line is that there are three nested loops

•      So the total running time is O(n3)

•      This can be improved on

–    This algorithm was developed in 1973 by Godbole

–    In 1982 Yao developed an O(n2)algorithm

–    In 1984 Hu and Shing described an O(n log n) algorithm