A matrix X is called a linear matrix if its entries are affine forms, i.e. degree one polynomials in n variables. What is a minimal-sized representation of a given matrix F as a product of linear matrices? Finding such a minimal representation is closely related to finding an optimal way to compute a given polynomial via an algebraic branching program. Here we devise an efficient algorithm for an average-case version of this problem. Specifically, given w d n \N and blackbox access to the w 2 entries of a matrix product F = X 1 X d , where each X i is a w w linear matrix over a given finite field \F q , we wish to recover a factorization F = Y 1 Y d , where every Y i is also a linear matrix over \F q (or a small extension of \F q ). We show that when the input F is sampled from a distribution defined by choosing random linear matrices X 1 X d over \F q independently and taking their product and n 4 w 2 and char ( \F q ) = ( nd w ) (1) then an equivalent factorization F = Y 1 Y d can be recovered in (randomized) time ( wdn log q ) O (1) . In fact, we give a (worst-case) polynomial time randomized algorithm to factor any non-degenerate or pure matrix product (a notion we define in the paper) into linear matrices; a matrix product F = X 1 X d is pure with high probability when the X i 's are chosen independently at random. We also show that in this situation, if we are instead given a single entry of F rather than its w 2 correlated entries then the recovery can be done in (randomized) time ( d w 3 n log q ) O (1) .