Computer Science Graduate Seminar: Transformation of Linearized Computational Graphs Driven by the Associativity of the Chain Rule

 

Monday, October 16, 2017, 1:00pm

Location: Room 231, Seffenter Weg 23

Speaker: Dipl.-Math. Viktor Mosenkis

Abstract:

The efficient computation of derivatives of mathematical functions which are implemented as computer programs has become more important in various branches of industry and academia. While derivative information is highly desired, its efficient and reliable computation remains a big challenge. Algorithmic (or Automatic) Differentiation (AD) becomes the method of choice for computing derivatives as it promises to address both problems. AD is a collection of techniques allowing computation of accurate derivative information of a function F implemented as a computer program in some high level programming language. Often the algorithms for AD are expressed as elimination methods on the linearized directed acyclic graphs or l-DAGs. L-DAGs are computational graphs, where the edge weights are equal to the local partial derivatives of the respective operations. The most popular elimination methods are edge and vertex elimination. Associativity of the chain rule allows to perform the edge and vertex eliminations in different orders. The problem of finding an order of vertex [edge] eliminations that minimizes the number of multiplications involved in the computation of the Jacobian is known as vertex [edge] elimination problem (VE [EE]). Another closely related problem is the minimum edge count problem (MEC). This problem arises when Jacobian-vector products are needed and computation of full Jacobian is not required. The MEC problem aims to find a sequence of vertex (edge) eliminations such that the resulting l-DAG has the smallest amount of edges.

VE, EE and MEC problems are conjectured to be NP-hard. Therefore in practice the solution of these problems is approximated by greedy like heuristics or computed by branch and bound based methods if the exact solution is of interest. The main aim of this thesis is to support further development of algorithms for solving these problems. Therefore in this thesis we prove several properties of l-DAGs that help to reduce the search space and establish the base for other outcomes of the thesis. We also develop a new approach for computing lower bounds and prove various optimality preserving eliminations for all three problems. The search space of the EE problem is much bigger than that of VE problem. Thus edge elimination is barely used in any application. We show that mixing of edge and vertex eliminations in algorithms can be used to improve the results of approximative algorithms and to reduce the corresponding search space if the exact solution is required, demonstrating that edge eliminations should not be neglected in practice.

The Computer Science lecturers invite interested people to join.