Discrete Time Dynamic Programming Using Tensor Trains

Date issued

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Discrete time dynamic programming has many applications in decision-making and econometrics. In it, one is looking for a so-called value function that obeys a functional equation called the Bellman equation. The difficulty is that the number of variables of the value function can be very high, and a brute-force iteration of the Bellman equation is not feasible. Some authors solve this problem with deep neural networks, which have disadvantages. In this paper, we propose to handle the (sampled) value function in terms of a tensor train in a rectangular grid. Two novel techniques for the function interpolation were proposed. The decomposition has to be repeated in each Bellman iteration. Since the number of the tensor samples is still astronomically large, we propose to decompose the tensor using the TT-cross technique which only uses a fraction of the tensor elements. In this way, it is possible to find approximate solutions to the problem in dimensions where the traditional methods fail. Next, we propose a smoothing operation that may improve the convergence and a novel way of computing the approximation error and estimating the time when the iteration should be halted. The method’s performance is demonstrated in the example of the linear quadratic controller, where the ideal solution is known as the ground truth. Next, the proposed technique is applied to the problem of active fault detection, and its performance is compared to that of the neural network technique.

Description

Subject(s)

control design, Bellman equation, tensor train

Citation

Collections