Decision making is hard. It often requires reasoning about uncertain environments, partial observability and action spaces that are too large to enumerate. In such complex decision-making tasks decision-theoretic agents, that can reason about their environments on the basis of mathematical models and produce *policies* that optimize the utility for their users, can often assist us.

In most research on decision-theoretic agents, the desirability of actions and their effects is codified in a scalar reward function. However, many real-world decision problems have multiple objectives. In such cases the problem is more naturally expressed using a vector-valued reward function. Rather than having a single optimal policy, we then want to produce a set of policies that covers all possible preferences between the objectives. We call such a set a *coverage set*.

In this dissertation, we focus on decision-theoretic planning algorithms that produce the *convex coverage set (CCS)*, which is the optimal solution set when either: 1) the user utility can be expressed as a weighted sum over the values for each objective; or 2) policies can be stochastic.

We propose new methods based on two popular approaches to creating planning algorithms that produce an (approximate) CCS by building on an existing single-objective algorithm. In the *inner loop* approach, we replace the summations and maximizations in the inner most loops of the single-objective algorithm by cross-sums and pruning operations. In the *outer loop* approach, we solve a multi-objective problem as a series of scalarized problems by employing the single-objective method as a subroutine.

Our most important contribution is an outer loop framework that we call *optimistic linear support (OLS)*. As an outer loop method OLS builds the CCS incrementally. We show that, contrary to existing outer loop methods, each intermediate result is a bounded approximation of the CCS with known bounds (even when the single-objective method used is a bounded approximate method as well) and is guaranteed to terminate in a finite number of iterations.

We apply OLS-based algorithms to a variety of multi-objective decision problems, and show that it is more memory-efficient, and faster than corresponding inner loop algorithms for moderate numbers of objectives. We show that exchanging subroutines in OLS is relatively easy and illustrate the importance on a complex planning problem. Finally, we show that it is often possible to reuse parts of the policies and values, found in earlier iterations of OLS, to hot-start later iterations of OLS. Using this last insight, we propose the first method for multi-objective POMDPs that employs point-based planning and can produce an ε-CCS in reasonable time.

Overall, the methods we propose bring us closer to truly practical multi-objective decision-theoretic planning.