Notes:
Please see [BBC+14] for a journal version of this paper.

Abstract.
We study Markov decision processes (MDPs) with multiple limitaverage (or meanpayoff) functions. We consider two different objectives, namely, expectation and satisfaction objectives. Given an MDP with k reward functions, in the expectation objective the goal is to maximize the expected value, and in the satisfaction objective the goal is to maximize the probability of runs such that the limitaverage value stays above a given vector. We show that under the expectation objective, in contrast to the singleobjective case, both randomization and memory are necessary for strategies, and we show that finitememory randomized strategies are sufficient. We show that under the satisfaction objective, in contrast to the singleobjective case, randomization is necessary for strategies, and we show that randomized memoryless strategies are sufficient for epsilonapproximation, for all epsilon>0. We show that the decision problems for both expectation and satisfaction objectives can be solved in polynomial time and the tradeoff curve (Pareto curve) can be epsilonapproximated in time polynomial in the size of the MDP and 1/epsilon, and exponential in the number of reward functions, for all epsilon>0. Our results also reveal flaws in an earlier paper ("Markov Decision Processes with multiple LongRun Average Objectives", FSTTCS 2007) for MDPs with multiple meanpayoff functions under the expectation objective, correct the flaws and obtain improved results.
