skip to content
Examples in Markov decision processes Preview this item
ClosePreview this item

Examples in Markov decision processes

Author: A B Piunovskiy
Publisher: London : Imperial College Press, ©2013.
Series: Imperial College Press optimization series, 2.
Edition/Format:   Book : EnglishView all editions and formats
Database:WorldCat
Summary:
This invaluable book provides approximately 100 examples illustrating the theory of controlled discrete-time Markov processes. Except for applications of the theory to real-life problems like stock exchange, queues, gambling, optimal search etc., the main attention is paid to counter-intuitive, unexpected properties of optimization problems. Such examples illustrate the importance of conditions imposed in the
You are not connected to the University of Washington Libraries network. Access to online content and services may require you to authenticate with your library. OffCampus Access (log in)
Getting this item's online copy... Getting this item's online copy...

Find a copy in the library

Getting this item's location and availability... Getting this item's location and availability...

WorldCat

Find it in libraries globally
Worldwide libraries own this item

Details

Document Type: Book
All Authors / Contributors: A B Piunovskiy
ISBN: 9781848167933 1848167938
OCLC Number: 809939351
Description: xiii, 293 p. : ill. ; 24 cm.
Contents: Preface --
1. Finite-Horizon Models --
1.1. Preliminaries --
1.2. Model Description --
1.3. Dynamic Programming Approach --
1.4. Examples --
1.4.1. Non-transitivity of the correlation --
1.4.2. The more frequently used control is not better --
1.4.3. Voting --
1.4.4. The secretary problem --
1.4.5. Constrained optimization --
1.4.6. Equivalent Markov selectors in non-atomic MDPs --
1.4.7. Strongly equivalent Markov selectors in non-atomic MDPs --
1.4.8. Stock exchange --
1.4.9. Markov or non-Markov strategy? Randomized or not? When is the Bellman principle violated? --
1.4.10. Uniformly optimal, but not optimal strategy --
1.4.11. Martingales and the Bellman principles --
1.4.12. Conventions on expectation and infinities --
1.4.13. Nowhere-differentiable function vt(x); discontinuous function vt(x) --
1.4.14. The non-measurable Bellman function --
1.4.15. No one strategy is uniformly ε-optimal --
1.4.16. Semi-continuous model --
2. Homogeneous Infinite-Horizon Models: Expected Total Loss --
2.1. Homogeneous Non-discounted Model --
2.2. Examples --
2.2.1. Mixed Strategies --
2.2.2. Multiple solutions to the optimality equation --
2.2.3. Finite model: multiple solutions to the optimality equation; conserving but not equalizing strategy --
2.2.4. The single conserving strategy is not equalizing and not optimal --
2.2.5. When strategy iteration is not successful --
2.2.6. When value iteration is not successful --
2.2.7. When value iteration is not successful: positive model I --
2.2.8. When value iteration is not successful: positive model II --
2.2.9. Value iteration and stability in optimal stopping problems --
2.2.10. A non-equalizing strategy is uniformly optimal --
2.2.11. A stationary uniformly ε-optimal selector does not exist (positive model) --
2.2.12. A stationary uniformly ε-optimal selector does not exist (negative model) --
2.2.13. Finite-action negative model where a stationary uniformly ε-optimal selector does not exist --
2.2.14. Nearly uniformly optimal selectors in negative models --
2.2.15. Semi-continuous models and the blackmailer's dilemma --
2.2.16. Not a semi-continuous model --
2.2.17. The Bellman function is non-measurable and no one strategy is uniformly ε-optimal --
2.2.18. A randomized strategy is better than any selector (finite action space) --
2.2.19. The fluid approximation does not work --
2.2.20. The fluid approximation: refined model --
2.2.21. Occupation measures: phantom solutions --
2.2.22. Occupation measures in transient models --
2.2.23. Occupation measures and duality --
2.2.24. Occupation measures: compactness --
2.2.25. The bold strategy in gambling is not optimal (house limit) --
2.2.26. The bold strategy in gambling is not optimal (inflation) --
2.2.27. Search strategy for a moving target --
2.2.28. The three-way duel ("Truel") --
3. Homogeneous Infinite-Horizon Models: Discounted Loss --
3.1. Preliminaries --
3.2. Examples --
3.2.1. Phantom solutions of the optimality equation --
3.2.2. When value iteration is not successful: positive model --
3.2.3. A non-optimal strategy π for which v<sup>π</sup><sub>x</sub> solves the optimality equation --
3.2.4. The single conserving strategy is not equalizing and not optimal --
3.2.5. Value iteration and convergence of strategies --
3.2.6. Value iteration in countable models --
3.2.7. The Bellman function is non-measurable and no one strategy is uniformly ε-optimal --
3.2.8. No one selector is uniformly ε-optimal --
3.2.9. Myopic strategies --
3.2.10. Stable and unstable controllers for linear systems --
3.2.11. Incorrect optimal actions in the model with partial information --
3.2.12. Occupation measures and stationary strategies --
3.2.13. Constrained optimization and the Bellman principle --
3.2.14. Constrained optimization and Lagrange multipliers --
3.2.15. Constrained optimization: multiple solutions --
3.2.16. Weighted discounted loss and (N, ∞)-stationary selectors --
3.2.17. Non-constant discounting --
3.2.18. The nearly optimal strategy is not Blackwell optimal --
3.2.19. Blackwell optimal strategies and opportunity loss --
3.2.20. Blackwell optimal and n-discount optimal strategies --
3.2.21. No Blackwell (Maitra) optimal strategies --
3.2.22. Optimal strategies as β → 1- and MDPs with the average loss --
I --
3.2.23. Optimal strategies as β → 1- and MDPs with the average loss --
II --
4. Homogeneous Infinite-Horizon Models: Average Loss and Other Criteria --
4.1. Preliminaries --
4.2. Examples --
4.2.1. Why lim sup? --
4.2.2. AC-optimal non-canonical strategies --
4.2.3. Canonical triplets and canonical equations --
4.2.4. Multiple solutions to the canonical equations in finite models --
4.2.5. No AC-optimal strategies --
4.2.6. Canonical equations have no solutions: the finite action space --
4.2.7. No AC-ε-optimal stationary strategies in a finite state model --
4.2.8. No AC-optimal strategies in a finite-state semi-continuous model --
4.2.9. Semi-continuous models and the sufficiency of stationary selectors --
4.2.10. No AC-optimal stationary strategies in a unichain model with a finite action space --
4.2.11. No AC-ε-optimal stationary strategies in a finite action model --
4.2.12. No AC-ε-optimal Markov strategies --
4.2.13. Singular perturbation of an MDP --
4.2.14. Blackwell optimal strategies and AC-optimality --
4.2.15. Strategy iteration in a unichain model --
4.2.16. Unichain strategy iteration in a finite communicating model --
4.2.17. Strategy iteration in semi-continuous models --
4.2.18. When value iteration is not successful --
4.2.19. The finite-horizon approximation does not work --
4.2.20. The linear programming approach to finite models --
4.2.21. Linear programming for infinite models --
4.2.22. Linear programs and expected frequencies in finite models --
4.2.23. Constrained optimization --
4.2.24. AC-optimal, bias optimal, overtaking optimal and opportunity-cost optimal strategies: periodic model --
4.2.25. AC-optimal and average-overtaking optimal strategies --
4.2.26. Blackwell optimal, bias optimal, average-overtaking optimal and AC-optimal strategies --
4.2.27. Nearly optimal and average-overtaking optimal strategies --
4.2.28. Strong-overtaking/average optimal, overtaking optimal, AC-optimal strategies and minimal opportunity loss --
4.2.29. Strong-overtaking optimal and strong*-overtaking optimal strategies --
4.2.30. Parrondo's paradox --
4.2.31. An optimal service strategy in a queueing system --
Afterword --
Appendix A. Borel Spaces and Other Theoretical Issues --
A.1. Main Concepts --
A.2. Probability Measures on Borel Spaces --
A.3. Semi-continuous Functions and Measurable Selection --
A.4. Abelian (Tauberian) Theorem --
Appendix B. Proofs of Auxiliary Statements --
Notation --
List of the Main Statements --
Bibliography --
Index.
Series Title: Imperial College Press optimization series, 2.
Responsibility: A.B. Piunovskiy.

Abstract:

Provides approximately eighty examples illustrating the theory of controlled discrete-time Markov processes. Except for applications of the theory to real-life problems like stock exchange, queues,  Read more...
Retrieving notes about this item Retrieving notes about this item

Reviews

User-contributed reviews
Retrieving GoodReads reviews...

Tags

Be the first.

Similar Items

Related Subjects:(1)

Confirm this request

You may have already requested this item. Please select Ok if you would like to proceed with this request anyway.

Close Window

Please sign in to WorldCat 

Don't have an account? You can easily create a free account.