Metoda Monte Carlo w uczeniu ze wzmocnieniem

Gromna, Martyna

Simple view

Full metadata view

Authors

Statistics

Metoda Monte Carlo w uczeniu ze wzmocnieniem

master

Alternative title

Monte Carlo method in Reinforcement Learning

Author

Gromna Martyna

Reviewer

Kosiński Łukasz

Zapałowski Paweł

Advisor

Kosiński Łukasz

Date of defence

2020-09-30

Keywords in Polish

Programowanie dynamiczne, metody Monte Carlo, algorytm iteracji strategii, algorytm iteracji wartości, algortym Q-learning, uczenie się ze wzmocnieniem, procesy decyzyjne Markowa.

Keywords in English

Dynamic programming, Monte Carlo methods, strategy iteration algorithm, value iteration algorithm, Q-learning algorithm, Reinforcement Learning, Markov decision processes.

Language

Polish

Abstract in Polish

Celem pracy jest rozwiązanie problemu decyzyjnego Markowa (MDP). Korzystamy z programowania dynamicznego, w sytuacji w której znamy dynamikę MDP – używamy algorytm iteracji strategii oraz algorytm iteracji wartości. Z kolei, gdy nie znamy dynamiki MDP stosujemy metody Monte Carlo. Część pracy poświęcona jest także algorytmowi Q-learning. Jego działanie prezentujemy poprzez rozwiązanie problemu taxi-v2 w programie Python.

Abstract in English

The aim of the work is to solve the Markov decision problem (MDP). We use dynamic programming in a situation in which we know the dynamics of MDP - we use the strategy iteration algorithm and the value iteration algorithm. On the other hand, when we do not know the MDP dynamics, we use Monte Carlo methods. Part of the work is also devoted to the Q-learning algorithm. We present its operation by solving the taxi-v2 problem in Python.

dc.abstract.en	The aim of the work is to solve the Markov decision problem (MDP). We use dynamic programming in a situation in which we know the dynamics of MDP - we use the strategy iteration algorithm and the value iteration algorithm. On the other hand, when we do not know the MDP dynamics, we use Monte Carlo methods. Part of the work is also devoted to the Q-learning algorithm. We present its operation by solving the taxi-v2 problem in Python.	pl
dc.abstract.pl	Celem pracy jest rozwiązanie problemu decyzyjnego Markowa (MDP). Korzystamy z programowania dynamicznego, w sytuacji w której znamy dynamikę MDP – używamy algorytm iteracji strategii oraz algorytm iteracji wartości. Z kolei, gdy nie znamy dynamiki MDP stosujemy metody Monte Carlo. Część pracy poświęcona jest także algorytmowi Q-learning. Jego działanie prezentujemy poprzez rozwiązanie problemu taxi-v2 w programie Python.	pl
dc.affiliation	Wydział Matematyki i Informatyki	pl
dc.area	obszar nauk ścisłych	pl
dc.contributor.advisor	Kosiński, Łukasz - 136119	pl
dc.contributor.author	Gromna, Martyna	pl
dc.contributor.departmentbycode	UJK/WMI2	pl
dc.contributor.reviewer	Kosiński, Łukasz - 136119	pl
dc.contributor.reviewer	Zapałowski, Paweł - 132860	pl
dc.date.accessioned	2020-10-21T19:37:09Z
dc.date.available	2020-10-21T19:37:09Z
dc.date.submitted	2020-09-30	pl
dc.fieldofstudy	matematyka finansowa	pl
dc.identifier.apd	diploma-145609-249759	pl
dc.identifier.project	APD / O	pl
dc.identifier.uri	https://ruj.uj.edu.pl/xmlui/handle/item/250527
dc.language	pol	pl
dc.subject.en	Dynamic programming, Monte Carlo methods, strategy iteration algorithm, value iteration algorithm, Q-learning algorithm, Reinforcement Learning, Markov decision processes.	pl
dc.subject.pl	Programowanie dynamiczne, metody Monte Carlo, algorytm iteracji strategii, algorytm iteracji wartości, algortym Q-learning, uczenie się ze wzmocnieniem, procesy decyzyjne Markowa.	pl
dc.title	Metoda Monte Carlo w uczeniu ze wzmocnieniem	pl
dc.title.alternative	Monte Carlo method in Reinforcement Learning	pl
dc.type	master	pl
dspace.entity.type	Publication

dc.abstract.enpl

The aim of the work is to solve the Markov decision problem (MDP). We use dynamic programming in a situation in which we know the dynamics of MDP - we use the strategy iteration algorithm and the value iteration algorithm. On the other hand, when we do not know the MDP dynamics, we use Monte Carlo methods. Part of the work is also devoted to the Q-learning algorithm. We present its operation by solving the taxi-v2 problem in Python.

dc.abstract.plpl

Celem pracy jest rozwiązanie problemu decyzyjnego Markowa (MDP). Korzystamy z programowania dynamicznego, w sytuacji w której znamy dynamikę MDP – używamy algorytm iteracji strategii oraz algorytm iteracji wartości. Z kolei, gdy nie znamy dynamiki MDP stosujemy metody Monte Carlo. Część pracy poświęcona jest także algorytmowi Q-learning. Jego działanie prezentujemy poprzez rozwiązanie problemu taxi-v2 w programie Python.

dc.affiliationpl

Wydział Matematyki i Informatyki

dc.areapl

obszar nauk ścisłych

dc.contributor.advisorpl

Kosiński, Łukasz - 136119

dc.contributor.authorpl

Gromna, Martyna

dc.contributor.departmentbycodepl

UJK/WMI2

dc.contributor.reviewerpl

Kosiński, Łukasz - 136119

dc.contributor.reviewerpl

Zapałowski, Paweł - 132860

dc.date.accessioned

2020-10-21T19:37:09Z

dc.date.available

2020-10-21T19:37:09Z

dc.date.submittedpl

2020-09-30

dc.fieldofstudypl

matematyka finansowa

dc.identifier.apdpl

diploma-145609-249759

dc.identifier.projectpl

APD / O

dc.identifier.uri

https://ruj.uj.edu.pl/xmlui/handle/item/250527

dc.languagepl

pol

dc.subject.enpl

Dynamic programming, Monte Carlo methods, strategy iteration algorithm, value iteration algorithm, Q-learning algorithm, Reinforcement Learning, Markov decision processes.

dc.subject.plpl

Programowanie dynamiczne, metody Monte Carlo, algorytm iteracji strategii, algorytm iteracji wartości, algortym Q-learning, uczenie się ze wzmocnieniem, procesy decyzyjne Markowa.

dc.titlepl

Metoda Monte Carlo w uczeniu ze wzmocnieniem

dc.title.alternativepl

Monte Carlo method in Reinforcement Learning

dc.typepl

master

dspace.entity.type

Publication

Affiliations

No affiliation

Gromna, Martyna

Kosiński, Łukasz

Zapałowski, Paweł

* The migration of download and view statistics prior to the date of April 8, 2024 is in progress.

Views

52 Views per month

Views per city

Stopnica

13

Gdansk

11

Krakow

5

Chorzów

4

Dublin

2

Poznan

2

Warsaw

2

Wroclaw

2

Busko-Zdrój

1

Bytom Odrzański

1

No access

Collections

Masters theses

ROD UJ