상세정보 전남대학교 중앙도서관

주메뉴 바로가기OUT
본문 바로가기
퀵메뉴 바로가기
마이메뉴 바로가기
도서관 정보 바로가기

링크메뉴

주메뉴

- 게시판
- 전시마당

- 게시판
- 전시마당

- 게시판
- 전시마당

상세정보

상세정보

상세정보

검색결과 돌아가기

부가기능

MARC보기

Reinforcement Learning Algorithms with Python [electronic resource] : Learn, Understand, and Develop Smart Algorithms for Addressing AI Challenges

상세 프로파일

상세정보
자료유형	e-Book
서명/저자사항	Reinforcement Learning Algorithms with Python [electronic resource]: Learn, Understand, and Develop Smart Algorithms for Addressing AI Challenges / Andrea Lonza.
개인저자	Lonza, Andrea.
발행사항	Birmingham: Packt Publishing, Limited, 2019.
형태사항	1 online resource (356 pages).
기타형태 저록	Print version: Lonza, Andrea. Reinforcement Learning Algorithms with Python : Learn, Understand, and Develop Smart Algorithms for Addressing AI Challenges. Birmingham : Packt Publishing, Limited, 짤2019 9781789131116
ISBN	1789139708 9781789139709
일반주기	Implementing REINFORCE with baseline
내용주기	Cover; Title Page; Copyright and Credits; Dedication; About Packt; Contributors; Table of Contents; Preface; Section 1: Algorithms and Environments; Chapter 1: The Landscape of Reinforcement Learning; An introduction to RL; Comparing RL and supervised learning; History of RL; Deep RL; Elements of RL; Policy; The value function; Reward; Model; Applications of RL; Games; Robotics and Industry 4.0; Machine learning; Economics and finance; Healthcare; Intelligent transportation systems; Energy optimization and smart grid; Summary; Questions; Further reading Chapter 2: Implementing RL Cycle and OpenAI GymSetting up the environment; Installing OpenAI Gym; Installing Roboschool; OpenAI Gym and RL cycles; Developing an RL cycle; Getting used to spaces; Development of ML models using TensorFlow; Tensor; Constant; Placeholder; Variable; Creating a graph; Simple linear regression example; Introducing TensorBoard; Types of RL environments; Why different environments?; Open source environments; Summary; Questions; Further reading; Chapter 3: Solving Problems with Dynamic Programming; MDP; Policy; Return; Value functions; Bellman equation Categorizing RL algorithmsModel-free algorithms; Value-based algorithms; Policy gradient algorithms; Actor-Critic algorithms; Hybrid algorithms; Model-based RL; Algorithm diversity; Dynamic programming; Policy evaluation and policy improvement; Policy iteration; Policy iteration applied to FrozenLake; Value iteration; Value iteration applied to FrozenLake; Summary; Questions; Further reading; Section 2: Model-Free RL Algorithms; Chapter 4: Q-Learning and SARSA Applications; Learning without a model; User experience; Policy evaluation; The exploration problem; Why explore?; How to explore TD learningTD update; Policy improvement; Comparing Monte Carlo and TD; SARSA; The algorithm; Applying SARSA to Taxi-v2; Q-learning; Theory; The algorithm; Applying Q-learning to Taxi-v2; Comparing SARSA and Q-learning; Summary; Questions; Chapter 5: Deep Q-Network; Deep neural networks and Q-learning; Function approximation; Q-learning with neural networks; Deep Q-learning instabilities; DQN; The solution; Replay memory; The target network; The DQN algorithm; The loss function; Pseudocode; Model architecture; DQN applied to Pong; Atari games; Preprocessing; DQN implementation; DNNs The experienced bufferThe computational graph and training loop; Results; DQN variations; Double DQN; DDQN implementation; Results; Dueling DQN; Dueling DQN implementation; Results; N-step DQN; Implementation; Results; Summary; Questions; Further reading; Chapter 6: Learning Stochastic and PG Optimization; Policy gradient methods; The gradient of the policy; Policy gradient theorem; Computing the gradient; The policy; On-policy PG; Understanding the REINFORCE algorithm; Implementing REINFORCE; Landing a spacecraft using REINFORCE; Analyzing the results; REINFORCE with baseline
요약	With this book, you will understand the core concepts and techniques of reinforcement learning. You will take a look into each RL algorithm and will develop your own self-learning algorithms and models. You will optimize the algorithms for better precision, use high-speed actions and lower the risk of anomalies in your applications.
일반주제명	Computer algorithms. Python (Computer program language) Computer algorithms. Python (Computer program language)
분류기호(DDC)	005.1
언어	영어
바로가기

소장정보

소장정보

보존/밀집/기증 자료 신청 분관대출 서가부재도서 무인예약대출 배달서비스 소장위치출력

메세지가 없습니다
No.	등록번호	청구기호	소장처	밀집번호	도서상태	반납예정일	예약	서비스	매체정보
1	E191971	EB 005.1	중앙도서관[본관]/E-Book/		대출가능				true\|true\|true\|true \|true\|true \|

서평

서평

태그

태그

태그추가 (로그인 필요)

나의 태그

나의 태그 (0)

모든 이용자 태그

모든 이용자 태그 (0)

QRCode

전남대학교중앙도서관

도서관자치위원회

원격제어

Instagram

kakao 플친

500-757 광주광역시 북구 용봉로 77 TEL 062)530-3571~2(대출반납실) FAX 062)530-3529

Copyright@2013 CHONNAM NATIONAL UNIVERSITY LIBRARY, All Rights Reserved.

101892
127112813