The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
README.md	2019-02-28	3.0 kB	0
v0.6.0.tar.gz	2019-02-28	6.9 MB	0
v0.6.0.zip	2019-02-28	7.1 MB	0
Totals: 3 Items		14.0 MB	0

Important enhancements

Implicit Quantile Network (IQN) https://arxiv.org/abs/1806.06923 agent is added: chainerrl.agents.IQN.
Training DQN and its variants with N-step returns is supported.
Resetting env with done=False via info dict is supported. When env.step returns a info dict with info['needs_reset']=True, env is reset. This feature is useful for implementing a continuing env.
Evaluation with a fixed number of timesteps is supported (except async training). This evaluation protocol is popular in Atari benchmarks.
examples/atari/dqn now implements the same evaluation protocol as the Nature DQN paper.
An example script of training a DoubleDQN agent for a PyBullet-based robotic grasping env is added: examples/grasping.

Important bugfixes

The bug that PPO's obs_normalizer was not saved is fixed.
The bug that NonbiasWeightDecay didn't work with newer versions of Chainer is fixed.
The bug that argv argument was ignored by chainerrl.experiments.prepare_output_dir is fixed.

train_agent_with_evaluation and train_agent_batch_with_evaluation now require eval_n_steps (number of timesteps for each evaluation phase) and eval_n_episodes (number of episodes for each evaluation phase) to be explicitly specified, with one of them being None.
train_agent_with_evaluation's max_episode_len argument is renamed to train_max_episode_len.
ReplayBuffer.sample now returns a list of lists of N experiences to support N-step returns.

Source: README.md, updated 2019-02-28