robcaulk
|
94cfc8e63f
|
fix multiproc callback, add continual learning to multiproc, fix totalprofit bug in env, set eval_freq automatically, improve default reward
|
2022-08-25 11:46:18 +02:00 |
|
robcaulk
|
bd870e2331
|
fix monitor bug, set default values in case user doesnt set params
|
2022-08-24 16:32:14 +02:00 |
|
robcaulk
|
b708134c1a
|
switch multiproc thread count to rl_config definition
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
b26ed7dea4
|
fix generic reward, add time duration to reward
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
29f0e01c4a
|
expose environment reward parameters to the user config
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
3eb897c2f8
|
reuse callback, allow user to acces all stable_baselines3 agents via config
|
2022-08-24 13:00:55 +02:00 |
|