stable

Author	SHA1	Message	Date
robcaulk	8aac644009	add tests. add guardrails.	2022-09-15 00:46:35 +02:00
robcaulk	240b529533	fix tensorboard path so that users can track all historical models	2022-08-31 16:50:39 +02:00
robcaulk	7766350c15	refactor environment inheritence tree to accommodate flexible action types/counts. fix bug in train profit handling	2022-08-28 19:21:57 +02:00
robcaulk	3199eb453b	reduce code for base use-case, ensure multiproc inherits custom env, add ability to limit ram use.	2022-08-25 19:05:51 +02:00
robcaulk	05ccebf9a1	automate eval freq in multiproc	2022-08-25 12:29:48 +02:00
robcaulk	94cfc8e63f	fix multiproc callback, add continual learning to multiproc, fix totalprofit bug in env, set eval_freq automatically, improve default reward	2022-08-25 11:46:18 +02:00
robcaulk	bd870e2331	fix monitor bug, set default values in case user doesnt set params	2022-08-24 16:32:14 +02:00
robcaulk	b708134c1a	switch multiproc thread count to rl_config definition	2022-08-24 13:00:55 +02:00
robcaulk	b26ed7dea4	fix generic reward, add time duration to reward	2022-08-24 13:00:55 +02:00
robcaulk	29f0e01c4a	expose environment reward parameters to the user config	2022-08-24 13:00:55 +02:00
robcaulk	3eb897c2f8	reuse callback, allow user to acces all stable_baselines3 agents via config	2022-08-24 13:00:55 +02:00