robcaulk
|
936ca24482
|
separate RL install from general FAI install, update docs
|
2022-10-05 15:58:54 +02:00 |
|
robcaulk
|
83343dc2f1
|
control number of threads, update doc
|
2022-09-29 00:10:18 +02:00 |
|
Timothy Pogue
|
099137adac
|
remove hasattr calls
|
2022-09-27 22:35:15 -06:00 |
|
Timothy Pogue
|
9e36b0d2ea
|
fix formatting
|
2022-09-27 22:02:33 -06:00 |
|
Timothy Pogue
|
caa47a2f47
|
close subproc env on shutdown
|
2022-09-28 03:06:05 +00:00 |
|
robcaulk
|
647200e8a7
|
isort
|
2022-09-23 19:30:56 +02:00 |
|
robcaulk
|
77c360b264
|
improve typing, improve docstrings, ensure global tests pass
|
2022-09-23 19:17:27 +02:00 |
|
robcaulk
|
ea8e34e192
|
Merge branch 'develop' into dev-merge-rl
|
2022-09-22 19:46:50 +02:00 |
|
robcaulk
|
8aac644009
|
add tests. add guardrails.
|
2022-09-15 00:46:35 +02:00 |
|
robcaulk
|
81417cb795
|
Merge branch 'develop' into dev-merge-rl
|
2022-09-14 22:49:11 +02:00 |
|
Emre
|
330d7068ab
|
Merge branch 'develop' into add-xgboostclassifier
|
2022-09-10 23:59:11 +03:00 |
|
robcaulk
|
5a0cfee27e
|
allow user to multithread jobs (advanced users only)
|
2022-09-10 22:16:49 +02:00 |
|
Emre
|
60eb02bb62
|
Add XGBoostClassifier
|
2022-09-10 20:13:16 +03:00 |
|
robcaulk
|
10b6aebc5f
|
enable continual learning and evaluation sets on multioutput models.
|
2022-09-10 16:54:13 +02:00 |
|
robcaulk
|
a826c0eb83
|
ensure signatures match, reduce verbosity
|
2022-09-09 19:30:53 +02:00 |
|
Emre
|
acb410a0de
|
Remove verbosity params
|
2022-09-09 19:30:53 +02:00 |
|
Emre
|
df6e43d2c5
|
Add XGBoostRegressorMultiTarget class
|
2022-09-09 19:30:53 +02:00 |
|
Emre
|
1b6410d7d1
|
Add XGBoostRegressor for freqAI, fix mypy errors
|
2022-09-09 19:30:53 +02:00 |
|
robcaulk
|
4c9ac6b7c0
|
add kwargs, reduce duplicated code
|
2022-09-07 18:58:55 +02:00 |
|
robcaulk
|
97077ba18a
|
add continual learning to catboost and friends
|
2022-09-06 20:30:46 +02:00 |
|
robcaulk
|
240b529533
|
fix tensorboard path so that users can track all historical models
|
2022-08-31 16:50:39 +02:00 |
|
robcaulk
|
7766350c15
|
refactor environment inheritence tree to accommodate flexible action types/counts. fix bug in train profit handling
|
2022-08-28 19:21:57 +02:00 |
|
robcaulk
|
3199eb453b
|
reduce code for base use-case, ensure multiproc inherits custom env, add ability to limit ram use.
|
2022-08-25 19:05:51 +02:00 |
|
robcaulk
|
05ccebf9a1
|
automate eval freq in multiproc
|
2022-08-25 12:29:48 +02:00 |
|
robcaulk
|
94cfc8e63f
|
fix multiproc callback, add continual learning to multiproc, fix totalprofit bug in env, set eval_freq automatically, improve default reward
|
2022-08-25 11:46:18 +02:00 |
|
robcaulk
|
d1bee29b1e
|
improve default reward, fix bugs in environment
|
2022-08-24 18:32:40 +02:00 |
|
robcaulk
|
bd870e2331
|
fix monitor bug, set default values in case user doesnt set params
|
2022-08-24 16:32:14 +02:00 |
|
robcaulk
|
c0cee5df07
|
add continual retraining feature, handly mypy typing reqs, improve docstrings
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
b708134c1a
|
switch multiproc thread count to rl_config definition
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
b26ed7dea4
|
fix generic reward, add time duration to reward
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
29f0e01c4a
|
expose environment reward parameters to the user config
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
3eb897c2f8
|
reuse callback, allow user to acces all stable_baselines3 agents via config
|
2022-08-24 13:00:55 +02:00 |
|
sonnhfit
|
4baa36bdcf
|
fix persist a single training environment for PPO
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
f95602f6bd
|
persist a single training environment.
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
5d4e5e69fe
|
reinforce training with state info, reinforce prediction with state info, restructure config to accommodate all parameters from any user imported model type. Set 5Act to default env on TDQN. Clean example config.
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
b90da46b1b
|
improve price df handling to enable backtesting
|
2022-08-24 13:00:55 +02:00 |
|
sonnhfit
|
0475b7cb18
|
remove unuse code and fix coding conventions
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
d60a166fbf
|
multiproc TDQN with xtra callbacks
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
dd382dd370
|
add monitor to eval env so that multiproc can save best_model
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
e5df39e891
|
ensuring best_model is placed in ram and saved to disk and loaded from disk
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
bf7ceba958
|
set cpu threads in config
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
57c488a6f1
|
learning_rate + multicpu changes
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
acf3484e88
|
add multiprocessing variant of ReinforcementLearningPPO
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
13cd18dc9a
|
PPO policy change + verbose=1
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
926023935f
|
make base 3ac and base 5ac environments. TDQN defaults to 3AC.
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
096533bcb9
|
3ac to 5ac
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
718c9d0440
|
action fix
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
9c78e6c26f
|
base PPO model only customizes reward for 3AC
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
6048f60f13
|
get TDQN working with 5 action environment
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
d4db5c3281
|
ensure TDQN class is properly named
|
2022-08-24 13:00:55 +02:00 |
|