robcaulk
|
81417cb795
|
Merge branch 'develop' into dev-merge-rl
|
2022-09-14 22:49:11 +02:00 |
|
robcaulk
|
5a0cfee27e
|
allow user to multithread jobs (advanced users only)
|
2022-09-10 22:16:49 +02:00 |
|
robcaulk
|
10b6aebc5f
|
enable continual learning and evaluation sets on multioutput models.
|
2022-09-10 16:54:13 +02:00 |
|
robcaulk
|
a826c0eb83
|
ensure signatures match, reduce verbosity
|
2022-09-09 19:30:53 +02:00 |
|
Emre
|
acb410a0de
|
Remove verbosity params
|
2022-09-09 19:30:53 +02:00 |
|
Emre
|
df6e43d2c5
|
Add XGBoostRegressorMultiTarget class
|
2022-09-09 19:30:53 +02:00 |
|
Emre
|
1b6410d7d1
|
Add XGBoostRegressor for freqAI, fix mypy errors
|
2022-09-09 19:30:53 +02:00 |
|
robcaulk
|
4c9ac6b7c0
|
add kwargs, reduce duplicated code
|
2022-09-07 18:58:55 +02:00 |
|
robcaulk
|
97077ba18a
|
add continual learning to catboost and friends
|
2022-09-06 20:30:46 +02:00 |
|
robcaulk
|
240b529533
|
fix tensorboard path so that users can track all historical models
|
2022-08-31 16:50:39 +02:00 |
|
robcaulk
|
7766350c15
|
refactor environment inheritence tree to accommodate flexible action types/counts. fix bug in train profit handling
|
2022-08-28 19:21:57 +02:00 |
|
robcaulk
|
3199eb453b
|
reduce code for base use-case, ensure multiproc inherits custom env, add ability to limit ram use.
|
2022-08-25 19:05:51 +02:00 |
|
robcaulk
|
05ccebf9a1
|
automate eval freq in multiproc
|
2022-08-25 12:29:48 +02:00 |
|
robcaulk
|
94cfc8e63f
|
fix multiproc callback, add continual learning to multiproc, fix totalprofit bug in env, set eval_freq automatically, improve default reward
|
2022-08-25 11:46:18 +02:00 |
|
robcaulk
|
d1bee29b1e
|
improve default reward, fix bugs in environment
|
2022-08-24 18:32:40 +02:00 |
|
robcaulk
|
bd870e2331
|
fix monitor bug, set default values in case user doesnt set params
|
2022-08-24 16:32:14 +02:00 |
|
robcaulk
|
c0cee5df07
|
add continual retraining feature, handly mypy typing reqs, improve docstrings
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
b708134c1a
|
switch multiproc thread count to rl_config definition
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
b26ed7dea4
|
fix generic reward, add time duration to reward
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
29f0e01c4a
|
expose environment reward parameters to the user config
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
3eb897c2f8
|
reuse callback, allow user to acces all stable_baselines3 agents via config
|
2022-08-24 13:00:55 +02:00 |
|
sonnhfit
|
4baa36bdcf
|
fix persist a single training environment for PPO
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
f95602f6bd
|
persist a single training environment.
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
5d4e5e69fe
|
reinforce training with state info, reinforce prediction with state info, restructure config to accommodate all parameters from any user imported model type. Set 5Act to default env on TDQN. Clean example config.
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
b90da46b1b
|
improve price df handling to enable backtesting
|
2022-08-24 13:00:55 +02:00 |
|
sonnhfit
|
0475b7cb18
|
remove unuse code and fix coding conventions
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
d60a166fbf
|
multiproc TDQN with xtra callbacks
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
dd382dd370
|
add monitor to eval env so that multiproc can save best_model
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
e5df39e891
|
ensuring best_model is placed in ram and saved to disk and loaded from disk
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
bf7ceba958
|
set cpu threads in config
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
57c488a6f1
|
learning_rate + multicpu changes
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
acf3484e88
|
add multiprocessing variant of ReinforcementLearningPPO
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
13cd18dc9a
|
PPO policy change + verbose=1
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
926023935f
|
make base 3ac and base 5ac environments. TDQN defaults to 3AC.
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
096533bcb9
|
3ac to 5ac
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
718c9d0440
|
action fix
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
9c78e6c26f
|
base PPO model only customizes reward for 3AC
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
6048f60f13
|
get TDQN working with 5 action environment
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
d4db5c3281
|
ensure TDQN class is properly named
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
91683e1dca
|
restructure RL so that user can customize environment
|
2022-08-24 13:00:55 +02:00 |
|
sonnhfit
|
ecd1f55abc
|
add rl module
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
9b895500b3
|
initial commit - new dev branch
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
cd3fe44424
|
callback function and TDQN model added
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
01232e9a1f
|
callback function and TDQN model added
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
8eeaab2746
|
add reward function
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
ec813434f5
|
ReinforcementLearningModel
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
2f4d73eb06
|
Revert "ReinforcementLearningModel"
This reverts commit 4d8dfe1ff1daa47276eda77118ddf39c13512a85.
|
2022-08-24 13:00:55 +02:00 |
|
MukavaValkku
|
c1e7db3130
|
ReinforcementLearningModel
|
2022-08-24 13:00:55 +02:00 |
|
robcaulk
|
05ed1b544f
|
Working base for reinforcement learning model
|
2022-08-24 13:00:40 +02:00 |
|
robcaulk
|
4c0fda400f
|
fix input shape warning for LGBMClassifier, add sample_weights/eval_weights
|
2022-08-16 11:41:53 +02:00 |
|