Commit Graph

282 Commits

Author SHA1 Message Date
robcaulk 62c0a174c8 allow users to properly reverse train-test data ordering 2022-08-29 11:04:58 +02:00
robcaulk dd628eb525 add tests for outlier detection and removal functions 2022-08-28 12:56:39 +02:00
robcaulk 1e41c773a0 fix outlier protection 2022-08-28 12:11:29 +02:00
th0rntwig 71f7d68783 Fixed mypy error 2022-08-27 12:44:55 +02:00
elintornquist 86c5ac44e4 Add outlier percentage check 2022-08-26 23:05:07 +02:00
elintornquist b2d664c63c Change MinPts calculation 2022-08-26 18:57:27 +02:00
robcaulk 65b552e310 make docs reflect reality, move download_all_data to new utils.py file, automatic startup_candle detection 2022-08-26 15:30:01 +02:00
robcaulk 4b7e640f31 reduce code duplication, optimize auto data download per tf 2022-08-26 13:56:44 +02:00
th0rntwig 5ce1c69803
Improve DBSCAN epsilon identification (#7269)
* Improve DBSCAN epsilon identification
2022-08-22 19:57:20 +02:00
robcaulk ac42c0153d deprecate indicator_max_period_candles, automatically compute startup candles for FreqAI backtesting. 2022-08-22 18:19:07 +02:00
robcaulk 96d8882f1e Plug mem leak, add training timer 2022-08-22 13:30:30 +02:00
longyu cfa5b3f12c add new line 2022-08-19 12:39:08 +02:00
longyu 277245c69d remove line 2022-08-19 12:39:00 +02:00
longyu f70b0bab80 remove line 2022-08-17 23:49:20 +02:00
robcaulk 5155afb4e7 clean up code remnants 2022-08-17 15:22:48 +02:00
robcaulk 0c34104e45 extract download-data from freqai to prepare for future async changes 2022-08-17 15:18:44 +02:00
longyu 9c38c27eed ignore sample itself distance for avg_mean_dist computation 2022-08-17 15:09:57 +02:00
longyu 72c34291e3 newline 2022-08-17 15:09:10 +02:00
robcaulk 006b11e5d5 fix leftover bug in indicator population 2022-08-14 21:42:55 +02:00
robcaulk ad846cdb76 fix lock bug, update docstring 2022-08-14 20:24:29 +02:00
Matthias a29402ddde Rename and move analysis_lock to data_kitchen 2022-08-14 17:23:14 +02:00
Matthias 3a9ec76c91 Move "freqai.lock" to backend to simplify user interface 2022-08-14 17:19:50 +02:00
robcaulk b1b76a2dbe debug classifier with predict proba 2022-08-13 19:40:24 +02:00
robcaulk 23cc21ce59 add predict_proba to base classifier, improve historic predictions handling 2022-08-13 19:40:24 +02:00
robcaulk fb4b73ce89 ensure dates are saved 2022-08-12 12:03:44 +02:00
robcaulk 2cae3c42e6 remove trade database analyzer, clean up a bit 2022-08-10 17:43:06 +02:00
robcaulk 5a16d5a512 Deactivate database analyzer if user does not use sqlite 2022-08-09 16:36:22 +02:00
robcaulk aef086b02e Improved dict typing, timeframe parser, collect dates associated with training data points 2022-08-09 15:30:25 +02:00
Matthias 77b3b8a134 Use main exchange instead of creating a separate instance. 2022-08-08 18:34:11 +00:00
Matthias b16f57cb0d Minor stylistic fixes 2022-08-06 14:55:46 +02:00
Robert Caulk c172ce1011 improve flexibility of user defined prediction dataframe 2022-08-06 13:51:19 +02:00
Robert Caulk 07763d0d4f add classifier, improve model naming scheme 2022-08-06 08:33:55 +02:00
Robert Caulk ce8fbbf743 ensure loading historical df matches frequi indices 2022-08-06 07:25:59 +02:00
robcaulk 60d782e5c5 remove unnecessary function 2022-08-05 21:31:32 +02:00
robcaulk a42a060ab5 fix DB once and for all. Make DBSCAN more efficient and robust. 2022-08-05 21:29:03 +02:00
robcaulk 29b7b014e5 fix bug in DB path initialization 2022-08-05 18:19:26 +02:00
robcaulk 05ec5c5e54 generalize database url path for any db type 2022-08-05 12:19:29 +02:00
Robert Caulk 51a6b4289f improve DBSCAN performance for subsequent trainings 2022-08-04 17:41:58 +02:00
Robert Caulk fe1b8515a8 fix bug in DBSCAN, update doc 2022-08-04 17:00:59 +02:00
robcaulk 29225e4baf add DBSCAN outlier detection feature, add supporting documentation 2022-08-04 12:15:16 +02:00
robcaulk eae82d0222 fix bug with database url during backtesting. comment out example trade db analysis. 2022-08-03 16:17:57 +02:00
robcaulk 95d3009a95 give user ability to analyze live trade dataframe inside custom prediction model. Add documentation to explain new functionality 2022-08-02 20:14:02 +02:00
robcaulk d830105605 *BREAKING CHANGE* remove unnecessary arguments from populate_any_indicators(), accommodate tests 2022-07-31 17:05:29 +02:00
robcaulk dd8288c090 expose full parameter set for SVM outlier detection. Set default shuffle to false to improve reproducibility 2022-07-30 13:40:05 +02:00
robcaulk f22b140782 fix backtesting bug, undo move of label stat calc, fix example strat exit logic 2022-07-29 17:27:35 +02:00
robcaulk c84d54b35e Fix typing issue, avoid using .get() when unnecessary, convert to fstrings 2022-07-29 08:12:50 +02:00
robcaulk e213d0ad55 isolate data_drawer functions from data_kitchen, accommodate tests, add new test 2022-07-26 10:24:14 +02:00
robcaulk 56b17e6f3c allow user to pass test_size = 0 and avoid using eval sets in prediction models 2022-07-25 19:40:13 +02:00
Robert Caulk 897f18a8c8 ensure proper integer type casting for timestamps. Add check test for backtesting subdaily time periods 2022-07-25 15:07:09 +02:00
Matthias 61c41fd919 Merge branch 'develop' into feat/freqai 2022-07-24 16:18:58 +02:00
Robert Caulk 88e10f7306 add exception for not passing timerange. Remove hard coded arguments for CatboostPredictionModels. Update docs 2022-07-24 09:01:23 +02:00
Robert Caulk fff39eff9e fix multitarget bug 2022-07-24 08:42:50 +02:00
robcaulk f3d46613ee move prediction denormalization into datakitchen. remove duplicate associated code. avoid normalization/denormalization for string dtypes. 2022-07-23 17:14:33 +02:00
robcaulk c91e23dc50 let user avoid normalizing labels 2022-07-23 16:14:13 +02:00
robcaulk a1cff377ec add record of contribution to data_kitchen.py 2022-07-23 13:32:04 +02:00
robcaulk 40f00196eb use cloudpickle in place of pickle. define Paths once in data_drawer. 2022-07-22 17:37:51 +02:00
robcaulk afcb0bec00 clean up obsolete comments, move remove_features_from_df to datakitchen 2022-07-22 12:29:20 +02:00
robcaulk 3205788bce extend doc to include descriptions of the return values from FreqAI to the strategy 2022-07-21 22:11:46 +02:00
robcaulk 183dec866a remove ability to backtest open ended timeranges (safer) 2022-07-21 13:02:52 +02:00
robcaulk ca4dd58642 remove superceded function from datakitchen 2022-07-21 12:40:54 +02:00
robcaulk 8f86b0deaa *breaking change* simplify user strat by consolidating feature loops into backend 2022-07-21 12:24:22 +02:00
robcaulk e7337728bf add separator in folder name just incase an asset ends in an integer 2022-07-21 11:25:28 +02:00
robcaulk d43c146676 add more tests for datakitchen functionalities, add regression tests for freqai_interface train/backtest 2022-07-20 12:56:46 +02:00
lolong 9c051958a6
Feat/freqai (#7105)
Vectorize weight setting, log training dates

Co-authored-by: robcaulk <rob.caulk@gmail.com>
2022-07-19 17:49:18 +02:00
robcaulk 714d9534b6 start adding tests 2022-07-19 16:16:44 +02:00
lolong ed0f8b1189
Improve FreqAI documentation (#7072)
Improve doc + some other small fixes

Co-authored-by: robcaulk <rob.caulk@gmail.com>
2022-07-18 11:57:52 +02:00
robcaulk ef409dd345 Add ground work for TensorFlow models, add protections from common mistakes 2022-07-12 18:09:17 +02:00
Robert Caulk 8ce6b18318 start collecting indefinite history of predictions. Allow user to generate statistics on these predictions. Direct FreqAI to save these to disk and reload them if available. 2022-07-11 22:01:48 +02:00
Robert Caulk 607455919e Change config parameter names to improve clarity and consistency throughout the code (!!breaking change, please check discord support channel for migration instructions or review templates/FreqaiExampleStrategy.py config_examples/config_freqai_futures.example.json file changes!!) 2022-07-10 12:35:44 +02:00
robcaulk d9acdc9767 remove excess, increase no model warning clarity 2022-07-06 18:20:21 +02:00
robcaulk 4cac67fd66 Catch infrequent issue associated with grabbing first candle 2022-07-05 12:43:33 +02:00
robcaulk bd3a6ba2fe update backtesting to handle new output framework 2022-07-03 17:34:44 +02:00
robcaulk 4ff0ef7359 fix bug returning multiple targets for training 2022-07-03 12:15:59 +02:00
robcaulk ffb39a5029 black formatting on freqai files 2022-07-03 10:59:38 +02:00
robcaulk 106131ff0f Rehaul organization of return values 2022-07-02 18:09:38 +02:00
robcaulk 93e1410ed9 first step toward cleaning output and enabling multimodel training per pair 2022-07-01 14:00:30 +02:00
robcaulk 6c7d02cb18 expose nu in the SVM outlier detection via svm_nu in config 2022-06-28 15:12:25 +02:00
robcaulk 051b99791d reduce unnecessary verbosity, fix error on first training sweep, add LightGBMPredictionModel 2022-06-26 19:04:23 +02:00
Robert Caulk 852706cd6b
Fix default behavior for expiration_hours 2022-06-21 08:12:51 +02:00
robcaulk 6da7a98857 add docstrings to new functions, remove superceded code 2022-06-17 16:16:23 +02:00
robcaulk f631ae911b add model expiration feature, fix bug in DI return values 2022-06-17 14:55:40 +02:00
robcaulk c5de0c49e4 first functional scanning commit 2022-06-16 00:24:18 +02:00
robcaulk 4d472a0ea1 merging datarehaul into scanning branch 2022-06-16 00:22:49 +02:00
robcaulk eb47c74096 merge datarehaul into main freqai branch 2022-06-10 20:26:19 +02:00
robcaulk d9b79d94e4 increase candle update flexibility to allow long sequential trainings that may last more than one candle 2022-06-07 20:57:10 +02:00
robcaulk 15d049cffe detect if upper tf candles are new or not, append if so. Correct the epoch for candle update check 2022-06-07 19:49:20 +02:00
robcaulk 4b26b6aaec add lock to any historic data access 2022-06-07 00:54:18 +02:00
robcaulk d6b8801f41 fix follower bug 2022-06-05 04:40:58 +02:00
robcaulk e8c0dcf9f3 add debug message to timerange 2022-06-03 17:14:07 +02:00
robcaulk 16b4a5b71f rehaul of backend data management - increasing performance by holding history in memory, reducing load on the ratelimit by only pinging exchange once per candle. Improve code readability. 2022-06-03 15:19:46 +02:00
robcaulk 15a971346d catch infinity values when filtering 2022-06-02 17:13:20 +02:00
robcaulk 7523ed825e automatically detect maximum required data based on user fed indicators (to avoid NaNs in dataset for rolling indicators), add new config parameter for backtesting to let users increase their startup_candles to accommodate high timeframe indicators, add docs to explain all. Add new feature for automatic indicator duplication according to user defined intervals (exhibited in example strat and configs now). 2022-05-31 18:42:27 +02:00
robcaulk 70adf55643 Automatically detect and change follower data_path to accommodate remote systems 2022-05-31 12:35:09 +02:00
robcaulk 0306f5ca13 Add autopurge feature so that FreqAI cleans up after itself when it no longer needs old models on disk 2022-05-31 11:58:21 +02:00
robcaulk 29d2f59f12 fix PCA bug 2022-05-31 00:40:45 +02:00
robcaulk a20651efd8 Increase performance by only predicting on most recent candle instead of full strat provided dataframe. Collect predictions and store them so that we can feed true predictions back to strategy (so that frequi isnt updating historic predictions based on newly trained models). 2022-05-30 11:37:05 +02:00
robcaulk 2f1a2c1cd7 allow users to store data in custom formats, update spot config to reflect better target horizon to training period ratio 2022-05-30 02:12:31 +02:00
robcaulk 4eb4753e20 allow subdaily retraining for backtesting 2022-05-29 17:44:35 +02:00
robcaulk 83dd453723 catch errors occuring on background thread, and make sure to keep the ball rolling. Improve pair retraining queue. 2022-05-28 18:26:19 +02:00
robcaulk 2a4d1e2d64 fix bug in setting new timerange for retraining 2022-05-28 12:23:26 +02:00
robcaulk 7870a86e9a fix live retraining bug 2022-05-28 11:38:57 +02:00
robcaulk c5a16e91fb throw user error if user tries to load models but feeds the wrong features (while using PCA) 2022-05-28 11:11:41 +02:00
robcaulk b8f9c3557b dirty dirty, dont look here (hacking a flag to avoid reloading leverage_tiers in dry/live) 2022-05-27 13:56:34 +02:00
robcaulk 891fb87712 give load_cached_data_for_updating the right flags to avoid redownloading data in dry/live 2022-05-27 13:38:22 +02:00
robcaulk 65fdebab75 let load_pairs_histories load futures candles in live 2022-05-27 13:01:33 +02:00
robcaulk c080571b7a help futures go dry/live with auto download feature 2022-05-27 12:23:32 +02:00
robcaulk 8a501831d6 fix the error logic on previous commit 2022-05-27 01:15:55 +02:00
robcaulk 23c30dbc10 add error for user trying to backtest with backtest_period<1 2022-05-27 00:43:52 +02:00
robcaulk 6193205012 fix bug for target_mean/std array merging in backtesting 2022-05-26 21:07:50 +02:00
robcaulk b79d4e8876 Allow user to go live and start from pretrained models (after a completed backtest) by simply reusing the `identifier` config parameter while dry/live. 2022-05-25 14:40:32 +02:00
robcaulk 7486d9d9e2 proper validation of freqai config parameters 2022-05-25 12:37:25 +02:00
robcaulk 7ff3258607 remove assertions, log error if user has not assigned freqai in config, fix stratify bug 2022-05-25 11:43:45 +02:00
robcaulk 31ae2b3060 alleviate FutureWarning in sklearn about ensuring svm model features are passed with identical order 2022-05-24 14:46:16 +02:00
robcaulk 059c285425 paying closer attention to managing live retraining on separate thread without affecting prediction of other coins on master thread 2022-05-24 12:01:01 +02:00
robcaulk b0d2d13eb1 improve data persistence/mapping for live/dry. This accommodates quick reloads after crash and handles multi-pair cleanly 2022-05-23 21:05:05 +02:00
robcaulk e1c068ca66 add config asserts, use .get method with default values for optional functionality, move data_cleaning_* to freqai_interface (away from user custom pred model) since it is controlled by config params. 2022-05-23 12:07:09 +02:00
robcaulk ee3cdd0ffe more cleanup 2022-05-23 09:55:58 +02:00
robcaulk 3587bd82e1 cleanup superceded code 2022-05-23 00:10:36 +02:00
robcaulk af0cc21af9 Enable hourly/minute retraining in live/dry. Suppress catboost folder output. Update config + constants + docs to reflect updates. 2022-05-23 00:06:26 +02:00
robcaulk 42d95af829 Aggregated commit. Adding support vector machine for outlier detection, improve user interface to dry/live, better standardization, fix various other bugs 2022-05-22 17:51:49 +02:00
robcaulk 1fae6c9ef7 keep model accessible in memory to avoid loading objects from disk during live/dry 2022-05-19 19:27:38 +02:00
robcaulk db66b82f6f accept open-ended timeranges from user 2022-05-17 19:50:06 +02:00
robcaulk d1d451c27e auto populate features based on a prepended % in the strategy (remove feature assignment from config). Update doc/constants/example strategy to reflect change 2022-05-17 18:15:03 +02:00
robcaulk 80dcd88abf allow user to run config from anywhere on their system 2022-05-15 17:42:15 +02:00
robcaulk a8022c104a give beta testers more information in the doc 2022-05-15 17:42:15 +02:00
robcaulk 9b3e5faebe create more flexible whitelist, avoid duplicating whitelist features into corr_pairlist, update docs 2022-05-15 17:42:15 +02:00
robcaulk 22bd5556ed add self-retraining functionality for live/dry 2022-05-15 17:42:15 +02:00
robcaulk 178c2014b0 appease mypy 2022-05-15 17:42:15 +02:00
robcaulk f653ace24b another attempt at fixing datalength bug 2022-05-15 17:42:15 +02:00
robcaulk b03c7b514d optional style for interfacing freqai with backtesting 2022-05-15 17:42:15 +02:00
robcaulk 3020218096 fix bug on backtest timerange 2022-05-15 17:41:34 +02:00
robcaulk 00ff0c9b91 ensure user defined timerange truncates final backtest so that we arent mismatching data lengths upon return to strategy. Rename DataHandler class to FreqaiDataKitchen 2022-05-15 17:41:34 +02:00