fix conflcits

2022-07-22 17:20:26 +02:00 · 2022-07-22 17:20:26 +02:00 · dcf0275838
commit dcf0275838
parent 38841e30b8 accc629e32
6 changed files with 298 additions and 51 deletions
--- a/docs/assets/freqai_logo_no_md.svg
+++ b/docs/assets/freqai_logo_no_md.svg
--- a/docs/freqai.md
+++ b/docs/freqai.md
@ -1,3 +1,5 @@
+![freqai-logo](assets/freqai_logo_no_md.svg)
+
 # FreqAI

 FreqAI is a module designed to automate a variety of tasks associated with
@ -65,6 +67,7 @@ freqtrade backtesting --config config_examples/config_freqai.example.json --stra

 ## Configuring the bot

+### Parameter table
 The table below will list all configuration parameters available for `FreqAI`.

 Mandatory parameters are marked as **Required**, which means that they are required to be set in one of the possible ways.
@ -79,6 +82,8 @@ Mandatory parameters are marked as **Required**, which means that they are requi
 | `follow_mode` | If true, this instance of FreqAI will look for models associated with `identifier` and load those for inferencing. A `follower` will **not** train new models. False by default. <br> **Datatype:** boolean.
 | `live_trained_timestamp` | Useful if user wants to start from models trained during a *backtest*. The timestamp can be located in the `user_data/models` backtesting folder. This is not a commonly used parameter, leave undefined for most applications. <br> **Datatype:** positive integer.
 | `fit_live_predictions_candles` | Computes target (label) statistics from prediction data, instead of from the training data set. Number of candles is the number of historical candles it uses to generate the statistics. <br> **Datatype:** positive integer.
+| `purge_old_models` | Tell FreqAI to delete obsolete models. Otherwise, all historic models will remain on disk. Defaults to False. <br> **Datatype:** boolean.
+| `expiration_hours` | Ask FreqAI to avoid making predictions if a model is more than `expiration_hours` old. Defaults to 0 which means models never expire. <br> **Datatype:** positive integer.
 |  |  **Feature Parameters**
 | `feature_parameters` | A dictionary containing the parameters used to engineer the feature set. Details and examples shown [here](#building-the-feature-set) <br> **Datatype:** dictionary.
 | `include_corr_pairlist` | A list of correlated coins that FreqAI will add as additional features to all `pair_whitelist` coins. All indicators set in `populate_any_indicators` will be created for each coin in this list, and that set of features is added to the base asset feature set. <br> **Datatype:** list of assets (strings).
@ -103,6 +108,16 @@ Mandatory parameters are marked as **Required**, which means that they are requi
 | `learning_rate` | A common parameter among regressors which sets the boosting learning rate. <br> **Datatype:** float.
 | `n_jobs`, `thread_count`, `task_type` | Different libraries use different parameter names to control the number of threads used for parallel processing or whether or not it is a `task_type` of `gpu` or `cpu`. <br> **Datatype:** float.

+### Return values for use in strategy
+Here are the values you can expect to receive inside the dataframe returned by FreqAI:
+
+|  Parameter | Description |
+|------------|-------------|
+| `&-s*` | user defined labels in the user made strategy. Anything prepended with `&` is treated as a training target inside FreqAI. These same dataframe columns names are fed back to the user as the predictions. For example, the user wishes to predict the price change in the next 40 candles (similar to `templates/FreqaiExampleStrategy.py`) by setting `&-s_close`. FreqAI makes the predictions and gives them back to the user under the same key (`&-s_close`) to be used in `populate_entry/exit_trend()`. <br> **Datatype:** depends on the output of the model.
+| `&-s*_std/mean` | The standard deviation and mean values of the user defined labels during training (or live tracking with `fit_live_predictions_candles`). Commonly used to understand rarity of prediction (use the z-score as shown in `templates/FreqaiExampleStrategy.py` to evaluate how often a particular prediction was observed during training (or historically with `fit_live_predictions_candles`)<br> **Datatype:** floats.
+| `do_predict` | An indication of an outlier, this return value is integer between -1 and 2 which lets the user understand if the prediction is trustworthy or not. `do_predict==1` means the prediction is trustworthy. If the [Dissimilartiy Index](#removing-outliers-with-the-dissimilarity-index) is above the user defined treshold, it will subtract 1 from `do_predict`. If `use_SVM_to_remove_outliers()` is active, then the Support Vector Machine (SVM) may also detect outliers in training and prediction data. In this case, the SVM will also subtract one from `do_predict`.  A particular case is when `do_predict == 2`, it means that the model has expired due to `expired_hours`. <br> **Datatype:** integer between -1 and 2.
+| `DI_values` | The raw Dissimilarity Index values to give the user a sense of confidence in the prediction. Lower DI means the data point is closer to the trained parameter space. <br> **Datatype:** float.
+
 ### Example config file

 The user interface is isolated to the typical config file. A typical Freqai
@ -376,7 +391,7 @@ The Freqai strategy requires the user to include the following lines of code in
        # (& appended targets), an indication of whether or not the prediction should be accepted, 
        # the target mean/std values for each of the labels created by user in 
        # `populate_any_indicators()` for each training period.
-        
+
        dataframe = self.model.bridge.start(dataframe, metadata, self)

        return dataframe
@ -504,7 +519,7 @@ The user can tell Freqai to remove outlier data points from the training/test da
 ```json
    "freqai": {
        "feature_parameters" : {
-            "use_SVM_to_remove_outliers: true
+            "use_SVM_to_remove_outliers": true
        }
    }
 ```
--- a/freqtrade/freqai/data_kitchen.py
+++ b/freqtrade/freqai/data_kitchen.py
@ -69,7 +69,7 @@ class FreqaiDataKitchen:
                config["freqai"]["train_period_days"],
                config["freqai"]["backtest_period_days"],
            )
-        # self.strat_dataframe: DataFrame = strat_dataframe
+
        self.dd = data_drawer

    def set_paths(
@ -1116,6 +1116,16 @@ class FreqaiDataKitchen:
        # self.data["lower_quantile"] = lower_q
        return

+    def remove_features_from_df(self, dataframe: DataFrame) -> DataFrame:
+        """
+        Remove the features from the dataframe before returning it to strategy. This keeps it
+        compact for Frequi purposes.
+        """
+        to_keep = [
+            col for col in dataframe.columns if not col.startswith("%") or col.startswith("%%")
+        ]
+        return dataframe[to_keep]
+
    def np_encoder(self, object):
        if isinstance(object, np.generic):
            return object.item()
--- a/freqtrade/freqai/freqai_interface.py
+++ b/freqtrade/freqai/freqai_interface.py
@ -37,9 +37,7 @@ def threaded(fn):
 class IFreqaiModel(ABC):
    """
    Class containing all tools for training and prediction in the strategy.
-    User models should inherit from this class as shown in
-    templates/ExamplePredictionModel.py where the user overrides
-    train(), predict(), fit(), and make_labels().
+    Base*PredictionModels inherit from this class.
    Author: Robert Caulk, rob.caulk@gmail.com
    """

@ -51,23 +49,15 @@ class IFreqaiModel(ABC):
        self.data_split_parameters = config.get("freqai", {}).get("data_split_parameters")
        self.model_training_parameters = config.get("freqai", {}).get("model_training_parameters")
        self.feature_parameters = config.get("freqai", {}).get("feature_parameters")
-        self.time_last_trained = None
-        self.current_time = None
        self.model = None
-        self.predictions = None
-        self.training_on_separate_thread = False
        self.retrain = False
        self.first = True
-        self.update_historic_data = 0
        self.set_full_path()
        self.follow_mode = self.freqai_info.get("follow_mode", False)
        self.dd = FreqaiDataDrawer(Path(self.full_path), self.config, self.follow_mode)
        self.lock = threading.Lock()
-        self.follow_mode = self.freqai_info.get("follow_mode", False)
        self.identifier = self.freqai_info.get("identifier", "no_id_provided")
        self.scanning = False
-        self.ready_to_scan = False
-        self.first = True
        self.keras = self.freqai_info.get("keras", False)
        if self.keras and self.freqai_info.get("feature_parameters", {}).get("DI_threshold", 0):
            self.freqai_info["feature_parameters"]["DI_threshold"] = 0
@ -114,7 +104,7 @@ class IFreqaiModel(ABC):
            )
            dk = self.start_backtesting(dataframe, metadata, self.dk)

-        dataframe = self.remove_features_from_df(dk.return_dataframe)
+        dataframe = dk.remove_features_from_df(dk.return_dataframe)
        return self.return_values(dataframe, dk)

    @threaded
@ -260,9 +250,6 @@ class IFreqaiModel(ABC):
            dk.update_historic_data(strategy)
            logger.debug(f'Updating historic data on pair {metadata["pair"]}')

-        # if trainable, check if model needs training, if so compute new timerange,
-        # then save model and metadata.
-        # if not trainable, load existing data
        if not self.follow_mode:

            (_, new_trained_timerange, data_load_timerange) = dk.check_if_new_training_required(
@ -320,6 +307,8 @@ class IFreqaiModel(ABC):
        # correct array to strategy

        if pair not in self.dd.model_return_values:
+            # first predictions are made on entire historical candle set coming from strategy. This
+            # allows FreqUI to show full return values.
            pred_df, do_preds = self.predict(dataframe, dk)
            self.dd.set_initial_return_values(pair, dk, pred_df, do_preds)
            dk.return_dataframe = self.dd.attach_return_values_to_return_dataframe(pair, dataframe)
@ -333,7 +322,8 @@ class IFreqaiModel(ABC):
                "prediction == 0 and do_predict == 2"
            )
        else:
-            # Only feed in the most recent candle for prediction in live scenario
+            # remaining predictions are made only on the most recent candles for performance and
+            # historical accuracy reasons.
            pred_df, do_preds = self.predict(dataframe.iloc[-self.CONV_WIDTH:], dk, first=False)

        self.dd.append_model_predictions(pair, pred_df, do_preds, dk, len(dataframe))
@ -384,11 +374,6 @@ class IFreqaiModel(ABC):
        if self.freqai_info.get("feature_parameters", {}).get("DI_threshold", 0):
            dk.data["avg_mean_dist"] = dk.compute_distances()

-        # if self.feature_parameters["determine_statistical_distributions"]:
-        #     dk.determine_statistical_distributions()
-        # if self.feature_parameters["remove_outliers"]:
-        #     dk.remove_outliers(predict=False)
-
    def data_cleaning_predict(self, dk: FreqaiDataKitchen, dataframe: DataFrame) -> None:
        """
        Base data cleaning method for predict.
@ -411,11 +396,6 @@ class IFreqaiModel(ABC):
        if self.freqai_info.get("feature_parameters", {}).get("DI_threshold", 0):
            dk.check_if_pred_in_training_spaces()

-        # if self.feature_parameters["determine_statistical_distributions"]:
-        #     dk.determine_statistical_distributions()
-        # if self.feature_parameters["remove_outliers"]:
-        #     dk.remove_outliers(predict=True)  # creates dropped index
-
    def model_exists(
        self,
        pair: str,
@ -428,6 +408,8 @@ class IFreqaiModel(ABC):
        Given a pair and path, check if a model already exists
        :param pair: pair e.g. BTC/USD
        :param path: path to model
+        :return:
+        :boolean: whether the model file exists or not.
        """
        coin, _ = pair.split("/")

@ -452,16 +434,6 @@ class IFreqaiModel(ABC):
            Path(self.full_path, Path(self.config["config_files"][0]).name),
        )

-    def remove_features_from_df(self, dataframe: DataFrame) -> DataFrame:
-        """
-        Remove the features from the dataframe before returning it to strategy. This keeps it
-        compact for Frequi purposes.
-        """
-        to_keep = [
-            col for col in dataframe.columns if not col.startswith("%") or col.startswith("%%")
-        ]
-        return dataframe[to_keep]
-
    def train_model_in_series(
        self,
        new_trained_timerange: TimeRange,
@ -507,7 +479,6 @@ class IFreqaiModel(ABC):

        if self.freqai_info.get("purge_old_models", False):
            self.dd.purge_old_models()
-        # self.retrain = False

    def set_initial_historic_predictions(
        self, df: DataFrame, model: Any, dk: FreqaiDataKitchen, pair: str
@ -567,16 +538,6 @@ class IFreqaiModel(ABC):
        data (NaNs) or felt uncertain about data (i.e. SVM and/or DI index)
        """

-    def make_labels(self, dataframe: DataFrame, dk: FreqaiDataKitchen) -> DataFrame:
-        """
-        User defines the labels here (target values).
-        :params:
-        dataframe: DataFrame = the full dataframe for the present training period
-        dk: FreqaiDataKitchen = Data management/analysis tool assoicated to present pair only
-        """
-
-        return
-
    @abstractmethod
    def return_values(self, dataframe: DataFrame, dk: FreqaiDataKitchen) -> DataFrame:
        """
--- a/freqtrade/freqai/prediction_models/LightGBMPredictionMultiModel.py
+++ b/freqtrade/freqai/prediction_models/LightGBMPredictionMultiModel.py
@ -0,0 +1,40 @@
+import logging
+from typing import Any, Dict
+
+from lightgbm import LGBMRegressor
+from sklearn.multioutput import MultiOutputRegressor
+
+from freqtrade.freqai.prediction_models.BaseRegressionModel import BaseRegressionModel
+
+
+logger = logging.getLogger(__name__)
+
+
+class LightGBMPredictionMultiModel(BaseRegressionModel):
+    """
+    User created prediction model. The class needs to override three necessary
+    functions, predict(), train(), fit(). The class inherits ModelHandler which
+    has its own DataHandler where data is held, saved, loaded, and managed.
+    """
+
+    def fit(self, data_dictionary: Dict) -> Any:
+        """
+        User sets up the training and test data to fit their desired model here
+        :params:
+        :data_dictionary: the dictionary constructed by DataHandler to hold
+        all the training and test data/labels.
+        """
+
+        lgb = LGBMRegressor(**self.model_training_parameters)
+
+        X = data_dictionary["train_features"]
+        y = data_dictionary["train_labels"]
+        eval_set = (data_dictionary["test_features"], data_dictionary["test_labels"])
+        sample_weight = data_dictionary["train_weights"]
+
+        model = MultiOutputRegressor(estimator=lgb)
+        model.fit(X=X, y=y, sample_weight=sample_weight)  # , eval_set=eval_set)
+        train_score = model.score(X, y)
+        test_score = model.score(*eval_set)
+        logger.info(f"Train score {train_score}, Test score {test_score}")
+        return model
--- a/mkdocs.yml
+++ b/mkdocs.yml
@ -32,10 +32,10 @@ nav:
        - Backtest analysis: advanced-backtesting.md
    - Advanced Topics:
        - Advanced Post-installation Tasks: advanced-setup.md
-        - Edge Positioning: edge.md
        - Advanced Strategy: strategy-advanced.md
        - Advanced Hyperopt: advanced-hyperopt.md
        - FreqAI: freqai.md
+        - Edge Positioning: edge.md
        - Sandbox Testing: sandbox-testing.md
    - FAQ: faq.md
    - SQL Cheat-sheet: sql_cheatsheet.md