improve doc, update test strats, change function names
This commit is contained in:
@@ -2,96 +2,132 @@
|
||||
|
||||
## Defining the features
|
||||
|
||||
Low level feature engineering is performed in the user strategy within a function called `populate_any_indicators()`. That function sets the `base features` such as, `RSI`, `MFI`, `EMA`, `SMA`, time of day, volume, etc. The `base features` can be custom indicators or they can be imported from any technical-analysis library that you can find. One important syntax rule is that all `base features` string names are prepended with `%-{pair}`, while labels/targets are prepended with `&`.
|
||||
Low level feature engineering is performed in the user strategy within a set of functions called `feature_engineering_*`. These function set the `base features` such as, `RSI`, `MFI`, `EMA`, `SMA`, time of day, volume, etc. The `base features` can be custom indicators or they can be imported from any technical-analysis library that you can find. One important syntax rule is that all `base features` string names defined within `feature_engineering_*` functions must be prepended with `%-{pair}`. FreqAI is equipped with a set of functions to simplify rapid large-scale feature engineering:
|
||||
|
||||
| Function | Description |
|
||||
|---------------|-------------|
|
||||
| `feature_engineering__expand_all()` | This optional function will automatically expand the defined features on the config defined `indicator_periods_candles`, `include_timeframes`, `include_shifted_candles`, and `include_corr_pairs`.
|
||||
| `feature_engineering__expand_basic()` | This optional function will automatically expand the defined features on the config defined `include_timeframes`, `include_shifted_candles`, and `include_corr_pairs`. Note: this function does *not* expand across `include_periods_candles`.
|
||||
| `feature_engineering_standard()` | This optional function will be called once with the dataframe of the base timeframe. This is the final function to be called, which means that the dataframe entering this function will contain all the features and columns created by all other `feature_engineering_expand` functions. This function is a good place to do custom exotic feature extractions (e.g. tsfresh). This function is also a good place for any feature that should not be auto-expanded upon (e.g. day of the week).
|
||||
| `set_freqai_targets()` | Required function to set the targets for the model. All targets must be prepended with `&` to be recognized by the FreqAI internals.
|
||||
|
||||
|
||||
!!! Note
|
||||
Adding the full pair string, e.g. XYZ/USD, in the feature name enables improved performance for dataframe caching on the backend. If you decide *not* to add the full pair string in the feature string, FreqAI will operate in a reduced performance mode.
|
||||
|
||||
Meanwhile, high level feature engineering is handled within `"feature_parameters":{}` in the FreqAI config. Within this file, it is possible to decide large scale feature expansions on top of the `base_features` such as "including correlated pairs" or "including informative timeframes" or even "including recent candles."
|
||||
|
||||
It is advisable to start from the template `populate_any_indicators()` in the source provided example strategy (found in `templates/FreqaiExampleStrategy.py`) to ensure that the feature definitions are following the correct conventions. Here is an example of how to set the indicators and labels in the strategy:
|
||||
It is advisable to start from the template `feature_engineering_*` functions in the source provided example strategy (found in `templates/FreqaiExampleStrategy.py`) to ensure that the feature definitions are following the correct conventions. Here is an example of how to set the indicators and labels in the strategy:
|
||||
|
||||
```python
|
||||
def populate_any_indicators(
|
||||
self, pair, df, tf, informative=None, set_generalized_indicators=False
|
||||
):
|
||||
def feature_engineering_expand_all(self, dataframe, period, **kwargs):
|
||||
"""
|
||||
Function designed to automatically generate, name, and merge features
|
||||
from user-indicated timeframes in the configuration file. The user controls the indicators
|
||||
passed to the training/prediction by prepending indicators with `'%-' + pair `
|
||||
(see convention below). I.e., the user should not prepend any supporting metrics
|
||||
(e.g., bb_lowerband below) with % unless they explicitly want to pass that metric to the
|
||||
model.
|
||||
:param pair: pair to be used as informative
|
||||
:param df: strategy dataframe which will receive merges from informatives
|
||||
:param tf: timeframe of the dataframe which will modify the feature names
|
||||
:param informative: the dataframe associated with the informative pair
|
||||
*Only functional with FreqAI enabled strategies*
|
||||
This function will automatically expand the defined features on the config defined
|
||||
`indicator_periods_candles`, `include_timeframes`, `include_shifted_candles`, and
|
||||
`include_corr_pairs`. In other words, a single feature defined in this function
|
||||
will automatically expand to a total of
|
||||
`indicator_periods_candles` * `include_timeframes` * `include_shifted_candles` *
|
||||
`include_corr_pairs` numbers of features added to the model.
|
||||
|
||||
All features must be prepended with `%` to be recognized by FreqAI internals.
|
||||
|
||||
:param df: strategy dataframe which will receive the features
|
||||
:param period: period of the indicator - usage example:
|
||||
dataframe["%-ema-period"] = ta.EMA(dataframe, timeperiod=period)
|
||||
"""
|
||||
|
||||
if informative is None:
|
||||
informative = self.dp.get_pair_dataframe(pair, tf)
|
||||
dataframe["%-rsi-period"] = ta.RSI(dataframe, timeperiod=period)
|
||||
dataframe["%-mfi-period"] = ta.MFI(dataframe, timeperiod=period)
|
||||
dataframe["%-adx-period"] = ta.ADX(dataframe, timeperiod=period)
|
||||
dataframe["%-sma-period"] = ta.SMA(dataframe, timeperiod=period)
|
||||
dataframe["%-ema-period"] = ta.EMA(dataframe, timeperiod=period)
|
||||
|
||||
# first loop is automatically duplicating indicators for time periods
|
||||
for t in self.freqai_info["feature_parameters"]["indicator_periods_candles"]:
|
||||
t = int(t)
|
||||
informative[f"%-{pair}rsi-period_{t}"] = ta.RSI(informative, timeperiod=t)
|
||||
informative[f"%-{pair}mfi-period_{t}"] = ta.MFI(informative, timeperiod=t)
|
||||
informative[f"%-{pair}adx-period_{t}"] = ta.ADX(informative, window=t)
|
||||
bollinger = qtpylib.bollinger_bands(
|
||||
qtpylib.typical_price(dataframe), window=period, stds=2.2
|
||||
)
|
||||
dataframe["bb_lowerband-period"] = bollinger["lower"]
|
||||
dataframe["bb_middleband-period"] = bollinger["mid"]
|
||||
dataframe["bb_upperband-period"] = bollinger["upper"]
|
||||
|
||||
bollinger = qtpylib.bollinger_bands(
|
||||
qtpylib.typical_price(informative), window=t, stds=2.2
|
||||
dataframe["%-bb_width-period"] = (
|
||||
dataframe["bb_upperband-period"]
|
||||
- dataframe["bb_lowerband-period"]
|
||||
) / dataframe["bb_middleband-period"]
|
||||
dataframe["%-close-bb_lower-period"] = (
|
||||
dataframe["close"] / dataframe["bb_lowerband-period"]
|
||||
)
|
||||
|
||||
dataframe["%-roc-period"] = ta.ROC(dataframe, timeperiod=period)
|
||||
|
||||
dataframe["%-relative_volume-period"] = (
|
||||
dataframe["volume"] / dataframe["volume"].rolling(period).mean()
|
||||
)
|
||||
|
||||
return dataframe
|
||||
|
||||
def feature_engineering_expand_basic(self, dataframe, **kwargs):
|
||||
"""
|
||||
*Only functional with FreqAI enabled strategies*
|
||||
This function will automatically expand the defined features on the config defined
|
||||
`include_timeframes`, `include_shifted_candles`, and `include_corr_pairs`.
|
||||
In other words, a single feature defined in this function
|
||||
will automatically expand to a total of
|
||||
`include_timeframes` * `include_shifted_candles` * `include_corr_pairs`
|
||||
numbers of features added to the model.
|
||||
|
||||
Features defined here will *not* be automatically duplicated on user defined
|
||||
`indicator_periods_candles`
|
||||
|
||||
All features must be prepended with `%` to be recognized by FreqAI internals.
|
||||
|
||||
:param df: strategy dataframe which will receive the features
|
||||
dataframe["%-pct-change"] = dataframe["close"].pct_change()
|
||||
dataframe["%-ema-200"] = ta.EMA(dataframe, timeperiod=200)
|
||||
"""
|
||||
dataframe["%-pct-change"] = dataframe["close"].pct_change()
|
||||
dataframe["%-raw_volume"] = dataframe["volume"]
|
||||
dataframe["%-raw_price"] = dataframe["close"]
|
||||
return dataframe
|
||||
|
||||
def feature_engineering_standard(self, dataframe, **kwargs):
|
||||
"""
|
||||
*Only functional with FreqAI enabled strategies*
|
||||
This optional function will be called once with the dataframe of the base timeframe.
|
||||
This is the final function to be called, which means that the dataframe entering this
|
||||
function will contain all the features and columns created by all other
|
||||
freqai_feature_engineering_* functions.
|
||||
|
||||
This function is a good place to do custom exotic feature extractions (e.g. tsfresh).
|
||||
This function is a good place for any feature that should not be auto-expanded upon
|
||||
(e.g. day of the week).
|
||||
|
||||
All features must be prepended with `%` to be recognized by FreqAI internals.
|
||||
|
||||
:param df: strategy dataframe which will receive the features
|
||||
usage example: dataframe["%-day_of_week"] = (dataframe["date"].dt.dayofweek + 1) / 7
|
||||
"""
|
||||
dataframe["%-day_of_week"] = (dataframe["date"].dt.dayofweek + 1) / 7
|
||||
dataframe["%-hour_of_day"] = (dataframe["date"].dt.hour + 1) / 25
|
||||
return dataframe
|
||||
|
||||
def set_freqai_targets(self, dataframe, **kwargs):
|
||||
"""
|
||||
*Only functional with FreqAI enabled strategies*
|
||||
Required function to set the targets for the model.
|
||||
All targets must be prepended with `&` to be recognized by the FreqAI internals.
|
||||
|
||||
:param df: strategy dataframe which will receive the targets
|
||||
usage example: dataframe["&-target"] = dataframe["close"].shift(-1) / dataframe["close"]
|
||||
"""
|
||||
dataframe["&-s_close"] = (
|
||||
dataframe["close"]
|
||||
.shift(-self.freqai_info["feature_parameters"]["label_period_candles"])
|
||||
.rolling(self.freqai_info["feature_parameters"]["label_period_candles"])
|
||||
.mean()
|
||||
/ dataframe["close"]
|
||||
- 1
|
||||
)
|
||||
informative[f"{pair}bb_lowerband-period_{t}"] = bollinger["lower"]
|
||||
informative[f"{pair}bb_middleband-period_{t}"] = bollinger["mid"]
|
||||
informative[f"{pair}bb_upperband-period_{t}"] = bollinger["upper"]
|
||||
|
||||
informative[f"%-{pair}bb_width-period_{t}"] = (
|
||||
informative[f"{pair}bb_upperband-period_{t}"]
|
||||
- informative[f"{pair}bb_lowerband-period_{t}"]
|
||||
) / informative[f"{pair}bb_middleband-period_{t}"]
|
||||
informative[f"%-{pair}close-bb_lower-period_{t}"] = (
|
||||
informative["close"] / informative[f"{pair}bb_lowerband-period_{t}"]
|
||||
)
|
||||
|
||||
informative[f"%-{pair}relative_volume-period_{t}"] = (
|
||||
informative["volume"] / informative["volume"].rolling(t).mean()
|
||||
)
|
||||
|
||||
indicators = [col for col in informative if col.startswith("%")]
|
||||
# This loop duplicates and shifts all indicators to add a sense of recency to data
|
||||
for n in range(self.freqai_info["feature_parameters"]["include_shifted_candles"] + 1):
|
||||
if n == 0:
|
||||
continue
|
||||
informative_shift = informative[indicators].shift(n)
|
||||
informative_shift = informative_shift.add_suffix("_shift-" + str(n))
|
||||
informative = pd.concat((informative, informative_shift), axis=1)
|
||||
|
||||
df = merge_informative_pair(df, informative, self.config["timeframe"], tf, ffill=True)
|
||||
skip_columns = [
|
||||
(s + "_" + tf) for s in ["date", "open", "high", "low", "close", "volume"]
|
||||
]
|
||||
df = df.drop(columns=skip_columns)
|
||||
|
||||
# Add generalized indicators here (because in live, it will call this
|
||||
# function to populate indicators during training). Notice how we ensure not to
|
||||
# add them multiple times
|
||||
if set_generalized_indicators:
|
||||
df["%-day_of_week"] = (df["date"].dt.dayofweek + 1) / 7
|
||||
df["%-hour_of_day"] = (df["date"].dt.hour + 1) / 25
|
||||
|
||||
# user adds targets here by prepending them with &- (see convention below)
|
||||
# If user wishes to use multiple targets, a multioutput prediction model
|
||||
# needs to be used such as templates/CatboostPredictionMultiModel.py
|
||||
df["&-s_close"] = (
|
||||
df["close"]
|
||||
.shift(-self.freqai_info["feature_parameters"]["label_period_candles"])
|
||||
.rolling(self.freqai_info["feature_parameters"]["label_period_candles"])
|
||||
.mean()
|
||||
/ df["close"]
|
||||
- 1
|
||||
)
|
||||
|
||||
return df
|
||||
```
|
||||
|
||||
In the presented example, the user does not wish to pass the `bb_lowerband` as a feature to the model,
|
||||
@@ -118,13 +154,13 @@ After having defined the `base features`, the next step is to expand upon them u
|
||||
}
|
||||
```
|
||||
|
||||
The `include_timeframes` in the config above are the timeframes (`tf`) of each call to `populate_any_indicators()` in the strategy. In the presented case, the user is asking for the `5m`, `15m`, and `4h` timeframes of the `rsi`, `mfi`, `roc`, and `bb_width` to be included in the feature set.
|
||||
The `include_timeframes` in the config above are the timeframes (`tf`) of each call to `feature_engineering_expand_*()` in the strategy. In the presented case, the user is asking for the `5m`, `15m`, and `4h` timeframes of the `rsi`, `mfi`, `roc`, and `bb_width` to be included in the feature set.
|
||||
|
||||
You can ask for each of the defined features to be included also for informative pairs using the `include_corr_pairlist`. This means that the feature set will include all the features from `populate_any_indicators` on all the `include_timeframes` for each of the correlated pairs defined in the config (`ETH/USD`, `LINK/USD`, and `BNB/USD` in the presented example).
|
||||
You can ask for each of the defined features to be included also for informative pairs using the `include_corr_pairlist`. This means that the feature set will include all the features from `feature_engineering_expand_*()` on all the `include_timeframes` for each of the correlated pairs defined in the config (`ETH/USD`, `LINK/USD`, and `BNB/USD` in the presented example).
|
||||
|
||||
`include_shifted_candles` indicates the number of previous candles to include in the feature set. For example, `include_shifted_candles: 2` tells FreqAI to include the past 2 candles for each of the features in the feature set.
|
||||
|
||||
In total, the number of features the user of the presented example strat has created is: length of `include_timeframes` * no. features in `populate_any_indicators()` * length of `include_corr_pairlist` * no. `include_shifted_candles` * length of `indicator_periods_candles`
|
||||
In total, the number of features the user of the presented example strat has created is: length of `include_timeframes` * no. features in `feature_engineering_expand_*()` * length of `include_corr_pairlist` * no. `include_shifted_candles` * length of `indicator_periods_candles`
|
||||
$= 3 * 3 * 3 * 2 * 2 = 108$.
|
||||
|
||||
### Returning additional info from training
|
||||
|
||||
Reference in New Issue
Block a user