Merge branch 'develop' into feat/externalsignals

This commit is contained in:
Timothy Pogue 2022-09-06 13:02:36 -06:00
commit 8bfaf0a998
11 changed files with 90 additions and 38 deletions

Binary file not shown.

Before

Width:  |  Height:  |  Size: 191 KiB

After

Width:  |  Height:  |  Size: 185 KiB

View File

@ -112,15 +112,15 @@ Mandatory parameters are marked as **Required**, which means that they are requi
| `DI_threshold` | Activates the Dissimilarity Index for outlier detection when > 0. See details about how it works [here](#removing-outliers-with-the-dissimilarity-index). <br> **Datatype:** Positive float (typically < 1).
| `use_SVM_to_remove_outliers` | Train a support vector machine to detect and remove outliers from the training data set, as well as from incoming data points. See details about how it works [here](#removing-outliers-using-a-support-vector-machine-svm). <br> **Datatype:** Boolean.
| `svm_params` | All parameters available in Sklearn's `SGDOneClassSVM()`. See details about some select parameters [here](#removing-outliers-using-a-support-vector-machine-svm). <br> **Datatype:** Dictionary.
| `use_DBSCAN_to_remove_outliers` | Cluster data using DBSCAN to identify and remove outliers from training and prediction data. See details about how it works [here](#removing-outliers-with-dbscan). <br> **Datatype:** Boolean.
| `outlier_protection_percentage` | If more than `outlier_protection_percentage` fraction of points are removed as outliers, FreqAI will log a warning message and ignore outlier detection while keeping the original dataset intact. <br> **Datatype:** float. Default: `30`
| `reverse_train_test_order` | If true, FreqAI will train on the latest data split and test on historical split of the data. This allows the model to be trained up to the most recent data point, while avoiding overfitting. However, users should be careful to understand unorthodox nature of this parameter before employing it. <br> **Datatype:** bool. Default: False
| `use_DBSCAN_to_remove_outliers` | Cluster data using DBSCAN to identify and remove outliers from training and prediction data. See details about how it works [here](#removing-outliers-with-dbscan). <br> **Datatype:** Boolean.
| `outlier_protection_percentage` | If more than `outlier_protection_percentage` % of points are detected as outliers by the SVM or DBSCAN, FreqAI will log a warning message and ignore outlier detection while keeping the original dataset intact. If the outlier protection is triggered, no predictions will be made based on the training data. <br> **Datatype:** Float. Default: `30`
| `reverse_train_test_order` | If true, FreqAI will train on the latest data split and test on historical split of the data. This allows the model to be trained up to the most recent data point, while avoiding overfitting. However, users should be careful to understand unorthodox nature of this parameter before employing it. <br> **Datatype:** Boolean. Default: False
| | **Data split parameters**
| `data_split_parameters` | Include any additional parameters available from Scikit-learn `test_train_split()`, which are shown [here](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html) (external website). <br> **Datatype:** Dictionary.
| `test_size` | Fraction of data that should be used for testing instead of training. <br> **Datatype:** Positive float < 1.
| `shuffle` | Shuffle the training data points during training. Typically, for time-series forecasting, this is set to `False`. <br>
| `shuffle` | Shuffle the training data points during training. Typically, for time-series forecasting, this is set to `False`. <br> **Datatype:** Boolean.
| | **Model training parameters**
| `model_training_parameters` | A flexible dictionary that includes all parameters available by the user selected model library. For example, if the user uses `LightGBMRegressor`, this dictionary can contain any parameter available by the `LightGBMRegressor` [here](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMRegressor.html) (external website). If the user selects a different model, this dictionary can contain any parameter from that model. <br> **Datatype:** Dictionary.**Datatype:** Boolean.
| `model_training_parameters` | A flexible dictionary that includes all parameters available by the user selected model library. For example, if the user uses `LightGBMRegressor`, this dictionary can contain any parameter available by the `LightGBMRegressor` [here](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMRegressor.html) (external website). If the user selects a different model, this dictionary can contain any parameter from that model. <br> **Datatype:** Dictionary.
| `n_estimators` | The number of boosted trees to fit in regression. <br> **Datatype:** Integer.
| `learning_rate` | Boosting learning rate during regression. <br> **Datatype:** Float.
| `n_jobs`, `thread_count`, `task_type` | Set the number of threads for parallel processing and the `task_type` (`gpu` or `cpu`). Different model libraries use different parameter names. <br> **Datatype:** Float.
@ -280,7 +280,7 @@ The FreqAI strategy requires the user to include the following lines of code in
Notice how the `populate_any_indicators()` is where the user adds their own features ([more information](#feature-engineering)) and labels ([more information](#setting-classifier-targets)). See a full example at `templates/FreqaiExampleStrategy.py`.
### Setting the `startup_candle_count`
Users need to take care to set the `startup_candle_count` in their strategy the same way they would for any normal Freqtrade strategy (see details [here](strategy-customization.md/#strategy-startup-period)). This value is used by Freqtrade to ensure that a sufficient amount of data is provided when calling on the `dataprovider` to avoid any NaNs at the beginning of the first training. Users can easily set this value by identifying the longest period (in candle units) that they pass to their indicator creation functions (e.g. talib functions). In the present example, the user would pass 20 to as this value (since it is the maximum value in their `indicators_periods_candles`).
Users need to take care to set the `startup_candle_count` in their strategy the same way they would for any normal Freqtrade strategy (see details [here](strategy-customization.md#strategy-startup-period)). This value is used by Freqtrade to ensure that a sufficient amount of data is provided when calling on the `dataprovider` to avoid any NaNs at the beginning of the first training. Users can easily set this value by identifying the longest period (in candle units) that they pass to their indicator creation functions (e.g. talib functions). In the present example, the user would pass 20 to as this value (since it is the maximum value in their `indicators_periods_candles`).
!!! Note
Typically it is best for users to be safe and multiply their expected `startup_candle_count` by 2. There are instances where the talib functions actually require more data than just the passed `period`. Anecdotally, multiplying the `startup_candle_count` by 2 always leads to a fully NaN free training dataset. Look out for this log message to confirm that your data is clean:
@ -515,10 +515,10 @@ and if a full `live_retrain_hours` has elapsed since the end of the loaded model
The FreqAI backtesting module can be executed with the following command:
```bash
freqtrade backtesting --strategy FreqaiExampleStrategy --config config_examples/config_freqai.example.json --freqaimodel LightGBMRegressor --timerange 20210501-20210701
freqtrade backtesting --strategy FreqaiExampleStrategy --strategy-path freqtrade/templates --config config_examples/config_freqai.example.json --freqaimodel LightGBMRegressor --timerange 20210501-20210701
```
Backtesting mode requires the user to have the data pre-downloaded (unlike in dry/live mode where FreqAI automatically downloads the necessary data). The user should be careful to consider that the time range of the downloaded data is more than the backtesting time range. This is because FreqAI needs data prior to the desired backtesting time range in order to train a model to be ready to make predictions on the first candle of the user-set backtesting time range. More details on how to calculate the data to download can be found [here](#deciding-the-sliding-training-window-and-backtesting-duration).
Backtesting mode requires the user to have the data [pre-downloaded](#downloading-data-for-backtesting) (unlike in dry/live mode where FreqAI automatically downloads the necessary data). The user should be careful to consider that the time range of the downloaded data is more than the backtesting time range. This is because FreqAI needs data prior to the desired backtesting time range in order to train a model to be ready to make predictions on the first candle of the user-set backtesting time range. More details on how to calculate the data to download can be found [here](#deciding-the-sliding-training-window-and-backtesting-duration).
If this command has never been executed with the existing config file, it will train a new model
for each pair, for each backtesting window within the expanded `--timerange`.
@ -546,7 +546,7 @@ FreqAI will train have trained 8 separate models at the end of `--timerange` (be
Although fractional `backtest_period_days` is allowed, the user should be aware that the `--timerange` is divided by this value to determine the number of models that FreqAI will need to train in order to backtest the full range. For example, if the user wants to set a `--timerange` of 10 days, and asks for a `backtest_period_days` of 0.1, FreqAI will need to train 100 models per pair to complete the full backtest. Because of this, a true backtest of FreqAI adaptive training would take a *very* long time. The best way to fully test a model is to run it dry and let it constantly train. In this case, backtesting would take the exact same amount of time as a dry run.
### Downloading data for backtesting
Live/dry instances will download the data automatically for the user, but users who wish to use backtesting functionality still need to download the necessary data using `download-data` (details [here](data-download/#data-downloading)). FreqAI users need to pay careful attention to understanding how much *additional* data needs to be downloaded to ensure that they have a sufficient amount of training data *before* the start of their backtesting timerange. The amount of additional data can be roughly estimated by taking subtracting `train_period_days` and the `startup_candle_count` ([details](#setting-the-startupcandlecount)) from the beginning of the desired backtesting timerange.
Live/dry instances will download the data automatically for the user, but users who wish to use backtesting functionality still need to download the necessary data using `download-data` (details [here](data-download.md#data-downloading)). FreqAI users need to pay careful attention to understanding how much *additional* data needs to be downloaded to ensure that they have a sufficient amount of training data *before* the start of their backtesting timerange. The amount of additional data can be roughly estimated by moving the start date of the timerange backwards by `train_period_days` and the `startup_candle_count` ([details](#setting-the-startupcandlecount)) from the beginning of the desired backtesting timerange.
As an example, if we wish to backtest the `--timerange` above of `20210501-20210701`, and we use the example config which sets `train_period_days` to 15. The startup candle count is 40 on a maximum `include_timeframes` of 1h. We would need 20210501 - 15 days - 40 * 1h / 24 hours = 20210414 (16.7 days earlier than the start of the desired training timerange).
@ -738,7 +738,7 @@ Given a number of data points $N$, and a distance $\varepsilon$, DBSCAN clusters
![dbscan](assets/freqai_dbscan.jpg)
FreqAI uses `sklearn.cluster.DBSCAN` (details are available on scikit-learn's webpage [here](#https://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html)) with `min_samples` ($N$) taken as double the no. of user-defined features, and `eps` ($\varepsilon$) taken as the longest distance in the *k-distance graph* computed from the nearest neighbors in the pairwise distances of all data points in the feature set.
FreqAI uses `sklearn.cluster.DBSCAN` (details are available on scikit-learn's webpage [here](#https://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html)) with `min_samples` ($N$) taken as 1/4 of the no. of time points in the feature set, and `eps` ($\varepsilon$) taken as the elbow point in the *k-distance graph* computed from the nearest neighbors in the pairwise distances of all data points in the feature set.
## Additional information
@ -763,5 +763,5 @@ Code review, software architecture brainstorming:
@xmatthias
Beta testing and bug reporting:
@bloodhunter4rc, Salah Lamkadem @ikonx, @ken11o2, @longyu, @paranoidandy, @smidelis, @smarm
@bloodhunter4rc, Salah Lamkadem @ikonx, @ken11o2, @longyu, @paranoidandy, @smidelis, @smarm,
Juha Nykänen @suikula, Wagner Costa @wagnercosta

View File

@ -824,6 +824,8 @@ Options:
- Merge the dataframe without lookahead bias
- Forward-fill (optional)
For a full sample, please refer to the [complete data provider example](#complete-data-provider-sample) below.
All columns of the informative dataframe will be available on the returning dataframe in a renamed fashion:
!!! Example "Column renaming"

View File

@ -147,13 +147,16 @@ class FreqtradeBot(LoggingMixin):
:return: None
"""
logger.info('Cleaning up modules ...')
try:
# Wrap db activities in shutdown to avoid problems if database is gone,
# and raises further exceptions.
if self.config['cancel_open_orders_on_exit']:
self.cancel_all_open_orders()
if self.config['cancel_open_orders_on_exit']:
self.cancel_all_open_orders()
self.check_for_open_trades()
self.check_for_open_trades()
self.strategy.ft_bot_cleanup()
finally:
self.strategy.ft_bot_cleanup()
self.rpc.cleanup()
if self.emc:
@ -296,7 +299,7 @@ class FreqtradeBot(LoggingMixin):
pair=trade.pair,
amount=trade.amount,
is_short=trade.is_short,
open_date=trade.open_date_utc
open_date=trade.date_last_filled_utc
)
trade.funding_fees = funding_fees
else:
@ -741,10 +744,11 @@ class FreqtradeBot(LoggingMixin):
fee = self.exchange.get_fee(symbol=pair, taker_or_maker='maker')
base_currency = self.exchange.get_pair_base_currency(pair)
open_date = datetime.now(timezone.utc)
funding_fees = self.exchange.get_funding_fees(
pair=pair, amount=amount, is_short=is_short, open_date=open_date)
# This is a new trade
if trade is None:
funding_fees = self.exchange.get_funding_fees(
pair=pair, amount=amount, is_short=is_short, open_date=open_date)
trade = Trade(
pair=pair,
base_currency=base_currency,
@ -1499,7 +1503,7 @@ class FreqtradeBot(LoggingMixin):
pair=trade.pair,
amount=trade.amount,
is_short=trade.is_short,
open_date=trade.open_date_utc,
open_date=trade.date_last_filled_utc,
)
exit_type = 'exit'
exit_reason = exit_tag or exit_check.exit_reason

View File

@ -686,7 +686,7 @@ class Backtesting:
self.futures_data[trade.pair],
amount=trade.amount,
is_short=trade.is_short,
open_date=trade.open_date_utc,
open_date=trade.date_last_filled_utc,
close_date=exit_candle_time,
)

View File

@ -421,9 +421,10 @@ class Hyperopt:
preprocessed = self.backtesting.strategy.advise_all_indicators(data)
# Trim startup period from analyzed dataframe to get correct dates for output.
processed = trim_dataframes(preprocessed, self.timerange, self.backtesting.required_startup)
self.min_date, self.max_date = get_timerange(processed)
return processed
trimmed = trim_dataframes(preprocessed, self.timerange, self.backtesting.required_startup)
self.min_date, self.max_date = get_timerange(trimmed)
# Real trimming will happen as part of backtesting.
return preprocessed
def prepare_hyperopt_data(self) -> None:
HyperoptStateContainer.set_state(HyperoptState.DATALOAD)

View File

@ -212,17 +212,18 @@ def migrate_orders_table(engine, table_back_name: str, cols_order: List):
ft_fee_base = get_column_def(cols_order, 'ft_fee_base', 'null')
average = get_column_def(cols_order, 'average', 'null')
stop_price = get_column_def(cols_order, 'stop_price', 'null')
funding_fee = get_column_def(cols_order, 'funding_fee', '0.0')
# sqlite does not support literals for booleans
with engine.begin() as connection:
connection.execute(text(f"""
insert into orders (id, ft_trade_id, ft_order_side, ft_pair, ft_is_open, order_id,
status, symbol, order_type, side, price, amount, filled, average, remaining, cost,
stop_price, order_date, order_filled_date, order_update_date, ft_fee_base)
stop_price, order_date, order_filled_date, order_update_date, ft_fee_base, funding_fee)
select id, ft_trade_id, ft_order_side, ft_pair, ft_is_open, order_id,
status, symbol, order_type, side, price, amount, filled, {average} average, remaining,
cost, {stop_price} stop_price, order_date, order_filled_date,
order_update_date, {ft_fee_base} ft_fee_base
order_update_date, {ft_fee_base} ft_fee_base, {funding_fee} funding_fee
from {table_back_name}
"""))
@ -307,9 +308,10 @@ def check_migrate(engine, decl_base, previous_tables) -> None:
# Check if migration necessary
# Migrates both trades and orders table!
# if ('orders' not in previous_tables
# or not has_column(cols_orders, 'stop_price')):
# or not has_column(cols_orders, 'funding_fee')):
migrating = False
if not has_column(cols_trades, 'contract_size'):
# if not has_column(cols_trades, 'contract_size'):
if not has_column(cols_orders, 'funding_fee'):
migrating = True
logger.info(f"Running database migration for trades - "
f"backup: {table_back_name}, {order_table_bak_name}")

View File

@ -65,6 +65,8 @@ class Order(_DECL_BASE):
order_filled_date = Column(DateTime, nullable=True)
order_update_date = Column(DateTime, nullable=True)
funding_fee = Column(Float, nullable=True)
ft_fee_base = Column(Float, nullable=True)
@property
@ -72,6 +74,13 @@ class Order(_DECL_BASE):
""" Order-date with UTC timezoneinfo"""
return self.order_date.replace(tzinfo=timezone.utc)
@property
def order_filled_utc(self) -> Optional[datetime]:
""" last order-date with UTC timezoneinfo"""
return (
self.order_filled_date.replace(tzinfo=timezone.utc) if self.order_filled_date else None
)
@property
def safe_price(self) -> float:
return self.average or self.price
@ -119,6 +128,10 @@ class Order(_DECL_BASE):
self.ft_is_open = True
if self.status in NON_OPEN_EXCHANGE_STATES:
self.ft_is_open = False
if self.trade:
# Assign funding fee up to this point
# (represents the funding fee since the last order)
self.funding_fee = self.trade.funding_fees
if (order.get('filled', 0.0) or 0.0) > 0:
self.order_filled_date = datetime.now(timezone.utc)
self.order_update_date = datetime.now(timezone.utc)
@ -179,6 +192,10 @@ class Order(_DECL_BASE):
self.remaining = 0
self.status = 'closed'
self.ft_is_open = False
# Assign funding fees to Order.
# Assumes backtesting will use date_last_filled_utc to calculate future funding fees.
self.funding_fee = trade.funding_fees
if (self.ft_order_side == trade.entry_side):
trade.open_rate = self.price
trade.recalc_trade_from_orders()
@ -346,6 +363,15 @@ class LocalTrade():
else:
return self.amount
@property
def date_last_filled_utc(self) -> datetime:
""" Date of the last filled order"""
orders = self.select_filled_orders()
if not orders:
return self.open_date_utc
return max([self.open_date_utc,
max(o.order_filled_utc for o in orders if o.order_filled_utc)])
@property
def open_date_utc(self):
return self.open_date.replace(tzinfo=timezone.utc)
@ -843,10 +869,14 @@ class LocalTrade():
close_profit = 0.0
close_profit_abs = 0.0
profit = None
for o in self.orders:
# Reset funding fees
self.funding_fees = 0.0
funding_fees = 0.0
ordercount = len(self.orders) - 1
for i, o in enumerate(self.orders):
if o.ft_is_open or not o.filled:
continue
funding_fees += (o.funding_fee or 0.0)
tmp_amount = FtPrecise(o.safe_amount_after_fee)
tmp_price = FtPrecise(o.safe_price)
@ -861,7 +891,11 @@ class LocalTrade():
avg_price = current_stake / current_amount
if is_exit:
# Process partial exits
# Process exits
if i == ordercount and is_closing:
# Apply funding fees only to the last closing order
self.funding_fees = funding_fees
exit_rate = o.safe_price
exit_amount = o.safe_amount_after_fee
profit = self.calc_profit(rate=exit_rate, amount=exit_amount,
@ -871,6 +905,7 @@ class LocalTrade():
exit_rate, amount=exit_amount, open_rate=avg_price)
else:
total_stake = total_stake + self._calc_open_trade_value(tmp_amount, price)
self.funding_fees = funding_fees
if close_profit:
self.close_profit = close_profit

View File

@ -261,11 +261,15 @@ class RPC:
profit_str += f" ({fiat_profit:.2f})"
fiat_profit_sum = fiat_profit if isnan(fiat_profit_sum) \
else fiat_profit_sum + fiat_profit
open_order = (trade.select_order_by_order_id(
trade.open_order_id) if trade.open_order_id else None)
detail_trade = [
f'{trade.id} {direction_str}',
trade.pair + ('*' if (trade.open_order_id is not None
and trade.close_rate_requested is None) else '')
+ ('**' if (trade.close_rate_requested is not None) else ''),
trade.pair + ('*' if (open_order
and open_order.ft_order_side == trade.entry_side) else '')
+ ('**' if (open_order and
open_order.ft_order_side == trade.exit_side is not None) else ''),
shorten_date(arrow.get(trade.open_date).humanize(only_distance=True)),
profit_str
]

View File

@ -12,7 +12,7 @@ arrow==1.2.3
cachetools==4.2.2
requests==2.28.1
urllib3==1.26.12
jsonschema==4.14.0
jsonschema==4.15.0
TA-Lib==0.4.24
technical==1.3.0
tabulate==0.8.10

View File

@ -615,21 +615,25 @@ def test_calc_open_close_trade_price(
is_short=is_short,
leverage=lev,
trading_mode=trading_mode,
funding_fees=funding_fees
)
entry_order = limit_order[trade.entry_side]
exit_order = limit_order[trade.exit_side]
trade.open_order_id = f'something-{is_short}-{lev}-{exchange}'
oobj = Order.parse_from_ccxt_object(entry_order, 'ADA/USDT', trade.entry_side)
trade.orders.append(oobj)
oobj.trade = trade
oobj.update_from_ccxt_object(entry_order)
trade.update_trade(oobj)
trade.funding_fees = funding_fees
oobj = Order.parse_from_ccxt_object(exit_order, 'ADA/USDT', trade.exit_side)
trade.orders.append(oobj)
oobj.trade = trade
oobj.update_from_ccxt_object(exit_order)
trade.update_trade(oobj)
assert trade.is_open is False
assert trade.funding_fees == funding_fees
assert pytest.approx(trade._calc_open_trade_value(trade.amount, trade.open_rate)) == open_value
assert pytest.approx(trade.calc_close_trade_value(trade.close_rate)) == close_value