Merge pull request #5496 from LoveIsGrief/docs/performance-warning

Docs: Mention Performance Warning for strategies
This commit is contained in:
Matthias 2021-09-23 20:16:58 +02:00 committed by GitHub
commit ff9c8fe234
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 32 additions and 0 deletions

View File

@ -701,3 +701,33 @@ The variable 'content', will contain the strategy file in a BASE64 encoded form.
``` ```
Please ensure that 'NameOfStrategy' is identical to the strategy name! Please ensure that 'NameOfStrategy' is identical to the strategy name!
## Performance warning
When executing a strategy, one can sometimes be greeted by the following in the logs
> PerformanceWarning: DataFrame is highly fragmented.
This is a warning from [`pandas`](https://github.com/pandas-dev/pandas) and as the warning continues to say:
use `pd.concat(axis=1)`.
This can have slight performance implications, which are usually only visible during hyperopt (when optimizing an indicator).
For example:
```python
for val in self.buy_ema_short.range:
dataframe[f'ema_short_{val}'] = ta.EMA(dataframe, timeperiod=val)
```
should be rewritten to
```python
frames = [dataframe]
for val in self.buy_ema_short.range:
frames.append({
f'ema_short_{val}': ta.EMA(dataframe, timeperiod=val)
})
# Append columns to existing dataframe
merged_frame = pd.concat(frames, axis=1)
```

View File

@ -942,6 +942,8 @@ Printing more than a few rows is also possible (simply use `print(dataframe)` i
## Common mistakes when developing strategies ## Common mistakes when developing strategies
### Peeking into the future while backtesting
Backtesting analyzes the whole time-range at once for performance reasons. Because of this, strategy authors need to make sure that strategies do not look-ahead into the future. Backtesting analyzes the whole time-range at once for performance reasons. Because of this, strategy authors need to make sure that strategies do not look-ahead into the future.
This is a common pain-point, which can cause huge differences between backtesting and dry/live run methods, since they all use data which is not available during dry/live runs, so these strategies will perform well during backtesting, but will fail / perform badly in real conditions. This is a common pain-point, which can cause huge differences between backtesting and dry/live run methods, since they all use data which is not available during dry/live runs, so these strategies will perform well during backtesting, but will fail / perform badly in real conditions.