Reduce image sizes in freqai doc (#7304)
BIN
docs/assets/freqai_DI.jpg
Normal file
After Width: | Height: | Size: 307 KiB |
Before Width: | Height: | Size: 38 MiB |
BIN
docs/assets/freqai_algo.jpg
Normal file
After Width: | Height: | Size: 345 KiB |
Before Width: | Height: | Size: 17 MiB |
BIN
docs/assets/freqai_dbscan.jpg
Normal file
After Width: | Height: | Size: 66 KiB |
Before Width: | Height: | Size: 1.9 MiB |
BIN
docs/assets/freqai_moving-window.jpg
Normal file
After Width: | Height: | Size: 270 KiB |
Before Width: | Height: | Size: 3.3 MiB |
BIN
docs/assets/freqai_weight-factor.jpg
Normal file
After Width: | Height: | Size: 191 KiB |
Before Width: | Height: | Size: 4.7 MiB |
Before Width: | Height: | Size: 126 KiB |
@ -40,7 +40,7 @@ FreqAI trains a model to predict the target values based on the input of custom
|
||||
|
||||
An overview of the algorithm is shown below, explaining the data processing pipeline and the model usage.
|
||||
|
||||
data:image/s3,"s3://crabby-images/b8371/b83716859e3dfa61a69df684a7d4b3cbdb5f8b5e" alt="freqai-algo"
|
||||
data:image/s3,"s3://crabby-images/37d4f/37d4f30077d2b50981a59c804f7ef251f832c406" alt="freqai-algo"
|
||||
|
||||
### Important machine learning vocabulary
|
||||
|
||||
@ -469,7 +469,7 @@ Additionally, the example classifier models do not accommodate multiple labels,
|
||||
|
||||
There are two ways to train and deploy an adaptive machine learning model. FreqAI enables live deployment as well as backtesting analyses. In both cases, a model is trained periodically, as shown in the following figure.
|
||||
|
||||
data:image/s3,"s3://crabby-images/2d160/2d160d31f01cbd905b1d6afb759daa498e5bd656" alt="freqai-window"
|
||||
data:image/s3,"s3://crabby-images/61114/61114efea9f21c8d79cb537439c5ac71b3c4b2cd" alt="freqai-window"
|
||||
|
||||
### Running the model live
|
||||
|
||||
@ -648,7 +648,7 @@ $$ W_i = \exp(\frac{-i}{\alpha*n}) $$
|
||||
|
||||
where $W_i$ is the weight of data point $i$ in a total set of $n$ data points. Below is a figure showing the effect of different weight factors on the data points (candles) in a feature set.
|
||||
|
||||
data:image/s3,"s3://crabby-images/ad810/ad810ad68085b9f61830e25d58f9cf0b5e0a5647" alt="weight-factor"
|
||||
data:image/s3,"s3://crabby-images/9ff85/9ff85230e70849178ef87ab4cc34babe906cccc7" alt="weight-factor"
|
||||
|
||||
`train_test_split()` has a parameters called `shuffle` that allows the user to keep the data unshuffled. This is particularly useful to avoid biasing training with temporally auto-correlated data.
|
||||
|
||||
@ -691,7 +691,7 @@ The user can tweak the DI through the `DI_threshold` to increase or decrease the
|
||||
|
||||
Below is a figure that describes the DI for a 3D data set.
|
||||
|
||||
data:image/s3,"s3://crabby-images/35b6e/35b6ea2596cf803fcbd1e95c4f8a05685179f482" alt="DI"
|
||||
data:image/s3,"s3://crabby-images/fed6b/fed6bd90b948006724d8f858859f85843f1eff86" alt="DI"
|
||||
|
||||
#### Removing outliers using a Support Vector Machine (SVM)
|
||||
|
||||
@ -728,7 +728,7 @@ DBSCAN is an unsupervised machine learning algorithm that clusters data without
|
||||
|
||||
Given a number of data points $N$, and a distance $\varepsilon$, DBSCAN clusters the data set by setting all data points that have $N-1$ other data points within a distance of $\varepsilon$ as *core points*. A data point that is within a distance of $\varepsilon$ from a *core point* but that does not have $N-1$ other data points within a distance of $\varepsilon$ from itself is considered an *edge point*. A cluster is then the collection of *core points* and *edge points*. Data points that have no other data points at a distance $<\varepsilon$ are considered outliers. The figure below shows a cluster with $N = 3$.
|
||||
|
||||
data:image/s3,"s3://crabby-images/e5f82/e5f82772065bfa04a2ba02d661fc619e5fc4e4c4" alt="dbscan"
|
||||
data:image/s3,"s3://crabby-images/b6e28/b6e28c6bfd50e3beea5f1d2cb5b49549b19cd187" alt="dbscan"
|
||||
|
||||
FreqAI uses `sklearn.cluster.DBSCAN` (details are available on scikit-learn's webpage [here](#https://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html)) with `min_samples` ($N$) taken as double the no. of user-defined features, and `eps` ($\varepsilon$) taken as the longest distance in the *k-distance graph* computed from the nearest neighbors in the pairwise distances of all data points in the feature set.
|
||||
|
||||
|