From af8f308584a270c4e35d2ad6d768099459975cb1 Mon Sep 17 00:00:00 2001
From: robcaulk <rob.caulk@gmail.com>
Date: Sun, 28 Aug 2022 20:52:03 +0200
Subject: [PATCH] start the reinforcement learning doc

---
 docs/assets/tensorboard.png | Bin 0 -> 9273 bytes
 docs/freqai.md              | 101 +++++++++++++++++++++++++++++++++++-
 2 files changed, 99 insertions(+), 2 deletions(-)
 create mode 100644 docs/assets/tensorboard.png
diff --git a/docs/assets/tensorboard.png b/docs/assets/tensorboard.png
new file mode 100644
index 0000000000000000000000000000000000000000..b986900435b28c89e9d9e8d1bdb5413d4411f913
GIT binary patch
literal 9273
zcmWk!1yodB5GGbiN?hsg24U&$Zcyp&j-^3V8tD?GQ$QB!knUcPSQ;c;Iz+nu_n-6P
zoO|Bey?5r$%=dlsVzf1t@Nmd+P*70tRFvg)fqMXO+F)S-*P^<7T;PW7p={`lf`Z@o
z|A{)}{>=s$B=b>t=cDIt=i_JTWsBnH=f~~n=Im{4>0!(5?q&b=SezUMg%(9cUPj+P
z=P1`O$rzEf{vHW?N8tczTXb>L&|Q(&fhdfUt7I@4TC-h#win-tVX07MhmG1)w5zl0
zWvP_@O)6*PsnBO92v9NLOKnz<m&jlWa+z4DBQ_#ta;9TbU0E~<d^Ux`9riO{yx_zY
zCEhKiBT0hq$$-@)C+MUV^D5^=Y~8n$;qj0LPmOxRJ5I}8rD2^WzhnLAno^DyvuS;m
zY91WE*4DUD-ua=)tx=*|ciS&%QoSKDiaO#CxWVpCdujSYOpalT(lqTq#LDLZO=6te
znfL0#A>?QzWVf_TvU5635(y1u*c_QJ0&&8n!33N6_wH)!`jIpk40vwn`sgL=kMlEO
zpmIgw5DXhES((KPTu2#RN5tu%*taeC$I6$Qxr%RC=I=2PXvymlbr+=9f^`NmjF>W7
zAMA)UbL1z@jRUnTd1#Y&#=3jezer(Wu~L89)8qBW@(Qv%errKYCx9~tw>U-z5n)G<
z6E|3|_dLx#DGDy3%SrsjWbKPO4ZWp7oyxy?nRdpr^>D3)WiKg1CK>&ADsK3eu_vnw
zMfyciajLB{uP2f1!Sp3wL`Oju;dOW_4gD_#`O4_JBhc^u<tg^7kk@su@w~7KnUdV7
zN<afLEqGHz%H<fcj(&V-?aZWPSSD%H#k6uZ7Z~4eZ~x62S-izA*J1$UHGqE4D$&Zx
zJ7MJ*tsC3i{QA{9h)4P);lZ&n`AOor_lcY@$->N3j!XVQxJVc&DY`cbv1nE|Q_J<z
zroc##;A`y02V=FSJND#6o)=j1TT-Xer$+dJILXw8bJbpt>?N_g#wmW#d?6a?9b%N;
zar)fbSVWvWwIlrM9!BJuoNulw{w4KoIKstm-f23g?`PfZvOA(j`Yq<w9bUoe!c)zg
z^;iMv5mzNJwb2bd8IeiI)*58jJ2pik;s@se>d!~Yt$3Ozum+arIyr5rjh9zSZEfw{
zd?Z{fydX&P@!yP=3M0{%vKV9lX=KmF2CXcXzCo!zJ6*Vig#`*7Xu>X4pIw0v)L_K*
zYY&2wVQaZ8m|d1Uw!^ED8v1E()o)z^0UaD16b`wC#1<uKmQO?Q=s>UoD^iT?vflyJ
zohQYP<dK$BE`n#`iOI<?MtArsD~Uj;30q03rAfM0+03v5w4|{eheo3Yqw4D?kW85C
z@@|`E{o?j;GQ-tI<GOS7v8E;?(dWcO?~8o{;C<ME*W-PFQ8jTC0nMMkJ7X;{HPXn#
zwT`b|t?Gmz;N3<erpz%N_Vk6W_fjO>CjT!a?BVW;9|CFEKGPCE1M&0oo3wgG0ApLi
z_?;)I%8W_6zgz?>-c;cd5CHQnbJVTSCzMTsBRx%psBJQAUkeK>lSV2p=6wJwSXx+w
zy{U?@B)#rT+!AJum1Rzi5fi;V$$uNISo{J$60eAkfRd4tN*m1Ky>GEKynbBb5+(Q{
zE*Xk^a8s2G_{&&dU*CS(`@Cev@z<V@xuBSYgob4P5JUroa;OKQ!8t~sIzn4oGUt@6
zT8i=waX=dR!kmUKb;M=%eNfmltm!`3*9m4zQeeV2hxPDp`s_A*WDt$MCfC>jEAc18
zjDh)TW9z0iZ=|N%T6;_pt3LZHd@9lY{(jJ&Pxb+#2Czf;mT>#OUux3EySfG!c^;q5
z?LjhA6R@!zzUnt|fR`W|3|MFo*M~!<WZGv(n)6VJ#|IROPj4UBue3#<<3yk>H@V6o
zE(Eh`tTEBYe0L9Oo^tr@rG7_A-JZ2co*=NBR(lC#VJ59a&r@Ht1$sI;#ah)ympRTl
zHNF$YI29*g+lka(>-mJK{YpkM-}tKSy=j4?Xe=S%#h+Cs|0=a%2Y&}C-22aMJIw4)
z);iQRHH!{hYqWt~QBj_^4`gTLWUjTp|Cg^q?sXw|zU1-1ZX1ZiAM)hIEv}&z5*2NO
zy}w4h;P709@uQ=o=Lma#bZW$DC|wWin0(IDPH6s8<yS`w^nh5%ZN<XDA^K|lJ<SYc
zvC;W$@LJOm>@J)T@NN;S+J0cz#it_n)!Y>a2y5mTFj!eRJ~5FzVfWK*AJKy{lt`19
zcy+k=k_HZk`%xj!<2p4pX68fI-iNud7L}(qyDkqlxh}skFtkTjKil!jk}dpk7_i)A
zeV|{G=?*EXux~|O3ox}=T2)^F+xVPqDoo}}B_adBaSBe2<c&sneZ0!0y4zkN%q%P-
zq4&ZpT6rb!I0x;<?ejx?`<5>Lj`0*9h+2}?Sr6ftm6zALQWzMJ(a<34;gWq24X}dD
z3<U62qQvDwWTSl}VWZJeUK_bsJWX<ap$Rubsk{!jmlw1I<|t)M9kEyz1WYySB!Gd5
znU$Bvj4KCLp!+((CoJ4T=bHQF3vIe;vN|&m#(^CPm6f|~)XfN9TyeOZPTIS{UHkGb
z^Xm~mbD3paVip5~R*&CkttR*yGY*1ZZR!nysIjxd81nQm&8(<kqZ}%HQ&m{Wo`>X0
zCYyg5ulQ+XL}hSj$kQ~_sd0387;Dby)mYsy^4uXi($)|pBTb+D&a<-Be7DZuwB0|M
z7R$<DsL8dLw;!Sb(KtxSK%l>1TF37gedy@+`t$@_qPJZxq_F&;R;wB>tCQBrzN`BG
z%=Vi6@$m3;IEhDgyh?Q06crUosi;b|SuG6bE6JOvfdn)j`0;9#n`WL7IAjP5D%lf`
zHVQ1F&@i82Mr3al5A${^8W_mJpOr_vtsz<vOpip@<lyde96%_+Mni#xdge}F<ib5k
zATQaNF1*NC90)!K_LZGIeoYI#z}GO}#6%?-6Xbbsv~R3M2pWfkJYI|;H|x@u8j0FK
zHR*qv;<K-Ad+Ipt?CoD`@Mn}fkCb61vmj+Zd4p!sp`jDns4hz*qZ*TRq_cJ^!JR>o
z!!VZmY}j!_kn^((zEn8ME4gdo*n~2yoDqt&wfomz7D>tLkC;_Wi)sb4rweY`phCuU
zRjVNG4}>KKBkPlq;%NjTQjVV9og;4vaL*fi7<uw@|6C=ynd+Jo_R>hz3m&)jZ#J?I
zTf9DQMo&JMBuWzJk;eMqxF(q=>h)HUU9H@@j~92ap6-K#63oBI)hS)~7w$%1K>ktW
zWCH3-F7WnC(Kh;mKC(^KS8}Wnm{~uMR;vH?Yp?a2;%gC(o0dM_&k=pQ!nd=O`6bL)
zeMylA{InoT!VQ(4pmO;6a7?j{d+6Zgk>MXN@__n>wY=rrtvmbj07rLN)$tYVZGdMK
zS!Q))N+zkCm3n&sva$UyyjJU;V&&6&lm;=nWaZNyOd5}kZd4$o;D?8tLkS@-=s<ma
zeq~yIIQkj3zuVj>M^j&TbW(fhLp@CiR-vtqYL#xoTj=btgAX*ixa8M6<X#qph;xfV
zd?j{B=-JB#Yjb*uM?%;Ez3Bc)={At>qo~<piH!G(wkEaM3#~DU_U_t6kemLJP@xSr
z_sHp$(40VOIlU2<ix%oatj@DjtpAO?`z~QM!p0;&$uih<cG2$>FK*`Eb}oD+fa_@+
zN;E=`UHUaUky1pKG*bNjv=61_c%MNUw{N#-+i8Xc{Cqzy%fieFM8ExfK*2hqGR()+
z2b68;qvN%6K~uleTC=76s?VX>7=iau`?A!AjcLv)-9<1ijgJ~PcRhioC_oTasdE8R
zt&-tRwm>L<obuM;!h=JxXV8V?F$&XK+=2qnOi0$XZzj%gCkV~Nq|{p1Hb++F{9zGo
zr#X|TzC!QBerme!8Lw~%tZQHy02Korzo@%YQxzs)>cY|uebvsp9X#;oOy(8BHPHtF
zoem~}LuF>L4vE*$nBJ}h#?QJ3KSMyzN@6)?FFk3o(FqW*)L2DQNBT1w13NSe_Jes(
z<O+&faXM=@BY&*=EsCVb`enBZEp7}*knd<oDvrZbg8nEFqDhFzI?PRt+uSGkEK1Ko
zeJrG>uQ%rjp_~fLse^fbn+I>?Mt-Ppnsp4@)&Flu*4U3}DT>9@3L|B7Y*?=}goF>W
z#XD9}432bp^h>$0$l9jug4z;tIu!nNGlbir$$^IPRWBoAD`J-?N19xVTevY%c>Yb5
zCXfU!G>zS4Y|PGb;WJAlxYJnd&l)o(N_|2Gj4yLr)_+hgJB69b{)zjd88o#t)7-N2
z%>PZg3-hzQIG!F)mldN|3=Z&tJ)b}W)TpJ7d6|>`Yx?kLF;1L`qY|6FphMn`w@egF
zryfVdyK!#g3aXccl^E5;ohQFdZo%iZJilAHWkLR&x^;1}GIyrm8BJM;V&0iWgBmlo
zv^`8Ky)}#~a5V;Y_Wwa9d+eu2Hhz24=Xa}TvyMRK_`4YYnc@Upp-y>_mng+8%ryyU
zOyij>502woIJ3u?7$+FwF+E?PB(7O5^wZR20#(xAxJ)9iVpV($Wd2=O8a%@aCd%TJ
zsrul3qleLT&DmMh#MTm(Qd}3BCjR=hOsk0<P?c@?7V*`ZjIWC%q~^2nMSgp?&Tdp;
z&Opl5SSQ#p@Q+a$VH6%-OMleZ{4Ci{g)bY`&I&H>3nXk?s8#tiXnUYk;Rl;HlB48*
zws~zTs-~|(bSuDl$ijob!waCq4S6b;&+N8!7IOD`wT@4G6;feNEt%O9R+cQ=?k?Yn
z5SKs+CEOInmz~bgtt*=Ge~9|BbytflQ(<=!C%G%sD){zAHf}6@S8GJf@7M-F65UqY
zjvGHW+DH|NC$O)5mNs|inqMsEDS2n(=3MK(r3&(5H$YhT^YUI%F}HXMP`8BAf3PGC
zA1v}9bvoLbg3Lo$I8?P*;1d%%+?gt+x>$6ZC?9}Qw!6y;;AaXUHvsDbVD!mU>QqyQ
z72M|1onZ^r@lu@r;gXqFX;I0ZjjJ8VYN+=N?E_!kE9Cu<%@yS#7QMa*v_oYM`?{l#
z!E@0oZ3P>CzJ%_!&wP0Z9<62ZG>!7EdHU@BxaF>QyT^gsbsF#g0<8hlR7OBp4?0}F
zRHYJJ{))gz3fGg00s*zIydX}HaCc3^mB?*(M!=xWS~ZxW$Zc)a640{eYp@JQxcn}i
zen(tN$BhVielFe9cwBTZ)-UybsQ<Tj6U58jY_Rlw+^mmXo%;4o^V<aZ3$>d$lnP%2
zYfNdqH^(4r_s%f<V`OPL4U=OwBXt1f4DTt)O5zK!0!u!xu;D9lDJ$Au56>wk-Q7yn
z7VT#5b8-mR=ie_%++jLx$F=?)e@8h?+Gg!b#ecZ3OLjh9)YNR!IV&M^IA6t|<mUcv
zJ)?Yo)`Qn6enowk?~leH)S^71Yh~W0{+LDb8W&w4lLhoILNuFO9Fk+JMA!EdQHATO
zX1z=P{<=S*GcEXFo(rH1H^H4;SfYZ3S^Xq9#|&DpZ_bz!q=Ik}P;7_iv7IM=I(0R@
zEo8hU5w0C=*0*OLQ5KyB(p*d*!G&ZkVpp4il2%-4l$Y<?+0${-RU--u3;n@uFtt+M
zm_C2bv7$Fs05IHYRM1IU54_Rg8r!_R0MNUP;Mb#HsS;?BL^w~#hp+czN|*sR6FQOH
z=8X1bQ|GS4Ck;l|RElv^o>&4@t=dCk2+qzrj!#={;`RJ}Rr%vdYCPNt2c_G=l3J4a
z=DhGg$k;1MN=a3Z`k4Ko;G_Ode?q&+W(!h9N@YSB9UpbaB+aJ9#;gghC887+h%`Ay
zPvdi3$w%Dzp>nIP3nNxT*0Fc(C28+66a>GPF1lQHh!FJ`O`26lmC*2I;YQ(8#lH8O
z9Fh6Po1x}uA*?W;7Hq2&%aL_d!eeG#6H4ukG=mrntppJg3rlZ&8<r<>wdCEejjQJ}
zeVXrMa6Lfqd3!aiU@m=<6cQah=d9E|mJLG>V`kLl4lQ_)RJgM^YujWkJfS00R*=Dv
z<ilOY&y1oTtT!J1k^8o^K5wp|DwUzjvwB`La8p?Gt|;NUO!ENVA&bnp|D87Ben`4u
zwAh>75+>Q1zm;9W7x=~8B%ATtCsvN%p7d_@eT#Bd!t+WQR;H#c*@ZWTvI>2rA&15E
z$I@CDZ9e{VprSs)nW`<N^+=qa-h0+?X`Y0QO#A7-a##uxwB3)pnc%!9E|iiJ24YdD
zhVyU`g5~|)^14}wEk)RHoGmGaH#CZ4=G-I?=DE@qZRBj32`iTR#~QkussqxZSyR|u
zFv(pWb<T7K0Yt%(H9@1`6Rcv^7@y~mW>9;1uWE(1pM;%lMm{{$Tor~mI14-y3;ErF
zsF0mXLklT1nbx>(RcUZZSgL3y6@LWFV3w+~VSgtVdPKgrJL!6=2{~wg`7O3dWFq?o
zzz!@Ii1pPqtledh8yPy3iYh_WZI7&Z1`!dhYFHFuzZ-hLaG#1#K~YOcDF1E*;8@lx
z2K3;gxHhu>xL6P*jHK)9>+=)~UN4_>8hGvSKnZ?f)%Z>6^rK*uhbajiDDULOY$a=g
z9G3d~6ZlB`=}*%uv5bLXsk@mL7btt$;rd91Ex@srnU@v+UEnl#7;k&Ewn${DmyOj>
z!lRzAA88VYz)@?feyq2y)tRYMQ&N8Z#7u9k9IpA$QUCqzDIYF6l}Z7b1Db4K|BQe>
z+|uHS!9gkE<(G;UYW_@?7SEXeGy8c)|7W37U8#B%`r<?(>xoL&53<{9=j{VIsneeS
zRu@M)kOdkFmZX;w2;-(-Ih2dPGJ`aIR#ZlEUHhR}5#2)S4Xc=ECGusuSPC1MmfyMe
z?h*<{3#yLibm|IDr~ttNj${Mlfa3=4mU^ZxRmmbh*czsd+2Z!m=DO8y0D?mOVn1X;
zxZTgUDHkl5tJ#b^ZxvxrR}IcjOEIceY&41=b6Gy#v={^UxdkaklPfUOO>ve;OYQ16
zTlVF;u9e^CoFL?QDC+zlvXp5*+!#?QO)KN!H9y6L-#De))i1uGii*8f6a)Laxw+w0
zY1lkzu6@@wfMRpJlJ$xAoq_!H`XK$YUd%g2K3mq_G+E+NbH-R6y(=|B5!EjjD!>00
z*-oX<?b{?~b(U+J4CNw{ev3{QDC>11AS&l|0Aox`5&#v3`>~+oKGSqD2yo}?CMZk-
zAM&g$dm7d**VyyY2@8?vdNM}E;b;)R4CK(TwcOX+^V;(t<_L5|UtI;A8R8pYret?N
z?;-d-YttRYe>5$&oI<%;<Y5~MEe_y1m2==EMA+{n>7}9yD0x+r*L4*nEr9~e+@tQY
zzgJVF>_yr@ifKXO&3X!YORj28ADdyAxaB7y#?A>Q$_niMKJu1+X1rerg;DRFM5Pxw
zYuw3(e?*z)i$Y`7nE^<}NECwyW$DeWDew=Q+j8%^>vMaxzff!KX;nX``H$eo&D7jT
zcA4ugTAi7Ho}c-=$b+cNUZd%96BPwG!`Oq$FoAPS7OQq&0YRWEzDfbIOdN3b_03Q%
zm1R$N8d7m2my?rQX%FD{1%LkhS(`nbl$zSk&MuM{myCiU?%ZO(ZBmY4l{pbgAP$dv
zm;)ZJ)w)AHrp!YQ#MlL4m#du`u_0_5Hr=YD4>xkt?*Dvs|K-%U8L_IXY8HGpARu*s
zzy?r&JiSbXK7BrXXfV6$W{hS<%6b^xH^lu^=>5$nH{a@U6D3Hcg%Fk>wfs1oFyY9y
z(+Ch)zNA#mvEKye(PBA3$XR2Fo@)bS*{XJUTs;UAtT6)^0rMhI;*wp#KWWo(Yr4ah
zFl?1};Q?zuoj=-$EQ?GY;-s#j0myI}w#mwj!!~^5<Kw(yVrqrj76_Q>I2=CX00D%&
z{}GBZ<7r5~T|Iza46Tl0iQ=S`&l+mA4H|!@_SH(THgmuH6Fb~%mC^nQ)BQxtY=Zt`
z!rJvNxVS2nXHNEUm4aLAMD;Y+S~7D?aA=TESYq*Ckt=r57VFPE!jG-3B3`YO7lBl!
z2!MMFC)8W@re$T3{rU5!CGdiMYisKat(g`}*kc35z3V~HxXHy5z+@(!!DMVqohST~
zlAVQY6jHal44h+D4i2oEa(j1@z(X<y27tiS3rLm>3=Au6P>xI5qs4lAJx9z{4%TdH
z6}hb=XMsElc&$6@NW%`F-vvDooizvmD*}2M<&b4yV0K|49U%1O==~q!y?O{JdHkR&
z`V&wDJ5Rh17hd=BGJc&%(JTkVkTRocuiYs+q8QLcVD8!3neRK7`c}p+WFW&Rk_v<h
zActx#L5);&O#FTs&~meT;*4(fc48;Z?wi7uFmtmS{<L2=z4H(}AzD|L915;FQ}Y&&
z58lYhT6;Kk<s3U8NS;NqOi^=w`t&JHl_`Nzq!(C9R(}2v?`L6^Tg44bscY0$lZ2z?
zX79^GL)^aqHO(1FR#w*U{+O4+0MTuMjgfSoKAX&wprDA(3*4{3b4ILgAlILC{8)nY
z5LtKiPnuTimR{y}=qoXCjQHYhjQ8dIF;3v7We141LAni}N`-zb1DU>oX~w1~6<8<9
zSc=`lRA{nimF$l<520&8o$lALpf#A-0KDs|*v<C{K0$lPJVCHd7jP~7C-2jBy<V`m
zXR2Z`kU;@?1+bJ1A+Lyt3b55YJd%v6tL^Y`aS0<WbHEbh56yBA>4JyeqI5DHLi6%I
zM!sraa)+Y2-MfUb-lQq0hnvY@YiPv%d6x!h<XOS-(S}1Doq@rY;AN2JY?GLy9BJf2
zr#u2&se!47P_BHKp;)h$QJvyvq4-Hne>3}bI#O{7Ebje0#OCu>-<2Q@RQ|=Z+Wv!e
zM?lRRW5wbG-k0I&hT#WEA%9LTLSoWYpDj0SwN&qt!_RO3obO*9rcW9sBs%W4`xf>$
zeO9C^KCM4ECyku!Z=dM8-V_8Yd|%3%nH#b74m&x1!VpXOSfkpKy*4TtJp)lV4YbN|
zT`mJ^s@eDGN`^PV<!pq2pp&f2$5+>ut9gQ9SrwBvIo0lgVaYsABZ)8bKGltJPwAvU
z4{6SJYuoCb+x68g42Bc>Z;x_aT%0oMSz;N^m&l$(c&7CC&=}l*K7Sh+vi|AYbtHAL
zjdV6x3?`cdKT&ax#WthH>Qb$?2c!Zxkvp^RDg;pFS<_VkJr9ual9iLxnE_$2%T<p4
zz>pAxi-$L$qEdX(ox6({bHrJ-%w5Sd4VKxxA_%Q%2o@hJ^R<rTuSe@h+PNKZ7B=gq
z32|hRA@mu+(D_JCJ9$7RA?0KZ)?VP@Y*`nf17#c)9i+U4PyDTBFg_LMkwgYj#}bN5
z<qepBb8ejPISxi(_x$!5CYJlb!|=sHbx`kPau6+>;$8%Ue2#_4-DRE3Q72<QpCd*}
zSKCesA5)h^d&|E$FZ9^J?y7ojOsCY~yVe2y(Wb3I{pA?NVj0p%&&ZWYi>o!??_mG+
zfJyvU5Dla|H{$wEshJzQKe1EnSsQfZu^8ekUfxP5>qz{o+r`@8;!=S<-Dv)aGLkNn
z+f-Hu?0d4NLE7YPt(dPVZ#ylJYs-h*YI1cdJJQ@#Nvaq$Cr=1EYE<Ogp3}1PuY)G{
z&hYrmarF%#p_3Dqq`E+%ASI0!z<5`(9Xe)9uX^m~23xaYP2K*PKY3csP^?x*<P+-W
z`#18>82}hec_4g|@_3}=<eNat;!N?&|Li{!P{m2%Pq9V)p5q<8+U{sG-@Y@wf*1^(
zf$?{iZe&Ccez@eTHIItdGsN*3n>4MN+0-uzTQM7*wHhsiDyzP|ery%#^t9NTVdhzc
zLPc<KaV?gQ25geM0dH^FLSP4cqN3!9hjZUsw86F$Dn_q1J{b-38)ww{jp1RH_D@05
z;%HKWgP~+6h*22T-g#WrdGNt>HK?=su&rV@Xmzo%%rT@?*c9=T*s6w@W6vK*jK?b?
zltaIP(&NYFZ{SLxhbiIVX<B;F7I{?QB+4Sy+C%a{K^m3FQ&E3@Yu0lBG$Q~&6+1v*
zq3<+z*!BmMmQ5Xmp@H`P?EQr?YOCep5@f=^W@}h+xpf=J!wXX{49kmfYm+_E+LW%T
zgyd(0{ZB2%%T5$ey<~@(Hqg~h-@TrOcJSzHL(J}6GHQ^fC7ro8ol}jKwasg0WEL(i
zoNMdsv35rf`OU>P%LK0taw`nKvNV_foK#4oY{NIJmOa?Fzm`${b}$}t0$+T{e<YDL
zKt!b=kw|Mgrr45kyZUz-uiP;=gz;sg{a+Y0AkU_zv=6_$tIqqmbXr84MoU+WZIS#^
zU?nV^{BJ71&`q}2%R+a#)i!Sfcr;=4-$U`^a%KfH5%_sh*j8YBXmzg;4LA399Sr+U
z^=}8WQuj=;C``%y3pa!nL;3NTzb}XK1_|We(ZU4uylUT>%shWLmO~Cg1v*y+i!NUO
zcBjSc5JVp6{x#dDJIMpv+!{OuIyL9#GnoNvIV@+}Z`+PLD10R(B&aY+vdb0`-)eJu
z`ku>@e}J!>eLt;MQd*8fm|(7Z|A6NYsxhiaoteC)&)fOTQPmjlp8W&Yz(52r@co_U
zBD7#%xNv*8K>m3!TDi(mui+?cW3v>y^YRe8|J1tKRCp}_mT5|Y6R{w*LWUP0=6h7c
zx0+(|9da{YT<j_{3dIU0L9SqKcV3GR%7P{ovCY(65LYiPNeA~wR&4~6;zdcSP@~33
zz8M=-R-hp;36fq5gq|;uY~rFh%ak_}9?77NJvR$wWiu2!(GnC9w)s0Jf$nPBGS}bg
z`E0#1`5~TU32`-YGk4SB+vA+&Lv)2M8@=!%X~NEZ42F&!|BEtJuO3Wp5NhSdz~Ax`
zLEs-METenEa{-Gq1-dgsYd<F2i*n5vHD){>!eX-Ftz>Z@D^A6tnj6kTo5+RBwSLnn
z(}=2mk(`Cna0cc8<w(p5!KD~$hJu3L_W!p4$-M34E|r?cE5w(+hf7VIc9Wswt^eFE
zk5>DH;+QMThT81I#iq^0rV~wBQb&ZtTT-))I=7J|qr<e)-`GE#cw5<jaHk|6gpFH1
z{Ize1$B5a)S{%wB*j>;oUZ2PK;WPFLiby?SDM<t1%yT;V@34Pz8*SsJt^GmeJi6pk
zgMJC_O3{~bdg-h$5x4ZZT!lxzpBth%XhOy<2Nb}O)8U%Vf5{)`h-YFzToa6U!=AzL
zzG>)p)zUHV@`Jd=NKr1o4x}hsz?{kiVvFfJW|5So;Tui^MbYbf;~E{AzH_vO?k%NJ
z?!3U;05kSoWitAmU%93hwcq)v<m@{-(xwg9DkcTU<F8AV@&v;E>1kOxkteJ=l}9ZE
zSuB2*%?%7_-7DC~Fx(pe{;Tw0r;hv~x7_C4xFvSKXkiiSJ;8&HBHlu_wh9@}lTT1&
zxMRNyK}KEcWS!<iUHzIFU0JGvZ!dJ!U-P)Z-MmUf9Ru|Pjn%$n;Ys11YrfA0rbA#D
z4Tv-_sGi4^<eMxn0+&>y?NA`Y4#&v7$5~GWksJN__4MdxDG9+AMy*Va+q+bE7Mlx>
zxP3Ft)BTC9SC*U=KbB`bO<OmPa9)4D<hT%zfmOA*g;_>NcoQYH1h#lQ<aQ)B4ko&e
zYQ8Z@_YsXRVg<M%Q?U?^QFJH5Xx{ilRo7*fn-+ijXDaf$4~(DIqx7;`wT;+i4Xiay
zbHAOc)i%n)%zS;lm6-MQ>p|O}c%e0m&`?QkG4DUP<(j#%GfZghXM+^&cgH_HS%F4h
zj+kFU^Lpr4-HO?PU+xGPAao)Rb-@vK^&@*T<su=sJ|=>%U-$fv2D7tcPsD&WJMc43
zC#v<@my=>Ar}M_{16LXv8fS6l_w?LxKa8B#AZ&Tm7S^Yi1feYf3l~|7NuN=R9V4_=
zAIHSF4dqVV^Lel1v?TZx#X>gJf{tBNrLrB7gkqAFf8WFXO}>Q(BcmsM<aD8oZYW*#
z?lbL!oeOD)MlU`V+K8nSg%#plZiA2gDzm>&yJf-}nibcQ4K@Yc?Cx_d5Wi)gM>tf0
zzCP3WDd^2zllnErv?hGwmPp9?XP~F~;zi)G@fph4R;P|v10%$U_X_AJ?_6nhD)|@4
zx{4nMu_*ap%)PZ%`^AtbUT5{@Bj!6YfTUQvxP0fBvefhWJ6)`_r6GM{yEFu62yC`&
z-`cp1m)+J~0s5;ce=*-ltspcL+BFD%B=1gaK@-oUPnY~zyPq(v;UMQym4iFrvmX={
M1x<OFti{Lw0M-T%4gdfE

literal 0
HcmV?d00001

diff --git a/docs/freqai.md b/docs/freqai.md
index 032046882..3aa4f8d74 100644
--- a/docs/freqai.md
+++ b/docs/freqai.md
@@ -118,10 +118,20 @@ Mandatory parameters are marked as **Required**, which means that they are requi
 | `test_size` | Fraction of data that should be used for testing instead of training. <br> **Datatype:** Positive float < 1.
 | `shuffle` | Shuffle the training data points during training. Typically, for time-series forecasting, this is set to `False`. <br> 
 |  |  **Model training parameters**
-| `model_training_parameters` | A flexible dictionary that includes all parameters available by the user selected model library. For example, if the user uses `LightGBMRegressor`, this dictionary can contain any parameter available by the `LightGBMRegressor` [here](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMRegressor.html) (external website). If the user selects a different model, this dictionary can contain any parameter from that model.  <br> **Datatype:** Dictionary.**Datatype:** Boolean.
+| `model_training_parameters` | A flexible dictionary that includes all parameters available by the user selected model library. For example, if the user uses `LightGBMRegressor`, this dictionary can contain any parameter available by the `LightGBMRegressor` [here](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMRegressor.html) (external website). If the user selects a different model, such as `PPO` from stable_baselines3, this dictionary can contain any parameter from that model.  <br> **Datatype:** Dictionary
 | `n_estimators` | The number of boosted trees to fit in regression. <br> **Datatype:** Integer.
 | `learning_rate` | Boosting learning rate during regression. <br> **Datatype:** Float.
 | `n_jobs`, `thread_count`, `task_type` | Set the number of threads for parallel processing and the `task_type` (`gpu` or `cpu`). Different model libraries use different parameter names. <br> **Datatype:** Float.
+|  |  *Reinforcement Learning Parameters**
+| `rl_config` | A dictionary containing the control parameters for a Reinforcement Learning model. <br> **Datatype:** Dictionary.
+| `train_cycles` | Training time steps will be set based on the `train_cycles * number of training data points. <br> **Datatype:** Integer.
+| `thread_count` | Number of threads to dedicate to the Reinforcement Learning training process. <br> **Datatype:** int.
+| `max_trade_duration_candles`| Guides the agent training to keep trades below desired length. Example usage shown in `prediction_models/ReinforcementLearner.py` within the user customizable `calculate_reward()` <br> **Datatype:** int.
+| `model_type` | Model string from stable_baselines3 or SBcontrib. Available strings include: `'TRPO', 'ARS', 'RecurrentPPO', 'MaskablePPO', 'PPO', 'A2C', 'DQN'`. User should ensure that `model_training_parameters` match those available to the corresponding stable_baselines3 model by visiting their documentaiton. [PPO doc](https://stable-baselines3.readthedocs.io/en/master/modules/ppo.html) (external website) <br> **Datatype:** string.
+| `policy_type` | One of the available policy types from stable_baselines3 <br> **Datatype:** string.
+| `continual_learning` | Number of threads to dedicate to the Reinforcement Learning training process. <br> **Datatype:** int.
+| `thread_count` | If true, the agent will start new trainings from the model selected during the previous training. If false, a new agent is trained from scratch for each training. <br> **Datatype:** Bool.
+| `model_reward_parameters` | Parameters used inside the user customizable `calculate_reward()` function in `ReinforcementLearner.py` <br> **Datatype:** int.
 |  |  **Extraneous parameters**
 | `keras` | If your model makes use of keras (typical of Tensorflow based prediction models), activate this flag so that the model save/loading follows keras standards. Default value `false`  <br> **Datatype:** boolean.
 | `conv_width` | The width of a convolutional neural network input tensor or the `ReinforcementLearningModel` `window_size`. This replaces the need for `shift` by feeding in historical data points as the second dimension of the tensor. Technically, this parameter can also be used for regressors, but it only adds computational overhead and does not change the model training/prediction. Default value, 2 <br> **Datatype:** integer.
@@ -731,6 +741,93 @@ Given a number of data points $N$, and a distance $\varepsilon$, DBSCAN clusters
 
 FreqAI uses `sklearn.cluster.DBSCAN` (details are available on scikit-learn's webpage [here](#https://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html)) with `min_samples` ($N$) taken as double the no. of user-defined features, and `eps` ($\varepsilon$) taken as the longest distance in the *k-distance graph* computed from the nearest neighbors in the pairwise distances of all data points in the feature set.
 
+## Reinforcement Learning
+
+Setting up and running a Reinforcement Learning model is as quick and simple as running a Regressor. Users can start training and trading live from example files using:
+
+```bash
+freqtrade trade --freqaimodel ReinforcementLearner --strategy ReinforcementLearningExample5ac --strategy-path freqtrade/freqai/example_strats --config config_examples/config_freqai-rl.example.json
+```
+
+As users begin to modify the strategy and the prediction model, they will quickly realize some important differences between the Reinforcement Learner and the Regressors/Classifiers. Firstly, the strategy does not set a target value (no labels!). Instead, the user sets a `calculate_reward()` function inside their custom `ReinforcementLearner.py` file. A default `calculate_reward()` is provided inside `prediction_models/ReinforcementLearner.py` to give users the necessary building blocks to start their own models. It is inside the `calculate_reward()` where users express their creative theories about the market. For example, the user wants to reward their agent when it makes a winning trade, and penalize the agent when it makes a losing trade. Or perhaps, the user wishes to reward the agnet for entering trades, and penalize the agent for sitting in trades too long. Below we show examples of how these rewards are all calculated:
+
+```python
+    class MyRLEnv(Base5ActionRLEnv):
+        """
+        User made custom environment. This class inherits from BaseEnvironment and gym.env.
+        Users can override any functions from those parent classes. Here is an example
+        of a user customized `calculate_reward()` function.
+        """
+
+        def calculate_reward(self, action):
+
+            # first, penalize if the action is not valid
+            if not self._is_valid(action):
+                return -2
+
+            pnl = self.get_unrealized_profit()
+            rew = np.sign(pnl) * (pnl + 1)
+            factor = 100
+
+            # reward agent for entering trades
+            if action in (Actions.Long_enter.value, Actions.Short_enter.value) \
+                    and self._position == Positions.Neutral:
+                return 25
+            # discourage agent from not entering trades
+            if action == Actions.Neutral.value and self._position == Positions.Neutral:
+                return -1
+
+            max_trade_duration = self.rl_config.get('max_trade_duration_candles', 300)
+            trade_duration = self._current_tick - self._last_trade_tick
+
+            if trade_duration <= max_trade_duration:
+                factor *= 1.5
+            elif trade_duration > max_trade_duration:
+                factor *= 0.5
+
+            # discourage sitting in position
+            if self._position in (Positions.Short, Positions.Long) and \
+               action == Actions.Neutral.value:
+                return -1 * trade_duration / max_trade_duration
+
+            # close long
+            if action == Actions.Long_exit.value and self._position == Positions.Long:
+                if pnl > self.profit_aim * self.rr:
+                    factor *= self.rl_config['model_reward_parameters'].get('win_reward_factor', 2)
+                return float(rew * factor)
+
+            # close short
+            if action == Actions.Short_exit.value and self._position == Positions.Short:
+                if pnl > self.profit_aim * self.rr:
+                    factor *= self.rl_config['model_reward_parameters'].get('win_reward_factor', 2)
+                return float(rew * factor)
+
+            return 0.
+
+```
+
+After users realize there are no labels to set, they will soon understand that the agent is making its "own" entry and exit decisions. This makes strategy construction rather simple (as shown in `example_strats/ReinforcementLearningExample5ac.py`). The entry and exit signals come from the agent in the form of an integer - which are used directly to decide entries and exits in the strategy. 
+
+
+### Using Tensorboard
+
+Reinforcement Learning models benefit from tracking training metrics. FreqAI has integrated Tensorboard to allow users to track training and evaluation performance across all coins and across all retrainings. To start, the user should ensure Tensorboard is installed on their computer:
+
+```bash
+pip3 install tensorboard
+```
+
+Next, the user can activate Tensorboard with the following command:
+
+```bash
+cd freqtrade
+tensorboard --logdir user_data/models/unique-id
+```
+
+where `unique-id` is the `identifier` set in the `freqai` configuration file. 
+
+![tensorboard](assets/tensorboard.png)
+
 ## Additional information
 
 ### Common pitfalls
@@ -738,7 +835,7 @@ FreqAI uses `sklearn.cluster.DBSCAN` (details are available on scikit-learn's we
 FreqAI cannot be combined with dynamic `VolumePairlists` (or any pairlist filter that adds and removes pairs dynamically).
 This is for performance reasons - FreqAI relies on making quick predictions/retrains. To do this effectively,
 it needs to download all the training data at the beginning of a dry/live instance. FreqAI stores and appends
-new candles automatically for future retrains. This means that if new pairs arrive later in the dry run due to a volume pairlist, it will not have the data ready. However, FreqAI does work with the `ShufflePairlist` or a `VolumePairlist` which keeps the total pairlist constant (but reorders the pairs according to volume).
+new candles automatically for future retrains. This means that if new pairs arrive later in the dry run due to a volume pairlist, it will not have the data ready. However, FreqAI does work with the `ShuffleFilter` or a `VolumePairlist` which keeps the total pairlist constant (but reorders the pairs according to volume).
 
 ## Credits