当前位置: 首页 > news >正文

cms建站系统有关学校网站建设策划书

cms建站系统,有关学校网站建设策划书,国外网站做任务套利,wordpress 传统循环神经网络 RNN 前言一、RNN 是什么&#xff1f;1.1 RNN 的结构1.2 结构举例 二、RNN 模型的分类2.1 按照 输入跟输出 的结构分类2.2 按照 内部结构 分类 三、传统 RNN 模型3.1 RNN内部结构图3.2 内部计算公式3.3 其中 tanh 激活函数的作用3.4 传统RNN优缺点 四、代码演示…<article class="baidu_pl"><div id="article_content" class="article_content clearfix"><div id="content_views" class="markdown_views prism-atom-one-dark"><svg xmlns="http://www.w3.org/2000/svg" style="display: none;"><path stroke-linecap="round" d="M5,0 0,2.5 5,5z" id="raphael-marker-block" style="-webkit-tap-highlight-color: rgba(0, 0, 0, 0);"></path></svg><hr /> <p></p> <div class="toc"> <h4>传统循环神经网络 RNN</h4> <ul><li>前言</li><li>一、RNN 是什么?</li><li><ul><li>1.1 RNN 的结构</li><li>1.2 结构举例</li></ul> </li><li>二、RNN 模型的分类</li><li><ul><li>2.1 按照 输入跟输出 的结构分类</li><li>2.2 按照 内部结构 分类</li></ul> </li><li>三、传统 RNN 模型</li><li><ul><li>3.1 RNN内部结构图</li><li>3.2 内部计算公式</li><li>3.3 其中 tanh 激活函数的作用</li><li>3.4 传统RNN优缺点</li></ul> </li><li>四、代码演示</li><li>总结</li></ul> </div> <p></p> <hr /> <h2><a id="_5"></a>前言</h2> <ul><li>前面我们学习了卷积神经网络CNN,通过对图像做卷积运算来提取到图片的局部特征,但是在文本中,我们该怎么对文本进行张量转换,并且让机器学习到文本前后的联系呢,接下里我们将对文本领域的循环神经网络进行讲解。</li></ul> <hr /> <h2><a id="RNN__10"></a>一、RNN 是什么?</h2> <ul><li>RNN(Recurrent Neural Network),中文称作循环神经网络:它一般以序列数据为输入, 通过网络内部的结构设计有效捕捉序列之间的关系特征,一般也是以序列形式进行输出。</li></ul> <h3><a id="11_RNN__12"></a>1.1 RNN 的结构</h3> <ul><li>一般单层神经网络结构:<br /> <img referrerpolicy="no-referrer" src="https://i-blog.csdnimg.cn/direct/c3d9606e39f24df68e8d212afe82e8d4.png" alt="在这里插入图片描述" /></li><li>RNN单层网络结构:<br /> <img referrerpolicy="no-referrer" src="https://i-blog.csdnimg.cn/direct/4863e00fb73448549083335c31df44d0.gif#pic_center" alt="在这里插入图片描述" /></li><li>以时间步对RNN进行展开后的单层网络结构:<br /> <img referrerpolicy="no-referrer" src="https://i-blog.csdnimg.cn/direct/624c930ae981499c90c214e94e7c582d.gif#pic_center" alt="在这里插入图片描述" /></li><li><code>RNN 的循环机制能够使模型的隐藏层上一步产生的结果,作为当下时间步输入的一部分(当下时间步的输入除了正常输入之外,还包括上一步的隐层输出)对当下时间步的输出产生影响。</code></li></ul> <h3><a id="12__20"></a>1.2 结构举例</h3> <ul><li>我们前面说RNN能有效捕捉序列之间的关系特征,下边我们举个例子:</li><li>假如用户输入了一段话:“What time is it ?”,那么机器是怎么捕捉他们之间的序列关系的呢? <ul><li>第一步:先对输入的 “What time is it ?” 进行分词,因为RNN是按照输入顺序来工作的,每次都接受一个单词进行处理。<br /> <img referrerpolicy="no-referrer" src="https://i-blog.csdnimg.cn/direct/40ae1e0404fa48fea2ef19701fe87d5b.gif#pic_center" alt="在这里插入图片描述" /></li><li>第二步:首先将一个单词 “What” 输入进RNN,他将产生一个输出 <span class="katex--inline"><span class="katex"><span class="katex-mathml"> O 1 O1 </span><span class="katex-html"><span class="base"><span class="strut" style="height: 0.6833em;"></span><span class="mord mathnormal" style="margin-right: 0.0278em;">O</span><span class="mord">1</span></span></span></span></span><br /> <img referrerpolicy="no-referrer" src="https://i-blog.csdnimg.cn/direct/6418c346afb64b3eb870cedeaa068a51.gif#pic_center" alt="在这里插入图片描述" /></li><li>第三步:继续将单词 “time” 输入到 RNN,此时 RNN 不仅利用 “time” 来产生 <span class="katex--inline"><span class="katex"><span class="katex-mathml"> O 2 O2 </span><span class="katex-html"><span class="base"><span class="strut" style="height: 0.6833em;"></span><span class="mord mathnormal" style="margin-right: 0.0278em;">O</span><span class="mord">2</span></span></span></span></span>,还会使用上次隐藏层的输出<span class="katex--inline"><span class="katex"><span class="katex-mathml"> O 1 O1 </span><span class="katex-html"><span class="base"><span class="strut" style="height: 0.6833em;"></span><span class="mord mathnormal" style="margin-right: 0.0278em;">O</span><span class="mord">1</span></span></span></span></span>作为输入信息<br /> <img referrerpolicy="no-referrer" src="https://i-blog.csdnimg.cn/direct/c3636eba9712461bb009371c39127885.gif#pic_center" alt="在这里插入图片描述" /></li><li>第四步 :重复第三步,知道将所有单词输入<br /> <img referrerpolicy="no-referrer" src="https://i-blog.csdnimg.cn/direct/f84c0a5f9c07435681e1159cc111dab0.gif#pic_center" alt="在这里插入图片描述" /></li><li>第五步:最后将隐藏层的输出 <span class="katex--inline"><span class="katex"><span class="katex-mathml"> O 5 O5 </span><span class="katex-html"><span class="base"><span class="strut" style="height: 0.6833em;"></span><span class="mord mathnormal" style="margin-right: 0.0278em;">O</span><span class="mord">5</span></span></span></span></span> 进行处理来理解用户意图<br /> <img referrerpolicy="no-referrer" src="https://i-blog.csdnimg.cn/direct/bdf3bc4f3ceb4b0e99837f95b8f449b2.gif#pic_center" alt="在这里插入图片描述" /></li></ul> </li></ul> <h2><a id="RNN__33"></a>二、RNN 模型的分类</h2> <h3><a id="21____34"></a>2.1 按照 输入跟输出 的结构分类</h3> <ul><li><strong>N</strong> vs <strong>N</strong> <ul><li>这种结构是RNN最基础的机构形式,最大的特点就是:输入跟输出序列是等长的</li><li>由于这种限制的存在,使其适用范围较小,可以用于生成的等长的诗句或对联。<br /> <img referrerpolicy="no-referrer" src="https://i-blog.csdnimg.cn/direct/cfee5dfb86f94ff58077c5043e4fe2aa.png" alt="在这里插入图片描述" /></li></ul> </li><li><strong>N</strong> vs <strong>1</strong> <ul><li>当我们输入的问题是一个序列,而要求输出是单个的一个值而不是序列,这时候我们就要在最后一个隐藏层的输出上进行线性变化了。</li><li>大部分情况下,为了更好的明确结果,还要使用 sigmoid 或者 softmax 进行处理,这样的结构经常用于文本分类问题上。<br /> <img referrerpolicy="no-referrer" src="https://i-blog.csdnimg.cn/direct/e43b746cdff144fd83c548d8ed1060b5.png" alt="在这里插入图片描述" /></li></ul> </li><li><strong>1</strong> vs <strong>N</strong> <ul><li>如果输入的不是一个序列,而要求输出是一个序列,那我们就要让每次的输入都作用到每次的输出上</li><li>一般用来将图片生成文字任务、<br /> <img referrerpolicy="no-referrer" src="https://i-blog.csdnimg.cn/direct/caa1e18b6246450d94f7f3066dfa928f.png" alt="在这里插入图片描述" /></li></ul> </li><li><strong>N</strong> vs <strong>M</strong> <ul><li>这是一种不限输入输出长度的RNN结构,它由编码器和解码器两部分组成,两者的内部结构都是某类RNN,它也被称为 seq2seq 架构</li><li>输入数据首先通过编码器,最终输出一个隐含变量 <span class="katex--inline"><span class="katex"><span class="katex-mathml"> c c </span><span class="katex-html"><span class="base"><span class="strut" style="height: 0.4306em;"></span><span class="mord mathnormal">c</span></span></span></span></span>,之后最常用的做法是使用这个隐含变量 <span class="katex--inline"><span class="katex"><span class="katex-mathml"> c c </span><span class="katex-html"><span class="base"><span class="strut" style="height: 0.4306em;"></span><span class="mord mathnormal">c</span></span></span></span></span> 作用在解码器进行解码的每一步上,以保证输入信息被有效利用。<br /> <img referrerpolicy="no-referrer" src="https://i-blog.csdnimg.cn/direct/6fbac853a43847d5b3d75057bb0f3c42.png" alt="在这里插入图片描述" /></li></ul> </li><li>seq2seq架构最早被提出应用于机器翻译,因为其输入输出不受限制,如今也是应用最广的RNN模型结构。</li><li>在机器翻译, 阅读理解, 文本摘要等众多领域都进行了非常多的应用实践。</li></ul> <h3><a id="22____54"></a>2.2 按照 内部结构 分类</h3> <ul><li>我们先介绍分为几种,对于其工作原理</li><li>在之后的章节中,我们再进行详细讨论 <ul><li><strong>传统RNN</strong></li><li><strong>LSTM</strong></li><li><strong>Bi-LSTM</strong></li><li><strong>GRU</strong></li><li><strong>Bi-GRU</strong></li></ul> </li></ul> <h2><a id="_RNN__62"></a>三、传统 RNN 模型</h2> <h3><a id="31_RNN_63"></a>3.1 RNN内部结构图</h3> <p><img referrerpolicy="no-referrer" src="https://i-blog.csdnimg.cn/direct/1514dd8f8cce4df5ac89f65c805b55f3.png" alt="在这里插入图片描述" /></p> <ul><li>解释: <ul><li>隐藏层也就是循环层接收到的是当前时间步的输入 <span class="katex--inline"><span class="katex"><span class="katex-mathml"> X t X_t </span><span class="katex-html"><span class="base"><span class="strut" style="height: 0.8333em; vertical-align: -0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right: 0.0785em;">X</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.2806em;"><span class="" style="top: -2.55em; margin-left: -0.0785em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.15em;"><span class=""></span></span></span></span></span></span></span></span></span></span> 和上个时间步的隐藏层的输出 <span class="katex--inline"><span class="katex"><span class="katex-mathml"> h t − 1 h_{t-1} </span><span class="katex-html"><span class="base"><span class="strut" style="height: 0.9028em; vertical-align: -0.2083em;"></span><span class="mord"><span class="mord mathnormal">h</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.3011em;"><span class="" style="top: -2.55em; margin-left: 0em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.2083em;"><span class=""></span></span></span></span></span></span></span></span></span></span></li><li>这两个进入RNN结构体中,各自有跟权重矩阵进行运算以后,会融合到一起(也就是拼接到一起),形成新的张量 <span class="katex--inline"><span class="katex"><span class="katex-mathml"> [ X t , h t − 1 ] [X_t , h_{t-1}] </span><span class="katex-html"><span class="base"><span class="strut" style="height: 1em; vertical-align: -0.25em;"></span><span class="mopen">[</span><span class="mord"><span class="mord mathnormal" style="margin-right: 0.0785em;">X</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.2806em;"><span class="" style="top: -2.55em; margin-left: -0.0785em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.15em;"><span class=""></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right: 0.1667em;"></span><span class="mord"><span class="mord mathnormal">h</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.3011em;"><span class="" style="top: -2.55em; margin-left: 0em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.2083em;"><span class=""></span></span></span></span></span></span><span class="mclose">]</span></span></span></span></span></li><li>之后这个张量经过一个全连接层(线性层),该层使用 tanh 作为激活函数,最终得到当前时间步的输出 <span class="katex--inline"><span class="katex"><span class="katex-mathml"> h t h_t </span><span class="katex-html"><span class="base"><span class="strut" style="height: 0.8444em; vertical-align: -0.15em;"></span><span class="mord"><span class="mord mathnormal">h</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.2806em;"><span class="" style="top: -2.55em; margin-left: 0em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.15em;"><span class=""></span></span></span></span></span></span></span></span></span></span></li><li>最后,当前时间步的输出 <span class="katex--inline"><span class="katex"><span class="katex-mathml"> h t h_t </span><span class="katex-html"><span class="base"><span class="strut" style="height: 0.8444em; vertical-align: -0.15em;"></span><span class="mord"><span class="mord mathnormal">h</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.2806em;"><span class="" style="top: -2.55em; margin-left: 0em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.15em;"><span class=""></span></span></span></span></span></span></span></span></span></span> 将和 下一个时间步的输入 <span class="katex--inline"><span class="katex"><span class="katex-mathml"> X t + 1 X_{t+1} </span><span class="katex-html"><span class="base"><span class="strut" style="height: 0.8917em; vertical-align: -0.2083em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right: 0.0785em;">X</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.3011em;"><span class="" style="top: -2.55em; margin-left: -0.0785em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mbin mtight">+</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.2083em;"><span class=""></span></span></span></span></span></span></span></span></span></span> 一起进入结构体</li></ul> </li></ul> <h3><a id="32__71"></a>3.2 内部计算公式</h3> <p><img referrerpolicy="no-referrer" src="https://i-blog.csdnimg.cn/direct/486adff3198e4f39a7ba26b237329f40.png" alt="在这里插入图片描述" /><br /> <span class="katex--display"><span class="katex-display"><span class="katex"><span class="katex-mathml"> h t = tanh ⁡ ( X t W i h T + b i h + h t − 1 W h h T + b h h ) h_t = \tanh(X_t W_{ih}^T + b_{ih} + h_{t-1}W_{hh}^T + b_{hh}) </span><span class="katex-html"><span class="base"><span class="strut" style="height: 0.8444em; vertical-align: -0.15em;"></span><span class="mord"><span class="mord mathnormal">h</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.2806em;"><span class="" style="top: -2.55em; margin-left: 0em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.15em;"><span class=""></span></span></span></span></span></span><span class="mspace" style="margin-right: 0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right: 0.2778em;"></span></span><span class="base"><span class="strut" style="height: 1.1413em; vertical-align: -0.25em;"></span><span class="mop">tanh</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right: 0.0785em;">X</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.2806em;"><span class="" style="top: -2.55em; margin-left: -0.0785em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.15em;"><span class=""></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right: 0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.8913em;"><span class="" style="top: -2.453em; margin-left: -0.1389em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">ih</span></span></span></span><span class="" style="top: -3.113em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right: 0.1389em;">T</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.247em;"><span class=""></span></span></span></span></span></span><span class="mspace" style="margin-right: 0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right: 0.2222em;"></span></span><span class="base"><span class="strut" style="height: 0.8444em; vertical-align: -0.15em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.3361em;"><span class="" style="top: -2.55em; margin-left: 0em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">ih</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.15em;"><span class=""></span></span></span></span></span></span><span class="mspace" style="margin-right: 0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right: 0.2222em;"></span></span><span class="base"><span class="strut" style="height: 1.1383em; vertical-align: -0.247em;"></span><span class="mord"><span class="mord mathnormal">h</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.3011em;"><span class="" style="top: -2.55em; margin-left: 0em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.2083em;"><span class=""></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right: 0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.8913em;"><span class="" style="top: -2.453em; margin-left: -0.1389em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">hh</span></span></span></span><span class="" style="top: -3.113em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right: 0.1389em;">T</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.247em;"><span class=""></span></span></span></span></span></span><span class="mspace" style="margin-right: 0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right: 0.2222em;"></span></span><span class="base"><span class="strut" style="height: 1em; vertical-align: -0.25em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.3361em;"><span class="" style="top: -2.55em; margin-left: 0em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">hh</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.15em;"><span class=""></span></span></span></span></span></span><span class="mclose">)</span></span></span></span></span></span></p> <ul><li><span class="katex--inline"><span class="katex"><span class="katex-mathml"> W i h W_{ih} </span><span class="katex-html"><span class="base"><span class="strut" style="height: 0.8333em; vertical-align: -0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right: 0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.3361em;"><span class="" style="top: -2.55em; margin-left: -0.1389em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">ih</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.15em;"><span class=""></span></span></span></span></span></span></span></span></span></span> 表示输入数据的权重</li><li><span class="katex--inline"><span class="katex"><span class="katex-mathml"> b i h b_{ih} </span><span class="katex-html"><span class="base"><span class="strut" style="height: 0.8444em; vertical-align: -0.15em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.3361em;"><span class="" style="top: -2.55em; margin-left: 0em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">ih</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.15em;"><span class=""></span></span></span></span></span></span></span></span></span></span> 表示输入数据的偏置</li><li><span class="katex--inline"><span class="katex"><span class="katex-mathml"> W h h W_{hh} </span><span class="katex-html"><span class="base"><span class="strut" style="height: 0.8333em; vertical-align: -0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right: 0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.3361em;"><span class="" style="top: -2.55em; margin-left: -0.1389em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">hh</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.15em;"><span class=""></span></span></span></span></span></span></span></span></span></span> 表示隐藏状态的权重</li><li><span class="katex--inline"><span class="katex"><span class="katex-mathml"> b h h b_{hh} </span><span class="katex-html"><span class="base"><span class="strut" style="height: 0.8444em; vertical-align: -0.15em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height: 0.3361em;"><span class="" style="top: -2.55em; margin-left: 0em; margin-right: 0.05em;"><span class="pstrut" style="height: 2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">hh</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height: 0.15em;"><span class=""></span></span></span></span></span></span></span></span></span></span> 表示隐藏状态的偏置</li></ul> <h3><a id="33__tanh__82"></a>3.3 其中 tanh 激活函数的作用</h3> <ul><li><strong>非线性映射</strong>: <ul><li>RNN中的线性层(也称为全连接层或仿射变换)仅仅是对输入进行线性组合,而tanh函数则引入了非线性特性。这使得RNN能够学习和表示更复杂的输入-输出关系,因为非线性映射能够捕捉数据中的非线性特征。</li></ul> </li><li><strong>值域限制</strong>: <ul><li>tanh函数的输出值域为(-1, 1),这有助于将神经元的输出限制在一个合理的范围内。与sigmoid函数类似,tanh函数也能够在一定程度上缓解梯度消失的问题(尽管在非常深的网络中仍然可能存在),因为梯度在值域内不会趋于零。</li></ul> </li><li><strong>中心化输出</strong>: <ul><li>tanh函数的输出是中心化的,即均值为0。这有助于在训练过程中保持数据的分布相对稳定,有助于加快收敛速度和提高模型的稳定性。</li></ul> </li><li><strong>梯度传播</strong>: <ul><li>在反向传播过程中,tanh函数的导数(即梯度)在输入接近0时最大,而在输入接近-1或1时接近0。这意味着当神经元的输出接近极端值时,梯度会变小,可能导致梯度消失问题。</li></ul> </li></ul> <h3><a id="34_RNN_91"></a>3.4 传统RNN优缺点</h3> <ul><li>优势 <ul><li>由于内部结构简单,对计算资源要求低,相比之后我们要学习的RNN变体:LSTM和GRU模型参数总量少了很多,在短序列任务上性能和效果都表现优异</li></ul> </li><li>缺点 <ul><li>传统RNN在解决长序列之间的关联时,通过实践,证明经典RNN表现很差,原因是在进行反向传播的时候,过长的序列导致梯度的计算异常,发生梯度消失或爆炸。</li></ul> </li></ul> <h2><a id="_98"></a>四、代码演示</h2> <blockquote> <p>演示代码 1 :</p> </blockquote> <pre><code class="prism language-python"><span class="token keyword">import</span> torch <span class="token keyword">from</span> torch <span class="token keyword">import</span> nn<span class="token keyword">def</span> <span class="token function">my_rnn_dm01</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span><span class="token triple-quoted-string string">'''RNN 的三个参数的含义第一个参数:input_size(输入张量 x 的维度)第二个参数:hidden_size(隐藏层的维度,隐藏层的神经元个数)第三个参数:num_layer(隐藏层的数量)'''</span>rnn <span class="token operator">=</span> nn<span class="token punctuation">.</span>RNN<span class="token punctuation">(</span><span class="token number">5</span><span class="token punctuation">,</span> <span class="token number">6</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span><span class="token triple-quoted-string string">'''input 的三个参数的含义第一个参数:sequence_length(输入序列的长度)第二个参数:batch_size(批次的样本数量)第三个参数:input_size(输入张量的维度)'''</span><span class="token builtin">input</span> <span class="token operator">=</span> torch<span class="token punctuation">.</span>randn<span class="token punctuation">(</span><span class="token number">5</span><span class="token punctuation">,</span> <span class="token number">3</span><span class="token punctuation">,</span> <span class="token number">5</span><span class="token punctuation">)</span><span class="token triple-quoted-string string">'''output 的三个参数的含义第一个参数:num_layer * num_directions(层数*网络方向)第二个参数:batch_size(批次的样本数)第三个参数:hidden_size(隐藏层的维度, 隐藏层神经元的个数)'''</span><span class="token comment"># h0 = torch.randn(1, 5, 6) </span><span class="token comment"># output, hn = rnn(input, h0) # h0 可以传也可以不传</span>output<span class="token punctuation">,</span> hn <span class="token operator">=</span> rnn<span class="token punctuation">(</span><span class="token builtin">input</span><span class="token punctuation">)</span><span class="token keyword">print</span><span class="token punctuation">(</span>output<span class="token punctuation">.</span>shape<span class="token punctuation">)</span> <span class="token comment"># torch.Size([5, 3, 6])</span><span class="token keyword">print</span><span class="token punctuation">(</span>output<span class="token punctuation">)</span><span class="token comment"># print(hn.shape)</span><span class="token comment"># print(hn)</span> </code></pre> <blockquote> <p>演示代码 2 :</p> </blockquote> <pre><code class="prism language-python"><span class="token keyword">def</span> <span class="token function">my_rnn_dm02</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span><span class="token triple-quoted-string string">'''RNN 的三个参数的含义第一个参数:input_size(输入张量 x 的维度)第二个参数:hidden_size(隐藏层的维度,隐藏层的神经元个数)第三个参数:num_layer(隐藏层的数量)第四个参数:输入层可以把 batch_size参数 放在一个位置'''</span>rnn <span class="token operator">=</span> nn<span class="token punctuation">.</span>RNN<span class="token punctuation">(</span><span class="token number">5</span><span class="token punctuation">,</span> <span class="token number">6</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">,</span> batch_first<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span><span class="token triple-quoted-string string">'''input 的三个参数的含义第一个参数:batch_size(批次的样本数量)第二个参数:sequence_length(输入序列的长度)第三个参数:input_size(输入张量的维度)'''</span><span class="token builtin">input</span> <span class="token operator">=</span> torch<span class="token punctuation">.</span>randn<span class="token punctuation">(</span><span class="token number">3</span><span class="token punctuation">,</span> <span class="token number">20</span><span class="token punctuation">,</span> <span class="token number">5</span><span class="token punctuation">)</span><span class="token triple-quoted-string string">'''output 的三个参数的含义第一个参数:num_layer * num_directions(层数*网络方向)第二个参数:batch_size(批次的样本数)第三个参数:hidden_size(隐藏层的维度, 隐藏层神经元的个数)'''</span>output<span class="token punctuation">,</span> hn <span class="token operator">=</span> rnn<span class="token punctuation">(</span><span class="token builtin">input</span><span class="token punctuation">)</span><span class="token keyword">print</span><span class="token punctuation">(</span>output<span class="token punctuation">.</span>shape<span class="token punctuation">)</span> <span class="token comment"># torch.Size([3, 20, 6])</span><span class="token keyword">print</span><span class="token punctuation">(</span>output<span class="token punctuation">)</span> </code></pre> <hr /> <h2><a id="_180"></a>总结</h2> <ul><li>以上就是传统RNN的基本内容</li></ul></div></div> <div id="vip"><a class="submit" onclick="showArticle()">查看全文</a></div> <div class="entry-copyright"> <!--<span class="source_url"></span><br>--> <span class="Disclaimers"><a href="http://www.yayakq.cn/news/391290/">http://www.yayakq.cn/news/391290/</a></span> <span class="email"></span> </div> <div class="gkt-entry-xgwz clear" style="margin-bottom:8px;"> <h3>相关文章:</h3> <li> <a href="/news/391289/">广汉网站如何做英文ppt模板下载网站</a> </li> <li> <a href="/news/391288/">乐都区公司网站建设个人购物网站建设</a> </li> <li> <a href="/news/391287/">厦门建设与管理局网站网站建设进度控制</a> </li> <li> <a href="/news/391286/">中国城乡建设部证件查询网站关键词优化排名软件怎么样</a> </li> <li> <a href="/news/391285/">网站主机设置方法用wordpress制作软件</a> </li> <li> <a href="/news/391281/">建站模板外贸开封建设局网站</a> </li> <li> <a href="/news/391280/">网站加水印php网站开发具体的参考文献</a> </li> <li> <a href="/news/391279/">模板网站建设信息网站建设公司何去何从</a> </li> <li> <a href="/news/391278/">做招聘网站需要什么每天做特卖的网站是哪个</a> </li> <li> <a href="/news/391277/">网站开发技术的雏形 cgi上海人才引进官网</a> </li> <li> <a href="/news/391276/">西安网络推广网站优化去年做啥网站能致富</a> </li> <li> <a href="/news/391275/">在电脑上建设个人网站网页制作图片大小代码</a> </li> <li> <a href="/news/391274/">wordpress站点制作企业网站seo点击软件</a> </li> <li> <a href="/news/391272/">义乌网站制作是什么网站采用哪种开发语言</a> </li> <li> <a href="/news/391270/">在网站上显示地图wordpress留言群发</a> </li> <li> <a href="/news/391269/">如何做淘宝商城网站设计asp网站做安全</a> </li> <li> <a href="/news/391268/">宁波建设局网站首页免费推广的平台都有哪些</a> </li> <li> <a href="/news/391267/">网站建设报价单 excel企业英文网站建设的重要性</a> </li> <li> <a href="/news/391266/">大学生网站设计作品青岛做网站公司</a> </li> <li> <a href="/news/391265/">重庆网站seo多少钱企业网站建设品牌</a> </li> <li> <a href="/news/391264/">创新的中山网站建设淘宝客网站一般用什么做的</a> </li> <li> <a href="/news/391263/">网站备案密码使用织梦网站做seo优化</a> </li> <li> <a href="/news/391262/">网站网址大全网站使用mip后效果怎么样</a> </li> <li> <a href="/news/391261/">旅游网站设计与实现徐州建网站</a> </li> <li> <a href="/news/391260/">dede网站后台导入文档哪里制作网站好</a> </li> <li> <a href="/news/391259/">长沙高新区建设局网站公司宣传册设计与制作图片</a> </li> <li> <a href="/news/391258/">gta5房子网站建设中做视频官方网站</a> </li> <li> <a href="/news/391257/">做网站怎么买服务器廊坊高端模板建站</a> </li> <li> <a href="/news/391256/">购物网站 设计仿新浪微博网站代码</a> </li> <li> <a href="/news/391255/">石家庄网站设计泰安有什么互联网公司</a> </li> </div> </article> </main> </div> </div> <aside id="secondary" class="widget-area sidebar"> <div class="widget widget_posts_thumbnail" style="margin-top:6px;"> <h2 class="widget-title">最新文章</h2> <ul> <li class="clear"> <a href="/news/431091/" rel="bookmark"> <div class="thumbnail-wrap"> <img width="120" height="80" src="http://pic.xiahunao.cn/yaotu/不会代码怎么做网站上饶网站建设兼职" alt=" 不会代码怎么做网站上饶网站建设兼职" /> </div> </a> <div class="entry-wrap"> <a href="/news/431091/" rel="bookmark"> 不会代码怎么做网站上饶网站建设兼职</a> <div class="entry-meta">2025/9/17 2:00:47</div></div> </li> <li class="clear"> <a href="/news/431090/" rel="bookmark"> <div class="thumbnail-wrap"> <img width="120" height="80" src="http://pic.xiahunao.cn/yaotu/国产做网站如何做一个网站赚钱" alt=" 国产做网站如何做一个网站赚钱" /> </div> </a> <div class="entry-wrap"> <a href="/news/431090/" rel="bookmark"> 国产做网站如何做一个网站赚钱</a> <div class="entry-meta">2025/9/17 2:00:47</div></div> </li> <li class="clear"> <a href="/news/431089/" rel="bookmark"> <div class="thumbnail-wrap"> <img width="120" height="80" src="http://pic.xiahunao.cn/yaotu/所有的网站都要用htmlu做吗天津营销类网站设计" alt=" 所有的网站都要用htmlu做吗天津营销类网站设计" /> </div> </a> <div class="entry-wrap"> <a href="/news/431089/" rel="bookmark"> 所有的网站都要用htmlu做吗天津营销类网站设计</a> <div class="entry-meta">2025/9/17 2:00:47</div></div> </li> <li class="clear"> <a href="/news/431088/" rel="bookmark"> <div class="thumbnail-wrap"> <img width="120" height="80" src="http://pic.xiahunao.cn/yaotu/自助网站建设 网易枸杞网站怎么做" alt=" 自助网站建设 网易枸杞网站怎么做" /> </div> </a> <div class="entry-wrap"> <a href="/news/431088/" rel="bookmark"> 自助网站建设 网易枸杞网站怎么做</a> <div class="entry-meta">2025/9/17 2:00:47</div></div> </li> <li class="clear"> <a href="/news/431087/" rel="bookmark"> <div class="thumbnail-wrap"> <img width="120" height="80" src="http://pic.xiahunao.cn/yaotu/公司网站哪个建的好开店做网站有什么好处" alt=" 公司网站哪个建的好开店做网站有什么好处" /> </div> </a> <div class="entry-wrap"> <a href="/news/431087/" rel="bookmark"> 公司网站哪个建的好开店做网站有什么好处</a> <div class="entry-meta">2025/9/17 2:00:47</div></div> </li> <li class="clear"> <a href="/news/431086/" rel="bookmark"> <div class="thumbnail-wrap"> <img width="120" height="80" src="http://pic.xiahunao.cn/yaotu/网站下载的软件在哪里找的到查公司法人天眼查" alt=" 网站下载的软件在哪里找的到查公司法人天眼查" /> </div> </a> <div class="entry-wrap"> <a href="/news/431086/" rel="bookmark"> 网站下载的软件在哪里找的到查公司法人天眼查</a> <div class="entry-meta">2025/9/17 2:00:47</div></div> </li> </ul> </div> <div class="leftdiv2"> </div> </aside> </div> <footer id="colophon" class="site-footer"> <div class="clear"></div> <div id="site-bottom" class="clear"> <div class="container"> <div class="menu-m_footer-container"> <ul id="footer-menu" class="footer-nav"> <li> <strong> <a href="/">芽芽口腔健康站介绍</a></strong> </li> <li> <strong> <a href="/">商务合作</a></strong> </li> <li> <strong> <a href="/">免责声明</a></strong> </li> </ul> </div> <div class="site-info"> <p>CopyRight © <a href="/">芽芽口腔健康站</a>版权所有 </p> </div> </div> </div> </footer> </div> <div id="back-top"> <a href="#top" title="返回顶部"> <svg width="38" height="38" viewbox="0 0 48 48" fill="none" xmlns="http://www.w3.org/2000/svg"> <rect width="48" height="48" fill="white" fill-opacity="0.01" /> <path d="M24 44C35.0457 44 44 35.0457 44 24C44 12.9543 35.0457 4 24 4C12.9543 4 4 12.9543 4 24C4 35.0457 12.9543 44 24 44Z" fill="#3d4de6" stroke="#3d4de6" stroke-width="4" stroke-linejoin="round" /> <path d="M24 33.5V15.5" stroke="#FFF" stroke-width="4" stroke-linecap="round" stroke-linejoin="round" /> <path d="M33 24.5L24 15.5L15 24.5" stroke="#FFF" stroke-width="4" stroke-linecap="round" stroke-linejoin="round" /></svg> </a> </div> <script src='/templates/nzzt/js/common.js'></script> <script> $(function(){ $('.source_url').text('原文地址:https://blog.csdn.net/Lyg970112/article/details/143955264'); }); /*$('.source_url').on("click",function() { window.open('https://blog.csdn.net/Lyg970112/article/details/143955264', '_blank'); });*/ </script> </body> </html>