tensor2tensor

这么多年,终于有人讲清楚Transformer了

故事扮演 提交于 2020-10-21 14:31:06
作者 | Jay Alammar 译者 | 香槟超新星,责编 | 夕颜 来源 | CSDN(ID:CSDNnews) 注意力机制是一种在现代深度学习模型中无处不在的方法,它有助于提高神经机器翻译应用程序性能的概念。在本文中,我们将介绍Transformer这种模型,它可以通过注意力机制来提高训练模型的速度。在特定任务中,Transformer的表现优于Google神经机器翻译模型。但是,最大的好处来自于Transformer如何适用于并行化。实际上,Google Cloud建议使用Transformer作为参考模型来使用其Cloud TPU产品。因此,我们试试将模型分解开吧,看看它是如何工作的。 Attention is All You Need一文中提出了Transformer。它的TensorFlow实现是Tensor2Tensor包的一部分。哈佛大学的NLP团队创建了一份指南,用PyTorch实现对这篇文章进行注释。在本文中,我们将试着尽可能地简化讲解,并逐一介绍概念,希望能让那些对这方面没有深入知识的人们更容易理解Transformer。 Transformer概览 首先,让我们先将模型视为一个黑盒。在机器翻译应用程序中,这个模型将拿一种语言中的一个句子,然后以另一种语言输出其翻译。 打开擎天柱的引擎盖(Optimus Prime,Transformer与变形金刚是同一个词

Transformer_Introduce

佐手、 提交于 2020-08-15 01:59:26
1. Embedding After embedding the words in our input sequence, each of them flows through each of the two layers of the encoder. The word in each position flows through its own path in the encoder. There are dependencies between these paths in the self-attention layer. The feed-forward layer does not have those dependencies, however, and thus the various paths can be executed in parallel while flowing through the feed-forward layer. 2. Encode an encoder receives a list of vectors as input. It processes this list by passing these vectors into a ‘self-attention’ layer, then into a feed-forward

Google工程师亲授 Tensorflow2.0-入门到进阶

前提是你 提交于 2020-05-04 09:19:52
第1章 Tensorflow简介与环境搭建 本门课程的入门章节,简要介绍了tensorflow是什么,详细介绍了Tensorflow历史版本变迁以及tensorflow的架构和强大特性。并在Tensorflow1.0、pytorch、Tensorflow2.0之间做了对比。最后通过实战讲解了在Google cloud和AWS两个平台上的环境配置。 第2章 Tensorflow keras实战 本门课程的基础章节,详细介绍了如何使用tf.keras进行模型的搭建以及大量的深度学习的理论知识。理论知识包括分类问题、回归问题、损失函数、神经网络、激活函数、dropout、批归一化、深度神经网络、Wide&Deep模型、密集特征、稀疏特征、超参数搜索等及其在图像分类、房价预测上的实现。... 第3章 Tensorflow基础API使用 接上一节课中使用高级抽象的API tf.keras搭建模型,本节课则介绍了基础的API来方便大家更加灵活的定义和使用模型。课程内容包括tensorflow基础数据类型、自定义模型和损失函数、自定义求导、tf.function、图结构等以及其在图像分类、房价预测上的实现。... 第4章 Tensorflow dataset使用 介绍Tensorflow dataset空间下API的使用,dataset API主要用于读取数据

How to get current global_step in data pipeline

Deadly 提交于 2020-04-07 07:08:37
问题 I am trying to create a filter which depends on the current global_step of the training but I am failing to do so properly. First, I cannot use tf.train.get_or_create_global_step() in the code below because it will throw ValueError: Variable global_step already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at: This is why I tried fetching the scope with tf.get_default_graph().get_name_scope() and within that context I was able to "

How to get current global_step in data pipeline

不想你离开。 提交于 2020-04-07 07:08:15
问题 I am trying to create a filter which depends on the current global_step of the training but I am failing to do so properly. First, I cannot use tf.train.get_or_create_global_step() in the code below because it will throw ValueError: Variable global_step already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at: This is why I tried fetching the scope with tf.get_default_graph().get_name_scope() and within that context I was able to "

How to get current global_step in data pipeline

半城伤御伤魂 提交于 2020-04-07 07:07:38
问题 I am trying to create a filter which depends on the current global_step of the training but I am failing to do so properly. First, I cannot use tf.train.get_or_create_global_step() in the code below because it will throw ValueError: Variable global_step already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at: This is why I tried fetching the scope with tf.get_default_graph().get_name_scope() and within that context I was able to "