Skip to content

Latest commit

 

History

History
86 lines (52 loc) · 2.92 KB

tutorial_charbaseseq2seq.rst

File metadata and controls

86 lines (52 loc) · 2.92 KB

Character-Based Sequence-to-Sequence (seq2seq) Models

Since release 0.6.0, shorttext supports sequence-to-sequence (seq2seq) learning. While there is a general seq2seq class behind, it provides a character-based seq2seq implementation.

Creating One-hot Vectors

To use it, create an instance of the class :class:`shorttext.generators.SentenceToCharVecEncoder`:

>>> import numpy as np
>>> import shorttext
>>> from urllib.request import urlopen
>>> chartovec_encoder = shorttext.generators.initSentenceToCharVecEncoder(urlopen('http://norvig.com/big.txt', 'r'))

The above code is the same as :doc:`tutorial_charbaseonehot` .

.. automodule:: shorttext.generators.charbase.char2vec
   :members: initSentenceToCharVecEncoder


Training

Then we can train the model by creating an instance of :class:`shorttext.generators.CharBasedSeq2SeqGenerator`:

>>> latent_dim = 100
>>> seq2seqer = shorttext.generators.CharBasedSeq2SeqGenerator(chartovec_encoder, latent_dim, 120)

And then train this neural network model:

>>> seq2seqer.train(text, epochs=100)

This model takes several hours to train on a laptop.

.. autoclass:: shorttext.generators.seq2seq.charbaseS2S.CharBasedSeq2SeqGenerator
   :members:

Decoding

After training, we can use this class as a generative model of answering questions as a chatbot:

>>> seq2seqer.decode('Happy Holiday!')

It does not give definite answers because there is a stochasticity in the prediction.

Model I/O

This model can be saved by entering:

>>> seq2seqer.save_compact_model('/path/to/norvigtxt_iter5model.bin')

And can be loaded by:

>>> seq2seqer2 = shorttext.generators.seq2seq.charbaseS2S.loadCharBasedSeq2SeqGenerator('/path/to/norvigtxt_iter5model.bin')
.. automodule:: shorttext.generators.seq2seq.charbaseS2S
   :members: loadCharBasedSeq2SeqGenerator


Reference

Aurelien Geron, Hands-On Machine Learning with Scikit-Learn and TensorFlow (Sebastopol, CA: O'Reilly Media, 2017). [O'Reilly]

Ilya Sutskever, James Martens, Geoffrey Hinton, "Generating Text with Recurrent Neural Networks," ICML (2011). [UToronto]

Ilya Sutskever, Oriol Vinyals, Quoc V. Le, "Sequence to Sequence Learning with Neural Networks," arXiv:1409.3215 (2014). [arXiv]

Oriol Vinyals, Quoc Le, "A Neural Conversational Model," arXiv:1506.05869 (2015). [arXiv]

Tom Young, Devamanyu Hazarika, Soujanya Poria, Erik Cambria, "Recent Trends in Deep Learning Based Natural Language Processing," arXiv:1708.02709 (2017). [arXiv]

Zackary C. Lipton, John Berkowitz, "A Critical Review of Recurrent Neural Networks for Sequence Learning," arXiv:1506.00019 (2015). [arXiv]