On the Memory Mechanism of Tensor-Power Recurrent Models

Hejia Qiu; Chao Li; Ying Weng; Zhun Sun; Xingyu He; Qibin Zhao

On the Memory Mechanism of Tensor-Power Recurrent Models

Hejia Qiu, Chao Li, Ying Weng, Zhun Sun, Xingyu He, Qibin Zhao

School of Computer Science

Research output: Journal Publication › Conference article › peer-review

5 Citations (Scopus)

Abstract

Tensor-power (TP) recurrent model is a family of non-linear dynamical systems, of which the recurrence relation consists of a p-fold (a.k.a., degree-p) tensor product. Despite such the model frequently appears in the advanced recurrent neural networks (RNNs), to this date there is limited study on its memory property, a critical characteristic in sequence tasks. In this work, we conduct a thorough investigation of the memory mechanism of TP recurrent models. Theoretically, we prove that a large degree p is an essential condition to achieve the long memory effect, yet it would lead to unstable dynamical behaviors. Empirically, we tackle this issue by extending the degree p from discrete to a differentiable domain, such that it is efficiently learnable from a variety of datasets. Taken together, the new model is expected to benefit from the long memory effect in a stable manner. We experimentally show that the proposed model achieves competitive performance compared to various advanced RNNs in both the single-cell and seq2seq architectures.

Original language	English
Pages (from-to)	3682-3690
Number of pages	9
Journal	Proceedings of Machine Learning Research
Volume	130
Publication status	Published - 2021
Event	24th International Conference on Artificial Intelligence and Statistics, AISTATS 2021 - Virtual, Online, United States Duration: 13 Apr 2021 → 15 Apr 2021

ASJC Scopus subject areas

Artificial Intelligence
Software
Control and Systems Engineering
Statistics and Probability

Cite this

@article{b04f5aa21f884c22bbe27ac39a9cd794,

title = "On the Memory Mechanism of Tensor-Power Recurrent Models",

abstract = "Tensor-power (TP) recurrent model is a family of non-linear dynamical systems, of which the recurrence relation consists of a p-fold (a.k.a., degree-p) tensor product. Despite such the model frequently appears in the advanced recurrent neural networks (RNNs), to this date there is limited study on its memory property, a critical characteristic in sequence tasks. In this work, we conduct a thorough investigation of the memory mechanism of TP recurrent models. Theoretically, we prove that a large degree p is an essential condition to achieve the long memory effect, yet it would lead to unstable dynamical behaviors. Empirically, we tackle this issue by extending the degree p from discrete to a differentiable domain, such that it is efficiently learnable from a variety of datasets. Taken together, the new model is expected to benefit from the long memory effect in a stable manner. We experimentally show that the proposed model achieves competitive performance compared to various advanced RNNs in both the single-cell and seq2seq architectures.",

author = "Hejia Qiu and Chao Li and Ying Weng and Zhun Sun and Xingyu He and Qibin Zhao",

note = "Publisher Copyright: Copyright {\textcopyright} 2021 by the author(s); 24th International Conference on Artificial Intelligence and Statistics, AISTATS 2021 ; Conference date: 13-04-2021 Through 15-04-2021",

year = "2021",

language = "English",

volume = "130",

pages = "3682--3690",

journal = "Proceedings of Machine Learning Research",

issn = "2640-3498",

publisher = "ML Research Press",

}

TY - JOUR

T1 - On the Memory Mechanism of Tensor-Power Recurrent Models

AU - Qiu, Hejia

AU - Li, Chao

AU - Weng, Ying

AU - Sun, Zhun

AU - He, Xingyu

AU - Zhao, Qibin

PY - 2021

Y1 - 2021

N2 - Tensor-power (TP) recurrent model is a family of non-linear dynamical systems, of which the recurrence relation consists of a p-fold (a.k.a., degree-p) tensor product. Despite such the model frequently appears in the advanced recurrent neural networks (RNNs), to this date there is limited study on its memory property, a critical characteristic in sequence tasks. In this work, we conduct a thorough investigation of the memory mechanism of TP recurrent models. Theoretically, we prove that a large degree p is an essential condition to achieve the long memory effect, yet it would lead to unstable dynamical behaviors. Empirically, we tackle this issue by extending the degree p from discrete to a differentiable domain, such that it is efficiently learnable from a variety of datasets. Taken together, the new model is expected to benefit from the long memory effect in a stable manner. We experimentally show that the proposed model achieves competitive performance compared to various advanced RNNs in both the single-cell and seq2seq architectures.

AB - Tensor-power (TP) recurrent model is a family of non-linear dynamical systems, of which the recurrence relation consists of a p-fold (a.k.a., degree-p) tensor product. Despite such the model frequently appears in the advanced recurrent neural networks (RNNs), to this date there is limited study on its memory property, a critical characteristic in sequence tasks. In this work, we conduct a thorough investigation of the memory mechanism of TP recurrent models. Theoretically, we prove that a large degree p is an essential condition to achieve the long memory effect, yet it would lead to unstable dynamical behaviors. Empirically, we tackle this issue by extending the degree p from discrete to a differentiable domain, such that it is efficiently learnable from a variety of datasets. Taken together, the new model is expected to benefit from the long memory effect in a stable manner. We experimentally show that the proposed model achieves competitive performance compared to various advanced RNNs in both the single-cell and seq2seq architectures.

UR - http://www.scopus.com/inward/record.url?scp=85132858282&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85132858282

SN - 2640-3498

VL - 130

SP - 3682

EP - 3690

JO - Proceedings of Machine Learning Research

JF - Proceedings of Machine Learning Research

T2 - 24th International Conference on Artificial Intelligence and Statistics, AISTATS 2021

Y2 - 13 April 2021 through 15 April 2021

ER -

On the Memory Mechanism of Tensor-Power Recurrent Models

Abstract

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this