On the Memory Mechanism of Tensor-Power Recurrent Models

Hejia Qiu, Chao Li, Ying Weng, Zhun Sun, Xingyu He, Qibin Zhao

Research output: Journal PublicationConference articlepeer-review

3 Citations (Scopus)

Abstract

Tensor-power (TP) recurrent model is a family of non-linear dynamical systems, of which the recurrence relation consists of a p-fold (a.k.a., degree-p) tensor product. Despite such the model frequently appears in the advanced recurrent neural networks (RNNs), to this date there is limited study on its memory property, a critical characteristic in sequence tasks. In this work, we conduct a thorough investigation of the memory mechanism of TP recurrent models. Theoretically, we prove that a large degree p is an essential condition to achieve the long memory effect, yet it would lead to unstable dynamical behaviors. Empirically, we tackle this issue by extending the degree p from discrete to a differentiable domain, such that it is efficiently learnable from a variety of datasets. Taken together, the new model is expected to benefit from the long memory effect in a stable manner. We experimentally show that the proposed model achieves competitive performance compared to various advanced RNNs in both the single-cell and seq2seq architectures.

Original languageEnglish
Pages (from-to)3682-3690
Number of pages9
JournalProceedings of Machine Learning Research
Volume130
Publication statusPublished - 2021
Event24th International Conference on Artificial Intelligence and Statistics, AISTATS 2021 - Virtual, Online, United States
Duration: 13 Apr 202115 Apr 2021

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'On the Memory Mechanism of Tensor-Power Recurrent Models'. Together they form a unique fingerprint.

Cite this