2023、SegRNN¶

2025-03-04 21:01:002025-09-30 13:16:28

约 829 个字 20 张图片预计阅读时间 4 分钟

模型架构图

https://paperswithcode.com/paper/segrnn-segment-recurrent-neural-network-for

https://github.com/lss-1138/SegRNN?tab=readme-ov-file

Time Series Forecasting on Weather (192) #rank1

SegRNN: Segment Recurrent Neural Network for Long-Term Time Series Forecasting

摘要¶

Time Series Forecasting (LTSF) domain when dealing with excessively long look-back windows and forecast horizons.

（过长的回溯窗口）

（过长的预测范围）

导致 RNN 迭代步数过多

To address these issues, we propose two novel strategies to reduce the number of iterations in RNNs for LTSF tasks: Segment-wise Iterations and Parallel Multi-step Forecasting (PMF).

逐段迭代代替逐点迭代

并行多步预测

引入¶

However, extending the forecast horizon poses significant challenges:

(i) Forecasting further into the future leads to increased uncertainty, resulting in decreased forecast accuracy.

预测范围太远，导致预测精度的下降

(ii) Longer forecast horizons require models to consider a more extensive historical context for accurate predictions, significantly increasing the complexity of modeling.

过长的预测范围，需要考虑更多的上下文语义信息，显著增加了模型复杂度

图 1，指出问题：

较长的回溯窗口
预测范围越长，精度越差、花费时间越多

贡献：

• We propose SegRNN, which utilizes time-series segment technique to replace point-wise iterations with segmentwise iterations in LTSF.

• We further introduce the PMF technique to enhance the inference speed and performance of RNNs.

PMF 中的位置嵌入 \(pe_i\) 作为循环结构的顺序信息的替代。

相关工作¶

Transformers

LogTrans (Li et al. 2019),

Informer (Zhou et al. 2021),

Pyraformer (Liu et al. 2021),

Autoformer (Wu et al. 2021), and

FEDformer (Zhou et al. 2022), aimed at reducing the complexity of Transformers.

More recently,

PatchTST (Nie et al. 2023) and

Crossformer (Zhang and Yan 2023)

leveraged patch-based techniques from computer vision (Dosovitskiy et al. 2021; He et al. 2022),

MLPs

DLinear achieved superiority over then-state-of-the-art Transformerbased models through a simple linear layer and channel independent strategy (Zeng et al. 2023).

MTS-Mixers (Li et al. 2023),

TSMixer (Vijay et al. 2023), and

TiDE (Das et al. 2023).

The accomplishments of these MLP-based models have raised questions about the necessity of employing complex and cumbersome Transformers for time series prediction.

CNNs

MICN (Wang et al. 2023),

TimesNet(Wu et al. 2023), and

SCINet (LIU et al. 2022)

have demonstrated impressive results in the LTSF field.

RNNs

Numerous efforts have been devoted to utilizing RNNs for short-term and probabilistic forecasting, achieving significant advancements (Lai et al. 2018; Bergsma et al. 2022; Wen et al. 2018; Tan, Xie, and Cheng 2023).

However, in the LTSF domain with excessively long look-back windows and forecast horizons, RNNs have been considered inadequate for effectively capturing long-term dependencies, leading to their gradual abandonment (Zhou et al. 2021, 2022).

The emergence of SegRNN aims to challenge and change this situation by attempting to address these limitations.

Preliminaries¶

迭代步数过多会造成梯度爆炸和梯度消失，导致训练阶段的模型收敛问题

The vanilla RNN faces challenges such as vanishing and exploding gradients, which hinder the model’s convergence during training

本文采用的 RNN 单元是 GRU

Therefore, for consistency throughout the text, the SegRNN model is assumed to be based on the GRU cell

模型结构¶

实验¶

实验环境¶

All experiments in this section are implemented in PyTorch and executed on two NVIDIA T4 GPUs, each equipped with 16GB of memory

数据集¶

参数设置¶

Model configuration. The uniform configuration of SegRNN consists of a look-back of 720, a segment length of 48, a single GRU layer, a hidden size of 512, 30 training epochs, a learning rate decay of 0.8 after the initial 3 epochs, and early stopping with a patience of 10. The dropout rate, batch size, and learning rate vary based on the scale of the data.

回溯步数：720

分段长度：48

单层GRU，隐含层层数：512

迭代次数：30 次