Skip to content

chenguolin/NuTime

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[TMLR 2024] NuTime

NuTime: Numerically Multi-Scaled Embedding for Large-Scale Time-Series Pretraining

Chenguo Lin, Xumeng Wen, Wei Cao, Congrui Huang, Jiang Bian, Stephen Lin, Zhirong Wu

OpenReview arXiv License: MIT

pipeline

This repository contains the official implementation of the paper: NuTime: Numerically Multi-Scaled Embedding for Large-Scale Time-Series Pretraining, which is accepted to TMLR 2024. In this work, we propose the NuTime model for large-scale time series pretraining. The model is based on the Transformer architecture, which takes input as a set of tokens from non-overlapping windows. Each window is represented by its normalized shape, the window mean and standard deviation. We develop a numerically multi-scaled embedding method (NME) for representing the scalar values of mean and std. The model can take raw values of time-series data in any numerical scales as input without any data normalization and transformation.

Feel free to contact me (chenguolin@stu.pku.edu.cn) or open an issue if you have any questions or suggestions.

πŸ“’ News

  • 2024-12-23: Check the latest repository under the Microsoft account: microsoft/NuTime.
  • 2024-11-12: Checkpoint of the self-supervised pretrained NuTime is released.
  • 2024-11-12: Codes about data preprocessing, training, evaluation are released.
  • 2024-07-15: It might take some time to clean the entire codebase for releasing, so we first provide the code about window & mean & std embeddings, which is the essential part of the proposed NuTime, at here.
  • 2024-07-10: NuTime is accepted to TMLR 2024.

πŸ“‹ TODO

  • Release the training and evaluation code
  • Release the self-supervised pretrained NuTime

πŸ”§ Installation

Please install PyTorch according to your CUDA version first. There are not restrictions on the torch version, feel free to use your preferred one.

git clone https://github.com/chenguolin/NuTime.git
cd NuTime
bash settings/setup.sh

πŸ“Š Dataset

Please refer to src/data/preprocess.py. We provide the script to preprocess the data including: UCR, UEA, SleepEDF, Epilepsy, etc. The processed and splitted Epilpesy dataset is provided in datasets/Epilepsy for example.

πŸš€ Usage

  • The core part of our work is WindowNormEncoder in src/models/encoders/WindowNormEncoder.py and WinT in src/models/networks.py. You can directly view the code for implementation details. Other codes are merely for data preprocessing, training, evaluation and ablation study, which could be ignored essentially.

  • Checkpoint of the self-supervised (i.e., BYOL-style) pretrained NuTime (with 9 multi-scaled embeddings) is provided in ckpt/checkpoint_bias9.pth

Finetune Pretrained NuTime for Epilepsy dataset

python3 src/pipeline.py --config_file configs/demo_ft_epilepsy.json

πŸ“š Citation

If you find our work helpful, please consider citing:

@article{lin2024nutime,
  title={NuTime: Numerically Multi-Scaled Embedding for Large-Scale Time-Series Pretraining},
  author={Chenguo Lin and Xumeng Wen and Wei Cao and Congrui Huang and Jiang Bian and Stephen Lin and Zhirong Wu},
  journal={Transactions on Machine Learning Research (TMLR)},
  year={2024}
}

About

[TMLR 2024] Official implementation of "NuTime: Numerically Multi-Scaled Embedding for Large-Scale Time-Series Pretraining".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors