Lorenz System¶

Model Training CommandModel Evaluation CommandModel Export CommandModel Inference Command

# linux
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/lorenz_training_rk.hdf5 -P ./datasets/
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/lorenz_valid_rk.hdf5 -P ./datasets/
# windows
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/lorenz_training_rk.hdf5 --create-dirs -o ./datasets/lorenz_training_rk.hdf5
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/lorenz_valid_rk.hdf5 --create-dirs -o ./datasets/lorenz_valid_rk.hdf5
python train_enn.py
python train_transformer.py

# linux
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/lorenz_training_rk.hdf5 -P ./datasets/
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/lorenz_valid_rk.hdf5 -P ./datasets/
# windows
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/lorenz_training_rk.hdf5 --create-dirs -o ./datasets/lorenz_training_rk.hdf5
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/lorenz_valid_rk.hdf5 --create-dirs -o ./datasets/lorenz_valid_rk.hdf5
python train_enn.py mode=eval EVAL.pretrained_model_path=https://paddle-org.bj.bcebos.com/paddlescience/models/lorenz/lorenz_pretrained.pdparams
python train_transformer.py mode=eval EVAL.pretrained_model_path=https://paddle-org.bj.bcebos.com/paddlescience/models/lorenz/lorenz_transformer_pretrained.pdparams EMBEDDING_MODEL_PATH=https://paddle-org.bj.bcebos.com/paddlescience/models/lorenz/lorenz_pretrained.pdparams

python train_transformer.py mode=export EMBEDDING_MODEL_PATH=https://paddle-org.bj.bcebos.com/paddlescience/models/lorenz/lorenz_pretrained.pdparams

# linux
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/lorenz_training_rk.hdf5 -P ./datasets/
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/lorenz_valid_rk.hdf5 -P ./datasets/
# windows
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/lorenz_training_rk.hdf5 --create-dirs -o ./datasets/lorenz_training_rk.hdf5
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/lorenz_valid_rk.hdf5 --create-dirs -o ./datasets/lorenz_valid_rk.hdf5
python train_transformer.py mode=infer

Model	MSE
lorenz_transformer_pretrained.pdparams	0.054

1. Background Introduction¶

The Lorenz system, proposed by meteorologist Edward N. Lorenz in 1963, is a seminal model in chaos theory. It famously illustrates the "Butterfly Effect," where small changes in initial conditions can lead to vastly different outcomes—metaphorically, a butterfly flapping its wings in Brazil causing a tornado in Texas.

Mathematically, the Lorenz system describes atmospheric convection using a set of three ordinary differential equations. It exhibits chaotic behavior for certain parameter values, characterized by extreme sensitivity to initial conditions and long-term unpredictability. Due to this sensitivity, the Lorenz system serves as an excellent benchmark for evaluating the precision and stability of machine learning models in capturing complex dynamics.

2. Problem Definition¶

State equations of the Lorenz system:

\[ \begin{cases} \dfrac{\partial x}{\partial t} = \sigma(y - x), & \\ \dfrac{\partial y}{\partial t} = x(\rho - z) - y, & \\ \dfrac{\partial z}{\partial t} = xy - \beta z \end{cases} \]

When the parameters take the following values, the system exhibits classic chaotic characteristics:

\[\rho = 28, \sigma = 10, \beta = \frac{8}{3}\]

In this case, it is required to predict the trajectory of the point in the future period given the coordinates of the point at the initial moment.

3. Problem Solving¶

Next, we will explain how to solve this problem using deep learning methods based on PaddleScience code. This case is based on the method of the paper Transformers for Modeling Physical Systems. Next, we will first briefly introduce the theoretical method of this paper, then introduce the dataset used, and finally explain the construction of supervised constraints and model construction for the two training steps of this method (Embedding model training, Transformer model training), while other details please refer to API Documentation.

3.1 Method Introduction¶

While Transformers have revolutionized NLP and CV, their application to physical system modeling is relatively new. This example implements the method from Transformers for Modeling Physical Systems, which adapts the Transformer architecture for dynamical systems.

The approach involves two key components: 1. Embedding Model: An autoencoder structure. - Encoder: Maps physical state variables to a latent embedding space. - Decoder: Reconstructs physical states from the latent vectors. 2. Transformer Model: Operates within the latent space. It predicts the future latent state based on the current latent state (output of the Encoder).

Training Strategy: 1. Train the Embedding model to minimize reconstruction error. 2. Freeze the Embedding model and train the Transformer to predict dynamics in the latent space.

trphysx-arch — Left: Embedding network structure, Right: Transformer network structure

3.2 Dataset Introduction¶

The dataset uses data provided in Transformer-Physx. This dataset is obtained using the Runge-Kutta traditional numerical solution method, with a time step size of 0.01, and the initial position is randomly selected from the following range:

\[x_{0} \sim(-20, 20), y_{0} \sim(-20, 20), z_{0} \sim(10, 40)\]

The division of the dataset is as follows:

Dataset	Number of Time Series	Number of Time Steps	Download Link
Training Set	2048	256	lorenz_training_rk.hdf5
Validation Set	64	1024	lorenz_valid_rk.hdf5

The official website of the dataset is: https://zenodo.org/record/5148524#.ZDe77-xByrc

3.3 Embedding Model¶

We first define the key hyperparameters in the configuration file:

examples/conf/enn.yaml
output_dir: ${hydra:run.dir}
TRAIN_BLOCK_SIZE: 16
VALID_BLOCK_SIZE: 32
TRAIN_FILE_PATH: ./datasets/lorenz_training_rk.hdf5
VALID_FILE_PATH: ./datasets/lorenz_valid_rk.hdf5

# model settings
MODEL:
  input_keys: ["states"]

3.3.1 Constraint Construction¶

Since this is a data-driven task, we use SupervisedConstraint. First, we configure the data loader:

examples/lorenz/train_enn.py
train_dataloader_cfg = {
    "dataset": {
        "name": "LorenzDataset",
        "file_path": cfg.TRAIN_FILE_PATH,
        "input_keys": cfg.MODEL.input_keys,
        "label_keys": cfg.MODEL.output_keys,
        "block_size": cfg.TRAIN_BLOCK_SIZE,
        "stride": 16,
        "weight_dict": {
            key: value for key, value in zip(cfg.MODEL.output_keys, weights)
        },
    },
    "sampler": {
        "name": "BatchSampler",
        "drop_last": True,
        "shuffle": True,
    },
    "batch_size": cfg.TRAIN.batch_size,
    "num_workers": 4,
}

Dataset: LorenzDataset handles loading the HDF5 data.
- block_size: Length of time sequence for training.
- stride: Step interval between samples.
Sampler: BatchSampler with shuffling enabled.

The code for defining supervised constraints is as follows:

examples/lorenz/train_enn.py
sup_constraint = ppsci.constraint.SupervisedConstraint(
    train_dataloader_cfg,
    ppsci.loss.MSELossWithL2Decay(
        regularization_dict={
            regularization_key: 1.0e-1 * (cfg.TRAIN_BLOCK_SIZE - 1)
        }
    ),
    {
        key: lambda out, k=key: out[k]
        for key in cfg.MODEL.output_keys + (regularization_key,)
    },
    name="Sup",
)
constraint = {sup_constraint.name: sup_constraint}

Dataloader: Uses train_dataloader_cfg.
Loss: MSELossWithL2Decay (MSE with L2 regularization).
Target: Model output.
Name: "Sup".

3.3.2 Model Construction¶

The Embedding model uses fully connected layers to map physical coordinates \((x, y, z)\) to and from the latent space.

lorenz_embedding — Embedding Network Model

Expressed in PaddleScience code as follows:

examples/lorenz/train_enn.py
data_mean, data_std = get_mean_std(sup_constraint.data_loader.dataset.data)
model = ppsci.arch.LorenzEmbedding(
    cfg.MODEL.input_keys,
    cfg.MODEL.output_keys + (regularization_key,),
    data_mean,
    data_std,
)

Among them, the first two parameters of LorenzEmbedding have been described above and will not be repeated here. The third and fourth parameters of the network model are the mean and variance of the training dataset, used to normalize the input data. The code for calculating the mean and variance is expressed as follows:

examples/lorenz/train_enn.py
def get_mean_std(data: np.ndarray):
    mean = np.asarray(
        [np.mean(data[:, :, 0]), np.mean(data[:, :, 1]), np.mean(data[:, :, 2])]
    ).reshape(1, 3)
    std = np.asarray(
        [np.std(data[:, :, 0]), np.std(data[:, :, 1]), np.std(data[:, :, 2])]
    ).reshape(1, 3)
    return mean, std

3.3.3 Learning Rate and Optimizer Construction¶

We use ExponentialDecay for the learning rate (initial lr=0.001) and the Adam optimizer with ClipGradByGlobalNorm for gradient clipping.

examples/lorenz/train_enn.py
# init optimizer and lr scheduler
clip = paddle.nn.ClipGradByGlobalNorm(clip_norm=0.1)
lr_scheduler = ppsci.optimizer.lr_scheduler.ExponentialDecay(
    iters_per_epoch=ITERS_PER_EPOCH,
    decay_steps=ITERS_PER_EPOCH,
    **cfg.TRAIN.lr_scheduler,
)()
optimizer = ppsci.optimizer.Adam(
    lr_scheduler, grad_clip=clip, **cfg.TRAIN.optimizer
)(model)

3.3.4 Validator Construction¶

During the training process of this case, the training status of the current model will be evaluated using the validation set at certain training round intervals, and SupervisedValidator is needed to construct the validator. The code is as follows:

examples/lorenz/train_enn.py
eval_dataloader_cfg = {
    "dataset": {
        "name": "LorenzDataset",
        "file_path": cfg.VALID_FILE_PATH,
        "input_keys": cfg.MODEL.input_keys,
        "label_keys": cfg.MODEL.output_keys,
        "block_size": cfg.VALID_BLOCK_SIZE,
        "stride": 32,
        "weight_dict": {
            key: value for key, value in zip(cfg.MODEL.output_keys, weights)
        },
    },
    "sampler": {
        "name": "BatchSampler",
        "drop_last": False,
        "shuffle": False,
    },
    "batch_size": cfg.EVAL.batch_size,
    "num_workers": 4,
}

mse_validator = ppsci.validate.SupervisedValidator(
    eval_dataloader_cfg,
    ppsci.loss.MSELoss(),
    metric={"MSE": ppsci.metric.MSE()},
    name="MSE_Validator",
)
validator = {mse_validator.name: mse_validator}

The SupervisedValidator validator is quite similar to SupervisedConstraint, the difference is that the validator needs to set evaluation metric metric, here ppsci.metric.MSE is used.

3.3.5 Model Training and Evaluation¶

After completing the above settings, you only need to pass the instantiated objects to ppsci.solver.Solver, and then start training and evaluation.

examples/lorenz/train_enn.py
solver = ppsci.solver.Solver(
    model,
    constraint,
    cfg.output_dir,
    optimizer,
    lr_scheduler,
    cfg.TRAIN.epochs,
    ITERS_PER_EPOCH,
    eval_during_train=True,
    validator=validator,
)
# train model
solver.train()
# evaluate after finished training
solver.eval()

3.4 Transformer Model¶

Having trained the Embedding model, we now train the Transformer model using the fixed Embedding model. The process is similar, so we focus on the differences.

examples/lorenz/conf/transformer.yaml
output_dir: ${hydra:run.dir}
log_freq: 20
TRAIN_BLOCK_SIZE: 64
VALID_BLOCK_SIZE: 256
TRAIN_FILE_PATH: ./datasets/lorenz_training_rk.hdf5
VALID_FILE_PATH: ./datasets/lorenz_valid_rk.hdf5

# set working condition

3.4.1 Constraint Construction¶

We again use SupervisedConstraint. The data loading configuration is:

examples/lorenz/train_transformer.py
train_dataloader_cfg = {
    "dataset": {
        "name": "LorenzDataset",
        "input_keys": cfg.MODEL.input_keys,
        "label_keys": cfg.MODEL.output_keys,
        "file_path": cfg.TRAIN_FILE_PATH,
        "block_size": cfg.TRAIN_BLOCK_SIZE,
        "stride": 64,
        "embedding_model": embedding_model,
    },
    "sampler": {
        "name": "BatchSampler",
        "drop_last": True,
        "shuffle": True,
    },
    "batch_size": cfg.TRAIN.batch_size,
    "num_workers": 4,
}

Note: The Transformer trains on data in the latent (encoding) space. We pass the pre-trained Embedding model to LorenzDataset to map the physical data to the encoding space during initialization.

The code for defining supervised constraints is as follows:

examples/lorenz/train_transformer.py
sup_constraint = ppsci.constraint.SupervisedConstraint(
    train_dataloader_cfg,
    ppsci.loss.MSELoss(),
    name="Sup",
)
constraint = {sup_constraint.name: sup_constraint}

3.4.2 Model Construction¶

The Transformer operates entirely within the latent encoding space.

lorenz_transformer — Transformer Network Model

Expressed in PaddleScience code as follows:

examples/lorenz/train_transformer.py
model = ppsci.arch.PhysformerGPT2(**cfg.MODEL)

In addition to filling in input_keys and output_keys, the class PhysformerGPT2 also needs to set the number of layers of the Transformer model num_layers, the context size num_ctx, the length of the input Embedding vector embed_size, and the parameter num_heads of the multi-head attention mechanism. The values filled in here are 4, 64, 32, 4.

3.4.3 Learning Rate and Optimizer Construction¶

The learning rate method used in this case is CosineWarmRestarts, and the learning rate size is set to 0.001. The optimizer uses Adam, and gradient clipping uses the ClipGradByGlobalNorm method built in Paddle. Expressed in PaddleScience code as follows:

examples/lorenz/train_transformer.py
clip = paddle.nn.ClipGradByGlobalNorm(clip_norm=0.1)
lr_scheduler = ppsci.optimizer.lr_scheduler.CosineWarmRestarts(
    iters_per_epoch=ITERS_PER_EPOCH, **cfg.TRAIN.lr_scheduler
)()
optimizer = ppsci.optimizer.Adam(
    lr_scheduler, grad_clip=clip, **cfg.TRAIN.optimizer
)(model)

3.4.4 Validator Construction¶

During the training process, the training status of the current model will be evaluated using the validation set at certain training round intervals, and SupervisedValidator is needed to construct the validator. Expressed in PaddleScience code as follows:

examples/lorenz/train_transformer.py
eval_dataloader_cfg = {
    "dataset": {
        "name": "LorenzDataset",
        "file_path": cfg.VALID_FILE_PATH,
        "input_keys": cfg.MODEL.input_keys,
        "label_keys": cfg.MODEL.output_keys,
        "block_size": cfg.VALID_BLOCK_SIZE,
        "stride": 1024,
        "embedding_model": embedding_model,
    },
    "sampler": {
        "name": "BatchSampler",
        "drop_last": False,
        "shuffle": False,
    },
    "batch_size": cfg.EVAL.batch_size,
    "num_workers": 4,
}

mse_validator = ppsci.validate.SupervisedValidator(
    eval_dataloader_cfg,
    ppsci.loss.MSELoss(),
    metric={"MSE": ppsci.metric.MSE()},
    name="MSE_Validator",
)
validator = {mse_validator.name: mse_validator}

3.4.5 Visualizer Construction¶

To visualize results, we must map the Transformer's latent output back to physical space using the Decoder. We define an OutputTransform for this purpose:

examples/lorenz/train_transformer.py
def build_embedding_model(embedding_model_path: str) -> ppsci.arch.LorenzEmbedding:
    input_keys = ("states",)
    output_keys = ("pred_states", "recover_states")
    regularization_key = "k_matrix"
    model = ppsci.arch.LorenzEmbedding(input_keys, output_keys + (regularization_key,))
    save_load.load_pretrain(model, embedding_model_path)
    return model


class OutputTransform(object):
    def __init__(self, model: base.Arch):
        self.model = model
        self.model.eval()

    def __call__(self, x: Dict[str, paddle.Tensor]):
        pred_embeds = x["pred_embeds"]
        pred_states = self.model.decoder(pred_embeds)

        return pred_states

examples/lorenz/train_transformer.py
embedding_model = build_embedding_model(cfg.EMBEDDING_MODEL_PATH)
output_transform = OutputTransform(embedding_model)

OutputTransform loads the Embedding model and decodes the latent vectors. We then construct the visualizer:

examples/lorenz/train_transformer.py
states = mse_validator.data_loader.dataset.data
embedding_data = mse_validator.data_loader.dataset.embedding_data
vis_data = {
    "embeds": embedding_data[: cfg.VIS_DATA_NUMS, :-1, :],
    "states": states[: cfg.VIS_DATA_NUMS, 1:, :],
}

visualizer = {
    "visualize_states": ppsci.visualize.VisualizerScatter3D(
        vis_data,
        {
            "pred_states": lambda d: output_transform(d),
            "states": lambda d: d["states"],
        },
        num_timestamps=1,
        prefix="result_states",
    )
}

First, use the dataset in mse_validator above for visualization, and introduce the vis_data_nums variable to control the number of samples to be visualized. Finally, construct the visualizer through VisualizerScatter3D.

3.4.6 Model Training, Evaluation and Visualization¶

After completing the above settings, you only need to pass the instantiated objects to ppsci.solver.Solver, and then start training and evaluation.

examples/lorenz/train_transformer.py
solver = ppsci.solver.Solver(
    model,
    constraint,
    cfg.output_dir,
    optimizer,
    lr_scheduler,
    cfg.TRAIN.epochs,
    ITERS_PER_EPOCH,
    eval_during_train=cfg.TRAIN.eval_during_train,
    eval_freq=cfg.TRAIN.eval_freq,
    validator=validator,
    visualizer=visualizer,
)
# train model
solver.train()
# evaluate after finished training
solver.eval()
# visualize prediction after finished training
solver.visualize()

4. Complete Code¶

lorenz/train_enn.py
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Two-stage training
# 1. Train a embedding model by running train_enn.py.
# 2. Load pretrained embedding model and freeze it, then train a transformer model by running train_transformer.py.

# This file is for step1: training a embedding model.
# This file is based on PaddleScience/ppsci API.
from os import path as osp

import hydra
import numpy as np
import paddle
from omegaconf import DictConfig

import ppsci
from ppsci.utils import logger


def get_mean_std(data: np.ndarray):
    mean = np.asarray(
        [np.mean(data[:, :, 0]), np.mean(data[:, :, 1]), np.mean(data[:, :, 2])]
    ).reshape(1, 3)
    std = np.asarray(
        [np.std(data[:, :, 0]), np.std(data[:, :, 1]), np.std(data[:, :, 2])]
    ).reshape(1, 3)
    return mean, std


def train(cfg: DictConfig):
    # set random seed for reproducibility
    ppsci.utils.misc.set_random_seed(cfg.seed)
    # initialize logger
    logger.init_logger("ppsci", osp.join(cfg.output_dir, f"{cfg.mode}.log"), "info")

    weights = (1.0 * (cfg.TRAIN_BLOCK_SIZE - 1), 1.0e4 * cfg.TRAIN_BLOCK_SIZE)
    regularization_key = "k_matrix"
    # manually build constraint(s)
    train_dataloader_cfg = {
        "dataset": {
            "name": "LorenzDataset",
            "file_path": cfg.TRAIN_FILE_PATH,
            "input_keys": cfg.MODEL.input_keys,
            "label_keys": cfg.MODEL.output_keys,
            "block_size": cfg.TRAIN_BLOCK_SIZE,
            "stride": 16,
            "weight_dict": {
                key: value for key, value in zip(cfg.MODEL.output_keys, weights)
            },
        },
        "sampler": {
            "name": "BatchSampler",
            "drop_last": True,
            "shuffle": True,
        },
        "batch_size": cfg.TRAIN.batch_size,
        "num_workers": 4,
    }

    sup_constraint = ppsci.constraint.SupervisedConstraint(
        train_dataloader_cfg,
        ppsci.loss.MSELossWithL2Decay(
            regularization_dict={
                regularization_key: 1.0e-1 * (cfg.TRAIN_BLOCK_SIZE - 1)
            }
        ),
        {
            key: lambda out, k=key: out[k]
            for key in cfg.MODEL.output_keys + (regularization_key,)
        },
        name="Sup",
    )
    constraint = {sup_constraint.name: sup_constraint}

    # set iters_per_epoch by dataloader length
    ITERS_PER_EPOCH = len(sup_constraint.data_loader)

    # manually init model
    data_mean, data_std = get_mean_std(sup_constraint.data_loader.dataset.data)
    model = ppsci.arch.LorenzEmbedding(
        cfg.MODEL.input_keys,
        cfg.MODEL.output_keys + (regularization_key,),
        data_mean,
        data_std,
    )

    # init optimizer and lr scheduler
    clip = paddle.nn.ClipGradByGlobalNorm(clip_norm=0.1)
    lr_scheduler = ppsci.optimizer.lr_scheduler.ExponentialDecay(
        iters_per_epoch=ITERS_PER_EPOCH,
        decay_steps=ITERS_PER_EPOCH,
        **cfg.TRAIN.lr_scheduler,
    )()
    optimizer = ppsci.optimizer.Adam(
        lr_scheduler, grad_clip=clip, **cfg.TRAIN.optimizer
    )(model)

    # manually build validator
    weights = (1.0 * (cfg.VALID_BLOCK_SIZE - 1), 1.0e4 * cfg.VALID_BLOCK_SIZE)
    eval_dataloader_cfg = {
        "dataset": {
            "name": "LorenzDataset",
            "file_path": cfg.VALID_FILE_PATH,
            "input_keys": cfg.MODEL.input_keys,
            "label_keys": cfg.MODEL.output_keys,
            "block_size": cfg.VALID_BLOCK_SIZE,
            "stride": 32,
            "weight_dict": {
                key: value for key, value in zip(cfg.MODEL.output_keys, weights)
            },
        },
        "sampler": {
            "name": "BatchSampler",
            "drop_last": False,
            "shuffle": False,
        },
        "batch_size": cfg.EVAL.batch_size,
        "num_workers": 4,
    }

    mse_validator = ppsci.validate.SupervisedValidator(
        eval_dataloader_cfg,
        ppsci.loss.MSELoss(),
        metric={"MSE": ppsci.metric.MSE()},
        name="MSE_Validator",
    )
    validator = {mse_validator.name: mse_validator}

    # initialize solver
    solver = ppsci.solver.Solver(
        model,
        constraint,
        cfg.output_dir,
        optimizer,
        lr_scheduler,
        cfg.TRAIN.epochs,
        ITERS_PER_EPOCH,
        eval_during_train=True,
        validator=validator,
    )
    # train model
    solver.train()
    # evaluate after finished training
    solver.eval()


def evaluate(cfg: DictConfig):
    # set random seed for reproducibility
    ppsci.utils.misc.set_random_seed(cfg.seed)
    # initialize logger
    logger.init_logger("ppsci", osp.join(cfg.output_dir, f"{cfg.mode}.log"), "info")

    weights = (1.0 * (cfg.TRAIN_BLOCK_SIZE - 1), 1.0e4 * cfg.TRAIN_BLOCK_SIZE)
    regularization_key = "k_matrix"
    # manually build constraint(s)
    train_dataloader_cfg = {
        "dataset": {
            "name": "LorenzDataset",
            "file_path": cfg.TRAIN_FILE_PATH,
            "input_keys": cfg.MODEL.input_keys,
            "label_keys": cfg.MODEL.output_keys,
            "block_size": cfg.TRAIN_BLOCK_SIZE,
            "stride": 16,
            "weight_dict": {
                key: value for key, value in zip(cfg.MODEL.output_keys, weights)
            },
        },
        "sampler": {
            "name": "BatchSampler",
            "drop_last": True,
            "shuffle": True,
        },
        "batch_size": cfg.TRAIN.batch_size,
        "num_workers": 4,
    }

    sup_constraint = ppsci.constraint.SupervisedConstraint(
        train_dataloader_cfg,
        ppsci.loss.MSELossWithL2Decay(
            regularization_dict={
                regularization_key: 1.0e-1 * (cfg.TRAIN_BLOCK_SIZE - 1)
            }
        ),
        {
            key: lambda out, k=key: out[k]
            for key in cfg.MODEL.output_keys + (regularization_key,)
        },
        name="Sup",
    )

    # manually init model
    data_mean, data_std = get_mean_std(sup_constraint.data_loader.dataset.data)
    model = ppsci.arch.LorenzEmbedding(
        cfg.MODEL.input_keys,
        cfg.MODEL.output_keys + (regularization_key,),
        data_mean,
        data_std,
    )

    # manually build validator
    weights = (1.0 * (cfg.VALID_BLOCK_SIZE - 1), 1.0e4 * cfg.VALID_BLOCK_SIZE)
    eval_dataloader_cfg = {
        "dataset": {
            "name": "LorenzDataset",
            "file_path": cfg.VALID_FILE_PATH,
            "input_keys": cfg.MODEL.input_keys,
            "label_keys": cfg.MODEL.output_keys,
            "block_size": cfg.VALID_BLOCK_SIZE,
            "stride": 32,
            "weight_dict": {
                key: value for key, value in zip(cfg.MODEL.output_keys, weights)
            },
        },
        "sampler": {
            "name": "BatchSampler",
            "drop_last": False,
            "shuffle": False,
        },
        "batch_size": cfg.EVAL.batch_size,
        "num_workers": 4,
    }

    mse_validator = ppsci.validate.SupervisedValidator(
        eval_dataloader_cfg,
        ppsci.loss.MSELoss(),
        metric={"MSE": ppsci.metric.MSE()},
        name="MSE_Validator",
    )
    validator = {mse_validator.name: mse_validator}

    solver = ppsci.solver.Solver(
        model,
        output_dir=cfg.output_dir,
        validator=validator,
        pretrained_model_path=cfg.EVAL.pretrained_model_path,
    )
    solver.eval()


@hydra.main(version_base=None, config_path="./conf", config_name="enn.yaml")
def main(cfg: DictConfig):
    if cfg.mode == "train":
        train(cfg)
    elif cfg.mode == "eval":
        evaluate(cfg)
    else:
        raise ValueError(f"cfg.mode should in ['train', 'eval'], but got '{cfg.mode}'")


if __name__ == "__main__":
    main()

lorenz/train_transformer.py
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Two-stage training
# 1. Train a embedding model by running train_enn.py.
# 2. Load pretrained embedding model and freeze it, then train a transformer model by running train_transformer.py.

# This file is for step2: training a transformer model, based on frozen pretrained embedding model.
# This file is based on PaddleScience/ppsci API.
from os import path as osp
from typing import Dict

import hydra
import paddle
from omegaconf import DictConfig

import ppsci
from ppsci.arch import base
from ppsci.utils import logger
from ppsci.utils import save_load


def build_embedding_model(embedding_model_path: str) -> ppsci.arch.LorenzEmbedding:
    input_keys = ("states",)
    output_keys = ("pred_states", "recover_states")
    regularization_key = "k_matrix"
    model = ppsci.arch.LorenzEmbedding(input_keys, output_keys + (regularization_key,))
    save_load.load_pretrain(model, embedding_model_path)
    return model


class OutputTransform(object):
    def __init__(self, model: base.Arch):
        self.model = model
        self.model.eval()

    def __call__(self, x: Dict[str, paddle.Tensor]):
        pred_embeds = x["pred_embeds"]
        pred_states = self.model.decoder(pred_embeds)

        return pred_states


def train(cfg: DictConfig):
    # train time-series: 2048    time-steps: 256    block-size: 64  stride: 64
    # valid time-series: 64      time-steps: 1024   block-size: 256 stride: 1024
    # test  time-series: 256     time-steps: 1024
    # set random seed for reproducibility
    ppsci.utils.misc.set_random_seed(cfg.seed)
    # initialize logger
    logger.init_logger("ppsci", osp.join(cfg.output_dir, f"{cfg.mode}.log"), "info")

    embedding_model = build_embedding_model(cfg.EMBEDDING_MODEL_PATH)
    output_transform = OutputTransform(embedding_model)

    # manually build constraint(s)
    train_dataloader_cfg = {
        "dataset": {
            "name": "LorenzDataset",
            "input_keys": cfg.MODEL.input_keys,
            "label_keys": cfg.MODEL.output_keys,
            "file_path": cfg.TRAIN_FILE_PATH,
            "block_size": cfg.TRAIN_BLOCK_SIZE,
            "stride": 64,
            "embedding_model": embedding_model,
        },
        "sampler": {
            "name": "BatchSampler",
            "drop_last": True,
            "shuffle": True,
        },
        "batch_size": cfg.TRAIN.batch_size,
        "num_workers": 4,
    }

    sup_constraint = ppsci.constraint.SupervisedConstraint(
        train_dataloader_cfg,
        ppsci.loss.MSELoss(),
        name="Sup",
    )
    constraint = {sup_constraint.name: sup_constraint}

    # set iters_per_epoch by dataloader length
    ITERS_PER_EPOCH = len(constraint["Sup"].data_loader)

    # manually init model
    model = ppsci.arch.PhysformerGPT2(**cfg.MODEL)

    # init optimizer and lr scheduler
    clip = paddle.nn.ClipGradByGlobalNorm(clip_norm=0.1)
    lr_scheduler = ppsci.optimizer.lr_scheduler.CosineWarmRestarts(
        iters_per_epoch=ITERS_PER_EPOCH, **cfg.TRAIN.lr_scheduler
    )()
    optimizer = ppsci.optimizer.Adam(
        lr_scheduler, grad_clip=clip, **cfg.TRAIN.optimizer
    )(model)

    # manually build validator
    eval_dataloader_cfg = {
        "dataset": {
            "name": "LorenzDataset",
            "file_path": cfg.VALID_FILE_PATH,
            "input_keys": cfg.MODEL.input_keys,
            "label_keys": cfg.MODEL.output_keys,
            "block_size": cfg.VALID_BLOCK_SIZE,
            "stride": 1024,
            "embedding_model": embedding_model,
        },
        "sampler": {
            "name": "BatchSampler",
            "drop_last": False,
            "shuffle": False,
        },
        "batch_size": cfg.EVAL.batch_size,
        "num_workers": 4,
    }

    mse_validator = ppsci.validate.SupervisedValidator(
        eval_dataloader_cfg,
        ppsci.loss.MSELoss(),
        metric={"MSE": ppsci.metric.MSE()},
        name="MSE_Validator",
    )
    validator = {mse_validator.name: mse_validator}

    # set visualizer(optional)
    states = mse_validator.data_loader.dataset.data
    embedding_data = mse_validator.data_loader.dataset.embedding_data
    vis_data = {
        "embeds": embedding_data[: cfg.VIS_DATA_NUMS, :-1, :],
        "states": states[: cfg.VIS_DATA_NUMS, 1:, :],
    }

    visualizer = {
        "visualize_states": ppsci.visualize.VisualizerScatter3D(
            vis_data,
            {
                "pred_states": lambda d: output_transform(d),
                "states": lambda d: d["states"],
            },
            num_timestamps=1,
            prefix="result_states",
        )
    }

    solver = ppsci.solver.Solver(
        model,
        constraint,
        cfg.output_dir,
        optimizer,
        lr_scheduler,
        cfg.TRAIN.epochs,
        ITERS_PER_EPOCH,
        eval_during_train=cfg.TRAIN.eval_during_train,
        eval_freq=cfg.TRAIN.eval_freq,
        validator=validator,
        visualizer=visualizer,
    )
    # train model
    solver.train()
    # evaluate after finished training
    solver.eval()
    # visualize prediction after finished training
    solver.visualize()


def evaluate(cfg: DictConfig):
    # directly evaluate pretrained model(optional)
    logger.init_logger("ppsci", osp.join(cfg.output_dir, f"{cfg.mode}.log"), "info")

    embedding_model = build_embedding_model(cfg.EMBEDDING_MODEL_PATH)
    output_transform = OutputTransform(embedding_model)

    # manually init model
    model = ppsci.arch.PhysformerGPT2(**cfg.MODEL)

    # manually build validator
    eval_dataloader_cfg = {
        "dataset": {
            "name": "LorenzDataset",
            "file_path": cfg.VALID_FILE_PATH,
            "input_keys": cfg.MODEL.input_keys,
            "label_keys": cfg.MODEL.output_keys,
            "block_size": cfg.VALID_BLOCK_SIZE,
            "stride": 1024,
            "embedding_model": embedding_model,
        },
        "sampler": {
            "name": "BatchSampler",
            "drop_last": False,
            "shuffle": False,
        },
        "batch_size": cfg.EVAL.batch_size,
        "num_workers": 4,
    }

    mse_validator = ppsci.validate.SupervisedValidator(
        eval_dataloader_cfg,
        ppsci.loss.MSELoss(),
        metric={"MSE": ppsci.metric.MSE()},
        name="MSE_Validator",
    )
    validator = {mse_validator.name: mse_validator}

    # set visualizer(optional)
    states = mse_validator.data_loader.dataset.data
    embedding_data = mse_validator.data_loader.dataset.embedding_data
    vis_datas = {
        "embeds": embedding_data[: cfg.VIS_DATA_NUMS, :-1, :],
        "states": states[: cfg.VIS_DATA_NUMS, 1:, :],
    }

    visualizer = {
        "visulzie_states": ppsci.visualize.VisualizerScatter3D(
            vis_datas,
            {
                "pred_states": lambda d: output_transform(d),
                "states": lambda d: d["states"],
            },
            num_timestamps=1,
            prefix="result_states",
        )
    }

    solver = ppsci.solver.Solver(
        model,
        output_dir=cfg.output_dir,
        validator=validator,
        visualizer=visualizer,
        pretrained_model_path=cfg.EVAL.pretrained_model_path,
    )
    solver.eval()
    # visualize prediction for pretrained model(optional)
    solver.visualize()


def export(cfg: DictConfig):
    # set model
    embedding_model = build_embedding_model(cfg.EMBEDDING_MODEL_PATH)
    model_cfg = {
        **cfg.MODEL,
        "embedding_model": embedding_model,
        "input_keys": ["states"],
        "output_keys": ["pred_states"],
    }
    model = ppsci.arch.PhysformerGPT2(**model_cfg)

    # initialize solver
    solver = ppsci.solver.Solver(
        model,
        pretrained_model_path=cfg.INFER.pretrained_model_path,
    )
    # export model
    from paddle.static import InputSpec

    input_spec = [
        {
            key: InputSpec([None, 255, 3], "float32", name=key)
            for key in model.input_keys
        },
    ]

    solver.export(input_spec, cfg.INFER.export_path)


def inference(cfg: DictConfig):
    from deploy.python_infer import pinn_predictor

    predictor = pinn_predictor.PINNPredictor(cfg)

    dataset_cfg = {
        "name": "LorenzDataset",
        "file_path": cfg.VALID_FILE_PATH,
        "input_keys": cfg.MODEL.input_keys,
        "label_keys": cfg.MODEL.output_keys,
        "block_size": cfg.VALID_BLOCK_SIZE,
        "stride": 1024,
    }

    dataset = ppsci.data.dataset.build_dataset(dataset_cfg)

    input_dict = {
        "states": dataset.data[: cfg.VIS_DATA_NUMS, :-1, :],
    }
    output_dict = predictor.predict(input_dict, cfg.INFER.batch_size)

    # mapping data to cfg.INFER.output_keys
    output_keys = ["pred_states"]
    output_dict = {
        store_key: output_dict[infer_key]
        for store_key, infer_key in zip(output_keys, output_dict.keys())
    }

    input_dict = {
        "states": dataset.data[: cfg.VIS_DATA_NUMS, 1:, :],
    }

    data_dict = {**input_dict, **output_dict}
    for i in range(cfg.VIS_DATA_NUMS):
        ppsci.visualize.save_plot_from_3d_dict(
            f"./lorenz_transformer_pred_{i}",
            {key: value[i] for key, value in data_dict.items()},
            ("states", "pred_states"),
        )


@hydra.main(version_base=None, config_path="./conf", config_name="transformer.yaml")
def main(cfg: DictConfig):
    if cfg.mode == "train":
        train(cfg)
    elif cfg.mode == "eval":
        evaluate(cfg)
    elif cfg.mode == "export":
        export(cfg)
    elif cfg.mode == "infer":
        inference(cfg)
    else:
        raise ValueError(
            f"cfg.mode should in ['train', 'eval', 'export', 'infer'], but got '{cfg.mode}'"
        )


if __name__ == "__main__":
    main()

5. Result Display¶

The plots below compare the model's predictions with the numerical ground truth for two different initial conditions.

result_states0 — Model prediction results ("pred_states") vs traditional numerical differentiation results ("states")

result_states1 — Model prediction results ("pred_states") vs traditional numerical differentiation results ("states")