Skip to content

Rossler System

AI Studio Quick Experience

# linux
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/rossler_training.hdf5 -P ./datasets/
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/rossler_valid.hdf5 -P ./datasets/
# windows
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/rossler_training.hdf5 --create-dirs -o ./datasets/rossler_training.hdf5
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/rossler_valid.hdf5 --create-dirs -o ./datasets/rossler_valid.hdf5
python train_enn.py
python train_transformer.py
# linux
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/rossler_training.hdf5 -P ./datasets/
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/rossler_valid.hdf5 -P ./datasets/
# windows
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/rossler_training.hdf5 --create-dirs -o ./datasets/rossler_training.hdf5
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/rossler_valid.hdf5 --create-dirs -o ./datasets/rossler_valid.hdf5
python train_enn.py mode=eval EVAL.pretrained_model_path=https://paddle-org.bj.bcebos.com/paddlescience/models/rossler/rossler_pretrained.pdparams
python train_transformer.py mode=eval EVAL.pretrained_model_path=https://paddle-org.bj.bcebos.com/paddlescience/models/rossler/rossler_transformer_pretrained.pdparams EMBEDDING_MODEL_PATH=https://paddle-org.bj.bcebos.com/paddlescience/models/rossler/rossler_pretrained.pdparams
python train_transformer.py mode=export EMBEDDING_MODEL_PATH=https://paddle-org.bj.bcebos.com/paddlescience/models/rossler/rossler_pretrained.pdparams
# linux
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/rossler_training.hdf5 -P ./datasets/
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/rossler_valid.hdf5 -P ./datasets/
# windows
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/rossler_training.hdf5 --create-dirs -o ./datasets/rossler_training.hdf5
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer_physx/rossler_valid.hdf5 --create-dirs -o ./datasets/rossler_valid.hdf5
python train_transformer.py mode=infer
Model MSE
rossler_transformer_pretrained.pdparams 0.022

1. Background Introduction

The Rossler System, first proposed by German scientist Rossler, is also a common chaotic system. This system plays an important role in the study of chaos theory, providing a mathematical description and understanding method for chaotic phenomena. At the same time, because the system is extremely sensitive to numerical disturbances, it is also a good benchmark for evaluating the accuracy of machine learning (deep learning) models.

2. Problem Definition

The state equations of the Rossler system:

\[ \begin{cases} \dfrac{\partial x}{\partial t} = -\omega y - z, & \\ \dfrac{\partial y}{\partial t} = \omega x + \alpha y, & \\ \dfrac{\partial z}{\partial t} = \beta + z(x - \gamma) \end{cases} \]

When the parameters take the following values, the system exhibits classic chaotic characteristics:

\[\omega = 1.0, \alpha = 0.165, \beta = 0.2, \gamma = 10\]

In this case, given the coordinates of the point at the initial time, predict the movement trajectory of the point in the future period.

3. Problem Solving

Next, we will explain how to solve this problem using deep learning methods based on PaddleScience code. This case is solved based on the method in the paper Transformers for Modeling Physical Systems. For the theoretical part of this method, please refer to this document or original paper. Next, the dataset used will be introduced first, and then the supervised constraint construction and model construction of the two training steps of this method (Embedding model training, Transformer model training) will be explained. For other details, please refer to API Documentation.

3.1 Dataset Introduction

The dataset uses data provided in Transformer-Physx. This dataset is obtained using the traditional Runge-Kutta numerical solution method. The dataset is divided as follows:

Dataset Number of Time Series Number of Time Steps Download Address
Training Set 256 1025 rossler_training.hdf5
Validation Set 32 1025 rossler_valid.hdf5

Dataset official website: https://zenodo.org/record/5148524#.ZDe77-xByrc

3.2 Embedding Model

First, show the various parameter variables defined in the code. The specific meaning of each parameter will be explained when used below.

examples/rossler/conf/enn.yaml
# general settings
mode: train # running mode: train/eval
seed: 6
output_dir: ${hydra:run.dir}
TRAIN_BLOCK_SIZE: 16
VALID_BLOCK_SIZE: 32
TRAIN_FILE_PATH: ./datasets/rossler_training.hdf5
VALID_FILE_PATH: ./datasets/rossler_valid.hdf5

# model settings
MODEL:
  input_keys: ["states"]

3.2.1 Constraint Construction

This case solves the problem based on data-driven methods, so it is necessary to use SupervisedConstraint built in PaddleScience to construct supervised constraints. Before defining constraints, you need to first specify various parameters used for data loading in supervised constraints. The code is as follows:

examples/rossler/train_enn.py
train_dataloader_cfg = {
    "dataset": {
        "name": "RosslerDataset",
        "file_path": cfg.TRAIN_FILE_PATH,
        "input_keys": cfg.MODEL.input_keys,
        "label_keys": cfg.MODEL.output_keys,
        "block_size": cfg.TRAIN_BLOCK_SIZE,
        "stride": 16,
        "weight_dict": {
            key: value for key, value in zip(cfg.MODEL.output_keys, weights)
        },
    },
    "sampler": {
        "name": "BatchSampler",
        "drop_last": True,
        "shuffle": True,
    },
    "batch_size": cfg.TRAIN.batch_size,
    "num_workers": 4,
}

Among them, the "dataset" field defines the used Dataset class name as RosslerDataset, and also specifies the values of parameters when initializing this class:

  1. file_path: Represents the file path of the training dataset, specified as the value of variable train_file_path;
  2. input_keys: Represents the variable name of model input data, here fill in variable input_keys;
  3. label_keys: Represents the variable name of true label, here fill in variable output_keys;
  4. block_size: Represents how long the time step is used for training, specified as the value of variable train_block_size;
  5. stride: Represents the time step interval between two consecutive training samples, specified as 16;
  6. weight_dict: Represents the weight of the loss function between model output variables and true labels, generated here using output_keys and weights.

The "sampler" field defines the used Sampler class name as BatchSampler, and also specifies that the parameters drop_last and shuffle are both True when initializing this class.

train_dataloader_cfg also defines the values of batch_size and num_workers.

The code for defining supervised constraints is as follows:

examples/rossler/train_enn.py
sup_constraint = ppsci.constraint.SupervisedConstraint(
    train_dataloader_cfg,
    ppsci.loss.MSELossWithL2Decay(
        regularization_dict={regularization_key: 1e-1 * (cfg.TRAIN_BLOCK_SIZE - 1)}
    ),
    {
        key: lambda out, k=key: out[k]
        for key in cfg.MODEL.output_keys + (regularization_key,)
    },
    name="Sup",
)

The first parameter of SupervisedConstraint is the data loading method, here train_dataloader_cfg defined above is used;

The second parameter is the definition of loss function, here MSELoss with L2Decay is used, class name is MSELossWithL2Decay, regularization_dict sets the variable name and corresponding weight of regularization;

The third parameter indicates how to calculate the intermediate variables that need to be constrained during training. Here, the variable we constrain is the output of the network;

The fourth parameter is the name of the constraint condition, which is convenient for subsequent indexing. Here it is named "Sup".

3.2.2 Model Construction

In this case, the input and output of the Embedding model are the position coordinates \((x, y, z)\) of points in physical space. A fully connected layer is used to implement the Embedding model, as shown in the figure below.

rossler_embedding

Embedding Network Model

Expressed in PaddleScience code as follows:

examples/rossler/train_enn.py
# manually init model
data_mean, data_std = get_mean_std(sup_constraint.data_loader.dataset.data)
model = ppsci.arch.RosslerEmbedding(
    cfg.MODEL.input_keys,
    cfg.MODEL.output_keys + (regularization_key,),
    data_mean,
    data_std,
)

Among them, the first two parameters of RosslerEmbedding have been described above and will not be repeated here. The third and fourth parameters of the network model are the mean and variance of the training dataset, which are used to normalize the input data. The code for calculating mean and variance is expressed as follows:

examples/rossler/train_enn.py
def get_mean_std(data: np.ndarray):
    mean = np.asarray(
        [np.mean(data[:, :, 0]), np.mean(data[:, :, 1]), np.min(data[:, :, 2])]
    ).reshape(1, 3)
    std = np.asarray(
        [
            np.std(data[:, :, 0]),
            np.std(data[:, :, 1]),
            np.max(data[:, :, 2]) - np.min(data[:, :, 2]),
        ]
    ).reshape(1, 3)
    return mean, std

3.2.3 Learning Rate and Optimizer Construction

The learning rate method used in this case is ExponentialDecay, and the learning rate size is set to 0.001. The optimizer uses Adam, and gradient clipping uses Paddle's built-in ClipGradByGlobalNorm method. Expressed in PaddleScience code as follows

examples/rossler/train_enn.py
# init optimizer and lr scheduler
clip = paddle.nn.ClipGradByGlobalNorm(clip_norm=0.1)
lr_scheduler = ppsci.optimizer.lr_scheduler.ExponentialDecay(
    iters_per_epoch=ITERS_PER_EPOCH,
    decay_steps=ITERS_PER_EPOCH,
    **cfg.TRAIN.lr_scheduler,
)()
optimizer = ppsci.optimizer.Adam(
    lr_scheduler, grad_clip=clip, **cfg.TRAIN.optimizer
)(model)

3.2.4 Validator Construction

During the training process of this case, the training status of the current model will be evaluated using the validation set at certain training round intervals, and SupervisedValidator is needed to construct the validator. The code is as follows:

examples/rossler/train_enn.py
eval_dataloader_cfg = {
    "dataset": {
        "name": "RosslerDataset",
        "file_path": cfg.VALID_FILE_PATH,
        "input_keys": cfg.MODEL.input_keys,
        "label_keys": cfg.MODEL.output_keys,
        "block_size": cfg.VALID_BLOCK_SIZE,
        "stride": 32,
        "weight_dict": {
            key: value for key, value in zip(cfg.MODEL.output_keys, weights)
        },
    },
    "sampler": {
        "name": "BatchSampler",
        "drop_last": False,
        "shuffle": False,
    },
    "batch_size": cfg.EVAL.batch_size,
    "num_workers": 4,
}

mse_validator = ppsci.validate.SupervisedValidator(
    eval_dataloader_cfg,
    ppsci.loss.MSELoss(),
    metric={"MSE": ppsci.metric.MSE()},
    name="MSE_Validator",
)
validator = {mse_validator.name: mse_validator}

The SupervisedValidator validator is similar to SupervisedConstraint, the difference is that the validator needs to set the evaluation metric metric, here ppsci.metric.MSE is used.

3.2.5 Model Training and Evaluation

After completing the above settings, you only need to pass the instantiated objects to ppsci.solver.Solver, and then start training and evaluation.

examples/rossler/train_enn.py
solver = ppsci.solver.Solver(
    model,
    constraint,
    cfg.output_dir,
    optimizer,
    lr_scheduler,
    cfg.TRAIN.epochs,
    ITERS_PER_EPOCH,
    eval_during_train=True,
    validator=validator,
)
# train model
solver.train()
# evaluate after finished training
solver.eval()

3.3 Transformer Model

The previous section introduced how to construct the training and evaluation of the Embedding model. In this section, we will introduce how to use the trained Embedding model to train the Transformer model. Because the steps for training the Transformer model are basically similar to the steps for training the Embedding model, the various parameters in the repeated parts of the two are not introduced in detail in this section. First, the various parameter variables defined in the code are shown below, and the specific meaning of each parameter will be explained when used below.

examples/rossler/conf/transformer.yaml
          - EVAL.pretrained_model_path
          - mode
          - output_dir
          - log_freq
          - EMBEDDING_MODEL_PATH
  sweep:
    # output directory for multirun
    dir: ${hydra.run.dir}
    subdir: ./

# general settings
mode: train # running mode: train/eval

3.3.1 Constraint Construction

The Transformer model also solves problems based on data-driven methods, so it is necessary to use SupervisedConstraint built in PaddleScience to construct supervised constraints. Before defining constraints, you need to first specify various parameters used for data loading in supervised constraints. The code is as follows:

examples/rossler/train_transformer.py
# manually build constraint(s)
train_dataloader_cfg = {
    "dataset": {
        "name": "RosslerDataset",
        "file_path": cfg.TRAIN_FILE_PATH,
        "input_keys": cfg.MODEL.input_keys,
        "label_keys": cfg.MODEL.output_keys,
        "block_size": cfg.TRAIN_BLOCK_SIZE,
        "stride": 16,
        "embedding_model": embedding_model,
    },
    "sampler": {
        "name": "BatchSampler",
        "drop_last": True,
        "shuffle": True,
    },
    "batch_size": cfg.TRAIN.batch_size,
    "num_workers": 4,
}

The various parameters for data loading are basically consistent with those in the Embedding model and will not be repeated. It should be noted that since the input data for Transformer model training is the output data of the Encoder module of the Embedding model, we use the trained Embedding model as a parameter of RosslerDataset, and first map the training data to the coding space during initialization.

The code for defining supervised constraints is as follows:

examples/rossler/train_transformer.py
sup_constraint = ppsci.constraint.SupervisedConstraint(
    train_dataloader_cfg,
    ppsci.loss.MSELoss(),
    name="Sup",
)
constraint = {sup_constraint.name: sup_constraint}

3.3.2 Model Construction

In this case, the input and output of the Transformer model are vectors in the coding space. The Transformer structure used is as follows:

rossler_transformer

Transformer Network Model

Expressed in PaddleScience code as follows:

examples/rossler/train_transformer.py
model = ppsci.arch.PhysformerGPT2(**cfg.MODEL)

In addition to filling in input_keys and output_keys, the class PhysformerGPT2 also needs to set the number of layers of the Transformer model num_layers, the size of the context num_ctx, the length of the input Embedding vector embed_size, and the parameter of the multi-head attention mechanism num_heads. The values filled in here are 4, 64, 32, 4.

3.3.3 Learning Rate and Optimizer Construction

The learning rate method used in this case is CosineWarmRestarts, and the learning rate size is set to 0.001. The optimizer uses Adam, and gradient clipping uses Paddle's built-in ClipGradByGlobalNorm method. Expressed in PaddleScience code as follows:

examples/rossler/train_transformer.py
# init optimizer and lr scheduler
clip = paddle.nn.ClipGradByGlobalNorm(clip_norm=0.1)
lr_scheduler = ppsci.optimizer.lr_scheduler.CosineWarmRestarts(
    iters_per_epoch=ITERS_PER_EPOCH, **cfg.TRAIN.lr_scheduler
)()
optimizer = ppsci.optimizer.Adam(
    lr_scheduler, grad_clip=clip, **cfg.TRAIN.optimizer
)(model)

3.3.4 Validator Construction

During the training process, the training status of the current model will be evaluated using the validation set at certain training round intervals, and SupervisedValidator is needed to construct the validator. Expressed in PaddleScience code as follows:

examples/rossler/train_transformer.py
eval_dataloader_cfg = {
    "dataset": {
        "name": "RosslerDataset",
        "file_path": cfg.VALID_FILE_PATH,
        "input_keys": cfg.MODEL.input_keys,
        "label_keys": cfg.MODEL.output_keys,
        "block_size": cfg.VALID_BLOCK_SIZE,
        "stride": 1024,
        "embedding_model": embedding_model,
    },
    "sampler": {
        "name": "BatchSampler",
        "drop_last": False,
        "shuffle": False,
    },
    "batch_size": cfg.EVAL.batch_size,
    "num_workers": 4,
}

mse_validator = ppsci.validate.SupervisedValidator(
    eval_dataloader_cfg,
    ppsci.loss.MSELoss(),
    metric={"MSE": ppsci.metric.MSE()},
    name="MSE_Validator",
)
validator = {mse_validator.name: mse_validator}

3.3.5 Visualizer Construction

In this case, the visualizer can be constructed to visualize the evaluation results during model evaluation. Since the output data of the Transformer model is the predicted data in the coding space and cannot be directly visualized, it is necessary to additionally transform the output data to the physical state space using the Decoder module of the Embedding network.

In this paper, the code for transforming the output data of the Transformer model to the physical state space is defined first:

examples/rossler/train_transformer.py
def build_embedding_model(embedding_model_path: str) -> ppsci.arch.RosslerEmbedding:
    input_keys = ("states",)
    output_keys = ("pred_states", "recover_states")
    regularization_key = "k_matrix"
    model = ppsci.arch.RosslerEmbedding(input_keys, output_keys + (regularization_key,))
    save_load.load_pretrain(model, embedding_model_path)
    return model


class OutputTransform(object):
    def __init__(self, model: base.Arch):
        self.model = model
        self.model.eval()

    def __call__(self, x: Dict[str, paddle.Tensor]):
        pred_embeds = x["pred_embeds"]
        pred_states = self.model.decoder(pred_embeds)

        return pred_states
examples/rossler/train_transformer.py
# manually build constraint(s)

It can be seen that the program first loads the trained Embedding model, and then implements the transformation from the encoding vector to the physical state space in the __call__ function of OutputTransform.

After defining the above code, you can implement the construction of the visualizer code:

examples/rossler/train_transformer.py
# set visualizer(optional)
states = mse_validator.data_loader.dataset.data
embedding_data = mse_validator.data_loader.dataset.embedding_data
vis_data = {
    "embeds": embedding_data[: cfg.VIS_DATA_NUMS, :-1, :],
    "states": states[: cfg.VIS_DATA_NUMS, 1:, :],
}

visualizer = {
    "visualize_states": ppsci.visualize.VisualizerScatter3D(
        vis_data,
        {
            "pred_states": lambda d: output_transform(d),
            "states": lambda d: d["states"],
        },
        num_timestamps=1,
        prefix="result_states",
    )
}

First use the dataset in mse_validator above for visualization, and also introduce the vis_data_nums variable to control the number of samples to be visualized. Finally, build the visualizer through VisualizerScatter3D.

3.3.6 Model Training, Evaluation and Visualization

After completing the above settings, you only need to pass the instantiated objects to ppsci.solver.Solver, and then start training and evaluation.

examples/rossler/train_transformer.py
solver = ppsci.solver.Solver(
    model,
    constraint,
    cfg.output_dir,
    optimizer,
    lr_scheduler,
    cfg.TRAIN.epochs,
    ITERS_PER_EPOCH,
    eval_during_train=cfg.TRAIN.eval_during_train,
    eval_freq=cfg.TRAIN.eval_freq,
    validator=validator,
    visualizer=visualizer,
)
# train model
solver.train()
# evaluate after finished training
solver.eval()
# visualize prediction after finished training
solver.visualize()

4. Complete Code

rossler/train_enn.py
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Two-stage training
# 1. Train a embedding model by running train_enn.py.
# 2. Load pretrained embedding model and freeze it, then train a transformer model by running train_transformer.py.

# This file is for step1: training a embedding model.
# This file is based on PaddleScience/ppsci API.
from os import path as osp

import hydra
import numpy as np
import paddle
from omegaconf import DictConfig

import ppsci
from ppsci.utils import logger


def get_mean_std(data: np.ndarray):
    mean = np.asarray(
        [np.mean(data[:, :, 0]), np.mean(data[:, :, 1]), np.min(data[:, :, 2])]
    ).reshape(1, 3)
    std = np.asarray(
        [
            np.std(data[:, :, 0]),
            np.std(data[:, :, 1]),
            np.max(data[:, :, 2]) - np.min(data[:, :, 2]),
        ]
    ).reshape(1, 3)
    return mean, std


def train(cfg: DictConfig):
    # set random seed for reproducibility
    ppsci.utils.misc.set_random_seed(cfg.seed)
    # initialize logger
    logger.init_logger("ppsci", osp.join(cfg.output_dir, f"{cfg.mode}.log"), "info")

    weights = (1.0 * (cfg.TRAIN_BLOCK_SIZE - 1), 1.0e3 * cfg.TRAIN_BLOCK_SIZE)
    regularization_key = "k_matrix"
    # manually build constraint(s)
    train_dataloader_cfg = {
        "dataset": {
            "name": "RosslerDataset",
            "file_path": cfg.TRAIN_FILE_PATH,
            "input_keys": cfg.MODEL.input_keys,
            "label_keys": cfg.MODEL.output_keys,
            "block_size": cfg.TRAIN_BLOCK_SIZE,
            "stride": 16,
            "weight_dict": {
                key: value for key, value in zip(cfg.MODEL.output_keys, weights)
            },
        },
        "sampler": {
            "name": "BatchSampler",
            "drop_last": True,
            "shuffle": True,
        },
        "batch_size": cfg.TRAIN.batch_size,
        "num_workers": 4,
    }

    sup_constraint = ppsci.constraint.SupervisedConstraint(
        train_dataloader_cfg,
        ppsci.loss.MSELossWithL2Decay(
            regularization_dict={regularization_key: 1e-1 * (cfg.TRAIN_BLOCK_SIZE - 1)}
        ),
        {
            key: lambda out, k=key: out[k]
            for key in cfg.MODEL.output_keys + (regularization_key,)
        },
        name="Sup",
    )
    constraint = {sup_constraint.name: sup_constraint}

    # set iters_per_epoch by dataloader length
    ITERS_PER_EPOCH = len(sup_constraint.data_loader)

    # manually init model
    data_mean, data_std = get_mean_std(sup_constraint.data_loader.dataset.data)
    model = ppsci.arch.RosslerEmbedding(
        cfg.MODEL.input_keys,
        cfg.MODEL.output_keys + (regularization_key,),
        data_mean,
        data_std,
    )

    # init optimizer and lr scheduler
    clip = paddle.nn.ClipGradByGlobalNorm(clip_norm=0.1)
    lr_scheduler = ppsci.optimizer.lr_scheduler.ExponentialDecay(
        iters_per_epoch=ITERS_PER_EPOCH,
        decay_steps=ITERS_PER_EPOCH,
        **cfg.TRAIN.lr_scheduler,
    )()
    optimizer = ppsci.optimizer.Adam(
        lr_scheduler, grad_clip=clip, **cfg.TRAIN.optimizer
    )(model)

    # manually build validator
    weights = (1.0 * (cfg.VALID_BLOCK_SIZE - 1), 1.0e4 * cfg.VALID_BLOCK_SIZE)
    eval_dataloader_cfg = {
        "dataset": {
            "name": "RosslerDataset",
            "file_path": cfg.VALID_FILE_PATH,
            "input_keys": cfg.MODEL.input_keys,
            "label_keys": cfg.MODEL.output_keys,
            "block_size": cfg.VALID_BLOCK_SIZE,
            "stride": 32,
            "weight_dict": {
                key: value for key, value in zip(cfg.MODEL.output_keys, weights)
            },
        },
        "sampler": {
            "name": "BatchSampler",
            "drop_last": False,
            "shuffle": False,
        },
        "batch_size": cfg.EVAL.batch_size,
        "num_workers": 4,
    }

    mse_validator = ppsci.validate.SupervisedValidator(
        eval_dataloader_cfg,
        ppsci.loss.MSELoss(),
        metric={"MSE": ppsci.metric.MSE()},
        name="MSE_Validator",
    )
    validator = {mse_validator.name: mse_validator}

    solver = ppsci.solver.Solver(
        model,
        constraint,
        cfg.output_dir,
        optimizer,
        lr_scheduler,
        cfg.TRAIN.epochs,
        ITERS_PER_EPOCH,
        eval_during_train=True,
        validator=validator,
    )
    # train model
    solver.train()
    # evaluate after finished training
    solver.eval()


def evaluate(cfg: DictConfig):
    # set random seed for reproducibility
    ppsci.utils.misc.set_random_seed(cfg.seed)
    # initialize logger
    logger.init_logger("ppsci", osp.join(cfg.output_dir, f"{cfg.mode}.log"), "info")

    weights = (1.0 * (cfg.TRAIN_BLOCK_SIZE - 1), 1.0e3 * cfg.TRAIN_BLOCK_SIZE)
    regularization_key = "k_matrix"
    # manually build constraint(s)
    train_dataloader_cfg = {
        "dataset": {
            "name": "RosslerDataset",
            "file_path": cfg.TRAIN_FILE_PATH,
            "input_keys": cfg.MODEL.input_keys,
            "label_keys": cfg.MODEL.output_keys,
            "block_size": cfg.TRAIN_BLOCK_SIZE,
            "stride": 16,
            "weight_dict": {
                key: value for key, value in zip(cfg.MODEL.output_keys, weights)
            },
        },
        "sampler": {
            "name": "BatchSampler",
            "drop_last": True,
            "shuffle": True,
        },
        "batch_size": cfg.TRAIN.batch_size,
        "num_workers": 4,
    }

    sup_constraint = ppsci.constraint.SupervisedConstraint(
        train_dataloader_cfg,
        ppsci.loss.MSELossWithL2Decay(
            regularization_dict={regularization_key: 1e-1 * (cfg.TRAIN_BLOCK_SIZE - 1)}
        ),
        {
            key: lambda out, k=key: out[k]
            for key in cfg.MODEL.output_keys + (regularization_key,)
        },
        name="Sup",
    )

    # manually init model
    data_mean, data_std = get_mean_std(sup_constraint.data_loader.dataset.data)
    model = ppsci.arch.RosslerEmbedding(
        cfg.MODEL.input_keys,
        cfg.MODEL.output_keys + (regularization_key,),
        data_mean,
        data_std,
    )

    # manually build validator
    weights = (1.0 * (cfg.VALID_BLOCK_SIZE - 1), 1.0e4 * cfg.VALID_BLOCK_SIZE)
    eval_dataloader_cfg = {
        "dataset": {
            "name": "RosslerDataset",
            "file_path": cfg.VALID_FILE_PATH,
            "input_keys": cfg.MODEL.input_keys,
            "label_keys": cfg.MODEL.output_keys,
            "block_size": cfg.VALID_BLOCK_SIZE,
            "stride": 32,
            "weight_dict": {
                key: value for key, value in zip(cfg.MODEL.output_keys, weights)
            },
        },
        "sampler": {
            "name": "BatchSampler",
            "drop_last": False,
            "shuffle": False,
        },
        "batch_size": cfg.EVAL.batch_size,
        "num_workers": 4,
    }

    mse_validator = ppsci.validate.SupervisedValidator(
        eval_dataloader_cfg,
        ppsci.loss.MSELoss(),
        metric={"MSE": ppsci.metric.MSE()},
        name="MSE_Validator",
    )
    validator = {mse_validator.name: mse_validator}
    solver = ppsci.solver.Solver(
        model,
        output_dir=cfg.output_dir,
        validator=validator,
        pretrained_model_path=cfg.EVAL.pretrained_model_path,
    )
    solver.eval()


@hydra.main(version_base=None, config_path="./conf", config_name="enn.yaml")
def main(cfg: DictConfig):
    if cfg.mode == "train":
        train(cfg)
    elif cfg.mode == "eval":
        evaluate(cfg)
    else:
        raise ValueError(f"cfg.mode should in ['train', 'eval'], but got '{cfg.mode}'")


if __name__ == "__main__":
    main()
rossler/train_transformer.py
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Two-stage training
# 1. Train a embedding model by running train_enn.py.
# 2. Load pretrained embedding model and freeze it, then train a transformer model by running train_transformer.py.

# This file is for step2: training a transformer model, based on frozen pretrained embedding model.
# This file is based on PaddleScience/ppsci API.
from os import path as osp
from typing import Dict

import hydra
import paddle
from omegaconf import DictConfig

import ppsci
from ppsci.arch import base
from ppsci.utils import logger
from ppsci.utils import save_load


def build_embedding_model(embedding_model_path: str) -> ppsci.arch.RosslerEmbedding:
    input_keys = ("states",)
    output_keys = ("pred_states", "recover_states")
    regularization_key = "k_matrix"
    model = ppsci.arch.RosslerEmbedding(input_keys, output_keys + (regularization_key,))
    save_load.load_pretrain(model, embedding_model_path)
    return model


class OutputTransform(object):
    def __init__(self, model: base.Arch):
        self.model = model
        self.model.eval()

    def __call__(self, x: Dict[str, paddle.Tensor]):
        pred_embeds = x["pred_embeds"]
        pred_states = self.model.decoder(pred_embeds)

        return pred_states


def train(cfg: DictConfig):
    # set random seed for reproducibility
    ppsci.utils.misc.set_random_seed(cfg.seed)
    # initialize logger
    logger.init_logger("ppsci", osp.join(cfg.output_dir, f"{cfg.mode}.log"), "info")

    embedding_model = build_embedding_model(cfg.EMBEDDING_MODEL_PATH)
    output_transform = OutputTransform(embedding_model)

    # manually build constraint(s)
    train_dataloader_cfg = {
        "dataset": {
            "name": "RosslerDataset",
            "file_path": cfg.TRAIN_FILE_PATH,
            "input_keys": cfg.MODEL.input_keys,
            "label_keys": cfg.MODEL.output_keys,
            "block_size": cfg.TRAIN_BLOCK_SIZE,
            "stride": 16,
            "embedding_model": embedding_model,
        },
        "sampler": {
            "name": "BatchSampler",
            "drop_last": True,
            "shuffle": True,
        },
        "batch_size": cfg.TRAIN.batch_size,
        "num_workers": 4,
    }

    sup_constraint = ppsci.constraint.SupervisedConstraint(
        train_dataloader_cfg,
        ppsci.loss.MSELoss(),
        name="Sup",
    )
    constraint = {sup_constraint.name: sup_constraint}

    # set iters_per_epoch by dataloader length
    ITERS_PER_EPOCH = len(constraint["Sup"].data_loader)

    # manually init model
    model = ppsci.arch.PhysformerGPT2(**cfg.MODEL)

    # init optimizer and lr scheduler
    clip = paddle.nn.ClipGradByGlobalNorm(clip_norm=0.1)
    lr_scheduler = ppsci.optimizer.lr_scheduler.CosineWarmRestarts(
        iters_per_epoch=ITERS_PER_EPOCH, **cfg.TRAIN.lr_scheduler
    )()
    optimizer = ppsci.optimizer.Adam(
        lr_scheduler, grad_clip=clip, **cfg.TRAIN.optimizer
    )(model)

    # manually build validator
    eval_dataloader_cfg = {
        "dataset": {
            "name": "RosslerDataset",
            "file_path": cfg.VALID_FILE_PATH,
            "input_keys": cfg.MODEL.input_keys,
            "label_keys": cfg.MODEL.output_keys,
            "block_size": cfg.VALID_BLOCK_SIZE,
            "stride": 1024,
            "embedding_model": embedding_model,
        },
        "sampler": {
            "name": "BatchSampler",
            "drop_last": False,
            "shuffle": False,
        },
        "batch_size": cfg.EVAL.batch_size,
        "num_workers": 4,
    }

    mse_validator = ppsci.validate.SupervisedValidator(
        eval_dataloader_cfg,
        ppsci.loss.MSELoss(),
        metric={"MSE": ppsci.metric.MSE()},
        name="MSE_Validator",
    )
    validator = {mse_validator.name: mse_validator}

    # set visualizer(optional)
    states = mse_validator.data_loader.dataset.data
    embedding_data = mse_validator.data_loader.dataset.embedding_data
    vis_data = {
        "embeds": embedding_data[: cfg.VIS_DATA_NUMS, :-1, :],
        "states": states[: cfg.VIS_DATA_NUMS, 1:, :],
    }

    visualizer = {
        "visualize_states": ppsci.visualize.VisualizerScatter3D(
            vis_data,
            {
                "pred_states": lambda d: output_transform(d),
                "states": lambda d: d["states"],
            },
            num_timestamps=1,
            prefix="result_states",
        )
    }

    solver = ppsci.solver.Solver(
        model,
        constraint,
        cfg.output_dir,
        optimizer,
        lr_scheduler,
        cfg.TRAIN.epochs,
        ITERS_PER_EPOCH,
        eval_during_train=cfg.TRAIN.eval_during_train,
        eval_freq=cfg.TRAIN.eval_freq,
        validator=validator,
        visualizer=visualizer,
    )
    # train model
    solver.train()
    # evaluate after finished training
    solver.eval()
    # visualize prediction after finished training
    solver.visualize()


def evaluate(cfg: DictConfig):
    # set random seed for reproducibility
    ppsci.utils.misc.set_random_seed(cfg.seed)
    # initialize logger
    logger.init_logger("ppsci", osp.join(cfg.output_dir, f"{cfg.mode}.log"), "info")

    embedding_model = build_embedding_model(cfg.EMBEDDING_MODEL_PATH)
    output_transform = OutputTransform(embedding_model)

    # manually init model
    model = ppsci.arch.PhysformerGPT2(**cfg.MODEL)

    # manually build validator
    eval_dataloader_cfg = {
        "dataset": {
            "name": "RosslerDataset",
            "file_path": cfg.VALID_FILE_PATH,
            "input_keys": cfg.MODEL.input_keys,
            "label_keys": cfg.MODEL.output_keys,
            "block_size": cfg.VALID_BLOCK_SIZE,
            "stride": 1024,
            "embedding_model": embedding_model,
        },
        "sampler": {
            "name": "BatchSampler",
            "drop_last": False,
            "shuffle": False,
        },
        "batch_size": cfg.EVAL.batch_size,
        "num_workers": 4,
    }

    mse_validator = ppsci.validate.SupervisedValidator(
        eval_dataloader_cfg,
        ppsci.loss.MSELoss(),
        metric={"MSE": ppsci.metric.MSE()},
        name="MSE_Validator",
    )
    validator = {mse_validator.name: mse_validator}

    # set visualizer(optional)
    states = mse_validator.data_loader.dataset.data
    embedding_data = mse_validator.data_loader.dataset.embedding_data
    vis_datas = {
        "embeds": embedding_data[: cfg.VIS_DATA_NUMS, :-1, :],
        "states": states[: cfg.VIS_DATA_NUMS, 1:, :],
    }

    visualizer = {
        "visulzie_states": ppsci.visualize.VisualizerScatter3D(
            vis_datas,
            {
                "pred_states": lambda d: output_transform(d),
                "states": lambda d: d["states"],
            },
            num_timestamps=1,
            prefix="result_states",
        )
    }

    solver = ppsci.solver.Solver(
        model,
        output_dir=cfg.output_dir,
        validator=validator,
        visualizer=visualizer,
        pretrained_model_path=cfg.EVAL.pretrained_model_path,
    )
    solver.eval()
    # visualize prediction for pretrained model(optional)
    solver.visualize()


def export(cfg: DictConfig):
    # set model
    embedding_model = build_embedding_model(cfg.EMBEDDING_MODEL_PATH)
    model_cfg = {
        **cfg.MODEL,
        "embedding_model": embedding_model,
        "input_keys": ["states"],
        "output_keys": ["pred_states"],
    }
    model = ppsci.arch.PhysformerGPT2(**model_cfg)

    # initialize solver
    solver = ppsci.solver.Solver(
        model,
        pretrained_model_path=cfg.INFER.pretrained_model_path,
    )
    # export model
    from paddle.static import InputSpec

    input_spec = [
        {
            key: InputSpec([None, 255, 3], "float32", name=key)
            for key in model.input_keys
        },
    ]

    solver.export(input_spec, cfg.INFER.export_path)


def inference(cfg: DictConfig):
    from deploy.python_infer import pinn_predictor

    predictor = pinn_predictor.PINNPredictor(cfg)

    dataset_cfg = {
        "name": "RosslerDataset",
        "file_path": cfg.VALID_FILE_PATH,
        "input_keys": cfg.MODEL.input_keys,
        "label_keys": cfg.MODEL.output_keys,
        "block_size": cfg.VALID_BLOCK_SIZE,
        "stride": 1024,
    }

    dataset = ppsci.data.dataset.build_dataset(dataset_cfg)

    input_dict = {
        "states": dataset.data[: cfg.VIS_DATA_NUMS, :-1, :],
    }

    output_dict = predictor.predict(input_dict, cfg.INFER.batch_size)

    # mapping data to cfg.INFER.output_keys
    output_keys = ["pred_states"]
    output_dict = {
        store_key: output_dict[infer_key]
        for store_key, infer_key in zip(output_keys, output_dict.keys())
    }

    input_dict = {
        "states": dataset.data[: cfg.VIS_DATA_NUMS, 1:, :],
    }

    data_dict = {**input_dict, **output_dict}
    for i in range(cfg.VIS_DATA_NUMS):
        ppsci.visualize.save_plot_from_3d_dict(
            f"./rossler_transformer_pred_{i}",
            {key: value[i] for key, value in data_dict.items()},
            ("states", "pred_states"),
        )


@hydra.main(version_base=None, config_path="./conf", config_name="transformer.yaml")
def main(cfg: DictConfig):
    if cfg.mode == "train":
        train(cfg)
    elif cfg.mode == "eval":
        evaluate(cfg)
    elif cfg.mode == "export":
        export(cfg)
    elif cfg.mode == "infer":
        inference(cfg)
    else:
        raise ValueError(
            f"cfg.mode should in ['train', 'eval', 'export', 'infer'], but got '{cfg.mode}'"
        )


if __name__ == "__main__":
    main()

5. Result Display

The figure below shows the model prediction results and traditional numerical differentiation prediction results under two different initial conditions.

result_states0

Model prediction result ("pred_states") vs traditional numerical differentiation result ("states")

result_states1

Model prediction result ("pred_states") vs traditional numerical differentiation result ("states")