DeepONet¶

Model Training CommandModel Evaluation CommandModel Export CommandModel Inference Command

# linux
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/DeepONet/antiderivative_unaligned_train.npz
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/DeepONet/antiderivative_unaligned_test.npz
# windows
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/deeponet/antiderivative_unaligned_train.npz -o antiderivative_unaligned_train.npz
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/deeponet/antiderivative_unaligned_test.npz -o antiderivative_unaligned_test.npz
python deeponet.py

# linux
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/DeepONet/antiderivative_unaligned_train.npz
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/DeepONet/antiderivative_unaligned_test.npz
# windows
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/deeponet/antiderivative_unaligned_train.npz -o antiderivative_unaligned_train.npz
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/deeponet/antiderivative_unaligned_test.npz -o antiderivative_unaligned_test.npz
python deeponet.py mode=eval EVAL.pretrained_model_path=https://paddle-org.bj.bcebos.com/paddlescience/models/deeponet/deeponet_pretrained.pdparams

python deeponet.py mode=export

python deeponet.py mode=infer

Pretrained Model	Metrics
deeponet_pretrained.pdparams	loss(G_eval): 0.00003 L2Rel.G(G_eval): 0.01799

1. Background Introduction¶

Based on the universal approximation theorem for operators, neural networks can approximate not just functions, but also nonlinear operators that map one function space to another. This is the core concept of "operator learning."

DeepONet, a prominent operator learning framework, demonstrates significant potential across diverse fields:

Fluid Dynamics: Solving partial differential equations (PDEs) like the Navier-Stokes equations for aerodynamics and climate modeling.
Computer Vision: Learning complex mappings for image classification, segmentation, and medical analysis.
Signal Processing: Applications in denoising, compression, and restoration for communications and radar.
Control Systems: Modeling system dynamics for predictive control and optimization.
Finance & Environment: Risk assessment, market forecasting, and climate prediction.

While DeepONet is versatile, successful application requires domain-specific adaptation and optimization.

2. Problem Definition¶

Consider the following Ordinary Differential Equation (ODE) system:

\[ \begin{cases} \frac{d}{dx} \mathbf{s}(x) = \mathbf{g}(\mathbf{s}(x), u(x), x) \\ \mathbf{s}(a) = s_0 \end{cases} \]

Here, \(u \in V\) (continuous on \([a, b]\)) is the input signal, and the solution \(\mathbf{s}: [a,b] \rightarrow \mathbb{R}^K\) is the output. We define an operator \(G\) such that \(\mathbf{s}(x) = (G u)(x)\). This can be expressed in integral form:

\[ (G u)(x) = s_0 + \int_a^x \mathbf{g}((G u)(t), u(t), t) dt \]

Our goal is to train a neural network that takes the function \(u\) and a coordinate \(x\) as inputs and predicts the value \((G u)(x)\). Essentially, we aim to learn the operator \(G\).

Note: In this specific example, \(G\) acts as an integral operator (antiderivative) with the initial condition \((G u)(0)=0\).

3. Problem Solving¶

Next, we will explain how to convert the problem into PaddleScience code step by step and solve the problem using deep learning methods. In order to quickly understand PaddleScience, only key steps such as model construction, equation construction, and computational domain construction are described below, while other details please refer to API Documentation.

3.1 Dataset Introduction¶

This case dataset uses the dataset provided by the DeepXDE official documentation. One npz file already contains the training set and validation set. Download Address

The data file description is as follows:

antiderivative_unaligned_train.npz

Field Name	Description
X_train0	Training input data corresponding to \(u\), shape is (10000, 100)
X_train1	Training input data corresponding to \(y\), shape is (10000, 1)
y_train	Training label data corresponding to \(G(u)\), shape is (10000,1)

antiderivative_unaligned_test.npz

Field Name	Description
X_test0	Test input data corresponding to \(u\), shape is (100000, 100)
X_test1	Test input data corresponding to \(y\), shape is (100000, 1)
y_test	Test label data corresponding to \(G(u)\), shape is (100000,1)

3.2 Model Construction¶

The inputs are the function \(u\) and the coordinate \(y\), and the output is the value \(G(u)(y)\). Following the DeepONet architecture, we employ a Branch Net (for \(u\)) and a Trunk Net (for \(y\)).

model = ppsci.arch.DeepONet(**cfg.MODEL)

We specify input keys as u and y, and the output key as G. The DeepONet model is instantiated by configuring the number of sensors, feature channels, hidden layers, neurons, and activation functions.

3.3 Constraint Construction¶

We use supervised learning to train the model. First, we configure the data loader, specifying file paths, input/label keys, and aliases.

train_dataloader_cfg = {
    "dataset": {
        "name": "IterableNPZDataset",
        "file_path": cfg.TRAIN_FILE_PATH,
        "input_keys": ("u", "y"),
        "label_keys": ("G",),
        "alias_dict": {"u": "X_train0", "y": "X_train1", "G": "y_train"},
    },
}

3.3.1 Supervised Constraint¶

Since we train in a supervised manner, supervised constraint SupervisedConstraint is used here:

sup_constraint = ppsci.constraint.SupervisedConstraint(
    train_dataloader_cfg,
    ppsci.loss.MSELoss(),
    {"G": lambda out: out["G"]},
)

Dataloader: Uses train_dataloader_cfg.
Loss: MSE with reduction="mean".
Target: The model output G.

The constraint is then stored in a dictionary.

# wrap constraints together
constraint = {sup_constraint.name: sup_constraint}

3.4 Hyperparameter Setting¶

We set the training epochs to 10,000 and the evaluation interval to 500 epochs.

# training settings
TRAIN:
  epochs: 10000
  iters_per_epoch: 1
  learning_rate: 1.0e-3
  save_freq: 500
  eval_freq: 500

3.5 Optimizer Construction¶

We use the Adam optimizer with a learning rate of 0.001.

# set optimizer
optimizer = ppsci.optimizer.Adam(cfg.TRAIN.learning_rate)(model)

3.6 Validator Construction¶

To monitor performance, we construct a SupervisedValidator for periodic evaluation on the test set.

# set validator
eval_dataloader_cfg = {
    "dataset": {
        "name": "IterableNPZDataset",
        "file_path": cfg.VALID_FILE_PATH,
        "input_keys": ("u", "y"),
        "label_keys": ("G",),
        "alias_dict": {"u": "X_test0", "y": "X_test1", "G": "y_test"},
    },
}

For evaluation metric metric, select ppsci.metric.L2Rel.

Other configurations are similar to the settings of Constraint Construction.

3.7 Model Training and Evaluation¶

With all components configured, we pass them to ppsci.solver.Solver to commence training and evaluation.

solver = ppsci.solver.Solver(
    model,
    constraint,
    cfg.output_dir,
    optimizer,
    None,
    cfg.TRAIN.epochs,
    cfg.TRAIN.iters_per_epoch,
    save_freq=cfg.TRAIN.save_freq,
    eval_freq=cfg.TRAIN.eval_freq,
    log_freq=cfg.log_freq,
    seed=cfg.seed,
    validator=validator,
    eval_during_train=cfg.TRAIN.eval_during_train,
    checkpoint_path=cfg.TRAIN.checkpoint_path,
)
# train model
solver.train()
# evaluate after finished training
solver.eval()

3.8 Result Visualization¶

Post-training, we verify the model by constructing 9 synthetic \(u-G(u)\) function pairs. We discretize \(u\) and \(y\), predict \(G(u)(y)\), and compare the results with the analytical solutions.

    def predict_func(input_dict):
        return solver.predict(input_dict, return_numpy=True)[cfg.MODEL.G_key]

    plot(cfg, predict_func)


def evaluate(cfg: DictConfig):
    # set random seed for reproducibility
    ppsci.utils.misc.set_random_seed(cfg.seed)
    # initialize logger
    logger.init_logger("ppsci", osp.join(cfg.output_dir, f"{cfg.mode}.log"), "info")

    # set model
    model = ppsci.arch.DeepONet(**cfg.MODEL)

    # set validator
    eval_dataloader_cfg = {
        "dataset": {
            "name": "IterableNPZDataset",
            "file_path": cfg.VALID_FILE_PATH,
            "input_keys": ("u", "y"),
            "label_keys": ("G",),
            "alias_dict": {"u": "X_test0", "y": "X_test1", "G": "y_test"},
        },
    }
    sup_validator = ppsci.validate.SupervisedValidator(
        eval_dataloader_cfg,
        ppsci.loss.MSELoss(),
        {"G": lambda out: out["G"]},
        metric={"L2Rel": ppsci.metric.L2Rel()},
        name="G_eval",
    )
    validator = {sup_validator.name: sup_validator}

    solver = ppsci.solver.Solver(
        model,
        None,
        cfg.output_dir,
        validator=validator,
        pretrained_model_path=cfg.EVAL.pretrained_model_path,
        eval_with_no_grad=cfg.EVAL.eval_with_no_grad,
    )
    solver.eval()

    def predict_func(input_dict):
        return solver.predict(input_dict, return_numpy=True)[cfg.MODEL.G_key]

    plot(cfg, predict_func)


def export(cfg: DictConfig):
    # set model
    model = ppsci.arch.DeepONet(**cfg.MODEL)

    # initialize solver
    solver = ppsci.solver.Solver(
        model,
        pretrained_model_path=cfg.INFER.pretrained_model_path,
    )

4. Complete Code¶

deeponet.py
"""
Reference: https://deepxde.readthedocs.io/en/latest/demos/operator/antiderivative_unaligned.html
"""

import os
from os import path as osp
from typing import Callable
from typing import Tuple

import hydra
import numpy as np
import paddle
from matplotlib import pyplot as plt
from omegaconf import DictConfig

import ppsci
from ppsci.utils import logger


def train(cfg: DictConfig):
    # set random seed for reproducibility
    ppsci.utils.misc.set_random_seed(cfg.seed)
    # initialize logger
    logger.init_logger("ppsci", osp.join(cfg.output_dir, f"{cfg.mode}.log"), "info")

    # set model
    model = ppsci.arch.DeepONet(**cfg.MODEL)

    # set dataloader config
    train_dataloader_cfg = {
        "dataset": {
            "name": "IterableNPZDataset",
            "file_path": cfg.TRAIN_FILE_PATH,
            "input_keys": ("u", "y"),
            "label_keys": ("G",),
            "alias_dict": {"u": "X_train0", "y": "X_train1", "G": "y_train"},
        },
    }

    sup_constraint = ppsci.constraint.SupervisedConstraint(
        train_dataloader_cfg,
        ppsci.loss.MSELoss(),
        {"G": lambda out: out["G"]},
    )
    # wrap constraints together
    constraint = {sup_constraint.name: sup_constraint}

    # set optimizer
    optimizer = ppsci.optimizer.Adam(cfg.TRAIN.learning_rate)(model)

    # set validator
    eval_dataloader_cfg = {
        "dataset": {
            "name": "IterableNPZDataset",
            "file_path": cfg.VALID_FILE_PATH,
            "input_keys": ("u", "y"),
            "label_keys": ("G",),
            "alias_dict": {"u": "X_test0", "y": "X_test1", "G": "y_test"},
        },
    }
    sup_validator = ppsci.validate.SupervisedValidator(
        eval_dataloader_cfg,
        ppsci.loss.MSELoss(),
        {"G": lambda out: out["G"]},
        metric={"L2Rel": ppsci.metric.L2Rel()},
        name="G_eval",
    )
    validator = {sup_validator.name: sup_validator}

    # initialize solver
    solver = ppsci.solver.Solver(
        model,
        constraint,
        cfg.output_dir,
        optimizer,
        None,
        cfg.TRAIN.epochs,
        cfg.TRAIN.iters_per_epoch,
        save_freq=cfg.TRAIN.save_freq,
        eval_freq=cfg.TRAIN.eval_freq,
        log_freq=cfg.log_freq,
        seed=cfg.seed,
        validator=validator,
        eval_during_train=cfg.TRAIN.eval_during_train,
        checkpoint_path=cfg.TRAIN.checkpoint_path,
    )
    # train model
    solver.train()
    # evaluate after finished training
    solver.eval()

    def predict_func(input_dict):
        return solver.predict(input_dict, return_numpy=True)[cfg.MODEL.G_key]

    plot(cfg, predict_func)


def evaluate(cfg: DictConfig):
    # set random seed for reproducibility
    ppsci.utils.misc.set_random_seed(cfg.seed)
    # initialize logger
    logger.init_logger("ppsci", osp.join(cfg.output_dir, f"{cfg.mode}.log"), "info")

    # set model
    model = ppsci.arch.DeepONet(**cfg.MODEL)

    # set validator
    eval_dataloader_cfg = {
        "dataset": {
            "name": "IterableNPZDataset",
            "file_path": cfg.VALID_FILE_PATH,
            "input_keys": ("u", "y"),
            "label_keys": ("G",),
            "alias_dict": {"u": "X_test0", "y": "X_test1", "G": "y_test"},
        },
    }
    sup_validator = ppsci.validate.SupervisedValidator(
        eval_dataloader_cfg,
        ppsci.loss.MSELoss(),
        {"G": lambda out: out["G"]},
        metric={"L2Rel": ppsci.metric.L2Rel()},
        name="G_eval",
    )
    validator = {sup_validator.name: sup_validator}

    solver = ppsci.solver.Solver(
        model,
        None,
        cfg.output_dir,
        validator=validator,
        pretrained_model_path=cfg.EVAL.pretrained_model_path,
        eval_with_no_grad=cfg.EVAL.eval_with_no_grad,
    )
    solver.eval()

    def predict_func(input_dict):
        return solver.predict(input_dict, return_numpy=True)[cfg.MODEL.G_key]

    plot(cfg, predict_func)


def export(cfg: DictConfig):
    # set model
    model = ppsci.arch.DeepONet(**cfg.MODEL)

    # initialize solver
    solver = ppsci.solver.Solver(
        model,
        pretrained_model_path=cfg.INFER.pretrained_model_path,
    )

    # export model
    from paddle.static import InputSpec

    input_spec = [
        {
            model.input_keys[0]: InputSpec(
                [None, 1000], "float32", name=model.input_keys[0]
            ),
            model.input_keys[1]: InputSpec(
                [None, 1], "float32", name=model.input_keys[1]
            ),
        }
    ]
    solver.export(input_spec, cfg.INFER.export_path)


def inference(cfg: DictConfig):
    from deploy import python_infer

    predictor = python_infer.GeneralPredictor(cfg)

    def predict_func(input_dict):
        return next(iter(predictor.predict(input_dict).values()))

    plot(cfg, predict_func)


def plot(cfg: DictConfig, predict_func: Callable):
    # visualize prediction for different functions u and corresponding G(u)
    dtype = paddle.get_default_dtype()

    def generate_y_u_G_ref(
        u_func: Callable, G_u_func: Callable
    ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
        """Generate discretized data of given function u and corresponding G(u).

        Args:
            u_func (Callable): Function u.
            G_u_func (Callable): Function G(u).

        Returns:
            Tuple[np.ndarray, np.ndarray, np.ndarray]: Discretized data of u, y and G(u).
        """
        x = np.linspace(0, 1, cfg.MODEL.num_loc, dtype=dtype).reshape(
            [1, cfg.MODEL.num_loc]
        )
        u = u_func(x)
        u = np.tile(u, [cfg.NUM_Y, 1])

        y = np.linspace(0, 1, cfg.NUM_Y, dtype=dtype).reshape([cfg.NUM_Y, 1])
        G_ref = G_u_func(y)
        return u, y, G_ref

    func_u_G_pair = [
        # (title_string, func_u, func_G(u)), s.t. dG/dx == u and G(u)(0) = 0
        (r"$u=\cos(x), G(u)=sin(x$)", lambda x: np.cos(x), lambda y: np.sin(y)),  # 1
        (
            r"$u=sec^2(x), G(u)=tan(x$)",
            lambda x: (1 / np.cos(x)) ** 2,
            lambda y: np.tan(y),
        ),  # 2
        (
            r"$u=sec(x)tan(x), G(u)=sec(x) - 1$",
            lambda x: (1 / np.cos(x) * np.tan(x)),
            lambda y: 1 / np.cos(y) - 1,
        ),  # 3
        (
            r"$u=1.5^x\ln{1.5}, G(u)=1.5^x-1$",
            lambda x: 1.5**x * np.log(1.5),
            lambda y: 1.5**y - 1,
        ),  # 4
        (r"$u=3x^2, G(u)=x^3$", lambda x: 3 * x**2, lambda y: y**3),  # 5
        (r"$u=4x^3, G(u)=x^4$", lambda x: 4 * x**3, lambda y: y**4),  # 6
        (r"$u=5x^4, G(u)=x^5$", lambda x: 5 * x**4, lambda y: y**5),  # 7
        (r"$u=6x^5, G(u)=x^6$", lambda x: 5 * x**4, lambda y: y**5),  # 8
        (r"$u=e^x, G(u)=e^x-1$", lambda x: np.exp(x), lambda y: np.exp(y) - 1),  # 9
    ]

    os.makedirs(os.path.join(cfg.output_dir, "visual"), exist_ok=True)
    for i, (title, u_func, G_func) in enumerate(func_u_G_pair):
        u, y, G_ref = generate_y_u_G_ref(u_func, G_func)
        G_pred = predict_func({"u": u, "y": y})
        plt.plot(y, G_pred, label=r"$G(u)(y)_{ref}$")
        plt.plot(y, G_ref, label=r"$G(u)(y)_{pred}$")
        plt.legend()
        plt.title(title)
        plt.savefig(os.path.join(cfg.output_dir, "visual", f"func_{i}_result.png"))
        logger.message(
            f"Saved result of function {i} to {cfg.output_dir}/visual/func_{i}_result.png"
        )
        plt.clf()
    plt.close()


@hydra.main(version_base=None, config_path="./conf", config_name="deeponet.yaml")
def main(cfg: DictConfig):
    if cfg.mode == "train":
        train(cfg)
    elif cfg.mode == "eval":
        evaluate(cfg)
    elif cfg.mode == "export":
        export(cfg)
    elif cfg.mode == "infer":
        inference(cfg)
    else:
        raise ValueError(
            f"cfg.mode should in ['train', 'eval', 'export', 'infer'], but got '{cfg.mode}'"
        )


if __name__ == "__main__":
    main()