Skip to content

DeepONet

AI Studio Quick Experience

# linux
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/DeepONet/antiderivative_unaligned_train.npz
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/DeepONet/antiderivative_unaligned_test.npz
# windows
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/deeponet/antiderivative_unaligned_train.npz -o antiderivative_unaligned_train.npz
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/deeponet/antiderivative_unaligned_test.npz -o antiderivative_unaligned_test.npz
python deeponet.py
# linux
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/DeepONet/antiderivative_unaligned_train.npz
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/DeepONet/antiderivative_unaligned_test.npz
# windows
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/deeponet/antiderivative_unaligned_train.npz -o antiderivative_unaligned_train.npz
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/deeponet/antiderivative_unaligned_test.npz -o antiderivative_unaligned_test.npz
python deeponet.py mode=eval EVAL.pretrained_model_path=https://paddle-org.bj.bcebos.com/paddlescience/models/deeponet/deeponet_pretrained.pdparams
python deeponet.py mode=export
python deeponet.py mode=infer
Pretrained Model Metrics
deeponet_pretrained.pdparams loss(G_eval): 0.00003
L2Rel.G(G_eval): 0.01799

1. Background Introduction

Based on the universal approximation theorem for operators, neural networks can approximate not just functions, but also nonlinear operators that map one function space to another. This is the core concept of "operator learning."

DeepONet, a prominent operator learning framework, demonstrates significant potential across diverse fields:

  • Fluid Dynamics: Solving partial differential equations (PDEs) like the Navier-Stokes equations for aerodynamics and climate modeling.
  • Computer Vision: Learning complex mappings for image classification, segmentation, and medical analysis.
  • Signal Processing: Applications in denoising, compression, and restoration for communications and radar.
  • Control Systems: Modeling system dynamics for predictive control and optimization.
  • Finance & Environment: Risk assessment, market forecasting, and climate prediction.

While DeepONet is versatile, successful application requires domain-specific adaptation and optimization.

2. Problem Definition

Consider the following Ordinary Differential Equation (ODE) system:

\[ \begin{cases} \frac{d}{dx} \mathbf{s}(x) = \mathbf{g}(\mathbf{s}(x), u(x), x) \\ \mathbf{s}(a) = s_0 \end{cases} \]

Here, \(u \in V\) (continuous on \([a, b]\)) is the input signal, and the solution \(\mathbf{s}: [a,b] \rightarrow \mathbb{R}^K\) is the output. We define an operator \(G\) such that \(\mathbf{s}(x) = (G u)(x)\). This can be expressed in integral form:

\[ (G u)(x) = s_0 + \int_a^x \mathbf{g}((G u)(t), u(t), t) dt \]

Our goal is to train a neural network that takes the function \(u\) and a coordinate \(x\) as inputs and predicts the value \((G u)(x)\). Essentially, we aim to learn the operator \(G\).

Note: In this specific example, \(G\) acts as an integral operator (antiderivative) with the initial condition \((G u)(0)=0\).

3. Problem Solving

Next, we will explain how to convert the problem into PaddleScience code step by step and solve the problem using deep learning methods. In order to quickly understand PaddleScience, only key steps such as model construction, equation construction, and computational domain construction are described below, while other details please refer to API Documentation.

3.1 Dataset Introduction

This case dataset uses the dataset provided by the DeepXDE official documentation. One npz file already contains the training set and validation set. Download Address

The data file description is as follows:

antiderivative_unaligned_train.npz

Field Name Description
X_train0 Training input data corresponding to \(u\), shape is (10000, 100)
X_train1 Training input data corresponding to \(y\), shape is (10000, 1)
y_train Training label data corresponding to \(G(u)\), shape is (10000,1)

antiderivative_unaligned_test.npz

Field Name Description
X_test0 Test input data corresponding to \(u\), shape is (100000, 100)
X_test1 Test input data corresponding to \(y\), shape is (100000, 1)
y_test Test label data corresponding to \(G(u)\), shape is (100000,1)

3.2 Model Construction

The inputs are the function \(u\) and the coordinate \(y\), and the output is the value \(G(u)(y)\). Following the DeepONet architecture, we employ a Branch Net (for \(u\)) and a Trunk Net (for \(y\)).

model = ppsci.arch.DeepONet(**cfg.MODEL)

We specify input keys as u and y, and the output key as G. The DeepONet model is instantiated by configuring the number of sensors, feature channels, hidden layers, neurons, and activation functions.

3.3 Constraint Construction

We use supervised learning to train the model. First, we configure the data loader, specifying file paths, input/label keys, and aliases.

train_dataloader_cfg = {
    "dataset": {
        "name": "IterableNPZDataset",
        "file_path": cfg.TRAIN_FILE_PATH,
        "input_keys": ("u", "y"),
        "label_keys": ("G",),
        "alias_dict": {"u": "X_train0", "y": "X_train1", "G": "y_train"},
    },
}

3.3.1 Supervised Constraint

Since we train in a supervised manner, supervised constraint SupervisedConstraint is used here:

sup_constraint = ppsci.constraint.SupervisedConstraint(
    train_dataloader_cfg,
    ppsci.loss.MSELoss(),
    {"G": lambda out: out["G"]},
)
  • Dataloader: Uses train_dataloader_cfg.
  • Loss: MSE with reduction="mean".
  • Target: The model output G.

The constraint is then stored in a dictionary.

# wrap constraints together
constraint = {sup_constraint.name: sup_constraint}

3.4 Hyperparameter Setting

We set the training epochs to 10,000 and the evaluation interval to 500 epochs.

# training settings
TRAIN:
  epochs: 10000
  iters_per_epoch: 1
  learning_rate: 1.0e-3
  save_freq: 500
  eval_freq: 500

3.5 Optimizer Construction

We use the Adam optimizer with a learning rate of 0.001.

# set optimizer
optimizer = ppsci.optimizer.Adam(cfg.TRAIN.learning_rate)(model)

3.6 Validator Construction

To monitor performance, we construct a SupervisedValidator for periodic evaluation on the test set.

# set validator
eval_dataloader_cfg = {
    "dataset": {
        "name": "IterableNPZDataset",
        "file_path": cfg.VALID_FILE_PATH,
        "input_keys": ("u", "y"),
        "label_keys": ("G",),
        "alias_dict": {"u": "X_test0", "y": "X_test1", "G": "y_test"},
    },
}

For evaluation metric metric, select ppsci.metric.L2Rel.

Other configurations are similar to the settings of Constraint Construction.

3.7 Model Training and Evaluation

With all components configured, we pass them to ppsci.solver.Solver to commence training and evaluation.

solver = ppsci.solver.Solver(
    model,
    constraint,
    cfg.output_dir,
    optimizer,
    None,
    cfg.TRAIN.epochs,
    cfg.TRAIN.iters_per_epoch,
    save_freq=cfg.TRAIN.save_freq,
    eval_freq=cfg.TRAIN.eval_freq,
    log_freq=cfg.log_freq,
    seed=cfg.seed,
    validator=validator,
    eval_during_train=cfg.TRAIN.eval_during_train,
    checkpoint_path=cfg.TRAIN.checkpoint_path,
)
# train model
solver.train()
# evaluate after finished training
solver.eval()

3.8 Result Visualization

Post-training, we verify the model by constructing 9 synthetic \(u-G(u)\) function pairs. We discretize \(u\) and \(y\), predict \(G(u)(y)\), and compare the results with the analytical solutions.

    def predict_func(input_dict):
        return solver.predict(input_dict, return_numpy=True)[cfg.MODEL.G_key]

    plot(cfg, predict_func)


def evaluate(cfg: DictConfig):
    # set random seed for reproducibility
    ppsci.utils.misc.set_random_seed(cfg.seed)
    # initialize logger
    logger.init_logger("ppsci", osp.join(cfg.output_dir, f"{cfg.mode}.log"), "info")

    # set model
    model = ppsci.arch.DeepONet(**cfg.MODEL)

    # set validator
    eval_dataloader_cfg = {
        "dataset": {
            "name": "IterableNPZDataset",
            "file_path": cfg.VALID_FILE_PATH,
            "input_keys": ("u", "y"),
            "label_keys": ("G",),
            "alias_dict": {"u": "X_test0", "y": "X_test1", "G": "y_test"},
        },
    }
    sup_validator = ppsci.validate.SupervisedValidator(
        eval_dataloader_cfg,
        ppsci.loss.MSELoss(),
        {"G": lambda out: out["G"]},
        metric={"L2Rel": ppsci.metric.L2Rel()},
        name="G_eval",
    )
    validator = {sup_validator.name: sup_validator}

    solver = ppsci.solver.Solver(
        model,
        None,
        cfg.output_dir,
        validator=validator,
        pretrained_model_path=cfg.EVAL.pretrained_model_path,
        eval_with_no_grad=cfg.EVAL.eval_with_no_grad,
    )
    solver.eval()

    def predict_func(input_dict):
        return solver.predict(input_dict, return_numpy=True)[cfg.MODEL.G_key]

    plot(cfg, predict_func)


def export(cfg: DictConfig):
    # set model
    model = ppsci.arch.DeepONet(**cfg.MODEL)

    # initialize solver
    solver = ppsci.solver.Solver(
        model,
        pretrained_model_path=cfg.INFER.pretrained_model_path,
    )

4. Complete Code

deeponet.py
"""
Reference: https://deepxde.readthedocs.io/en/latest/demos/operator/antiderivative_unaligned.html
"""

import os
from os import path as osp
from typing import Callable
from typing import Tuple

import hydra
import numpy as np
import paddle
from matplotlib import pyplot as plt
from omegaconf import DictConfig

import ppsci
from ppsci.utils import logger


def train(cfg: DictConfig):
    # set random seed for reproducibility
    ppsci.utils.misc.set_random_seed(cfg.seed)
    # initialize logger
    logger.init_logger("ppsci", osp.join(cfg.output_dir, f"{cfg.mode}.log"), "info")

    # set model
    model = ppsci.arch.DeepONet(**cfg.MODEL)

    # set dataloader config
    train_dataloader_cfg = {
        "dataset": {
            "name": "IterableNPZDataset",
            "file_path": cfg.TRAIN_FILE_PATH,
            "input_keys": ("u", "y"),
            "label_keys": ("G",),
            "alias_dict": {"u": "X_train0", "y": "X_train1", "G": "y_train"},
        },
    }

    sup_constraint = ppsci.constraint.SupervisedConstraint(
        train_dataloader_cfg,
        ppsci.loss.MSELoss(),
        {"G": lambda out: out["G"]},
    )
    # wrap constraints together
    constraint = {sup_constraint.name: sup_constraint}

    # set optimizer
    optimizer = ppsci.optimizer.Adam(cfg.TRAIN.learning_rate)(model)

    # set validator
    eval_dataloader_cfg = {
        "dataset": {
            "name": "IterableNPZDataset",
            "file_path": cfg.VALID_FILE_PATH,
            "input_keys": ("u", "y"),
            "label_keys": ("G",),
            "alias_dict": {"u": "X_test0", "y": "X_test1", "G": "y_test"},
        },
    }
    sup_validator = ppsci.validate.SupervisedValidator(
        eval_dataloader_cfg,
        ppsci.loss.MSELoss(),
        {"G": lambda out: out["G"]},
        metric={"L2Rel": ppsci.metric.L2Rel()},
        name="G_eval",
    )
    validator = {sup_validator.name: sup_validator}

    # initialize solver
    solver = ppsci.solver.Solver(
        model,
        constraint,
        cfg.output_dir,
        optimizer,
        None,
        cfg.TRAIN.epochs,
        cfg.TRAIN.iters_per_epoch,
        save_freq=cfg.TRAIN.save_freq,
        eval_freq=cfg.TRAIN.eval_freq,
        log_freq=cfg.log_freq,
        seed=cfg.seed,
        validator=validator,
        eval_during_train=cfg.TRAIN.eval_during_train,
        checkpoint_path=cfg.TRAIN.checkpoint_path,
    )
    # train model
    solver.train()
    # evaluate after finished training
    solver.eval()

    def predict_func(input_dict):
        return solver.predict(input_dict, return_numpy=True)[cfg.MODEL.G_key]

    plot(cfg, predict_func)


def evaluate(cfg: DictConfig):
    # set random seed for reproducibility
    ppsci.utils.misc.set_random_seed(cfg.seed)
    # initialize logger
    logger.init_logger("ppsci", osp.join(cfg.output_dir, f"{cfg.mode}.log"), "info")

    # set model
    model = ppsci.arch.DeepONet(**cfg.MODEL)

    # set validator
    eval_dataloader_cfg = {
        "dataset": {
            "name": "IterableNPZDataset",
            "file_path": cfg.VALID_FILE_PATH,
            "input_keys": ("u", "y"),
            "label_keys": ("G",),
            "alias_dict": {"u": "X_test0", "y": "X_test1", "G": "y_test"},
        },
    }
    sup_validator = ppsci.validate.SupervisedValidator(
        eval_dataloader_cfg,
        ppsci.loss.MSELoss(),
        {"G": lambda out: out["G"]},
        metric={"L2Rel": ppsci.metric.L2Rel()},
        name="G_eval",
    )
    validator = {sup_validator.name: sup_validator}

    solver = ppsci.solver.Solver(
        model,
        None,
        cfg.output_dir,
        validator=validator,
        pretrained_model_path=cfg.EVAL.pretrained_model_path,
        eval_with_no_grad=cfg.EVAL.eval_with_no_grad,
    )
    solver.eval()

    def predict_func(input_dict):
        return solver.predict(input_dict, return_numpy=True)[cfg.MODEL.G_key]

    plot(cfg, predict_func)


def export(cfg: DictConfig):
    # set model
    model = ppsci.arch.DeepONet(**cfg.MODEL)

    # initialize solver
    solver = ppsci.solver.Solver(
        model,
        pretrained_model_path=cfg.INFER.pretrained_model_path,
    )

    # export model
    from paddle.static import InputSpec

    input_spec = [
        {
            model.input_keys[0]: InputSpec(
                [None, 1000], "float32", name=model.input_keys[0]
            ),
            model.input_keys[1]: InputSpec(
                [None, 1], "float32", name=model.input_keys[1]
            ),
        }
    ]
    solver.export(input_spec, cfg.INFER.export_path)


def inference(cfg: DictConfig):
    from deploy import python_infer

    predictor = python_infer.GeneralPredictor(cfg)

    def predict_func(input_dict):
        return next(iter(predictor.predict(input_dict).values()))

    plot(cfg, predict_func)


def plot(cfg: DictConfig, predict_func: Callable):
    # visualize prediction for different functions u and corresponding G(u)
    dtype = paddle.get_default_dtype()

    def generate_y_u_G_ref(
        u_func: Callable, G_u_func: Callable
    ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
        """Generate discretized data of given function u and corresponding G(u).

        Args:
            u_func (Callable): Function u.
            G_u_func (Callable): Function G(u).

        Returns:
            Tuple[np.ndarray, np.ndarray, np.ndarray]: Discretized data of u, y and G(u).
        """
        x = np.linspace(0, 1, cfg.MODEL.num_loc, dtype=dtype).reshape(
            [1, cfg.MODEL.num_loc]
        )
        u = u_func(x)
        u = np.tile(u, [cfg.NUM_Y, 1])

        y = np.linspace(0, 1, cfg.NUM_Y, dtype=dtype).reshape([cfg.NUM_Y, 1])
        G_ref = G_u_func(y)
        return u, y, G_ref

    func_u_G_pair = [
        # (title_string, func_u, func_G(u)), s.t. dG/dx == u and G(u)(0) = 0
        (r"$u=\cos(x), G(u)=sin(x$)", lambda x: np.cos(x), lambda y: np.sin(y)),  # 1
        (
            r"$u=sec^2(x), G(u)=tan(x$)",
            lambda x: (1 / np.cos(x)) ** 2,
            lambda y: np.tan(y),
        ),  # 2
        (
            r"$u=sec(x)tan(x), G(u)=sec(x) - 1$",
            lambda x: (1 / np.cos(x) * np.tan(x)),
            lambda y: 1 / np.cos(y) - 1,
        ),  # 3
        (
            r"$u=1.5^x\ln{1.5}, G(u)=1.5^x-1$",
            lambda x: 1.5**x * np.log(1.5),
            lambda y: 1.5**y - 1,
        ),  # 4
        (r"$u=3x^2, G(u)=x^3$", lambda x: 3 * x**2, lambda y: y**3),  # 5
        (r"$u=4x^3, G(u)=x^4$", lambda x: 4 * x**3, lambda y: y**4),  # 6
        (r"$u=5x^4, G(u)=x^5$", lambda x: 5 * x**4, lambda y: y**5),  # 7
        (r"$u=6x^5, G(u)=x^6$", lambda x: 5 * x**4, lambda y: y**5),  # 8
        (r"$u=e^x, G(u)=e^x-1$", lambda x: np.exp(x), lambda y: np.exp(y) - 1),  # 9
    ]

    os.makedirs(os.path.join(cfg.output_dir, "visual"), exist_ok=True)
    for i, (title, u_func, G_func) in enumerate(func_u_G_pair):
        u, y, G_ref = generate_y_u_G_ref(u_func, G_func)
        G_pred = predict_func({"u": u, "y": y})
        plt.plot(y, G_pred, label=r"$G(u)(y)_{ref}$")
        plt.plot(y, G_ref, label=r"$G(u)(y)_{pred}$")
        plt.legend()
        plt.title(title)
        plt.savefig(os.path.join(cfg.output_dir, "visual", f"func_{i}_result.png"))
        logger.message(
            f"Saved result of function {i} to {cfg.output_dir}/visual/func_{i}_result.png"
        )
        plt.clf()
    plt.close()


@hydra.main(version_base=None, config_path="./conf", config_name="deeponet.yaml")
def main(cfg: DictConfig):
    if cfg.mode == "train":
        train(cfg)
    elif cfg.mode == "eval":
        evaluate(cfg)
    elif cfg.mode == "export":
        export(cfg)
    elif cfg.mode == "infer":
        inference(cfg)
    else:
        raise ValueError(
            f"cfg.mode should in ['train', 'eval', 'export', 'infer'], but got '{cfg.mode}'"
        )


if __name__ == "__main__":
    main()

5. Result Display

result0.jpg result1.jpg result2.jpg result3.jpg result4.jpg result5.jpg result6.jpg result7.jpg result8.jpg

6. References