跳转至

NLS-MB

# soliton
python NLS-MB_optical_soliton.py
# rogue wave
python NLS-MB_optical_rogue_wave.py
# soliton
python NLS-MB_optical_soliton.py mode=eval EVAL.pretrained_model_path=https://paddle-org.bj.bcebos.com/paddlescience/models/NLS-MB/NLS-MB_soliton_pretrained.pdparams
# rogue wave
python NLS-MB_optical_rogue_wave.py mode=eval EVAL.pretrained_model_path=https://paddle-org.bj.bcebos.com/paddlescience/models/NLS-MB/NLS-MB_rogue_wave_pretrained.pdparams
# soliton
python NLS-MB_optical_soliton.py mode=export
# rogue wave
python NLS-MB_optical_rogue_wave.py mode=export
# soliton
python NLS-MB_optical_soliton.py mode=infer
# rogue wave
python NLS-MB_optical_rogue_wave.py mode=infer
预训练模型 指标
NLS-MB_soliton_pretrained.pdparams Residual/loss: 0.00000
Residual/MSE.Schrodinger_1: 0.00000
Residual/MSE.Schrodinger_2: 0.00000
Residual/MSE.Maxwell_1: 0.00000
Residual/MSE.Maxwell_2: 0.00000
Residual/MSE.Bloch: 0.00000
NLS-MB_optical_rogue_wave.pdparams Residual/loss: 0.00001
Residual/MSE.Schrodinger_1: 0.00000
Residual/MSE.Schrodinger_2: 0.00000
Residual/MSE.Maxwell_1: 0.00000
Residual/MSE.Maxwell_2: 0.00000
Residual/MSE.Bloch: 0.00000

1. 背景简介

非线性局域波动力学,作为非线性科学的重要分支,涵盖了孤子、呼吸子和怪波等基本形式的非线性局域波。激光锁模技术为这些理论预言的非线性局域波提供了实验验证的平台,人们通过此技术观察到了孤子分子和怪波等丰富的非线性现象,进一步推动了非线性局域波的研究。目前,该领域的研究已深入流体力学、非线性光学、玻色-爱因斯坦凝聚(BEC)、等离子体物理等多个物理领域。在光纤领域,非线性动力学的研究基于光纤的光学器件、信息处理、材料设计以及信号传输的原理,对光纤激光器、放大器、波导和通信技术的发展起到了关键作用。光脉冲在光纤中的传播动力学受非线性偏微分方程(如非线性薛定谔方程NLSE)的调控。当色散与非线性效应共存时,这些方程往往难以解析求解。因此,分步傅立叶方法及其改进版本被广泛应用于研究光纤中的非线性效应,其优势在于实现简单且具有较高的相对精度。然而,对于长距离且高度非线性的场景,为满足精度需求,必须大幅减少分步傅立叶方法的步长,这无疑增加了计算复杂性,导致时域中网格点集数量庞大,计算过程耗时较长。PINN比数据驱动的方法在数据少得多的情况下表现出更好的性能,并且计算复杂性(以倍数表示)通常比SFM低两个数量级。

2. 问题定义

在掺铒光纤中,光脉冲的传播性质可以由耦合NLS-MB方程来描述,其形式为

\[ \begin{cases} \dfrac{\partial E}{\partial x} = i \alpha_1 \dfrac{\partial^2 E}{\partial t ^2} - i \alpha_2 |E|^2 E+2 p \\ \dfrac{\partial p}{\partial t} = 2 i \omega_0 p+2 E \eta \\ \dfrac{\partial \eta}{\partial t} = -(E p^* + E^* p) \end{cases} \]

其中,x, t分别表示归一化的传播距离和时间,复包络E是慢变的电场,p是共振介质偏振的量度,\(\eta\)表示粒子数反转的程度,符号*表示复共轭。\(\alpha_1\)是群速度色散参数,\(\alpha_2\)​​是Kerr非线性参数,是测量共振频率的偏移。NLS-MB系统是由Maimistov和Manykin首次提出来的,用来描述极短的脉冲在Kerr非线性介质中的传播.该系统在解决光纤损耗使得其传输距离受限这一问题上,也扮演着重要的作用。在这个方程中,它描述的是自感应透明孤子和NLS孤子的混合状态,称作SIT-NLS孤子,这两种孤子可以共存,并且已经有很多关于其在光纤通信中的研究.

2.1 Optical soliton

在光纤的反常色散区,由于色散和非线性效应的相互作用,可产生一种非常引人注目的现象——光孤子。“孤子”(soliton)是一种特殊的波包,它可以传输很长距离而不变形孤子在物理学的许多分支已得到广的研究,本案例讨论的光纤中的孤子不仅具有基础理论研究价值,而且在光纤通信方面也有实际应用。

\[ \begin{gathered} E(x,t) = \frac{{2\exp ( - 2it)}}{{\cosh (2t + 6x)}}, \\ p(x,t) = \frac{{\exp ( - 2it)\left\{ {\exp ( - 2t - 6x) - \exp (2t + 6x)} \right\}}}{{\cosh {{(2t + 6x)}^2}}}, \\ \eta (x,t) = \frac{{\cosh {{(2t + 6x)}^2} - 2}}{{\cosh {{(2t + 6x)}^2}}}. \end{gathered} \]

我们考虑计算域为 \([−1, 1] × [−1, 1]\)。 我们首先确定优化策略。 每个边界上有 \(200\) 个点,即 \(N_b = 2 × 200\)。为了计算 NLS-MB 的方程损失,在域内随机选择 \(20,000\) 个点。

2.2 Optical rogue wave

光学怪波(Optical rogue waves)是光学中的一种现象,类似于海洋中的孤立海浪,但在光学系统中。它们是突然出现并且幅度异常高的光波,光学孤立子波有一些潜在的应用,尤其是在光通信和激光技术领域。一些研究表明,它们可以用于增强光信号的传输和处理,或者用于产生超短脉冲激光。 我们考虑计算域为 \([−0.5, 0.5] × [−2.5, 2.5]\)

3. 问题求解

接下来开始讲解如何将问题一步一步地转化为 PaddleScience 代码,用深度学习的方法求解该问题。 为了快速理解 PaddleScience,接下来仅对模型构建、方程构建、计算域构建等关键步骤进行阐述,而其余细节请参考 API文档

3.1 模型构建

本文使用PINN经典的MLP模型进行训练。

model = ppsci.arch.MLP(**cfg.MODEL)

3.2 方程构建

由于 Optical soliton 使用的是 NLS-MB 方程,因此可以直接使用 PaddleScience 内置的 NLSMB

equation = {
    "NLS-MB": ppsci.equation.NLSMB(alpha_1=0.5, alpha_2=-1, omega_0=-1, time=True)
}

3.3 计算域构建

本文中 Optical soliton 问题作用在以空间(-1.0, 1.0), 时间(-1.0, 1.0) 的时空区域, 因此可以直接使用 PaddleScience 内置的时空几何 time_interval 作为计算域。

geom = {
    "time_interval": ppsci.geometry.TimeXGeometry(
        ppsci.geometry.TimeDomain(t_lower, t_upper, timestamps=timestamps),
        ppsci.geometry.Interval(x_lower, x_upper),
    )
}

3.4 约束构建

因数据集为解析解,我们先构造解析解函数

def analytic_solution(out):
    t, x = out["t"], out["x"]
    Eu_true = 2 * np.cos(2 * t) / np.cosh(2 * t + 6 * x)

    Ev_true = -2 * np.sin(2 * t) / np.cosh(2 * t + 6 * x)

    pu_true = (
        (np.exp(-2 * t - 6 * x) - np.exp(2 * t + 6 * x))
        * np.cos(2 * t)
        / np.cosh(2 * t + 6 * x) ** 2
    )
    pv_true = (
        -(np.exp(-2 * t - 6 * x) - np.exp(2 * t + 6 * x))
        * np.sin(2 * t)
        / np.cosh(2 * t + 6 * x) ** 2
    )
    eta_true = (np.cosh(2 * t + 6 * x) ** 2 - 2) / np.cosh(2 * t + 6 * x) ** 2

    return Eu_true, Ev_true, pu_true, pv_true, eta_true

3.4.1 内部点约束

以作用在内部点上的 InteriorConstraint 为例,代码如下:

pde_constraint = ppsci.constraint.InteriorConstraint(
    equation["NLS-MB"].equations,
    {
        "Schrodinger_1": 0,
        "Schrodinger_2": 0,
        "Maxwell_1": 0,
        "Maxwell_2": 0,
        "Bloch": 0,
    },
    geom["time_interval"],
    {
        "dataset": {"name": "IterableNamedArrayDataset"},
        "batch_size": 20000,
        "iters_per_epoch": cfg.TRAIN.iters_per_epoch,
    },
    ppsci.loss.MSELoss(),
    evenly=True,
    name="EQ",
)

InteriorConstraint 的第一个参数是方程(组)表达式,用于描述如何计算约束目标,此处填入在 3.2 方程构建 章节中实例化好的 equation["NLS-MB"].equations

第二个参数是约束变量的目标值,在本问题中希望 NLS-MB 每个方程均被优化至 0;

第三个参数是约束方程作用的计算域,此处填入在 3.3 计算域构建 章节实例化好的 geom["time_interval"] 即可;

第四个参数是在计算域上的采样配置,此处设置 batch_size20000

第五个参数是损失函数,此处选用常用的 MSE 函数,且 reduction 设置为 "mean",即会将参与计算的所有数据点的均方误差;

第六个参数是约束条件的名字,需要给每一个约束条件命名,方便后续对其索引。此处命名为 "EQ" 即可。

3.4.2 边界约束

由于我们边界点和初值点具有解析解,因此我们使用监督约束

sup_constraint = ppsci.constraint.SupervisedConstraint(
    train_dataloader_cfg,
    ppsci.loss.MSELoss("mean"),
    name="Sup",
)

3.5 超参数设定

接下来需要指定训练轮数和学习率,此处按实验经验,使用 50000 轮训练轮数,0.001 的初始学习率。

# training settings
TRAIN:
  epochs: 50000
  iters_per_epoch: 1
  lbfgs:
    iters_per_epoch: ${TRAIN.iters_per_epoch}
    output_dir: ${output_dir}LBFGS
    learning_rate: 1.0
    max_iter: 1
    eval_freq: ${TRAIN.eval_freq}
    eval_during_train: ${TRAIN.eval_during_train}
  eval_during_train: true
  eval_freq: 1000
  learning_rate: 0.001

3.6 优化器构建

训练过程会调用优化器来更新模型参数,此处选择较为常用的 Adam 优化器。

optimizer = ppsci.optimizer.Adam(learning_rate=cfg.TRAIN.learning_rate)(model)

3.7 评估器构建

在训练过程中通常会按一定轮数间隔,用验证集(测试集)评估当前模型的训练情况,因此使用 ppsci.validate.GeometryValidator 构建评估器。

residual_validator = ppsci.validate.GeometryValidator(
    equation["NLS-MB"].equations,
    {
        "Schrodinger_1": 0,
        "Schrodinger_2": 0,
        "Maxwell_1": 0,
        "Maxwell_2": 0,
        "Bloch": 0,
    },
    geom["time_interval"],
    {
        "dataset": "IterableNamedArrayDataset",
        "total_size": 20600,
    },
    ppsci.loss.MSELoss(),
    evenly=True,
    metric={"MSE": ppsci.metric.MSE()},
    with_initial=True,
    name="Residual",
)
validator = {residual_validator.name: residual_validator}

3.8 可视化器构建

在模型训练完毕之后,我们可以在计算域取点进行预测,并手动计算出振幅,并可视化结果。

vis_points = geom["time_interval"].sample_interior(20000, evenly=True)
Eu_true, Ev_true, pu_true, pv_true, eta_true = analytic_solution(vis_points)
pred = solver.predict(vis_points, return_numpy=True)
t = vis_points["t"][:, 0]
x = vis_points["x"][:, 0]
E_ref = np.sqrt(Eu_true**2 + Ev_true**2)[:, 0]
E_pred = np.sqrt(pred["Eu"] ** 2 + pred["Ev"] ** 2)[:, 0]
p_ref = np.sqrt(pu_true**2 + pv_true**2)[:, 0]
p_pred = np.sqrt(pred["pu"] ** 2 + pred["pv"] ** 2)[:, 0]
eta_ref = eta_true[:, 0]
eta_pred = pred["eta"][:, 0]

# plot
plot(t, x, E_ref, E_pred, p_ref, p_pred, eta_ref, eta_pred, cfg.output_dir)

3.9 模型训练、评估与可视化

3.9.1 使用 Adam 训练

完成上述设置之后,只需要将上述实例化的对象按顺序传递给 ppsci.solver.Solver,然后启动训练、评估、可视化。

solver = ppsci.solver.Solver(
    model,
    constraint,
    cfg.output_dir,
    optimizer,
    epochs=cfg.TRAIN.epochs,
    iters_per_epoch=cfg.TRAIN.iters_per_epoch,
    eval_during_train=cfg.TRAIN.eval_during_train,
    eval_freq=cfg.TRAIN.eval_freq,
    equation=equation,
    geom=geom,
    validator=validator,
)
# train model
solver.train()
# evaluate after finished training
solver.eval()

3.9.2 使用 L-BFGS 微调[可选]

在使用 Adam 优化器训练完毕之后,我们可以将优化器更换成二阶优化器 L-BFGS 继续训练少量轮数(此处我们使用 Adam 优化轮数的 10% 即可),从而进一步提高模型精度。

OUTPUT_DIR = cfg.TRAIN.lbfgs.output_dir
logger.init_logger("ppsci", osp.join(OUTPUT_DIR, f"{cfg.mode}.log"), "info")
EPOCHS = cfg.TRAIN.epochs // 10
optimizer_lbfgs = ppsci.optimizer.LBFGS(
    cfg.TRAIN.lbfgs.learning_rate, cfg.TRAIN.lbfgs.max_iter
)(model)
solver = ppsci.solver.Solver(
    model,
    constraint,
    OUTPUT_DIR,
    optimizer_lbfgs,
    None,
    EPOCHS,
    cfg.TRAIN.lbfgs.iters_per_epoch,
    eval_during_train=cfg.TRAIN.lbfgs.eval_during_train,
    eval_freq=cfg.TRAIN.lbfgs.eval_freq,
    equation=equation,
    geom=geom,
    validator=validator,
)
# train model
solver.train()
# evaluate after finished training
solver.eval()
提示

在常规优化器训练完毕之后,使用 L-BFGS 微调少量轮数的方法,在大多数场景中都可以进一步有效提高模型精度。

4. 完整代码

NLS-MB_optical_soliton.py
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from os import path as osp

import hydra
import numpy as np
from matplotlib import pyplot as plt
from omegaconf import DictConfig

import ppsci
from ppsci.utils import logger


def analytic_solution(out):
    t, x = out["t"], out["x"]
    Eu_true = 2 * np.cos(2 * t) / np.cosh(2 * t + 6 * x)

    Ev_true = -2 * np.sin(2 * t) / np.cosh(2 * t + 6 * x)

    pu_true = (
        (np.exp(-2 * t - 6 * x) - np.exp(2 * t + 6 * x))
        * np.cos(2 * t)
        / np.cosh(2 * t + 6 * x) ** 2
    )
    pv_true = (
        -(np.exp(-2 * t - 6 * x) - np.exp(2 * t + 6 * x))
        * np.sin(2 * t)
        / np.cosh(2 * t + 6 * x) ** 2
    )
    eta_true = (np.cosh(2 * t + 6 * x) ** 2 - 2) / np.cosh(2 * t + 6 * x) ** 2

    return Eu_true, Ev_true, pu_true, pv_true, eta_true


def plot(
    t: np.ndarray,
    x: np.ndarray,
    E_ref: np.ndarray,
    E_pred: np.ndarray,
    p_ref: np.ndarray,
    p_pred: np.ndarray,
    eta_ref: np.ndarray,
    eta_pred: np.ndarray,
    output_dir: str,
):
    fig = plt.figure(figsize=(10, 10))
    plt.subplot(3, 3, 1)
    plt.title("E_ref")
    plt.tricontourf(x, t, E_ref, levels=256, cmap="jet")
    plt.subplot(3, 3, 2)
    plt.title("E_pred")
    plt.tricontourf(x, t, E_pred, levels=256, cmap="jet")
    plt.subplot(3, 3, 3)
    plt.title("E_diff")
    plt.tricontourf(x, t, np.abs(E_ref - E_pred), levels=256, cmap="jet")
    plt.subplot(3, 3, 4)
    plt.title("p_ref")
    plt.tricontourf(x, t, p_ref, levels=256, cmap="jet")
    plt.subplot(3, 3, 5)
    plt.title("p_pred")
    plt.tricontourf(x, t, p_pred, levels=256, cmap="jet")
    plt.subplot(3, 3, 6)
    plt.title("p_diff")
    plt.tricontourf(x, t, np.abs(p_ref - p_pred), levels=256, cmap="jet")
    plt.subplot(3, 3, 7)
    plt.title("eta_ref")
    plt.tricontourf(x, t, eta_ref, levels=256, cmap="jet")
    plt.subplot(3, 3, 8)
    plt.title("eta_pred")
    plt.tricontourf(x, t, eta_pred, levels=256, cmap="jet")
    plt.subplot(3, 3, 9)
    plt.title("eta_diff")
    plt.tricontourf(x, t, np.abs(eta_ref - eta_pred), levels=256, cmap="jet")
    fig_path = osp.join(output_dir, "pred_optical_soliton.png")
    print(f"Saving figure to {fig_path}")
    fig.savefig(fig_path, bbox_inches="tight", dpi=400)
    plt.close()


def train(cfg: DictConfig):
    # set model
    model = ppsci.arch.MLP(**cfg.MODEL)

    # set equation
    equation = {
        "NLS-MB": ppsci.equation.NLSMB(alpha_1=0.5, alpha_2=-1, omega_0=-1, time=True)
    }

    x_lower = -1
    x_upper = 1
    t_lower = -1
    t_upper = 1
    # set timestamps(including initial t0)
    timestamps = np.linspace(t_lower, t_upper, cfg.NTIME_ALL, endpoint=True)
    # set time-geometry
    geom = {
        "time_interval": ppsci.geometry.TimeXGeometry(
            ppsci.geometry.TimeDomain(t_lower, t_upper, timestamps=timestamps),
            ppsci.geometry.Interval(x_lower, x_upper),
        )
    }

    X, T = np.meshgrid(
        np.linspace(x_lower, x_upper, 256), np.linspace(t_lower, t_upper, 256)
    )
    X_star = np.hstack((X.flatten()[:, None], T.flatten()[:, None]))

    # Boundary and Initial conditions
    ic = X_star[:, 1] == t_lower
    idx_ic = np.random.choice(np.where(ic)[0], 200, replace=False)
    lb = X_star[:, 0] == x_lower
    idx_lb = np.random.choice(np.where(lb)[0], 200, replace=False)
    ub = X_star[:, 0] == x_upper
    idx_ub = np.random.choice(np.where(ub)[0], 200, replace=False)
    icbc_idx = np.hstack((idx_lb, idx_ic, idx_ub))
    X_u_train = X_star[icbc_idx].astype("float32")
    X_u_train = {"t": X_u_train[:, 1:2], "x": X_u_train[:, 0:1]}

    Eu_train, Ev_train, pu_train, pv_train, eta_train = analytic_solution(X_u_train)

    train_dataloader_cfg = {
        "dataset": {
            "name": "NamedArrayDataset",
            "input": {"t": X_u_train["t"], "x": X_u_train["x"]},
            "label": {
                "Eu": Eu_train,
                "Ev": Ev_train,
                "pu": pu_train,
                "pv": pv_train,
                "eta": eta_train,
            },
        },
        "batch_size": 600,
        "iters_per_epoch": cfg.TRAIN.iters_per_epoch,
    }

    # set constraint
    pde_constraint = ppsci.constraint.InteriorConstraint(
        equation["NLS-MB"].equations,
        {
            "Schrodinger_1": 0,
            "Schrodinger_2": 0,
            "Maxwell_1": 0,
            "Maxwell_2": 0,
            "Bloch": 0,
        },
        geom["time_interval"],
        {
            "dataset": {"name": "IterableNamedArrayDataset"},
            "batch_size": 20000,
            "iters_per_epoch": cfg.TRAIN.iters_per_epoch,
        },
        ppsci.loss.MSELoss(),
        evenly=True,
        name="EQ",
    )

    # supervised constraint s.t ||u-u_0||
    sup_constraint = ppsci.constraint.SupervisedConstraint(
        train_dataloader_cfg,
        ppsci.loss.MSELoss("mean"),
        name="Sup",
    )

    # wrap constraints together
    constraint = {
        pde_constraint.name: pde_constraint,
        sup_constraint.name: sup_constraint,
    }

    # set optimizer
    optimizer = ppsci.optimizer.Adam(learning_rate=cfg.TRAIN.learning_rate)(model)

    # set validator
    residual_validator = ppsci.validate.GeometryValidator(
        equation["NLS-MB"].equations,
        {
            "Schrodinger_1": 0,
            "Schrodinger_2": 0,
            "Maxwell_1": 0,
            "Maxwell_2": 0,
            "Bloch": 0,
        },
        geom["time_interval"],
        {
            "dataset": "IterableNamedArrayDataset",
            "total_size": 20600,
        },
        ppsci.loss.MSELoss(),
        evenly=True,
        metric={"MSE": ppsci.metric.MSE()},
        with_initial=True,
        name="Residual",
    )
    validator = {residual_validator.name: residual_validator}

    # initialize solver
    solver = ppsci.solver.Solver(
        model,
        constraint,
        cfg.output_dir,
        optimizer,
        epochs=cfg.TRAIN.epochs,
        iters_per_epoch=cfg.TRAIN.iters_per_epoch,
        eval_during_train=cfg.TRAIN.eval_during_train,
        eval_freq=cfg.TRAIN.eval_freq,
        equation=equation,
        geom=geom,
        validator=validator,
    )
    # train model
    solver.train()
    # evaluate after finished training
    solver.eval()

    # fine-tuning pretrained model with L-BFGS
    OUTPUT_DIR = cfg.TRAIN.lbfgs.output_dir
    logger.init_logger("ppsci", osp.join(OUTPUT_DIR, f"{cfg.mode}.log"), "info")
    EPOCHS = cfg.TRAIN.epochs // 10
    optimizer_lbfgs = ppsci.optimizer.LBFGS(
        cfg.TRAIN.lbfgs.learning_rate, cfg.TRAIN.lbfgs.max_iter
    )(model)
    solver = ppsci.solver.Solver(
        model,
        constraint,
        OUTPUT_DIR,
        optimizer_lbfgs,
        None,
        EPOCHS,
        cfg.TRAIN.lbfgs.iters_per_epoch,
        eval_during_train=cfg.TRAIN.lbfgs.eval_during_train,
        eval_freq=cfg.TRAIN.lbfgs.eval_freq,
        equation=equation,
        geom=geom,
        validator=validator,
    )
    # train model
    solver.train()
    # evaluate after finished training
    solver.eval()

    # visualize prediction
    vis_points = geom["time_interval"].sample_interior(20000, evenly=True)
    Eu_true, Ev_true, pu_true, pv_true, eta_true = analytic_solution(vis_points)
    pred = solver.predict(vis_points, return_numpy=True)
    t = vis_points["t"][:, 0]
    x = vis_points["x"][:, 0]
    E_ref = np.sqrt(Eu_true**2 + Ev_true**2)[:, 0]
    E_pred = np.sqrt(pred["Eu"] ** 2 + pred["Ev"] ** 2)[:, 0]
    p_ref = np.sqrt(pu_true**2 + pv_true**2)[:, 0]
    p_pred = np.sqrt(pred["pu"] ** 2 + pred["pv"] ** 2)[:, 0]
    eta_ref = eta_true[:, 0]
    eta_pred = pred["eta"][:, 0]

    # plot
    plot(t, x, E_ref, E_pred, p_ref, p_pred, eta_ref, eta_pred, cfg.output_dir)


def evaluate(cfg: DictConfig):
    # set model
    model = ppsci.arch.MLP(**cfg.MODEL)

    # set equation
    equation = {
        "NLS-MB": ppsci.equation.NLSMB(alpha_1=0.5, alpha_2=-1, omega_0=-1, time=True)
    }

    # set geometry
    x_lower = -1
    x_upper = 1
    t_lower = -1
    t_upper = 1
    # set timestamps(including initial t0)
    timestamps = np.linspace(t_lower, t_upper, cfg.NTIME_ALL, endpoint=True)
    # set time-geometry
    geom = {
        "time_interval": ppsci.geometry.TimeXGeometry(
            ppsci.geometry.TimeDomain(t_lower, t_upper, timestamps=timestamps),
            ppsci.geometry.Interval(x_lower, x_upper),
        )
    }

    # set validator
    residual_validator = ppsci.validate.GeometryValidator(
        equation["NLS-MB"].equations,
        {
            "Schrodinger_1": 0,
            "Schrodinger_2": 0,
            "Maxwell_1": 0,
            "Maxwell_2": 0,
            "Bloch": 0,
        },
        geom["time_interval"],
        {
            "dataset": "IterableNamedArrayDataset",
            "total_size": 20600,
        },
        ppsci.loss.MSELoss(),
        evenly=True,
        metric={"MSE": ppsci.metric.MSE()},
        with_initial=True,
        name="Residual",
    )
    validator = {residual_validator.name: residual_validator}

    # initialize solver
    solver = ppsci.solver.Solver(
        model,
        output_dir=cfg.output_dir,
        eval_freq=cfg.TRAIN.eval_freq,
        equation=equation,
        geom=geom,
        validator=validator,
        pretrained_model_path=cfg.EVAL.pretrained_model_path,
    )
    solver.eval()

    # visualize prediction
    vis_points = geom["time_interval"].sample_interior(20000, evenly=True)
    Eu_true, Ev_true, pu_true, pv_true, eta_true = analytic_solution(vis_points)
    pred = solver.predict(vis_points, return_numpy=True)
    t = vis_points["t"][:, 0]
    x = vis_points["x"][:, 0]
    E_ref = np.sqrt(Eu_true**2 + Ev_true**2)[:, 0]
    E_pred = np.sqrt(pred["Eu"] ** 2 + pred["Ev"] ** 2)[:, 0]
    p_ref = np.sqrt(pu_true**2 + pv_true**2)[:, 0]
    p_pred = np.sqrt(pred["pu"] ** 2 + pred["pv"] ** 2)[:, 0]
    eta_ref = eta_true[:, 0]
    eta_pred = pred["eta"][:, 0]

    # plot
    plot(t, x, E_ref, E_pred, p_ref, p_pred, eta_ref, eta_pred, cfg.output_dir)


def export(cfg: DictConfig):
    # set model
    model = ppsci.arch.MLP(**cfg.MODEL)

    # initialize solver
    solver = ppsci.solver.Solver(
        model,
        pretrained_model_path=cfg.INFER.pretrained_model_path,
    )
    # export model
    from paddle.static import InputSpec

    input_spec = [
        {key: InputSpec([None, 1], "float32", name=key) for key in model.input_keys},
    ]
    solver.export(input_spec, cfg.INFER.export_path)


def inference(cfg: DictConfig):
    from deploy.python_infer import pinn_predictor

    predictor = pinn_predictor.PINNPredictor(cfg)

    # set geometry
    x_lower = -1
    x_upper = 1
    t_lower = -1
    t_upper = 1
    # set timestamps(including initial t0)
    timestamps = np.linspace(t_lower, t_upper, cfg.NTIME_ALL, endpoint=True)
    # set time-geometry
    geom = {
        "time_interval": ppsci.geometry.TimeXGeometry(
            ppsci.geometry.TimeDomain(t_lower, t_upper, timestamps=timestamps),
            ppsci.geometry.Interval(x_lower, x_upper),
        )
    }

    NPOINT_TOTAL = cfg.NPOINT_INTERIOR + cfg.NPOINT_BC
    input_dict = geom["time_interval"].sample_interior(NPOINT_TOTAL, evenly=True)

    output_dict = predictor.predict(
        {key: input_dict[key] for key in cfg.MODEL.input_keys}, cfg.INFER.batch_size
    )

    # mapping data to cfg.INFER.output_keys
    output_dict = {
        store_key: output_dict[infer_key]
        for store_key, infer_key in zip(cfg.MODEL.output_keys, output_dict.keys())
    }

    # visualize prediction
    Eu_true, Ev_true, pu_true, pv_true, eta_true = analytic_solution(input_dict)
    t = input_dict["t"][:, 0]
    x = input_dict["x"][:, 0]
    E_ref = np.sqrt(Eu_true**2 + Ev_true**2)[:, 0]
    E_pred = np.sqrt(output_dict["Eu"] ** 2 + output_dict["Ev"] ** 2)[:, 0]
    p_ref = np.sqrt(pu_true**2 + pv_true**2)[:, 0]
    p_pred = np.sqrt(output_dict["pu"] ** 2 + output_dict["pv"] ** 2)[:, 0]
    eta_ref = eta_true[:, 0]
    eta_pred = output_dict["eta"][:, 0]

    # plot
    plot(t, x, E_ref, E_pred, p_ref, p_pred, eta_ref, eta_pred, cfg.output_dir)


@hydra.main(version_base=None, config_path="./conf", config_name="NLS-MB_soliton.yaml")
def main(cfg: DictConfig):
    if cfg.mode == "train":
        train(cfg)
    elif cfg.mode == "eval":
        evaluate(cfg)
    elif cfg.mode == "export":
        export(cfg)
    elif cfg.mode == "infer":
        inference(cfg)
    else:
        raise ValueError(
            f"cfg.mode should in ['train', 'eval', 'export', 'infer'], but got '{cfg.mode}'"
        )


if __name__ == "__main__":
    main()

5. 结果展示

5.1 optical_soliton

optical_soliton

解析解结果与 PINN 预测结果对比,从上到下分别为:慢变电场(E),共振偏量(p)以及粒子数反转程度(eta)

5.2 optical_rogue_wave

optical_rogue_wave

解析解结果与 PINN 预测结果对比,从上到下分别为:慢变电场(E),共振偏量(p)以及粒子数反转程度(eta)

可以看到PINN预测与解析解的结果基本一致。

6. 参考资料

[1] S.-Y. Xu, Q. Zhou, and W. Liu, Prediction of Soliton Evolution and Equation Parameters for NLS–MB Equation Based on the phPINN Algorithm, Nonlinear Dyn (2023).