Development Guide¶
This document describes how to develop code based on the PaddleScience suite and ultimately contribute to the PaddleScience suite.
Before starting PaddleScience-related paper reproduction and API development tasks, you need to submit an RFC document. Please refer to: PaddleScience RFC Template
1. Preparation¶
- Fork PaddleScience to your own repository on the web page.
-
Clone PaddleScience from your own repository to local and enter the directory.
Please fill in your github username in the
USER_NAMEfield in theclonecommand above. -
Install necessary dependent packages.
-
Create a new branch based on the current
developbranch (assume the new branch name isdev_model). -
Add the PaddleScience directory to the system environment variable
PYTHONPATH. -
Execute the following code to verify whether the basic functions of installed PaddleScience are normal.
If PaddleScience is installed successfully.✨ 🍰 ✨ appears, the installation verification is successful.
-
Install pre-commit [Important]
PaddleScience is an open source code base developed by many people. Therefore, in order to keep the final merged code style clean and consistent, PaddleScience uses automated code checking and formatting plugins including isort, black, etc., to make the committed code follow the python PEP8 code style specification.
Therefore, before committing your code, please be sure to execute the following command in the
PaddleScience/directory to installpre-commit, otherwise the submitted PR will be detected by code-style that the code is not formatted and cannot be merged.If you have already committed the code, you can manually execute the pre-commit command after installing the above pre-commit to format the code:
pre-commit run --files your/committed/code/file/or/folder, then manuallygit addthe modified files, and thengit commit.For details on pre-commit, please refer to Paddle Code Style Check Guide.
2. Write Code¶
After completing the above preparation work, you can start developing your own case or function based on PaddleScience.
Assume that the path of the new case code file is: PaddleScience/examples/demo/demo.py. Next, we will introduce this process in detail.
2.1 Import Necessary Packages¶
All APIs provided by PaddleScience are under the ppsci.* module. Therefore, at the beginning of demo.py, you first need to import the ppsci top-level module, then import the log printing module logger to facilitate automatically recording logs to local files when printing logs, and finally import other necessary modules according to your own needs.
import ppsci
from ppsci.utils import logger
# Import other necessary modules
# import ...
2.2 Set Running Environment¶
Before running demo.py, some necessary running environment settings are required, such as fixing random seeds (guaranteeing experiment reproducibility), setting output directories and initializing log printing modules (saving important experimental data).
if __name__ == "__main__":
# set random seed for reproducibility
ppsci.utils.misc.set_random_seed(42)
# set output directory
OUTPUT_DIR = "./output_example"
# initialize logger
logger.init_logger("ppsci", f"{OUTPUT_DIR}/train.log", "info")
After completing the above steps, demo.py has set up the necessary framework. Next, we will introduce how to develop or reuse other modules under ppsci.* based on your specific needs, so as to finally use them in demo.py.
2.3 Build Model¶
2.3.1 Build Existing Model¶
PaddleScience has built-in some common models, such as the MLP model. If you want to use these built-in models, you can directly call the API under ppsci.arch.* and fill in the parameters required for model instantiation to quickly build the model.
# create a MLP model
model = ppsci.arch.MLP(("x", "y"), ("u", "v", "p"), 9, 50, "tanh")
The above code instantiates an MLP fully connected model. Its input data has two fields: "x", "y", and output data has three fields: "u", "v", "w"; the model has \(9\) hidden layers, each layer has \(50\) neurons, and the activation function used by each hidden layer is the \(\tanh\) hyperbolic tangent function.
2.3.2 Build New Model¶
When the built-in models of PaddleScience cannot meet your needs, you can use your custom model by adding a model file and writing model code. The steps are as follows:
- Create a new model structure file under the
ppsci/arch/folder, takingnew_model.pyas an example. -
In the
new_model.pyfile, import the modulebasewhere the model base class of PaddleScience is located, and derive the new model class you want to create frombase.Arch(takingClass NewModelas an example). -
Write the
NewModel.__init__method, which is used for initialization operations during model creation, including initialization of model layers and parameter variables; then write theNewModel.forwardmethod, which defines the process of the model accepting input and calculating output. TakingMLP.__init__andMLP.forwardas examples, as shown below.class MLP(base.Arch): """Multi layer perceptron network. Args: input_keys (Tuple[str, ...]): Name of input keys, such as ("x", "y", "z"). output_keys (Tuple[str, ...]): Name of output keys, such as ("u", "v", "w"). num_layers (int): Number of hidden layers. hidden_size (Union[int, Tuple[int, ...]]): Number of hidden size. An integer for all layers, or list of integer specify each layer's size. activation (str, optional): Name of activation function. Defaults to "tanh". skip_connection (bool, optional): Whether to use skip connection. Defaults to False. weight_norm (bool, optional): Whether to apply weight norm on parameter(s). Defaults to False. input_dim (Optional[int]): Number of input's dimension. Defaults to None. output_dim (Optional[int]): Number of output's dimension. Defaults to None. periods (Optional[Dict[int, Tuple[float, bool]]]): Period of each input key, input in given channel will be period embedded if specified, each tuple of periods list is [period, trainable]. Defaults to None. fourier (Optional[Dict[str, Union[float, int]]]): Random fourier feature embedding, e.g. {'dim': 256, 'scale': 1.0}. Defaults to None. random_weight (Optional[Dict[str, float]]): Mean and std of random weight factorization layer, e.g. {"mean": 0.5, "std: 0.1"}. Defaults to None. Examples: >>> import paddle >>> import ppsci >>> model = ppsci.arch.MLP( ... input_keys=("x", "y"), ... output_keys=("u", "v"), ... num_layers=5, ... hidden_size=128 ... ) >>> input_dict = {"x": paddle.rand([64, 1]), ... "y": paddle.rand([64, 1])} >>> output_dict = model(input_dict) >>> print(output_dict["u"].shape) [64, 1] >>> print(output_dict["v"].shape) [64, 1] """ def __init__( self, input_keys: Tuple[str, ...], output_keys: Tuple[str, ...], num_layers: int, hidden_size: Union[int, Tuple[int, ...]], activation: str = "tanh", skip_connection: bool = False, weight_norm: bool = False, input_dim: Optional[int] = None, output_dim: Optional[int] = None, periods: Optional[Dict[int, Tuple[float, bool]]] = None, fourier: Optional[Dict[str, Union[float, int]]] = None, random_weight: Optional[Dict[str, float]] = None, ): super().__init__() self.input_keys = input_keys self.output_keys = output_keys self.linears = [] self.acts = [] self.periods = periods self.fourier = fourier if periods: self.period_emb = PeriodEmbedding(periods) if isinstance(hidden_size, (tuple, list)): if num_layers is not None: raise ValueError( "num_layers should be None when hidden_size is specified" ) elif isinstance(hidden_size, int): if not isinstance(num_layers, int): raise ValueError( "num_layers should be an int when hidden_size is an int" ) hidden_size = [hidden_size] * num_layers else: raise ValueError( f"hidden_size should be list of int or int, but got {type(hidden_size)}" ) # initialize FC layer(s) cur_size = len(self.input_keys) if input_dim is None else input_dim if input_dim is None and periods: # period embedded channel(s) will be doubled automatically # if input_dim is not specified cur_size += len(periods) if fourier: self.fourier_emb = FourierEmbedding( cur_size, fourier["dim"], fourier["scale"] ) cur_size = fourier["dim"] for i, _size in enumerate(hidden_size): if weight_norm: self.linears.append(WeightNormLinear(cur_size, _size)) elif random_weight: self.linears.append( RandomWeightFactorization( cur_size, _size, mean=random_weight["mean"], std=random_weight["std"], ) ) else: self.linears.append(nn.Linear(cur_size, _size)) # initialize activation function self.acts.append( act_mod.get_activation(activation) if activation != "stan" else act_mod.get_activation(activation)(_size) ) # special initialization for certain activation # TODO: Adapt code below to a more elegant style if activation == "siren": if i == 0: act_mod.Siren.init_for_first_layer(self.linears[-1]) else: act_mod.Siren.init_for_hidden_layer(self.linears[-1]) cur_size = _size self.linears = nn.LayerList(self.linears) self.acts = nn.LayerList(self.acts) if random_weight: self.last_fc = RandomWeightFactorization( cur_size, len(self.output_keys) if output_dim is None else output_dim, mean=random_weight["mean"], std=random_weight["std"], ) else: self.last_fc = nn.Linear( cur_size, len(self.output_keys) if output_dim is None else output_dim, ) self.skip_connection = skip_connectiondef forward(self, x): if self._input_transform is not None: x = self._input_transform(x) if self.periods: x = self.period_emb(x) y = self.concat_to_tensor(x, self.input_keys, axis=-1) if self.fourier: y = self.fourier_emb(y) y = self.forward_tensor(y) y = self.split_to_dict(y, self.output_keys, axis=-1) if self._output_transform is not None: y = self._output_transform(x, y) return y -
Import the written new model class
NewModelinppsci/arch/__init__.pyand add it to__all__.
After completing the work of writing new model code above, in demo.py, you can instantiate the model just written by calling ppsci.arch.NewModel, as shown below.
2.4 Build Equation¶
If your case problem involves equation calculation, you can choose to use PaddleScience's built-in equations or write your own equations.
2.4.1 Build Existing Equation¶
PaddleScience has built-in some common equations, such as NavierStokes equation. If you want to use these built-in equations, you can directly call the API under ppsci.equation.* and fill in the parameters required for equation instantiation to quickly build the equation.
# create a Vibration equation
viv_equation = ppsci.equation.Vibration(2, -4, 0)
2.4.2 Build New Equation¶
When the built-in equations of PaddleScience cannot meet your needs, you can also use your custom equation by adding an equation file and writing equation code.
Assume the equation formula to be calculated is as follows.
Where \(x\), \(y\) are model inputs, representing x and y axis coordinates; \(u=u(x,y)\), \(v=v(x,y)\) are model outputs, representing x and y axis direction velocities at \((x,y)\).
First, we need to appropriately transpose the above equations, moving terms containing variables and functions to the left side of the equation, and terms containing constants to the right side of the equation, to facilitate subsequent conversion into program code, as shown below.
Then the above transposed equation system can be converted into corresponding program code according to the following steps.
-
Create a new equation file under
ppsci/equation/pde/. If your equation is not a PDE equation, you need to create a new equation class folder, for example, create anodefolder underppsci/equation/, and then place your equation file under theodefolder. Here take the PDE class equationnew_pde.pyas an example. -
In the
new_pde.pyfile, import the modulebasewhere the equation base class of PaddleScience is located, and deriveClass NewPDEfrombase.PDE. -
Write
__init__code for initialization during equation creation, in which necessary variables and formula calculation processes are defined. PaddleScience supports creating equations using the sympy symbolic calculation library and writing equations directly using python functions. The two methods are as follows.ppsci/equation/pde/new_pde.pyfrom ppsci.equation.pde import base class NewPDE(base.PDE): def __init__(self): x, y = self.create_symbols("x y") # Create independent variables x, y u = self.create_function("u", (x, y)) # Create function u(x,y) regarding independent variables (x, y) v = self.create_function("v", (x, y)) # Create function v(x,y) regarding independent variables (x, y) expr1 = u.diff(x) + u.diff(y) - u # Expression on the left side of equation (3) expr2 = v.diff(x) + v.diff(y) - v # Expression on the left side of equation (4) self.add_equation("expr1", expr1) # Add sympy expression object of expr1 to the formula collection of NewPDE object self.add_equation("expr2", expr2) # Add sympy expression object of expr2 to the formula collection of NewPDE objectppsci/equation/pde/new_pde.pyfrom ppsci.autodiff import jacobian from ppsci.equation.pde import base class NewPDE(base.PDE): def __init__(self): def expr1_compute_func(out): x, y = out["x"], out["y"] # Extract data values of independent variables x, y from out data dictionary u = out["u"] # Extract function value of dependent variable u from out data dictionary expr1 = jacobian(u, x) + jacobian(u, y) - u # Calculation process of expression on the left side of equation (3) return expr1 # Return calculation result value def expr2_compute_func(out): x, y = out["x"], out["y"] # Extract data values of independent variables x, y from out data dictionary v = out["v"] # Extract function value of dependent variable v from out data dictionary expr2 = jacobian(v, x) + jacobian(v, y) - v # Calculation process of expression on the left side of equation (4) return expr2 self.add_equation("expr1", expr1_compute_func) # Add calculation function of expr1 to the formula collection of NewPDE object self.add_equation("expr2", expr2_compute_func) # Add calculation function of expr2 to the formula collection of NewPDE object -
Import the written new equation class
NewPDEinppsci/equation/__init__.pyand add it to__all__.
After completing the work of writing new equation code above, we can call the new equation class we wrote and use it to create an equation instance in the manner of ppsci.equation.NewPDE, just like PaddleScience built-in equations.
After the equation construction is completed, we need to wrap all equations into a dictionary.
2.5 Build Geometry Module [Optional]¶
The source of input and label data used during model training and validation varies according to specific case scenarios. Most PINN-based cases have data from coordinate points, normal vectors, and SDF values sampled inside or on the surface of geometric shapes; while for data-driven methods, most of their input and label data come from external files, or data stored in memory constructed by third-party libraries such as numpy. This chapter mainly introduces the geometry module required for the first case. The second case does not necessarily require a geometry module, and its construction method can refer to #2.6 Build Constraint Condition.
2.5.1 Build Existing Geometry¶
PaddleScience has built-in several types of common geometric shapes, including simple geometries and complex geometries, as shown below.
| Geometry Call Method | Meaning |
|---|---|
ppsci.geometry.Interval |
1D Line Segment Geometry |
ppsci.geometry.Disk |
2D Disk Geometry |
ppsci.geometry.Polygon |
2D Polygon Geometry |
ppsci.geometry.Rectangle |
2D Rectangle Geometry |
ppsci.geometry.Triangle |
2D Triangle Geometry |
ppsci.geometry.Cuboid |
3D Cuboid Geometry |
ppsci.geometry.Sphere |
3D Sphere Geometry |
ppsci.geometry.Mesh |
3D Mesh Geometry |
ppsci.geometry.PointCloud |
Point Cloud Geometry |
ppsci.geometry.TimeDomain |
1D Time Geometry (commonly used for transient problems) |
ppsci.geometry.TimeXGeometry |
1 + N D Geometry with Time (commonly used for transient problems) |
Taking the calculation domain as a 2D rectangle geometry as an example, the code to instantiate a rectangle geometry with x-axis side length of 2, y-axis side length of 1, and bottom-left corner at point (-1, -3) is as follows:
LEN_X, LEN_Y = 2, 1 # Define rectangle side lengths
rect = ppsci.geometry.Rectangle([-1, -3], [-1 + LEN_X, -3 + LEN_Y]) # Construct rectangle by bottom-left and top-right diagonal coordinates
Other geometry construction methods are similar, please refer to the ppsci.geometry part of the API document.
2.5.2 Build New Geometry¶
The following introduces how to build a new geometry —— 2D ellipse (no rotation) as an example.
-
First, we need to create a new ellipse class
Ellipsein the code fileppsci/geometry/geometry_2d.pyof 2D geometry, and designate its direct parent class asgeometry.Geometrygeometry base class. Then according to the algebraic representation formula of ellipse: \(\dfrac{x^2}{a^2} + \dfrac{y^2}{b^2} = 1\), it can be found that representing an ellipse requires recording its center coordinates \((x_0,y_0)\), x-axis radius \(a\), and y-axis radius \(b\). Therefore, the code of this ellipse class is as follows. -
Write necessary basic methods for the ellipse class, as shown below.
-
Determine whether the given point set is inside the ellipse
-
Determine whether the given point set is on the boundary of the ellipse
-
Randomly sample inside the ellipse (implemented using "Rejection Sampling Method" here)
ppsci/geometry/geometry_2d.pydef random_points(self, n, random="pseudo"): res_n = n result = [] max_radius = self.center.max() while (res_n < n): rng = sampler.sample(n, 2, random) r, theta = rng[:, 0], 2 * np.pi * rng[:, 1] x = np.sqrt(r) * np.cos(theta) y = np.sqrt(r) * np.sin(theta) candidate = max_radius * np.stack((x, y), axis=1) + self.center candidate = candidate[self.is_inside(candidate)] if len(candidate) > res_n: candidate = candidate[: res_n] result.append(candidate) res_n -= len(candidate) result = np.concatenate(result, axis=0) return result -
Randomly sample on the boundary of the ellipse (implemented based on ellipse parameter equation here)
-
-
Add the ellipse class
Ellipseinppsci/geometry/__init__.py, as shown below.
After completing the above implementation, we can instantiate the ellipse class in the following way. Similarly, it is recommended to wrap all geometry class instances in a dictionary for subsequent indexing.
2.6 Build Constraint Condition¶
Whether it is the PINNs method or the data-driven method, they always need to use data to guide the training of the network model, and this process is responsible by the Constraint module in PaddleScience.
2.6.1 Build Existing Constraint¶
PaddleScience has built-in some common constraints, as shown below.
| Constraint Name | Function |
|---|---|
ppsci.constraint.BoundaryConstraint |
Boundary Constraint |
ppsci.constraint.InitialConstraint |
Internal Point Initial Value Constraint |
ppsci.constraint.IntegralConstraint |
Boundary Integral Constraint |
ppsci.constraint.InteriorConstraint |
Internal Point Constraint |
ppsci.constraint.PeriodicConstraint |
Boundary Periodic Constraint |
ppsci.constraint.SupervisedConstraint |
Supervised Data Constraint |
If you want to use these built-in constraints, you can directly call the API under ppsci.constraint.* and fill in the parameters required for constraint instantiation to quickly build constraint conditions.
# create a SupervisedConstraint
sup_constraint = ppsci.constraint.SupervisedConstraint(
train_dataloader_cfg,
ppsci.loss.MSELoss("mean"),
{"eta": lambda out: out["eta"], **equation["VIV"].equations},
name="Sup",
)
For the parameter filling method of constraints, please refer to the corresponding API document parameter description and sample code.
2.6.2 Build New Constraint¶
When the built-in constraints of PaddleScience cannot meet your needs, you can also use your custom constraint by adding a constraint file and writing constraint code. The steps are as follows:
-
Create a new constraint file under
ppsci/constraint(here take constraintnew_constraint.pyas an example) -
In the
new_constraint.pyfile, import the modulebasewhere the constraint base class of PaddleScience is located, and let the created new constraint class (takeClass NewConstraintas an example) inherit frombase.PDE. -
Write the
__init__method for initialization during constraint creation. -
Import the written new constraint class
NewConstraintinppsci/constraint/__init__.pyand add it to__all__.
After completing the work of writing new constraint code above, we can call the new constraint class we wrote and use it to create a constraint instance in the manner of ppsci.constraint.NewConstraint, just like PaddleScience built-in constraints.
new_constraint = ppsci.constraint.NewConstraint(...)
constraint = {..., new_constraint.name: new_constraint}
2.7 Define Hyperparameters¶
Before the model starts training, some training-related hyperparameters need to be defined, such as training rounds, learning rate, etc., as shown below.
2.8 Build Optimizer¶
In addition to the model itself, an optimizer for updating model parameters needs to be defined during model training, as shown below.
2.9 Build Validator [Optional]¶
2.9.1 Build Existing Validator¶
PaddleScience has built-in some common validators, as shown below.
| Validator Name | Function |
|---|---|
ppsci.validator.GeometryValidator |
Geometry Validator |
ppsci.validator.SupervisedValidator |
Supervised Data Validator |
If you want to use these built-in validators, you can directly call the API under ppsci.validate.* and fill in the parameters required for validator instantiation to quickly build a validator.
# create a SupervisedValidator
eta_mse_validator = ppsci.validate.SupervisedValidator(
valid_dataloader_cfg,
ppsci.loss.MSELoss("mean"),
{"eta": lambda out: out["eta"], **equation["VIV"].equations},
metric={"MSE": ppsci.metric.MSE()},
name="eta_mse",
)
2.9.2 Build New Validator¶
When the built-in validators of PaddleScience cannot meet your needs, you can also use your custom validator by adding a validator file and writing validator code. The steps are as follows:
-
Create a new validator file under
ppsci/validate(here takenew_validator.pyas an example). -
In the
new_validator.pyfile, import the modulebasewhere the validator base class of PaddleScience is located, and let the created new validator class (takeClass NewValidatoras an example) inherit frombase.Validator. -
Write
__init__code for initialization during validator creation. -
Import the written new validator class
NewValidatorinppsci/validate/__init__.pyand add it to__all__.
After completing the work of writing new validator code above, we can call the new validator class we wrote and use it to create a validator instance in the manner of ppsci.validate.NewValidator, just like PaddleScience built-in validators. Similarly, after the validator is built, it is recommended to wrap all validators in a dictionary for convenient subsequent indexing.
new_validator = ppsci.validate.NewValidator(...)
validator = {..., new_validator.name: new_validator}
2.10 Build Visualizer [Optional]¶
PaddleScience has built-in some common visualizers, such as VisualizerVtu visualizer, etc. If you want to use these built-in visualizers, you can directly call the API under ppsci.visualizer.* and fill in the parameters required for visualizer instantiation to quickly build the model.
# manually collate input data for visualization,
# interior+boundary
vis_points = {}
for key in vis_interior_points:
vis_points[key] = np.concatenate(
(vis_interior_points[key], vis_boundary_points[key])
)
visualizer = {
"visualize_u_v": ppsci.visualize.VisualizerVtu(
vis_points,
{"u": lambda d: d["u"], "v": lambda d: d["v"], "p": lambda d: d["p"]},
prefix="result_u_v",
)
}
If you need to add a new visualizer, the steps are similar to the addition method of other modules, and will not be repeated here.
2.11 Build Solver¶
Solver is the global management class responsible for calling training, evaluation, and visualization in PaddleScience. Before training starts, you need to pass the built model, constraint, optimizer and other instances to Solver for instantiation, and then call its built-in methods for training, evaluation, and visualization.
# initialize solver
solver = ppsci.solver.Solver(
model,
constraint,
output_dir,
optimizer,
lr_scheduler,
EPOCHS,
iters_per_epoch,
eval_during_train=True,
eval_freq=eval_freq,
equation=equation,
validator=validator,
visualizer=visualizer,
)
2.12 Write Configuration File [Important]¶
Long content, click to expand
After the development of the above steps, the main part of the case code has been completed.
When we want to run some tuning experiments based on this code to get better results, or better manage the running parameter settings,
we can use the configuration management system provided by PaddleScience to separate the experimental running parameters from the code and write them into the configuration file in yaml format, so as to better manage, record, and tune experiments.
Taking viv case code as an example, at runtime we need to set equation parameters, STL file path, training rounds, batch_size, random seed, learning rate and other hyperparameters in appropriate positions, as shown below.
...
# set dataloader config
train_dataloader_cfg = {
"dataset": {
"name": "MatDataset",
"file_path": cfg.VIV_DATA_PATH,
"input_keys": ("t_f",),
"label_keys": ("eta", "f"),
"weight_dict": {"eta": 100},
},
"batch_size": cfg.TRAIN.batch_size,
"sampler": {
"name": "BatchSampler",
"drop_last": False,
"shuffle": True,
},
}
...
...
# set optimizer
lr_scheduler = ppsci.optimizer.lr_scheduler.Step(**cfg.TRAIN.lr_scheduler)()
...
These parameters may be manually adjusted as variables at any time during the experiment. How to avoid frequent modification of source code leading to confusion of experimental records and ensure complete and traceable records during the adjustment process is a major problem. Therefore, PaddleScience provides a configuration file management system based on hydra + omegaconf to solve this problem.
It is very simple to modify the existing code to the way of configuration file control. You only need to write the necessary parameters into the yaml file, then read and parse the file through hydra when the program is running, and control the experiment running through its content. Taking the viv case as an example, it specifically includes the following steps.
-
You need to create a
conffolder under the directory where the code fileviv.pyis located, and create aviv.yamlfile with the same name asviv.pyunderconf, as shown below. -
Fill in the necessary hyperparameters in
viv.pyinto the configuration of each level ofviv.yamlaccording to their semantics. For example, general parametersmode,output_dir,seed, equation parameters, file paths, etc., are directly filled in the first level; while model structure parameters, training rounds, etc., which are only related to model and training, only need to be filled inMODELandTRAINlevels respectively (EVALlevel is the same). -
Modify the existing
trainandevaluatefunctions to accept a parametercfg(cfgis the content in the readyamlfile and stored in the form of a dictionary), and uniformly change the internal hyperparameters to be obtained throughcfg.xxxinstead of the original direct setting as numbers or strings, as shown below. -
Create a new
mainfunction (which also accepts and only accepts onecfgparameter). It is responsible for calling thetrainorevaluatefunction according tocfg.mode, and add the decorator@hydra.main(version_base=None, config_path="./conf", config_name="viv.yaml")to themainfunction, as shown below. -
Start
main()in the main program startup entryif __name__ == "__main__":, as shown below.
After all transformations are completed, viv.py and viv.yaml are as shown below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 | |
defaults:- Beginning of defining a series of default configuration items.- ppsci_default- Use default configuration namedppsci_default.- TRAIN: train_default- Applytrain_defaultdefault configuration underTRAINnamespace.- TRAIN/ema: ema_default- Applyema_defaultconfiguration underemasub-namespace ofTRAIN.- TRAIN/swa: swa_default- Applyswa_defaultconfiguration underswasub-namespace ofTRAIN.- EVAL: eval_default- Applyeval_defaultdefault configuration underEVALnamespace.- INFER: infer_default- Applyinfer_defaultdefault configuration underINFERnamespace.- _self_- Indicates that the configuration of the current file itself will be included, used to overwrite or add default settings.dir: outputs_VIV/...- Set dynamic output directory based on current time and override name.init_callback:- Define settings for initialization callback._target_: ppsci.utils.callbacks.InitCallback- Specify the concrete implementation class or function of the initialization callback.mode: train- Set current running mode to training mode.seed: 42- Set global random seed to 42 to ensure experimental reproducibility.output_dir: ${hydra:run.dir}- Set output directory, same as Hydra configured running directory.log_freq: 20- Set log recording frequency, record every 20 times.use_tbd: false- Turn off tensorboard function.VIV_DATA_PATH: "./VIV_Training_Neta100.mat"- Specify VIV data file path (can be other dataset or hyperparameter values).MODEL:- Start defining model related settings.input_keys: ["t_f"]- Set model input keys tot_f.output_keys: ["eta"]- Set model output keys toeta.num_layers: 5- Set model layers to 5.hidden_size: 50- Set model hidden layer size to 50.activation: "tanh"- Use hyperbolic tangent function as activation function.TRAIN:- Start defining training related settings.epochs: 100000- Set total training rounds to 100000.iters_per_epoch: 1- Each round contains 1 iteration.save_freq: 10000- Save model or checkpoint every 10000 iterations.eval_during_train: true- Enable evaluation during training.eval_freq: 1000- Perform evaluation every 1000 iterations.batch_size: 100- Set training batch size to 100.lr_scheduler:- Start defining learning rate scheduler settings.epochs: ${TRAIN.epochs}- Rounds used by learning rate scheduler are the same as training rounds.iters_per_epoch: ${TRAIN.iters_per_epoch}- Iterations per round used by learning rate scheduler are the same as training.learning_rate: 0.001- Set initial learning rate to 0.001.step_size: 20000- Adjust learning rate every 20000 steps.gamma: 0.9- Learning rate adjustment ratio is 0.9.pretrained_model_path: null- Initial pre-trained model path, can be actual path or url, default is null meaning no pre-trained model is used.checkpoint_path: null- Checkpoint path, used to specify the checkpoint loaded by the model, default is null meaning no checkpoint is loaded.EVAL:- Start defining evaluation related settings.pretrained_model_path: null- Pre-trained model path used during evaluation, default is null meaning no pre-trained model is used.batch_size: 32- Batch size during evaluation.INFER:- Start defining inference related settings.pretrained_model_path: "https://..."- Specify pre-trained model path, can be actual path or url, used for inference.export_path: ./inference/viv- Set model export path, i.e., the saving location of model files required for inference.pdmodel_path: ${INFER.export_path}.json- Specify Paddle model structure file path, suffix can be.jsonor.pdmodel.pdiparams_path: ${INFER.export_path}.pdiparams- Specify Paddle model parameter file path.input_keys: ${MODEL.input_keys}- Input keys used during inference are the same as when model is defined.output_keys: ["eta", "f"]- Set output keys during inference, including "eta" and "f".device: gpu- Specify device used for inference as GPU.engine: native- Set inference engine to native (may refer to default engine using specific library or framework).precision: fp32- Set inference precision to 32-bit floating point number.onnx_path: ${INFER.export_path}.onnx- If supported, specify ONNX model file path.ir_optim: true- Enable Intermediate Representation optimization.min_subgraph_size: 10- Set minimum size for subgraph optimization to improve inference performance.gpu_mem: 4000- Specify amount of memory (unit may be MB) available for GPU during inference.gpu_id: 0- Specify which GPU to use for inference, here is the first GPU.max_batch_size: 64- Set maximum batch size supported during inference.num_cpu_threads: 4- Specify number of CPU threads used for inference.batch_size: 16- Set actual batch size during inference to 16.
2.13 Training¶
The training of the PaddleScience model only requires calling one line of code.
2.14 Evaluation¶
The evaluation of the PaddleScience model only requires calling one line of code.
2.15 Visualization [Optional]¶
If the visualizer parameter is passed when instantiating Solver, the visualization of the PaddleScience model only requires calling one line of code.
Visualization Solution
For some complex cases, the cost of writing Visualizer is not low, and not any data type can be easily visualized. Therefore, after training is completed, you can manually construct a data dictionary for prediction, then use solver.predict to get model prediction results, and finally use third-party libraries such as matplotlib to visualize and save the prediction results.
3. Write Documentation¶
In addition to case code, PaddleScience also stores detailed documentation for corresponding cases, written and rendered using Markdown + Mkdocs + Mkdocs-Material. The steps for writing documentation are as follows.
3.1 Install Necessary Dependent Packages¶
Instant rendering is required during the documentation writing process to preview the document content to check for errors. Therefore, you need to install mkdocs related dependent packages according to the following command.
3.2 Write Document Content¶
PaddleScience documentation is written based on plugins such as Mkdocs-Material, PyMdown. Based on Markdown syntax, it supports a variety of extensible functions, which can greatly improve the aesthetics and reading experience of the document. It is recommended to refer to the document content in the hyperlink and select appropriate functions to assist in document writing.
3.3 Use markdownlint to Format Document [Optional]¶
If your development environment is VSCode, it is recommended to install the markdownlint extension. After installation, inside the written document: right click --> Format Document.
3.4 Preview Document¶
Assuming the location of the written document is PaddleScience/docs/en/examples/your_exmaple.md, in order to display it in the left directory of PaddleScience official website Classic Cases, you need to modify PaddleScience/mkdocs.yml. Add the relative path of your_exmaple.md to the list under - Classic Cases: following other cases, as shown in the highlighted line below.
...
- 学习资料: tutorials.md
- 经典案例:
- " ":
- 数学(AI for Math):
- AllenCahn: examples/allen_cahn.md
- DeepHPMs: examples/deephpms.md
- DeepONet: examples/deeponet.md
- Euler_Beam: examples/euler_beam.md
- Laplace2D: examples/laplace2d.md
- Lorenz_transform_physx: examples/lorenz.md
- PIRBN: examples/pirbn.md
- Rossler_transform_physx: examples/rossler.md
- Volterra_IDE: examples/volterra_ide.md
- NLSMB: examples/nlsmb.md
- SPINN: examples/spinn.md
- XPINN: examples/xpinns.md
- NeuralOperator: examples/neuraloperator.md
- Brusselator3D: examples/brusselator3d.md
- Transformer4SR: examples/transformer4sr.md
- LatentNO: examples/latent_no.md
- 技术科学(AI for Technology):
- EXAMPLE_NAME: docs/en/examples/your_exmaple.md
...
Then execute the following command in the PaddleScience/ directory. After waiting for the build to complete, click the displayed link to enter the local webpage to preview the document content.
# ====== Terminal printing information is as follows ======
# INFO - Building documentation...
# INFO - Cleaning site directory
# INFO - Documentation built in 20.95 seconds
# INFO - [07:39:35] Watching paths for changes: 'docs', 'mkdocs.yml'
# INFO - [07:39:35] Serving on http://127.0.0.1:8000/PaddlePaddle/PaddleScience/
# INFO - [07:39:41] Browser connected: http://127.0.0.1:58903/PaddlePaddle/PaddleScience/
# INFO - [07:40:41] Browser connected: http://127.0.0.1:58903/PaddlePaddle/PaddleScience/en/development/
Manually Specify Service Address and Port Number
If the default port number 8000 is occupied, you can manually specify the address and port for service deployment. Examples are as follows.
4. Organize Code and Submit¶
4.1 Organize Code¶
After completing example writing and training, and confirming that the results are correct, you can organize the code.
Use git commands to submit all new and modified code files as well as necessary documents, pictures, etc. to the local dev_model branch together.
4.2 Sync Upstream Code¶
During the development process, the upstream code may be updated, so you need to execute the following command to pull the latest upstream code, merge it into the current code, and synchronize with the latest upstream code.
git remote add upstream https://github.com/PaddlePaddle/PaddleScience.git
git fetch upstream develop:upstream_develop
git merge upstream_develop
If a conflict occurs, you need to resolve the conflict, then use git add and git commit -m "merge code of upstream" commands to commit the code to the local repository, and finally execute git push origin dev_model to push the code to your remote repository.
4.3 Submit Pull Request¶
Switch to the dev_model branch on the github webpage, click "Contribute", then click the "Open pull request" button,
and contribute the dev_model branch containing your code, documents, pictures and other content to PaddleScience as a merge request.