Skip to content

A Transformer Model for Symbolic regression towards Scientific Discovery

# linux
wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer4sr/data_generated.tar.gz
# windows
# curl https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer4sr/data_generated.tar.gz -o data_generated.tar.gz
# unzip it
tar -xzvf data_generated.tar.gz
python transformer4sr.py
# download srsd dataset from huggingface
git clone https://huggingface.co/datasets/yoshitomo-matsubara/srsd-feynman_easy/
git clone https://huggingface.co/datasets/yoshitomo-matsubara/srsd-feynman_medium/
git clone https://huggingface.co/datasets/yoshitomo-matsubara/srsd-feynman_hard/
# running
python transformer4sr.py mode=eval EVAL.pretrained_model_path=https://paddle-org.bj.bcebos.com/paddlescience/models/transformer4sr/transformer4sr_pretrained.pdparams
python transformer4sr.py mode=export
python transformer4sr.py mode=infer
Pretrained Model Metrics
transformer4sr_pretrained.pdparams Mean ZSS distance(srsd-feynman_easy): 0.658 +- 0.390
Hit rate(srsd-feynman_easy): 8/30
Mean ZSS distance(srsd-feynman_medium): 0.674 +- 0.331
Hit rate(srsd-feynman_medium): 8/37
Mean ZSS distance(srsd-feynman_hard): 0.737 +- 0.188
Hit rate(srsd-feynman_hard): 1/39

1. Background Introduction

Symbolic Regression (SR) searches for mathematical expressions that best describe numerical datasets. It is roughly divided into three categories: GP-based SR (Genetic Programming-based Symbolic Regression), ML-based SR (Machine Learning-based Symbolic Regression), and DL-based SR (Deep Learning-based Symbolic Regression). The computational cost of GP-based SR algorithms is usually high, so this case uses a new Transformer model for symbolic regression, and applies the best model to the SRSD dataset (Symbolic Regression for Scientific Discovery dataset) for inference and testing.

2. Problem Definition

The authors propose a Transformer-based SR model, called Transformer4SR (A Transformer Model for Symbolic regression towards Scientific Discovery), which is used to deal with the closed library problem, converting the predefined vocabulary of symbolic regression into tokens using a specific method. In this case, the input data is generated by a program, the data passes through the model to obtain the output result, and then it is converted back to symbolic representation to finally obtain the symbolic regression result of the data.

The figure below is the network structure diagram of this method. The structure is based on Transformer and has two main parts: Encoder and Decoder. The authors propose three encoder architectures: MLP, Att or Mix. In this case, the encoder structure under Mix is mainly implemented. The decoder is a standard Transformer decoder. During training, the encoder receives a tabular dataset, and the decoder receives a sequence of ground truth values of tokens, while during inference, the decoder is independent and predicts tokens in an autoregressive manner.

model

transformer4sr model structure diagram

3. Problem Solving

Next, we will explain how to convert the problem into PaddleScience code step by step and solve the problem using deep learning methods. In order to quickly understand PaddleScience, only key steps such as model construction, equation construction, and computational domain construction are described below, while other details please refer to API Documentation.

3.1 Dataset Generation and Download

3.1.1 Data Generation

The training data used in this case is data autonomously generated by the case: first use symbols in the symbol library to randomly generate a large number of formulas; then filter out invalid or non-compliant formulas through a certain screening mechanism; finally sample according to these formulas; finally obtain input datasets and label data.

The parameter information for generating data is as follows:

DATA_GENERATE:
  # output path
  data_path: "./data_generated/"
  # filters
  num_nodes: [2, 15] # number of nodes
  num_nested_max: 6 # multiple levels of nesting
  num_consts: [1, 1] # number of constants(C)
  num_vars: [1, 6] # number of variables(x1,x2,...)
  seq_length_max: 30
  order_of_mag_limit: 1.0e+9 # magnitude of value
  # others
  num_init_trials: 100000 # number of initial trials
  num_sampling_per_eq: 25 # number of times to evaluate constants for each unique equation
  sampling_times: 50 # the number of observations
  var_type: "normal" # variable representation, 'normal' is (y, x1, x2, ...), 'log' is log(abs(y, x1, x2, ...)), or 'both'
  num_zfill: 8
DATA:

Among them, num_init_trials is the number of randomly generated initial equations. The larger this value, the more data is generated. In the original paper, this value is 1000000.

After setting relevant parameters, you can use the following command to generate the dataset:

python generate_datasets.py

3.1.2 Data Download

We also generated a dataset 10 times smaller than the original training data in advance (i.e., num_init_trials is 100000) for simple model training, and provided a download link:

wget -c https://paddle-org.bj.bcebos.com/paddlescience/datasets/transformer4sr/data_generated.tar
tar -xvf data_generated.tar

This case validates the model on the open source symbolic regression dataset SRSD (Rethinking Symbolic Regression Datasets and Benchmarks for Scientific Discovery), so this dataset needs to be pre-downloaded. The dataset is stored on huggingface, and the addresses are srsd-feynman_easy, srsd-feynman_medium, srsd-feynman_hard. Can be downloaded from the corresponding webpage or using git:

git clone https://huggingface.co/datasets/yoshitomo-matsubara/srsd-feynman_easy/
git clone https://huggingface.co/datasets/yoshitomo-matsubara/srsd-feynman_medium/
git clone https://huggingface.co/datasets/yoshitomo-matsubara/srsd-feynman_hard/

3.2 Data Reading

Since data reading and conversion are relatively complicated, dataset-related functions are defined in the functions_data.py file and called during model training, validation, etc.

Reading autonomously generated datasets during training:

# data
data_funcs = DataFuncs(
    cfg.DATA.data_path,
    cfg.DATA.vocab_library,
    cfg.DATA.seq_length_max,
    cfg.DATA.ratio,
    shuffle=True,
)

Reading SRSD dataset during validation:

# data
data_funcs = SRSDDataFuncs(
    cfg.DATA.data_path_srsd,
    cfg.DATA.sampling_times,
    cfg.DATA.response_variable,
    cfg.DATA.vocab_library,
    cfg.DATA.seq_length_max,
    shuffle=True,
)

Data related parameters are defined in the yaml file:

data_path_srsd: ["./srsd-feynman_easy/"]
ratio: [0.8, 0.1, 0.1]
sampling_times: ${DATA_GENERATE.sampling_times}
seq_length_max: 30 # ${DATA_GENERATE.seq_length_max}
response_variable: ["y", "x1", "x2", "x3", "x4", "x5", "x6"] # maximum number of variables is len(response_variable)=7
vocab_library: [
    "add",
    "mul",
    "sin",

3.3 Model Construction

In this problem, we use the neural network Transformer as the model.

# set model
num_var_max = len(cfg.DATA.response_variable)
vocab_size = len(cfg.DATA.vocab_library) + 2
model = ppsci.arch.Transformer(
    **cfg.MODEL,
    num_var_max=num_var_max,
    vocab_size=vocab_size,
    seq_length=data_funcs.seq_length_max,
)

In order to accurately and quickly access the value of specific variables during calculation, we specify here that the input variable name of the network model is ("input", "target_seq"), and the output variable name is ("output", ). These names are consistent with subsequent code.

3.4 Optimizer Construction

This case uses a custom learning rate strategy LambdaDecay, which supports custom learning rate decay functions. The training process will call the optimizer to update model parameters. Here, the commonly used Adam optimizer is selected.

# set optimizer
def lr_lambda(step, d_model=cfg.MODEL.d_model, warmup=cfg.TRAIN.lr_warmup):
    if step == 0:
        step = 1
    lr = d_model ** (-0.5) * min(step ** (-0.5), step * warmup ** (-1.5))
    return lr

lr_scheduler = ppsci.optimizer.lr_scheduler.LambdaDecay(
    **cfg.TRAIN.lr_scheduler,
    lr_lambda=lr_lambda,
)()
optimizer = ppsci.optimizer.Adam(lr_scheduler, **cfg.TRAIN.adam)(model)

3.5 Constraint Construction

In this case, we use a supervised dataset to train the model, so we need to construct supervised constraint SupervisedConstraint:

# set constraint
sup_constraint = ppsci.constraint.SupervisedConstraint(
    {
        "dataset": {
            "name": "NamedArrayDataset",
            "input": {
                "input": data_funcs.values_train.astype(paddle.get_default_dtype()),
                "target_seq": data_funcs.targets_train[:, :-1],
            },
            "label": {"output": data_funcs.targets_train[:, 1:]},
        },
        "batch_size": cfg.TRAIN.batch_size,
        "sampler": {
            "name": "BatchSampler",
            "drop_last": False,
            "shuffle": True,
        },
        "num_workers": 1,
    },
    ppsci.loss.FunctionalLoss(cross_entropy_loss_func),
    name="sup_constraint",
)

The first parameter of SupervisedConstraint is the reading configuration of supervised constraint, where dataset field represents the training dataset information used, and each field respectively represents:

  1. name: Dataset type, here NamedArrayDataset represents the dataset type is Array;
  2. input: Input data;
  3. label: Label data;

batch_size field represents the size of batch;

sampler field represents sampling method, where each field represents:

  1. name: Sampler type, here BatchSampler represents batch sampler;
  2. drop_last: Whether to discard the last samples that cannot make up a mini-batch;
  3. shuffle: Whether to shuffle the order when generating sample subscripts;

The second parameter is the loss function. Here FunctionalLoss is a PaddleScience custom loss function class, which supports custom loss calculation method when writing code. The specific implementation of loss is its parameter cross_entropy_loss_func, which is a function defined in the functions_loss_metric.py file, as shown below:

def cross_entropy_loss_func(output_dict, label_dict, *args):
    custom_loss = paddle.nn.CrossEntropyLoss(ignore_index=0, label_smoothing=0.0)
    loss = custom_loss(output_dict["output"], label_dict["output"])
    return {"ce_loss": loss}

The third parameter is the name of the constraint condition. We need to name each constraint condition for subsequent indexing.

After the constraint construction is completed, encapsulate it into a dictionary with the name we just named as the keyword for subsequent access:

# wrap constraints together
constraint = {sup_constraint.name: sup_constraint}

3.6 Validator Construction

Usually during the training process, the training status of the current model is evaluated using the validation set (test set) at a certain epoch interval, so ppsci.validate.SupervisedValidator is used to construct the validator. The construction process is similar to Constraint Construction.

# set validator
sup_validator = ppsci.validate.SupervisedValidator(
    {
        "dataset": {
            "name": "NamedArrayDataset",
            "input": {
                "input": data_funcs.values_val.astype(paddle.get_default_dtype()),
                "target_seq": data_funcs.targets_val[:, :-1],
            },
            "label": {"output": data_funcs.targets_val[:, 1:]},
        },
        "batch_size": cfg.TRAIN.batch_size,
        "num_workers": 1,
    },
    ppsci.loss.FunctionalLoss(cross_entropy_loss_func),
    metric={"metric": ppsci.metric.FunctionalMetric(compute_inaccuracy)},
    name="sup_validator",
)

# wrap validator together
validator = {sup_validator.name: sup_validator}

The evaluation metric is the custom metric calculation function compute_inaccuracy, in the functions_loss_metric.py file:

def compute_inaccuracy(
    output_dict: Dict[str, paddle.Tensor],
    label_dict: Dict[str, paddle.Tensor],
    *args,
) -> Dict[str, paddle.Tensor]:
    """Calculate the ratio of incorrectly matched tokens to the total number."""
    preds = output_dict["output"]
    labels = label_dict["output"]
    padding_not_mask = labels != 0
    correct_bool = paddle.equal(paddle.argmax(preds, axis=-1), labels)
    correct_bool = paddle.logical_and(
        correct_bool,
        padding_not_mask,
    )
    inacc = 1 - paddle.sum(correct_bool) / paddle.sum(padding_not_mask)
    return {"inaccuracy_mean": inacc}

3.7 Hyperparameter Setting

Set training epochs and other parameters, as shown below.

      "x4",
      "x5",
      "x6",
    ] # vocab_size=len(vocab_library)+2(because add and mul are binary operators)
# model settings
MODEL:
  input_keys: ["input", "target_seq"]

3.8 Model Training and Evaluation

After completing the above settings, you only need to pass the instantiated objects to ppsci.solver.Solver in order, and then start training and evaluation.

# initialize solver
solver = ppsci.solver.Solver(
    model,
    constraint,
    optimizer=optimizer,
    validator=validator,
    cfg=cfg,
)

# train model
solver.train()

# evaluate after finished training
solver.eval()

3.9 Model Verification and Result Visualization

This case validates the model on the open source symbolic regression dataset SRSD (Rethinking Symbolic Regression Datasets and Benchmarks for Scientific Discovery). During verification, the decoder is run in an autoregressive manner, and the encoder does not need to be run. The metric during verification is a normalized metric ZSS based on tree edit distance.

num_repeat = cfg.EVAL.num_repeat if isinstance(data_funcs, SRSDDataFuncs) else 1
num_samples = data_funcs.values_test.shape[0]
zss_dist = np.zeros((num_repeat, num_samples))
for i in tqdm(range(num_repeat), desc="Evaluating"):
    encoder_input = paddle.to_tensor(
        data_funcs.values_test, dtype=paddle.get_default_dtype()
    )
    preds = model.decode_process(encoder_input, is_tree_complete)
    labels = paddle.to_tensor(data_funcs.targets_test)

    for j in range(num_samples):
        try:
            pred_simplify = simplify_output(preds[j], "tensor")
            zss_dist[i][j] = compute_norm_zss_dist(pred_simplify[0], labels[j])
        except Exception:
            zss_dist[i][j] = np.nan

    if i != num_repeat - 1:
        # reload data to increase randomness
        data_funcs.init_data("test")

zss_dist_mean = np.nanmean(zss_dist, axis=0)
zss_dist_std = np.nanstd(zss_dist, axis=0)
zss_dist_min = np.nanmin(zss_dist, axis=0)
zss_dist_max = np.nanmax(zss_dist, axis=0)

try:

The visualization code is defined in the file functions_vis.py. In addition to visualizing the results in the validation set, a visualization result of a demo with the formula \(25*x1+x2*log(x1)\) is also provided:

visualizer = VisualizeFuncs(model)
visualizer.visualize_valid_data(data_funcs.targets_test, data_funcs.values_test, 10)
visualizer.visualize_demo()

4. Complete Code

transformer4sr.py
# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""
Reference: https://github.com/omron-sinicx/transformer4sr
"""

import hydra
import numpy as np
import paddle
from functions_data import DataFuncs
from functions_data import SRSDDataFuncs
from functions_loss_metric import compute_inaccuracy
from functions_loss_metric import cross_entropy_loss_func
from functions_vis import VisualizeFuncs
from omegaconf import DictConfig
from tqdm import tqdm
from utils import compute_norm_zss_dist
from utils import is_tree_complete
from utils import simplify_output

import ppsci


def train(cfg: DictConfig):
    # data
    data_funcs = DataFuncs(
        cfg.DATA.data_path,
        cfg.DATA.vocab_library,
        cfg.DATA.seq_length_max,
        cfg.DATA.ratio,
        shuffle=True,
    )

    # set model
    num_var_max = len(cfg.DATA.response_variable)
    vocab_size = len(cfg.DATA.vocab_library) + 2
    model = ppsci.arch.Transformer(
        **cfg.MODEL,
        num_var_max=num_var_max,
        vocab_size=vocab_size,
        seq_length=data_funcs.seq_length_max,
    )

    # set optimizer
    def lr_lambda(step, d_model=cfg.MODEL.d_model, warmup=cfg.TRAIN.lr_warmup):
        if step == 0:
            step = 1
        lr = d_model ** (-0.5) * min(step ** (-0.5), step * warmup ** (-1.5))
        return lr

    lr_scheduler = ppsci.optimizer.lr_scheduler.LambdaDecay(
        **cfg.TRAIN.lr_scheduler,
        lr_lambda=lr_lambda,
    )()
    optimizer = ppsci.optimizer.Adam(lr_scheduler, **cfg.TRAIN.adam)(model)

    # set constraint
    sup_constraint = ppsci.constraint.SupervisedConstraint(
        {
            "dataset": {
                "name": "NamedArrayDataset",
                "input": {
                    "input": data_funcs.values_train.astype(paddle.get_default_dtype()),
                    "target_seq": data_funcs.targets_train[:, :-1],
                },
                "label": {"output": data_funcs.targets_train[:, 1:]},
            },
            "batch_size": cfg.TRAIN.batch_size,
            "sampler": {
                "name": "BatchSampler",
                "drop_last": False,
                "shuffle": True,
            },
            "num_workers": 1,
        },
        ppsci.loss.FunctionalLoss(cross_entropy_loss_func),
        name="sup_constraint",
    )

    # wrap constraints together
    constraint = {sup_constraint.name: sup_constraint}

    # set validator
    sup_validator = ppsci.validate.SupervisedValidator(
        {
            "dataset": {
                "name": "NamedArrayDataset",
                "input": {
                    "input": data_funcs.values_val.astype(paddle.get_default_dtype()),
                    "target_seq": data_funcs.targets_val[:, :-1],
                },
                "label": {"output": data_funcs.targets_val[:, 1:]},
            },
            "batch_size": cfg.TRAIN.batch_size,
            "num_workers": 1,
        },
        ppsci.loss.FunctionalLoss(cross_entropy_loss_func),
        metric={"metric": ppsci.metric.FunctionalMetric(compute_inaccuracy)},
        name="sup_validator",
    )

    # wrap validator together
    validator = {sup_validator.name: sup_validator}

    # initialize solver
    solver = ppsci.solver.Solver(
        model,
        constraint,
        optimizer=optimizer,
        validator=validator,
        cfg=cfg,
    )

    # train model
    solver.train()

    # evaluate after finished training
    solver.eval()


def evaluate(cfg: DictConfig):
    # data
    data_funcs = SRSDDataFuncs(
        cfg.DATA.data_path_srsd,
        cfg.DATA.sampling_times,
        cfg.DATA.response_variable,
        cfg.DATA.vocab_library,
        cfg.DATA.seq_length_max,
        shuffle=True,
    )

    # set model
    num_var_max = len(cfg.DATA.response_variable)
    vocab_size = len(cfg.DATA.vocab_library) + 2
    model = ppsci.arch.Transformer(
        **cfg.MODEL,
        num_var_max=num_var_max,
        vocab_size=vocab_size,
        seq_length=data_funcs.seq_length_max,
    )
    ppsci.utils.save_load.load_pretrain(model, path=cfg.EVAL.pretrained_model_path)
    model.eval()

    # evaluate
    num_repeat = cfg.EVAL.num_repeat if isinstance(data_funcs, SRSDDataFuncs) else 1
    num_samples = data_funcs.values_test.shape[0]
    zss_dist = np.zeros((num_repeat, num_samples))
    for i in tqdm(range(num_repeat), desc="Evaluating"):
        encoder_input = paddle.to_tensor(
            data_funcs.values_test, dtype=paddle.get_default_dtype()
        )
        preds = model.decode_process(encoder_input, is_tree_complete)
        labels = paddle.to_tensor(data_funcs.targets_test)

        for j in range(num_samples):
            try:
                pred_simplify = simplify_output(preds[j], "tensor")
                zss_dist[i][j] = compute_norm_zss_dist(pred_simplify[0], labels[j])
            except Exception:
                zss_dist[i][j] = np.nan

        if i != num_repeat - 1:
            # reload data to increase randomness
            data_funcs.init_data("test")

    zss_dist_mean = np.nanmean(zss_dist, axis=0)
    zss_dist_std = np.nanstd(zss_dist, axis=0)
    zss_dist_min = np.nanmin(zss_dist, axis=0)
    zss_dist_max = np.nanmax(zss_dist, axis=0)

    try:
        keys = data_funcs.keys_test
        assert len(keys) == num_samples
    except Exception:
        keys = [f"sample_{i}" for i in range(num_samples)]

    print(
        f"zss_distance and accuracy in {num_repeat} attempts of {num_samples} samples with format: name => mean +- std | min ~ max"
    )
    for i in range(num_samples):
        key = keys[i]
        print(
            f"{key} => {zss_dist_mean[i]:.3f} +- {zss_dist_std[i]:.3f} | {zss_dist_min[i]:.3f} ~ {zss_dist_max[i]:.3f}"
        )

    print("-----------")
    print(
        f"=> Mean ZSS distance: {np.nanmean(zss_dist):.3f} +- {np.nanstd(zss_dist):.3f}"
    )
    print(f"=> Hit rate: {np.sum(np.any(zss_dist==0, axis=0))}/{zss_dist.shape[1]}")

    # visualize prediction
    visualizer = VisualizeFuncs(model)
    visualizer.visualize_valid_data(data_funcs.targets_test, data_funcs.values_test, 10)
    visualizer.visualize_demo()


def export(cfg: DictConfig):
    def temporary_complete_func(seq_indices):
        ".utils.is_tree_complete is not work in static gragh now."
        arity = 1
        for n in seq_indices:
            n = n.item()
            if n == 0 or n == 1:
                continue
                print("Predict padding or <SOS>, which is bad...")
            if n == 2 or n == 3:
                arity = arity + 2 - 1
            elif n in range(4, 13):
                arity = arity + 1 - 1
            elif n in range(13, 20):
                arity = arity + 0 - 1
        if arity == 0:
            return True
        else:
            return False

    class WarppedModel(ppsci.arch.Transformer):
        def __init__(self, *args, complete_func, **kwargs):
            super().__init__(*args, **kwargs)
            self.complete_func = complete_func

        def forward(self, x):
            return {"output": self.decode_process(x["input"], self.complete_func)}

    # set model
    num_var_max = len(cfg.DATA.response_variable)
    vocab_size = len(cfg.DATA.vocab_library) + 2
    warpped_model = WarppedModel(
        **cfg.MODEL,
        num_var_max=num_var_max,
        vocab_size=vocab_size,
        seq_length_max=cfg.DATA.seq_length_max,
        complete_func=temporary_complete_func,
    )
    warpped_model.eval()

    # initialize solver
    solver = ppsci.solver.Solver(
        warpped_model,
        pretrained_model_path=cfg.INFER.pretrained_model_path,
    )

    # export model
    from paddle.static import InputSpec

    input_spec = [
        {
            "input": InputSpec(
                [None, cfg.DATA.sampling_times, len(cfg.DATA.response_variable), 1],
                "float32",
                name="input",
            )
        }
    ]
    solver.export(input_spec, cfg.INFER.export_path)


def inference(cfg: DictConfig):
    import sympy

    from deploy.python_infer import pinn_predictor

    predictor = pinn_predictor.PINNPredictor(cfg)

    C, y, x1, x2, x3, x4, x5, x6 = sympy.symbols(
        "C, y, x1, x2, x3, x4, x5, x6", real=True, positive=True
    )
    y = 25 * x1 + x2 * sympy.log(x1)
    print("The ground truth is:", y)

    x1_values = np.power(10.0, np.random.uniform(-1.0, 1.0, size=50))
    x2_values = np.power(10.0, np.random.uniform(-1.0, 1.0, size=50))
    f = sympy.lambdify([x1, x2], y)
    y_values = f(x1_values, x2_values)
    dataset = np.zeros((50, 7))
    dataset[:, 0] = y_values
    dataset[:, 1] = x1_values
    dataset[:, 2] = x2_values
    encoder_input = dataset[np.newaxis, :, :, np.newaxis].astype(np.float32)
    output_dict = predictor.predict({"input": encoder_input}, cfg.INFER.batch_size)
    output_dict = {
        store_key: output_dict[infer_key]
        for store_key, infer_key in zip(("output",), output_dict.keys())
    }
    sympy_pred = simplify_output(output_dict["output"][0], "sympy")
    print("The prediction is:", sympy_pred)


@hydra.main(version_base=None, config_path="./conf", config_name="transformer4sr.yaml")
def main(cfg: DictConfig):
    if cfg.mode == "train":
        train(cfg)
    elif cfg.mode == "eval":
        evaluate(cfg)
    elif cfg.mode == "export":
        export(cfg)
    elif cfg.mode == "infer":
        inference(cfg)
    else:
        raise ValueError(f"cfg.mode should in ['train', 'eval'], but got '{cfg.mode}'")


if __name__ == "__main__":
    main()

5. Result Display

The figure below shows the model's prediction result on the formula \(25*x1+x2*log(x1)\).

res_demo

Prediction result on demo formula

Where \(C\) represents a constant. It can be seen that the model prediction result is basically consistent with the real formula.

6. References

@inproceedings{lalande2023,
    title = {A Transformer Model for Symbolic Regression towards Scientific Discovery},
    author = {Florian Lalande and Yoshitomo Matsubara and Naoya Chiba and Tatsunori Taniai and Ryo Igarashi and Yoshitaka Ushiku},
    booktitle = {NeurIPS 2023 AI for Science Workshop},
    year = {2023},
    url = {https://openreview.net/forum?id=AIfqWNHKjo},
}