Random NN models

This tutorial shows how to build neural network models. We will learn how to calculate compositional descriptors using xenonpy.descriptor.Compositions calculator and train our model using xenonpy.model.training modules.

In this tutorial, we will use some inorganic sample data from materials project. If you don’t have it, please see https://github.com/yoshida-lab/XenonPy/blob/master/samples/build_sample_data.ipynb.

useful functions

Running the following cell will load some commonly used packages, such as numpy, pandas, and so on. It will also import some in-house functions used in this tutorial. See samples/tools.ipynb to check what will be imported.

[1]:
%run tools.ipynb

Sequential linear model

We will use xenonpy.model.SequentialLinear to build a sequential linear model. The basic layer in SequentialLinear model is xenonpy.model.LinearLayer

[2]:
from xenonpy.model import SequentialLinear, LinearLayer
SequentialLinear?
Init signature:
SequentialLinear(
    in_features: int,
    out_features: int,
    bias: bool = True,
    *,
    h_neurons: Union[Tuple[float, ...], Tuple[int, ...]] = (),
    h_bias: Union[bool, Tuple[bool, ...]] = True,
    h_dropouts: Union[float, Tuple[float, ...]] = 0.1,
    h_normalizers: Union[float, NoneType, Tuple[Union[float, NoneType], ...]] = 0.1,
    h_activation_funcs: Union[Callable, NoneType, Tuple[Union[Callable, NoneType], ...]] = ReLU(),
)
Docstring:
Sequential model with linear layers and configurable other hype-parameters.
e.g. ``dropout``, ``hidden layers``
Init docstring:
Parameters
----------
in_features
    Size of input.
out_features
    Size of output.
bias
    Enable ``bias`` in input layer.
h_neurons
    Number of neurons in hidden layers.
    Can be a tuple of floats. In that case,
    all these numbers will be used to calculate the neuron numbers.
    e.g. (0.5, 0.4, ...) will be expanded as (in_features * 0.5, in_features * 0.4, ...)
h_bias
    ``bias`` in hidden layers.
h_dropouts
    Probabilities of dropout in hidden layers.
h_normalizers
    Momentum of batched normalizers in hidden layers.
h_activation_funcs
    Activation functions in hidden layers.
File:           ~/projects/XenonPy/xenonpy/model/sequential.py
Type:           type
Subclasses:

[3]:
LinearLayer?
Init signature:
LinearLayer(
    in_features: int,
    out_features: int,
    bias: bool = True,
    *,
    dropout: float = 0.0,
    activation_func: Callable = ReLU(),
    normalizer: Union[float, NoneType] = 0.1,
)
Docstring:
Base NN layer. This is a wrap around PyTorch.
See here for details: http://pytorch.org/docs/master/nn.html#
Init docstring:
Parameters
----------
in_features:
    Size of each input sample.
out_features:
    Size of each output sample
dropout: float
    Probability of an element to be zeroed. Default: 0.5
activation_func: func
    Activation function.
normalizer: func
    Normalization layers
File:           ~/projects/XenonPy/xenonpy/model/sequential.py
Type:           type
Subclasses:

Following the official suggestion from PyTorch, SequentialLinear is a python class that inherit the torch.nn.Module class. Users can specify the hyperparameters to get a fully customized model object. For example:

[4]:
model = SequentialLinear(290, 1, h_neurons=(0.8, 0.7, 0.6))
model
[4]:
SequentialLinear(
  (layer_0): LinearLayer(
    (linear): Linear(in_features=290, out_features=232, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(232, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_1): LinearLayer(
    (linear): Linear(in_features=232, out_features=203, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(203, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_2): LinearLayer(
    (linear): Linear(in_features=203, out_features=174, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(174, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (output): Linear(in_features=174, out_features=1, bias=True)
)

fully randomized model generation

Process of random model generation:

  1. using a random parameter generator to generate a set of parameter.

  2. using the generated parameter set to setup a model object.

  3. loop step 1 and 2 as many times as needed.

We provided a general parameter generator xenonpy.utils.ParameterGenerator to do all the 3 steps for any callable object.

[5]:
from xenonpy.utils import ParameterGenerator
ParameterGenerator?
Init signature:
ParameterGenerator(
    seed: Union[int, NoneType] = None,
    **kwargs: Union[Any, Sequence, Callable, Dict],
)
Docstring:      Generator for parameter set generating.
Init docstring:
Parameters
----------
seed
    Numpy random seed.
kwargs
    Parameter candidate.
File:           ~/projects/XenonPy/xenonpy/utils/parameter_gen.py
Type:           type
Subclasses:

Calling an instance of ParameterGenerator will return a generator. This generator can randomly select parameters from parameter candidates and yield them as a dict. This is what we want in step 1.

[6]:
from math import ceil
from random import uniform

generator = ParameterGenerator(
    in_features=290,
    out_features=1,
    h_neurons=dict(
        data=[ceil(uniform(0.1, 1.2) * 290) for _ in range(100)],
        repeat=(2, 3)
    )
)

Because generator is a generator, for ... in ... statement can be applied. For example, we can use for parameters in generator(num_of_models) to loop all these generated parameter sets.

[7]:
for parameters in generator(num=10):
    print(parameters)
{'in_features': 290, 'out_features': 1, 'h_neurons': (70, 109)}
{'in_features': 290, 'out_features': 1, 'h_neurons': (135, 45)}
{'in_features': 290, 'out_features': 1, 'h_neurons': (216, 261, 79)}
{'in_features': 290, 'out_features': 1, 'h_neurons': (111, 88, 49)}
{'in_features': 290, 'out_features': 1, 'h_neurons': (216, 79, 47)}
{'in_features': 290, 'out_features': 1, 'h_neurons': (161, 45)}
{'in_features': 290, 'out_features': 1, 'h_neurons': (193, 161, 78)}
{'in_features': 290, 'out_features': 1, 'h_neurons': (90, 36, 234)}
{'in_features': 290, 'out_features': 1, 'h_neurons': (230, 83, 295)}
{'in_features': 290, 'out_features': 1, 'h_neurons': (171, 247)}

For step 2, we can give a model class to the factory parameter as a factory function. If factory parameter is given, generator will feed generated parameters to the factory function automatically and yield the result as a second return in each loop. For example:

[8]:
for parameters, model in generator(2, factory=SequentialLinear):
    print('parameters: ', parameters)
    print(model, '\n')
parameters:  {'in_features': 290, 'out_features': 1, 'h_neurons': (182, 335, 204)}
SequentialLinear(
  (layer_0): LinearLayer(
    (linear): Linear(in_features=290, out_features=182, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(182, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_1): LinearLayer(
    (linear): Linear(in_features=182, out_features=335, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(335, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_2): LinearLayer(
    (linear): Linear(in_features=335, out_features=204, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(204, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (output): Linear(in_features=204, out_features=1, bias=True)
)

parameters:  {'in_features': 290, 'out_features': 1, 'h_neurons': (108, 255)}
SequentialLinear(
  (layer_0): LinearLayer(
    (linear): Linear(in_features=290, out_features=108, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(108, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_1): LinearLayer(
    (linear): Linear(in_features=108, out_features=255, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(255, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (output): Linear(in_features=255, out_features=1, bias=True)
)

We also enable users to generate parameters in a functional way by given a function as candidate parameters. In this case, the function accept an int number as vector length and a vector of generated parameters. For example, generate a n length vector with float numbers sorted in ascending.

[9]:
generator = ParameterGenerator(
    in_features=290,
    out_features=1,
    h_neurons=dict(
        data=lambda n: sorted(np.random.uniform(0.2, 0.8, size=n), reverse=True),
        repeat=(2, 3)
    )
)
[10]:
for parameters, model in generator(2, factory=SequentialLinear):
    print('parameters: ', parameters)
    print(model, '\n')
parameters:  {'in_features': 290, 'out_features': 1, 'h_neurons': (0.7954011465554711, 0.6993610045591458)}
SequentialLinear(
  (layer_0): LinearLayer(
    (linear): Linear(in_features=290, out_features=231, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(231, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_1): LinearLayer(
    (linear): Linear(in_features=231, out_features=203, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(203, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (output): Linear(in_features=203, out_features=1, bias=True)
)

parameters:  {'in_features': 290, 'out_features': 1, 'h_neurons': (0.798810246990777, 0.38584535382368323, 0.29569841893337045)}
SequentialLinear(
  (layer_0): LinearLayer(
    (linear): Linear(in_features=290, out_features=232, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(232, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_1): LinearLayer(
    (linear): Linear(in_features=232, out_features=112, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(112, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_2): LinearLayer(
    (linear): Linear(in_features=112, out_features=86, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(86, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (output): Linear(in_features=86, out_features=1, bias=True)
)

Model training

We provide a general and extendable model training system for neural network models. By customizing the extensions, users can fully control their model training process, and save the training results following the XenonPy.MDL format automatically.

All these modules and extensions are under the xenonpy.model.training.

[11]:
import torch
import matplotlib.pyplot as plt
from pymatgen import Structure

from xenonpy.model.training import Trainer, SGD, MSELoss, Adam, ReduceLROnPlateau, ExponentialLR, ClipValue

from xenonpy.datatools import preset, Splitter
from xenonpy.descriptor import Compositions

prepare training/testing data

If you followed the tutorial in https://github.com/yoshida-lab/XenonPy/blob/master/samples/build_sample_data.ipynb, the sample data should be save at ~/.xenonpy/userdata with name mp_samples.pd.xz. You can use pd.read_pickle to load this file but we suggest that you use our xenonpy.datatools.preset module.

We chose volume as the example to demonstrate how to use our training system. We only use the data which volume are smaller than 2,500 to avoid the disparate data. we select data in pandas.DataFrame and calculate the descriptors using xenonpy.descriptor.Compositions calculator. It is noticed that the input for a neuron network model in pytorch must has shape (N x M x …), which mean if the input is a 1-D vector, it should be reshaped to 2-D, e.g. (N) -> (N x 1).

[12]:
# if you have not have the samples data
# preset.build('mp_samples', api_key=<your materials project api key>)

from xenonpy.datatools import preset

data = preset.mp_samples
data.head(3)
[12]:
band_gap composition density e_above_hull efermi elements final_energy_per_atom formation_energy_per_atom pretty_formula structure volume
mp-1008807 0.0000 {'Rb': 1.0, 'Cu': 1.0, 'O': 1.0} 4.784634 0.996372 1.100617 [Rb, Cu, O] -3.302762 -0.186408 RbCuO [[-3.05935361 -3.05935361 -3.05935361] Rb, [0.... 57.268924
mp-1009640 0.0000 {'Pr': 1.0, 'N': 1.0} 8.145777 0.759393 5.213442 [Pr, N] -7.082624 -0.714336 PrN [[0. 0. 0.] Pr, [1.57925232 1.57925232 1.58276... 31.579717
mp-1016825 0.7745 {'Hf': 1.0, 'Mg': 1.0, 'O': 3.0} 6.165888 0.589550 2.424570 [Hf, Mg, O] -7.911723 -3.060060 HfMgO3 [[2.03622802 2.03622802 2.03622802] Hf, [0. 0.... 67.541269

In case of the system did not automatically download some descriptor data in XenonPy, please run the following code to sync the data.

[13]:
from xenonpy.datatools import preset
preset.sync('elements')
preset.sync('elements_completed')
fetching dataset `elements` from https://github.com/yoshida-lab/dataset/releases/download/v0.1.3/elements.pd.xz.
fetching dataset `elements_completed` from https://github.com/yoshida-lab/dataset/releases/download/v0.1.3/elements_completed.pd.xz.
[14]:
prop = data[data.volume <= 2500]['volume'].to_frame()  # reshape to 2-D
desc = Compositions(featurizers='classic').transform(data.loc[prop.index]['composition'])

desc.head(3)
prop.head(3)
[14]:
ave:atomic_number ave:atomic_radius ave:atomic_radius_rahm ave:atomic_volume ave:atomic_weight ave:boiling_point ave:bulk_modulus ave:c6_gb ave:covalent_radius_cordero ave:covalent_radius_pyykko ... min:num_s_valence min:period min:specific_heat min:thermal_conductivity min:vdw_radius min:vdw_radius_alvarez min:vdw_radius_mm3 min:vdw_radius_uff min:sound_velocity min:Polarizability
mp-1008807 24.666667 174.067140 209.333333 25.666667 55.004267 1297.063333 72.868680 1646.90 139.333333 128.333333 ... 1.0 2.0 0.360 0.02658 152.0 150.0 182.0 349.5 317.5 0.802
mp-1009640 33.000000 137.000000 232.500000 19.050000 77.457330 1931.200000 43.182441 1892.85 137.000000 123.500000 ... 2.0 2.0 0.192 0.02583 155.0 166.0 193.0 360.6 333.6 1.100
mp-1016825 21.600000 153.120852 203.400000 13.920000 50.158400 1420.714000 76.663625 343.82 102.800000 96.000000 ... 2.0 2.0 0.146 0.02658 152.0 150.0 182.0 302.1 317.5 0.802

3 rows × 290 columns

[14]:
volume
mp-1008807 57.268924
mp-1009640 31.579717
mp-1016825 67.541269
[15]:
from xenonpy.datatools import preset

Use the xenonpy.datatools.Splitter to split data into training and test sets.

[16]:
Splitter?
Init signature:
Splitter(
    size: int,
    *,
    test_size: Union[float, int] = 0.2,
    k_fold: Union[int, Iterable, NoneType] = None,
    random_state: Union[int, NoneType] = None,
    shuffle: bool = True,
)
Docstring:      Data splitter for train and test
Init docstring:
Parameters
----------
size
    Total sample size.
    All data must have same length of their first dim,
test_size
    If float, should be between ``0.0`` and ``1.0`` and represent the proportion
    of the dataset to include in the test split. If int, represents the
    absolute number of test samples. Can be ``0`` if cv is ``None``.
    In this case, :meth:`~Splitter.cv` will yield a tuple only contains ``training`` and ``validation``
    on each step. By default, the value is set to 0.2.
k_fold
    Number of k-folds.
    If ``int``, Must be at least 2.
    If ``Iterable``, it should provide label for each element which will be used for group cv.
    In this case, the input of :meth:`~Splitter.cv` must be a :class:`pandas.DataFrame` object.
    Default value is None to specify no cv.
random_state
    If int, random_state is the seed used by the random number generator;
    Default is None.
shuffle
    Whether or not to shuffle the data before splitting.
File:           ~/projects/XenonPy/xenonpy/datatools/splitter.py
Type:           type
Subclasses:

[17]:
sp = Splitter(prop.shape[0])
x_train, x_val, y_train, y_val = sp.split(desc, prop)

x_train.shape
y_train.shape
x_val.shape
y_val.shape
[17]:
(738, 290)
[17]:
(738, 1)
[17]:
(185, 290)
[17]:
(185, 1)

model training

Model training in pytorch is very flexible. The official document explains the concept with examples. In short, training a neuron network model includes: 1. calculating loss of training data for the model. 2. executing the backpropagation to update the weights between each neuron. 3. looping over step 1 and 2 until convergence.

In step 1, a calculator, often called loss function, is needed to calculate the loss. In step 2, an optimizer will be used. xenonpy.model.training.Trainer covers step 3.

Here is how you will do it in XenonPy:

[18]:
model = SequentialLinear(290, 1, h_neurons=(0.8, 0.6, 0.4, 0.2))
model
[18]:
SequentialLinear(
  (layer_0): LinearLayer(
    (linear): Linear(in_features=290, out_features=232, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(232, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_1): LinearLayer(
    (linear): Linear(in_features=232, out_features=174, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(174, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_2): LinearLayer(
    (linear): Linear(in_features=174, out_features=116, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(116, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_3): LinearLayer(
    (linear): Linear(in_features=116, out_features=58, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(58, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (output): Linear(in_features=58, out_features=1, bias=True)
)
[19]:
trainer = Trainer(
    model=model,
    optimizer=Adam(lr=0.01),
    loss_func=MSELoss()
)
trainer
[19]:
Trainer(clip_grad=None, cuda=None, epochs=200, loss_func=MSELoss(),
        lr_scheduler=None,
        model=SequentialLinear(
  (layer_0): LinearLayer(
    (linear): Linear(in_features=290, out_features=232, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(232, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_1): LinearLayer(
    (linear): Li...
    (linear): Linear(in_features=116, out_features=58, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(58, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (output): Linear(in_features=58, out_features=1, bias=True)
),
        non_blocking=False,
        optimizer=Adam (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.999)
    eps: 1e-08
    lr: 0.01
    weight_decay: 0
))
[20]:
trainer.fit(
    x_train=torch.tensor(x_train.values, dtype=torch.float),
    y_train=torch.tensor(y_train.values, dtype=torch.float),
    epochs=400
)
Training: 100%|██████████| 400/400 [00:06<00:00, 60.86it/s]
[21]:
_, ax = plt.subplots(figsize=(10, 5), dpi=100)
trainer.training_info.tail(3)
trainer.training_info.plot(y=['train_mse_loss'], ax=ax)
[21]:
total_iters i_epoch i_batch train_mse_loss
397 397 398 1 3684.717041
398 398 399 1 3058.761719
399 399 400 1 3039.683350
[21]:
<matplotlib.axes._subplots.AxesSubplot at 0x1a2ce256d0>
../_images/tutorials_4-random_nn_model_and_training_37_2.png
[22]:
y_pred = trainer.predict(x_in=torch.tensor(x_val.values, dtype=torch.float)).detach().numpy().flatten()
y_true = y_val.values.flatten()

y_fit_pred = trainer.predict(x_in=torch.tensor(x_train.values, dtype=torch.float)).detach().numpy().flatten()
y_fit_true = y_train.values.flatten()

draw(y_true, y_pred, y_fit_true, y_fit_pred, prop_name='Volume ($\AA^3$)')
Missing directory and/or file name information!
../_images/tutorials_4-random_nn_model_and_training_38_1.png

You can see that although trainer can keep training info automatically for us, we still need to convert DataFrame to torch.Tensor and convert dtype from np.double to torch.float during training, and eventually convert them back during prediction ourselves. Also, overfitting could be observed in the prediction vs observation plot. One possible caused is using all training data in each epoch of the training process. We can use a technique called mini-batch to avoid overfitting, but that will require more coding.

To skip these tedious coding steps, we provide an extension system.

First, we use ArrayDataset to wrap our data, and use torch.utils.data.DataLoader to a build mini-batch loader.

[23]:
from xenonpy.model.training.dataset import ArrayDataset
from torch.utils.data import DataLoader
[24]:
train_dataset = DataLoader(ArrayDataset(x_train, y_train), shuffle=True, batch_size=100)
val_dataset = DataLoader(ArrayDataset(x_val, y_val), batch_size=1000)

Second, we use TensorConverter extension to automatically convert data between numpy and torch.Tensor. Additionally, we want to trace some stopping criterion and use early stopping when these criterion stop improving. This can be done by using Validator.

At last, to save our model for latter use, just add Persist to the trainer and saving will be done in slient.

[25]:
from xenonpy.model.training.extension import Validator, TensorConverter, Persist
from xenonpy.model.training.dataset import ArrayDataset
from xenonpy.model.utils import regression_metrics

We use trainer.extend method to stack up extensions. Note that extensions will be executed in the order of insertion, so TensorConverter should always go first and Persist be the last.

[26]:
trainer = Trainer(
    optimizer=Adam(lr=0.01),
    loss_func=MSELoss(),
).extend(
    TensorConverter(),
    Validator(metrics_func=regression_metrics, early_stopping=30, trace_order=5, mae=0.0, pearsonr=1.0),
)
[27]:
Persist?
Init signature:
Persist(
    path: Union[pathlib.Path, str] = '.',
    *,
    model_class: Callable = None,
    model_params: Union[tuple, dict, <built-in function any>] = None,
    increment=False,
    sync_training_step=False,
    **describe: Any,
)
Docstring:      Trainer extension for data persistence
Init docstring:
Parameters
----------
path
    Path for model saving.
model_class
    A factory function for model reconstructing.
    In most case this is the model class inherits from :class:`torch.nn.Module`
model_params
    The parameters for model reconstructing.
    This can be anything but in general this is a dict which can be used as kwargs parameters.
increment
    If ``True``, dir name of path will be decorated with a auto increment number,
    e.g. use ``model_dir@1`` for ``model_dir``.
sync_training_step
    If ``True``, will save ``trainer.training_info`` at each iteration.
    Default is ``False``, only save ``trainer.training_info`` at each epoch.
describe:
    Any other information to describe this model.
    These information will be saved under model dir by name ``describe.pkl.z``.
File:           ~/projects/XenonPy/xenonpy/model/training/extension/persist.py
Type:           type
Subclasses:

Persist need a path to save model. For convenience, we prepare the path name by concatenating the number of neurons in a model.

[28]:
def make_name(model):
    name = []
    for n, m in model.named_children():
        if 'layer_' in n:
            name.append(str(m.linear.in_features))
        else:
            name.append(str(m.in_features))
            name.append(str(m.out_features))
    return '-'.join(name)
[29]:
model = SequentialLinear(290, 1, h_neurons=(0.8, 0.6, 0.4, 0.2))
model

model_name = make_name(model)
persist = Persist(
    f'trained_models/{model_name}',
    # -^- required -^-

    # -v- optional -v-
    increment=False,
    sync_training_step=True,
    author='Chang Liu',
    email='liu.chang.1865@gmail.com',
    dataset='materials project',
)
_ = trainer.extend(persist)
trainer.reset(to=model)

trainer.fit(training_dataset=train_dataset, validation_dataset=val_dataset, epochs=400)
persist(splitter=sp, data_indices=prop.index.tolist())  # <-- calling of this method only after the model training
[29]:
SequentialLinear(
  (layer_0): LinearLayer(
    (linear): Linear(in_features=290, out_features=232, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(232, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_1): LinearLayer(
    (linear): Linear(in_features=232, out_features=174, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(174, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_2): LinearLayer(
    (linear): Linear(in_features=174, out_features=116, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(116, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_3): LinearLayer(
    (linear): Linear(in_features=116, out_features=58, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(58, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (output): Linear(in_features=58, out_features=1, bias=True)
)
Training:  12%|█▏        | 47/400 [00:15<02:06,  2.78it/s]
Early stopping is applied: no improvement for ['mae', 'pearsonr'] since the last 31 iterations, finish training at iteration 372

[30]:
_, ax = plt.subplots(figsize=(10, 5), dpi=150)
trainer.training_info.plot(y=['train_mse_loss', 'val_mse'], ax=ax)
[30]:
<matplotlib.axes._subplots.AxesSubplot at 0x1a2d100e10>
../_images/tutorials_4-random_nn_model_and_training_51_1.png
[31]:
y_pred, y_true = trainer.predict(dataset=val_dataset, checkpoint='pearsonr_1')
y_fit_pred, y_fit_true = trainer.predict(dataset=train_dataset, checkpoint='pearsonr_1')
draw(y_true, y_pred, y_fit_true, y_fit_pred, prop_name='Volume ($\AA^3$)')
Missing directory and/or file name information!
../_images/tutorials_4-random_nn_model_and_training_52_1.png

Combine random model generating and training

[32]:
generator = ParameterGenerator(
    in_features=290,
    out_features=1,
    h_neurons=dict(
        data=lambda n: sorted(np.random.uniform(0.2, 0.8, size=n), reverse=True),
        repeat=(3, 4)
    )
)
[33]:
for paras, model in generator(num=2, factory=SequentialLinear):
    print(model)
    model_name = make_name(model)
    persist = Persist(
        f'trained_models/{model_name}',
        # -^- required -^-

        # -v- optional -v-
        increment=False,
        sync_training_step=True,
        model_class=SequentialLinear,
        model_params=paras,
        author='Chang Liu',
        email='liu.chang.1865@gmail.com',
        dataset='materials project',
    )
    _ = trainer.extend(persist)
    trainer.reset(to=model)

    trainer.fit(training_dataset=train_dataset, validation_dataset=val_dataset, epochs=400)
    persist(splitter=sp, data_indices=prop.index.tolist())  # <-- calling of this method only after the model training
Training:   0%|          | 0/400 [00:00<?, ?it/s]
SequentialLinear(
  (layer_0): LinearLayer(
    (linear): Linear(in_features=290, out_features=195, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(195, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_1): LinearLayer(
    (linear): Linear(in_features=195, out_features=141, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(141, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_2): LinearLayer(
    (linear): Linear(in_features=141, out_features=131, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(131, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_3): LinearLayer(
    (linear): Linear(in_features=131, out_features=61, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(61, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (output): Linear(in_features=61, out_features=1, bias=True)
)
Training:  12%|█▏        | 48/400 [00:15<02:06,  2.79it/s]
Training:   0%|          | 0/400 [00:00<?, ?it/s]
Early stopping is applied: no improvement for ['mae', 'pearsonr'] since the last 31 iterations, finish training at iteration 380
SequentialLinear(
  (layer_0): LinearLayer(
    (linear): Linear(in_features=290, out_features=214, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(214, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_1): LinearLayer(
    (linear): Linear(in_features=214, out_features=181, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(181, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_2): LinearLayer(
    (linear): Linear(in_features=181, out_features=141, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(141, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (layer_3): LinearLayer(
    (linear): Linear(in_features=141, out_features=130, bias=True)
    (dropout): Dropout(p=0.1)
    (normalizer): BatchNorm1d(130, eps=0.1, momentum=0.1, affine=True, track_running_stats=True)
    (activation): ReLU()
  )
  (output): Linear(in_features=130, out_features=1, bias=True)
)
Training:  10%|█         | 42/400 [00:14<02:18,  2.59it/s]
Early stopping is applied: no improvement for ['mae', 'pearsonr'] since the last 31 iterations, finish training at iteration 333

[ ]: