The StructureSelector class

Objects of this class are used to select optimal structures to train cluster expansion models. Also, allows to evaluate the quality of training sets.

Initialization and methods

class clusterx.structure_selector.StructureSelector(cluster_pool, training_set, candidate_set=None, correlations_calculator=None)

Structure selector class

Objects of this class are used to select optimal structures to enhance the predictive performance of the current cluster expansion model.

Parameters

cluster_pool: ClustersPool() object

The optimal set of clusters for the training data.

training_set: StructuresSet() object

Contains the current training set

candidate_set: StructuresSet() object

Pool of new structures, that might be selected to enter the training set. If it remains ‘None’, the class can only assess the training set.

correlations_calculator: CorrelationsCalculator object

Used to calculate the correlations of structures and clusters. IF THIS IS LEFT EMPTY (or set to None) the correlations are calculated with a trigonometric basis.

calculate_population_variance(domain_calculation_method='averagedConcentration', concentration=None)

Calculate the cluster expansions prediction variance of the structural property averaged over all possible structures.

The input parameter ‘domain_calculation_method’ defines with which method the domain matrix is calculated (see Mueller2010 PRB 82, 184107).

The options for this input parameter are ‘averagedConcentration’, ‘infiniteCrystalFiniteClusters’, ‘byConcentration’, ‘vdWalleAndCeder’. If ‘byConcentration’ is chosen, one needs to supply the routine with a second input parameter, the ‘concentration’.

‘concentration’ is limited to the range [0, 1] and means the concentration of one of the binaries in the alloy.

Parameters:

domain_calculation_method: string

Defines the method used to calculate the domain matrix.

concentration: float

Defines the concentration for which the domain matrix should be evaluated.

select_structure(method=None, concentration=None)

Select a structure to enter the training set

This method needs a candidate set. If no candidate set was passed to the class object, set it with set_candidate_set().

The routine takes each structure and adds it temporarily to the training set. With the enlarged training set the population variance is computed. To compute the population variance, supply a ‘method’ parameter, and maybe a ‘concentration’ (see calculate_population_variance). Afterwards the structure is removed from the training set, and the next structure is taken. The routine selects the structure that minimizes the population variance.

Parameters:

method: string

Defines the method used to calculate the domain matrix. Options are ‘global_averagedConcentration’, ‘global_infiniteCrystalFiniteClusters’, ‘global_byConcentration’, ‘global_vdWalleAndCeder’

concentration: float

Defines the concentration for which the domain matrix should be evaluated.

set_candidate_set(new_candidate_set)

Set the candidate set

Changes the set of candidates to be selected for the training set. The input parameter is, thus, a StructureSet object.

Parameters:

new_candidate_set: StructureSet object

The new set of candidate structures.