xenonpy.inverse package

Abstract class to calculate likelihood of candidates from pandas.DataFrame descriptor data of the candidates generated by BaseFeaturizer or BaseDescriptor.

Using a BaseLogLikelihood Class

BaseLogLikelihood() requires user to define only the log_likelihood function. Use of log values is recommended in order to avoid typical underflow that may appear for small probability values.

__call__(X, **targets)[source]: Call self as a function.

fit(X, y, **kwargs)[source]

log_likelihood(X, **targets)[source]

Log likelihood

Parameters:

X (list[object]) – Input samples for likelihood calculation.
targets (tuple[float, float]) – Target area. Should be a tuple which have down and up boundary. e.g: target1=(10, 20) equal to target1 should in range [10, 20].

Returns:

log_likelihood – Estimated log-likelihood of each sample’s property values. Cannot be pd.Series!

Return type:

pd.Dataframe of float (col - properties, row - samples)

property timer

class xenonpy.inverse.base.BaseLogLikelihoodSet(*, loglikelihoods='all')[source]

Bases: BaseEstimator

Abstract class to organize log-likelihoods.

Examples

class MyLogLikelihood(BaseLogLikelihoodSet):
    def __init__(self):
        super().__init__()

        self.loglike1 = SomeFeature1()
        self.loglike1 = SomeFeature2()
        self.loglike2 = SomeFeature3()
        self.loglike2 = SomeFeature4()

Parameters:: loglikelihoods (list[str] or 'all') – log-likelihoods that will be used. Default is ‘all’.

__call__(X, **kwargs)[source]: Call self as a function.

log_likelihood(X, **kwargs)[source]

Log likelihood

Parameters:

X (list-like[object] or pd.DataFrame) – Input samples for likelihood calculation. For pd.DataFrame, if any column name matches any group name, the matched group(s) will be calculated with corresponding column(s); otherwise, the pd.DataFrame will be passed on as is.
kwargs (list[string]) – specified BaseLogLikelihood.

Returns:

log_likelihood – Estimated log-likelihood of each sample’s property values.

Return type:

pd.Dataframe of float (col - properties, row - samples)

property all_loglikelihoods

property elapsed

property timer

class xenonpy.inverse.base.BaseProposal[source]

Bases: BaseEstimator

__call__(X)[source]: Call self as a function.

fit(X, y, **kwargs)[source]

on_errors(error)[source]

proposal(X)[source]

Proposal new samples based on the input samples.

Parameters:: X (list of object) – Samples for generate next samples
Returns:: samples – Generated samples from input samples.
Return type:: list of object

property timer

class xenonpy.inverse.base.BaseResample[source]

Bases: BaseEstimator

Abstract class to resample candidates.

Using a BaseResample Class

BaseResample() requires user to define only the resample function. Use of this function appears in BaseSMC, but can often be skipped, as BaseSMC may have its direct implementation inside the class, too.

__call__(X, freq, size, p)[source]: Call self as a function.

fit(X, y=None, **kwargs)[source]

resample(X, freq, size, p)[source]

Re-sample from given samples.

Parameters:

X (list of object) – Input samples for likelihood calculation.
freq (list or np.array of int) – Frequency of each input sample
size (int) – Resample size.
p (np.ndarray of float) – The probabilities associated with each entry in X. If not given the sample assumes a uniform distribution over all entries.

Returns:

new_sample – Re-sampling result.

Return type:

list of object

property timer

class xenonpy.inverse.base.BaseSMC[source]

Bases: BaseEstimator

Abstract class to iteratively generate and pick up high likelihood candidates based on BaseProposal, BaseLogLikelihood, and/or BaseResample classes.

Using a BaseSMC Class

BaseSMC() provides a basic looping structure for user to implement algorithms that are in the form of sequential Monte Carlo or genetic algorithm. To avoid repeated calculation of log-likelihood of the same candidates, a unique function is required and implemented in the loop to pick up unique candidates, which may need to be able to adjust for different input type of the candidates. The default unique function of BaseSMC assumes candidates to be list or np.array.

The set of candidates is assumed to be list-like, but other data types are allowed if they are compatible with other components (modifier, estimator, resample and unique functions).

__call__(samples, beta, *, size=None, yield_lpf=False)[source]

Run SMC

Parameters:

samples (list of object) – Initial samples. This variable can take other data type as long as it matches with other components, such as estimator, modifier, resample and unique functions.
beta (list/1D-numpy of float or pd.Dataframe) – Annealing parameters for each step. If pd.Dataframe, column names should follow keys of mdl in BaseLogLikeihood or BaseLogLikelihoodSet
size (int) – Sample size for each draw. Default is None, which means sample size will be the same as the length of samples
yield_lpf (bool) – Yield estimated log likelihood, probability and frequency of each samples. Default is False.

Yields:

samples (list of object) – New samples in each SMC iteration. This variable can also be other data type, consistent with the input samples.
llh (np.ndarray float) – Estimated values of log-likelihood of each samples. Only yield when yield_lpf=Ture.
p (np.ndarray of float) – Estimated probabilities of each samples. Only yield when yield_lpf=Ture.
freq (np.ndarray of float) – The number of unique samples in original samples. Only yield when yield_lpf=Ture.

log_likelihood(X)[source]

Log likelihood

Parameters:: X (list of object) – Input samples for likelihood calculation. Can be changed to accept other data types.
Returns:: log_likelihood – Estimated likelihood of each samples.
Return type:: np.ndarray of float

on_errors(ite, samples, error)[source]

proposal(X)[source]

Proposal new samples based on the input samples.

Parameters:: X (list of object) – Samples for generate next samples. Can be changed to accept other data types.
Returns:: samples – Generated samples from input samples.
Return type:: list of object

resample(X, freq, size, p)[source]

Re-sample from given samples.

Parameters:

X (list[object]) – Input samples for likelihood calculation. Can be changed to accept other data types.
freq (list[int]) – Frequency of each input sample.
size (int) – Resample size.
p (numpy.ndarray[float]) – The probabilities associated with each entry in X. If not given the sample assumes a uniform distribution over all entries.

Returns:

re-sample – Re-sampling result.

Return type:

list of object

unique(X)[source]

Parameters:

X (list of object) – Input samples. Can be changed to accept other data types.

Returns:

unique (list of object) – The sorted unique values.
unique_counts (np.ndarray of int) – The number of times each of the unique values comes up in the original array

property timer

xenonpy.inverse package

Subpackages

Submodules

xenonpy.inverse.base module

Module contents