dgbpy.dgbscikit

Attributes

tot_cpu

n_cpu

platform

mse_criterion

regmltypes

classmltypes

lineartypes

logistictypes

clustertypes

ensembletypes

nntypes

svmtypes

clustermethods

solvertypes

linkernel

kerneltypes

savetypes

defsavetype

xgboostjson

defstoragetype

scikit_dict

settings_mltrain_path

settings_mltrain

Classes

RangedScaler

Custom scaler for normalizing data.

Functions

hasScikit()

isVersionAtLeast(version)

isClustering(model)

hasXGBoost()

getMLPlatform()

getUIMLPlatform()

getUiModelTypes(isclassification, ismultiregression, ...)

getUiLinearTypes()

getUiLogTypes()

getUiClusterTypes()

getUiClusterMethods()

getUiEnsembleTypes(ismultiregression)

getUiNNTypes()

getUiSVMTypes()

getUiSolverTypes()

getUiNNKernelTypes()

getDefaultSolver([uiname])

getDefaultNNKernel(isclass[, uiname])

getClusterDistances(model, samples)

Calculate the minimum normalized Euclidean distance from each sample to the nearest cluster center.

getClusterParsKMeans(methodname, nclust, ninit, maxiter)

getClusterParsMeanShift(methodname, maxiter)

getClusterParsSpectral(methodname, nclust, ninit)

getLinearPars([modelname])

getLogPars([modelname, solver])

getEnsembleParsXGDT([modelname, maxdep, est, lr])

getEnsembleParsXGRF([modelname, maxdep, est, lr])

getEnsembleParsRF([modelname, maxdep, est])

getEnsembleParsGB([modelname, maxdep, est, lr])

getEnsembleParsAda([modelname, est, lr])

getNNPars([modelname, maxitr, lr, lay1, lay2, lay3, ...])

getSVMPars([modelname, kernel, degree])

getNewScaler(mean, scale)

Gets new scaler object for standardization

getNewMinMaxScaler(data[, minout, maxout])

Gets new scaler object for normalization

getNewRangeScaler(data[, std])

Gets new scaler object for range normalization

getScaler(x_train, byattrib)

Extract scaler for standardization of features.

transform(samples, mean, stddev)

transformBack(samples, mean, stddev)

scale(samples, scaler)

Applies a scaler transformation to an array of features

unscale(samples, scaler)

Applies an inverse scaler transformation to an array of features

getDefaultModel(setup[, params])

train(model, trainingdp)

assessQuality(model, trainingdp)

onnx_from_sklearn(model)

save(model, outfnm[, save_type])

load(modelfnm)

apply(model, samples, scaler, isclassification, ...)

Module Contents

dgbpy.dgbscikit.tot_cpu
dgbpy.dgbscikit.n_cpu
dgbpy.dgbscikit.hasScikit()
dgbpy.dgbscikit.isVersionAtLeast(version)
dgbpy.dgbscikit.isClustering(model)
dgbpy.dgbscikit.hasXGBoost()
dgbpy.dgbscikit.platform
dgbpy.dgbscikit.mse_criterion = 'mse'
dgbpy.dgbscikit.regmltypes = (('linear', 'Linear'), ('ensemble', 'Ensemble'), ('neuralnet', 'Neural Network'), ('svm', 'SVM'))
dgbpy.dgbscikit.classmltypes = (('logistic', 'Logistic'), ('ensemble', 'Ensemble'), ('neuralnet', 'Neural Network'), ('svm', 'SVM'))
dgbpy.dgbscikit.lineartypes = [('oslq', 'Ordinary Least Squares')]
dgbpy.dgbscikit.logistictypes = [('log', 'Logistic Regression Classifier')]
dgbpy.dgbscikit.clustertypes = [('cluster', 'Clustering')]
dgbpy.dgbscikit.ensembletypes = []
dgbpy.dgbscikit.nntypes = [('mlp', 'Multi-Layer Perceptron')]
dgbpy.dgbscikit.svmtypes = [('svm', 'Support Vector Machine')]
dgbpy.dgbscikit.clustermethods = [('kmeans', 'K-Means'), ('meanshift', 'Mean Shift'), ('spec', 'Spectral Clustering')]
dgbpy.dgbscikit.solvertypes = [('newton-cg', 'Newton-CG'), ('lbfgs', 'Lbfgs'), ('liblinear', 'Liblinear'), ('sag', 'Sag'),...
dgbpy.dgbscikit.linkernel = 'linear'
dgbpy.dgbscikit.kerneltypes
dgbpy.dgbscikit.savetypes = ('onnx', 'joblib', 'pickle')
dgbpy.dgbscikit.defsavetype = 'onnx'
dgbpy.dgbscikit.xgboostjson = 'xgboostjson'
dgbpy.dgbscikit.defstoragetype
dgbpy.dgbscikit.scikit_dict
dgbpy.dgbscikit.settings_mltrain_path
dgbpy.dgbscikit.settings_mltrain
dgbpy.dgbscikit.getMLPlatform()
dgbpy.dgbscikit.getUIMLPlatform()
dgbpy.dgbscikit.getUiModelTypes(isclassification, ismultiregression, issegmentation)
dgbpy.dgbscikit.getUiLinearTypes()
dgbpy.dgbscikit.getUiLogTypes()
dgbpy.dgbscikit.getUiClusterTypes()
dgbpy.dgbscikit.getUiClusterMethods()
dgbpy.dgbscikit.getUiEnsembleTypes(ismultiregression)
dgbpy.dgbscikit.getUiNNTypes()
dgbpy.dgbscikit.getUiSVMTypes()
dgbpy.dgbscikit.getUiSolverTypes()
dgbpy.dgbscikit.getUiNNKernelTypes()
dgbpy.dgbscikit.getDefaultSolver(uiname=True)
dgbpy.dgbscikit.getDefaultNNKernel(isclass, uiname=True)
dgbpy.dgbscikit.getClusterDistances(model, samples)

Calculate the minimum normalized Euclidean distance from each sample to the nearest cluster center.

This function computes the distances of each sample to the nearest cluster center for a given clustering model. It supports dgbpy.dgbscikit unsupervised model including KMeans, MeanShift, and SpectralClustering models. The distances are normalized to a range of [0, 1] using MinMaxScaler.

Parameters

modelobject

The clustering model.

samplesarray-like of shape (n_samples, n_features)

The input data samples to be clustered. Each row corresponds to a sample, and each column corresponds to a feature.

Returns

min_distances_normalizedndarray of shape (n_samples,)

The minimum normalized Euclidean distance from each sample to the nearest cluster center. The distances are scaled to a range between 0 and 1.

dgbpy.dgbscikit.getClusterParsKMeans(methodname, nclust, ninit, maxiter)
dgbpy.dgbscikit.getClusterParsMeanShift(methodname, maxiter)
dgbpy.dgbscikit.getClusterParsSpectral(methodname, nclust, ninit)
dgbpy.dgbscikit.getLinearPars(modelname='Ordinary Least Squares')
dgbpy.dgbscikit.getLogPars(modelname='Logistic Regression Classifier', solver=None)
dgbpy.dgbscikit.getEnsembleParsXGDT(modelname='XGBoost: (Decision Tree)', maxdep=scikit_dict['ensemblepars']['xgdt']['maxdep'], est=scikit_dict['ensemblepars']['xgdt']['est'], lr=scikit_dict['ensemblepars']['xgdt']['lr'])
dgbpy.dgbscikit.getEnsembleParsXGRF(modelname='XGBoost: (Random Forests)', maxdep=scikit_dict['ensemblepars']['xgrf']['maxdep'], est=scikit_dict['ensemblepars']['xgrf']['est'], lr=scikit_dict['ensemblepars']['xgrf']['lr'])
dgbpy.dgbscikit.getEnsembleParsRF(modelname='Random Forests', maxdep=scikit_dict['ensemblepars']['rf']['maxdep'], est=scikit_dict['ensemblepars']['rf']['est'])
dgbpy.dgbscikit.getEnsembleParsGB(modelname='Gradient Boosting', maxdep=scikit_dict['ensemblepars']['gb']['maxdep'], est=scikit_dict['ensemblepars']['gb']['est'], lr=scikit_dict['ensemblepars']['gb']['lr'])
dgbpy.dgbscikit.getEnsembleParsAda(modelname='Adaboost', est=scikit_dict['ensemblepars']['ada']['est'], lr=scikit_dict['ensemblepars']['ada']['lr'])
dgbpy.dgbscikit.getNNPars(modelname='Multi-Layer Perceptron', maxitr=scikit_dict['nnpars']['maxitr'], lr=scikit_dict['nnpars']['lr'], lay1=scikit_dict['nnpars']['lay1'], lay2=scikit_dict['nnpars']['lay2'], lay3=scikit_dict['nnpars']['lay3'], lay4=scikit_dict['nnpars']['lay4'], lay5=scikit_dict['nnpars']['lay5'], nb=scikit_dict['nnpars']['nb'])
dgbpy.dgbscikit.getSVMPars(modelname='Support Vector Machine', kernel=scikit_dict['svmpars']['kernel'], degree=scikit_dict['svmpars']['degree'])
dgbpy.dgbscikit.getNewScaler(mean, scale)

Gets new scaler object for standardization

Parameters:
  • mean (ndarray of shape (n_features,) or None): mean value to be used for scaling

  • scale ndarray of shape (n_features,) or None: Per feature relative scaling of the data to achieve zero mean and unit variance (from sklearn docs)

Returns:
  • object: scaler (an instance of sklearn.preprocessing.StandardScaler())

dgbpy.dgbscikit.getNewMinMaxScaler(data, minout=0, maxout=1)

Gets new scaler object for normalization

Parameters:
  • data ndarray: data used to fit the MinMaxScaler object

  • minout int: desired minimum value of transformed data

  • maxout int: desired maximum value of transformed data

Returns:
  • object: scaler (an instance of sklearn.preprocessing.MinMaxScaler())

dgbpy.dgbscikit.getNewRangeScaler(data, std=4)

Gets new scaler object for range normalization

dgbpy.dgbscikit.getScaler(x_train, byattrib)

Extract scaler for standardization of features. The scaler is such that when it is applied to the samples they get a mean of 0 and standard deviation of 1, globally or per channel

Parameters:
  • x_train ndarray: data used to fit the StandardScaler object

  • byattrib Boolean: sets a per channel scaler if True

Returns: * object: scaler (an instance of sklearn.preprocessing.StandardScaler())

dgbpy.dgbscikit.transform(samples, mean, stddev)
dgbpy.dgbscikit.transformBack(samples, mean, stddev)
dgbpy.dgbscikit.scale(samples, scaler)

Applies a scaler transformation to an array of features If the scaler is a StandardScaler, the returned samples have a mean and standard deviation according to the value set in the scaler. If the scaler is a MinMaxScaler, the returned samples have a min/max value according to the range set in that scaler Scaling is applied on the input array directly.

Parameters:
  • samples ndarray: input/output values to be scaled

  • scaler sklearn.preprocessing scaler object (see sklearn docs)

dgbpy.dgbscikit.unscale(samples, scaler)

Applies an inverse scaler transformation to an array of features Scaling is applied on the input array directly.

Parameters:
  • samples ndarray: input/output values to be unscaled

  • scaler sklearn.preprocessing scaler object (see sklearn docs)

class dgbpy.dgbscikit.RangedScaler(std=4)

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Custom scaler for normalizing data.

std = 4
fit(X, y=None)

Compute the mean and standard deviation to be used for later scaling.

Parameters

X : array-like of shape (n_samples, n_features) y : Ignored Not used, present for API consistency.

Returns

selfobject

Fitted scaler.

transform(X, y=None)

Perform standardization by centering and scaling, then clip the data based on the specified range.

Parameters

Xarray-like of shape (n_samples, n_features)

The data to transform based on the computed mean and standard deviation.

yIgnored

Not used, present here for API consistency by convention.

Returns

X_scaledarray-like of shape (n_samples, n_features)

The transformed data.

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Parameters

Xarray-like of shape (n_samples, n_features)

The data to fit, then transform.

yIgnored

Not used, present here for API consistency by convention.

**fit_paramsdict

Additional fit parameters.

Returns

X_scaledarray-like of shape (n_samples, n_features)

The transformed data.

dgbpy.dgbscikit.getDefaultModel(setup, params=scikit_dict)
dgbpy.dgbscikit.train(model, trainingdp)
dgbpy.dgbscikit.assessQuality(model, trainingdp)
dgbpy.dgbscikit.onnx_from_sklearn(model)
dgbpy.dgbscikit.save(model, outfnm, save_type=defsavetype)
dgbpy.dgbscikit.load(modelfnm)
dgbpy.dgbscikit.apply(model, samples, scaler, isclassification, withpred, withprobs, withconfidence, doprobabilities)