dgbpy.mlio

Attributes

nladbdirid

mlinpgrp

mltrlgrp

dgbtrl

dblistall

Functions

getInfo(filenm[, quick])

Gets information from an example file

datasetCount(dsets)

Gets count of dataset

getDatasetNms(dsets[, validation_split, valid_inputs])

Gets train and validation indices of dataset

getCrossValidationIndices(dsets[, seed, valid_inputs, ...])

Gets train and validation data for cross validation.

getChunks(dsets, nbchunks)

Splits dataset object into smaller chunks

hasScaler(infos[, inputsel])

Checks if example file has scaleror not from info

getDatasetsByGroup(dslist, groupnm)

getSomeDatasets(dslist[, decim])

getTrainingData(filenm[, decim])

Gets training data from file name

getTrainingDataByInfo(info[, dsetsel])

Gets training data from file info

getClasses(info, y_vectors)

normalize_class_vector(arr, classes)

unnormalize_class_vector(arr, classes)

saveModel(model, inpfnm, platform, infos, outfnm, ...)

Saves trained model for any platform workflow

getModel(modelfnm[, fortrain, pars])

Get model and model information

getApplyInfoFromFile(modelfnm[, outsubsel])

Gets model apply info from file name

getApplyInfo(infos[, outsubsel])

Gets model apply info from example file info

modelNameIsFree(modnm, type, args[, reload])

modelNameExists(modnm, type, args[, reload])

dbInfoForModel(modnm, args[, reload])

getModelType(infos)

Gets model type

getSaveLoc(outnm, ftype, args)

announceShowTensorboard()

announceTrainingFailure()

announceTrainingSuccess()

Module Contents

dgbpy.mlio.nladbdirid = '100060'
dgbpy.mlio.mlinpgrp = 'Deep Learning Example Data'
dgbpy.mlio.mltrlgrp = 'Deep Learning Model'
dgbpy.mlio.dgbtrl = 'dGB'
dgbpy.mlio.getInfo(filenm, quick=False)

Gets information from an example file

Parameters:
  • filenm (str): ffile name/path in hdf5 format

  • quick (bool): when set to True, info is gottenn quickly leaving out some info(e.g. datasets),

    defaults to False and loads all informaton

Returns:
  • dict: information from data file or model file

dgbpy.mlio.datasetCount(dsets)

Gets count of dataset

Parameters:
  • dsets (dict): dataset

Returns:
  • dict: counts for target attribute(s) or well(s) for project

dgbpy.mlio.getDatasetNms(dsets, validation_split=None, valid_inputs=None)

Gets train and validation indices of dataset

Parameters:
  • dsets (dict): dataset

  • validation_split (float): size of validation data (between 0-1)

  • valid_inputs (iter):

Returns:
  • dict: train and validation indices

dgbpy.mlio.getCrossValidationIndices(dsets, seed=None, valid_inputs=1, nbfolds=5)

Gets train and validation data for cross validation.

Parameters:
  • dsets (dict): dictionary of survey names and datasets

  • n_wells (int): number of wells to use as the validat ion set

Returns:
  • list: list of dictionaries containing train and validation data for each fold

dgbpy.mlio.getChunks(dsets, nbchunks)

Splits dataset object into smaller chunks

Parameters:
  • dsets (dict): dataset

  • nbchunks (int): number of data chunks to be created

Returns:
  • dict: chunks from dataset stored as dictionaries

dgbpy.mlio.hasScaler(infos, inputsel=None)

Checks if example file has scaleror not from info

Parameters:
  • infos (dict): information about example file

  • inputsel (bool or NoneType):

Returns:

bool: True if dataset info has scaler key, False if other

dgbpy.mlio.getDatasetsByGroup(dslist, groupnm)
dgbpy.mlio.getSomeDatasets(dslist, decim=None)
dgbpy.mlio.getTrainingData(filenm, decim=False)

Gets training data from file name

Parameters:
  • filenm (str): path to file in hdf5 format

  • decim (bool):

Returns:
  • dict: train, validation datasets as arrays, and info on example file

dgbpy.mlio.getTrainingDataByInfo(info, dsetsel=None)

Gets training data from file info

Parameters:
  • info (dict): information about example file

  • dsetsel ():

Returns:
  • dict: train, validation datasets as arrays, and info on example file

dgbpy.mlio.getClasses(info, y_vectors)
dgbpy.mlio.normalize_class_vector(arr, classes)
dgbpy.mlio.unnormalize_class_vector(arr, classes)
dgbpy.mlio.saveModel(model, inpfnm, platform, infos, outfnm, params, **kwargs)

Saves trained model for any platform workflow

Parameters:
  • model (obj): trained model on any platform

  • inpfnm (str): example file name in hdf5 format

  • platform (str): machine learning platform (options; keras, Scikit-learn, torch)

  • infos (dict): example file info

  • outfnm (str): name of model to be saved or S3 folder URI

  • params (dict): parameters to be used when saving the model

dgbpy.mlio.getModel(modelfnm, fortrain=False, pars=None, **kwargs)

Get model and model information

Parameters:
  • modelfnm (str): model file path/name in hdf5 format

  • fortrain (bool): specifies if the model might be further trained

  • pars (dict): parameters to be used when restoring the model if needed

Returs:
  • tuple: (trained model and model/project info)

dgbpy.mlio.getApplyInfoFromFile(modelfnm, outsubsel=None)

Gets model apply info from file name

Parameters:
  • modelfnm (str): model file path/name in hdf5 format

  • outsubsel ():

Returns:
  • dict: apply information

dgbpy.mlio.getApplyInfo(infos, outsubsel=None)

Gets model apply info from example file info

Parameters:
  • infos (dict): example file info

  • outsubsel ():

Returns:
  • dict: apply information

dgbpy.mlio.dblistall = None
dgbpy.mlio.modelNameIsFree(modnm, type, args, reload=True)
dgbpy.mlio.modelNameExists(modnm, type, args, reload=True)
dgbpy.mlio.dbInfoForModel(modnm, args, reload=True)
dgbpy.mlio.getModelType(infos)

Gets model type

Parameters:
  • infos (dict): example file info

Returns:
  • str: Type ofmodel/workflow

dgbpy.mlio.getSaveLoc(outnm, ftype, args)
dgbpy.mlio.announceShowTensorboard()
dgbpy.mlio.announceTrainingFailure()
dgbpy.mlio.announceTrainingSuccess()