learn_ml.generators package

Submodules

learn_ml.generators.generator_utils module

Utility functions for code generators.

These functions help with using object formats that were created specifically for the PythonGenerator and JsonGenerator class hierarchies.

learn_ml.generators.generator_utils._is_valid_arg_dict(arg_dict)[source]

Check if arguments are formatted correctly in the argument dictionary.

An arg dict defines all arguments to a function as param:value paries. They must follow the rules of standard Python arguments. Args with default parameters must come after args without. All keys in the dict must be of type string.

Values may take the forms::

string: To represent any object. Will be generated as code, not a string. int/float: To represent a number value. list of strings, ints, or floats: When an arg requires multiple values. dict: To represent a function being passed as a value. None: This can be used to signify there is no default parameter.

This function raises exceptions if any of these rules are not followed.

Parameters

arg_dict (dict) – Method arguments in the form {“param” : “value”}. Refer to above for how to format an argument dictionary.

Raises
  • TypeError – Arguments must be formatted as a dictionary.

  • ValueError – All None values must come before all string values.

  • TypeError – All keys must be strings and all values must be strings or None.

learn_ml.generators.generator_utils._is_valid_fn_dict(fn_dict)[source]

Check if function is formatted correctly in a dictionary.

The function dictionary must be correctly formatted so it can be parsed. It must have two keys: “name” and “args”. The value of “name” must be a string, and args must be formatted as an arg_dict (see _is_valid_arg_dict for that format). Any input deviating from this will raise an exception.

Parameters

fn_dict (dict) – Represents a function name and its arguments.

Raises
  • TypeError – Input must be of type dict.

  • ValueError – Must have a “name” key in the dict.

  • ValueError – Must have an “args” key in the dict.

  • ValueError – The only keys must be “name’ and “args”.

  • TypeError – Function name must be a string.

learn_ml.generators.generator_utils.create_fn_dict(name, args=None)[source]

Creates a dictionary representation of a function.

This function exists so the user doesn’t have to worry about the internal representation of a function in the JSON file.

Parameters
  • name (str) – Name of the function.

  • args (dict) – Function arguments as {“parameter” : “value”} elements. Refer to _is_valid_arg_dict() docstring for formatting.

Returns

The formatted function.

Return type

dict

Example

>>> function_name = "flatten"
>>> args = {
>>>    "input_shape" : [28, 28, 1],
>>>    "parameter" : "value"
>>> }
>>> print(create_fn_dict(function_name, args))
::
    {
        "name" : "flatten",
        "args" : {
            "input_shape" : [28, 28, 1],
            "parameter" : "value"
        }
    }

learn_ml.generators.json_generators module

Contains classes for generating pipeline and model configuration files.

These classes are used by the front-end application to define a dataset pipeline and machine learning model in the form of a JSON file. Options are stored in about the same format as they would be selected by the user. After these JSONs are generated, the actual DatasetPipeline and Model classes can be generated based on the JSONs.

Classes::

JsonGenerator: Base class for writing to a JSON. PipelineJsonGenerator: Writes JSON entries specific to the Pipeline class. ModelJsonGenerator: Writes JSON entries specific to the Model class.

class learn_ml.generators.json_generators.JsonGenerator(out_file)[source]

Bases: object

Base class that adds content to a JSON file.

out_file

The file to write to.

Type

str

out

The file object that performs write operations.

root

Stores all the data before writing.

Type

dict

index

Keeps track of the current indentation in the root dict.

Type

dict

_close()[source]

Close the file object.

_indent(key)[source]

Set the entry point for writing to the JSON.

After indenting, new entries will be added to this indented level. Indents can only be set to dictionaries. Will raise a warning if trying to indent to anything else.

Parameters

name (str) – Dictionary key in the JSON. If the key does not already exist in the JSON, then it is added with an empty dictionary value.

_unindent(key=None)[source]

Sets the entry point back to the root of the JSON file.

add_entry(key, value)[source]

Adds a key-value entry to the file.

The entry is added at the current indent level. If the entry is being added to a key that already exists in the JSON, then it will overwrite the current value. If the existing key has a list as its value, then the entry is appended to the end of the list.

Parameters
  • key (str) – Dictionary key to add to the file.

  • value – Dictionary value to add to the file. Can be of any type allowed in Python dictionaries.

add_fn(key, fn_name, args=None)[source]

Adds a function to the file.

A function is defined as a dictionary with “name” and “args” keys. The name is a string and the args is a dictionary.

Parameters
  • key (str) – Key value in the JSON file.

  • fn_name (str) – Function name

  • args (dict) – Function arguments as a dictionary.

Example usage:
>>> json_gen = JsonGenerator("out.json")
>>> args = {
>>>     "parameter_1" : "value_1",
>>>     "parameter_2" : [28, 28, 1]
>>> }
>>> json_gen.add_fn("key", "function", args)
>>> print(json_gen.root)
{
    "key" : {
        "name" : "function",
        "args" : {
            "parameter_1" : "value_1",
            "parameter_2" : [28, 28, 1]
        }
    }
}
write()[source]

Writes the root dictionary to the JSON file and closes the file.

class learn_ml.generators.json_generators.ModelJsonGenerator(out_file)[source]

Bases: learn_ml.generators.json_generators.JsonGenerator

Generates a configuration file representing a machine learning model.

add_compile(args=None)[source]

Adds compiler options to the file.

Refer to model_options.json for all compiler arg options.

Parameters

args (dict) – Arguments passed into the compile function.

add_layer(layer_name, args=None)[source]

Adds a neural net layer to the file.

Parameters
  • layer_name (str) – Layer name. Refer to the model_options.json for possible layer names.

  • args (dict) – Args {param : value} to pass into the layer function.

Example usage:
>>> model = ModelJsonGenerator("out.json")
>>> args={
>>>     "units" : 10,
>>>     "activation" : "softmax"
>>> }
>>> add_layer("dense", args)
>>> print(model.root["layers"])
[
    {
        "name" : "dense",
        "args" : {
            "units" : 10,
            "activation" : "softmax"
        }
    }
]
add_model(model_name)[source]

Adds a model type to the file.

Parameters

model_name (str) – Model name. Refer to model_options.json for possible model names.

get_compile_list()[source]

Get the list of all possible compile arguments.

get_layers_list()[source]

Get the list of all possible layer functions.

get_model_list()[source]

Get the list of all possible model types.

class learn_ml.generators.json_generators.PipelineJsonGenerator(out_file)[source]

Bases: learn_ml.generators.json_generators.JsonGenerator

Generates a dataset pipeline configuration file based on user selections.

_add_operation(key, op_name, args=None)[source]

Add a preprocessing operation.

A list of all allowable operations can be found as methods for the tf.data.Dataset class

(https://www.tensorflow.org/api_docs/python/tf/data/Dataset)

Parameters
  • op_name (str) – Name of the operation. Equivalent to a tf.Dataset method name.

  • args (dict) – The method arguments {“param” : “value”}. The value doesn’t always correspond to the actual argument so that functionality can be abstracted from specific machine learning libraries. Check variable_map.json for all values and their representations.

  • usage (Example) –

    >>> pipeline = PipelineJsonGenerator("out.json")
    >>> args = {
    >>>         "map_func" : "normalize_img",
    >>>         "num_parallel_calls" : "autotune"
    >>>         }
    >>> pipeline.add_operation("map", args)
    >>> print(pipeline.root["operations"])
    [
        {
            "name" : "map",
            "args" : {
                "map_fun" : "normalize_img",
                "num_parallel_calls" : "autotune"
            }
        }
    ]
    

add_dataset(label)[source]

Add a dataset source.

All available datasets can be found in the Tensorflow Datasets catalog.

(https://www.tensorflow.org/datasets/catalog/overview)

Parameters

label (str) – Dataset identifier. Equivalent to the Tensorflow dataset name.

add_test_operation(op_name, args=None)[source]

Add an operation to the test dataset.

add_train_operation(op_name, args=None)[source]

Add an operation to the training dataset.

get_dataset_list()[source]

Return a list of all possible dataset sources.

get_operations_list()[source]

Return a list of all possible preprocessing operations.

learn_ml.generators.python_generators module

Classes to generate Python scripts based off JSON configuration files.

The purpose of these classes is to generate a class-based Python script based off parameters set by a user stored in a JSON file. The layout is as follows:

Classes::

PythonGenerator: Base class with python script writing methods. ClassGenerator: Includes methods for writing class definitions and methods. PipelineGenerator: Generates a dataset pipeline object script. ModelGenerator: Generates a machine learning model object script.

class learn_ml.generators.python_generators.ClassGenerator(class_config, map_config, out)[source]

Bases: learn_ml.generators.python_generators.PythonGenerator

Abstract class that defines methods for writing object-oriented strings to a python file.

class_

Class data loaded from the JSON file.

Type

dict

map

A mapping from class dict representations to their actual values. For example, a “dense” value in the class may be mapped to “tf.keras.layers.Dense”.

Type

dict

_arg_str(arg_dict)[source]

Converts function arguments to a string.

Parameters

arg_dict (dict) – The function arguments. Can be None if there are no args.

Returns

Arguments formatted as they would be passed into a function.

Return type

str

Example usage:
>>> arg_dict = {
>>>     "param_1" : None,
>>>     "param_2" : "val_2"
>>> }
>>> print(_arg_str(arg_dict))
"param_1, param_2=val_2"
_check_configs(class_config, map_config)[source]

Check that the config files exist.

Raises

FileNotFoundError – The class or map config files cannot be found.

abstract _class_def()[source]

Define a class.

This method should call _start_class() and should define an __init__ method.

_end_class()[source]

Signify end of class and remove indents.

_end_method()[source]

Signify end of method and remove indents.

_fn(fn_dict)[source]

Maps a function dictionary and returns it as a string.

Parameters

fn_dict (dict) – The function.

Returns

A mapped dictionary of the same form as the input.

Return type

dict

Example usage:
>>> fn_dict = {
>>>     "name" : "function",
>>>     "args" : {
>>>         "param" : "value"
>>>     }
>>> }
>>> print(_fn(fn_dict))
"mapped_function(mapped_param=mapped_value)"
_fn_str(fn_dict)[source]

Convert a function from dictionary form to a string.

Parameters

fn_dict (dict) – The function.

Example usage:
>>> fn_dict = {
>>>     "name" : "function",
>>>     "args" : {
>>>         "param" : "val"
>>>     }
>>> }
>>> print(_fn_str(fn_dict))
"function(param=val)"
abstract _imports()[source]

Write imports to file.

_map(input_)[source]

Converts an input to its language-specific representation.

This function allows the model JSON language to be independent of implementation syntax. For example, the model may have a “dense” layer. The Tensorflow code for this is tf.keras.layers.Dense(). This function returns that language-specific syntax.

Parameters

input (str) – The term to be mapped.

_map_fn(fn_dict)[source]

Convert the elements of a function dictionary to their real values.

The function name, arg param, and arg value can be mapped if they have a mapping defined in the map JSON config file.

Parameters

fn_dict (dict) – The function.

Returns

A mapped dictionary of the same form as the input.

Return type

dict

Example usage:
>>> function = {
>>>     "name" : "function"
>>>     "args" : {
>>>         "param" : "value"
>>>     }
>>> }
>>> print(_map_fn(function))
{
    "name" : "mapped_function",
    "args" : {
        "mapped_param" : "mapped_value"
    }
}
_start_class(name, base='object', docstring=None)[source]

Write the class definition to file.

Parameters
  • name (str) – The class name.

  • base (str) – The base class. Default is “object”.

  • docstring (str, List[str], optional) – The docstring.

_start_class_method(name, arg_dict=None, docstring=None)[source]

Writes a class method definition to file.

Parameters
  • name (str) – The method name.

  • args (dict, optional) – The args passed to the method. Don’t include “self”.

  • docstring (str, List[str], optional) – The docstring.

_start_method(name, args=None, docstring=None)[source]

Writes the method definition to file.

Parameters
  • name (str) – The method name.

  • args (dict, optional) – The args passed to the method.

  • docstring (str, List[str], optional) – The docstring.

_write_imports(imports_dict)[source]

Write import statements to file.

Parameters

import_dict (dict) – A dictionary of strings containing the import packages and their abbreviations. Currently, an abbreviation must be specified.

Example

>>> imports_dict = {
>>>    "tensorflow" : "tf",
>>>    "numpy" : "np"
>>> }
>>> _write_imports(imports_dict)
In out file::

import tensorflow as tf import numpy as np

Raises

TypeError – imports_dict is not of type dictionary.

class learn_ml.generators.python_generators.ModelGenerator(model_config='generators/model.json', map_config='generators/model_variable_map.json', out='project/models/model.py')[source]

Bases: learn_ml.generators.python_generators.ClassGenerator

Generates a machine learning model based off a JSON config file.

Use gen_model() to create the model. The model is stored as Model class with its layers and compilations features as its internal state.

model

Model configuration loaded from the JSON.

Type

dict

model_name

The name that represents the model type.

Type

str

_build_model()[source]

Generate code for building the model.

_class_def()[source]

Generate Model class definition.

_compile_model()[source]

Generate code for the model compiler.

_helper_funcs()[source]

Generates get/set helper functions.

_imports()[source]

Generate import statements.

_train()[source]

Generate code to train the model.

gen_model()[source]

Generate all code for the Model class.

class learn_ml.generators.python_generators.PipelineGenerator(pipeline_config='generators/pipeline.json', map_config='generators/pipeline_map.json', out='project/pipelines/pipeline.py')[source]

Bases: learn_ml.generators.python_generators.ClassGenerator

Generates a dataset preprocessing pipeline based off a JSON config file.

Most of the class methods are internal and do not need to be modified. All changes should be made to the JSON pipeline file used for configuration. Call gen_pipeline() to write the script.

The pipeline is represented as a Pipeline object with dataset and preprocessing operations in its internal state.

pipeline

The pipeline configuration loaded from the JSON.

Type

dict

dataset

A subset of the pipeline config that represents the dataset options.

Type

dict

_class_def()[source]

Generate DatasetPipeline class definition.

_helper_funcs()[source]

Generates get/set helper functions.

_imports()[source]

Generate import statements.

_load_dataset()[source]

Generate code for loading the dataset.

_operations(variable, operations)[source]

Write operation code for a dataset variable and list of operations.

An operation is a dict that defines a function to be applied to a variable. This method writes a line of code to apply a generic operation to a generic variable.

Parameters
  • variable (str) – The variable the function/operation is called on.

  • operations (list[dict]) – The operations to be individually applied to the variable.

Example

>>> operations = [
    {
        "name" : "map",
        "args" : {
            "param" : "value"
        }
    },
    {
        "name" : "cache",
        "args" : None
    }
]
>>> _operations("dataset", operations)
'self.dataset = self.dataset.map(param=value)'
'self.dataset = self.dataset.cache()'
_preprocess()[source]

Generate code for all preprocessing operations.

The training and test datasets may have preprocessing operations applied to them. This function will write each operation in the pipeline JSON as an individual line of code.

gen_pipeline()[source]

Generate all code part of the dataset pipeline.

class learn_ml.generators.python_generators.PythonGenerator(out)[source]

Bases: object

Generates a Python script and writes code to it.

This class can be used to write lines of code to a python file and keep track of indent levels.

out_file_name

Name of the file to write to.

Type

str

out

File object used to write to file.

name

An identifier for the file. Taken from the out name. For example, for out = “./project/models/model.py”, name = “model”

Type

str

indent_level

Keep track of current writing indentation.

Type

int

indent_str

4 spaces to represent a single indent.

Type

str

_indent(inc=1)[source]

Increments the indent level of the output string.

_spaces()[source]

Returns a string of the correct number of spaces for the current indent level.

_unindent(dec=1)[source]

Decrements the indent level of the output string.

_write(lines)[source]

Writes lines of code to the file.

Parameters

lines (str, List[str]) – Line or lines to be written to the file.

Example

>>> pygen = PythonGenerator("out.py")
>>> pygen._write("This is a line.\n")
>>> pygen._indent()
>>> pygen._write(["This is line 1\n",
>>>               "This is line 2\n"])
In out.py:
    This is a line.
        This is line 1
        This is line 2
_write_docstring(line=None)[source]

Writes a line of docstring to the file.

Parameters

line (str, optional) – Docstring comment without docstring quotes. Defaults to None and will do nothing.

get_gen_file_name()[source]

Returns the name of the generated Python script.

Module contents