learn_ml.generators package¶
Submodules¶
learn_ml.generators.generator_utils module¶
Utility functions for code generators.
These functions help with using object formats that were created specifically for the PythonGenerator and JsonGenerator class hierarchies.
-
learn_ml.generators.generator_utils.
_is_valid_arg_dict
(arg_dict)[source]¶ Check if arguments are formatted correctly in the argument dictionary.
An arg dict defines all arguments to a function as param:value paries. They must follow the rules of standard Python arguments. Args with default parameters must come after args without. All keys in the dict must be of type string.
- Values may take the forms::
string: To represent any object. Will be generated as code, not a string. int/float: To represent a number value. list of strings, ints, or floats: When an arg requires multiple values. dict: To represent a function being passed as a value. None: This can be used to signify there is no default parameter.
This function raises exceptions if any of these rules are not followed.
- Parameters
arg_dict (dict) – Method arguments in the form {“param” : “value”}. Refer to above for how to format an argument dictionary.
- Raises
TypeError – Arguments must be formatted as a dictionary.
ValueError – All None values must come before all string values.
TypeError – All keys must be strings and all values must be strings or None.
-
learn_ml.generators.generator_utils.
_is_valid_fn_dict
(fn_dict)[source]¶ Check if function is formatted correctly in a dictionary.
The function dictionary must be correctly formatted so it can be parsed. It must have two keys: “name” and “args”. The value of “name” must be a string, and args must be formatted as an arg_dict (see _is_valid_arg_dict for that format). Any input deviating from this will raise an exception.
- Parameters
fn_dict (dict) – Represents a function name and its arguments.
- Raises
TypeError – Input must be of type dict.
ValueError – Must have a “name” key in the dict.
ValueError – Must have an “args” key in the dict.
ValueError – The only keys must be “name’ and “args”.
TypeError – Function name must be a string.
-
learn_ml.generators.generator_utils.
create_fn_dict
(name, args=None)[source]¶ Creates a dictionary representation of a function.
This function exists so the user doesn’t have to worry about the internal representation of a function in the JSON file.
- Parameters
name (str) – Name of the function.
args (dict) – Function arguments as {“parameter” : “value”} elements. Refer to _is_valid_arg_dict() docstring for formatting.
- Returns
The formatted function.
- Return type
dict
Example
>>> function_name = "flatten" >>> args = { >>> "input_shape" : [28, 28, 1], >>> "parameter" : "value" >>> } >>> print(create_fn_dict(function_name, args)) :: { "name" : "flatten", "args" : { "input_shape" : [28, 28, 1], "parameter" : "value" } }
learn_ml.generators.json_generators module¶
Contains classes for generating pipeline and model configuration files.
These classes are used by the front-end application to define a dataset pipeline and machine learning model in the form of a JSON file. Options are stored in about the same format as they would be selected by the user. After these JSONs are generated, the actual DatasetPipeline and Model classes can be generated based on the JSONs.
- Classes::
JsonGenerator: Base class for writing to a JSON. PipelineJsonGenerator: Writes JSON entries specific to the Pipeline class. ModelJsonGenerator: Writes JSON entries specific to the Model class.
-
class
learn_ml.generators.json_generators.
JsonGenerator
(out_file)[source]¶ Bases:
object
Base class that adds content to a JSON file.
-
out_file
¶ The file to write to.
- Type
str
-
out
¶ The file object that performs write operations.
-
root
¶ Stores all the data before writing.
- Type
dict
-
index
¶ Keeps track of the current indentation in the root dict.
- Type
dict
-
_indent
(key)[source]¶ Set the entry point for writing to the JSON.
After indenting, new entries will be added to this indented level. Indents can only be set to dictionaries. Will raise a warning if trying to indent to anything else.
- Parameters
name (str) – Dictionary key in the JSON. If the key does not already exist in the JSON, then it is added with an empty dictionary value.
-
add_entry
(key, value)[source]¶ Adds a key-value entry to the file.
The entry is added at the current indent level. If the entry is being added to a key that already exists in the JSON, then it will overwrite the current value. If the existing key has a list as its value, then the entry is appended to the end of the list.
- Parameters
key (str) – Dictionary key to add to the file.
value – Dictionary value to add to the file. Can be of any type allowed in Python dictionaries.
-
add_fn
(key, fn_name, args=None)[source]¶ Adds a function to the file.
A function is defined as a dictionary with “name” and “args” keys. The name is a string and the args is a dictionary.
- Parameters
key (str) – Key value in the JSON file.
fn_name (str) – Function name
args (dict) – Function arguments as a dictionary.
- Example usage:
>>> json_gen = JsonGenerator("out.json") >>> args = { >>> "parameter_1" : "value_1", >>> "parameter_2" : [28, 28, 1] >>> } >>> json_gen.add_fn("key", "function", args) >>> print(json_gen.root) { "key" : { "name" : "function", "args" : { "parameter_1" : "value_1", "parameter_2" : [28, 28, 1] } } }
-
-
class
learn_ml.generators.json_generators.
ModelJsonGenerator
(out_file)[source]¶ Bases:
learn_ml.generators.json_generators.JsonGenerator
Generates a configuration file representing a machine learning model.
-
add_compile
(args=None)[source]¶ Adds compiler options to the file.
Refer to model_options.json for all compiler arg options.
- Parameters
args (dict) – Arguments passed into the compile function.
-
add_layer
(layer_name, args=None)[source]¶ Adds a neural net layer to the file.
- Parameters
layer_name (str) – Layer name. Refer to the model_options.json for possible layer names.
args (dict) – Args {param : value} to pass into the layer function.
- Example usage:
>>> model = ModelJsonGenerator("out.json") >>> args={ >>> "units" : 10, >>> "activation" : "softmax" >>> } >>> add_layer("dense", args) >>> print(model.root["layers"]) [ { "name" : "dense", "args" : { "units" : 10, "activation" : "softmax" } } ]
-
-
class
learn_ml.generators.json_generators.
PipelineJsonGenerator
(out_file)[source]¶ Bases:
learn_ml.generators.json_generators.JsonGenerator
Generates a dataset pipeline configuration file based on user selections.
-
_add_operation
(key, op_name, args=None)[source]¶ Add a preprocessing operation.
- A list of all allowable operations can be found as methods for the tf.data.Dataset class
(https://www.tensorflow.org/api_docs/python/tf/data/Dataset)
- Parameters
op_name (str) – Name of the operation. Equivalent to a tf.Dataset method name.
args (dict) – The method arguments {“param” : “value”}. The value doesn’t always correspond to the actual argument so that functionality can be abstracted from specific machine learning libraries. Check variable_map.json for all values and their representations.
usage (Example) –
>>> pipeline = PipelineJsonGenerator("out.json") >>> args = { >>> "map_func" : "normalize_img", >>> "num_parallel_calls" : "autotune" >>> } >>> pipeline.add_operation("map", args) >>> print(pipeline.root["operations"]) [ { "name" : "map", "args" : { "map_fun" : "normalize_img", "num_parallel_calls" : "autotune" } } ]
-
learn_ml.generators.python_generators module¶
Classes to generate Python scripts based off JSON configuration files.
The purpose of these classes is to generate a class-based Python script based off parameters set by a user stored in a JSON file. The layout is as follows:
- Classes::
PythonGenerator: Base class with python script writing methods. ClassGenerator: Includes methods for writing class definitions and methods. PipelineGenerator: Generates a dataset pipeline object script. ModelGenerator: Generates a machine learning model object script.
-
class
learn_ml.generators.python_generators.
ClassGenerator
(class_config, map_config, out)[source]¶ Bases:
learn_ml.generators.python_generators.PythonGenerator
Abstract class that defines methods for writing object-oriented strings to a python file.
-
class_
¶ Class data loaded from the JSON file.
- Type
dict
-
map
¶ A mapping from class dict representations to their actual values. For example, a “dense” value in the class may be mapped to “tf.keras.layers.Dense”.
- Type
dict
-
_arg_str
(arg_dict)[source]¶ Converts function arguments to a string.
- Parameters
arg_dict (dict) – The function arguments. Can be None if there are no args.
- Returns
Arguments formatted as they would be passed into a function.
- Return type
str
- Example usage:
>>> arg_dict = { >>> "param_1" : None, >>> "param_2" : "val_2" >>> } >>> print(_arg_str(arg_dict)) "param_1, param_2=val_2"
-
_check_configs
(class_config, map_config)[source]¶ Check that the config files exist.
- Raises
FileNotFoundError – The class or map config files cannot be found.
-
abstract
_class_def
()[source]¶ Define a class.
This method should call _start_class() and should define an __init__ method.
-
_fn
(fn_dict)[source]¶ Maps a function dictionary and returns it as a string.
- Parameters
fn_dict (dict) – The function.
- Returns
A mapped dictionary of the same form as the input.
- Return type
dict
- Example usage:
>>> fn_dict = { >>> "name" : "function", >>> "args" : { >>> "param" : "value" >>> } >>> } >>> print(_fn(fn_dict)) "mapped_function(mapped_param=mapped_value)"
-
_fn_str
(fn_dict)[source]¶ Convert a function from dictionary form to a string.
- Parameters
fn_dict (dict) – The function.
- Example usage:
>>> fn_dict = { >>> "name" : "function", >>> "args" : { >>> "param" : "val" >>> } >>> } >>> print(_fn_str(fn_dict)) "function(param=val)"
-
_map
(input_)[source]¶ Converts an input to its language-specific representation.
This function allows the model JSON language to be independent of implementation syntax. For example, the model may have a “dense” layer. The Tensorflow code for this is tf.keras.layers.Dense(). This function returns that language-specific syntax.
- Parameters
input (str) – The term to be mapped.
-
_map_fn
(fn_dict)[source]¶ Convert the elements of a function dictionary to their real values.
The function name, arg param, and arg value can be mapped if they have a mapping defined in the map JSON config file.
- Parameters
fn_dict (dict) – The function.
- Returns
A mapped dictionary of the same form as the input.
- Return type
dict
- Example usage:
>>> function = { >>> "name" : "function" >>> "args" : { >>> "param" : "value" >>> } >>> } >>> print(_map_fn(function)) { "name" : "mapped_function", "args" : { "mapped_param" : "mapped_value" } }
-
_start_class
(name, base='object', docstring=None)[source]¶ Write the class definition to file.
- Parameters
name (str) – The class name.
base (str) – The base class. Default is “object”.
docstring (str, List[str], optional) – The docstring.
-
_start_class_method
(name, arg_dict=None, docstring=None)[source]¶ Writes a class method definition to file.
- Parameters
name (str) – The method name.
args (dict, optional) – The args passed to the method. Don’t include “self”.
docstring (str, List[str], optional) – The docstring.
-
_start_method
(name, args=None, docstring=None)[source]¶ Writes the method definition to file.
- Parameters
name (str) – The method name.
args (dict, optional) – The args passed to the method.
docstring (str, List[str], optional) – The docstring.
-
_write_imports
(imports_dict)[source]¶ Write import statements to file.
- Parameters
import_dict (dict) – A dictionary of strings containing the import packages and their abbreviations. Currently, an abbreviation must be specified.
Example
>>> imports_dict = { >>> "tensorflow" : "tf", >>> "numpy" : "np" >>> } >>> _write_imports(imports_dict)
- In out file::
import tensorflow as tf import numpy as np
- Raises
TypeError – imports_dict is not of type dictionary.
-
-
class
learn_ml.generators.python_generators.
ModelGenerator
(model_config='generators/model.json', map_config='generators/model_variable_map.json', out='project/models/model.py')[source]¶ Bases:
learn_ml.generators.python_generators.ClassGenerator
Generates a machine learning model based off a JSON config file.
Use gen_model() to create the model. The model is stored as Model class with its layers and compilations features as its internal state.
-
model
¶ Model configuration loaded from the JSON.
- Type
dict
-
model_name
¶ The name that represents the model type.
- Type
str
-
-
class
learn_ml.generators.python_generators.
PipelineGenerator
(pipeline_config='generators/pipeline.json', map_config='generators/pipeline_map.json', out='project/pipelines/pipeline.py')[source]¶ Bases:
learn_ml.generators.python_generators.ClassGenerator
Generates a dataset preprocessing pipeline based off a JSON config file.
Most of the class methods are internal and do not need to be modified. All changes should be made to the JSON pipeline file used for configuration. Call gen_pipeline() to write the script.
The pipeline is represented as a Pipeline object with dataset and preprocessing operations in its internal state.
-
pipeline
¶ The pipeline configuration loaded from the JSON.
- Type
dict
-
dataset
¶ A subset of the pipeline config that represents the dataset options.
- Type
dict
-
_operations
(variable, operations)[source]¶ Write operation code for a dataset variable and list of operations.
An operation is a dict that defines a function to be applied to a variable. This method writes a line of code to apply a generic operation to a generic variable.
- Parameters
variable (str) – The variable the function/operation is called on.
operations (list[dict]) – The operations to be individually applied to the variable.
Example
>>> operations = [ { "name" : "map", "args" : { "param" : "value" } }, { "name" : "cache", "args" : None } ] >>> _operations("dataset", operations) 'self.dataset = self.dataset.map(param=value)' 'self.dataset = self.dataset.cache()'
-
-
class
learn_ml.generators.python_generators.
PythonGenerator
(out)[source]¶ Bases:
object
Generates a Python script and writes code to it.
This class can be used to write lines of code to a python file and keep track of indent levels.
-
out_file_name
¶ Name of the file to write to.
- Type
str
-
out
¶ File object used to write to file.
-
name
¶ An identifier for the file. Taken from the out name. For example, for out = “./project/models/model.py”, name = “model”
- Type
str
-
indent_level
¶ Keep track of current writing indentation.
- Type
int
-
indent_str
¶ 4 spaces to represent a single indent.
- Type
str
-
_write
(lines)[source]¶ Writes lines of code to the file.
- Parameters
lines (str, List[str]) – Line or lines to be written to the file.
Example
>>> pygen = PythonGenerator("out.py") >>> pygen._write("This is a line.\n") >>> pygen._indent() >>> pygen._write(["This is line 1\n", >>> "This is line 2\n"]) In out.py: This is a line. This is line 1 This is line 2
-