Analyze images, videos, time-series and more with the Generalized Neural Network bundle (GNN)

Evolving Neural Networks

Neural NetworkNeural Networks have become a key mechanism for the analysis of many types of data.  In particular they have been found to be very effective for the analysis of complex datasets such as images, video, and time-series, where classical methods have proven inadequate.  Additionally, these networks have shown success in optimization problems such as text vectorization and feature detection.  Some of these breakthroughs have been demonstrated with classical neural networks, but others are best handled with more sophisticated flavors of neural network, such as Convolutional Neural Networks(CNNs), Recurrent Neural Networks(RNNs), and Generative Adversarial Networks (GANs).  Additionally, endless variations and combinations of these techniques have been employed in order to push the state of the art in machine learning and optimization.  One can safely assume that this explosion of Neural Network techniques will continue for the forseeable future, resuting in a continued procession of new techniques.

Google created Tensorflow in order to provide a single core technology that can incorporate current and future types of neural network layer, and allow a user to assemble these layers in arbitrary combinations to achieve their particular application goals.  Tensorflow has become the most widely used Neural Network framework in the industry.  Fundamental to the use of Tensorflow is the the use of Tensors as a primary interface structure.  Tensors can be thought of as N-dimensional arrays.

Traditional Machine Learning specifies the data as a set of observation, each with a set of features, representing a two-dimensional dataset (nObservations x nFeatures).  This is not optimal, however, when dealing with more complex data such as images, video, or time-series.  Images, for example require for each observation, the set of pixels, each with a row and column index and three color values (i.e. Red, Green, Blue).  Thus the dataset would be 4 dimensional:  (observation, row-index, column-index, color).  A video stream would add another dimension: the time sequence.  Tensors are, therefore, an excellent generalized form for handling input and output of any dimensionality.  Furthermore, tensorflow supports complex neural network layers that have higher dimensional sets of weights.  Access to these weights is also provided via Tensors. 

Keras defines a high-level interface for creating neural networks, layer by layer.  It is the most widely used interface for Tensorflow, and is also capable of running with other neural network frameworks, insulating a user from strict dependence on Tensorflow.  The most commonly used combination, however, remains a Keras Interface into Tensorflow.

GNN Bundle Overview

The Generalized Neural Network Bundle (GNN) allows the ECL programmer to combine the parallel processing power of HPCC Systems with the powerful Neural Network capabilities of Keras and Tensorflow.  Each node in the https://github.com/hpcc-systems/GNN Systems cluster is attached to an independent Keras / Tensorflow environment.  These environments may each contain various hardware acceleration capabilities such as Graphical Processing Units (GPUs) or Tensor Processing Units (TPUs).   GNN, coordinates among those environments to provide a distributed environment that can parallelize all phases of Keras / Tensorflow usage.  This coordination is transparent to the GNN user, who can program as if running on a single node.

GNN also provides a Tensor module, allowing users to efficiently encode, decode, and manipulate Tensors within ECL. These Tensor data sets are used as the primary data interface to the GNN functions.

GNN is designed to be easy to use and familiar for people who have used the Keras interface.

Using GNN

We assume that the reader has some experience with Keras or a similar neural network interface.  Since we use the Keras approach, and in fact use some Keras syntax, we refer you to the Keras documentation for details on the types of layers and other parameters available within Keras.

Once your data is in the form of Tensors, GNN is straight-forward to use.  Getting your data into Tensor form requires a little more background.  We first illustrate how to use GNN once your data is available in Tensor form.  Then we provide a tutorial on how to use Tensors.  This is analogous (for Keras users) to understanding the Keras interface and then needing to learn the details of Numpy N-Dimensional Arrays (ndArray) in order to provide the data.

Installing GNN

From your clienttools area:

 ecl bundle install https://github.com/hpcc-systems/GNN.git

This will download the bundle from Github, and install it into your ECL bundle repository.

Before running GNN, make sure that Python3 and Tensorflow are installed on each server running HPCC Systems software.  Instructions for installing Python3 and Tensorflow are given below.

The GNNI Interface

The GNNI Module provides the primary interface into neural network functionality.  It provides methods to:

  • Initialize the environment
  • Build Keras / Tensorflow models
  • Serialize / Deserialize built models
  • Compile the model to define its training parameters, evaluation metrics, and other Keras Compile information
  • Retrieve the model’s weights
  • Test the model’s efficacy
  • Predict (i.e. Classify or Regress) using the model

GNNI Example

So lets get started with an example.  The full executable example on which this example is based can be found here.

First we import GNNI into our test module

 IMPORT GNN.GNNI;

Now we need to initialize the Keras / Tensorflow environment by calling GetSession()

 s := GNNI.GetSession();

Note that  GetSession must be called before any other GNNI functions.

GetSession returns a session id (i.e “s”) to be used in subsequent calls.

Now we can define the neural network model as a set of layers using the (Python) Keras syntax.  We create these definitions as a set of strings, one per neural network layer.

Note the triple quote (”’) syntax.  This ignores any special characters in the text, and allows any characters until the end triple quote.  This makes it easier to e.g., use quoted strings within the text, without the need for escape characters.

ldef := ['''layers.Dense(256, activation='tanh', input_shape=(5,))''',
         '''layers.Dense(256, activation='relu')''',
         '''layers.Dense(1, activation=None)'''];

There are two special Python module imports that can be used within the strings: “tf” for tensorflow, and “layers” for tensorflow.keras.layers.

The above layer definition defines a three layer neural network, where each layer is dense (i.e. classical fully-connected layers).  We could, of course, have defined an arbitrarily complex network using any of the myriad layer types supported by Keras and Tensorflow. Note that the first layer includes an input_shape parameter.  Per Keras, this must always be included since the first layer otherwise doesn’t know how many inputs it has.  The number of inputs for subsequent layers is implicit, since it is the number of neurons in the previous layer. 

In addition to specifying the layers, Keras also requires a definition as to how the network is to be trained and evaluated.  In Keras, this is done via the “compile” function. 

compileDef := '''compile(optimizer=tf.keras.optimizers.SGD(.05),
                 loss=tf.keras.losses.mean_squared_error,
                 metrics=[tf.keras.metrics.mean_squared_error]) ''';

GNNI allows the compile definition to be passed along with the layer definition in the DefineModel() function.  We pass the session-id returned from GetSession(), the Layer Definition and the Compile Definition, and get back a model-id.  This model-id will be passed to subsequent calls.

 mod := GNNI.DefineModel(s, ldef, compileDef);

Now that we have defined our model, we can train it using our Tensor formatted training data.

The Fit() function is used to train the network.  We pass the model-id , the independent training data (as a Tensor), the dependent training data (as a Tensor), and then a batch size and a number of epochs to train.

Each epoch processes through the entire training set once.  It is typically necessary to pass through the training data many times in order to fully train the network.  Batch Size is how many training records to process on each HPCC node before synchronizing weights among nodes.  Values of 32 to 1024 are typically used.  Larger values cause the epochs to be processed faster (less synchronization), but not as much progress is made during each epoch.  A value of 128 is usually a good compromise to start with.   During training progress messages will be output to the ECLWatch work unit status window as training can take an arbitrarily long time depending on the complexity of the network and training set.

 mod2 := GNNI.Fit(mod, trainX, trainY, batchSize := 128, numEpochs := 5);

The output from the Fit() function is another model-id.  This is a different model-id than was returned from DefineModel().  To illustrate, we can retrieve the weights at this point by calling GetWeights().  The model-id that we pass to GetWeights() will determine if we are getting the weights from the untrained model (the output from DefinieModel()) or the trained weights as output from Fit().

initialWts := GNNI.GetWeights(mod);  // The untrained model weights  trainedWts :=
initialWts := GNNI.GetWeights(mod);  // The untrained model weights

trainedWts := GNNI.GetWeights(mod2);  // The trained model weights

Now that we have our trained model, we can evaluate the effectiveness of the model against test data using EvaluateMod()

 metrics := GNNI.EvaluateMod(mod2, testX, testY);

The testX and testY are the testing data formatted as Tensors.  The returned metrics are the set of metrics that were defined within the Compile Definition, evaluated against the test data.

We can also use the trained model to make predictions for a new set of independent data.

 preds := GNNI.Predict(mod2, predX);

The returned predictions represent the outputs from the model given the inputs.  Both the inputs and  outputs are in Tensor format. Note that we used our trained model (i.e. mod2) in this call.  Otherwise, our results would be very disappointing.

This example provided a quick walk through of using GNNI.  You can see that it is fairly easy to interface to GNNI once you have your data available and in the right Tensor format.   Now we’ll look at how to use the Tensor module to format your training and test data and to retrieve predictions.

Using Tensors

Tensor Basics

The GNN Tensor module provides a way to construct and manage N-dimensional datasets.

Tensors have a shape.  The shape determines how the data within the Tensor is to be interpreted.  The shape is stored as a set of integers.  For example the shape [2,2] indicates a 2 x 2 two-dimensional Tensor. The shape [1000, 10, 4, 3] represents a four-dimensional 1000 x 10 x 4 x 3 Tensor.

Training and test data utilize a specific form of Tensor we call a Record-oriented Tensor.  This is a Tensor with the first dimension representing the observation number and the following dimensions the dimension of a single observation.  This is identified by a first shape term of 0.  The shape [0, 5], indicates any number of observations each with 5 features.  The shape [0, 50, 50, 3] represents a set of observations, each with a shape of 50 x 50 x 3.  This could represent, for example a 50 x 50 color image.  Weights, on the other hand, are stored in Rectangular Tensors. Unlike Record-oriented Tensors, where the first index is arbitrary, Rectangular Tensors have a fixed shape along all dimensions.  Therefore, the first dimension of the shape is never zero for Rectangular Tensors.  It is important that Record-oriented Tensors are used for Training and Testing data.  Therefore, their shape should always begin with a zero term.

There is a two step process for creating an ECL Tensor.

  1. Create the data that is to populate the tensor.  This uses the TensData record type.
  2. Create a Tensor using that data and any provided meta-data such as it’s shape.  This is packed into an efficient block-oriented form.

For example, I will create a 2 x 2 two-dimensional tensor:

[[0, 1],

[2, 3]]

First we import the Tensor module:

 IMPORT GNN.Tensor;

Now we use the TensData record type to format our data.  GNN currently only supports tensors of type REAL4, but the interface is set up for future support of other tensor types.  So we use the Tensor.R4.TensData (REAL4 type) record format.

TensData is a sparse form and only non-zero cells need to be specified.Therefore, we skip the first cell which has a zero value.

tensData1 := DATASET([{[1,2], 1}, // This is the second cell
                         {[2,1], 2}, // Third cell
                         {[2,2], 3}], // Fourth cell 
                         Tensor.R4.TensData);

Note that we don’t yet know how big the Tensor is.  It could be a 2 x 2 tensor, but it might also be a 1000 x 1000 tensor with only a few non-zero cells.

myTensor := Tensor.R4.MakeTensor([2,2], tensData1);

Now we have a Rectangular Tensor with exactly four cells as above, because we gave it a 2 x 2 shape.

For efficiency, Tensors are packed into slices using either sparse formatting or dense formatting, depending which yields the smallest data size.  This is transparent to the user.

Now if I want to get my original tensor data back, I can extract it from the tensor using the GetData() method.

 tensData2 := Tensor.R4.GetData(myTensor);

This returns the data in the Tensor.R4.TensData format. 

Tensor Examples

Tensor Example 1

In the GNNI example above, the Classical Neural Network requires an input shape of [5], meaning that each record has 5 independent (X)  features.  The output and dependent training data (Y) has a single value as indicated by the single node in the final layer.

So, what we want to do is construct a 2 dimensional array for the X data and the Y data with shapes of [0,5] and [0,1] respectively.  Note that the first term of the shape should always be 0 for record-oriented tensors.

Now lets assume that our source data is in a record format with the X values and Y values in the same record.  We define our input format below:

datFormat := RECORD
   UNSIGNED id;
   REAL X1;
   REAL X2;
   REAL X3;
   REAL X4:
   REAL X5;
   REAL Y;
END;

Now assume that myTrainData is a dataset of datFormat records representing the training data.  I can format my tensor data as follows:

// Use NORMALIZE to change each record into 5 tensor data records
myXTensDat := NORMALIZE(myTrainData, 5,
                      TRANSFORM(Tensor.R4.TensData,
                             SELF.indexes := [LEFT.id, COUNTER],
                             SELF.value := MAP(COUNTER = 1 -> LEFT.X1,
                                               COUNTER = 2 -> LEFT.X2,
                                               COUNTER = 3 -> LEFT.X3,
                                               COUNTER = 4 -> LEFT.X4,
                                               COUNTER = 5 -> LEFT.X5)));

// Since Y only has 1 tensor cell per source record, we just use PROJECT.  We could, of course,
// used NORMALIZE here with a count of 1, but PROJECT makes more sense.
myYTensDat:= PROJECT(myTrainData,
                      TRANSFORM(Tensor.R4.TensData,
                             SELF.indexes := [LEFT.id, 1],
                             SELF.value := LEFT.Y));

// Now we convert the Tensor Data to a Tensor dataset by calling MakeTensor()
myXTensor := Tensor.R4.MakeTensor([0,5], myXTensDat);
myYTensor := Tensor.R4.MakeTensor([0,1], myYTensDat);

Note that the record ids must start with 1 and be sequential (i.e. 1-N).  If the original data had non-sequential ids, we would have used a PROJECT to make the ids sequential before doing the above.

Tensor Example 2

Let’s try a somewhat more complex example.  Suppose we have a neural network that expects an input shape of [5, 20].  Let’s say the input data represents a time series with five features and 20 time steps for each feature.  Suppose further that our output is one of three classes:  0, 1, or 2.  This would typically be modeled as 3 outputs each containing the probability of one of the classes.

Our source data might be formatted as follows:

datFormat := RECORD
   SET OF REAL X; // 100 values -- 20 per feature
   UNSIGNED Y; // The class number 0-2
END;

The following code could build the X and Y tensors:

// Each source record becomes 100 X tensor cells (5 features x 20 time-steps)

myXTensData := NORMALIZE(myTrainData, 100,
                     TRANSFORM(Tensor.R4.TensData,
                       SELF.indexes := [LEFT.id, (COUNTER-1) DIV 20 + 1, (COUNTER-1) % 20 +1],
                       SELF.value := LEFT.X[COUNTER])));
// Each source record becomes 3 Y tensor cells (one per class value)
// using One-Hot encoding.
// But only the record associated with the class (i.e. value of Y)
// will be 1.  The others will be zero.  Since the TensData format
// is sparse, we just skip the zero cells.
myYTensData := NORMALIZE(myTrainData, 3,
                     TRANSFORM(Tensor.R4.TensData,
                       SELF.indexes := [LEFT.id, COUNTER],
                       SELF.value := IF(LEFT.Y != COUNTER - 1, SKIP, 1)));

Alternatively we could have take advantage of one of the GNN Utilities to format our Y data and handled it as follows:

IMPORT GNN.Utils;
myYTDTemp := PROJECT(myTrainData,
                      TRANSFORM(Tensor.R4.TensData,
                        SELF.indexes := [LEFT.id, 1], // Only one Y feature -- the class number
                        SELF.value := LEFT.Y));
myYTensData := Utils.ToOneHot(myYTDTemp, 3); // 3 is the number of classes

Now when I use GNNI.Predict() to classify some new samples, I will get back a tensor of shape [0,3].  I can use another utility function to convert the probabilities for each class to a single class value:

myPredictTensData := Tensor.R4.GetData(myPredictTensor); // Shape [0,3]
myPredictClassData := Utils.FromOneHot(myPredictTensData); // Shape [0,1]

Using GNN with NumericField data

If you have used other HPCC Systems Machine Learning Library bundles, you may be familiar with the NumericField data format.  This is the standard way of handling two-dimensional ML data.  GNNI provides methods to work directly with NumericField data for compatibility.  ML_Core provides ToField() and FromField() macros to easily convert record-oriented data to the cell-based NumericField format.  If your input and output data are two-dimensional then you can optionally use the NumericField functions within GNNI.  These functions use NumericField matrices rather than Tensors as input, and end in “NF”.  For example, you could use FitNF() rather than Fit().  All of the parameters are the same except input and output is in NumericField format rather than Tensor format.  For details on using NumericField data, see Using HPCC Systems Machine Learning.

Installing Tensorflow

Before running GNN, Tensorflow must be installed on each server running HPCC Systems software.  The instructions for installing on Ubuntu are detailed below.  Other Linux systems are similar, but may use slightly different commands.  As we gain experience with other platforms, we will add instructions for those.

We recommend installing Tensorflow universally for all users.  Otherwise, Tensorflow will have to be explicitly installed for the “hpcc” user.  We do not recommend the use of virtual environments as that complicates the process of making it available to the HPCC Systems Platform.

Installing Tenosrflow on Ubuntu

Refresh APT repository

 sudo apt update

Install python3 if not already installed

 sudo apt install python3

Install pip3  (Python 3 package installer)

 sudo apt install python3-pip

Install tensorflow for all users.  This is the recommended approach, since it needs to be available to the “hpcc” user as well as the current user.  The –H sudo option is necessary in order to have it installed globally.

 sudo –H pip3 install tensorflow

Additional Resources

There are a number of additional tests within the GNN distribution that are written as tutorials for handling different types of neural networks, and using various types of data as input.  These include:

Additionally, the code contains a good amount of overview documentation as well as parameter descriptions for each attribute.  The GNN code base is available here.

Conclusion

GNN allows the ECL programmer the ability to perform arbitrarily complex deep-learning tasks using the power of Keras and Tensorflow.  These neural network based methods are the preferred way of analyzing challenging types of data such as Image, Video, and Time-Series.