This depends on Metal.framework The
MPSRNNMatrixTrainingLayer specifies a recurrent neural network layer
for training on MPSMatrices.
A MPSRNNMatrixTrainingLayer is initialized using a @ref MPSRNNLayerDescriptor, which further specifies the
recurrent network layer.
The input and output vectors in encode calls are stored as rows of the input and output matrices and
MPSRNNMatrixTrainingLayer supports matrices with decreasing number of rows: The row-indices identify the different
sequences that may be of different lengths - for example if we have three sequences:
( x1, x2, x3 ), ( y1, y2, y3, y4 ) and ( z1, z2 )
of vectors xi, yi and zi, then these can be inserted together as a batch to the sequence encoding kernel by
using the matrices:
( y1 ) ( y2 ) ( y3 ) ( y4 )
m1 = ( x1 ), m2 = ( x2 ), m3 = ( x3 ), m4 =
( z1 ) ( z2 )
The gradient computation pass is then achieved by passing the corresponding
gradient sequence from the previous layer ( dx1, dx2, dx3 ), ( dy1, dy2,
dy3, dy4 ) and ( dz1, dz2 ) as matrices
( dy1 ) ( dy2 ) ( dy3 ) ( dy4 )
dm1 = ( dx1 ), dm2 = ( dx2 ), dm3 = ( dx3 ), dm4 =
( dz1 ) ( dz2 )
The mathematical operation described in the linear transformations
of MPSRNNSingleGateDescriptor MPSLSTMDescriptor and
MPSGRUDescriptor are y^T = W x^T <=> y = x W^T, where x is the
matrix containing the input vectors as rows, y is the matrix containing the
output vectors as rows and W is the weight matrix.
- (nonnull instancetype) copyWithZone: (nullable NSZone *)
zone(nullable id< MTLDevice >) device
Make a copy of this kernel for a new device -
See also:
MPSKernel
Parameters:
zone The NSZone in which to allocate the object
device The device for the new MPSKernel. If nil, then use
self.device.
Returns:
a pointer to a copy of this MPSKernel. This will
fail, returning nil if the device is not supported. Devices must be
MTLFeatureSet_iOS_GPUFamily2_v1 or later.
Reimplemented from MPSKernel.
- (void) createTemporaryWeightGradientMatrices:
(NSMutableArray< MPSMatrix * > *__nonnull)
matricesOut(MPSDataType) dataType(nonnull id< MTLCommandBuffer >)
commandBuffer
As createWeightGradientMatrices, but the matrices will be
temporary with readCount = 1, which means that they become invalid after the
first encode call that reads them. Note also that as the matrices are
temporary, their storage mode will be private which means that you can only
access the data using a kernel on the GPU.
Parameters:
matricesOut An array where the newly created
matrices will be stored, will be initialized to zero.
dataType Datatype for the entries - currently MPSDataTypeFloat32 and
MPSDataTypeFloat16 are supported.
commandBuffer The command buffer that the temporary matrices will live
on.
- (void) createWeightGradientMatrices: (NSMutableArray<
MPSMatrix * > *__nonnull) matricesOut(MPSDataType)
dataType
Initializes a set of matrices that can be used in training for
weight and bias gradient outputs in
See also:
encodeBackwardSequenceToCommandBuffer. Can be also used
to easily create auxiliary matrices for example for ADAM and other advanced
optimization schemes. The layout and number of matrices is the same as for the
outputs of
initWithDevice, but the data type may differ. NOTE: These matrices
cannot be used as weight matrices in the forward and backward encode calls,
but matrices from initWithDevice() or createWeightMatrices() should be used
instead.
Parameters:
matricesOut An array where the newly created
matrices will be stored, will be initialized to zero.
dataType Datatype for the entries - currently MPSDataTypeFloat32 and
MPSDataTypeFloat16 are supported.
- (void) createWeightMatrices: (NSMutableArray<
MPSMatrix * > *__nonnull) matricesOut
Initializes a set of matrices that can be used in training for
weight and bias matrices in the forward and backward passes. The layout,
datatype and number of matrices is the same as for the outputs of
See also:
initWithDevice.
Parameters:
matricesOut An array where the newly created
matrices will be stored, will be initialized to zero.
- (void) encodeCopyWeightsToCommandBuffer: (nonnull id<
MTLCommandBuffer >) commandBuffer(NSArray< MPSMatrix * >
*__nonnull) weights(MPSRNNMatrixId) matrixId(MPSMatrix
*__nonnull) matrix(BOOL) copyFromWeightsToMatrix(MTLOrigin) matrixOffset
Encode a copy kernel that copies one matrix from the trainable
weight set to a matrix with standard layout, where the column index is the
input feature channel index (in forward direction) and row index is the
output feature channel index.
Parameters:
commandBuffer A valid MTLCommandBuffer to
receive the encoded filter
weights An array weights from
See also:
initWithDevice or
createWeightMatrices.
Parameters:
matrixId Which matrix to copy - has to be a valid
Id based on inputs defined in the rnnDescriptor of
See also:
initWithDevice.
Parameters:
matrix The destination or source matrix that is
used in the copy.
copyFromWeightsToMatrix If YES then the copy direction is from the set of
trainable 'weights' to 'matrix', otherwise the copy is done from 'matrix' to
'weights'.
matrixOffset A (valid) offset into matrix to be applied to the
copy operation.
- (void) encodeForwardSequenceToCommandBuffer: (nonnull id<
MTLCommandBuffer >) commandBuffer(NSArray< MPSMatrix * >
*__nonnull) sourceMatrices(NSArray< MPSMatrix * > *__nonnull)
destinationMatrices(NSMutableArray< MPSRNNMatrixTrainingState * >
*__nonnull) trainingStates(NSArray< MPSMatrix * > *__nonnull)
weights
Encode an MPSRNNMatrixTrainingLayer forward pass kernel for
a sequence of inputs into a command buffer.
Parameters:
commandBuffer A valid MTLCommandBuffer to
receive the encoded filter
sourceMatrices An array of valid MPSMatrix objects containing the
sequence of source matrices.
destinationMatrices An array valid MPSMatrices to be overwritten by
result matrix sequence. destinationMatrices may not alias sourceMatrices.
trainingStates An array containing the training states to be passed to
the gradient computation encode function.
weights An array of valid MPSMatrix objects containing the
weights, should be the array that was produced either by
See also:
initWithDevice or
createWeightMatrices.
- (void) encodeForwardSequenceToCommandBuffer: (nonnull id<
MTLCommandBuffer >) commandBuffer(NSArray< MPSMatrix * >
*__nonnull) sourceMatrices(NSUInteger *__nullable) sourceOffsets(NSArray<
MPSMatrix * > *__nonnull) destinationMatrices(NSUInteger
*__nullable) destinationOffsets(NSMutableArray<
MPSRNNMatrixTrainingState * > *__nonnull)
trainingStates(MPSRNNRecurrentMatrixState *__nullable)
recurrentInputState(NSMutableArray< MPSRNNRecurrentMatrixState *
> *__nullable) recurrentOutputStates(NSArray< MPSMatrix * >
*__nonnull) weights
Encode an MPSRNNMatrixTrainingLayer forward pass kernel for
a sequence of inputs into a command buffer.
Parameters:
commandBuffer A valid MTLCommandBuffer to
receive the encoded filter
sourceMatrices An array of valid MPSMatrix objects containing the
sequence of source matrices.
sourceOffsets An array of byte-offsets into the sourceMatrices, if nil
zeros are assumed and if not nil must contain offset for every matrix in
sourceMatrices.
destinationMatrices An array valid MPSMatrices to be overwritten by
result matrix sequence. destinationMatrices may not alias sourceMatrices.
destinationOffsets An array of byte-offsets into the destinationMatrices,
if nil zeros are assumed and if not nil must contain offset for every matrix
in destinationMatrices.
trainingStates An array containing the training states to be passed to
the gradient computation encode function.
recurrentInputState An optional state containing the output matrices and
memory cells (for LSTMs) of the layer obtained from the previous input
matrices in a sequence of inputs. Has to be the output of a previous call to
this function or nil (assumed zero).
recurrentOutputStates An array that will be appended with the recurrent
output states. May not be nil. If recurrentOutputIsTemporary is YES and then
all returned recurrent states will be temporary.
See also:
MPSState:isTemporary.
Parameters:
weights An array of valid MPSMatrix objects
containing the weights, should be the array that was produced either by
See also:
initWithDevice or
createWeightMatrices.
- (void) encodeGradientSequenceToCommandBuffer: (nonnull id<
MTLCommandBuffer >) commandBuffer(NSArray< MPSMatrix * >
*__nonnull) forwardSources(NSUInteger *__nullable)
forwardSourceOffsets(NSArray< MPSMatrix * > *__nonnull)
sourceGradients(NSUInteger *__nullable) sourceGradientOffsets(NSArray<
MPSMatrix * > *__nullable) destinationGradients(NSUInteger
*__nullable) destinationOffsets(NSArray< MPSMatrix * >
*__nullable) weightGradients(NSArray< MPSRNNMatrixTrainingState *
> *__nonnull) trainingStates(MPSRNNRecurrentMatrixState *__nullable)
recurrentInputState(NSMutableArray< MPSRNNRecurrentMatrixState *
> *__nullable) recurrentOutputStates(NSArray< MPSMatrix * >
*__nonnull) weights
Encode an MPSRNNMatrixTrainingLayer gradient pass kernel
for a sequence of input gradients into a command buffer. NOTE: The time
sequence indexing follows the array indexing in the inputs:
sourceGradients[0] has to contain the gradients corresponding to the first
matrix in the forward pass corresponding to the current subsequence, which
is typically sourceMatrices[0].
Parameters:
commandBuffer A valid MTLCommandBuffer to
receive the encoded filter
forwardSources An array of MPSMatrix objects containing the
sequence of source matrices of the forward pass.
forwardSourceOffsets An array of byte-offsets into the forwardSources, if
nil zeros are assumed and if not nil must contain offset for every matrix in
forwardSources.
sourceGradients An array of valid MPSMatrix objects containing the
sequence of source gradient matrices.
sourceGradientOffsets An array of byte-offsets into the sourceGradients,
if nil zeros are assumed and if not nil must contain offset for every matrix
in sourceGradients.
destinationGradients An array valid MPSMatrix objects that will
receive the backpropagated gradients, may be nil if not needed (for example
first layer in graph).
destinationOffsets An array of byte-offsets into the
destinationGradients, if nil zeros are assumed and if not nil must contain
offset for every matrix in destinationGradients.
weightGradients An array of valid MPSMatrix objects that will
receive the gradient wrt. weights and biases of the layer - should be the
array that was produced either by
See also:
initWithDevice or
createWeightMatrices. May be nil in which case the gradients for
the weights are not computed.
Parameters:
trainingStates An array containing the training
states from the forward pass - the array must contain the states corresponding
to the input gradients is sourceGradients.
recurrentInputState An optional state containing the output matrices and
memory cells (for LSTMs) of the layer obtained from the previous input
gradients in a sequence of inputs. Has to be the output of a previous call to
this function or nil (assumed zero).
recurrentOutputStates An array that will be appended with the recurrent
output states. Can be nil. If recurrentOutputIsTemporary is YES and then all
returned recurrent states will be temporary.
See also:
MPSState:isTemporary.
Parameters:
weights An array of valid MPSMatrix objects
containing the weights, should be the array that was produced either by
See also:
initWithDevice or
createWeightMatrices.
- (void) encodeGradientSequenceToCommandBuffer: (nonnull id<
MTLCommandBuffer >) commandBuffer(NSArray< MPSMatrix * >
*__nonnull) forwardSources(NSArray< MPSMatrix * > *__nonnull)
sourceGradients(NSArray< MPSMatrix * > *__nullable)
destinationGradients(NSArray< MPSMatrix * > *__nullable)
weightGradients(NSArray< MPSRNNMatrixTrainingState * >
*__nonnull) trainingStates(NSArray< MPSMatrix * > *__nonnull)
weights
Encode an MPSRNNMatrixTrainingLayer gradient pass kernel
for a sequence of input gradients into a command buffer. NOTE: The time
sequence indexing follows the array indexing in the inputs:
sourceGradients[0] has to contain the gradients corresponding to the first
matrix in the forward pass corresponding to the current subsequence, which
is typically sourceMatrices[0].
Parameters:
commandBuffer A valid MTLCommandBuffer to
receive the encoded filter
forwardSources An array of MPSMatrix objects containing the
sequence of source matrices of the forward pass.
sourceGradients An array of MPSMatrix objects containing the
sequence of source gradient matrices.
destinationGradients An array valid MPSMatrix objects that will
receive the backpropagated gradients, may be nil if not needed (for example
first layer in graph).
weightGradients An array valid MPSMatrix objects that will receive
the gradient wrt. weights and biases of the layer - should be the array that
was produced either by
See also:
initWithDevice or
createWeightMatrices. May be nil in which case the gradients for
the weights are not computed. NOTE: The weight gradients are accumulated on
top of existing values so
Parameters:
trainingStates An array containing the training
states from the forward pass - the array must contain the states corresponding
to the input gradients is sourceGradients.
weights An array of valid MPSMatrix objects containing the
weights, should be the array that was produced either by
See also:
initWithDevice or
createWeightMatrices.
- (nullable instancetype) initWithCoder: (NSCoder
*__nonnull) aDecoder(nonnull id< MTLDevice >) device
NSSecureCoding compatability See
MPSKernel::initWithCoder.
Parameters:
aDecoder The NSCoder subclass with your serialized
MPSRNNMatrixTrainingLayer
device The MTLDevice on which to make the
MPSRNNMatrixTrainingLayer
Returns:
A new MPSRNNMatrixTrainingLayer object, or
nil if failure.
Reimplemented from MPSKernel.
- (nonnull instancetype) initWithDevice: (nonnull id<
MTLDevice >) device
Standard init with default properties per filter type
Parameters:
device The device that the filter will be used on.
May not be NULL.
Returns:
a pointer to the newly initialized object. This will
fail, returning nil if the device is not supported. Devices must be
MTLFeatureSet_iOS_GPUFamily2_v1 or later.
Reimplemented from MPSKernel.
- (nonnull instancetype) initWithDevice: (nonnull id<
MTLDevice >) device(nonnull const MPSRNNDescriptor *)
rnnDescriptor(NSMutableArray< MPSMatrix * > *__nonnull)
trainableWeights
Initializes a linear (fully connected) RNN kernel for training
Parameters:
device The MTLDevice on which this
MPSRNNMatrixLayer filter will be used
rnnDescriptor The descriptor that defines the RNN layer
trainableWeights An array where to store the weights of the layer as
MPSMatrices. NOTE: The exact layout and number of matrices may vary between
platforms and therefore you should not save out these weights directly, but
instead use the function encodeCopyWeightsToCommandBuffer to identify the
weights and biases for serialization. Typically you should pass here an
initialized but empty NSMutableArray and when this function returns the array
will have been populated with the weight matrices needed in the encode-calls,
by using initial values from the datasources in rnnDescriptor.
Returns:
A valid MPSRNNMatrixTrainingLayer object or
nil, if failure.