NAME

MPSCNNConvolutionGradientState

SYNOPSIS

#import <MPSCNNConvolution.h>

Inherits MPSNNGradientState, and <MPSImageSizeEncodingState>.

Properties

__nonnull id< MTLBuffer > gradientForWeights
__nonnull id< MTLBuffer > gradientForBiases
MPSCNNConvolution * convolution

Detailed Description

The MPSCNNConvolutionGradientState is returned by resultStateForSourceImage:sourceStates method on MPSCNNConvolution object. Note that resultStateForSourceImage:sourceStates:destinationImage creates the object on autoreleasepool. It will be consumed by MPSCNNConvolutionGradient. This used by MPSCNNConvolutionTranspose encode call that returns MPSImage on left hand side to correctly size the destination. Note that state objects are not usable across batches i.e. when batch is done you should nuke the state object and create new one for next batch.

This state exposes the gradient with respect to weights and biases, as computed by the MPSCNNConvolutionGradient kernel, as a metal buffer to be used during weights and biases update. The standard weights and biases update formula is:



      weights(t+1) = f(weights(t), gradientForWeights(t)) and


      biases(t+1) = f(biases(t), gradientForBiases(t)),

where the weights(t)/biases(t) are the wegihts and the biases at step t that are provided by data source provider used to create MPSCNNConvolution and MPSCNNConvoltuionGradient objects. There are multiple ways user can update weights and biases as described below:

1) For check pointing, i.e. updating weights/biases and storing: once the command buffer on which MPSCNNConvolutionGradient is enqueued is done (e.g. in command buffer completion callback), the application can simply use float* delta_w = (float*)((char*)[gradientForWeights contents]); float* delta_b = (float*)((char*)[gradientForBiases contents]); to update the weights and biases in the data provider directly. The application can instead provide a metal kernel that reads from gradientForWeights and gradientForBiases buffer and the buffer created using data provided by the data source to do any kind of update it will like to do, then read back the updated weights/biases and store to the data source. Note that lifetime of the gradientForWeights and gradientForBiases buffer is the same as the MPSCNNConvolutionGradientState. So it's the applications's responsibility to make sure the buffer is alive (retained) when the update kernel is running if the command buffer doesn't retain the buffer. Also, in order to gaurantee that the buffer is correctly synchronized for CPU side access, it is the application's responsibility to call [gradientState synchronizeOnCommandBuffer:] before accessing data from the buffer.

2) For a CPU side update, once the weights and biases in the data source provider are updated as above, the original MPSCNNConvolution and MPSCNNConvolutionGradient objects need to be updated with the new weigths and biases by calling the -(void) reloadWeightsAndBiasesFromDataSource method. Again application needs to call [gradientState synchronizeOnCommandBuffer:] before touching data on CPU side.

3) The above CPU side update requires command buffer to be done. If the application doesn't want to update its data source provider object and would prefer to directly enqueue an update of the internal MPSCNNConvolution and MPSCNNConvolutionGradient weights/biases buffers on the GPU without CPU side involvement, it needs to do following: i) get gradientForWeights and gradientForBiases buffers from this gradient state object and set it as source of update kernel ii) create a temporary buffer, dest, of same size and set it as destination of update kernel iii) enqueue update kernel on command buffer iv) call reloadWeightsAndBiasesWithCommandBuffer:dest:weightsOffset:biasesOffset on MPSCNNConvolution and MPSCNNConvolutionGradient objects. This will reload the weights from application's update kernel in dest on GPU without CPU side involvement.

Property Documentation

- convolution [read], [nonatomic], [retain]

The convolution filter that produced the state.

- gradientForBiases [read], [nonatomic], [assign]

A buffer that contains the loss function gradients with respect to biases.

- gradientForWeights [read], [nonatomic], [assign]

A buffer that contains the loss function gradients with respect to weights. Each value in the buffer is a float. The layout of the gradients with respect to the weights is the same as the weights layout provided by data source i.e. it can be interpreted as 4D array



 gradientForWeights[outputFeatureChannels][kernelHeight][kernelWidth][inputFeatureChannels/groups]

Author

Generated automatically by Doxygen for MetalPerformanceShaders.framework from the source code.