NAME

MPSNNOptimizerAdam

SYNOPSIS

#import <MPSNNOptimizers.h>

Inherits MPSNNOptimizer.

Instance Methods

(nonnull instancetype) - initWithDevice:
(nonnull instancetype) - initWithDevice:learningRate:
(nonnull instancetype) - initWithDevice:beta1:beta2:epsilon:timeStep:optimizerDescriptor:
(void) - encodeToCommandBuffer:inputGradientVector:inputValuesVector:inputMomentumVector:inputVelocityVector:resultValuesVector:
(void) - encodeToCommandBuffer:convolutionGradientState:convolutionSourceState:inputMomentumVectors:inputVelocityVectors:resultState:
(void) - encodeToCommandBuffer:batchNormalizationState:inputMomentumVectors:inputVelocityVectors:resultState:
(void) - encodeToCommandBuffer:batchNormalizationGradientState:batchNormalizationSourceState:inputMomentumVectors:inputVelocityVectors:resultState:

Properties

double beta1
double beta2
float epsilon
NSUInteger timeStep

Additional Inherited Members

Detailed Description

The MPSNNOptimizerAdam performs an Adam Update



        Initialization time


        m[0] = 0 (Initialize initial 1st moment vector aka momentum, user is responsible for this)


        v[0] = 0 (Initialize initial 2nd moment vector aka velocity, user is responsible for this)


        t    = 0 (Initialize timestep)


        https://arxiv.org/abs/1412.6980


        At update time:


        t = t + 1


        lr[t] = learningRate * sqrt(1 - beta2^t) / (1 - beta1^t)


        m[t]     = beta1 * m[t-1] + (1 - beta1) * g


        v[t]     = beta2 * v[t-1] + (1 - beta2) * (g ^ 2)


        variable = variable - lr[t] * m[t] / (sqrt(v[t]) + epsilon)


        where,


          g    is gradient of error wrt variable


          v[t] is velocity


          m[t] is momentum

Method Documentation

- (void) encodeToCommandBuffer: (nonnull id< MTLCommandBuffer >) commandBuffer(MPSCNNBatchNormalizationState nonnull) batchNormalizationGradientState(MPSCNNBatchNormalizationState __nonnull) batchNormalizationSourceState(nullable NSArray< MPSVector * > ) inputMomentumVectors(nullable NSArray< MPSVector > ) inputVelocityVectors(nonnull MPSCNNNormalizationGammaAndBetaState ) resultState

Encode an MPSNNOptimizerAdam object to a command buffer to perform out of place update

Parameters:

commandBuffer A valid MTLCommandBuffer to receive the encoded kernel.
batchNormalizationGradientState A valid MPSCNNBatchNormalizationState object which specifies the input state with gradients for this update.
batchNormalizationSourceState A valid MPSCNNBatchNormalizationState object which specifies the input state with original gamma/beta for this update.
inputMomentumVectors An array MPSVector object which specifies the gradient momentum vectors which will be updated and overwritten. The index 0 corresponds to gamma, index 1 corresponds to beta, array can be of size 1 in which case beta won't be updated
inputVelocityVectors An array MPSVector object which specifies the gradient velocity vectors which will be updated and overwritten. The index 0 corresponds to gamma, index 1 corresponds to beta, array can be of size 1 in which case beta won't be updated
resultState A valid MPSCNNNormalizationGammaAndBetaState object which specifies the resultValues state which will be updated and overwritten.

The following operations would be applied



        t = t + 1


        lr[t] = learningRate * sqrt(1 - beta2^t) / (1 - beta1^t)


        m[t]     = beta1 * m[t-1] + (1 - beta1) * g


        v[t]     = beta2 * v[t-1] + (1 - beta2) * (g ^ 2)


        variable = variable - lr[t] * m[t] / (sqrt(v[t]) + epsilon)

- (void) encodeToCommandBuffer: (nonnull id< MTLCommandBuffer >) commandBuffer(MPSCNNBatchNormalizationState nonnull) batchNormalizationState(nullable NSArray< MPSVector > ) inputMomentumVectors(nullable NSArray< MPSVector > ) inputVelocityVectors(nonnull MPSCNNNormalizationGammaAndBetaState ) resultState

Encode an MPSNNOptimizerAdam object to a command buffer to perform out of place update

Parameters:

commandBuffer A valid MTLCommandBuffer to receive the encoded kernel.
batchNormalizationState A valid MPSCNNBatchNormalizationState object which specifies the input state with gradients and original gamma/beta for this update.
inputMomentumVectors An array MPSVector object which specifies the gradient momentum vectors which will be updated and overwritten. The index 0 corresponds to gamma, index 1 corresponds to beta, array can be of size 1 in which case beta won't be updated
inputVelocityVectors An array MPSVector object which specifies the gradient velocity vectors which will be updated and overwritten. The index 0 corresponds to gamma, index 1 corresponds to beta, array can be of size 1 in which case beta won't be updated
resultState A valid MPSCNNNormalizationGammaAndBetaState object which specifies the resultValues state which will be updated and overwritten.

The following operations would be applied



        t = t + 1


        lr[t] = learningRate * sqrt(1 - beta2^t) / (1 - beta1^t)


        m[t]     = beta1 * m[t-1] + (1 - beta1) * g


        v[t]     = beta2 * v[t-1] + (1 - beta2) * (g ^ 2)


        variable = variable - lr[t] * m[t] / (sqrt(v[t]) + epsilon)

- (void) encodeToCommandBuffer: (nonnull id< MTLCommandBuffer >) commandBuffer(MPSCNNConvolutionGradientState nonnull) convolutionGradientState(MPSCNNConvolutionWeightsAndBiasesState __nonnull) convolutionSourceState(nullable NSArray< MPSVector * > ) inputMomentumVectors(nullable NSArray< MPSVector > ) inputVelocityVectors(nonnull MPSCNNConvolutionWeightsAndBiasesState ) resultState

Encode an MPSNNOptimizerAdam object to a command buffer to perform out of place update

Parameters:

commandBuffer A valid MTLCommandBuffer to receive the encoded kernel.
convolutionGradientState A valid MPSCNNConvolutionGradientState object which specifies the input state with gradients for this update.
convolutionSourceState A valid MPSCNNConvolutionWeightsAndBiasesState object which specifies the input state with values to be updated.
inputMomentumVectors An array MPSVector object which specifies the gradient momentum vectors which will be updated and overwritten. The index 0 corresponds to weights, index 1 corresponds to biases, array can be of size 1 in which case biases won't be updated
inputVelocityVectors An array MPSVector object which specifies the gradient velocity vectors which will be updated and overwritten. The index 0 corresponds to weights, index 1 corresponds to biases, array can be of size 1 in which case biases won't be updated
resultState A valid MPSCNNConvolutionWeightsAndBiasesState object which specifies the resultValues state which will be updated and overwritten.

The following operations would be applied



        t = t + 1


        lr[t] = learningRate * sqrt(1 - beta2^t) / (1 - beta1^t)


        m[t]     = beta1 * m[t-1] + (1 - beta1) * g


        v[t]     = beta2 * v[t-1] + (1 - beta2) * (g ^ 2)


        variable = variable - lr[t] * m[t] / (sqrt(v[t]) + epsilon)

- (void) encodeToCommandBuffer: (nonnull id< MTLCommandBuffer >) commandBuffer(nonnull MPSVector ) inputGradientVector(nonnull MPSVector ) inputValuesVector(nonnull MPSVector ) inputMomentumVector(nonnull MPSVector ) inputVelocityVector(nonnull MPSVector *) resultValuesVector

Encode an MPSNNOptimizerAdam object to a command buffer to perform out of place update

Parameters:

commandBuffer A valid MTLCommandBuffer to receive the encoded kernel.
inputGradientVector A valid MPSVector object which specifies the input vector of gradients for this update.
inputValuesVector A valid MPSVector object which specifies the input vector of values to be updated.
inputMomentumVector A valid MPSVector object which specifies the gradient momentum vector which will be updated and overwritten.
inputVelocityVector A valid MPSVector object which specifies the gradient velocity vector which will be updated and overwritten.
resultValuesVector A valid MPSVector object which specifies the resultValues vector which will be updated and overwritten.

The following operations would be applied



        t = t + 1


        lr[t] = learningRate * sqrt(1 - beta2^t) / (1 - beta1^t)


        m[t]     = beta1 * m[t-1] + (1 - beta1) * g


        v[t]     = beta2 * v[t-1] + (1 - beta2) * (g ^ 2)


        variable = variable - lr[t] * m[t] / (sqrt(v[t]) + epsilon)

- (nonnull instancetype) initWithDevice: (nonnull id< MTLDevice >) device

Standard init with default properties per filter type

Parameters:

device The device that the filter will be used on. May not be NULL.

Returns:

a pointer to the newly initialized object. This will fail, returning nil if the device is not supported. Devices must be MTLFeatureSet_iOS_GPUFamily2_v1 or later.

Reimplemented from MPSNNOptimizer.

- (nonnull instancetype) initWithDevice: (nonnull id< MTLDevice >) device(double) beta1(double) beta2(float) epsilon(NSUInteger) timeStep(nonnull MPSNNOptimizerDescriptor *) optimizerDescriptor

Full initialization for the adam update

Parameters:

device The device on which the kernel will execute.
beta1 The beta1 to update values
beta2 The beta2 to update values
epsilon The epsilon at which we update values
timeStep The timeStep at which values will start updating
optimizerDescriptor The optimizerDescriptor which will have a bunch of properties to be applied

Returns:

A valid MPSNNOptimizerAdam object or nil, if failure.

- (nonnull instancetype) initWithDevice: (nonnull id< MTLDevice >) device(float) learningRate

Convenience initialization for the adam update

Parameters:

device The device on which the kernel will execute.
learningRate The learningRate at which we will update values

Returns:

A valid MPSNNOptimizerAdam object or nil, if failure.

Property Documentation

- beta1 [read], [nonatomic], [assign]

The beta1 at which we update values Default value is 0.9

- beta2 [read], [nonatomic], [assign]

The beta2 at which we update values Default value is 0.999

- epsilon [read], [nonatomic], [assign]

The epsilon at which we update values This value is usually used to ensure to avoid divide by 0, default value is 1e-8

- timeStep [read], [write], [nonatomic], [assign]

Current timeStep for the update, number of times update has occurred

Author

Generated automatically by Doxygen for MetalPerformanceShaders.framework from the source code.

- (void) encodeToCommandBuffer: (nonnull id< MTLCommandBuffer >) commandBuffer(nonnull MPSVector *) inputGradientVector(nonnull MPSVector *) inputValuesVector(nonnull MPSVector *) inputMomentumVector(nonnull MPSVector *) inputVelocityVector(nonnull MPSVector *) resultValuesVector

- (nonnull instancetype) initWithDevice: (nonnull id< MTLDevice >) device

- (nonnull instancetype) initWithDevice: (nonnull id< MTLDevice >) device(double) beta1(double) beta2(float) epsilon(NSUInteger) timeStep(nonnull MPSNNOptimizerDescriptor *) optimizerDescriptor

- (nonnull instancetype) initWithDevice: (nonnull id< MTLDevice >) device(float) learningRate

- beta1 [read], [nonatomic], [assign]

- beta2 [read], [nonatomic], [assign]

- epsilon [read], [nonatomic], [assign]

- timeStep [read], [write], [nonatomic], [assign]

- (void) encodeToCommandBuffer: (nonnull id< MTLCommandBuffer >) commandBuffer(nonnull MPSVector ) inputGradientVector(nonnull MPSVector ) inputValuesVector(nonnull MPSVector ) inputMomentumVector(nonnull MPSVector ) inputVelocityVector(nonnull MPSVector *) resultValuesVector