Namespace Keras.Layers

Classes

Activation

Applies an activation function to an output.

ActivityRegularization

Layer that applies an update to the cost function based input activity.

Applies Alpha Dropout to the input. Alpha Dropout is a Dropout that keeps mean and variance of inputs to their original values, in order to ensure the self-normalizing property even after this dropout. Alpha Dropout fits well to Scaled Exponential Linear Units by randomly setting activations to the negative saturation value.

AveragePooling1D

Average pooling for temporal data.

AveragePooling2D

Average pooling operation for spatial data.

AveragePooling3D

Average pooling operation for 3D data (spatial or spatio-temporal).

BaseLayer

BatchNormalization

Batch normalization layer (Ioffe and Szegedy, 2014). Normalize the activations of the previous layer at each batch, i.e.applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1.

Bidirectional

Bidirectional wrapper for RNNs.

Conv1D

1D convolution layer (e.g. temporal convolution). This layer creates a convolution kernel that is convolved with the layer input over a single spatial(or temporal) dimension to produce a tensor of outputs.If use_bias is True, a bias vector is created and added to the outputs.Finally, if activation is not None, it is applied to the outputs as well. When using this layer as the first layer in a model, provide an input_shape argument (tuple of integers or None, does not include the batch axis), e.g. input_shape=(10, 128) for time series sequences of 10 time steps with 128 features per step in data_format="channels_last", or (None, 128) for variable-length sequences with 128 features per step.

Conv2D

2D convolution layer (e.g. spatial convolution over images). This layer creates a convolution kernel that is convolved with the layer input to produce a tensor of outputs.If use_bias is True, a bias vector is created and added to the outputs.Finally, if activation is not None, it is applied to the outputs as well. When using this layer as the first layer in a model, provide the keyword argument input_shape (tuple of integers, does not include the batch axis), e.g. input_shape=(128, 128, 3) for 128x128 RGB pictures in data_format="channels_last".

Conv2DTranspose

Transposed convolution layer (sometimes called Deconvolution). The need for transposed convolutions generally arises from the desire to use a transformation going in the opposite direction of a normal convolution, i.e., from something that has the shape of the output of some convolution to something that has the shape of its input while maintaining a connectivity pattern that is compatible with said convolution. When using this layer as the first layer in a model, provide the keyword argument input_shape (tuple of integers, does not include the batch axis), e.g. input_shape=(128, 128, 3) for 128x128 RGB pictures in data_format="channels_last".

Conv3D

3D convolution layer (e.g. spatial convolution over volumes). This layer creates a convolution kernel that is convolved with the layer input to produce a tensor of outputs.If use_bias is True, a bias vector is created and added to the outputs.Finally, if activation is not None, it is applied to the outputs as well. When using this layer as the first layer in a model, provide the keyword argument input_shape (tuple of integers, does not include the batch axis), e.g. input_shape=(128, 128, 128, 1) for 128x128x128 volumes with a single channel, in data_format="channels_last".

Conv3DTranspose

Transposed convolution layer (sometimes called Deconvolution). The need for transposed convolutions generally arises from the desire to use a transformation going in the opposite direction of a normal convolution, i.e., from something that has the shape of the output of some convolution to something that has the shape of its input while maintaining a connectivity pattern that is compatible with said convolution. When using this layer as the first layer in a model, provide the keyword argument input_shape (tuple of integers, does not include the batch axis), e.g. input_shape=(128, 128, 128, 3) for a 128x128x128 volume with 3 channels if data_format="channels_last".

ConvLSTM2D

Convolutional LSTM. It is similar to an LSTM layer, but the input transformations and recurrent transformations are both convolutional.

ConvLSTM2DCell

Cell class for the ConvLSTM2D layer.

Cropping1D

Cropping layer for 1D input (e.g. temporal sequence). It crops along the time dimension(axis 1).

Cropping2D

Cropping layer for 2D input (e.g. picture). It crops along spatial dimensions, i.e.height and width.

Cropping3D

Cropping layer for 3D data (e.g. spatial or spatio-temporal).

CuDNNGRU

Fast GRU implementation backed by CuDNN. Can only be run on GPU, with the TensorFlow backend.

CuDNNLSTM

Dense

Just your regular densely-connected NN layer. Dense implements the operation: output = activation(dot(input, kernel) + bias) where activation is the element-wise activation function passed as the activation argument, kernel is a weights matrix created by the layer, and bias is a bias vector created by the layer(only applicable if use_bias is True). Note: if the input to the layer has a rank greater than 2, then it is flattened prior to the initial dot product with kernel.

DepthwiseConv2D

Depthwise separable 2D convolution.
Depthwise Separable convolutions consists in performing just the first step in a depthwise spatial convolution(which acts on each input channel separately). The depth_multiplier argument controls how many output channels are generated per input channel in the depthwise step.

Dropout

Applies Dropout to the input. Dropout consists in randomly setting a fraction rate of input units to 0 at each update during training time, which helps prevent overfitting.

ELU

Embedding

Turns positive integers (indexes) into dense vectors of fixed size. eg. [[4], [20]] -> [[0.25, 0.1], [0.6, -0.2]] This layer can only be used as the first layer in a model.

Flatten

Flattens the input. Does not affect the batch size.

GaussianDropout

Apply multiplicative 1-centered Gaussian noise. As it is a regularization layer, it is only active at training time.

GaussianNoise

Apply additive zero-centered Gaussian noise. This is useful to mitigate overfitting(you could see it as a form of random data augmentation). Gaussian Noise(GS) is a natural choice as corruption process for real valued inputs. As it is a regularization layer, it is only active at training time.

GlobalAveragePooling1D

Global average pooling operation for temporal data.

GlobalAveragePooling2D

Global average pooling operation for spatial data.

GlobalAveragePooling3D

Global Average pooling operation for 3D data.

GlobalMaxPooling1D

Global max pooling operation for temporal data.

GlobalMaxPooling2D

Global max pooling operation for spatial data.

GlobalMaxPooling3D

Global Max pooling operation for 3D data.

GRU

Gated Recurrent Unit - Cho et al. 2014. There are two variants.The default one is based on 1406.1078v3 and has reset gate applied to hidden state before matrix multiplication. The other one is based on original 1406.1078v1 and has the order reversed. The second variant is compatible with CuDNNGRU (GPU-only) and allows inference on CPU.Thus it has separate biases for kernel and recurrent_kernel.Use 'reset_after'=True and recurrent_activation='sigmoid'.

GRUCell

Cell class for the GRU layer.

Input

Input() is used to instantiate a Keras tensor. A Keras tensor is a tensor object from the underlying backend(Theano, TensorFlow or CNTK), which we augment with certain attributes that allow us to build a Keras model just by knowing the inputs and outputs of the model. For instance, if a, b and c are Keras tensors, it becomes possible to do: model = Model(input =[a, b], output = c) The added Keras attributes are: _keras_shape: Integer shape tuple propagated via Keras-side shape inference._keras_history: Last layer applied to the tensor. the entire layer graph is retrievable from that layer, recursively.

Lambda

Wraps arbitrary expression as a Layer object.

LeakyReLU

LocallyConnected1D

Locally-connected layer for 1D inputs. The LocallyConnected1D layer works similarly to the Conv1D layer, except that weights are unshared, that is, a different set of filters is applied at each different patch of the input.

LocallyConnected2D

Locally-connected layer for 2D inputs. The LocallyConnected2D layer works similarly to the Conv2D layer, except that weights are unshared, that is, a different set of filters is applied at each different patch of the input.

LSTM

Long Short-Term Memory layer - Hochreiter 1997.

LSTMCell

Cell class for the LSTM layer.

Masking

Masks a sequence by using a mask value to skip timesteps. If all features for a given sample timestep are equal to mask_value, then the sample timestep will be masked(skipped) in all downstream layers(as long as they support masking). If any downstream layer does not support masking yet receives such an input mask, an exception will be raised.

MaxPooling1D

Max pooling operation for temporal data.

MaxPooling2D

Max pooling operation for spatial data.

MaxPooling3D

Max pooling operation for 3D data (spatial or spatio-temporal).

Merge

Permute

Permutes the dimensions of the input according to a given pattern. Useful for e.g.connecting RNNs and convnets together.

PReLU

ReLU

RepeatVector

Repeats the input n times.

Reshape

Reshapes an output to a certain shape.

RNN

Base class for recurrent layers. This layer supports masking for input data with a variable number of timesteps. To introduce masks to your data, use an Embedding layer with the mask_zero parameter set to True.

You can set RNN layers to be 'stateful', which means that the states computed for the samples in one batch will be reused as initial states for the samples in the next batch. This assumes a one-to-one mapping between samples in different successive batches. To enable statefulness: - specify stateful = True in the layer constructor. - specify a fixed batch size for your model, by passing if sequential model: batch_input_shape = (...) to the first layer in your model. else for functional model with 1 or more Input layers: batch_shape = (...) to all the first layers in your model.This is the expected shape of your inputs including the batch size.It should be a tuple of integers, e.g. (32, 10, 100). - specify shuffle = False when calling fit(). To reset the states of your model, call.reset_states() on either a specific layer, or on your entire model.

You can specify the initial state of RNN layers symbolically by calling them with the keyword argument initial_state. The value of initial_state should be a tensor or list of tensors representing the initial state of the RNN layer. You can specify the initial state of RNN layers numerically by calling reset_states with the keyword argument states.The value of states should be a numpy array or list of numpy arrays representing the initial state of the RNN layer.

You can pass "external" constants to the cell using the constants keyword argument of RNN.__call__ (as well as RNN.call) method. This requires that the cell. Call method accepts the same keyword argument constants. Such constants can be used to condition the cell transformation on additional static inputs (not changing over time), a.k.a. an attention mechanism.

SeparableConv1D

Depthwise separable 1D convolution. Separable convolutions consist in first performing a depthwise spatial convolution(which acts on each input channel separately) followed by a pointwise convolution which mixes together the resulting output channels.The depth_multiplier argument controls how many output channels are generated per input channel in the depthwise step. Intuitively, separable convolutions can be understood as a way to factorize a convolution kernel into two smaller kernels, or as an extreme version of an Inception block.

SeparableConv2D

Depthwise separable 2D convolution. Separable convolutions consist in first performing a depthwise spatial convolution(which acts on each input channel separately) followed by a pointwise convolution which mixes together the resulting output channels.The depth_multiplier argument controls how many output channels are generated per input channel in the depthwise step. Intuitively, separable convolutions can be understood as a way to factorize a convolution kernel into two smaller kernels, or as an extreme version of an Inception block.

SimpleRNN

Fully-connected RNN where the output is to be fed back to input.

SimpleRNNCell

Cell class for SimpleRNN.

Softmax

Softmax activation function.

SpatialDropout1D

Spatial 1D version of Dropout. This version performs the same function as Dropout, however it drops entire 1D feature maps instead of individual elements.If adjacent frames within feature maps are strongly correlated (as is normally the case in early convolution layers) then regular dropout will not regularize the activations and will otherwise just result in an effective learning rate decrease.In this case, SpatialDropout1D will help promote independence between feature maps and should be used instead.

SpatialDropout2D

Spatial 2D version of Dropout. This version performs the same function as Dropout, however it drops entire 2D feature maps instead of individual elements.If adjacent pixels within feature maps are strongly correlated (as is normally the case in early convolution layers) then regular dropout will not regularize the activations and will otherwise just result in an effective learning rate decrease.In this case, SpatialDropout2D will help promote independence between feature maps and should be used instead.

SpatialDropout3D

Spatial 3D version of Dropout. This version performs the same function as Dropout, however it drops entire 3D feature maps instead of individual elements.If adjacent voxels within feature maps are strongly correlated (as is normally the case in early convolution layers) then regular dropout will not regularize the activations and will otherwise just result in an effective learning rate decrease.In this case, SpatialDropout3D will help promote independence between feature maps and should be used instead.

ThresholdedReLU

Thresholded Rectified Linear Unit. It follows: f(x) = x for x > theta, f(x) = 0 otherwise.

TimeDistributed

This wrapper applies a layer to every temporal slice of an input. The input should be at least 3D, and the dimension of index one will be considered to be the temporal dimension. Consider a batch of 32 samples, where each sample is a sequence of 10 vectors of 16 dimensions. The batch input shape of the layer is then (32, 10, 16), and the input_shape, not including the samples dimension, is (10, 16). You can then use TimeDistributed to apply a Dense layer to each of the 10 timesteps, independently: