Show / Hide Table of Contents

Class SequenceUtil

Sequence class

Inheritance
System.Object
Keras
Base
SequenceUtil
Implements
System.IDisposable
Inherited Members
Base.Parameters
Base.None
Base.Init()
Base.ToPython()
Base.InvokeStaticMethod(Object, String, Dictionary<String, Object>)
Base.InvokeMethod(String, Dictionary<String, Object>)
Base.Item[String]
Keras.Instance
Keras.keras
Keras.keras2onnx
Keras.tfjs
Keras.Dispose()
Keras.ToTuple(Array)
Keras.ToList(Array)
System.Object.Equals(System.Object)
System.Object.Equals(System.Object, System.Object)
System.Object.GetHashCode()
System.Object.GetType()
System.Object.MemberwiseClone()
System.Object.ReferenceEquals(System.Object, System.Object)
System.Object.ToString()
Namespace: Keras.PreProcessing.sequence
Assembly: Keras.dll
Syntax
public class SequenceUtil : Base, IDisposable

Methods

| Improve this Doc View Source

MakeSamplingTable(Int32, Single)

Generates a word rank-based probabilistic sampling table. Used for generating the sampling_table argument for skipgrams.sampling_table[i] is the probability of sampling the word i-th most common word in a dataset(more common words should be sampled less frequently, for balance). The sampling probabilities are generated according to the sampling distribution used in word2vec:

Declaration
public static NDarray MakeSamplingTable(int size, float sampling_factor = 1E-05F)
Parameters
Type Name Description
System.Int32 size

The size.

System.Single sampling_factor

The sampling factor.

Returns
Type Description
Numpy.NDarray

A 1D Numpy array of length size where the ith entry is the probability that a word of rank i should be sampled.

| Improve this Doc View Source

PadSequences(NDarray, Nullable<Int32>, String, String, String, Single)

Pads sequences to the same length. This function transforms a list of num_samples sequences(lists of integers) into a 2D Numpy array of shape(num_samples, num_timesteps). num_timesteps is either the maxlen argument if provided, or the length of the longest sequence otherwise. Sequences that are shorter than num_timesteps are padded with value at the end. Sequences longer than num_timesteps are truncated so that they fit the desired length.The position where padding or truncation happens is determined by the arguments padding and truncating, respectively. Pre-padding is the default.

Declaration
public static NDarray PadSequences(NDarray sequences, int? maxlen = default(int? ), string dtype = "int32", string padding = "pre", string truncating = "pre", float value = 0F)
Parameters
Type Name Description
Numpy.NDarray sequences

The sequences.

System.Nullable<System.Int32> maxlen

The maxlen.

System.String dtype

The dtype.

System.String padding

The padding.

System.String truncating

The truncating.

System.Single value

The value.

Returns
Type Description
Numpy.NDarray

Numpy array with shape (len(sequences), maxlen)

| Improve this Doc View Source

SkipGrams(NDarray, Int32, Int32, Single, Boolean, Boolean, NDarray, Nullable<Int32>)

Skips the grams.

Declaration
public static NDarray SkipGrams(NDarray sequence, int vocabulary_size, int window_size = 4, float negative_samples = 1F, bool shuffle = true, bool categorical = false, NDarray sampling_table = null, int? seed = default(int? ))
Parameters
Type Name Description
Numpy.NDarray sequence

A word sequence (sentence), encoded as a list of word indices (integers). If using a sampling_table, word indices are expected to match the rank of the words in a reference dataset (e.g. 10 would encode the 10-th most frequently occurring token). Note that index 0 is expected to be a non-word and will be skipped.

System.Int32 vocabulary_size

Int, maximum possible word index + 1

System.Int32 window_size

Int, size of sampling windows (technically half-window). The window of a word w_i will be [i - window_size, i + window_size+1].

System.Single negative_samples

Float >= 0. 0 for no negative (i.e. random) samples. 1 for same number as positive samples.

System.Boolean shuffle

Whether to shuffle the word couples before returning them.

System.Boolean categorical

bool. if False, labels will be integers (eg. [0, 1, 1 .. ]), if True, labels will be categorical, e.g. [[1,0],[0,1],[0,1] .. ].

Numpy.NDarray sampling_table

1D array of size vocabulary_size where the entry i encodes the probability to sample a word of rank i.

System.Nullable<System.Int32> seed

Random seed.

Returns
Type Description
Numpy.NDarray

couples, labels: where couples are int pairs and labels are either 0 or 1.

Implements

System.IDisposable
  • Improve this Doc
  • View Source
Back to top Generated by DocFX