Experimenting with pretrained embedding vectors
The GloVe word vectors from Stanford are a good place to start with using pretrained weights in an embedding layer. They consist of a large vocabulary - about 400k words and a word vector for each one. They come in various vector lengths from 50 to 300.
When you are adding an embedding layer in keras you can specify the weights to use and set the layer as untrainable.
So this is my model:
def get_pretrained_model():
model = keras.Sequential()
model.add(layers.Embedding(len(word_index)+1, 200, input_length=max_len, weights=[embeddings_matrix], trainable=False))
model.add(layers.Bidirectional(layers.LSTM(32)))
model.add(layers.Dense(6, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
I am using the IMDB sentiment dataset and may be coming up against the limits of the length of that dataset.
Comments
Post a Comment