jaclearn.embedding.word_embedding#

Functions

load(path[, word_index_only, filter, format])

Loads pre-trained embeddings from the specified path.

load_word_index(path[, filter, format])

Loads only the word index from the embeddings file

map(word, word2idx)

Get the word index for the given word.

map_sequence(word_sequence, word2idx)

Get embedding indices for the given word sequence.

Functions

load(path, word_index_only=False, filter=None, format='glove')[source]#

Loads pre-trained embeddings from the specified path.

load_word_index(path, filter=None, format='glove')[source]#

Loads only the word index from the embeddings file

@return word to index dictionary

map(word, word2idx)[source]#

Get the word index for the given word. Maps all numbers to 0, lowercases if necessary.

Parameters:
  • word – the word in question

  • word2idx – dictionary constructed from an embeddings file

Returns:

integer index of the word

map_sequence(word_sequence, word2idx)[source]#

Get embedding indices for the given word sequence.

Parameters:
  • word_sequence – sequence of words to process

  • word2idx – dictionary of word mapped to their embedding indices

Returns:

a sequence of embedding indices