Recreating python feature extraction code in JavaScript

March 01, 2021

So TensorFlow's new Preprocessing Layers will make the use of models from python in TensorFlow.js much easier. At the moment those layers are only available in python, not JavaScript, so there is some transcription to be done.

The original model was trained by creating a windowed dataset with a sequence length of 5. In order to get prediction working in the browser that code needs to be replicated.

function createWindowedDataset(data){
    let windowed = [];
    for(let i = 0; i< data.length-contextSize ; i++){
        windowed[i]=data.slice(i, i+contextSize);
    }
    return windowed;
}

Now my model loading function has changed to include a loop which loads a set of JSON files. There are 4 of these required to get the model predicting and to make sense of the predictions. So that code looks like this:

    const jsonToLoad = ['word_to_index.json', 'pos_to_index.json', 'most_common_tag_for_word.json', 'index_to_pos.json'];
    for (let f of jsonToLoad){
        $.getJSON(f, (data) => {
            results[f] = data;
        });
    }

My first pass at preparing the dataset just used the words as elements of a list. However I need to support the idea of tuples of word, POS pairs, so I modified the data code to look like this:

    let words = [];
    for (let word of this.value.split(' ')){
        words[words.length] = {word: word, pos: null};
    }

Then I went through each of the 11 input features and converted the python code to JavaScript. That is quite tricky and there are still some bugs in there, so I will fix that up tomorrow and hopefully have a working part of speech tagger.

Search This Blog

30 days of ML

Recreating python feature extraction code in JavaScript

Comments

Post a Comment

Popular posts from this blog

Execute Jupyter notebooks line by line in VS Code

Using TensorFlow Serving

Text Summarisation with BERT