Converting Python Keras NLP Model to Tensorflowjs

I'm trying to learn more about Tensorflowjs, but sadly I'm stuck getting my Keras NLP Model converted to Tensorflowjs.

This is what I'm trying to convert:

from keras.models import load_model

from keras.preprocessing.sequence import pad_sequences

import pickle

list_classes = ["toxic", "severe_toxic", "obscene", "threat", "insult", "identity_hate"]

model = load_model('Keras_Model/m.hdf5')
with open('Keras_Model/tokenizer.pkl', 'rb') as handler:
    tokenizer = pickle.load(handler)

list_sentences_train = ["I need help Stackoverflow"]

list_tokenized_train = tokenizer.texts_to_sequences(list_sentences_train)
maxlen = 200
X_t = pad_sequences(list_tokenized_train, maxlen=maxlen)

pred = model.predict(X_t)[0]

Tensorflowjs side:

import tf = require('@tensorflow/tfjs-node')

async function processModel(){
  const model = await tf.loadLayersModel('Server_Model/model.json');

How I can get the Tokenizer running and make correct predictions?


Actually, I ran into the same problem while classifying text on Android. I had the model ( tflite ) ready to use, but how can I tokenize the sentences just as Keras did in Python.

I found a simple solution which I have discussed here ( for Android ).

The simple idea is to convert the keras.preprocessing.text.Tokenizer vocabulary to a JSON file. This JSON file could be parsed in any of the programming languages including JavaScript.

The Tokenizer holds a object called word_index.

index = tokenizer.word_index

The word_index object is a dict which can be converted to JSON like,

import json 
with open( 'word_dict.json' , 'w' ) as file:    
    json.dump( tokenizer.word_index , file )

The JSON file contains pairs of words and indexes. You can parse it in JavaScript as mentioned in this link.

