python script slowdown

Hi,
I have a python script where I am doing some word2vec operations that works great in the textport, and it works OK in a textDAT. But, the model and modules are big, and slow to load which is fine for the first run, but Id like to avoid having to reload the script every time i send a new pair of words.

So is there a way to import the module’s and load the model so that it stays accessible as it does in the textport?

here’s a small snippet:

[code]import numpy
import gensim

model_loc = ‘D:_Projects\word2vec\GoogleNews-vectors-negative300.bin’

Load Google’s pre-trained Word2Vec model.

model = gensim.models.KeyedVectors.load_word2vec_format(model_loc, binary=True)

execute analogy task like king - man + woman = queen

print(model.most_similar(positive=[‘woman’, ‘king’], negative=[‘man’]))[/code]
its pretty standard word2vec stuff, but ill want to interactively compare word similarities, and vectors (like the analogy example above) w/o loading the model and modules everytime. It works fine within the textport (similar to a python console), so I guess Im just missing a simple step?

You might be better off running another instance of Touch that you send and receive from - using that other instance as your processing backend to keep it from slowing down your project.

Well, what I need to do is to keep sending commands like the last one in the snippet:

print(model.most_similar(positive=['woman', 'king'], negative=['man']))

where “woman, king, man” are changing interactively and I get back results from the model using op() references.

in the textport, I can send these and get results back very quickly. However, if i update the textDAT, it basically reloads the model, and the modules, creating a hang of about 3 seconds (the model is pretty big)

the way word2vec is sort of designed is that the model gets loaded once at the start, then just keeps getting referenced - which is what happens in the textport, so I’d like to understand a way to make that sort of functionality happen in a DAT.

You may want to investigate storage.
You can store python objects in operators to be referenced as needed.

In your case, the import statements should be light weight,
but the load is likely slow.

Place in in storage, and fetch as needed:

derivative.ca/wiki099/index … ss#Storage

Be sure to use ‘storeStartupValue’ on the object as None, so it doesnt try to load it up
when the toe restarts.

Hope that helps,
Rob