Optimizing in general and scripts specifically

Hi All!

I’m working on a project where there are 64 textured rocks that react according to simple webcam based motion detection and VERY simple physics and after much brainstorming and rewriting i’m down to 133ms per frame. Which is not good enough.

The things that seems to take up the most time are the script(~55ms) and the TopToCHOPs(~50ms).

The script handles the physics and detected-motion resolution.

The TopToCHOPs determine if a rock has been “touched” and if so from where and how hard.
There are 3 TopToCHOPS per rock (192 in total).

I’ve read the article on optimization ( derivative.ca/wiki/index.php?title=Optimize) and have moved the postitional translations and rotations into the Geometry as SOPs. I’ve also examined the network for unnecessary cooks.

Some experimentation has show that script-writes to tables take less time than to Geometry COMPs so all of the rock data is stored in a table and the Transform SOP accesses the table for it’s data.

I’m wondering if there are any optimization pointers in the scripting part and if there is a way to speed up the TopToCHOPs.

Thank you very much.

A

TOP to CHOPs are a very slow operation, because it requires reading sending data from the GPU back to the CPU, which is far slower than the opposite operation (uploading from the CPU to the GPU, which is what you’ll find talked about in GPU marketing material when they say PCI-Expression is super fast).

Reducing the resolution of the TOP will give significant speedups, are you able to do this, or do you need all of the pixel data?

Hi Malcolm,

The resolution is already 64X64 pixels. I tried a 32X32 and it didn’t by me any noticable frame rate increase.

I’m beginning to think that I might need to go with CUDA for this :frowning:

btw … Is there a way to get an average ms-per-frame over delta of time (I seem to recall a “batch” button on the Preformance Monitor in older versions of TD)? My projet is pretty dynamic in frame rate changes (~60ms when screen is clear and ~133ms when all rock are present) and so it’s hard to judge any changes.

Thanks,

A

You can get an average using the Perform CHOP and something like the Filter CHOP to get an average.

Ya so the next issue with readback from the GPU is that it stalls the CPU, waiting for the GPU to finish it’s work.
When a TOP cooks, it sends some work to the GPU for it to do, but the work doesn’t get done right away, as it’ll be busy doing some other work that was given to it previously. So it may be 5-10ms (or more) between the time that the TOP to CHOP asks for the texture data, and the time it’s ready, during this time it’ll be stuck waiting. This will really hurt performance. I’ll be adding a feature to the TOP to CHOP that’ll make it download the previous frame’s data, instead of the current frame’s, which will remove a lot of this cost, but it means that the data is 1 frame late, which may not be acceptable to you.

CUDA or GLSL is probably the best way to do this though. You need a Pro version for the CUDA TOP, so you may need to do this with a GLSL TOP.

Cool … will try the Perform CHOP.

I did some testing with the Performance Monitor and while the first TopToCHOP costs the most but all the others after seem MUCH less expensive so I don’t know how much this new TopToCHOP would buy me (best way to find out is to try it i guess :slight_smile:.

As for CUDA - yeah - I remember from the documentation that it’s a “pro” thing only.

Hmm - i’ll see what else I can come up with in the mean time.

Thank you very much for your time.

A