CplusPlus Chop VS Top

Angry_Primate · December 11, 2018, 4:01am

I have written a dll for use with the CplusPLus chop that implements Nvideas PhysX engine into Touch.

Works well, but the thing is, the chop accessing the Dll is a resource hog. I am thinking that a lot of the overhead is is the simple fact that it is passing thousands of XYZ values into touch. Or am i wrong here, and its the actual running of the PhysX back end?

if it is the passing of so much info, I was thinking that maybe using a CplusPlus TOP may be more efficient. The plan was to store the xyz values in the RGB values of each pixel. Hopefully move some of the load to the GPU?? Otherwise find a more efficient way of passing that much info…

Anyone have any thoughts here??

malcolm · December 11, 2018, 4:04am

I’d suggest timing parts of your code to see where it’s taking so long. It could be either, or there could just be efficiencies to be found.

For the GPU side, it depends how you plan on using the data. If you plan to use it with shaders, then keeping it on the GPU in a TOP will definitely be the fastest no matter what

Angry_Primate · December 11, 2018, 5:53am

Thanks Malcolm. Yes, i was planning on writing a GLSL shader to drive particles or instanced geometry, depending which way i go…

Angry_Primate · December 14, 2018, 6:59am

Ok. I have written a dll as described above.

I based it on the OpenGl top example. it seems to be slower than my equivalent cpluscplus Chop version. Basically the same inherent structure, just outputting different info.

This is where my knowledge falls down. should i have based it on the CUDA example?? To be honest, i have not gone down that rabbit hole and would appreciate some advice…

Anyone have any thoughts??

malcolm · December 14, 2018, 6:30pm

How are you getting the data out from PhysX?

Angry_Primate · December 14, 2018, 9:59pm

Broadly speaking, outputting a pixel per element. xyz positions are stored in the RBG values. At the moment, it is one long image (36000 pixels wide i think). i will break it up later if it works.

I have started (proof of concept) by assigning a glColor3f with the appropriate rbv values, then simply drawing a pixel. I do this for every call of the buffer that stores the current position values for each element (using a “for” statement).
eg
glBegin(GL_POINTS);
glVertex2f(pointnumber, 1);
glEnd();

Does this answer your question??

I dont know too much about this side of things. especially what is most optimal, so I am going to try assigning the values using glDrawArrays(GL_Points… Maybe an array will be more efficient. I think i read somewhere textures are, but cant find where i saw it again.

any direction would be helpfull. thanks

Angry_Primate · December 14, 2018, 11:36pm

ok. GLarrays dont work as i thought i understood them. Does anyone know if there is a more efficient way to write a large number of pixels?

malcolm · December 15, 2018, 4:15pm

This sounds like you have the information on the CPU. In this case you’ll want the CPU memory example to more quickly upload the data to the GPU. It’ll be slightly faster than the CHOP method, but not by too much since it’s just avoiding one CPU->CPU copy.

Angry_Primate · December 18, 2018, 11:47pm

Thanks Malcolm,

I may upgrade to the latest generation of graphic cards to see if that gives me the bump i require for this job. Based on how i am reading your first sentence below, I am unsure where to even start looking to get the info onto the GPU instead of the CPU. To be honest, most of my programing achievements come from banging my head against the monitor until it works.

Thanks again.