Hitting bottleneck controlling close to 4,000 LEDs!

As far as my findings go with my project, I’ve definitely decided on going with Pixel Pusher.

I got mine in the mail today, and it’s really optimized for communicating with large amounts of led’s and works smoothly for the most part.

Good News:

Pixel Pusher’s processing sketch for Spout(touch has a spout out TOP) works incredibly well… with COOK times of around .5 MS in touch designer and way above 30 fps on processing the whole thing is really robust.
This is good news for scalability…

The Bad News:

According to Jas @ heroicrobotics Neopixels are extremely slow(comparatively ) and perform badly with their device. They claim that the driver is about an order of magnitude slower than anything else they support, and depending on when you order them, the led’s will actually behave differently, sometimes not work at all because of manufacturing differences.

I confirmed this with my own 4 panels I’ve build so far, 1 of them bugged out in a totally different way than the other 3 and they were all built within weeks of each other.

Anyways,
To further complicate my own situation at least, the 60 led per meter flavor of the strip that works well (apa102) isn’t widely sold, almost at all, so fun times! I get to redesign quite a lot of 3d printed parts.
:cry:

I’m sure I’ll have more to report on this soon.

Well I’m happy to say I’ve made some progress in the frame rate department:

TL/DR
I managed to get my fps @ 25 using Touch’s Spout Out → Processing Spout In → Processing Serial Out.

Here’s a video showing it all working:
[url]25 FPS @ 3,840 Led's - Touch designer, Spout, and Processing! - YouTube

I was fooling around with pixelPusher, and ended up tearing out the dll and code for Spout that had been incorporated into a processing sketch and re purposed it to receive frames from touch over Spout but send that data out over serial to the led’s

Since processing is able to batch send the entire set of data at once (or at least it appears to from a higher level) the FPS was a solid 25 the whole time… not quite 30, but it looks pretty good to the eye.

That’s with touch and processing moving 3,840 pixels worth of data too. I’ll post some more info on the how’s for those who are interested soon. Going to clean up some code first.

Hey everyone,

I’ve put together a collection of attachments of all the things you might need to get rolling with a large numbers of led’s .

  1. Touch example (barebones)

  2. Processing middle man sketch

  3. Arduino / teensy 3.1 sketch

If you’re driving 3,840 led’s you can expect roughly 25 fps, the less you use granted you configure processing and arduino sketches accordingly the faster the speed you can expect!

Im sure these aren’t anywhere near perfect but it works and it’s quite robust in my experience. Hope it helps!
touch_to_serialArduino_11_octows8211Lib.zip (1.56 KB)
ReceiveFrames_R3_03.zip (34.3 KB)
simpleLedMappingAndDrivingWorkflow.toe (20 KB)

1 Like

Wooooow,
thanks a lot for sharing this stuff, it helps me so much ! The way you are dealing with the led mapping is so much more effective and smart than mine, i’ll redraw all the project :slight_smile: .
Thanks to your advices I finally achieve to manage all the LEDs @45-50fps but I think i could optimize even more: Like in your example, i’m using a Shuffle CHOP to split all the samples in order to send it to the DMX CHOP, but I must use a reorder CHOP for each DMX output (18x) to organize the channels in the right order (G0 R0 B0 G1 R1 B1…).
It take 0,35ms to each reorderCHOP to cook this task, isn’t it a bit long for such a simple operation ?
If I convert the data to DAT to reorder the channels, it’s really quicker but then the conversion back to CHOP it’s soooooooo slow.
I’m wondering about a solution to directly split the samples in the right order or maybe use a Cplusplus CHOP to write a optimized reorder…

Glad it helped!

Sounds like a lot of splitting and re ordering. If you can isolate the part of your TOE that you’re talking about I could take a look at it. I’m not hugely experienced at optimizing but there might be a few things you could do.

For example, if your rgb data is coming from a single top at some point in your network before splitting, you could try a reorder TOP and switch the channels there, it might be more efficient.

Any step you’re doing multiple times on multiple streams, see if there’s an alternate way of doing it in another OP type further back in your stream when things are merged.

FM64,
Actually you mind if I ask what your hardware setup is like?

What kind of Artnet controllers are you using? Leds?

Hi Lucas,
thanks for your support, here is a very simplified and commented .toe, it’s great if you have the time to take a look at it !
I’m currently using WS2812b ledstrips, I know it’s kind of slow and innacurate but it does the job.
This is the controller : [url]http://www.electrondes.com/an6u1903_00/an6u1903_00.pdf[/url]
it looks cheap but works incredibly great !
LEDmapper_v2.10.toe (9.03 KB)

Hi Everyone.
Thanks for the great analysis and working solutions.

Some thoughts on the python script:

Lucasm, can you describe your issue with:
"A side note here, I have to split the led bytes into chunks or packets of 255 or less due to a built in limit that sendBytes has… "

Looking at your sample script, I notice you do things like:

exec(‘op(“%s”).sendBytes(%s)’%(serialConnectorName, str(subArray).strip(‘[]’)))

Instead of:

op(serialConnectorName).sendBytes(subArray)

Though I suspect: .sendBytes(*subArray) may work in your case.

Minimizing the amount of scripting altogether would likely improve performance.
(ie, using a Quantize CHOP to avoid the int() calls, and a shuffle CHOP to get everything in the right format).

It might boil down to:

n = op(‘some_resulting_chop’) #everything interleaved and with shuffle CHOPs
v = n[0].vals #put everything into one float array
op(‘serial_dat’).sendBytes(v)

Also, agreed, FM64, those reorder CHOPs seem unnessarily slow.
Thanks for the examples, we will try to optimize them further.

We’re also researching the idea of changing the formats accepted by the DMX CHOPs to avoid the interleaving altogether.

Cheers,
Rob.

A couple more points:

The latest official, Shuffle CHOP has a relatively recent option (August 11th):
“Sequence All Samples”
Which would take three long channels, (example: r, g, b), and reshuffle it into one
single interleaved channel: (r0, g0, b0, r1, g1, b1, …)
This may be of use in some networks.

Also, we’ve just optimized the Reorder CHOP to be 5 to 10 times faster when dealing with a few hundred channels, which is often the case with these LED networks.
That will be in build 25440 or later.

Cheers,
Rob.

I’ts awesome Rob, thanks a lot for your reactivity, can’t wait to see my network run @60FPS :wink:

Rob, thanks a bunch for sharing your thoughts on that part of the send code. Ill be honest i was looking to optimize everywhere else but there… Everything you suggested did LOADS of good…

Precisely the bits about shuffle CHOP to get the samples ordered correctly.
Building that ordered array via loop in python was also eating up a ton of time. The append function was probably being used WAY too much.

My FPS is around the 60 range and solidly sticking. All of it is split between two instances of Touch Designer with out the help of any other programs . Woo!

Not quite sure why I was approaching it the way I was with strings, but using a direct array with that asterisk in front of it did perfectly… and loads faster.
What’s with the asterisk anyways? I noticed it stripped away the commas when sent through a print statement, I’m curious, is that all it does?


All that being said, here’s the new Serial send code, it’s LOADS faster and WAY simpler.
You’re the man Rob:

[code]
def valueChange(channel, sampleIndex, val, prev):

n = op('topto1') # CHOP to pull rgb data from
serialConnector = op("serialConnector") # serial DAT

vals = [int(i) for i in n[0].vals] # returns a list with values converted to ints.
serialConnector.sendBytes(*vals) # Send data to leds!

return
[/code]

I’ve attached a new TOE file to reflect the changes and updates.
simpleLedMappingAndDrivingWorkflow.6.toe (9.35 KB)

FM64,

I took a look at your file, made a few changes and attached it.
I switched the order of operations up a bit and moved a few things to your ledstrips container.

I was able to cut it down by almost about half, but there might be a way to squeeze a few more MS out of it. It’s getting by at just over 60 though.

I suspect that won’t hold if you have more going on in that file but if you split it off to another touch process I imagine it would hold fine!

You may already be doing this but when you test fps, open your performance monitor and then go into performance mode and then hit analyze. Touch’s UI takes a good chunk it’s self understandably and that can make it seem worse than it is.
LEDmapper_v2.12.toe (8.98 KB)

That’s awesome. Im glad all our bits are making one fast system.
One more thing:

vals = [int(i) for i in n[0].vals] # returns a list with values converted to ints.
serialConnector.sendBytes(*vals) # Send data to leds!

The .sendBytes will cast each entry to int anyways, but to be extra sure, you can use the Limit CHOP to quantize/round a channel to whole numbers, eliminating that line entirely.

As for the python * operator, apparently called the ‘splat’ operator, which expands a list or tuple into separate arguments in a function call.

You can save a bit more by not using a select CHOP but a constant CHOP with a replace CHOP in the dmx_output COMPs.
The expression you are using in the select CHOP can be copied and pasted into the name parameter of the constant CHOP, then use the constant CHOP as the first input to the replace and the select2 CHOP as the second input. This little trick saves you another ~0.18 ms per dmx output…

cheers
Markus

Hey Rob,

That makes sense, I added a limit and set quantitize steps to 1 however it doesn’t seem to to changing the data type? (I had this as last CHOP in the chain so it was sampling directly from it.)

Also, I’ve had that issue before with other node types where I try to reference a channel or something and it gives me the <class variableType … thing. Is there a different way to reference touch arrays to avoid that?

Snaut,

That’s a pretty awesome trick, would you mind explaining a bit more about why that works?
It brought frame time down to mostly under 10 ms ! well within the 60 fps range.

Thanks Lucas for your example, definitely, using one shuffle CHOP (sequence all channels) upstream instead of the reorder CHOP before each output was the best idea.
But look a this .toe, it’s even more quick, I reuse your first ideas to trim the channels and split it up inside of each dmx_outputs COMP

@Snaut, thanks for the tips but i didn’t get better performance with it, the replace CHOP cooking time was around 0,15ms …
LEDmapper_v2.16.toe (8.99 KB)

Nice! You got that frame time WAY down.
Your file is registering 2-5 MS per frame on my end. Glad that’s working out, good to know an artnet workflow could push the same if not better in terms of speed.

In the spirit of optimization I setup some nodes in my setup that captured the time in milliseconds just before serial data was sent to the arduino and just after the arduino finished it’s draw loop (it sends back a single byte to touch signifying it finished a frame update)

Anyways, I’m sitting nicely just at .017 MS between the before and after. That’s pretty much at 60 FPS.

I think the ws2811 leds flicker a little when updating that many led’s with a fast moving pattern so it has an ever so slight jumpy look but I can only see it with a really narrow ramp animating really quickly across the panel.

The replace is fairly simple operation compared to the select as it literally just has to look if the channel is in the second input and replace the first inputs channel value. The select has to do a bit more work and watch for more changes in its input.

Compared to the trim, the replace is still slow but it was about only 1/3 of the cooktime required by the select CHOP.

cheers
Markus

I’ve also just changed the .sendBytes methods to allow passing in an array of floats, so no conversion to ints necessary.
ie: n.sendBytes( *channel.vals ) will work now.

(build 25680+)

Cheers

Thanks Rob! I’m going to snag that update soon then :smiley:

Lucas