I have an SDI video input that contains song lyrics. The goal is the generate boxes behind the text that are sized appropriately and output the Key/Fill from a second DeckLink card.
There are 3 states that I need to be able to detect and handle, 1 or 2 lines of text and none (there should be no boxes when there is no text). The text is always white and in the same region.
The part that I am not sure how to do is detect the presence and location of the text. I don’t need OCR or anything like that. I just need the boundaries of the non black pixels. The rest of this seems pretty simple.
You could use an Analyze TOP to fetch the maximum for Rows and Columns.
Convert that to a CHOP and now use the Analyze CHOP to find the Index of the First Peak.
Reversing the Channel from the topto will let you run through the same process to retrieve the last white pixel calculated from the bottom/end of the image.
Alternatively you could write a little glsl TOP that uses atomic counters to track the most left/right, top/bottom white pixels and writes out the values into rgba channels of the last pixel processed (keep track with another atomic counter). Then just fetch the last value again with a Analyze TOP…
I am running into a synchronization problem with these approaches. Basically the CHOP is a frame or 2 behind the video so you see the video change and then the box moves a frame or 2 later. How can I delay the video to keep them in sync?
How would I figure out how much delay to add? It appears to be variable based on CPU load. Is there a way to get the processing time from a OP? Would that even work or would it be too late?
To avoid the delay it would be best to set the TopToCHOP to Immediate (Slow) - this will get the texture into the CHOP right away instead of waiting for an extra frame.
Attached now is also a version that works with up to 4 Lines and it comes with a border offset (left right top bottom) It’s a bit crazy but seems to work