Lesson 3 : Streaming to the X-window system

One camera to one window

Download lesson [here]

Let’s consider the following filtergraph with streaming, decoding and presentation:

Streaming part
(LiveThread:livethread)---+
                          |
Decoding part             |
(AVThread:avthread) <<----+
|
|       Presentation part
+--->> (OpenGLThread:glthread)

Compared to the previous lesson, we’re continuying the filterchain from AVThread to OpenGLThread. OpenGLThread is responsible for sending the frames to designated x windows.

Note

OpenGLThread uses OpenGL texture streaming. YUV interpolation to RGB is done on the GPU, using the shader language.

Start constructing the filterchain from end-to-beginning:

# presentation part
glthread        =OpenGLThread ("glthread")
gl_in_filter    =glthread.getFrameFilter()

We requested a framefilter from the OpenGLThread. It is passed to the AVThread:

# decoding part
avthread        =AVThread("avthread",gl_in_filter)
av_in_filter    =avthread.getFrameFilter()

# streaming part
livethread      =LiveThread("livethread")

Define the connection to the IP camera as usual, with slot number “1”:

# ctx =LiveConnectionContext(LiveConnectionType_rtsp, "rtsp://admin:nordic12345@192.168.1.41", 1, av_in_filter)
ctx =LiveConnectionContext(LiveConnectionType_rtsp, "rtsp://admin:12345@192.168.0.157", 1, av_in_filter)

Start all threads, start decoding, and register the live stream. Starting the threads should be done in end-to-beginning order (in the same order we constructed the filterchain).

glthread.startCall()
avthread.startCall()
livethread.startCall()

# start decoding
avthread.decodingOnCall()

livethread.registerStreamCall(ctx)
livethread.playStreamCall(ctx)

Now comes the new bit. First, we create a new X window on the screen:

window_id =glthread.createWindow()

We could also use the window id of an existing X window.

Next, we create a new “render group” to the OpenGLThread. Render group is a place where we can render bitmaps - in this case it’s just the X window.

glthread.newRenderGroupCall(window_id)

We still need a “render context”. Render context is a mapping from a frame source (in this case, the IP camera) to a certain render group (X window) on the screen:

context_id=glthread.newRenderContextCall(1,window_id,0) # slot, render group, z

The first argument to newRenderContextCall is the slot number. We defined the slot number for the IP camera when we used the LiveConnectionContext.

Now, each time a frame with slot number “1” arrives to OpenGLThread it will be rendered to render group “window_id”.

Stream for a while, and finally, close all threads:

time.sleep(10)

glthread.delRenderContextCall(context_id)
glthread.delRenderGroupCall(window_id)

# stop decoding
avthread.decodingOffCall()

Close threads. Stop threads in beginning-to-end order (i.e., following the filtergraph from left to right).

livethread.stopCall()
avthread.stopCall()
glthread.stopCall()

print("bye")

So, all nice and simple with the API.

However, here it is important to understand what’s going on “under-the-hood”. Similar to AVThread, OpenGLThread manages a stack of YUV bitmap frames. These are pre-reserved on the GPU (for details, see the OpenGLFrameFifo class in the cpp documentation).

The number of pre-reserved frames you need, depends on the buffering time used to queue the frames.

You can adjust the number of pre-reserved frames for different resolutions and the buffering time like this:

gl_ctx =OpenGLFrameFifoContext()
gl_ctx.n_720p    =20
gl_ctx.n_1080p   =20
gl_ctx.n_1440p   =20
gl_ctx.n_4K      =20

glthread =OpenGLThread("glthread", gl_ctx, 300)

Here we have reserved 20 frames for each available resolution. A buffering time of 300 milliseconds is used.

For example, if you are going to use two 720p cameras, each at 20 fps, with 300 millisecond buffering time, then you should reserve

2 * 20 fps * 0.3 sec = 12 frames

for 720p. If this math is too hard for you, just reserve several hundred frames for each frame resolution (or until you run out of GPU memory). :)

If you’re extremely ambitious libValkka user who wants to use that brand-new 8K running at 80 frames per second, then read this first.

One camera to several windows

Download lesson [here]

Streaming the same camera to several X windows is trivial; we just need to add more render groups (aka x windows) and render contexes (mappings):

id_list=[]

for i in range(10):
  window_id =glthread.createWindow()
  glthread.newRenderGroupCall(window_id)
  context_id=glthread.newRenderContextCall(1,window_id,0)
  id_list.append((context_id,window_id)) # save context and window ids

time.sleep(10)

for ids in id_list:
  glthread.delRenderContextCall(ids[0])
  glthread.delRenderGroupCall(ids[1])

Presenting the same stream in several windows is a typical situation in video surveillance applications, where one would like to have the same stream be shown simultaneously in various “views”

Keep in mind that here we have connected to the IP camera only once - and that the H264 stream has been decoded only once.

Note

When streaming video (from multiple sources) to multiple windows, OpenGL rendering synchronization to vertical refresh (“vsync”) should be disabled, as it will limit your total framerate to the refresh rate of your monitor (i.e. to around 50 frames per second). On MESA based X.org drivers (intel, nouveau, etc.), this can be achieved from command line with “export vblank_mode=0”. With nvidia proprietary drivers, use the nvidia-settings program. You can test if vsync is disabled with the “glxgears” command (in package “mesa-utils”). Glxgears should report 1000+ frames per second with vsync disabled.

Decoding multiple streams

Download lesson [here]

Let’s consider decoding the H264 streams from multiple RTSP cameras. For that, we’ll be needing several decoding AVThreads. Let’s take another look at the filtergraph:

Streaming part
(LiveThread:livethread)---+
                          |
Decoding part             |   [This part of the filtergraph should be replicated]
(AVThread:avthread) <<----+
|
|       Presentation part
+--->> (OpenGLThread:glthread)

LiveThread and OpenGLThread can deal with several simultaneous media streams, while for decoding, we need one thread per decoder. Take a look at the library architecture page

It’s a good idea to encapsulate the decoding part into its own class. This class takes as an input, the framefilter where it writes the decoded frames and as an input, the stream rtsp address:

class LiveStream:

  def __init__(self, gl_in_filter, address, slot):
    self.gl_in_filter =gl_in_filter

    self.address      =address
    self.slot         =slot

    # decoding part
    self.avthread        =AVThread("avthread", self.gl_in_filter)
    self.av_in_filter    =self.avthread.getFrameFilter()

    # define connection to camera
    self.ctx =LiveConnectionContext(LiveConnectionType_rtsp, self.address, self.slot, self.av_in_filter)

    self.avthread.startCall()
    self.avthread.decodingOnCall()


  def close(self):
    self.avthread.decodingOffCall()
    self.avthread.stopCall()

Construct the filtergraph from end-to-beginning:

# presentation part
glthread        =OpenGLThread ("glthread")
gl_in_filter    =glthread.getFrameFilter()

# streaming part
livethread      =LiveThread("livethread")

# start threads
glthread.startCall()
livethread.startCall()

Instantiate LiveStreams. This will also start the AVThreads. Frames from the first camera are tagged with slot number 1, while frames from the second camera are tagged with slot number 2:

stream1 = LiveStream(gl_in_filter, "rtsp://admin:nordic12345@192.168.1.41", 1) # slot 1
stream2 = LiveStream(gl_in_filter, "rtsp://admin:nordic12345@192.168.1.42", 2) # slot 2

Register streams to LiveThread and start playing them:

livethread.registerStreamCall(stream1.ctx)
livethread.playStreamCall(stream1.ctx)

livethread.registerStreamCall(stream2.ctx)
livethread.playStreamCall(stream2.ctx)

Create x windows, and map slot numbers to certain x windows:

# stream1 uses slot 1
window_id1 =glthread.createWindow()
glthread.newRenderGroupCall(window_id1)
context_id1 =glthread.newRenderContextCall(1, window_id1, 0)

# stream2 uses slot 2
window_id2 =glthread.createWindow()
glthread.newRenderGroupCall(window_id2)
context_id2 =glthread.newRenderContextCall(2, window_id2, 0)

Render video for a while, stop threads and exit:

time.sleep(10)

glthread.delRenderContextCall(context_id1)
glthread.delRenderGroupCall(window_id1)

glthread.delRenderContextCall(context_id2)
glthread.delRenderGroupCall(window_id2)

# Stop threads in beginning-to-end order
livethread.stopCall()
stream1.close()
stream2.close()
glthread.stopCall()

print("bye")

There are many ways to organize threads, render contexes (slot to x window mappings) and complex filtergraphs into classes. It’s all quite flexible and left for the API user.

One could even opt for an architecture, where there is a LiveThread and OpenGLThread for each individual stream (however, this is not recommended).

The level 2 API provides ready-made filtergraph classes for different purposes (similar to class LiveStream constructed here).