vf_overlay - Shared memory overlay image buffer =============================================== Introduction ------------ vf_overlay allows an application controlling MPlayer in slave mode (referred to as "the application") to overlay an arbitrary image on top of the video. Its features and interface are similar to the on-screen display (OSD) functionality found in hardware such as the Hauppauge PVR-350. Here is a brief list of features: * Image buffer stored in 32bpp BGRA. * Supports both per-pixel alpha and global alpha. * Compositing is MMX accelerated and performs quite well. On a 2GHz Sempron, it is possible to render a translucent video on the overlay with no frame loss in either video (and with CPU to spare). For an unchanging overlay image, vf_overlay can composite an 800x600 overlay at 24 fps with about a 7% CPU usage overhead (on the same 2GHz CPU). * Fixed overlay update frame rate (of about 30 fps), irrespective of the video's frame rate or pause state. * Overlay buffer accessible over shared memory. * Overlay controlled via slave commands, rather than a fifo. * Filter state maintained across 'loadfile' commands and loops. * A slice region can be defined to limit what portion of the overlay image is drawn. (This will improve performance for an overlay with little content, such as a line of text.) Technical Details ----------------- The filter is initialized with a single integer argument that is used as the key for the shared memory object. The dimensions of the overlay image are those of the display size at the point in the filter chain vf_overlay appears. If the display dimensions are different than the frame dimensions, the overlay will be scaled to fit when it is composited. For example, an anamorphic widescreen NTSC DVD has a frame size of 720x480 and a display size of 854x480. The overlay image will therefore be 854x480, and will be implicitly scaled to 720x480. If the application requires a fixed or predictable overlay image size, it can use the scale, expand, and/or dsize filters before specifying overlay in the filter chain. Otherwise, the application should parse MPlayer's output to determine the overlay size. In this case, if the filter is reinitialized (due to a loadfile command) and the display size is different than the previous file, the previous shared memory segment will be destroyed and a new one created. For example, consider this command line: mplayer -vf overlay=1234567 movie.avi vf_overlay will create a System V shared memory buffer with the key 1234567 that holds a BGRA image whose size is the display size of movie.avi. The application will then access this buffer using the given key. In order to determine the dimensions of the overlay image, the application will need to look for a line in MPlayer's output such as: overlay: 854x480 BGRA; shmem key: 1234567; MMX accelerated. Applications that intend to use the overlay for more complicated displays (such as menus) may require a fixed and predictable overlay image size, regardless of the video's display size. In this situation, it makes more sense for the controlling application to explicitly use software scaling: mplayer -vf scale=800:-2,expand=800:600,dsize=800:600,overlay=1234567 movie.avi When the overlay filter is initialized, it will output a line such as: overlay: 800x600 BGRA; shmem key: 1234567; MMX accelerated. At this point, the application can attach to the shared memory and render to the overlay image. The actual buffer will be (16 + width * height * 4) bytes, corresponding to 1 byte for locking (the "lock byte"), 15 bytes of padding, followed by 4 bytes (32bpp) for each pixel of the overlay's BGRA image. The lock byte is used as a very simple synchronization mechanism. It can hold 2 possible values depending on the state of the buffer, and its purpose is to ensure that the application does not alter the buffer while vf_overlay is in the middle of reading from it. The possible values for the lock byte are: BUFFER_UNLOCKED (0x10) The application is free to write to the overlay buffer. vf_overlay sets this byte when it is not reading from the buffer. The overlay buffer is initialized to this state. BUFFER_LOCKED (0x20) The application is not free to write to the overlay buffer. When the app writes to the overlay buffer, it sets this flag. Then it must wait for vf_overlay to clear this flag with BUFFER_UNLOCKED before writing to it again. After the application writes to the buffer and sets this flag, it must follow with an "overlay invalidate" slave command. (See below.) Internally, vf_overlay maintains two buffers. The first buffer is the shared BGRA buffer, which the application will render to. The second buffer is a planar YUVA (YV12 plus alpha channel). The "YV12A" buffer is what is read by vf_overlay when compositing. Therefore, an application merely writing to the shared BGRA buffer isn't sufficient for the overlay to be updated. The application must follow up with an "overlay invalidate" slave command to cause vf_overlay to synchronize its internal YV12A buffer with the BGRA buffer. NOTE: Currently, the only time the overlay update rate is independent of the video frame rate is when double buffering is disabled (with the -nodouble switch). When double buffering is enabled (which is the default behaviour in MPlayer) and the video is playing, the overlay can only be updated as fast as the video frame rate. Slave Command ------------- In addition to modifying the overlay image buffer, the application will control the overlay by issuing an "overlay" slave command, which takes the following form: overlay cmd=args[,cmd=args[,cmd=args[, ...]]] Multiple overlay commands "cmd" can be specified using only one "overlay" slave command. Only one "overlay" slave command is evaluated between each pair of adjacent video frames, so if you want to perform multiple operations to the overlay before the next frame is drawn (such as, for example, invalidating two different rectangles and adjusting the global alpha) you must list those operations, separated by commas, with one "overlay" slave command. Otherwise, if you had specified N "overlay" commands, it would take N frames to update those changes. The values for "args" are dependent on "cmd". Below are the supported commands "cmd" for the "overlay" slave command. invalidate=x:y:w:h Indicates that the specified rectangle of the overlay image has been updated by the application, and therefore this region of the overlay as it is currently displayed is invalid. This forces an internal BGRA -> YV12A colorspace conversion and will cause any changes made to that region of the overlay image to be updated the next time the overlay is drawn. Arguments are specified in pixels, and refer to the left, top, width, and height respectively. slice=y:h Clip the overlay viewport to the specified slice where y is the top of the slice and h is the height. The arguments are specified in pixels and are relative to the whole overlay image. This causes only the specified region of the overlay buffer to be composited onto the video. If there are only small regions of the overlay used, the application can use this as a hint to vf_overlay to speed up rendering. vf_overlay is initialized with a slice region covering the entire overlay image. visible=[0|1] Draw overlay if 1, or don't draw overlay if 0. vf_overlay is initialized with visible=0, so in order to display the overlay, the application will first have to set visible=1. alpha=[0..256] Sets the global alpha level for the overlay. A value of 0 is semantically equivalent to setting visible=0. 255 is fully opaque. A value of 256 is a special value that forces vf_overlay to ignore the per-pixel alpha channel of the overlay image, thus causing the video to be fully obstructed within the defined slice region. For an overlay that is fully opaque and fully obstructs the video, setting the global alpha to 256 will improve performance. vf_overlay is initialized to alpha=255. Here are some examples: overlay visible=1,invalidate=0:0:640:480 Sets the overlay visible, and causes the region of the BGRA buffer starting at coordinates (0,0) (top left corner) with a width of 640 pixels and a height of 480 pixels to be updated on the next frame draw. overlay invalidate=10:10:50:25,invalidate=200:200:50,25 Causes the two regions (10,10 50x25) and (200,200 50x25) to be updated on the next frame draw. overlay alpha=255 overlay alpha=200 overlay alpha=150 overlay alpha=75 overlay alpha=0 Causes the overlay to fade out in 5 frames, starting at 255 (fully opaque) and progressing to fully translucent. overlay alpha=255,alpha=200,alpha=150,alpha=75,alpha=0 Causes the overlay to become immediately transparent. Because these commands are chained together, they are all evaluated before the next frame is drawn, and therefore only the last alpha applies. overlay alpha=150,invalidate=0:0:640:100,slice=0:100 Sets the global overlay alpha to 150, causes the overlay to be updated at the region (0,0 640x100), and causes the overlay to render only at the same region. Any subsequent pixels changed outside this slice region will not be drawn to the overlay until the slice region is expanded to include those pixels. overlay visible=0,invalidate=0:0:640:480 Hides the overlay and updates the region at (0,0 640x480). Note that "updating" only means an internal synchronization between BGRA and YV12A colorspaces. Even though the overlay is hidden and this update won't be seen (until visible=1 is later called), the colorspace conversion is still performed. Example ------- Here is a small example in Python that demonstrates how to use vf_overlay: import shm, os, sys p = os.popen("mplayer -slave -vf scale=640:-2,expand=640:480," + \ "dsize=640:480,overlay=123456 movie.avi &>/dev/null", "w", 0) print "Hit ENTER once the movie is playing ..." sys.stdin.readline() # Set up the shared memory with key 123456. mem = shm.memory(shm.getshmid(123456)) mem.attach() # The first byte is the lock byte, which is BUFFER_LOCKED, followed by # 15 padding bytes. The remaining bytes represent the BGRA pixels and # are multiplied over the whole overlay. mem.write("\x20" + "\x00" * 15 + "\xa8\x77\x80\x88" * 640*480) p.write("overlay visible=1,invalidate=0:0:640:480\n") print "Overlay (translucent filled blue box) should be visible; ENTER to quit." sys.stdin.readline() p.write("quit\n") This example is quick and dirty. Of course, a real application would want to use an image library to render images and/or text to the buffer.