Video streaming

Video streaming is awkward: it seems to be relatively common, I know at least a few people who worked on software for that (always wondered how come that there are no existing tools), and it's still tricky to set. Although the situation is similar with many other things: somehow there's not much of usable software, despite all the programming that is being done; even XKCD #949 is still relevant.

A video-on-demand setup with RTSP inputs

Sometime in 2017, I needed to read RTSP streams from IP cameras, making them available to web browsers via HTML5 Video. The container formats commonly supported by web browsers are:

HLS / MP4
Usually H.264 video and AAC audio; non-free/patent encumbered; by Apple. There is the OpenH264 impelementation, 2-clause-BSD-licensed, for which Cisco pays royalties.
WebM
Usually VP9 video and Vorbis or Opus audio; free; by On2, Xiph, Matroska, Google.
Ogg
Usually Theora video and Vorbis or Opus audio; free; by Xiph.

I'd ignore H.264 completely if it wasn't the most widely supported by web browsers (particularly Safari and IE/Edge don't support anything else). As of software, there are:

nginx-rtmp-module
Supports FLV/MP4, requires to rebuild nginx, making maintenance rather painful (manual rebuilds on each update, it's not in Debian or CentOS repositories). But has plenty of features.
ffserver
No config reload without restart, apparently to be discontinued soon. Supports plenty of formats, but doesn't have many features as a server.
icecast
By Xiph, supports Ogg and WebM, MP4 is not supported because it's non-free (though sort of works for a single client).
proprietary, half-baked, and/or abandoned software
That'd be a pain to work with.

After the initial investigation, I've set an Icecast-based system. To make Icecast only create streams on demand, using RTSP streams as source, one has to use hacks: either turn that RTSP stream into HTTP one somehow, and use it as an on-demand relay, or set pseudo-authentication (listener_add and listener_remove, to track when a source stream is needed), running and killing ffmpeg instances when needed. I did the latter, with another program updating and reloading icecast.xml, and it worked – though duplicate users should be allowed for that, or it wouldn't be reliable. To support IE/Edge and Safari, there is ogv.js, but it's quite laggy to use JS for that. A custom libav-based MPEG-4 streaming server worked for Edge only occasionally, probably had something to do with installed codecs or drivers (can be tested with nc and ffmpeg rather easily); worked for FF with particular ffmpeg settings (not clear which, probably has something to do with framerate), but worked in VLC and ffplay. Apparently only HLS (with hls.js) is reliable for playback in Edge (I've made the same program that I use for Icecast pseudo-authentication to serve HLS files, so that it can easily keep track of all the clients and streams). Oh, and Chromium doesn't support HTTP basic authentication for embedded videos. There's actually more to it, since many things around WWW are broken, but that's a rough outline.

I guess that things would have been a bit nicer if we weren't using protocols designed for hypertext transfer to transfer video streams, along with the programs that are supposed to render hypertext documents, but are used to watch video; they may be not the best fit. Although it works, mostly, with some hacks.

Later it turned out that the computational costs of real-time transcoding were too high for the used hardware, source streams were using H.264, and HLS is supported in all major web browsers, so I had to disable the ogg/Icecast/transcoding bits, only using HLS. That's how some of the bad setups happen.

Cameras for streaming

Many configurations are possible for video capture and delivery, to match different goals and preferences. Mine (in the described cases) are those for a home/private property surveillance and for video chats: mostly on-demand streaming and constant processing (movement/human detection and alarms), with the overall system being open, customizable, secure, and inexpensive. Which suggests use of free video codecs (Theora, VP8/VP9, maybe AV1), and perhaps of uncompressed video: if the interfaces allow, I think it makes sense to read uncompressed video from cameras, process/check it for movement, and then only compress when it is viewed (streamed over Internet) or stored. Though it is common to compress them in the camera itself, similarly to extracting and compressing radio signals, among others, since it greatly reduces the required channel capacity.

Hardware codec implementation or acceleration is likely to be useful for an inexpensive (and perhaps fanless) setup. Embedded AMD GPUs tend to support H.264 and H.265 specifically, Broadcom ones seem to just provide more general acceleration, and the general acceleration may suffice for Theora, VP8/VP9, and future (versions of) codecs.

For webcams there is the USB video device class, which supports uncompressed video (YUV) and VP8, in addition to H.264 and others. Apparently USB 3.0 is sufficient for 1080p uncompressed video at 60 Hz, USB 2.0 -- for 720p at 15 Hz. Many regular digital cameras can be used for video streaming too, possibly with the help of gphoto2, though they (at least some of relatively old Canon ones) seem to stream Motion JPEG in the "live view" mode. Better quality may be achievable with their HDMI outputs and HDMI to USB UVC converters. Perhaps for such a setup it'd only make sense to hunt down used cameras, and older models.

Non-USB options include mostly the same ones as for display connection: DisplayPort, HDMI, HDBaseT (complete with Power-over-Ethernet, though apparently otherwise it's pretty much just HDMI over twisted pair cabling), DVI, VGA, old composite video; those can be looked up, for instance, in the list of video connectors. Cameras with HDMI output are relatively easy to find, and not the most expensive out of digital and relatively modern ones, but still pretty expensive. HDMI, and probably DesplayPort too, are patent-encumbered; HDMI LA collects royalties. Frame grabbers/capture cards tend to be expensive also, and PCI ones would require larger computers.

Raspberry Pi supports Camera Serial Interface (MIPI CSI) for video input, and so do Orange Pi, Banana Pi, BeagleBone, HummingBird; some camera modules (e.g., OV7670) support Serial Camera Control Bus (SCCB), ESP32-CAM uses CSI too, but it aims ML processing, and apparently doesn't suffice to encode video beyond MJPEG, while Wi-Fi is likely to be rather slow for uncompressed video at any good resolution and framerate. Single-board computers are more suitable for use as a programmable camera (and maybe some other purposes along with it), even though it feels like an overkill (but apparently that's what many IP cameras are anyway, just locked down).

As additional and potentially useful features to consider, there are pan–tilt–zoom (PTZ) cameras, and fiberscopes.

And then there are common consumer surveillance cameras: Wi-Fi and/or Ethernet, bundled with a small computer, with a web server on it, providing H.264/H.265-encoded RTSP streams, "smart" stuff integration, but also some useful features: PTZ, infrared lights. Plenty of those around, and they are almost as cheap as web cameras, though with webcams (particularly USB 3+ ones) there seems to be a better chance to find ones with uncompressed video. OpenIPC (see OpenIPC HN discussion) is an open IP camera firmware project, whith may make some of the consumer IP cameras more tolerable; though it is partially closed, uses Telegram for communication, and features are unclear.

I haven't tried setting it up yet, but it seems that once again it's easier (and cheaper) to give up and go with patent-encumbered H.264/H.265, though uncompressed video and compression into software-defined codecs is achievable and affordable.