Looking for Deep Machine Learning video / image resolution/frame-rate applications for upscaling?


I was curious if anyone can point to a good / not-difficult to install & configure GNU/Linux app for video resolution up-scaling and also frame rate up-scaling that uses deep machine learning, AND that does not require a GPU (where many machine learning apps require a GPU)? Not to difficult is important - as (per my comments below) I struggled with more complex apps.


To explain, I have some old home digicam videos from year 2003 to 2005 timeframe, taken at 320x240 resolution and 15fps (I even have some at 160x120 resolution). I would like to improve quality of these, increasing frame rate to say 30fps, and increasing resolution to 640p. This is not as impossible as it may sound with deep machine learning, although obviously there are major quality limitations.

I can do this with ffmpeg, but of course the quality is not the same as that of using deep machine learning to do the same. For ffmpeg, one command line I use is:

Converts to 854x640 and then increases frame rate to 120 fps:

videofile="input"; ffmpeg -i $videofile.AVI -vf "scale=854:640" temp_640p.mp4; ffmpeg -i temp_640p.mp4 -filter:v minterpolate -r 120 $videofile"-640p-120fps".mp4; rm temp_640p.mp4

where input.AVI is the original video. This will output a video at 854x640 resolution at 120fps entitled “input-640p-120fps.mp4”.

or sometimes I find it better to reverse the order and increase the frames per second first … ie :

Converts to 120 fps, and then increases resolution to 854x640 :

videofile="input"; ffmpeg -i $videofile.AVI -filter:v minterpolate -r 120 temp-120fps.mp4; ffmpeg -i temp-120fps.mp4 -vf "scale=854:640" $videofile"-120fps-640p".mp4; rm temp-120fps.mp4 

where input.AVI is the original video. This will output a video at 854x640 resolution at 120fps entitled “input-120fps-640p.mp4”

This actually gives a better quality video, than if I simply played the original video with vlc or with smplayer, or if I simply performed resolution up-scaling with handbrake.

Here is a comparison of handbrake (on the left) to ffmpeg (on the right). The original video was only 160x120, and I up-scaled this to 864x640.



I also stabilized the video (on the right) with ffmpeg

However its no deep machine learning creation, which applies AI techniques to create a higher quality output video. I believe deep machine learning can do better than ffmpeg.

So can anyone offer any advice here?

I researched this a bit, and the best I could find were apps that required a container and knowledge of python.

vapour-synth: For example “vapour-synth” might provide some functionality in this area, but I find instructions for basic users (like myself) on how to use it lacking. The instructions for vapoursynth immediately dive into ‘script’ files which clearly are not bash scripts, and leaves me lost.

Image-super-resolution (ISR): An app for image resolution upscaling is called “Image super-resolution” (ISR)but I can not see if it (1) can be adapted for videos, nor (2) if not having a GPU is an issue. I also note in reading it requires python knowledge that I don’t have, and recommends container use of which I have no experience (nor conducted any research).

SVT - I could not figure out how to use this app to create quality videos that were as good as I could do with ffmpeg. As far as I can tell, it does not use effective deep machine learning, albeit I could be wrong. Its been a year since I played with SVT.

Further, neither vapour-synth nor ISR are simple for me to use. I failed trying to figure out how to use vapour-synth as its reference to scripts (that were not bash shell scripts) totally lost me.

Any references / guidance as to basic/simple correct direction would be appreciated. *

(I did read part of the manuals and also some blogs/user-guides for vapour-synthh and ISR, but I believe they assumed I had a level of knowledge of python and containers which I do not have as they totally lost me when they went into non-bash scripts.)*

I am currently testing “rife-ncnn-vulkan” which is on git hub: https://github.com/nihui/rife-ncnn-vulkan

Rather than build it for openSUSE, I am using the precompiled version for Ubuntu that also works for openSUSE LEAP-15.3. Note this is a command line tool, and the github page explains how to use it.

I note that rife ( https://arxiv.org/abs/2011.06294 ) is a Real-Time Intermediate Flow Estimation for Video Frame Interpolation, and that rife-ncnn-vulkan uses ncnn project ( https://github.com/Tencent/ncnn ) as the universal neural network inference framework.

As a test I have taken a video at low resolution 160x120 @ 15fps, and I am upscaling it to 864x640 @ 120 fps. This is a very demanding test (possibly too demanding) and there is only so much one can do.

I am comparing the output I get with rife-ncnn-vulkan to that which I obtained only using ffmpeg. At present I can’t say rife-ncnn-vulkan gives a superior result (ie ffmpeg is just as good as ‘rife-ncnn-vulkan’ ), and I note ffmpeg (only) is significantly quicker on an intel cpu.

However I am still testing this, and my opinion could change as I learn some more.

I discovered other apps that also will do frame rate increase (and apps that do image resolution increase), however they are far far less user friendly and require a knowledge of running python scripts, and setting up libraries/directories of supporting functions also created by python, … all of which at present significantly exceeds my knowledge. I may post some links to guides on this (which I failed to implement) in setting up the appropriate python app.

Can you post the video and command your using? Can build it tomorrow for Leap and Tumbleweed?

I had some success with “rife-ncnn-vulkan” but clearly its not the right app for my hardware. To use this app to increase the frame rate of a video one must:

  1. deconstruct a video to MANY frames and save as .png image files, placing them in a dedicated directory using ffmpeg (this is quick to do)
  2. use rife-ncnn-vulkan to take the .png image files and double the the # of image files with a new image between each of the previous and next images (ie in essence creating the images needed to double the number of frames and hence the frame rate). This is slow dependent on the # of frames.
  3. with twice as many .png files (ie twice as many frames), then use ffmpeg to recombine the .png files to a video (this is also quick to do).

For my test, taking a video that was originally recorded at 15 fps, and increasing it to 120 fps, I ran ran ‘rife-ncnn-vulkan’ a few times, starting with frames from 15 fps, to create enough frames for 30 fps, ‘rife-ncnn-vulkan’ again to create enough frames for 60 fps, and run ‘rife-ncnn-vulkan’ again to finally enough frames for 120 fps, and only then recombine the frames to a video at 15 fps.

Surprisingly this worked well, but on a core-i7 (with no special GPU) it took about 4 hours of processing. The video resolution was 854x640 … so it was far from HD.

In contrast, with ffmpeg only, with the same original 15fps video, I can create a video ‘almost’ as good in quality (almost but not quite) in about 5-minutes. The extra quality from ‘rife-ncnn-vulkan’ is mostly not worth spending 4 hours processing (with my hardware) - which of course is why people pay money for fast GPUs.

original video: 160x120 @ 15fps
increased to 854x640 @ 120fps
ffmpeg (LEFT ) and rife-ncnn-vulkan (RIGHT)


I’ll post next my research on vapoursynth (a python method) where I did not get past 1st base in trying to use it to resize videos to a larger size or increase frame rate.

Thanks for the offer - I suspect thou I did not make enough progress in my research, to give you what you need for packaging. … I need to learn more so I can better propose something definitive. Overall, this stumped me.

Here is the link for the software setup of vapoursynth pluggins I used (although I used an openSUSE packaged vapoursynth):

l33tmeatwad Guide - software setup for VapourSynth

BEFORE you read the above link … I’m not too sure how amiable this is for making into a package, nor for that matter if it is really necessary to follow that guide. The guide includes many plugins that have to be added for vapoursynth to work for rescaling.

I also note vapoursynth and vapoursynth tools is packaged for openSUSE:

in fact there is a lot of vapoursynth packaging for openSUSE, but I could not tell if anyone of the addition packages were needed to do what I was attempting to do (which is take a video, and increase its resolution … and optionally increase its frame rate as well). Here is a link to the openSUSE apps:

However there is no GUI, nor command line instructions, so I was left puzzled as to what was needed.

So going back to the “l33tmeatwad Guide”, after going through all the steps, I still could not get vapoursynth to work in order to modify a video … possibly because I ‘mixed’ openSUSE packaged vapoursynth, with the instructions of that guide … or possibly I had the path wrong for python apps … or most likely because I had no clue as to what I was doing . :cry:

The tabs on the top of the I33tmeatwad page give more information … but I don’t know how relevant, other than to provide guidance to script writes.

I tried looking in the vapoursynth python reference but it was focused more on developer info (I believe).

I tried also to run a simple script such as this described here but it failed. I also tried a more complex script from here (see code below) and it also failed.

import vapoursynth as vs
import havsfunc as haf
core = vs.get_core()
clip = core.ffms2.Source(source='E:\Archive\\input video.avi')
clip = vs.core.resize.Point(clip, format = vs.YUV422P10)
clip = haf.QTGMC(clip, Preset='Slower', TFF=False)
clip = core.resize.Spline36(clip, 720, 540)

I noted a ‘basic’ guide here (Native Resolutions and Scaling)](https://blog.kageru.moe/legacy/resolutions.html) and a newer version here with individual lines for a script to change resolutions

I think I am missing something fundamental in what vapoursynth does, and possibly this is not the correct app for me to increase the resolution of a video and increase the framerate of a video.

Anything you can teach on that would be appreciated, … but I am a bit saddened to say that I do not know enough to give you better directions as to where to to from here.

So no worries if my research is still FAR too immature to be of much help to you. Maybe eventually I will figure this out.

I think I was (and I am) simply wandering all over the place, and not grasping what is/was needed.

Increasing frame rate: https://www.svp-team.com/wiki/Manual:SVPcode


I went through this same effort of resolution up-scaling with frame rate conversion of videos, somewhat unsuccessfully a few years ago … although at that time machine learning was more in its infancy and not yet implemented as well in video-resolution/frame-rate up-scaling.

At that time, I installed/played with an earlier (?) version of SVP:mpv, which at the time I think required a custom build/install on my PC, as there was not a suitable openSUSE package at the time.

I was not happy with it, and it was an annoyance to remove. In the end, when the next openSUSE release came out, I elected not to re-install it.

But time has gone by. I don’t recall if the older version used vapoursynth, and I note both vapoursynth and ffmpeg (which SVP:mpv use) have improved the past few years, so perhaps its worth setting aside my annoyance/previous-negative-impressions, and looking at it again.

I note according to that wiki, it can re-encode for frame rate conversions, but for video resolution up-scaling, its for real-time use, which I don’t want. Still, frame rate conversion is useful.

Thus far, I found when taking a video (for example 160x120 @ 15 fps => 854x640 @ 120 fps) I obtain better quality if I do the resolution increase first, followed by the frame rate increase, as opposed to the other way around (ie frame rate increase followed by video resolution increase yields poorer quality). Unfortunately to get the better quality with resolution-increase followed by frame-rate-increase is MUCH slower than to do it in the reverse order. Such is the ‘time price’ of better quality encoding.

I forgot to mention, I of course modified the script for my openSUSE LEAP-15.2 using my /home/oldcpu/Video directory (which I note did not work, it never got past the 3rd line of the script) :

import vapoursynth as vs
import havsfunc as haf
core = vs.get_core()
clip = core.ffms2.Source(source='/home/oldcpu/Video/MVI_1002.AVI')
clip = vs.core.resize.Point(clip, format = vs.YUV422P10)
clip = haf.QTGMC(clip, Preset='Slower', TFF=False)
clip = core.resize.Spline36(clip, 720, 540)

So I installed SVP:mpv again and played with it. It has improved a bit, and its moderately fast (just as fast as ffmpeg) , but I recall better now why I did not like it. When ever it is running in a Window on my KDE, say processing a video … if I open another Window and play a video or something with smplayer, SVP will interceed and throw a water mark on the video. The only way I have found to remove the realtime watermark is to close SVP. I find that intrusive behaviour highly annoying.
Below is a video comparing ffmpeg (LEFT) to SVP (right):

  1. original video 160x120 @15fps increased to 854x640 @ 15fps with ffmpeg
  2. frame rate then increased from 15 fps to 120 fps with ffmpeg(left) and SVP (right)
    ffmpeg (LEFT) vs SVP (right)



Below is a video comparing rief-ncnn-vuklan (LEFT) to SVP (right):

  1. original video 160x120 @15fps increased to 854x640 @ 15fps with ffmpeg
  2. frame rate then increased from 15 fps to 120 fps with rief-ncnn-vuklan (left) and SVP (right)
    rief-ncnn-vuklan (LEFT) vs SVP (right)


EDIT : As far as I can tell, the SVP I installed is not accessing vapoursynth … which may be limiting its quality.

There is an interesting article here, that provides a ‘top level’ explanation as to how its possible to obtain a reasonable high resolution image/video, from a low quality image/video … where I don’t believe ffmpeg, nor svp, nor rife-ncnn-vulkan use information/configuration from machine learning. Instead they use ‘trusted’ and ‘reasonably ok’ (until now) methods for increasing resolution, with algorithms such as nearest-neighbor, bilinear or lanczos.

Superior output as of recently is becoming possible via “Deep Learning Super Resolution” (DLSR) … however to find a built app (for users) with such DLSR is something I have yet to succeed in doing. As to how this up-scaling is possible from data that is not there in the 1st place, I like this explanation:

[INDENT=2]Super Resolution

An image’s resolution may be reduced due to lower spatial resolution (for example to reduce bandwidth) or due to image quality degradation such as blurring.

Super-resolution (SR) is a technique for constructing a high-resolution (HR) image from a collection of observed low-resolution (LR) images. SR increases high frequency components and removes compression artifacts.

The HR and LR images are related via the equation:

[INDENT=3]LR = degradation(HR).
By applying the degradation function, we obtain the LR image from the HR image. If we know the degradation function in advance, we can apply its inverse to the LR image to recover the HR image. Unfortunately we usually do not know the degradation function beforehand. The problem is thus ill-posed, and the quality of the SR result is limited.

DLSR solves this problem by learning image prior information from HR and/or LR example images, thereby improving the quality of the LR to HR transformation.

The key to DLSR success is the recent rapid development of deep convolutional neural networks (CNNs). Recent years have witnessed dramatic improvements in the design and training of CNN models used by Super-Resolution.

Of course even with the best ‘machine learning’ when upscaling, there are ‘inherent’ assumptions in the machine learning (despite the training done to improve the DLSR), … assumptions/limitations in the DLSR upscaling function when increasing the resolution of videos and images, … where details in the actual smaller resolution scene (not captured in the low resolution image/video) will of course not be seen when upscaled. Still, the end product can be much better than if not upscaled.

Ultimately an app that uses DLSR is what I hope to achieve/find, although for now I am happy to use the traditional nearest-neighbor, bilinear or lanczos algorithms.

Here is another good article, this one on openCV implementation … where it shows images comparing the effects of DLSR (Deep Learning Super Resolution) being applied: https://www.pyimagesearch.com/2020/11/09/opencv-super-resolution-with-deep-learning/

openCV is available for openSUSE, but I note the nominal version (3.x) for LEAP-15.2 is too low a version, and one needs at least 4.3.0 for a python interface. One can get v.4.5.4 of openCV for LEAP-15.2 and can get v.4.4.0 for LEAP-15.3.

I still have not been able to wait through the openCV documentation on DLSR so to figure out how to proceed. …

This is all a challenge (IMHO) as its fairly new.

  1. Leap 15.2 is outdated with glibc 2.26 and near its end of support. Use Leap 15.3 or TW.

  2. About “degradation function”** LR = degradation(HR)**: ILL person who wrote this has little knowledge about mathematics. With converting HR to LR we lose some info, and we cannot restore HR after that (because we threw away that “degradation function”).

  3. It is rater stupid to apply technique that intended for still images to video, because with video we have extra info. And we can enhance video before upscaling - see “AMD Fluid Motion” and similar software.


Custom – Allows you to adjust the following video settings:

  • Video Sharpness
    – Select from 3 levels of video sharpening or turn video sharpening off. > - Video Color Vibrance
    – Selection from 3 levels of color vibrance or turn video color vibrance off. > - AMD Steady Video
    – Enable or disable AMD Steady Video. AMD Steady Video uses the driver to attempt to reduce camera shake in a video. > - AMD Fluid Motion Video
    – Enable or disable AMD Fluid Motion Video. AMD Fluid Motion Video will adjust the frame rate of the video to smoothen out video playback. > - Custom Brightness
    – Adjust video brightness by dragging the slider. Brightness can be adjusted between +100% to -100%. >


Don’t expect too much from Deep Learning - it is one of possibilities, no more.

That’s interesting I guess for those with AMD hardware - but I don’t have such, nor intend to buy such.

It works incredibly well in some areas. In the completely different area of chess - it has mastered it. No one can come close to top computer chess programs that use machine learning (aka deep learning).

In the case of video improvement, Deep Learning also can work incredibly well in some areas. I have seen enough on line examples, and played around enough (without Deep Learning, using various tools) to become a believer that more is possible.

Take a look at the link I gave wrt signs being carried by people … https://www.pyimagesearch.com/2020/11/09/opencv-super-resolution-with-deep-learning/ … an appropriate learning model from Deep Learning can do that, while a typical upscaling algorithm can not.

The downside of course is in the model training/learning time, and in the processing time (especially given need to select correct trained model).

… but I understand you have a different view, so let me say that I prefer not to debate such on this thread - as I suspect we have divergent opinions.

My intent on this thread is to obtain help to get such working - not to debate its merits before it is working. Such a debate doesn’t help me. I have many videos from > 15 years ago that I want to enhance.

Feel free to start a new thread where you post on the topic amplifying that: "Don’t expect too much from Deep Learning " .

Here is an example, of using AI software to improve the framerate on a video … its sufficient to watch the first couple of minutes … although one can watch the entire video if curious.

This was made by Scott Manley, who is a space/satellite enthusiast


Now taking slow 1 fps videos of the moon and speeding them up to 25 fps is not my intent … But this did catch my eye and it illustrates what is possible. I seriously doubt at present time that ffmpeg by itself could do this.

Of course Scott was using a high powered AMD GPU to do this, and it took several hours. … But it gives an idea as to what can be done.

My intent is to take basic home videos from 15 years ago, some at 15fps, and some at very low resolutions, and to improve their quality.

At present, I can’t point to any particular piece of software and say this is the way to go, as I have yet to get one Deep Learning piece of software for frame rate changes, and for super resolution changes running. Rather I have the more traditional software running for such, which is impressive … but I have suspicion it may be possible to do better.

I am still bouncing around from app to app, not able to get past the basics to install, and even worse, not far enough to even pose a question.

I was trying an app whose install should be very basic (ie and functionality very basic and thus not likely to yield good results) and it stumped me right at the bash/ksh level, and I posted here on it on our forum : https://forums.opensuse.org/showthread.php/564376-Bash-shell-script-path-execution-question

This super resolution / frame rate learning effort of mine (to use/learn about Deep Learning apps as opposed to the more classical apps) is proving for me to be a challenge.

This is a status update of my efforts - no help requests here in THIS specific post (maybe in later posts) … so perhaps skip this post if the status is not of interest.

I am still banging my head against the wall here … and possibly I should be blogging about this instead, as I am beginning to think few have gone down this path before me, and those who have are so far advanced in their efforts that they don’t have the inner-newb wrt providing suggestions to a newb (myself in this area). I can fully appreciate that.

I confess I have not yet learned enough to ask ‘pointed’ questions, and my efforts here are revealing major gaps in my GNU/Linux knowledge, which I guess is good … as learning of the knowledge gaps is possibly the 1st step in later filling this gaps with useful information.

I have recently read of an app called “DAIN” (Depth-Aware Video Frame Interpolation) which can be used for frame interpolation. However my understanding is ‘as made available’ in specific github page for DAIN one needs a NVIDIA GPU.

I also noted apps based on that, which do not require such a GPU (although obviously they execute a dozen or more times slower). For example:

I tried running these on my Intel-Core-i7-4770 CPU desktop machine, but when they ran, they failed to properly interpolate. I did receive a warning while they were running that the Haswell processor was not fully supported.

I was able to run these on my new laptop, a Lenovo X1 Carbon gen-9 with an Intel Core-i7-1165 G7 CPU (which is a Tiger Lake processor). Nominally this Lenovo’s mobile 1165G7 CPU is about the same speed as my desktop’s core-i7-4770, except the 1165G7 being much newer has more GPU functionality.

When I first ran dain-ncnn-vulkan

oldcpu@X1-Carbon-G9:~/dain/dain> ./dain-ncnn-vulkan -i input_frames -o output_frames

I obtained the errors

vkCreateInstance failed -9
vkCreateInstance failed -9
vkCreateInstance failed -9
invalid gpu device

Since I had vulkan installed, this suggested to me that I was missing a software app such that vulkan would run on my Intel cpu. Hence I installed libvulkan_intel and this time the app ran.

I ran these apps on a video of resolution 160x120 @ 15 fps which is about 16-seconds long (so to initially increase frame rate to 30 fps, but stay at same low 160x120 resolution).

  • dain was very slow to exeucte … were dain created frames at a speed of about 1.5 frames per second.
  • cain failed to run properly, not interpolating properly, making every second frame a 100% black in color frame.
  • rife was very fast to execute, creating new frames at a rate of about 50fps (ie more faster than dain).

When running dain and cain, I noted the error " MESA-INTEL: warning: Performance support disabled, consider sysctl dev.i915.perf_stream_paranoid=0 " .

Rather than immediately go down that route of running sysctl, I first installed “Mesa-vulkan-device-select” and “Mesa-vulkan-overlay”. However this made no difference and the message "MESA-INTEL: warning: Performance support disabled, consider sysctl dev.i915.perf_stream_paranoid=0 " still appeared. I also installed Mesa-libOpenCL and intel-opencl and libOpenCL1 (which added dependencies libclang11, and libclc) but those installs made no difference.

So I ran

sudo sysctl dev.i915.perf_stream_paranoid=0

and next time I ran dain I did not see that MESA-Intel: warning.

However I observed no speed nor functional difference in performance in dain (very slow) nor cain (won’t interpolate), nor rife (same fast speed). However that warning is gone - which may or may not mean something.

I was curious and I investigated this further, but I am not convinced I found the correct intel page. I note on intel’s site this Intel page on the topic, where it mentioned that sysctl command.

I also note it gives instruction on how to make the command permanent:

This command makes a temporary change that is lost after reboot. To make a permanent change, enter:
echo dev.i915.perf_stream_paranoid=0 > /etc/sysctl.d/60-mdapi.conf

However I have yet to do that.

There was guidance given on how to install intel-metrics, but I have also not done that. I don’t know if needed.

I am testing ‘rife’ vs ‘dain’ but at the moment, I have not seen a difference in the quality of their output after a frame rate increase. I do note ‘dain’ is incredibly slow and ‘rife’ is much faster (on my PC) so likely I will not use ‘dain’ unless I find it produces much higher quality in frame rate increases.

If I get to the point where I understand this enough I will post a question - at the moment I think I am too much of a newb in this to even come up with a question. < sigh >

Further to this, I thought I would show the output of one frame (compared to the original (with original increased in size but no interpolations)) where ffmpeg was used to increase resolution of a video from 160x120 to 854x640:

FRAME Comparison between two videos:
LEFT original (160x120) - RIGHT ffmpeg (854x640)

I used bilinear in that upscaling (ie I did not use a neural network). The command for the video was:

ffmpeg -i input-video-120p.mp4 -vf "scale=854x640:flags=bilinear" output-video-640p.mp4

I hope eventually to get a deep learning super resolution app running, as I am especially curious to see it perform with my own videos - especially in areas where there are pixelated/noisy signs in the low resolution frames, that I know I could with very time consuming fix frame by frame efforts with gimp get ride of sign noise/pixelation. But that would take weeks of manual effort for a single video. That is one of the areas where I hope neural networks can do better.

I have been playing with STARnet (from a blog here: http://technology.research-lab.ca/2021/01/upscale-and-interpolate-video-super-resolution-using-starnet/ ) to upscale a video.

I had some difficulty in getting it running initially, due to my lack of familiarity with some shell aspects, scripting aspects, and python aspects. Fortunately I had some help in this thread https://forums.opensuse.org/showthread.php/564376-ksh-shell-(run-via-bash)-script-path-execution-question.

Further, fortunately my knowledge (albeit limited) of ffmpeg allowed me to take the STARnet output resized image frames and create a video.

Here are some comparions, where original video was 160x120 @ 15 fps which is only 16 seconds in duration (about 250 frames):

LEFT (orginal to 640x480, kept pixelization) - RIGHT (Starnet)



= = = = to be continued = = =

=== continued ===

LEFT (orginal to 640x480, kept pixelization) - RIGHT (Starnet)
Starnet (on right) stabilized with ffmpeg


I find , dependent on the amount of shake in the original video, that adding software stabilization can make a significant difference.
.= = = to be continued = = = .