google / compare-codecs Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
This needs to verify that encodings are stable or improve for the ffmpeg-provided encoders.
AST is order-of-magnitude slower than JSON, and parsing has turned out to be the heaviest part of report processing.
Codecs that support 2-pass (vp8, vp9, x264, x265?) need to be allowed to use it in "unconstrained" mode. We should also have a separate comparision table that only shows 1-pass results.
The encoder version is important if we want to generate statistics comparing one version of an encoder to another version.
This bug tracks picking the encoder version out of the compiled binary (for instance, using "x264 --version").
Numpy gives warnings about poorly conditioned polynominals when given less than 3 points (?).
These curves are most likely badly shaped, and need to be inspected to see if we can generate sensible numbers from them at all.
Investigation needed.
compare_json will drop out results with negative scores.
Negative scores are quite common under criterion RT.
When all scores for a file are negative for all configs where all files have been scored, compare_json --single_config will fail.
There needs to be a tool for making sure there exists a config with at least one positive score for each file in the test set - one can do it manually by picking a likely config and running force_run_config, but this should be automatable.
Some indications are that PSNR rates dropped for many of the MPEG clips with the same settings, but settings' meanings might have shifted.
Not a high priority.
When choosing alternatives in graphs - especially sweepdata.html - the URL should be updated to follow the selection. This would allow sharing of links to specific graph displays.
The AST parsing is just for backwards compatiblity, and should be removed.
The biggest repository of test files, the Derf filestore, uses .y4m files.
In order to use these, we need to:
a) read the .y4m format in scripts
b) compare the decode result with the .y4m file
In converting to use multiple score stores, it turns out that EncodingDiskCache.AllEncoderFilenames returns just the filename, not the path. This needs fixing.
The current Javascript uses UpperCamelCase for function names, while proper Javascript style is to use lowerCamelCase for functions, and UpperCamelCase only for constructors (~ class names).
This should be fixed.
The command line parameters "--codec", "--criterion" and so on are all over the place.
Their definitions should be collected into a single module for consistency.
(at the moment: 62 calls to parser.add_argument, only 21 different ones.)
At the moment the verify_scores function uses the current scores to find the "best" scores
to evaluate whether they have changed or not. This is bad in the case where the "best" score
is worsened - the next run of the tool will pick a new configuration as "best".
The solution likely involves moving the choice of scoredir into the optimizer.
As part of the varying process, options can be added to the commandline, but they don't go away. They should - shorter command lines are easier to understand.
When running with a shared repo, and parameter combinations are made illegal, stored parameters may turn illegal. This causes an exception when trying to retrieve "the best config".
Exceptions when retrieving "the best config" from the path should just ignore the bad configs.
The RT mode needs to keep the lookahead parameter at zero to meet its requirements.
This means that the goal-seeker needs to know to avoid it - this may fit best if the ConfigurationFixups function of "Codec" takes a "mode" parameter - this requires keeping the name of the mode around, which argues for encapsulating the scoring function into an object having both a mode name and a score function.
An alternative design is to have separate codec names for each mode, which control the underlying parameters.
Recommendation from Marco Panioni:
For RT, VP9 should use --rt --end-usage=cbr --aq-mode=3 - this is most like the mode used in WebRTC for realtime encoding.
The following gave an error:
/usr/local/google/home/hta/code/compare-codecs/tools/ffmpeg -loglevel warning -s 1920x1080 -i video/mpeg_video/Kimono1_1920x1080_24.yuv -codec:v mjpeg -qmin 58 -b:v 2500k -y /usr/local/google/home/hta/code/compare-codecs/workdir/mjpeg/c0a4fd36dd74/2500/Kimono1_1920x1080_24.mjpeg
Error message:
[rawvideo @ 0x2d792e0] Estimating duration from bitrate, this may be inaccurate
[swscaler @ 0x2d650a0] deprecated pixel format used, make sure you did set range correctly
[mjpeg @ 0x2d7bac0] qmin and or qmax are invalid, they must be 0 < min <= max
Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height
Likely, qmin is larger than the default value for qmax.
The output format of pylint has changed to hide the filename.
Use a msg-format to get it back. Instructions here: http://docs.pylint.org/output.html
Since the parameters.ChangeValue function returns a new parameter block (and similar for its echoes up the stack), it ought to be called something that implies it doesn't change its argument. CreateVariant is one possible name - ModifiedObject would be another.
39 occurences so far.
At almost a thousand lines, encoder.py is too big. Consider splitting it.
Removing the Encoding*Cache objects would take ~250 lines off the footprint.
The settings used for x264 were inherited from a test where the constrained baseline H.264 profile was of interest. Default x264 profile should use High, and baseline should be a separate target (if it's still of interest).
libyuv (open source) has a PSNR measurement tool as well as an SSIM measurement tool.
Using those tools instead of / in addition to our source-included PSNR needs to be investigated.
Due to the split between full and limited mode, the run_all_tests doesn't pick up all unittests.
In full mode, it should pick them all.
The setting -qmin 50 -qmax 722 seems to generate an error on BQMall for MJPEG @1200 kbps.
/home/hta/code/compare-codecs/tools/ffmpeg -loglevel warning -s 832x480 -i video/mpeg_video/BQMall_832x480_60.yuv -codec:v mjpeg -qmax 722 -qmin 50 -b:v 1200k -y /home/hta/code/compare-codecs/workdir/mjpeg/fd8f0e24053d/1200/BQMall_832x480_60.mjpeg
[rawvideo @ 0x1b48e20] Estimating duration from bitrate, this may be inaccurate
[swscaler @ 0x1b300c0] deprecated pixel format used, make sure you did set range correctly
Encode took 1.400000 CPU seconds 1.470000 clock seconds
/home/hta/code/compare-codecs/tools/ffmpeg -loglevel warning -codec:v mjpeg -i /home/hta/code/compare-codecs/workdir/mjpeg/fd8f0e24053d/1200/BQMall_832x480_60.mjpeg /home/hta/code/compare-codecs/workdir/mjpeg/fd8f0e24053d/1200/BQMall_832x480_60tempyuvfile.yuv
/home/hta/code/compare-codecs/workdir/mjpeg/fd8f0e24053d/1200/BQMall_832x480_60.mjpeg: Invalid data found when processing input
Settings that completed successfully:
7a2b7def4a12 -848.682010 -qmax 597
e3fef2f71b05 -530.719020 -qmax 597 -qmin 58
b4895ab262ea -499.714020 -qmax 598 -qmin 69
278ae022d480 -499.714020 -qmax 626 -qmin 69
366f29e2de05 -504.051020 -qmax 722 -qmin 67
456420d31ede -848.682010 -qmax 722
56c8aea054b6 -499.714020 -qmax 722 -qmin 69
f1306ea6b766 -499.714020 -qmax 827 -qmin 69
5499167d9533 -848.682010 -qmax 943
7df43d1fe8ce -499.714020 -qmax 943 -qmin 69
e8a589627343 -511.853020 -qmax 943 -qmin 64
d31668749543 -499.714020 -qmax 985 -qmin 69
1a5518512496 -499.714020 -qmax 1024 -qmin 69
Note (from Paul Wilkins): This vp9 control has no effect in 1-pass mode. So this only makes sense when 2-pass mode is fully supported.
Having code that accesses os.env['CODEC_WORKDIR'] and os.env['WORKDIR'] in many places is not good.
Add a module that centralizes access to these variables, and vends the paths when needed.
Note: Needs setters, because testing.
Most of the code for this should be a common utility function VerifyEncodingScoreBetterThan.
At the moment, parsing from config strings, such as reading from storage, does not check the parameter values. They ought to be checked (numeric values are numeric, bounded-range values within ranges, choice values have a valid choice), and action taken (likely raise exception).
This needs tools to clean out improper configurations from storage, too.
This situation arises when the bounds of a parameter have been tightened.
We should support video playback of encoded files - preferably side-by-side.
Most likely design:
x264-lossless encoding the resulting YUV file and placing it on the server.
Keep track of two last rate points clicked in a graph, and play them back when requested.
Please contact x264 developers to discuss further.
The results are pretty meaningless when settings are chosen without an understanding of what they do.
At the moment, if all the PSNR values for one codec are higher than the best PSNR values for another codec, the "size AVG" calculation returns 0, and the "size DRATE" calculation returns a very large number.
These cases should be treated sensibly (either clamped to some arbitrary large/small number, or omitted from the "average improvement" calculation altogether).
Is the testing of hardware encoders something that falls under the goals of this project?
For example I believe that Intel QuickSync Video, nVidia NVENC and AMD VCE h.264 implementations are all widely available with open source drivers (and the right hardware).
A hover or popup for each point should show the command line and the resulting score components used for that particular encoding.
It will be much better, either if the file-size comparison can be put under a similar encoding speed, or the results can include speed data along with the filesize-PSNR data.
The current default setting of x264 is '-preset slow', which can give very different filesize/PSNR results with '-preset ultrafast' or '-preset superfast' .
For most of the codec, the speed info can be read from the encoder output to screen, for example, x264 outputs "encoded xxx frames, xxx.xx fps, xxx.xx kb/s" at the end of encoding.
Both VP9 and x264 claim higher performance numbers when multithreading; this is important for the RT case, where encoding speed is the limiting factor.
But target systems have very different numbers of (free) cores, so results at various threading levels might be important to show, not just "max threads".
There should be a place in the database to store results for the same parameter sets, but differing encoder versions.
Depends on #68
The "tuned results" display is not optimal for comprehension.
Suggested alternate graphic:
The graph generation is done purely in Javascript. The information needs to be easily accessible.
There should be a common repository for results, which allow you to generate graphs without running everything yourself.
Checking in should be guarded so that only improvements are entered by default, and that results are traceable.
VP9 forces 1-pass internally, but all other codecs in RT mode should also ensure 1-pass.
There needs to be parameters for switching between filesets.
Suggested design: Fileset names = directory names under "video".
Bitrates are taken from a fixed table based on width x height.
Name "local" should be reserved for non-global results (not uploaded or made consistent).
There should be one.
I am trying to run the scripts to compare Vp9 vs x264. The file website/contributing/index.md, instructs to run install prerequisites. I am guessing this will fetch the input YUV files also. But I couldn't find this script in the directory.
From @pzembrod : The style guide recommendation is to use a module's Error class only as base class for the actual exceptions thrown, so that you can catch either all exceptions from that module (Error) or any specific exception in a targeted manner.
On a page, we should (somehow) offer the opportunity to pick the encoder version for each side of a comparision.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.