Age | Commit message (Collapse) | Author |
|
Change-Id: Ic9956ddf1c2ddffcf7be7fdfc23ad9a2426fc47a
WIP: Fixing unsafe threading in VP8 encoder.
|
|
Fixing unsafe threading in VP8 encoder.
Change-Id: Ibf4c89a2043654834747811bc11eb283de0bb830
|
|
Change-Id: I76fe20ade099573997404b8733cf7f79e82fb21e
WIP: Fixing unsafe threading in VP8 encoder.
|
|
Multi-threaded code was not updated to disable background
refresh for non base-layer frames at the time it was
disabled in the main C-code.
Change-Id: Id6cc376130b7def046942121cfd0526b4f0a71d4
|
|
Change-Id: Ifa78c0a953fab3e5dd7af0446924846c7022cd09
|
|
|
|
Change-Id: I44e4e3869f231ae270cca98c9565f23c512e3ddf
|
|
Change-Id: I650a593162280ab40e71e527ec6518303e2d5723
|
|
Change-Id: I28ac1519d1594801fef9a623cb64598d3d751eb0
|
|
Change-Id: Ie22841d096f3c86694b95bd06fc3a8ce1f032a10
|
|
Change-Id: Ib73c7b2bee4cb2eb2528fa6b381fffe9503079a0
|
|
Change-Id: Ie9a26be7c9baa54a0e43a63ed6c77f2746477a9c
|
|
Change-Id: I289564a5a27f0d03ddc6f19c7838542ff22719be
|
|
Code cleanup
Change-Id: I82f9d787a2f511d39895fd8dfd5347a1676d9dbc
|
|
|
|
The current way of counting inter_zz_count doesn't work correctly
in multi-threaded encoding. Calculating it after the frame is
encoded fixed the problem.
Change-Id: Ifcb1972cde950b8cc194f75c6d7b6af09e8b0e65
|
|
Added checks for pthread_create() errors.
Change-Id: Ie198ef5c14314fe252d2e02f7fe5bfacc7e16377
|
|
The sync interval for the multithreaded encoder was considered as not changing
during the encoding. This is not true if picture size is changed.
The encoder could dead-lock because the main thread and the other threads were
using different sync interval.
Change-Id: I75232bbdbc6c02d77f830d870fd8b4e96697c64e
|
|
Precalculated block ptrs do not need updates during encoding.
Set these at init stage.
Moved the allocation of 'mt_current_mb_col' (last encoded MB on each
row) to vp8_alloc_compressor_data(), so that it is correctly
reallocated when frame size is changing.
Change-Id: Idcdaa2d0cf3a7f782b7d888626b7cf22a4ffb5c1
|
|
Allows building the library with the gcc -pedantic option, for improved
portabilty. In particular, this commit removes usage of C99/C++ style
single-line comments and dynamic struct initializers. This is a
continuation of the work done in commit 97b766a46, which removed most
of these warnings for decode only builds.
Change-Id: Id453d9c1d9f44cc0381b10c3869fabb0184d5966
|
|
Conflicts:
vp8/common/entropymode.c
vp8/common/entropymode.h
vp8/encoder/encodeframe.c
vp8/vp8_cx_iface.c
Change-Id: I708b0f30449b9502b382e47b745d56f5ed2ce265
|
|
Change If4321cc5 fixed a bug caused by forward declarations not being
kept in sync across C files, resulting in a function call with the
wrong arguments. The commit moves the affected function declarations
into a header file, along with the other symbols from encodeframe.c
that were being sloppily shared.
Change-Id: I76a7b4c66d4fe175f9cbef7e52148655e4bb9ba1
|
|
mb_row and mb_col was not passed to vp8cx_encode_inter_macroblock in
threaded encoding.
Change-Id: If4321cc59bf91e991aa31e772f882ed5f2bbb201
|
|
mb_row and mb_col was not passed to vp8cx_encode_inter_macroblock in
threaded encoding.
Change-Id: If4321cc59bf91e991aa31e772f882ed5f2bbb201
|
|
RD costs were local to MACROBLOCK data and had to be copied all the
time to each thread's MACROBLOCK data. Tables moved to a common place
and only pointers are setup for each encoding thread.
vp8_cost_tokens() generates 'int' costs so changed all types to be
int (i.e. removed unsigned).
NOTE: Could do some more cleaning in vp8cx_init_mbrthread_data().
Change-Id: Ifa4de4c6286dffaca7ed3082041fe5af1345ddc0
|
|
Produce the token partitions on-the-fly, while processing each MB.
Context is updated at the beginning of each frame based on the
previoud frame's counters. Optimally encoder outputs partitions in
separate buffers. For frame based output, partitions are concatenated
internally.
Limitations:
- enabled just in combination with realtime-only mode
- number of encoding threads has to be equal or less than the
number of token partitions. For this reason, by default the encoder
will do 8 token partitions.
- vpxenc supports partition output (-P) just in combination with
IVF output format (--ivf)
Performance:
- Realtime encoder can be up to 13% faster (ARM) depending on the number
of threads and bitrate settings. Constant gain over the 5-16 speed
range.
- Token buffer reduced from one frame to 8 MBs
Quality:
- quality is affected by the delayed context updates. This again
dependents on input material, speed and bitrate settings. For VC
style input the loss seen is up to 0.2dB. If error-resilient=2
mode is used than the effect of this change is negligible.
Example:
./configure --enable-realtime-only --enable-onthefly-bitpacking
./vpxenc --rt --end-usage=1 --fps=30000/1000 -w 640 -h 480
--target-bitrate=1000 --token-parts=3 --static-thresh=2000
--ivf -P -t 4 -o strm.ivf tanya_640x480.yuv
Change-Id: I127295cb85b835fc287e1c0201a67e378d025d76
|
|
Second shot at this...
Sync with loopfilter thread as late as possible, usually just at the
beginning of next frame encoding. This returns control to application
faster and allows a better multicore scaling.
When PSNR packets are generated the final filtered frame is needed
imediatly so we cannot delay the sync. Same has to be done when
internal frame is previewed.
Change-Id: I64e110c8b224dd967faefffd9c93dd8dbad4a5b5
|
|
Change-Id: Ieb05270ac332a4cc38ec4b7b995fc0150e0fffdf
|
|
Change-Id: I10efa441d663fceb6bc97a3bfad518cd3d9a5128
|
|
This commit continues the process of converting to the new RTCD
system. It removes the last of the VP8_ENCODER_RTCD struct references.
Change-Id: I2a44f52d7cccf5177e1ca98a028ead570d045395
|
|
This commit continues the process of converting to the new RTCD
system.
Change-Id: I3f9c07db65eb206f6363d21bdb80e871570da767
|
|
This commit continues the process of converting to the new RTCD
system.
Change-Id: I9bfcf9bef65c3d4ba0fb9a3e1532bad1463a10d6
|
|
This patch removes the local copies of the dequantize
constants and implements John's idea as described
in "Make a local copy of the dequantized data" commit.
Change-Id: Ic6b7d681f00bf63263f71ff1e39ab2f80729e8b2
|
|
Change-Id: Ie2dc0d72363ff38e0f71b59f6e2d1a2d70c5266b
|
|
Change-Id: I72ed49ce14ca0124dd0d31bfcf4c7630a4681587
|
|
Remove BOOL, INTn, UINTn, etc, in favor of C99-style fixed width
types.
Change-Id: I396636212fb5edd6b347d43cc940186d8cd1e7b5
|
|
This value needs to be copied to each thread's data structure.
This fixed artifact problem in multi-thread encoder.
Change-Id: Iab6d9745a1d44846aa503184705376f63a505597
|
|
vp8cx_mb_init_quantizer() needs to be called at least once to get
all values calculated. This change added one check to decide if
we could skip initialization or not.
Change-Id: I3f65eb548be57580a61444328336bc18c25c085b
|
|
Change-Id: I4fcd6e4656d9823aead941616cd63501aecbd6e2
|
|
caused by the "Removed bmi copy to/from BLOCKD" commit.
Change-Id: I9fae71bdc34c8ecc07bb81cd3ccf498b91ce3ec7
|
|
I got this idea from Pascal (Thanks). Before encoding a macroblock,
copy it to a 16x16 buffer, and then read source data from there
instead. This will help keep the source data in cache, and help
with the performance.
Change-Id: Id05f4cb601299150511d59dcba0ae62c49b5b757
|
|
Some further re-structuring of activity masking code.
Still has various experimental switches.
Supports a metric based on intra encode.
Experimental comparison against a fixed activity target rather
than a frame average, for altering rd and zbin.
Overall the SSIM performance is similar to TT's original
code but there is a much smaller PSNR hit of circa
0.5% instead of 3.2%
Change-Id: I0fd53b2dfb60620b3f74d7415e0b81c1ac58c39a
|
|
|
|
Declared the bmi in BLOCKD as a union instead of B_MODE_INFO.
Then removed B_MODE_INFO completely.
Change-Id: Ieb7469899e265892c66f7aeac87b7f2bf38e7a67
|
|
vp8_fast_quantize_b_pair_neon function added to quantize
two adjacent blocks at the same time to improve performance.
- Additional 3-6% speedup compared to neon optimized fast
quantizer (Tanya VGA@30fps, 1Mbps stream, cpu-used=-5..-16)
Change-Id: I3fcbf141e5d05e9118c38ca37310458afbabaa4e
|
|
Change-Id: I6e5e921f03dc15a72da89a457848d519647677a3
|
|
Declared the bmi in MODE_INFO as a union instead of B_MODE_INFO.
This reduced the memory footprint by 518,400 bytes for 1080
resolutions. The decoder performance improved by ~4% for the
clip used and the encoder showed very small improvements. (0.5%)
This reduction was first mentioned to me by John K. and in a
later discussion by Yaowu.
This is WIP.
Change-Id: I8e175fdbc46d28c35277302a04bee4540efc8d29
|
|
Various members that were either completely unreferenced or written
and not read.
Change-Id: Ie41ebac0ff0364a76f287586e4fe09a68907806e
|
|
This commit restructures the mb activity masking code
to better facilitate experimentation using different metrics
etc. and also allows for adjustment of the zero bin either
for encode only or both the encode and mode selection
stages
It also uses information from the current frame rather than
the previous frame and the default strength has been
reduced.
Change-Id: Id39b19eace37574dc429f25aae810c203709629b
|
|
|