Age | Commit message (Collapse) | Author |
|
This commit replace a hard coded macro with a macro defined by
a configure command.
Change-Id: Ib31354d61865314ed43e2c429c72b4ef2c8fa2a7
|
|
Change-Id: I094bca857f0fc2c067a4d08d1b36370fe61c25aa
|
|
For new VP9 only content type adjust the rate distortion and ARF
filter based on the relative spatial variance of the source and
reconstruction.
In regards to the RD loop the method favors modes where the
reconstruction variance is similar to the source variance. However it
is currently only applied to regions where the source variance is quite
low.
For very low variance blocks it applies a further bias against intra
coding and large prediction block sizes (the later in particular limit
the usefulness of the loop filter).
The final part of this change is to lower the strength of the ARF
filter for blocks where the source has very low spatial variance, to
encourage some low amplitude texture or noise to pass through
the filter.
This change improves the retention of film grain and fine noise /
texture in spatially flat regions, but as expected causes a significant
drop in PSNR on many clips. This is to be expected because similar
but misaligned noise or texture will give a lower PSNR than a flat
noise free reconstruction. However, it is worth noting that most clips
show a strong gain in FAST SSIM.
The features are enabled on the vpxenc command line by setting
--tune-content=film.
VPX_ENCODER_ABI_VERSION bumped for this change and cvbr.
Change-Id: I26a4e4edfa3dc5cacead82fa701fe7a9118ccd0a
|
|
The intra mode rd penalty was implemented as a rate penalty.
Code was added to scale the penalty according to block size but
this was not done correctly for the SB level or sub 8x8.
The code did a weird double scaling in regard to bit depth that
has been removed. Given that it is a rate penalty the bit depth
should not matter.
This bug fix improves average metrics on our standard test
sets by about 0.1%
Change-Id: I7cf81b66aad0cda389fe234f47beba01c7493b1e
|
|
This patch followed allow_exhaustive_searches feature modification and
continued to modify the encoder to achieve the determinism in the row
based multi-threaded encoding. While row-mt = 1 and using multiple
threads, the adaptive feature in encoder was disabled, which gave
BDRate gain(at speed 1, -0.6% ~ -0.7%; at speed 2, -0.46% ~ -0.59%),
but some encoder speed losses(7% ~ 10% at speed 1 and 3% ~ 6% at
speed 2). These speed losses were acceptable considering the speed
gains obtained from row-mt.
Change-Id: I60d87a25346ebc487a864b57d559f560b7e398bb
|
|
Change it to row based array to avoid the slow down cause by sync.
row-mt on, speed 8, 2 threads: ~4% speedup for VGA on ARM benefited
from adaptive_rd_threshold.
Change-Id: I887e65a53af20a6c4f48d293daaee09dab3512cf
|
|
Add routine vp9_model_rd_from_var_lapndz_vec and call it from model_rd_for_sb
to model the rate and distortion for MAX_MB_PLANE Laplacian sources in
parallel. The caller ensures that all sources have non-zero variance.
Measured a 18% to 25% reduction in retired instructions, and 17% to 24%
reduction in instruction execution cost with different compilers for the
Laplacian modeling.
No change in behavior.
TEST=Verified that encoded files match bit for bit, with and without this
change.
BUG=b/33678225
Change-Id: I6b76947f21c659a349adb896e13e99f6e3f951e6
|
|
(Yunqing Wang)
This patch implements the row-based multi-threading within tiles in
the encoding pass, and substantially speeds up the multi-threaded
encoder in VP9.
Speed tests at speed 1 on STDHD(using 4 tiles) set show that the
average speedups of the encoding pass(second pass in the 2-pass
encoding) is 7% while using 2 threads, 16% while using 4 threads,
85% while using 8 threads, and 116% while using 16 threads.
Change-Id: I12e41dbc171951958af9e6d098efd6e2c82827de
|
|
(Yunqing)
This patch added the missing initialization in temporal filter.
Borg test BDRate results:
PSNR: -0.019%(lowres); -0.013%(hdres);
SSIM: -0.001%(lowres); -0.010%(hdres).
Other q values gave comparable but no better results.
Change-Id: I7ad0c18b39e6f558342688e2fe1e12fdb133ce9b
|
|
Change-Id: I45d9fb4013f50766b24363a86365e8063e8954c2
|
|
The bit to error transformation got doubled as a result of going from
8-bit to 9-bit costs (change d13385c).
Use defines to derive the scale numbers and comment some of the fields.
derf: -0.023 BDRATE
hevcmr: +0.067 BDRATE
stdhd: +0.098 BDRATE
(These are substantially smaller than than the original gains from 8 to
9 bit costing.)
Change-Id: I6a2b3b029b2f1415e4f90a05709b2333ec0eea9b
|
|
In inter mode search skip all modes except NEARESTMV and DC_PRED.
10% less encode latency for large frames using the chromium remoting_perftests.
+0.313% BDRATE on the screencast set at speed -6.
Change-Id: Ib97a39dd8bcdeab545509e0e02d78ce7033f8c63
|
|
This is a pure-refactor in preparation to potentially raise the bit-cost
resolution.
Verified at good speed 0 and rt speed -6.
Change-Id: I5347e6e8c28a9ad9dd0aae1d76a3d0f3c2335bb9
|
|
Removed unused tx_select_threshes and tx_select_diff.
Change-Id: I5e9e7ad170056efe14b5f071e94d0c5a36e4a34c
|
|
silences missing prototype warnings
Change-Id: Idaf68d83d2cb03847f3ee002c4d00c2ac79da604
|
|
While turning on "--aq_mode=3", the quantizers are updated by each
thread. Fixed the me consts initialization function to make sure
that the correct thread data are updated.
Change-Id: Ied27bb7bae76fc3fa2cda4f8c35ac0b46271bef4
|
|
Frame buffers are now allocated dynamically on-demand.
Entries in the reference frame map, cm->ref_frame_map,
may now be set to -1 (INVALID_IDX) to indicate that
there is not a valid reference buffer in that "slot".
All slots in the reference frame map are now initialized
to the empty state (-1) and each buffer is initialized
to have a reference count of 0.
Change-Id: Id1afe98de98db4ae8b2dfefed7889c3b28c68582
|
|
This commit enables sub8x8 inter block coding for RTC mode. The
use of sub8x8 blocks can be turned on by allowing
choose_partitioning function to select 4x4/4x8/8x4 block sizes.
Change-Id: Ifbf1fb3888fe4c094fc85158ac3aa89867d8494a
|
|
This patch modified struct VP9_COMP. Created a struct ThreadData
to include data that need to be copied for each thread. In
multiple thread case, one thread processes one tile. all threads
share one copy of VP9_COMP,
(refer to VP9_COMP *cpi in the code)
but each thread has its own copy of ThreadData,
(refer to ThreadData *td in the code).
Therefore, within the scope of encode_tiles(), both cpi and td
need to be passed as function parameters.
In single thread case, the FRAME_COUNTS pointer in ThreadData
points to "counts" in VP9_COMMON.
Change-Id: Ib37908b2d8e2c0f4f9c18f38017df5ce60e8b13e
|
|
Similar to mask_filter, the filter_cache in RD_OPT struct can be
moved out, and declared as a local variable since it is only
used in pick_inter_mode functions.
Change-Id: I412b99cca82bade07ac912064ec03dd1de6b2c17
|
|
The mask_filter in RD_OPT struct is used to record rd result in
filter decision. It is only used in pick_inter_mode functions,
and is removed from the struct and declared as a local variable.
Change-Id: I3c95c8632ba7241591ce00ef2ef5677b5e297d7b
|
|
These 2 members in RD_OPT were moved to TileDataEnc struct
already, and therefore were removed here.
Change-Id: I22fee3b67f96e473a58e194a7edc76dbd48bfa04
|
|
Several frame counters in encoder are updated at SB level. Combine
those counters and put them in a separate struct, which allows us
to allocate one copy for each thread.
Change-Id: I00366296a13c0ada4d8fa12f5e07728388b6cab7
|
|
Reduce the scope of function parameters.
Change-Id: Ifef2cfb559908a97498ffdbd6ea53da1cd45a73c
|
|
Adaptively adjust the mode thresholds after each mode search round
to skip checking less likely selected modes. Local tests indicate
5% - 10% speed-up in speed -5 and -6. Average coding performance
loss is -1.055%.
speed -5
vidyo1 720p 1000 kbps
16533 b/f, 40.851 dB, 12607 ms -> 16556 b/f, 40.796 dB, 11831 ms
nik 720p 1000 kbps
33229 b/f, 39.127 dB, 11468 ms -> 33235 b/f, 39.131 dB, 10919 ms
speed -6
vidyo1 720p 1000 kbps
16549 b/f, 40.268 dB, 10138 ms -> 16538 b/f, 40.212 dB, 8456 ms
nik 720p 1000 kbps
33271 b/f, 38.433 dB, 7886 ms -> 33279 b/f, 38.416 dB, 7843 ms
Change-Id: I2c2963f1ce4ed9c1cf233b5b2c880b682e1c1e8b
|
|
Change-Id: I2902de7051a883fd22e27a655209233733969cfd
|
|
This commit makes a struct that contains rate value, distortion
value, and the rate-distortion cost. The goal is to provide a
better interface for rate-distortion related operation. It is
first used in rd_pick_partition and saves a few RDCOST calculations.
Change-Id: I1a6ab7b35282d3c80195af59b6810e577544691f
|
|
Change-Id: I6f67b171022bbc8199c6d674190b57f6bab1b62f
|
|
This commit enables an adaptive mode search order scheduling scheme
in the rate-distortion optimization. It changes the compression
performance by -0.433% and -0.420% for derf and stdhd respectively.
It provides speed improvement for speed 3:
bus CIF 1000 kbps
24590 b/f, 35.513 dB, 7864 ms ->
24696 b/f, 35.491 dB, 7408 ms (6% speed-up)
stockholm 720p 1000 kbps
8983 b/f, 35.078 dB, 65698 ms ->
8962 b/f, 35.054 dB, 60298 ms (8%)
old_town_cross 720p 1000 kbps
11804 b/f, 35.666 dB, 62492 ms ->
11778 b/f, 35.609 dB, 56040 ms (10%)
blue_sky 1080p 1500 kbps
57173 b/f, 36.179 dB, 77879 ms ->
57199 b/f, 36.131 dB, 69821 ms (10%)
pedestrian_area 1080p 2000 kbps
74241 b/f, 41.105 dB, 144031 ms ->
74271 b/f, 41.091 dB, 133614 ms (8%)
Change-Id: Iaad28cbc99399030fc5f9951eb5aa7fa633f320e
|
|
Change-Id: I39a37335ba5b3a969d328afb1f425ddb2cf7ddda
|
|
vp9_rdopt is for making rd optimal mode decisions. vp9_rd is for all
other rd related routines. Anything used outside of making an rd optimal
decision belongs in rd.
Change-Id: I772a3073f7588bdf139f551fb9810b6864d8e64b
|