Age | Commit message (Collapse) | Author |
|
Change-Id: I3d2565572c2b905966d60bcaa6e5e6f057b1bd51
|
|
This reduces some regression when external RC
is used, for which avg_frame_low_motion is not
set/updated (=0).
Change-Id: I2408e62bd97592e892cefa0f183357c641aa5eea
|
|
Change-Id: I5c42013a08677cdef8d47f348458118338ff0138
|
|
Change-Id: Iba58e2aa2578964b5c8b48ab0acbee9b44bcdada
|
|
This refactoring is needed to allow the
RC_rtc library to support VBR.
Change-Id: I863a4a65096fed06b02307098febf7976360e0f3
|
|
this mirrors the change from libaom:
5b150b150 Update some comments for rc_target_bitrate
Change-Id: Iaabee5924e0320609a29dc8ab71327923fb4c5d2
|
|
Bug: webm:1731
Change-Id: I1db777c0c3a8784fb3dcf7cd39f78ebf833ab915
|
|
this allows the file to be located in LIBVPX_TEST_DATA_PATH similar to
other test sources.
Bug: webm:1731
Change-Id: I51606635d91871e7c179aa8d20d4841b0d60b6ad
|
|
Two pass rc parameters are only initialized in the second pass
in vp9 normal two pass encoding.
However, the simple_encode API queries the keyframe group, arf group,
and number of coding frames without going throught the two pass
route.
Since recent libvpx rc changes, parameters in the TWO_PASS
struct have a great influence on the determination of the above
information.
We therefore need to properly init two pass rc parameters in
the simple_encode related environment.
Change-Id: Ie14b86d6e7ebf171b638d2da24a7fdcf5a15c3d9
|
|
Properly init and delete cpi struct in simple encode functions.
Change-Id: I6e66bcac852cbb3dec9b754ba3fb01a348ac98b8
|
|
Change-Id: Id56e03dc9cf6d4e70c4681896f29893a9b4c76f2
|
|
* changes:
Use 'ptrdiff_t' instead of 'int' for pointer offset parameters
Implement vpx_convolve8_avg_vert_neon using SDOT instruction
Merge transpose and permute in Neon SDOT vertical convolution
|
|
|
|
A number of the load/store functions in mem_neon.h use type 'int' for
the 'stride' pointer offset parameter. This causes Clang to generate
the following warning every time these functions are called with a
wider type passed in for 'stride':
warning: implicit conversion loses integer precision: 'ptrdiff_t'
(aka 'long') to 'int' [-Wshorten-64-to-32]
This patch changes all such instances of 'int' to 'ptrdiff_t'.
Bug: b/181236880
Change-Id: I2e86b005219e1fbb54f7cf2465e918b7c077f7ee
|
|
Add an alternative AArch64 implementation of
vpx_convolve8_avg_vert_neon for targets that implement the Armv8.4-A
SDOT (signed dot product) instruction.
The existing MLA-based implementation of vpx_convolve8_avg_vert_neon
is retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.
Bug: b/181236880
Change-Id: I971c626116155e1384bff4c76fd3420312c7a15b
|
|
The original dot-product implementation of vpx_convolve8_vert_neon
used a separate transpose before and after the convolution operation.
This patch merges the first transpose with the TBL permute (necessary
before using SDOT to compute the convolution) to significantly reduce
the amount of data re-arrangement. This new approach also allows for
more effective data re-use between loop iterations.
Co-authored by: James Greenhalgh <james.greenhalgh@arm.com>
Bug: b/181236880
Change-Id: I87fe4dadd312c3ad6216943b71a5410ddf4a1b5b
|
|
Add an alternative AArch64 implementation of
vpx_convolve8_avg_horiz_neon for targets that implement the Armv8.4-A
SDOT (signed dot product) instruction.
The existing MLA-based implementation of vpx_convolve8_avg_horiz_neon
is retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.
Bug: b/181236880
Change-Id: Ib435107c47c485f325248da87ba5618d68b0c8ed
|
|
Implement sum of squared difference calculations in vpx_mse16x16_neon
and vpx_get4x4sse_cs_neon using the ABD and UDOT instructions -
instead of widening subtracts followed by a sequence of MLAs.
The existing implementation is retained for use on CPUs that do not
implement the Armv8.4-A UDOT instruction. This commit also updates
the variable names used in the existing implementations to be more
descriptive.
Bug: b/181236880
Change-Id: Id4ad8ea7c808af1ac9bb5f1b63327ab487e4b1c7
|
|
Add an alternative AArch64 implementation of vpx_convolve8_vert_neon
for targets that implement the Armv8.4-A SDOT (signed dot product)
instruction.
The existing MLA-based implementation of vpx_convolve8_vert_neon is
retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.
Bug: b/181236880
Change-Id: Iebb8c77aba1d45b553b5112f3d87071fef3076f0
|
|
Accelerate Neon variance functions by implementing the sum of squares
calculation using the Armv8.4-A UDOT instruction instead of 4 MLAs.
The previous implementation is retained for use on CPUs that do not
implement the Armv8.4-A dot product instructions.
Bug: b/181236880
Change-Id: I9ab3d52634278b9b6f0011f39390a1195210bc75
|
|
Implementing sad16_neon using ABD, UDOT instead of ABAL, ABAL2 saves
a cycle and removes resource contention for a single SIMD pipe on
modern out-of-order Arm CPUs. The UDOT accumulation into 32-bit
elements also allows for a faster reduction at the end of each SAD
function.
The existing implementation is retained for CPUs that do not
implement the Armv8.4-A UDOT instruction, and CPUs executing in
AArch32 mode.
Bug: b/181236880
Change-Id: Ibd0da46e86751d2f808c7b1e424f82b046a1aa6f
|
|
Use the AArch64-only ADDV and ADDLV instructions to accelerate
reductions that add across a Neon vector in sum_neon.h. This commit
also refactors the inline functions to return a scalar instead of a
vector - allowing for optimization of the surrounding code at each
call site.
Bug: b/181236880
Change-Id: Ieed2a2dd3c74f8a52957bf404141ffc044bd5d79
|
|
quiets an integer sanitizer warning:
vpx/src/vpx_image.c:101:25: runtime error: implicit conversion from
type 'int' of value -2 (32-bit, signed) to type 'unsigned int' changed
the value to 4294967294 (32-bit, unsigned)
Change-Id: Ifeac31cc80811081c1ba10aadaa94dc36cd46efa
|
|
Manually unrolling the inner loop is sufficient to stop the compiler
getting confused and emitting inefficient code.
Co-authored by: James Greenhalgh <james.greenhalgh@arm.com>
Bug: b/181236880
Change-Id: I860768ce0e6c0e0b6286d3fc1b94f0eae95d0a1a
|
|
Implement AArch64-only paths for each of the Neon SAD reduction
functions, making use of a wider pairwise addition instruction only
available on AArch64.
This change removes the need for shuffling between high and low
halves of Neon vectors - resulting in a faster reduction that requires
fewer instructions.
Bug: b/181236880
Change-Id: I1c48580b4aec27222538eeab44e38ecc1f2009dc
|
|
|
|
Add an alternative AArch64 implementation of vpx_convolve8_horiz_neon
for targets that implement the Armv8.4-A SDOT (signed dot product)
instruction.
The existing MLA-based implementation of vpx_convolve8_horiz_neon is
retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.
Co-authored by: James Greenhalgh <james.greenhalgh@arm.com>
Change-Id: I5337286b0f5f2775ad7cdbc0174785ae694363cc
|
|
this reduces the number of instructions to compute the sum
Change-Id: Icae4d4fb3e343d5b6e5a095c60ac6d171b3e7d54
|
|
this file uses GTEST_ALLOW_UNINSTANTIATED_PARAMETERIZED_TEST so it's
safe to enable unconditionally. the filter check fell out of sync with
the code, there's a sse2 and neon implementation for the filter.
Change-Id: I2a3336ccef3fb524ca5d9b8f88279240c9a276aa
|
|
|
|
Change clamp to an assert so we are warned if changes to input
ranges or defaults in the future lead to an invalid value.
Change-Id: Idb4e0729f477a519bfff3083cdce3891e2fc6faa
|
|
Due to recent changes to command line options for rate control
parameters.
Change-Id: I1de7cb4ff2850a3ed19ec216dd9d07f64a118e92
|
|
* changes:
vpx_convolve_neon: prefer != 0 to > 0 in tests
vpx_convolve_avg_neon: prefer != 0 to > 0 in tests
vpx_convolve_copy_neon: prefer != 0 to > 0 in tests
|
|
|
|
this produces better assembly code; the horizontal convolve is called
with an adjusted intermediate_height where it may over process some rows
so the checks in those functions remain.
Change-Id: Iebe9842f2a13a4960d9a5addde9489452f5ce33a
|
|
this produces better assembly code
Change-Id: I174b67a595d7efeb60c921f066302043b1c7d84e
|
|
this produces better assembly code
Change-Id: I80ed1a165512e941b35a4965faa0c44403357e91
|
|
Imposed provisional upper and lower limits to each parameter
that can be adjusted in the Vizier ML experiment.
Also in some cases applied secondary limits on on the
range of the final "used" values.
Defaults and limits may well require further tuning after
subsequent rounds of experimentation.
Re-factor get_sr_decay_rate().
Change-Id: I28e804ce3d3710f30cd51a203348e4ab23ef06c0
|
|
Change-Id: I63ffea52d079b0d50002526e209ae3fb64811bac
|
|
The overshoot_pct & undershoot_pct attributes for rate control
are expressed as a percentage of the target bitrate, so the range
should be 0-100.
Change-Id: I67af3c8be7ab814c711c2eaf30786f1e2fa4f5a3
|
|
Further changes to normalize the Vizier command line parameters.
The intent is that the default behavior for any given parameter
is signaled by the value 1.0 (expressed on the command line as a
rational).
The final values used in the two pass code are obtained by multiplying
the passed in factors by a default values if use_vizier_rc_params is 1.
Where use_vizier_rc_params is 0 the values are explicitly set to
the defaults.
This patch also changes the default value of each parameter to 1.0
even if not set explicitly. This should ensure safe /default behavior
if the user sets use_vizier_rc_params to 1 but does not set all the
the individual parameters.
Change-Id: Ied08b3c22df18f42f446a4cc9363473cad097f69
|
|
Add command line options for three rd parameters.
They are controlled by --use_vizier_rc_params, together with
other rc parameters.
If not set from command line, current default values will be used.
Change-Id: Ie1b9a98a50326551cc1d5940c4b637cb01a61aa0
|
|
|
|
If pass --use-vizier-rc-params=1, the rc parameters are overwittern
by pass in values. It --use-vizier-rc-params=0, the rc parameters
remain the default values.
Change-Id: I7a3e806e0918f49e8970997379a6e99af6bb7cac
|
|
|
|
Deleted #define that is no longer referenced.
Change-Id: If0b132c5a40dd8910f535fffdee7d2d1c7df4748
|
|
this avoids uninitialized values and potential misuse of them which
could lead to a crash should the function fail
this is the same fix that was applied in libaom:
d0cac70b5 Fix a free on invalid ptr when img allocation fails
Bug: webm:1722
Change-Id: If7a8d08c4b010f12e2e1d848613c0fa7328f1f9c
|
|
|
|
|
|
|