Age | Commit message (Collapse) | Author |
|
When eob is less than or equal to 135 for high-bitdepth 32x32 idct,
call this function.
BUG=webm:1301
Change-Id: I8a5864f5c076e449c984e602946547a7b09c9fe6
|
|
No speed changes and disassembly is almost identical.
Change-Id: Id07996237d2607ca6004da5906b7d288b8307e1f
|
|
Rename cospi_6_26_14_18N to cospi_6_26N_14_18N for consistency.
Change-Id: I00498b43bb612b368219a489b3adaa41729bf31a
|
|
|
|
Change-Id: I7a13b7e3a1e11ddbe4be3009edf03528e1bc7647
|
|
|
|
|
|
clear the entire array on error. the size used previously was equal to
the number of elements.
BUG=webm:1364
Change-Id: I2f2e16ed6e867f41d4774a5a8ac9cedaee11ce46
|
|
Saves 2688 bytes of rodata.
Change-Id: I46633b6e50c2845181c70fff6273a8e58fdd1e56
|
|
|
|
|
|
Reduce the level from 4 to 2.
This gives ~1-2% quality gain on RTC set, with small decreaee in speed (~1-2% on mac).
Change-Id: I7d959731badcee3d45b2f4a08efe378765016a13
|
|
Enable row level multithreading for realtime encodes where non-rd
path is used (speed >= 5).
Change-Id: I5439cb49a02171166d8e1de06c7d5e6f8e819a41
|
|
- Split the transform into first half and second half.
- Reschedule the instructions to avoid stack spillover.
- Function level speed improves ~16%.
Change-Id: I166889840d23aa8a273eca00f6fbdae8b4566f35
|
|
|
|
Moves the def from vpx_encoder.h -> vpx_codec.h. The defined value
is changed as part of this move.
Adds the value to decoder capabilities when CONFIG_VP9_HIGHBITDEPTH.
Change-Id: I7d61fc821cda29f1e32bb9b2b9ffd3d83966e419
|
|
This reverts commit d3db846cc50b1b0a9f6efcbe2b36c9c1943bc528.
This change causes a large drop in psnr (4-5db) on low framerate
difficult content (tested at 360/480p)
BUG=b/35804225
Change-Id: I8e90012d3b9c8a0cddb062ba93b01b36c0e0c0a0
|
|
Change-Id: I8c145337adf49d30b88a17ff31501b8751ed1fa0
|
|
vp9_stress_test now forces --row-mt=0 to cover both versions
Change-Id: I8d134879435bf1d8e76ab3fd89e698efba0e86b2
|
|
Change-Id: Iae45266cea86585f0935af4012335198cf93719f
|
|
Change-Id: I38e6c988f17c56fbfacd95378b27ef8d77c75f90
|
|
Added a comment.
Change-Id: I82f71c72598ad6f1eaa0b57b0b8ec56ab9658e81
|
|
Set row_mt to 0 for now.
Change-Id: I922536a6d71a765e435daeaf4d932ef14363d19a
|
|
From commit:
https://chromium-review.googlesource.com/c/441393/
On non-segment the set_vbp_thresholds() should be called
again to adjust thresholds based on content_state of superblock.
This was the intended behavior from 441393.
Small change in RTC metrics and speed.
Change-Id: I45e5fbdc4af74db76b3cb4f13074fcae0eb2219e
|
|
Rename left over occurences of new_mt.
Change-Id: Ib884e84c801fcd366ca4b57ec912ac5972023375
|
|
new_mt is a very generic name that will get obsolete soon enough.
Since this is exposed as a codec control, renaming it to row_mt to
signify row level paralellism. Also renaming the ETHREAD_BIT_MATCH
codec control to ROW_MT_BIT_EXACT.
Change-Id: Ic7872d78bb3b12fb4cf92ba028ec8e08eb3a9558
|
|
Removed an old comment that wasn't true anymore.
Change-Id: I286ad8d7cb2843070a55e45a599d26bc226d6bd7
|
|
promote the unsigned int calculation to uint64_t rather than int64_t for
type consistency
Change-Id: Ic34dee1dc707d9faf6a3ae250bfe39b60bef3438
|
|
|
|
Re-organized the encoder threading tests and grouped tests into
4 parts. Added PSNR checking test to make sure the PSNR variation
is within a small range.
BUG=webm:1376
Change-Id: I09edb990236a87a4d2b2b0e1ceaf6c6435a35eff
|
|
|
|
vp9_highbd_block_error_8bit_c was a very simple wrapper around
vp9_block_error_c. The SSE2 implemention was practically identical to
the non-HBD one. It was missing some minor improvements which only
went into the original version.
In quick speed tests, the AVX implementation showed minimal
improvement over SSE2 when it does not detect overflow. However, when
overflow is detected the function is run a second time. The
OperationCheck test seems to trigger this case and reverses any
speed benefits by running ~60% slower. AVX2 on the other hand is
always 30-40% faster.
Change-Id: I9fcb9afbcb560f234c7ae1b13ddb69eca3988ba1
|
|
|
|
Only works for bitdepth = 8 when compiled with high bitdepth flag.
4x speed ups for handling 1:2 down/upsampling.
Validated manually for:
1) Dynamic resize for a single layer encoding
2) SVC encoding with 3 spatial layers
Results are bitexact with the patch and the speed gain (~4x) in the
scaling was verified.
BUG=webm:1371
Change-Id: I1bdb5f4d4bd0df67763fc271b6aa355e60f34712
|
|
Change-Id: Ib04990e4a7bda9fbf501f294da2057a2b2595deb
|
|
|
|
|
|
|
|
The reduction showed improvement on RTC when aq-mode=3 is on.
Add that (cyclic refresh enabled) to the condition.
Only affects 1 pass CBR.
Change-Id: I5d0843002d8e31d7c165098a62e7a71146b08664
|
|
For speed 8 only.
3% speed up for QVGA and 6.3% for VGA on Nexus 6.
~3% avgPSNR decrease on rtc_derf and 2.9% on rtc.
Disabled for now.
Change-Id: I70133f1f6c804d663d594df437bfe7fdb0030d6a
|
|
|
|
This prevent possible reduction of cyclic refresh after key frame.
Change-Id: Idd4e49b69cd95476e7eccfa31b2bd8669569e9e8
|
|
The output needs to be aligned. Input is read with 'movq' not 'movqda'
so it is not expected to be aligned.
Change-Id: Ibd48a84c1785917a6a97c3689a05322abba486b4
|
|
Only affects speed 8. No obvious quality regression. Systematic speed
ups by ~1% on Nexus 6.
Change-Id: Ia904ca28ea041c3281c532911ec38fb7d7f46a17
|
|
|
|
|
|
Increase the variance partition thresholds for superblocks that
have low sum-diff (from source analysis prior to encoding frame).
Use it for now only for speed >= 7 or for denoising on.
Small change on metrics for rtc set: less than ~0.1 avgPNSR decrease
on RTC set, for both speed 7 and 8.
Change-Id: I38325046ebd5f371f51d6e91233d68ff73561af1
|
|
- vpx_idct8x8_12_add_ssse3
vpx_idct8x8_64_add_ssse3
vpx_idct32x32_34_add_ssse3
vpx_idct32x32_135_add_ssse3
vpx_idct32x32_1024_add_ssse3
- turn on unit tests.
Change-Id: I788b2b3b2074a6f3ab6a0e6f469c1327a123eff7
|
|
|
|
Re-enable the affected test.
BUG=webm:1374
Change-Id: I98cd49403927123546d1d0056660b98c9cb8babb
|