Age | Commit message (Collapse) | Author |
|
|
|
Change-Id: I7e96509ffa36899fcd2935749927a1e8aac8d025
|
|
On Nexus 7 speed -5 got ~2%, -6 got ~15%, -7 and -8 got ~30%
increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
Change-Id: I83246d63b96674d170098a572fa4fe28a05aaf51
|
|
|
|
|
|
|
|
This commit replaces an integer divide with a table-lookup. It is
to improve decoding speed, and at the same time, to reduce possible
complications with a bug in AMD Family 12h processors:
"665 Integer Divide Instruction May Cause Unpredictable Behavior"
Change-Id: I678b707a538798a923850bac467e66e847e6def7
|
|
master branch."
This reverts commit bde04ce5039cbcf86c8b34bdb4127e18d7e1d0c7
Change-Id: I053dae04c761b04a36dc239558503905a14d2470
|
|
In frame parallel decode, libvpx decoder decodes several frames on all
cpus in parallel fashion. If not being flushed, it will only return frame
when all the cpus are busy. If getting flushed, it will return all the
frames in the decoder. Compare with current serial decode mode in which
libvpx decoder is idle between decode calls, libvpx decoder is busy
between decode calls. VP9 frame parallel decode is >30% faster than serial
decode with tile parallel threading which will makes devices play 1080P
VP9 videos more easily.
* frame-parallel:
Add error handling for frame parallel decode and unit test for that.
Fix a bug in frame parallel decode and add a unit test for that.
Add two test vectors to test frame parallel decode.
Add key frame seeking to webmdec and webm_video_source.
Implement frame parallel decode for VP9.
Increase the thread test range to cover 5, 6, 7, 8 threads.
Fix a bug in adding frame parallel unit test.
Add VP9 frame-parallel unit test.
Manually pick "Make the api behavior conform to api spec." from master branch.
Move vp9_dec_build_inter_predictors_* to decoder folder.
Add segmentation map array for current and last frame segmentation.
Include the right header for VP9 worker thread.
Move vp9_thread.* to common.
ctrl_get_reference does not need user_priv.
Seperate the frame buffers from VP9 encoder/decoder structure.
Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:"""
Conflicts:
test/codec_factory.h
test/decode_test_driver.cc
test/decode_test_driver.h
test/invalid_file_test.cc
test/test-data.sha1
test/test.mk
test/test_vectors.cc
vp8/vp8_dx_iface.c
vp9/common/vp9_alloccommon.c
vp9/common/vp9_entropymode.c
vp9/common/vp9_loopfilter_thread.c
vp9/common/vp9_loopfilter_thread.h
vp9/common/vp9_mvref_common.c
vp9/common/vp9_onyxc_int.h
vp9/common/vp9_reconinter.c
vp9/decoder/vp9_decodeframe.c
vp9/decoder/vp9_decodeframe.h
vp9/decoder/vp9_decodemv.c
vp9/decoder/vp9_decoder.c
vp9/decoder/vp9_decoder.h
vp9/encoder/vp9_encoder.c
vp9/encoder/vp9_pickmode.c
vp9/encoder/vp9_rdopt.c
vp9/vp9_cx_iface.c
vp9/vp9_dx_iface.c
Change-Id: Ib92eb35851c172d0624970e312ed515054e5ca64
|
|
Change-Id: I78ef7f89586a329787f6bc4c58ec83af210989a3
|
|
For low spatial resolutions: bias partittion selection to smaller block sizes,
and base the variance computation on 4x4 down-sampling.
Also move the threshold computations into the choose_partitioning,
so they are computed once for each sb block.
On low-res clips (RTC_derf) PSNR/SSIMetrics increase by about 4-5%.
No change for resolutions above CIF.
Change-Id: I93f8ff742c8044786977bb6e31dcf8efda6dd1b0
|
|
|
|
Just before a forced key frame we often get a foreshortened
arf/gf group. In such a case, we do not want to update
rc->last_boosted_qindex, which is used to define the Q range
for the forced key frame itself.
This gives a small average metrics gain for the YT and YT-HD sets
(eg. YT SSIM +0.141%).
Change-Id: Ie06698bc4f249e87183b8f8fb27ff8f3fde216d9
|
|
|
|
The comparison of address in the condition is not necessary, since
they will constantly be non-null.
Change-Id: Id0b0075283f5af65215d5761a8160a4cb2a15c9b
|
|
Change-Id: I3d324e2baa4de2d266c5f7ca7b635b62372e90a7
|
|
|
|
|
|
Added non420 part back to make it consistent with single
thread code in vp9_loopfilter.c.
Change-Id: I8ca255d73bffebae294d2627d6655eafe535cb90
|
|
|
|
The SSE2 code is from VP8 MFQE, reuse it in VP9. No change on VP8
side. In our testing, we achieve 2X speed by adopting this change.
Change-Id: Ib2b14144ae57c892005c1c4b84e3379d02e56716
|
|
The 16 bit sum vector was overflowing.
Change-Id: I0fdf38e832ee99457ec8680a92691a6175ff8c3f
|
|
1. Added row-based loopfilter in encoder;
2. Moved common multi-threaded loopfilter functions from decoder
to common;
3. Merged multi-threaded loopfilter code, and made encoder/
decoder call same function to reduce code duplication.
Encoder tests showed that 1% - 2% speedup was seen for good-quality
2-pass mode(at speed 3); 1% - 3% speedup using 2 threads and 4% - 6%
speedup using 4 threads were seen for real-time mode(at speed 7).
Change-Id: I8a4ac51c2ad9bab9fa7b864e90743931c53ec1c4
|
|
|
|
This commit fixes a bug in denoiser reference frame buffer swap,
which disables frame buffer update.
Change-Id: I39a9427180fd18f9692602064ad821f7af4714c0
|
|
This is to make the usage of the variable name consistent across
the code base.
Change-Id: I698739e55841c59358d1c6e5cc97c96088772943
|
|
Change-Id: I78ecc8ec3fa3ba5f69bb23813e68a5255d0534e1
|
|
On Nexus 7 speed -5, -6, -7, and -8 saw about a 1% increase
in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 1.5%
increase in perf for 720p.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
Change-Id: Ibf17ebfd952a6aec941719bd8306df8ec4574bee
|
|
On some platforms, such as 32bit Windows and 32bit Mac, the allocated
memory isn't aligned automatically. The thread data is aligned to
ensure the correct access in SIMD code.
Change-Id: I1108c145fe982ddbd3d9324952758297120e4806
|
|
|
|
|
|
|
|
This commit adds encoder side control for vp9 to set color space info
in the output compressed bitstream.
It also amends the "vp9_encoder_params_get_to_decoder" test to verify
the correct color space information is passed from the encoder end to
decoder end.
Change-Id: Ibf5fba2edcb2a8dc37557f6fae5c7816efa52650
|
|
|
|
On Nexus 7 speed -5, -6, -7, and -8 saw about a 15% increase
in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 10%
increase in perf for 720p.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
Change-Id: I2fa5315845e3021c9a6e2ea47e52e68b398d8334
|
|
Saves 5 instructions on 8x8 and 16x16 and 8 instructions
on 32x32, when compiled with 4.9.
Change-Id: Id3da613a36a9d27d8c5169c59ba45d247c920c6c
|
|
|
|
Don't put small empty frame in front of a key frame. We will
put key frame flag in webm container if there's a visible key
frame. But there will be decoding error when we seek to here
if we put the small empty frame, which will be inter frame,
in front of it.
Change-Id: Id50c2c1fd31da0405ff6faa7375cc2f49c55402d
|
|
This commit added a field to vpx_image_t for indicating color space,
the field is also added to YUV_BUFFER_CONFIG. This allows the color
space information pass through the decoder from input stream to the
output buffer.
The commit also updated compare_img() function with added verification
of matching color space to ensure the color space information to be
correctly passed from encode to decoder in compressed vp9 streams.
Change-Id: I412776ec83defd8a09d76759aeb057b8fa690371
|
|
Add optimized Neon functions of:
vp9_variance32x64
vp9_variance64x32
vp9_variance64x64
On Nexus 7 speed -5 and -6 saw about a 4% increase in perf.
Speeds -7 and -8 saw about a 6% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
Change-Id: I5a81f13c9897eb927fa39662530f5524a0f768fa
|
|
|
|
|
|
Change-Id: If64052cc6e404abc8a64a889f42930d14fad21d3
|
|
Replaced "color space" with "color format" in comments where color
sampling format is concerned, so to differentiate from the concept
defined in COLOR_SPACE.
Change-Id: I8c935034c166b24307a99352dab1686531276bb8
|
|
|
|
|
|
|
|
|
|
|
|
|