summaryrefslogtreecommitdiff
path: root/vp9
AgeCommit message (Collapse)Author
2014-01-13Merge "Cleaning up and fixing psnr calculation code."Dmitry Kovalev
2014-01-13Enable reference frame masking for rt modeYaowu Xu
Reference frame masking helped good quality mode to gain about 5% in encoding speed, this commit enable it for rt mode to gain the speed improvement. In addition, this commit move the speed feature setup to a separate function. Change-Id: I015e8f78bbb21dd43ae183b9b9355bea2ccda9c5
2014-01-13No arf right before real scene cut.Paul Wilkins
To reduce pulsing we now allow an arf just before forced key frames and at the end of a clip or section (which may be stitched to another clip or section). However, this does not make sense for key frames arising from real scene cuts. Change from original patch reflects other recent changes in regard to alignment of gf/arf and kf groups. Change-Id: I074a91d1207e9b3e28085af982f6718aa599775f
2014-01-13Further rate control tweaks and fixes.Paul Wilkins
Further fixes regarding min and max rate. Bug fixes re kf group bits and last kf group. Change-Id: Iaafd719d30a489e135a3c55851ce8c632091a436
2014-01-11Merge "cosmetics: vp9_reconinter.h: make some variables const"James Zern
2014-01-10Merge "Cleaning up vp9_rc_postencode_update() function."Dmitry Kovalev
2014-01-10Cleaning up and fixing psnr calculation code.Dmitry Kovalev
Introducing calc_psnr() which calculates psnr between two yv12 buffers. Previously we incorrectly used width/height instead of crop_width/crop_height to calculate number of samples -- fixed. Change-Id: Iecda01980555de55ad347e0276e6641c793fa56c
2014-01-10Merge "Cleaning up vp9_dx_iface.c."Dmitry Kovalev
2014-01-10Merge "Declare setup_buffer_inter in vp9_rdopt.h"Jingning Han
2014-01-10Merge "Enable skipping reference frame check in rd loop"Jingning Han
2014-01-10Merge "Removing mi_height_log2_lookup table."Dmitry Kovalev
2014-01-10explain speed featuresJim Bankoski
Added comments to explain what the various speed features do, and removed 1 that was clearly unused. Change-Id: Icd37a536072ddafedbfaefcecbe48979f6d10faf
2014-01-10Declare setup_buffer_inter in vp9_rdopt.hJingning Han
This funtion initializes buffer pointers and first stage motion vector prediction. It will be needed by both regular rate-distortion optimization loop and the non-RD mode decision. Hence move its declaration in vp9_rdopt.h Change-Id: I64e8b6316c9d05f20756a62721533a2e4d158235
2014-01-10Removing mi_height_log2_lookup table.Dmitry Kovalev
Change-Id: I1f0ae2edc3a96b33c0494d165ae756a8feba6184
2014-01-10Merge "Don't use gf_update by default for 1-pass CBR."Marco Paniconi
2014-01-10Cleaning up vp9_dx_iface.c.Dmitry Kovalev
Change-Id: I6a0dfb95c55ee6cadc7b1675782c7830e5c7caaf
2014-01-10Cleaning up vp9_rc_postencode_update() function.Dmitry Kovalev
Change-Id: I02e44c10660fdb9201a802ad19ceb64756feeebe
2014-01-10Don't use gf_update by default for 1-pass CBR.Marco Paniconi
Change-Id: I5df6abceb0a2a69706feadeb820b593cae88f573
2014-01-10Merge "Adding {get, set}_rate_correction_factor() functions."Dmitry Kovalev
2014-01-10Merge "Keep buffer clipped to maximum in change_config."Marco Paniconi
2014-01-10Revert "SSSE3 convolution optimization"Paul Wilkins
This reverts commit 511d218c60b9b6c1ab9383db746815e907af0359. In current form intrinsics break borg build. Change-Id: Ied37936af841250ecff449802e69a3d3761c91b9
2014-01-09Enable skipping reference frame check in rd loopJingning Han
This commit allows encoder to compare the SAD cost associated with the best motion vector predictor, per frame. If one reference frame has this cost more than 4 times of the best SAD cost given by other reference frames, skip NEARESTMV, NEARMV, ZEROMV mode check of this reference frame. This setting is turned on in speed 2 and above. Compression quality change in speed 2: derf -0.014% yt -0.097% hd -0.023% stdhd 0.046% It reduces the speed 2 runtime of test sequences: pedestrian_area_1080p 4000 kbps 310763 ms -> 303595 ms bluesky_1080p 6000 kbps 259852 ms -> 251920 ms Change-Id: I7f59cf79503d51836d61d56d50dc5bdf0e502e22
2014-01-09Merge "Optimze inv 16x16 DCT with 10 non-zero coeffs - P2"Jingning Han
2014-01-09Merge "Optimze inv 16x16 DCT with 10 non-zero coeffs - P1"Jingning Han
2014-01-09Merge "Cleanups on refresh flags"Deb Mukherjee
2014-01-09Cleanups on refresh flagsDeb Mukherjee
Cleanups on frame refresh flags and external overrides. Change-Id: Ia6a56fe1bde906b1dc3fcbf4ef1c7b207cd2df2d
2014-01-09Merge "Use the correct member for initialization"Johann
2014-01-09Merge "Simplify set_rt_speed_feature()"Yaowu Xu
2014-01-09Keep buffer clipped to maximum in change_config.Marco Paniconi
Under a configuration change, where the bitrate suddenly decreases, the buffer level may be larger than maximum allowed (for that first frame to be encoded after change_config). This change keeps it clipped to its maximum level. Change-Id: I4d0b5b3d1fd8148600dd39e02bd630c9464baba5
2014-01-09Merge "Renaming 'Sharpness' to 'sharpness'."Dmitry Kovalev
2014-01-09Simplify set_rt_speed_feature()Yaowu Xu
1. Made speed choices to be progressive 2. Adjusted rt speed settings to achieve better speed/quality Overall, rt-5 gained 2.5% in compression/quality, encoding time of 720p niklas clip goes from 137,052ms to 121,874ms Change-Id: Ia6e7e1e15225395a868a2f1059c3db8e266e1600
2014-01-09Optimze inv 16x16 DCT with 10 non-zero coeffs - P2Jingning Han
This commit further optimizes SSE2 operations in the second 1-D inverse 16x16 DCT, with (<10) non-zero coefficients. The average runtime of this module goes down from 779 cycles -> 725 cycles. Change-Id: Iac31b123640d9b1e8f906e770702936b71f0ba7f
2014-01-09Merge "SSSE3 convolution optimization"Yunqing Wang
2014-01-09SSSE3 convolution optimizationlevytamar82
Optimizing all SSSE3 assembly for convolution: 1. vp9_filter_block1d4_h8_sse2 2. vp9_filter_block1d8_h8_sse2 3. vp9_filter_block1d16_h8_sse2 4. vp9_filter_block1d4_v8_sse2 5. vp9_filter_block1d8_v8_sse2 6. vp9_filter_block1d16_v8_sse2 my optimization include: -processing 2x8 elements in one 128 bit register instead of processing 8 elements in one 128 bit register. -removing unecessary loads. This optimization gives between 2.4% user level gain for 480p input and 1.6% user level gain for 720p. This Optimization done only for 64bit. Change-Id: Icb586dc0c938b56699864fcee6c52fd43b36b969
2014-01-09Merge "Using VP9_COMMON instead of VP9_COMP."Dmitry Kovalev
2014-01-09Merge "Fix rate allocation bug."Paul Wilkins
2014-01-08Use the correct member for initializationJohann
On Windows this fails with: error C2440: 'initializing': cannot convert from int_mv to uint32_t Change-Id: I51630efd0e83a0ce620c91aa7859dd6fc1572e99
2014-01-08Using VP9_COMMON instead of VP9_COMP.Dmitry Kovalev
Change-Id: If7d3958653104f3e170853e931f8489de3ecf3cc
2014-01-08Merge "Removing direct references to {lst_fb, gld_fb, alt_fb}_idx fields."Dmitry Kovalev
2014-01-08Merge "Cleanups around cpi->common."Dmitry Kovalev
2014-01-08Adding {get, set}_rate_correction_factor() functions.Dmitry Kovalev
Change-Id: Ib3212832953a3445fc5f021af0e1de7886f09b4f
2014-01-08Merge "Renaming 'Mode' to 'mode'."Dmitry Kovalev
2014-01-08Optimze inv 16x16 DCT with 10 non-zero coeffs - P1Jingning Han
This commit is the first patch optimizing SSE2 implementation of inverse 16x16 DCT with <10 non-zero coefficients. It focused on the first 1-D (row) transformation. It exploits the fact that only top-left 4x4 block contains non-zero coefficients, in a 2-D inverse 16x16 DCT with <10 coeffients. The average runtime of idct16x16_10 unit is reduced from 883 cycles -> 779 cycles (12% faster). For pedestrian_area_1080p 300 frames at 4000 kbps, the speed 2 runtime goes down from 310651 ms -> 305910 ms. The decoding speed goes up from 80.37 fps -> 80.87 fps. Change-Id: Ic6f3ac5a637a76c07ba73ddaafe318a699fea645
2014-01-08Removing direct references to {lst_fb, gld_fb, alt_fb}_idx fields.Dmitry Kovalev
Change-Id: Ib1d9628d2b538b6dc27b0db1fa7f40f70ff2072f
2014-01-08Cleanups around cpi->common.Dmitry Kovalev
Change-Id: I0c42a729038d0f4cb7bc07f587d066fcb1dfe9d9
2014-01-08Merge "Add a C fallback for get_msb() and change inline to INLINE."Alex Converse
2014-01-08Merge "Add initial intra frame neon optimization. 1~2% gain."hkuang
2014-01-08Renaming 'Mode' to 'mode'.Dmitry Kovalev
Change-Id: I6cdd670d66288dbd66228f38bba6b30502d25362
2014-01-08Renaming 'Sharpness' to 'sharpness'.Dmitry Kovalev
Change-Id: I54513dc3b3321e0c0bb6b15ea5c34085ed80b4a4
2014-01-08Merge "Using struct twopass_rc* instead of VP9_COMP*."Dmitry Kovalev