summaryrefslogtreecommitdiff
path: root/vp9/common
AgeCommit message (Collapse)Author
2013-11-18Improve vp9_iht4x4_16_add_sse2 (x1.341)Abo Talib Mahfoodh
This rebase is a better implementation of the previous ones. Modifications are done to reduce the total clock cycle. Speedup: 1.341 Compiled with -O3 Tested with: park_joy_420_720p50.y4m Change-Id: I940eaf283f60597ca0d9d2e13d518878d55ff02d
2013-11-18Merge "Move vp9_extend.{h,c} from common to encoder"Yaowu Xu
2013-11-18Move vp9_extend.{h,c} from common to encoderYaowu Xu
Since they used in encoder only. This commit also re-order includes for the files that include vp9_extend.h Change-Id: I929fc113f2135d3198cd1fc6a17434e5a2f8a459
2013-11-18Merge "Do horizontal loopfiltering in parallel"Yunqing Wang
2013-11-17partition context update speedupJim Bankoski
This removes a lot of operations in setting partition context... Change-Id: I365e6f5607ece85190cb21443988816dfa510ce3
2013-11-15Do horizontal loopfiltering in parallelYunqing Wang
This patch followed "Rewrite filter_selectively_horiz for parallel loopfiltering" commit, and added x86 SSE2 optimization to do 16-pixel filtering in parallel. Also, corrected the declaration of aligned arrays. For 8-pixel-in-parallel case, improved the calculation of the masks and filters. Updated the threshold loading since the thresholds were already duplicated. Updated neon C functions to call neon loopfilters twice. Using tulip clip, tests showed it gave a ~1.5% decoder speed gain. Change-Id: Id02638626ac27a4b0e0b09d71792a24c0499bd35
2013-11-15Let the idct vp9_idct32x32_34_add = vp9_idct32x32_1024_addhkuang
on arm until we implenment real vp9_idct32x32_34_add_neon. This issue is due to commit 47665452f0da3c11427ecb4852535e1787bb0c5b Merge "Add 32x32 idct function for eob<=34 case". Change-Id: I56b5f0abc20e7dd1bba521f78a995e85d65ea296
2013-11-15Merge "Cleaning up vp9_loopfilter.c file."Dmitry Kovalev
2013-11-15Merge "Fix coding format in vp9_idct"Jingning Han
2013-11-15partition plane context speed upJim Bankoski
Removes silly operations inside loop. Change-Id: I9eeab1e914e715a887f86cf1089de508e2364165
2013-11-15Merge "loop filter assert cleanout"Jim Bankoski
2013-11-14Merge "Cleaning up vp9_tile_common.{h, c} files."Dmitry Kovalev
2013-11-14Fix coding format in vp9_idctJingning Han
Change-Id: If97ae16a4478717933345b6b9d5bc1b417b8dd84
2013-11-14fix scalling bug by buffer auto-reallocationAdrian Grange
Change-Id: Ib748eb287520c794631697204da6ebe19523ce95
2013-11-14Cleaning up vp9_loopfilter.c file.Dmitry Kovalev
Change-Id: Ic6770072f80dfb54d2725ed96370d4f243a9f474
2013-11-14Cleaning up vp9_tile_common.{h, c} files.Dmitry Kovalev
Change-Id: I9d18f351abe7614107f34f47eeb38a234a9937c9
2013-11-14loop filter assert cleanoutJim Bankoski
Change-Id: I4e2ad4b7342681e6ac236356ef3a4927a54f105b
2013-11-13Simplifies band-getting with a static arrayDeb Mukherjee
Simplifies the code by implementing band mapping with static arrays. A lot of the code complexity introduced in a previous patch disappears. Change-Id: Ia3fac36e594fb5ad2d55ae141c58bba4c55c2d28
2013-11-13Merge "Removing function pointers from inter prediction."Dmitry Kovalev
2013-11-13Merge "Optimizing set_contexts() function."Dmitry Kovalev
2013-11-13Merge "Use 1D array to store super block filter levels"Yunqing Wang
2013-11-13Merge "mips dsp-ase r2 vp9 decoder intra module optimizations (rebase)"Johann
2013-11-13mips dsp-ase r2 vp9 decoder intra module optimizations (rebase)Parag Salasakar
Change-Id: Ib27fc4f3dbe01fe8adfa04a61aaba21b3480e75c
2013-11-13mips dsp-ase r2 vp9 decoder loopfilter module optimizations (rebase)Parag Salasakar
Change-Id: Ia7f640ca395e8deaac5986f19d11ab18d85eec2d
2013-11-12Moving q_index from MACROBLOCKD to MACROBLOCK.Dmitry Kovalev
Moving because q_index is used only by encoder. Change-Id: I0b96175614ed4fd3d76ee56a0ba36258e1e896f6
2013-11-12Merge "Using max_tx_size instead of bsize when possible."Dmitry Kovalev
2013-11-12Merge "Moving {sb, mb, b, ab}_index from MACROBLOCKD to MACROBLOCK."Dmitry Kovalev
2013-11-12Merge "Adding const to tree pointer inside vp9_extra_bit struct."Dmitry Kovalev
2013-11-12Merge "Added optimized vp9_idct32x32_34_add_dspr2"Johann
2013-11-12Adding const to tree pointer inside vp9_extra_bit struct.Dmitry Kovalev
Change-Id: I60e02fa3de930ff1f969687ab5af93dee40d86ad
2013-11-12Use 1D array to store super block filter levelsYunqing Wang
As Jim suggested, 1D array was used to store filter levels instead of 2D array. This used shift_y in setup_mask directly, and saved few cycles. Change-Id: If61ab298784861f1806b1cd396d4e4e2e0f097b9
2013-11-12Merge "Removes conditional statements from band getting"Deb Mukherjee
2013-11-12Use lowercase 'b' to branchJohann
iOS doesn't recognize B: bad instruction `B idct32_pass_loop' Change-Id: I3cf6aede4639f1d9efa97f7962fa287ba6feaaef
2013-11-12Merge "Rewrite filter_selectively_horiz for parallel loopfiltering"Yunqing Wang
2013-11-12Merge "Improve loopfilter function"Yunqing Wang
2013-11-12Removes conditional statements from band gettingDeb Mukherjee
Implements scan order to band map with arrays in both the encoder and decoder to remove conditional statements. Encoding seems to be about 1% faster at speed 0, tested on football. Decoding seems to be about 0.5-1% faster on a set of 25 videos. Change-Id: Idb233ca0b9e0efd790e30880642e8717e1c5c8dd
2013-11-11Removing function pointers from inter prediction.Dmitry Kovalev
Removing foreach_predicted_block_visitor and calling build_inter_predictors directly. Change-Id: I11bb3c872b99b47c2680b01b0dbcc01c558c4a2b
2013-11-11Rewrite filter_selectively_horiz for parallel loopfilteringYunqing Wang
Added loop filter mask checking, and made the caller function ready for implementation of parallel loopfiltering in horizontal direction. Next, we need to go through the loopfilter functions (both c and optimized versions), and provide 16-byte wide loopfiltering for each filter type. Change-Id: Ifef47e7ef9086ebc2fd6ca7ede8f27c9bbf79e66
2013-11-11Moving {sb, mb, b, ab}_index from MACROBLOCKD to MACROBLOCK.Dmitry Kovalev
We use {sb, mb, b, ab}_index only inside encoder, so moving them into appropriate data structure. Change-Id: Ib5c1036716354d9d321e11a60c1634c1cb8f9716
2013-11-11Decouple macroblockd_plane buffer usageJingning Han
Make the macroblockd_plane contain dynamic buffer pointers instead static pointers to the memory space allocated therein. The decoder uses the buffer allocated in pbi, while encoder will use a dual buffer approach for rate-distortion optimization search. Change-Id: Ie6f24be2dcda35df7c15b4014e5ccf236fb3f76c
2013-11-11Fix a bug in the assembly code.hkuang
Change-Id: Ic416e3f8a11e82ee298e6f709b2119a9ddf1e2f8
2013-11-11Merge "Localizing NEARESTMV special cases in the code."Dmitry Kovalev
2013-11-08Optimizing set_contexts() function.Dmitry Kovalev
Inlining set_contexts_on_border() into set_contexts(). The only difference is the additional check that "has_eob != 0" in addition to "xd->mb_to_right_edge < 0" and "xd->mb_to_right_edge < 0". If has_eob == 0 then memset does the right thing and works faster. Change-Id: I5206f767d729f758b14c667592b7034df4837d0e
2013-11-08Merge "Improve vp9_idct4x4_1_add_sse2"Yunqing Wang
2013-11-08Improve loopfilter functionYunqing Wang
This patch continued the work done in "Rewrite loop_filter_info_n struct"(commit:00dbd369c70270428d56da6d15ea5486fc821c52) to further improve loopfilter function. 1. Instead of storing pointers to thresholds, store loopfilter levels within 64x64 SB; 2. Since loopfilter levels are already calculated in setup_mask, we don't need call build_lfi to look up them again. Just save loopfilter levels in setup_mask. 3. Reorganized and simplified filter_block_plane(). Tests showed a ~0.8% decoder speedup. Change-Id: I723c7779738bbc2afcb9afa2c6f78580ee6c3af7
2013-11-07Merge "Add back vp9_short_idct32x32_1_add_neon which is deleted in cleanup ↵hkuang
I63df79a13cf62aa2c9360a7a26933c100f9ebda3."
2013-11-06Merge "Move SVC per-frame loop from sample app into libvpx proper"Ivan Maltz
2013-11-06Move SVC per-frame loop from sample app into libvpx properIvan Maltz
SVC multiple layer per frame encoding is invoked with vpx_svc_init and vpx_svc_encode. These interfaces are designed to be invoked from ffmpeg. Additional improvements: - make dummy frame handling a bit more explicit - fixed bug with single layer encodes - track individual frame sizes and psnrs instead of averages - parameterized quantizer, 16th scalefactors, more logging, - enabled single layer encodes to generate baseline - include new mode for 3 layer I frame with 5 total layers Change-Id: I46cfa600d102e208c6af8acd6132e0cc25cda8d4
2013-11-06Replacing mi_{width,height}_log2 with num_8x8_blocks_{wide,high}_lookup.Dmitry Kovalev
Change-Id: I04c55daef89bca2b85cb7db0850f9b052abc5a7c
2013-11-06Merge "Missing _ means no sse3 for vp9_h_predictor_32x32."Yaowu Xu