summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-06-21Remove unused vp9_build_intra_predictors_sb{y,uv}_sJohn Koleszar
The functions no longer referenced. Change-Id: If2705dfbc607f79ec8ec2242d5e03bec27a35aaf
2013-06-21Remove unused vp9_model_to_full_probs_sb()John Koleszar
This function never referenced. Change-Id: I1c42cd355bfa88e17d169f7335a44be682af58cc
2013-06-20Merge "Get some speed back for cpuused 1"Yaowu Xu
2013-06-20Get some speed back for cpuused 1Yaowu Xu
and remove unused code. Change-Id: If380440c4450294b5450b7a9eeb94a376846ec01
2013-06-20Merge "rename variables to avoid build error in MSVC"Yaowu Xu
2013-06-20rename variables to avoid build error in MSVCYaowu Xu
Change-Id: I7960178c95c54d5c4497e44cfc8c493566294b34
2013-06-20Merge "Implement sse2 and ssse3 versions for all sub_pixel_variance sizes."Yaowu Xu
2013-06-20Merge "clean out libvpx-srcs.txt if built"Jim Bankoski
2013-06-20clean out libvpx-srcs.txt if builtJim Bankoski
Change-Id: Idfd69e66e8982275eb00d8007a55efd1a4f86a98
2013-06-20Merge "Revert "test_libvpx: disable pthreads in gtest""James Zern
2013-06-20Fix win64 warning.Frank Galligan
- size_t vs int. Change-Id: Ib47ebd932a4b69db9f52a43000bb69d0a96b9134
2013-06-20Revert "test_libvpx: disable pthreads in gtest"James Zern
This reverts commit 90a9900abb79fabfd44189a959d14ca677c2777a Seems to break the Mac build: src/include/gtest/internal/gtest-port.h:1208:: pthread_mutex_lock(&mutex_)failed with error 22 Abort trap: 6 Change-Id: Icbe31161d7c27f1b0a28d33409e7712430bbf0ae
2013-06-20Merge "Add unit tests for 4x4 ADST"Jingning Han
2013-06-20Merge "Cast value to avoid size_t/int warning on win64"Johann
2013-06-20Merge "Renaming 'nmv' to 'mv' for several functions."Dmitry Kovalev
2013-06-20Merge "Function decomposition inside vp9_decodemv.c file."Dmitry Kovalev
2013-06-20Improving model rd with variance and quant stepDeb Mukherjee
Improves the rd modeling function and implements them using interpolation from a table which is a little faster. Also uses sse as input to the modeling function rather than var - since there is no dc prediction used and as a result the sse works a little better. derfraw300: +0.05% Speedup: ~1% Change-Id: I151353c6451e0e8fe3ae18ab9842f8f67e5151ff
2013-06-20Cast value to avoid size_t/int warning on win64Johann
dboolhuff.c(50) : warning C4267: 'initializing' : conversion from 'size_t' to 'int' Change-Id: I6b85759efb2fa19f362f406623d8a7583a55c036
2013-06-20adds force partitioning greater than or less than block sizeJim Bankoski
adds a new speed feature to force partitioning to be greater than or less than a certain size Change-Id: I8c048eeeef93700ae822eccf98f8751a45b2e7d0
2013-06-20adds a set partitioning to speed featuresJim Bankoski
this feature lets you set a partitioning size to be used by the entire frame. Change-Id: I208a4c8c701375cbb054418266f677768b6f8f06
2013-06-20partition by variance using var from last frameJim Bankoski
This uses variance to split partition. Variance is calculated using nearest mv, always from last ref frame. Change-Id: Idd015b4a9aa3bc82591759eac239680c07496896
2013-06-20convert all speed things to speed featuresJim Bankoski
Change-Id: Ie24489a4d39f3e53e816eeebf75a1c9c7d94515a
2013-06-20new partition via varianceJim Bankoski
Change-Id: Ideee45cad8b38087c509cd404484728e85d0c427
2013-06-20fix to set up new speed featureJim Bankoski
This uses the speed feature functionality for code. Change-Id: I9cd16c0c5f98520ae27ebba81aa2c178546587f8
2013-06-20don't copy partitions for key frames or altrefsJim Bankoski
force us to go through slow partitioning for keyframes, altref and overlays. Change-Id: I1a286361bf74083e71973575a7296be46eb98742
2013-06-20Implement sse2 and ssse3 versions for all sub_pixel_variance sizes.Ronald S. Bultje
Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 -> 3min58). Specific changes to timings for each function compared to original assembly-optimized versions (or just new version timings if no previous assembly-optimized version was available): sse2 4x4: 99 -> 82 cycles sse2 4x8: 128 cycles sse2 8x4: 121 cycles sse2 8x8: 149 -> 129 cycles sse2 8x16: 235 -> 245 cycles (?) sse2 16x8: 269 -> 203 cycles sse2 16x16: 441 -> 349 cycles sse2 16x32: 641 cycles sse2 32x16: 643 cycles sse2 32x32: 1733 -> 1154 cycles sse2 32x64: 2247 cycles sse2 64x32: 2323 cycles sse2 64x64: 6984 -> 4442 cycles ssse3 4x4: 100 cycles (?) ssse3 4x8: 103 cycles ssse3 8x4: 71 cycles ssse3 8x8: 147 cycles ssse3 8x16: 158 cycles ssse3 16x8: 188 -> 162 cycles ssse3 16x16: 316 -> 273 cycles ssse3 16x32: 535 cycles ssse3 32x16: 564 cycles ssse3 32x32: 973 cycles ssse3 32x64: 1930 cycles ssse3 64x32: 1922 cycles ssse3 64x64: 3760 cycles Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d
2013-06-20disable speed > 1 speed corrections in firstpassJim Bankoski
need to rework these Change-Id: I17dc2c88d2faadd2f8fb117c52c25f04ea2e9856
2013-06-20new debug modes codeJim Bankoski
The new print out includes skips and has prefixed sections so you can grep to find things like transforms chosen on each frame. Change-Id: I195043424647d9514cfc3ff6720a5b20d010fa1b
2013-06-20Merge "copy partitioning from last fame"Jim Bankoski
2013-06-20copy partitioning from last fameJim Bankoski
Change-Id: I26e80ede80cb4389378a95afa95d229092a9859a
2013-06-20Add unit tests for 4x4 ADSTJingning Han
Enable sign bias check and round-trip error unit tests for 4x4 hybrid transform modules. Change-Id: Icd3d839f098d4b92b00ff76eac146765b039d0d3
2013-06-20Merge "test_libvpx: disable pthreads in gtest"John Koleszar
2013-06-19Removed a number of unnecessary check on ref_frameYaowu Xu
Since intra block decoding is handled by decode_sb_intra() separately. Change-Id: I42d757884714084c92fc23ec5d35d4dc946f4b15
2013-06-19Function decomposition inside vp9_decodemv.c file.Dmitry Kovalev
Change-Id: Iab96e6a50aec543c63e15cd134f9d5f01ca7ceff
2013-06-19test_libvpx: disable pthreads in gtestJames Zern
currently threading is internal to libvpx so thread safety is unneeded in libgtest -- visual studio builds already operate in this way as they do not have pthread.h available by default. this removes an unconditional link to libpthread using $(extralibs) should libvpx require it. Change-Id: Ieae1d693406653a54b54fba818c598836797d33b
2013-06-19Merge "Add two-pass quantization"Yunqing Wang
2013-06-19Add two-pass quantizationYunqing Wang
Optimized the quantization function by making it a two-pass process. The first pass does a quick checking of the transform coefficients against the base ZBIN, and only keep the good enough set of coefficients for quantization. A skipping check is added. If all coefficients are within the base ZBIN, no quantization is needed. The second pass is the actual quantization pass, which only processes the coefficient subset determined in first pass. This reduces the computation. Furthermore, an alternitive method is used for large transform size, which often has sparse nonzero quantized coefficients. Overall, the encoder speedup is about 4%. The quantization function itself gets 20% faster. Change-Id: I3a9dd0da6db030260b6d9c314a9fa48ecae89f22
2013-06-18Remove unnecessary copying of probs.Yaowu Xu
Change-Id: Ic924f07c6ab0c929c6cdf11880d3c625806e272c
2013-06-18Renaming 'nmv' to 'mv' for several functions.Dmitry Kovalev
Change-Id: I183a38997a9d01e4a1b869e92509f6915216fa09
2013-06-18Merge "tests: clear system state after non-API calls"John Koleszar
2013-06-18Merge "Make fdct32 computation flow within 16bit range"Jingning Han
2013-06-18tests: clear system state after non-API callsJames Zern
add ClearSystemState() to reset MMX registers avoiding corrupting subsequent tests. Change-Id: I668deb09aa7aa467709776e5819f936910698bc0
2013-06-18Merge "Code cleanup inside the decoder code."Dmitry Kovalev
2013-06-18Merge "Removing vp9_invtrans.{c, h} files."Dmitry Kovalev
2013-06-18Make fdct32 computation flow within 16bit rangeJingning Han
This commit makes use of dual fdct32x32 versions for rate-distortion optimization loop and encoding process, respectively. The one for rd loop requires only 16 bits precision for intermediate steps. The original fdct32x32 that allows higher intermediate precision (18 bits) was retained for the encoding process only. This allows speed-up for fdct32x32 in the rd loop. No performance loss observed. Change-Id: I3237770e39a8f87ed17ae5513c87228533397cc3
2013-06-18Merge "Move subpixel variance function from common/ to encoder/."Ronald S. Bultje
2013-06-18Merge "Use assembly-optimized variance functions in sub_pixel_{avg}_var()."Ronald S. Bultje
2013-06-18Merge "vpx_ports/x86.h: de-dup #elif block"John Koleszar
2013-06-17convolve_test: align filter arraysJames Zern
fixes issue #583 Change-Id: I4b855a5b5b168c8961410cef6ab5e6d86f14d301
2013-06-17vpx_ports/x86.h: de-dup #elif blockJames Zern
Change-Id: I052647e13dd24354888c890f6b4a987d989552ae