summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2010-10-22Improve handling of invalid frames.Timothy B. Terriberry
The code was not checking for frame sizes smaller than 3 bytes, and the partition size checks might have failed if the input buffer was within 16MB of the top of the heap. In addition, the reference count on the current frame buffer was not being decremented on error, so after a small number of errors, no new frame buffer could be found and it would run off the list of them. Change-Id: I0c60dba6adb1e2a29df39754f72a56ab6c776b46
2010-09-21Merge "Fix typo"Johann
2010-09-21Fix typoJohann
Also, move with other ppc32 options Change-Id: I0b97413c767909c5682afc9bdd954f3d43401f6c
2010-09-21Merge "Don't reset mb clamping state during splitmv decoding"John Koleszar
2010-09-21Don't reset mb clamping state during splitmv decodingJohn Koleszar
The MV decoding changes in c5fb0eb introduced a bug where the macroblock clamping state was reset for each partition, so if an earlier partition needed clamping but a subsequent one didn't, the MB wouldn't receive clamping. Instead, the state is only set during splitmv decoding, never cleared. Change-Id: I224fe258493405ee0f6a04596acdb622c475e845
2010-09-21Merge "gitignore: initial version"John Koleszar
2010-09-21Merge "configure: support for ppc32-linux-gcc"John Koleszar
2010-09-21Merge "Add high limit check for unsigned parameters"John Koleszar
2010-09-21Merge "Restructure multi-threaded decoder"Yunqing Wang
2010-09-20Use movq instead of movdqu.Fritz Koenig
Movdqu is more expensive (throughput, uops) than movq. Minimal impact for newer big cores, but ~2.25% gain on Atom. Change-Id: I62c80bb1cc01d8a91c350c4c7719462809a4ef7f
2010-09-20Merge "Better choice of instruction filter mask comparision."Fritz Koenig
2010-09-20Merge "reorder data to use wider instructions"Johann
2010-09-20Merge "Update NEON wide idcts"Johann
2010-09-20Better choice of instruction filter mask comparision.Fritz Koenig
Use pmaxub instead of a combination of psubusb/por to determine if any comparisons go over the limit. Change-Id: I3f0bd7d2aabe5fee9ba6620508e2b60605abcb82
2010-09-20Add high limit check for unsigned parametersGuillermo Ballester Valor
The patch related with issue #55 (5a72620) fixed some warnings, but the fix was not optimal. It actually was a trick to confuse compiler rather than a fix. This patch fixes it by creating a new macro used when needed just a high limit check for an unsigned. Change-Id: I94b322e0f7fb07604b3b1df1f9321185f48cfcb5
2010-09-17reorder data to use wider instructionsJohann
the previous commit laid the groundwork by doing two sets of idcts together. this moved that further by grouping the interesting data (q[0], q+16[0]) together to allow using wider instructions. also managed to drop a few instructions by recognizing that the constant for sinpi8sqrt2 could be downshifted all the time which avoided a dowshift as well as workarounds for a function which only accepted signed data looks like a modest gain for performance: at qcif, went from ~180 fps to ~183 Change-Id: I842673f3080b8239e026cc9b50346dbccbab4adf
2010-09-17Restructure multi-threaded decoderYunqing Wang
On each MB, loopfiltering is done right after MB decoding. This combines two loops in multi-threaded code into one, which reduces number of synchronizations to half. The above-row/left-col data are saved in temp buffers for next-row/next MB decoding. Tests on 4-core gLucid machine showed 10% decoder performance gain with threads=4 (tulip clip). Testing on other platforms isn't done yet. Change-Id: Id18ea7c1e84965dabea65d4c01ca5bc056ddeac9
2010-09-16cleanup: remove unused xprintfJohn Koleszar
These files aren't currently used, and we can get them back if we need them. Change-Id: I62aa3bff828e491a80c80eeb84a7c44903df29b5
2010-09-16Reduce size of tokenizer tablesJohn Koleszar
This patch reduces the size of the global tables maintained by the tokenizer to 16k from 80k-96k. See issue #177. Change-Id: If0275d5f28389af11ac83c5d929d1157cde90fbe
2010-09-15Modify GET_GOT macro for performance.Fritz Koenig
GET_GOT was producing a zero length call. This resulted in pipeline flushes occuring when returing from the assembly functions. Masked on out of order cores, but evident on Atom cores. Change-Id: I8c375af313e8a169c77adbaf956693c0cfeb5ccd
2010-09-13Removed unnecessary pxor.Fritz Koenig
There is no need to make sure that the lower byte of the register is 0 because the downshift by 11 overwrites that byte. Change-Id: I89cbf004b2ff532a2c68e0dc399c45a49cdad5a1
2010-09-13Merge "Make block access to frame buffer sequential"Fritz Koenig
2010-09-13configure: support for ppc32-linux-gccJohn Koleszar
Fixes issue 89. Thanks to josejx for the patch. Change-Id: I7e664fed703b49f2fb3af4c5e6ce1173742000c2
2010-09-13cosmetics: expand tabs in configureJohn Koleszar
Change-Id: I88ddb0afb56ef2be8184b56fe125ad938ead7a84
2010-09-10Make block access to frame buffer sequentialFritz Koenig
Sequentially accessing memory from a low address to a high address should make it easier for the processor to predict the cache. Change-Id: I1921ce996bdd547144fe864fea6435f527f5842d
2010-09-09Merge "Improved subset block search"Scott LaVarnway
2010-09-09Improved subset block searchScott LaVarnway
Improved the subset block search and fill. (about 3% improvement for 32 bit) Modified/merged the code in order to create vp8_read_mb_modes_mv which can decode the modes/mvs on a macroblock level. This will allow the decode loop (in the future) to decode modes/mvs on a frame, row, or mb level. Change-Id: If637d994b508792f846d39b5d44a7bf9aa5cddf3
2010-09-09Update NEON wide idctsJohann
Expand 93c32a55 which used SSE2 instructions to do two idct/dequant/recons at a time to NEON. Initial working commit. More work needs to be put into rearranging and interlacing the data to take advantage of quadword operations, which is when we'll hopefully see a much better boost Change-Id: I86d59d96f15e0d0f9710253e2c098ac2ff2865d1
2010-09-09Fix GF interval for non-lagged ARFsJohn Koleszar
When ARFs are enabled in non-lagged compress modes, the GF interval was being reset to zero. Non-lagged ARF updates were enabled in commit 63ccfbd, but this incorrect GF interval caused a quality regression. Change-Id: I615c3b493f4ce2127044f4e68d0bcb07d6b730c3
2010-09-09Merge branch 'master' of git://review.webmproject.org/libvpxFritz Koenig
2010-09-09Use WebM in copyright notice for consistencyJohn Koleszar
Changes 'The VP8 project' to 'The WebM project', for consistency with other webmproject.org repositories. Fixes issue #97. Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba
2010-09-08Skip unnecessary search of identical framesJim Bankoski
vp8_get_compressed_data() was defeating logic in encode_frame_to_datarate() that determined the reference buffers to search and forcing all frames to be eligible to search. In cases where buffers have identical contents, this is unnecessary extra work. Change-Id: I9e667ac39128ae32dc455a3db4c62e3efce6f114
2010-09-08Enable ARFs for non-lagged compressJim Bankoski
ARFs were explicitly disabled except in lagged compress mode. New ARF logic allows for the ARF buffer to hold an older golden frame, which does not require lagged compress. Change-Id: I1dff82b6f53e8311f1e0514b1794ae05919d5f79
2010-09-07Bilinear subpixel optimizations for ssse3.Fritz Koenig
Used pmaddubsw for multiply and add of two filter taps at once for 16x16 and 8x8 blocks. Change-Id: Idccf2d6e094561624407b109fa7e80ba799355ea
2010-09-03Reduced the size of MB_MODE_INFOScott LaVarnway
Moved partition_bmi and partition_count out of MB_MODE_INFO and placed into MACROBLOCK. Also reduced the size of other members of the MB_MODE_INFO struct. For 1080p, the memory was reduced by 1,209,516 bytes. The decoder performance appeared to improve by 3% for the clip used. Note: The main goal for this change is to improve the decoder performance. The encoder will be revisited at a later date for further structure cleanup. Change-Id: I4733621292ee9cc3fffa4046cb3fd4d99bd14613
2010-09-02Update CHANGELOG for v0.9.2 releaseJohn Koleszar
Change-Id: I184e927987544e9f34f890249b589ea13a93a330
2010-09-02Update AUTHORSJohn Koleszar
Change-Id: I0395ffa107651a773fd11d12682ab9372f76a90b
2010-09-02Whitespace: nuke CRLFsJohn Koleszar
Change-Id: I8b9fdf9875a8fcff4cb49a3357ce44f18108c2e7
2010-09-02Use native win32 timers on mingwJohn Koleszar
Changed to use QueryPerformanceCounter on Windows rather than only when building with MSVC, so that MSVC can link libs built with MinGW. Fixes issue #149. Change-Id: Ie2dc7edc8f4d096cf95ec5ffb1ab00f2d67b3e7d
2010-09-02Fix target detection on mingw32John Koleszar
gcc -dumpmachine returns only 'mingw32' Change-Id: I774d05a97c5131fc12009e436712c319e54490a5
2010-09-02Use -fno-common for mingwJohn Koleszar
Fixes http://code.google.com/p/webm/issues/detail?id=112 Thanks to Ramiro Polla for the issue/fix. Change-Id: I7f7b547a4ea3270e183f59280510066cc29a619e
2010-09-02encoder: remove postproc dependencyJames Zern
Remove the dependency on postproc.c for the encoder in general, the only unchecked need for it is when CONFIG_PSNR is enabled. All other cases are already wrapped in CONFIG_POSTPROC. In the CONFIG_PSNR case the file will still be included. Additionally, when VP8_SET_POSTPROC is used with the encoder when post processing has been disabled an error will be returned. This addresses issue #153. Change-Id: Ia6dfe20167f7077734a6058cbd1d794550346089
2010-09-02Merge "added separate rounding/zbin constants for 2nd order"John Koleszar
2010-09-02Merge "Disable frame dropping by default"John Koleszar
2010-09-02added separate rounding/zbin constants for 2nd orderYaowu Xu
This allows experiments of using different rounding and zerobin constants for 2nd order blocks. Change-Id: Idd829adba3edd1f713c66151a8d29bb245e33a71
2010-09-02Disable frame dropping by defaultJohn Koleszar
This is not the behavior that most users expect. Change-Id: I226126ea400c22cf1f7918e80ea7fe0771c569cb
2010-09-01Fix rare deadlock before loop filterFrank Galligan
There was an extremely rare deadlock that happened when one thread was waiting to start the loop filter on frame n while the other threads were starting to work on frame n+1. Change-Id: Icc94f728b3b6663405435640d9a2996735ba19ef
2010-09-01Merge "Improved Force Key Frame Behaviour"Paul Wilkins
2010-08-31Replace sleep(0) calls in multi-threaded decoderYunqing Wang
This is a workaround for gLucid problem. Change-Id: I188a016a07e4c2ea212444c5a6284ff3c48a5caa
2010-08-31Improved Force Key Frame BehaviourPaul Wilkins
These changes improve the behaviour of the code with forced key frames sent in by a calling application. The sizing of the frames is still suboptimal for two pass in particular but the behaviour is much better than it was. Change-Id: I35fae610c67688ccc69d11f385e87dfc884e65a1