summaryrefslogtreecommitdiff
path: root/volk
AgeCommit message (Collapse)Author
2012-07-04volk: use loadu for unaligned volk_32f_x2_dot_prod_32f_u_sse*Josh Blum
2012-07-04volk: fix volk_32f_x2_dot_prod_32f_u_sse tail caseJosh Blum
2012-07-03Merge branch 'maint'Johnathan Corgan
2012-07-03volk: don't initialize phase in rotatorJohnathan Corgan
2012-06-25volk: replace (__m128) with volk cast for portabilityJosh Blum
2012-06-22volk: fixing some volk kernels.Tom Rondeau
This should fix some problems with gr-filter QA tests. Also removes some warnings.
2012-06-20volk: added missing avx header includeJosh Blum
2012-06-20Merge branch 'gr_filter'Johnathan Corgan
2012-06-15volk: adding new kernels to test and profile.Tom Rondeau
2012-06-15filter: adding ssc and fsf versions of filter with associated new Volk kernels.Tom Rondeau
These routines work and pass QA. They could use some performance work. the FSF is just slightly slower than before; the SCC version is more noticably slower. Both could benefit, probably, by using SSE2 intrinsics to handle the shorts.
2012-06-14filter: added a ccf Volk dot product to use with ccf filters and used it in ↵Tom Rondeau
fir_filter_ccf. Produces improved results to previous version.
2012-06-14volk: fixes for 32f dot_prodTom Rondeau
Accepts num_points like everything else and handles splitting up numbers itself, not expected to be done externally. Adds AVX version, both aligned and unaligned.
2012-06-13volk: dot_produce for floats does 16 at a time.Tom Rondeau
This was done to make this have the same performance as float_dotprod from before. This makes all flavors of the 32f dotprod work the same way. Because it's expecting the input to have 4x more samples than specified, it's making qa for these fail.
2012-06-13filter: process 4 vectors each time in volk dot_prod to speed up fir filters.Tom Rondeau
This makes the volk version of the SSE FIR filter the same speed as using the hand-crafted float_dotprod from before.
2012-06-07volk: have an alignment even for unknown (generic) machines.Tom Rondeau
2012-05-12volk: fix some signedness and unused variable warningsJohnathan Corgan
2012-05-12volk: fix profiler comparisonJohnathan Corgan
2012-05-11volk: add SIMD implementation for fixed phase rotationNick McCarthy
2012-05-07volk: fixed popcnt.Moritz Fischer
2012-04-23volk: force kwargs keys to be of type str, not unicode for py25Josh Blum
2012-04-19volk: code simplification, overrule macro and python optsJosh Blum
2012-04-19volk: avoid sse2 saturation issue 32768->32767Josh Blum
2012-04-19volk: added set_float_rounding to volk_cpu_initJosh Blum
2012-04-19volk: avx overrule is gcc4.4, make prints matchJosh Blum
2012-04-19volk: disable AVX if GCC version < 4.6.0Nick Foster
2012-04-19volk: gcc version check without __GNUC_PREREQJosh Blum
2012-04-19volk: added gcc version check to xgetbvJosh Blum
Reference https://code.google.com/p/pcsx2/issues/detail?id=1195
2012-04-19volk: remove norc, implement machine overruleJosh Blum
2012-04-19volk: use archs.xml to specify compiler flags + supportJosh Blum
2012-04-19volk: fix volk_profile install ruleJosh Blum
2012-04-19volk: move avx cpuid_x86_bit check in archs.xmlJosh Blum
2012-04-19volk: fix msvc __cpuid pointer castJosh Blum
2012-04-19Volk: redo the archs.xml language to make checks generic. no more "type", no ↵Nick Foster
more piles of #if crap in the template.
2012-04-19volk: fix for cpuid_eax check with hardcoded valuesJosh Blum
2012-04-19volk: removed old generator python codeJosh Blum
2012-04-19volk: updated build system for avx checking supportJosh Blum
updated copy of cpuid.h with the latest from gcc 4.6
2012-04-19volk: build system work, can build stand-alone msvcJosh Blum
2012-04-19volk: python checks and build system stuffJosh Blum
2012-04-19volk: make orc a normal arch with overruleJosh Blum
2012-04-19volk: added compile utils and cleanup cmakelistsJosh Blum
2012-04-19volk: working build w/ cmakelistsJosh Blum
2012-04-19volk: created other templates for runtime + machinesJosh Blum
2012-04-19volk: added kernel defs and typedefsJosh Blum
2012-04-19volk: work on template stuffJosh Blum
2012-04-19Merge branch 'maint'Johnathan Corgan
2012-04-18volk: gcc version check without __GNUC_PREREQJosh Blum
2012-04-18volk: added xgetbv stuff from volk_work to maintJosh Blum
This ensures that the compiler has support for xgetbv. This also fixes MSVC by checking for _xgetbv. Also, restored copy of cpuid.h, this should not be modified.
2012-04-16Merge branch 'maint'Johnathan Corgan
Conflicts: volk/gen/make_cpuid_c.py
2012-04-16Volk: also check to make sure OSXSAVE is enabled so you don't check XGETBV ↵Nick Foster
when OS has it disabled.
2012-04-16Volk: add support for checking AVX enable state of OS.Nick Foster
Some systems (notably Xen hypervisor) appear to use XSETBV to disable AVX. This causes SIGILL when running AVX instructions. This commit makes Volk check XCR0 on the AVX arch before proceeding.