gsoc2010-fftw-neon:hatchrads-gsoc2010-fftw-neon.git
7 years agofinal gsoc commit. did several fixups in simd-neon.h . disabled ffmpeg rdft and reodf... master
Christopher Friedt [Mon, 16 Aug 2010 12:01:27 +0000 (14:01 +0200)]
final gsoc commit. did several fixups in simd-neon.h . disabled ffmpeg rdft and reodft by default since I havent tested it thoroughly enough. added beginnings of copy acceleration features to kernel, but again, those are disabled by default since I havent tested them thoroughly enough. fixed several bugs that were really killing my project updates, to do with vector strides in simd-neon.h. simd works reliably now.

7 years agochanged volatile to __volatile__ for consistency with __asm__
Christopher Friedt [Thu, 12 Aug 2010 08:58:41 +0000 (10:58 +0200)]
changed volatile to __volatile__ for consistency with __asm__

7 years agoremoved unused FFTComplex *buf from P structure
Christopher Friedt [Thu, 12 Aug 2010 08:39:38 +0000 (10:39 +0200)]
removed unused FFTComplex *buf from P structure

7 years agoremoved undefined fflags_t from P structure
Christopher Friedt [Thu, 12 Aug 2010 08:38:05 +0000 (10:38 +0200)]
removed undefined fflags_t from P structure

7 years agodo not need FFMPEG_ALLOC at all.... discovered that I was neglecting several importan...
Christopher Friedt [Thu, 12 Aug 2010 08:10:07 +0000 (10:10 +0200)]
do not need FFMPEG_ALLOC at all.... discovered that I was neglecting several important optimized memory copy routines which could be the primary reason neon simd performance is not that fast

7 years agoadded #ifdef HAVE_FFMPEG to proper location in rdft and reodft. added some debugging...
Christopher Friedt [Tue, 10 Aug 2010 14:44:50 +0000 (16:44 +0200)]
added #ifdef HAVE_FFMPEG to proper location in rdft and reodft. added some debugging to neon-simd

7 years agominor changes to configure.ac . started adding code for out-of-place, misalignment...
Christopher Friedt [Tue, 10 Aug 2010 08:35:42 +0000 (10:35 +0200)]
minor changes to configure.ac . started adding code for out-of-place, misalignment, and strided data handling in dft/dft-ffmpeg.c.

7 years agoremoved AM_CONDITIONAL HAVE_FFMPEG. reorganized automake and source files accordingly...
Christopher Friedt [Tue, 10 Aug 2010 07:21:54 +0000 (09:21 +0200)]
removed AM_CONDITIONAL HAVE_FFMPEG. reorganized automake and source files accordingly. began addition of code to handle out-of-place, misaligned, and strided I/O for ffmpeg.

7 years agofixed cycle counter (again)... this time it is definitely not stuck in an infinite...
Christopher Friedt [Mon, 9 Aug 2010 19:44:11 +0000 (21:44 +0200)]
fixed cycle counter (again)... this time it is definitely not stuck in an infinite loop

7 years agocorrected alignments in simd/neon-simd.h, added --enable-neon-intrinsics option to...
Christopher Friedt [Mon, 9 Aug 2010 13:22:22 +0000 (15:22 +0200)]
corrected alignments in simd/neon-simd.h, added --enable-neon-intrinsics option to configure, added a few comments to ffmpeg interfaces.

7 years agoadded FIXME notes to support out-of-place and misaligned input for all ffmpeg routines
Christopher Friedt [Mon, 9 Aug 2010 09:56:44 +0000 (11:56 +0200)]
added FIXME notes to support out-of-place and misaligned input for all ffmpeg routines

7 years agofixed asm and invalid call to ccnt_read() in kernel/cycle.h. reverted configure.ac...
Christopher Friedt [Mon, 9 Aug 2010 09:25:46 +0000 (11:25 +0200)]
fixed asm and invalid call to ccnt_read() in kernel/cycle.h. reverted configure.ac due to shell errors in the generated configure. fixed compile errors in *dft-ffmpeg.c

7 years agominor changes
Christopher Friedt [Sun, 8 Aug 2010 22:37:39 +0000 (00:37 +0200)]
minor changes

7 years agosome minor changes
Christopher Friedt [Sun, 8 Aug 2010 22:36:07 +0000 (00:36 +0200)]
some minor changes

7 years agoremoved junk under dft/simd/nonportable
Christopher Friedt [Sun, 8 Aug 2010 18:25:15 +0000 (20:25 +0200)]
removed junk under dft/simd/nonportable

7 years agoChanged strategy about ffmpeg fft usage. Using system ffmpeg rather than bundled...
Christopher Friedt [Sun, 8 Aug 2010 18:11:05 +0000 (20:11 +0200)]
Changed strategy about ffmpeg fft usage. Using system ffmpeg rather than bundled library for various reasons: 1) architecture agnostic benefits, 2) bundled libraries are evil, 3) less maintenance required. Currently ffmpegs dft and dct transforms are working well. Need to add something to ffmpegs avfft api so that other programs (that might not use av_malloc) can check to see if their buffers are aligned properly.

7 years agominor changes to automake files, added a few more lines to neon-ffmpeg.c
Christopher Friedt [Mon, 2 Aug 2010 07:22:50 +0000 (09:22 +0200)]
minor changes to automake files, added a few more lines to neon-ffmpeg.c

7 years agofixed some build issues to do with automake and linking
Christopher Friedt [Mon, 2 Aug 2010 02:22:38 +0000 (04:22 +0200)]
fixed some build issues to do with automake and linking

7 years agoinitial check-in of neon-ffmpeg algorithm
Christopher Friedt [Sun, 1 Aug 2010 15:29:51 +0000 (17:29 +0200)]
initial check-in of neon-ffmpeg algorithm

7 years agoadded comment to clarify why libtoolflags were necessary for ffmpeg fft. corrected...
Christopher Friedt [Thu, 29 Jul 2010 21:39:45 +0000 (23:39 +0200)]
added comment to clarify why libtoolflags were necessary for ffmpeg fft. corrected declaration of PUBLIK in publik.h .

7 years agoadded ffmpeg_fft lib to fftw build system. Eventually, the static lib should be taken...
Christopher Friedt [Thu, 29 Jul 2010 01:22:07 +0000 (03:22 +0200)]
added ffmpeg_fft lib to fftw build system. Eventually, the static lib should be taken out and a shared library should be used instead. TODO: write up neon-ffmpeg.c / neon-ffmpeg.h files to register this algorithm with the planner.

7 years agoadded !defined(HAVE_NEON_INTRINSICS) to ensure nonportable simd code would not improv...
Christopher Friedt [Tue, 27 Jul 2010 13:47:47 +0000 (15:47 +0200)]
added !defined(HAVE_NEON_INTRINSICS) to ensure nonportable simd code would not improve runtimes of codelet simd code

7 years agocorrected dft/simd/nonportable/README.txt, dft/simd/nonnportable/arm/Makefile.am
Christopher Friedt [Tue, 27 Jul 2010 13:41:19 +0000 (15:41 +0200)]
corrected dft/simd/nonportable/README.txt, dft/simd/nonnportable/arm/Makefile.am

7 years agomodified ./Makefile.am to include simd/nonportable/**.la
Christopher Friedt [Tue, 27 Jul 2010 12:51:06 +0000 (14:51 +0200)]
modified ./Makefile.am to include simd/nonportable/**.la

7 years agomodified ./Makefile.am to include simd/nonportable/**.la
Christopher Friedt [Tue, 27 Jul 2010 12:50:05 +0000 (14:50 +0200)]
modified ./Makefile.am to include simd/nonportable/**.la

7 years agofixed dft/simd/Makefile.am to identify nonportable as subdir
Christopher Friedt [Tue, 27 Jul 2010 12:19:18 +0000 (14:19 +0200)]
fixed dft/simd/Makefile.am to identify nonportable as subdir

7 years agoadded basic structure for using architecture-specific, codelet-free, simd-capable...
Christopher Friedt [Tue, 27 Jul 2010 12:17:44 +0000 (14:17 +0200)]
added basic structure for using architecture-specific, codelet-free, simd-capable fft routines with fftws planner

7 years agomoved VFMNMSI declaration to proper spot
Christopher Friedt [Wed, 7 Jul 2010 14:54:28 +0000 (10:54 -0400)]
moved VFMNMSI declaration to proper spot

7 years agocomment correction
Christopher Friedt [Wed, 7 Jul 2010 14:52:36 +0000 (10:52 -0400)]
comment correction

7 years agomodified BYTW2, BYTWJ2 to use the same, emulated FMA functions
Christopher Friedt [Wed, 7 Jul 2010 14:49:32 +0000 (10:49 -0400)]
modified BYTW2, BYTWJ2 to use the same, emulated FMA functions

7 years agomodified BYTW1 and BYTWJ1 to use the same, emulated FMA functions
Christopher Friedt [Wed, 7 Jul 2010 13:54:21 +0000 (09:54 -0400)]
modified BYTW1 and BYTWJ1 to use the same, emulated FMA functions

7 years agoreorganized simd/simd-neon.h and made fixes according to yesterdays comments
Christopher Friedt [Wed, 7 Jul 2010 13:44:41 +0000 (09:44 -0400)]
reorganized simd/simd-neon.h and made fixes according to yesterdays comments

7 years agoasm fixups for simd/simd-neon.h
Christopher Friedt [Tue, 6 Jul 2010 16:53:56 +0000 (12:53 -0400)]
asm fixups for simd/simd-neon.h

7 years agochanged visibility of symbol armv7_ticker_started to hidden
Christopher Friedt [Tue, 6 Jul 2010 16:52:49 +0000 (12:52 -0400)]
changed visibility of symbol armv7_ticker_started to hidden

7 years agomodified simd/simd-neon.h to use inlines + asm instead of inlines + intrinsics
Christopher Friedt [Mon, 5 Jul 2010 19:44:50 +0000 (15:44 -0400)]
modified simd/simd-neon.h to use inlines + asm instead of inlines + intrinsics

7 years agomodified PREC_SUFFIX so that libfftw3f.so would be named libfftw3fn.so, to facilitate...
Christopher Friedt [Mon, 5 Jul 2010 04:56:32 +0000 (00:56 -0400)]
modified PREC_SUFFIX so that libfftw3f.so would be named libfftw3fn.so, to facilitate benchmarks. fixed typo. ran several benchmarks using benchfft (see misc repository). need to rewrite in simd-neon.h in __asm__ blocks if codelets are to be useful at all. Need to use proper alignment and auto-increment for loads and stores

7 years agofixed up arm7 -> armv7
Christopher Friedt [Thu, 1 Jul 2010 23:47:33 +0000 (19:47 -0400)]
fixed up arm7 -> armv7

7 years agofixed up double-underscore names. removed sse-type function names and emulative behav...
Christopher Friedt [Thu, 1 Jul 2010 23:39:23 +0000 (19:39 -0400)]
fixed up double-underscore names. removed sse-type function names and emulative behaviour. found one compilation bug that says "n1fv_128.c:3513: error: unable to find a register to spill in class CORE_REGS", so flip_ri needed to be implemented using d-regs.

7 years agoadded cycle counter
Christopher Friedt [Wed, 30 Jun 2010 20:07:44 +0000 (16:07 -0400)]
added cycle counter

7 years agoadded intrinsics (should keep upstream happy). Still need to perform a couple of...
Christopher Friedt [Wed, 30 Jun 2010 20:05:40 +0000 (16:05 -0400)]
added intrinsics (should keep upstream happy). Still need to perform a couple of small tests to ensure that the output is correct. cleaned up configure.ac.

7 years agothis commit should be visibiel via the gitorious web interface
Christopher Friedt [Tue, 22 Jun 2010 14:07:44 +0000 (10:07 -0400)]
this commit should be visibiel via the gitorious web interface

7 years agore-based
Christopher Friedt [Tue, 22 Jun 2010 14:06:05 +0000 (10:06 -0400)]
re-based