FK20 CUDA
|
Go to the source code of this file.
Functions | |
__device__ void | fr_fft (fr_t *output, const fr_t *input) |
FFT over Fr. More... | |
__device__ void | fr_ift (fr_t *output, const fr_t *input) |
Inverse FFT for fr_t[512]. More... | |
__global__ void | fr_fft_wrapper (fr_t *output, const fr_t *input) |
wrapper for fr_fft: FFT for fr_t[512] More... | |
__global__ void | fr_ift_wrapper (fr_t *output, const fr_t *input) |
wrapper for fr_ift: inverse FFT for fr_t[512] More... | |
Variables | |
__shared__ fr_t | fr_smem [] |
Workspace in shared memory. Must be 512*sizeof(fr_t) bytes. More... | |
FFT over Fr.
Performs one FFT-512 for each thread block. This function must be called with 256 threads per block, i.e. dim3(256,1,1). Input and output arrays can overlap without side effects. There is no interleaving of data for different FFTs (the stride is 1).
[out] | output | |
[in] | input |
Definition at line 26 of file fr_fft.cu.
wrapper for fr_fft: FFT for fr_t[512]
Executes an FFT over many arrays fr_t[512]. One array per block. input and output can overlap without side effects. There is no interleaving of data for different FFTs.
[out] | output | |
[in] | input |
Definition at line 316 of file fr_fft.cu.
Inverse FFT for fr_t[512].
Performs one inverse FFT-512 in each thread block. This function must be called with 256 threads per block, i.e. dim3(256,1,1). Input and output arrays can overlap without side effects. There is no interleaving of data for different FFTs (the stride is 1).
[out] | output | |
[in] | input |
Definition at line 170 of file fr_fft.cu.
wrapper for fr_ift: inverse FFT for fr_t[512]
Executes an inverse FFT over many arrays fr_t[512]. One array per block. input and output can overlap without side effects. There is no interleaving of data for different iFFTs.
[out] | output | |
[in] | input |
Definition at line 345 of file fr_fft.cu.
|
extern |
Workspace in shared memory. Must be 512*sizeof(fr_t) bytes.