FK20 CUDA
All Data Structures Namespaces Files Functions Variables Typedefs Macros
g1p_fft.cu File Reference
#include <stdio.h>
#include "g1.cuh"
#include "fk20.cuh"
Include dependency graph for g1p_fft.cu:

Go to the source code of this file.

Functions

__device__ void g1p_fft (g1p_t *output, const g1p_t *input)
 FFT of size 512 over G1 with projective coordinates. Input and output arrays may overlap. This function must be called with 256 threads per block, i.e. dim3(256,1,1). No interleaving of data for different FFTs. More...
 
__device__ void g1p_ift (g1p_t *output, const g1p_t *input)
 Inverse FFT of size 512 over G1 with projective coordinates. Input and output arrays may overlap. This function must be called with 256 threads per block, i.e. dim3(256,1,1). No interleaving of data for different FFTs. More...
 
__global__ void g1p_fft_wrapper (g1p_t *output, const g1p_t *input)
 wrapper for g1p_fft: FFT for arrays of g1p_t with length 512 More...
 
__global__ void g1p_ift_wrapper (g1p_t *output, const g1p_t *input)
 wrapper for g1p_ift: inverse FFT for arrays of g1p_t with length 512 More...
 

Variables

__shared__ g1p_t g1p_tmp []
 

Function Documentation

◆ g1p_fft()

__device__ void g1p_fft ( g1p_t output,
const g1p_t input 
)

FFT of size 512 over G1 with projective coordinates. Input and output arrays may overlap. This function must be called with 256 threads per block, i.e. dim3(256,1,1). No interleaving of data for different FFTs.

Parameters
[out]outputpointer to 512 elements array of g1p_t
[in]inputpointer to 512 elements array of g1p_t
Returns
void

Definition at line 24 of file g1p_fft.cu.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ g1p_fft_wrapper()

__global__ void g1p_fft_wrapper ( g1p_t output,
const g1p_t input 
)

wrapper for g1p_fft: FFT for arrays of g1p_t with length 512

Executes an FFT over many arrays of arrays of g1p_t with length 512. One array per block. input and output can overlap without side effects. There is no interleaving of data for different FFTs.

Parameters
[out]outputpointer to 512*blocksize elements array of g1p_t
[in]inputpointer to 512*blocksize elements array of g1p_t
Returns
void

Definition at line 336 of file g1p_fft.cu.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ g1p_ift()

__device__ void g1p_ift ( g1p_t output,
const g1p_t input 
)

Inverse FFT of size 512 over G1 with projective coordinates. Input and output arrays may overlap. This function must be called with 256 threads per block, i.e. dim3(256,1,1). No interleaving of data for different FFTs.

Parameters
[out]outputpointer to 512 elements array of g1p_t
[in]inputpointer to 512 elements array of g1p_t
Returns
void

Definition at line 178 of file g1p_fft.cu.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ g1p_ift_wrapper()

__global__ void g1p_ift_wrapper ( g1p_t output,
const g1p_t input 
)

wrapper for g1p_ift: inverse FFT for arrays of g1p_t with length 512

Executes an Inverse FFT over many arrays of arrays of g1p_t with length 512. One array per block. input and output can overlap without side effects. There is no interleaving of data for different FFTs.

Parameters
[out]outputpointer to 512*blocksize elements array of g1p_t
[in]inputpointer to 512*blocksize elements array of g1p_t
Returns
void

Definition at line 349 of file g1p_fft.cu.

Here is the call graph for this function:
Here is the caller graph for this function:

Variable Documentation

◆ g1p_tmp

__shared__ g1p_t g1p_tmp[]
extern