|
opm-simulators
|
Contains wrappers to make the CuBLAS library behave as a modern C++ library with function overlading. More...
Classes | |
| class | CuBlasHandle |
| The CuBlasHandle class provides a singleton for the simulator universal cuBlasHandle. More... | |
| class | CuSparseHandle |
| The CuSparseHandle class provides a singleton for the simulator universal cuSparseHandle. More... | |
| class | CuSparseResource |
| The CuSparseResource class wraps a CuSparse resource in a proper RAII pattern. More... | |
| class | FlexibleSolverWrapper |
| FlexibleSolverWrapper is compilational trick to reduce compile time overhead. More... | |
| class | has_should_call_pre |
| The has_should_call_pre class detects the presence of the method shouldCallPre. More... | |
| class | has_should_call_post |
| The has_should_call_post class detects the presence of the method shouldCallPost. More... | |
| class | has_communication |
| The has_communication class checks if the type has the member function getCommunication. More... | |
| class | is_a_well_operator |
| The is_a_well_operator class tries to guess if the operator is a well type operator. More... | |
Typedefs | |
| using | GpuSparseMatrixDescription = CuSparseResource<cusparseMatDescr_t> |
| GpuSparseMatrixDescription holder. | |
| using | GpuSparseMatrixDescriptionPtr = std::shared_ptr<CuSparseResource<cusparseMatDescr_t>> |
| Pointer to GpuSparseMatrixDescription holder. | |
Functions | |
| template<typename func> | |
| int | tuneThreadBlockSize (func &f, std::string descriptionOfFunction) |
| Function that tests the best thread block size, assumes the provided function depends on threadblock-size. | |
| std::vector< int > | createReorderedToNatural (const Opm::SparseTable< size_t > &levelSets) |
| std::vector< int > | createNaturalToReordered (const Opm::SparseTable< size_t > &levelSets) |
| template<class M, class field_type, class GPUM> | |
| std::unique_ptr< GPUM > | createReorderedMatrix (const M &naturalMatrix, const std::vector< int > &reorderedToNatural) |
| template<class M, class field_type, class GPUM> | |
| std::tuple< std::unique_ptr< GPUM >, std::unique_ptr< GPUM > > | extractLowerAndUpperMatrices (const M &naturalMatrix, const std::vector< int > &reorderedToNatural) |
| template<typename T, bool transpose> | |
| void | getQuasiImpesWeights (const GpuSparseMatrixWrapper< T > &matrix, std::size_t pressureVarIndex, GpuVector< T > &weights, const GpuVector< int > &diagonalIndices) |
| Calculates quasi-IMPES weights for CPR preconditioner on GPU. | |
| template<typename T, bool transpose> | |
| void | calculateCoarseEntries (const GpuSparseMatrixWrapper< T > &fineMatrix, GpuSparseMatrixWrapper< T > &coarseMatrix, const GpuVector< T > &weights, std::size_t pressureVarIndex) |
| Calculates the coarse level matrix entries based on the fine level matrix and weights. | |
| template<typename T, bool transpose> | |
| void | restrictVector (const GpuVector< T > &fine, GpuVector< T > &coarse, const GpuVector< T > &weights, std::size_t pressureVarIndex) |
| Restricts a fine level vector to a coarse level vector based on pressure index. | |
| template<typename T, bool transpose> | |
| void | prolongateVector (const GpuVector< T > &coarse, GpuVector< T > &fine, const GpuVector< T > &weights, std::size_t pressureVarIndex) |
| Prolongs a coarse level vector to a fine level vector based on pressure index. | |
| std::string | getCublasErrorMessage (cublasStatus_t error, const std::string_view &expression, const std::string_view &filename, const std::string_view &functionName, size_t lineNumber) |
| getCublasErrorMessage generates the error message to display for a given error. | |
| void | cublasSafeCall (cublasStatus_t error, const std::string_view &expression, const std::string_view &filename, const std::string_view &functionName, size_t lineNumber) |
| cublasSafeCall checks the return type of the CUBLAS expression (function call) and throws an exception if it does not equal CUBLAS_STATUS_SUCCESS. | |
| cublasStatus_t | cublasWarnIfError (cublasStatus_t error, const std::string_view &expression, const std::string_view &filename, const std::string_view &functionName, size_t lineNumber) |
| cublasWarnIfError checks the return type of the CUBLAS expression (function call) and issues a warning if it does not equal CUBLAS_STATUS_SUCCESS. | |
| cublasStatus_t | cublasScal (cublasHandle_t handle, int n, const double *alpha, double *x, int incx) |
| cublasStatus_t | cublasScal (cublasHandle_t handle, int n, const float *alpha, float *x, int incx) |
| cublasStatus_t | cublasScal (cublasHandle_t handle, int n, const int *alpha, int *x, int incx) |
| cublasStatus_t | cublasAxpy (cublasHandle_t handle, int n, const double *alpha, const double *x, int incx, double *y, int incy) |
| cublasStatus_t | cublasAxpy (cublasHandle_t handle, int n, const float *alpha, const float *x, int incx, float *y, int incy) |
| cublasStatus_t | cublasAxpy (cublasHandle_t handle, int n, const int *alpha, const int *x, int incx, int *y, int incy) |
| cublasStatus_t | cublasDot (cublasHandle_t handle, int n, const double *x, int incx, const double *y, int incy, double *result) |
| cublasStatus_t | cublasDot (cublasHandle_t handle, int n, const float *x, int incx, const float *y, int incy, float *result) |
| cublasStatus_t | cublasDot (cublasHandle_t handle, int n, const int *x, int incx, const int *y, int incy, int *result) |
| cublasStatus_t | cublasNrm2 (cublasHandle_t handle, int n, const double *x, int incx, double *result) |
| cublasStatus_t | cublasNrm2 (cublasHandle_t handle, int n, const float *x, int incx, float *result) |
| cublasStatus_t | cublasNrm2 (cublasHandle_t handle, int n, const int *x, int incx, int *result) |
| GpuSparseMatrixDescriptionPtr | createMatrixDescription () |
| createMatrixDescription creates a default matrix description | |
| GpuSparseMatrixDescriptionPtr | createLowerDiagonalDescription () |
| createLowerDiagonalDescription creates a lower diagonal matrix description | |
| GpuSparseMatrixDescriptionPtr | createUpperDiagonalDescription () |
| createUpperDiagonalDescription creates an upper diagonal matrix description | |
| std::string | getCusparseErrorCodeToString (int code) |
| getCusparseErrorCodeToString Converts an error code returned from a cusparse function a human readable string. | |
| std::string | getCusparseErrorMessage (cusparseStatus_t error, const std::string_view &expression, const std::string_view &filename, const std::string_view &functionName, size_t lineNumber) |
| getCusparseErrorMessage generates the error message to display for a given error. | |
| void | cusparseSafeCall (cusparseStatus_t error, const std::string_view &expression, const std::string_view &filename, const std::string_view &functionName, size_t lineNumber) |
| cusparseSafeCall checks the return type of the CUSPARSE expression (function call) and throws an exception if it does not equal CUSPARSE_STATUS_SUCCESS. | |
| cusparseStatus_t | cusparseWarnIfError (cusparseStatus_t error, const std::string_view &expression, const std::string_view &filename, const std::string_view &functionName, size_t lineNumber) |
| cusparseWarnIfError checks the return type of the CUSPARSE expression (function call) and issues a warning if it does not equal CUSPARSE_STATUS_SUCCESS. | |
| cusparseStatus_t | cusparseBsrilu02_analysis (cusparseHandle_t handle, cusparseDirection_t dirA, int mb, int nnzb, const cusparseMatDescr_t descrA, double *bsrSortedVal, const int *bsrSortedRowPtr, const int *bsrSortedColInd, int blockDim, bsrilu02Info_t info, cusparseSolvePolicy_t policy, void *pBuffer) |
| cusparseStatus_t | cusparseBsrsv2_analysis (cusparseHandle_t handle, cusparseDirection_t dirA, cusparseOperation_t transA, int mb, int nnzb, const cusparseMatDescr_t descrA, const double *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsv2Info_t info, cusparseSolvePolicy_t policy, void *pBuffer) |
| cusparseStatus_t | cusparseBsrsv2_analysis (cusparseHandle_t handle, cusparseDirection_t dirA, cusparseOperation_t transA, int mb, int nnzb, const cusparseMatDescr_t descrA, const float *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsv2Info_t info, cusparseSolvePolicy_t policy, void *pBuffer) |
| cusparseStatus_t | cusparseBsrilu02_analysis (cusparseHandle_t handle, cusparseDirection_t dirA, int mb, int nnzb, const cusparseMatDescr_t descrA, float *bsrSortedVal, const int *bsrSortedRowPtr, const int *bsrSortedColInd, int blockDim, bsrilu02Info_t info, cusparseSolvePolicy_t policy, void *pBuffer) |
| cusparseStatus_t | cusparseBsrsv2_solve (cusparseHandle_t handle, cusparseDirection_t dirA, cusparseOperation_t transA, int mb, int nnzb, const double *alpha, const cusparseMatDescr_t descrA, const double *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsv2Info_t info, const double *f, double *x, cusparseSolvePolicy_t policy, void *pBuffer) |
| cusparseStatus_t | cusparseBsrsv2_solve (cusparseHandle_t handle, cusparseDirection_t dirA, cusparseOperation_t transA, int mb, int nnzb, const float *alpha, const cusparseMatDescr_t descrA, const float *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsv2Info_t info, const float *f, float *x, cusparseSolvePolicy_t policy, void *pBuffer) |
| cusparseStatus_t | cusparseBsrilu02_bufferSize (cusparseHandle_t handle, cusparseDirection_t dirA, int mb, int nnzb, const cusparseMatDescr_t descrA, double *bsrSortedVal, const int *bsrSortedRowPtr, const int *bsrSortedColInd, int blockDim, bsrilu02Info_t info, int *pBufferSizeInBytes) |
| cusparseStatus_t | cusparseBsrilu02_bufferSize (cusparseHandle_t handle, cusparseDirection_t dirA, int mb, int nnzb, const cusparseMatDescr_t descrA, float *bsrSortedVal, const int *bsrSortedRowPtr, const int *bsrSortedColInd, int blockDim, bsrilu02Info_t info, int *pBufferSizeInBytes) |
| cusparseStatus_t | cusparseBsrsv2_bufferSize (cusparseHandle_t handle, cusparseDirection_t dirA, cusparseOperation_t transA, int mb, int nnzb, const cusparseMatDescr_t descrA, double *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsv2Info_t info, int *pBufferSizeInBytes) |
| cusparseStatus_t | cusparseBsrsv2_bufferSize (cusparseHandle_t handle, cusparseDirection_t dirA, cusparseOperation_t transA, int mb, int nnzb, const cusparseMatDescr_t descrA, float *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsv2Info_t info, int *pBufferSizeInBytes) |
| cusparseStatus_t | cusparseBsrilu02 (cusparseHandle_t handle, cusparseDirection_t dirA, int mb, int nnzb, const cusparseMatDescr_t descrA, double *bsrSortedVal, const int *bsrSortedRowPtr, const int *bsrSortedColInd, int blockDim, bsrilu02Info_t info, cusparseSolvePolicy_t policy, void *pBuffer) |
| cusparseStatus_t | cusparseBsrilu02 (cusparseHandle_t handle, cusparseDirection_t dirA, int mb, int nnzb, const cusparseMatDescr_t descrA, float *bsrSortedVal, const int *bsrSortedRowPtr, const int *bsrSortedColInd, int blockDim, bsrilu02Info_t info, cusparseSolvePolicy_t policy, void *pBuffer) |
| cusparseStatus_t | cusparseBsrmv (cusparseHandle_t handle, cusparseDirection_t dirA, cusparseOperation_t transA, int mb, int nb, int nnzb, const double *alpha, const cusparseMatDescr_t descrA, const double *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, const double *x, const double *beta, double *y) |
| cusparseStatus_t | cusparseBsrmv (cusparseHandle_t handle, cusparseDirection_t dirA, cusparseOperation_t transA, int mb, int nb, int nnzb, const float *alpha, const cusparseMatDescr_t descrA, const float *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, const float *x, const float *beta, float *y) |
| template<class Matrix> | |
| const Matrix | makeMatrixWithNonzeroDiagonal (const Matrix &matrix, const typename Matrix::field_type replacementValue=std::numeric_limits< typename Matrix::field_type >::epsilon()) |
| makeMatrixWithNonzeroDiagonal creates a new matrix with the zero diagonal elements (when viewed as a matrix of scalrars) set to replacementValue | |
| template<class Operator, class BlockType> | |
| Dune::BCRSMatrix< BlockType > | makeCPUMatrix (const Operator &op) |
| Utility functions for GPU preconditioner creation. | |
| std::string | getCudaErrorMessage (cudaError_t error, const std::string_view &expression, const std::string_view &filename, const std::string_view &functionName, size_t lineNumber) |
| getCudaErrorMessage generates the error message to display for a given error. | |
| void | cudaSafeCall (cudaError_t error, const std::string_view &expression, const std::string_view &filename, const std::string_view &functionName, size_t lineNumber) |
| cudaSafeCall checks the return type of the GPU expression (function call) and throws an exception if it does not equal cudaSuccess. | |
| void | cudaWarnIfError (cudaError_t error, const std::string_view &expression, const std::string_view &filename, const std::string_view &functionName, size_t lineNumber) |
| cudaWarnIfError checks the return type of the GPU expression (function call) and issues a warning if it does not equal cudaSuccess. | |
| template<class T, int blocksize> | |
| void | copyMatDataToReordered (const T *srcMatrix, const int *srcRowIndices, T *dstMatrix, int *dstRowIndices, int *naturalToReordered, size_t numberOfRows, int threadBlockSize) |
| Reorders the elements of a matrix by copying them from one matrix to another using a permutation list. | |
| template<class T, int blocksize> | |
| void | copyMatDataToReorderedSplit (const T *srcMatrix, const int *srcRowIndices, const int *srcColumnIndices, T *dstLowerMatrix, int *dstLowerRowIndices, T *dstUpperMatrix, int *dstUpperRowIndices, T *dstDiag, int *naturalToReordered, size_t numberOfRows, int threadBlockSize) |
| Reorders the elements of a matrix by copying them from one matrix to a split matrix using a permutation list. | |
| auto | makeSafeVectorDescriptor () |
| Create RAII-managed cuSPARSE dense vector descriptor. | |
| auto | makeSafeMatrixDescriptor () |
| Create RAII-managed cuSPARSE sparse matrix descriptor. | |
| template<class T, class M> | |
| std::vector< T > | extractNonzeroValues (const M &matrix) |
| Extract non-zero values from a DUNE matrix in the order expected by cuSPARSE. | |
| template<class M> | |
| std::pair< std::vector< int >, std::vector< int > > | extractSparsityPattern (const M &matrix) |
| Extract sparsity pattern from a DUNE matrix. | |
| template<class T, class M> | |
| T * | findFirstElementPointer (const M &matrix) |
| Find pointer to first matrix element for direct memory access. | |
| template<class M> | |
| void | validateMatrixConversion (const M &matrix, const std::vector< int > &columnIndices, const std::vector< int > &rowIndices) |
| Validate matrix conversion parameters. | |
| void | validateMatrixCompatibility (size_t nonzeroes1, size_t blockSize1, size_t N1, size_t nonzeroes2, size_t blockSize2, size_t N2) |
| Validate that two matrices have compatible dimensions. | |
| void | validateVectorMatrixSizes (size_t vectorSize, size_t matrixBlockSize, size_t matrixColumns) |
| Validate vector-matrix size compatibility for operations. | |
| constexpr size_t | getThreads (size_t numberOfRows) |
| size_t | getBlocks (size_t numberOfRows) |
| template<class Kernel> | |
| int | getCudaRecomendedThreadBlockSize (Kernel k, int suggestedThrBlockSize=-1) |
| int | getNumberOfBlocks (int wantedThreads, int threadBlockSize) |
| template<class T> | |
| bool | isGPUPointer (const T *ptr) |
| Checks whether the given pointer is associated with GPU device memory. | |
| template<class T, class D> | |
| bool | isGPUPointer (const std::unique_ptr< T, D > &ptr) |
| Checks if the given std::unique_ptr with custom deleter refers to GPU memory. | |
| template<class T> | |
| bool | isGPUPointer (const std::shared_ptr< T > &ptr) |
| Checks if the given std::shared_ptr refers to GPU memory. | |
| bool | isValidMatrixStorageMPScheme (int scheme) |
| __host__ __device__ constexpr bool | storeDiagonalAsFloat (MatrixStorageMPScheme scheme) |
| __host__ __device__ constexpr bool | storeOffDiagonalAsFloat (MatrixStorageMPScheme scheme) |
| __host__ __device__ constexpr bool | usingMixedPrecision (MatrixStorageMPScheme scheme) |
| template<class PreconditionerType> | |
| constexpr bool | shouldCallPreconditionerPre () |
| Tests (compile time) if the preconditioner type needs to call pre() before a call to apply(). | |
| template<class PreconditionerType> | |
| constexpr bool | shouldCallPreconditionerPost () |
| Tests (compile time) if the preconditioner type needs to call post() after a call to apply(...). | |
| template<class T> | |
| void | computeDiagIndicesNoReorder (const int *rowIndices, const int *colIndices, const size_t *indexConversion, int rows, size_t *diagIndices) |
| Computes indices of diagonal elements for non-reordered GPU preconditioner. | |
| template<class T> | |
| void | computeDiagIndices (const int *rowIndices, const int *colIndices, const int *reorderedToNatural, int rows, size_t *diagIndices) |
| Computes indices of diagonal elements for reordered GPU preconditioner. | |
| int | to_int (std::size_t s) |
| to_int converts a (on most relevant platforms) 64 bits unsigned size_t to a signed 32 bits signed int | |
| __host__ __device__ std::size_t | to_size_t (int i) |
| to_size_t converts a (on most relevant platforms) a 32 bit signed int to a 64 bits unsigned int | |
| template<class T> | |
| void | setVectorValue (T *deviceData, size_t numberOfElements, const T &value) |
| setVectorValue sets every element of deviceData to value | |
| template<class T> | |
| void | setZeroAtIndexSet (T *deviceData, size_t numberOfElements, const int *indices) |
| setZeroAtIndexSet sets deviceData to zero in the indices of contained in indices | |
| template<class T> | |
| T | innerProductAtIndices (cublasHandle_t cublasHandle, const T *deviceA, const T *deviceB, T *buffer, size_t numberOfElements, const int *indices) |
| innerProductAtIndices computes the inner product between deviceA[indices] and deviceB[indices] | |
| template<class T> | |
| void | prepareSendBuf (const T *deviceA, T *buffer, size_t numberOfElements, const int *indices) |
| template<class T> | |
| void | syncFromRecvBuf (T *deviceA, T *buffer, size_t numberOfElements, const int *indices) |
| template<class T> | |
| void | weightedDiagMV (const T *squareBlockVector, const size_t numberOfRows, const size_t blocksize, T relaxationFactor, const T *srcVec, T *dstVec) |
| Compue the weighted matrix vector product where the matrix is diagonal, the diagonal is a vector, meaning we compute the Hadamard product. | |
Variables | |
| constexpr cudaStream_t | DEFAULT_STREAM = 0 |
| The default GPU stream (stream 0). | |
| constexpr auto | CUSPARSE_MATRIX_ORDER = CUSPARSE_DIRECTION_ROW |
| Default matrix ordering for CuSparse operations. | |
Contains wrappers to make the CuBLAS library behave as a modern C++ library with function overlading.
Provides various utilities for doing signed to unsigned conversion, unsigned to signed, 32 bits to 64 bits and 64 bits to 32 bits.
Simple utility structs to test for the existence of functions in types.
Contains wrappers to make the CuSPARSE library behave as a modern C++ library with function overlading.
In simple terms, this allows one to call say cublasScal on both double and single precisision, instead of calling cublasDscal and cublasSscal respectively.
In simple terms, this allows one to call say cusparseBsrilu02_analysis on both double and single precisision, instead of calling cusparseDbsrilu02_analysis and cusparseDbsrilu02_analysis respectively.
Note that there are alternatives to this, see for instance https://stackoverflow.com/questions/257288/templated-check-for-the-existence-of-a-class-member-function , however, this is by far the cleanest approach for where this is going to be used for now.
TODO: Use the requires-keyword once C++20 becomes availble (https://en.cppreference.com/w/cpp/language/constraints ). With C++20 this file can be removed.
The main use case within cuistl is that the cusparse library requires signed int for all its size parameters, while Dune::BlockVector (and relatives) use unsigned size_t.
| using Opm::gpuistl::detail::GpuSparseMatrixDescription = CuSparseResource<cusparseMatDescr_t> |
GpuSparseMatrixDescription holder.
This is internal information needed for most calls to the CuSparse API.
| using Opm::gpuistl::detail::GpuSparseMatrixDescriptionPtr = std::shared_ptr<CuSparseResource<cusparseMatDescr_t>> |
Pointer to GpuSparseMatrixDescription holder.
This is internal information needed for most calls to the CuSparse API.
| void Opm::gpuistl::detail::calculateCoarseEntries | ( | const GpuSparseMatrixWrapper< T > & | fineMatrix, |
| GpuSparseMatrixWrapper< T > & | coarseMatrix, | ||
| const GpuVector< T > & | weights, | ||
| std::size_t | pressureVarIndex ) |
Calculates the coarse level matrix entries based on the fine level matrix and weights.
| fineMatrix | The fine level matrix |
| coarseMatrix | The coarse level matrix to fill |
| weights | The weights vector for pressure calculation |
| pressureVarIndex | The index of the pressure variable |
| transpose | Whether to use the transpose representation or not |
| void Opm::gpuistl::detail::computeDiagIndices | ( | const int * | rowIndices, |
| const int * | colIndices, | ||
| const int * | reorderedToNatural, | ||
| int | rows, | ||
| size_t * | diagIndices ) |
Computes indices of diagonal elements for reordered GPU preconditioner.
| T | Matrix element type |
| [in] | rowIndices | Row indices of the BCSR matrix |
| [in] | colIndices | Column indices of the BCSR matrix |
| [in] | reorderedToNatural | Table for reordering to natural numbering |
| [in] | rows | Number of rows in the matrix |
| [out] | diagIndices | Array to store the indices of diagonal elements |
| void Opm::gpuistl::detail::computeDiagIndicesNoReorder | ( | const int * | rowIndices, |
| const int * | colIndices, | ||
| const size_t * | indexConversion, | ||
| int | rows, | ||
| size_t * | diagIndices ) |
Computes indices of diagonal elements for non-reordered GPU preconditioner.
| T | Matrix element type |
| [in] | rowIndices | Row indices of the BCSR matrix |
| [in] | colIndices | Column indices of the BCSR matrix |
| [in] | indexConversion | Mapping from levelset-row index to matrix row |
| [in] | rows | Number of rows in the matrix |
| [out] | diagIndices | Array to store the indices of diagonal elements |
| void Opm::gpuistl::detail::copyMatDataToReordered | ( | const T * | srcMatrix, |
| const int * | srcRowIndices, | ||
| T * | dstMatrix, | ||
| int * | dstRowIndices, | ||
| int * | naturalToReordered, | ||
| size_t | numberOfRows, | ||
| int | threadBlockSize ) |
Reorders the elements of a matrix by copying them from one matrix to another using a permutation list.
| srcMatrix | The source matrix we will copy data from | |
| srcRowIndices | Pointer to vector on GPU containing row indices for the source matrix compliant wiht bsr format | |
| [out] | dstMatrix | The destination matrix that we copy data to |
| dstRowIndices | Pointer to vector on GPU containing riw indices for the destination matrix compliant wiht bsr format | |
| naturalToReordered | Permuation list that converts indices in the src matrix to the indices in the dst matrix | |
| numberOfRows | The number of rows in the matrices | |
| threadBlockSize | Block size for threads |
| void Opm::gpuistl::detail::copyMatDataToReorderedSplit | ( | const T * | srcMatrix, |
| const int * | srcRowIndices, | ||
| const int * | srcColumnIndices, | ||
| T * | dstLowerMatrix, | ||
| int * | dstLowerRowIndices, | ||
| T * | dstUpperMatrix, | ||
| int * | dstUpperRowIndices, | ||
| T * | dstDiag, | ||
| int * | naturalToReordered, | ||
| size_t | numberOfRows, | ||
| int | threadBlockSize ) |
Reorders the elements of a matrix by copying them from one matrix to a split matrix using a permutation list.
| srcMatrix | The source matrix we will copy data from | |
| srcRowIndices | Pointer to vector on GPU containing row indices for the source matrix compliant with bsr format | |
| srcColumnIndices | Pointer to vector on GPU containing column indices for the source matrix compliant with bsr format | |
| [out] | dstLowerMatrix | The destination of entries that originates from the strictly lower triangular matrix |
| dstLowerRowIndices | Pointer to vector on GPU containing rww indices for the destination lower matrix compliant wiht bsr format | |
| [out] | dstUpperMatrix | The destination of entries that originates from the strictly upper triangular matrix |
| dstUpperRowIndices | Pointer to vector on GPU containing riw indices for the destination upper matrix compliant wiht bsr format | |
| [out] | dstDiag | The destination buffer for the diagonal part of the matrix |
| naturalToReordered | Permuation list that converts indices in the src matrix to the indices in the dst matrix | |
| numberOfRows | The number of rows in the matrices | |
| threadBlockSize | Block size for threads |
|
inline |
createLowerDiagonalDescription creates a lower diagonal matrix description
|
inline |
createMatrixDescription creates a default matrix description
|
inline |
createUpperDiagonalDescription creates an upper diagonal matrix description
|
inline |
cublasSafeCall checks the return type of the CUBLAS expression (function call) and throws an exception if it does not equal CUBLAS_STATUS_SUCCESS.
| error | the error code from cublas |
| expression | the expresison (say "cublasCreate(&handle)") |
| filename | the code file the error occured in (typically FILE) |
| functionName | name of the function the error occured in (typically func) |
| lineNumber | the line number the error occured in (typically LINE) |
Example usage:
|
inline |
cublasWarnIfError checks the return type of the CUBLAS expression (function call) and issues a warning if it does not equal CUBLAS_STATUS_SUCCESS.
| error | the error code from cublas |
| expression | the expresison (say "cublasCreate(&handle)") |
| filename | the code file the error occured in (typically FILE) |
| functionName | name of the function the error occured in (typically func) |
| lineNumber | the line number the error occured in (typically LINE) |
Example usage:
|
inline |
cudaSafeCall checks the return type of the GPU expression (function call) and throws an exception if it does not equal cudaSuccess.
Example usage:
|
inline |
cudaWarnIfError checks the return type of the GPU expression (function call) and issues a warning if it does not equal cudaSuccess.
| error | the error code from cublas |
| expression | the expresison (say "cudaMalloc(&pointer, 1)") |
| filename | the code file the error occured in (typically FILE) |
| functionName | name of the function the error occured in (typically func) |
| lineNumber | the line number the error occured in (typically LINE) |
Example usage:
|
inline |
cusparseSafeCall checks the return type of the CUSPARSE expression (function call) and throws an exception if it does not equal CUSPARSE_STATUS_SUCCESS.
Example usage:
|
inline |
cusparseWarnIfError checks the return type of the CUSPARSE expression (function call) and issues a warning if it does not equal CUSPARSE_STATUS_SUCCESS.
| error | the error code from cublas |
| expression | the expresison (say "cublasCreate(&handle)") |
| filename | the code file the error occured in (typically FILE) |
| functionName | name of the function the error occured in (typically func) |
| lineNumber | the line number the error occured in (typically LINE) |
Example usage:
| std::vector< T > Opm::gpuistl::detail::extractNonzeroValues | ( | const M & | matrix | ) |
Extract non-zero values from a DUNE matrix in the order expected by cuSPARSE.
This function iterates through the matrix and extracts the non-zero values in the Block Compressed Sparse Row (BSR) format order that cuSPARSE expects.
| T | the element type (float or double) |
| M | the matrix type (assumed to be DUNE::BCRSMatrix compatible) |
| matrix | the input matrix |
| std::pair< std::vector< int >, std::vector< int > > Opm::gpuistl::detail::extractSparsityPattern | ( | const M & | matrix | ) |
Extract sparsity pattern from a DUNE matrix.
This function extracts the row and column indices that define the sparsity pattern of a DUNE matrix in the CSR format expected by cuSPARSE.
| M | the matrix type (assumed to be DUNE::BCRSMatrix compatible) |
| matrix | the input matrix |
| T * Opm::gpuistl::detail::findFirstElementPointer | ( | const M & | matrix | ) |
Find pointer to first matrix element for direct memory access.
This function finds a pointer to the first element in the matrix data for cases where direct memory copying is desired.
| T | the element type |
| M | the matrix type |
| matrix | the input matrix |
|
inline |
getCublasErrorMessage generates the error message to display for a given error.
| error | the error code from cublas |
| expression | the expresison (say "cublasCreate(&handle)") |
| filename | the code file the error occured in (typically FILE) |
| functionName | name of the function the error occured in (typically func) |
| lineNumber | the line number the error occured in (typically LINE) |
|
inline |
getCudaErrorMessage generates the error message to display for a given error.
| error | the error code from cublas |
| expression | the expresison (say "cudaMalloc(&pointer, 1)") |
| filename | the code file the error occured in (typically FILE) |
| functionName | name of the function the error occured in (typically func) |
| lineNumber | the line number the error occured in (typically LINE) |
|
inline |
getCusparseErrorCodeToString Converts an error code returned from a cusparse function a human readable string.
| code | an error code from a cusparse routine |
|
inline |
getCusparseErrorMessage generates the error message to display for a given error.
| error | the error code from cublas |
| expression | the expresison (say "cusparseCreate(&handle)") |
| filename | the code file the error occured in (typically FILE) |
| functionName | name of the function the error occured in (typically func) |
| lineNumber | the line number the error occured in (typically LINE) |
| void Opm::gpuistl::detail::getQuasiImpesWeights | ( | const GpuSparseMatrixWrapper< T > & | matrix, |
| std::size_t | pressureVarIndex, | ||
| GpuVector< T > & | weights, | ||
| const GpuVector< int > & | diagonalIndices ) |
Calculates quasi-IMPES weights for CPR preconditioner on GPU.
| matrix | The matrix used for weight calculation |
| pressureVarIndex | The index of the pressure variable |
| weights | Output vector to store the calculated weights |
| diagonalIndices | The diagonal indices of the matrix |
| transpose | Whether to use the transpose mode for calculation |
| T Opm::gpuistl::detail::innerProductAtIndices | ( | cublasHandle_t | cublasHandle, |
| const T * | deviceA, | ||
| const T * | deviceB, | ||
| T * | buffer, | ||
| size_t | numberOfElements, | ||
| const int * | indices ) |
innerProductAtIndices computes the inner product between deviceA[indices] and deviceB[indices]
| cublasHandle | a valid (initialized) cublas handle |
| deviceA | data A (device memory) |
| deviceB | data B (device memory) |
| buffer | a buffer with number of elements equal to numberOfElements (device memory) |
| numberOfElements | number of indices |
| indices | the indices to compute the inner product over (device memory) |
|
inline |
Checks if the given std::shared_ptr refers to GPU memory.
| T | The type stored within the pointer. |
| ptr | The std::shared_ptr object to inspect. |
|
inline |
Checks if the given std::unique_ptr with custom deleter refers to GPU memory.
| T | The type stored within the pointer. |
| D | The custom deleter type. |
| ptr | The std::unique_ptr object to inspect. |
|
inline |
Checks whether the given pointer is associated with GPU device memory.
This function retrieves CUDA pointer attributes for the provided pointer and determines whether it references device memory. It returns true if the pointer corresponds to GPU memory; otherwise, it returns false.
| T | Type of elements pointed to by the input pointer. |
| ptr | Pointer to the memory that needs to be checked. |
| Dune::BCRSMatrix< BlockType > Opm::gpuistl::detail::makeCPUMatrix | ( | const Operator & | op | ) |
Utility functions for GPU preconditioner creation.
These functions are temporary workarounds needed because some GPU preconditioners require CPU matrices for initial setup (e.g., graph coloring). They will be removed once GPU preconditioners have native GPU constructors. This function creates a CPU matrix from the operator holding a GPU matrix.
This is a workaround for now since some of the GPU preconditioners need a CPU matrix for the initial setup (graph coloring). The CPU matrix is only used in the constructor, not in the update function or the apply function.
| const Matrix Opm::gpuistl::detail::makeMatrixWithNonzeroDiagonal | ( | const Matrix & | matrix, |
| const typename Matrix::field_type | replacementValue = std::numeric_limits<typename Matrix::field_type>::epsilon() ) |
makeMatrixWithNonzeroDiagonal creates a new matrix with the zero diagonal elements (when viewed as a matrix of scalrars) set to replacementValue
| matrix | the matrix to replace |
| replacementValue | the value to set in the diagonal elements that are zero |
|
inline |
Create RAII-managed cuSPARSE sparse matrix descriptor.
|
inline |
Create RAII-managed cuSPARSE dense vector descriptor.
| void Opm::gpuistl::detail::prolongateVector | ( | const GpuVector< T > & | coarse, |
| GpuVector< T > & | fine, | ||
| const GpuVector< T > & | weights, | ||
| std::size_t | pressureVarIndex ) |
Prolongs a coarse level vector to a fine level vector based on pressure index.
| coarse | The coarse level vector |
| fine | The fine level vector to fill |
| weights | The weights vector for pressure calculation |
| pressureVarIndex | The index of the pressure variable |
| transpose | Whether to use the transpose representation or not |
| void Opm::gpuistl::detail::restrictVector | ( | const GpuVector< T > & | fine, |
| GpuVector< T > & | coarse, | ||
| const GpuVector< T > & | weights, | ||
| std::size_t | pressureVarIndex ) |
Restricts a fine level vector to a coarse level vector based on pressure index.
| fine | The fine level vector |
| coarse | The coarse level vector to fill |
| weights | The weights vector for pressure calculation |
| pressureVarIndex | The index of the pressure variable |
| transpose | Whether to use the transpose representation or not |
| void Opm::gpuistl::detail::setVectorValue | ( | T * | deviceData, |
| size_t | numberOfElements, | ||
| const T & | value ) |
setVectorValue sets every element of deviceData to value
| deviceData | pointer to GPU memory |
| numberOfElements | number of elements to set to value |
| value | the value to use |
| void Opm::gpuistl::detail::setZeroAtIndexSet | ( | T * | deviceData, |
| size_t | numberOfElements, | ||
| const int * | indices ) |
setZeroAtIndexSet sets deviceData to zero in the indices of contained in indices
| deviceData | the data to operate on (device memory) |
| numberOfElements | number of indices |
| indices | the indices to use (device memory) |
|
constexpr |
Tests (compile time) if the preconditioner type needs to call post() after a call to apply(...).
|
constexpr |
Tests (compile time) if the preconditioner type needs to call pre() before a call to apply().
|
inline |
to_int converts a (on most relevant platforms) 64 bits unsigned size_t to a signed 32 bits signed int
| s | the unsigned integer |
| std::invalid_argument | exception if s is out of range for an int |
|
inline |
to_size_t converts a (on most relevant platforms) a 32 bit signed int to a 64 bits unsigned int
| i | the signed integer |
| std::invalid_argument | if i is negative. |
| int Opm::gpuistl::detail::tuneThreadBlockSize | ( | func & | f, |
| std::string | descriptionOfFunction ) |
Function that tests the best thread block size, assumes the provided function depends on threadblock-size.
| The | type of the function to tune |
| f | the function to tune, which takes the thread block size as the input |
| descriptionOfFunction | Description of function |
|
inline |
Validate that two matrices have compatible dimensions.
| nonzeroes1 | number of nonzeros in first matrix |
| blockSize1 | block size of first matrix |
| N1 | number of rows in first matrix |
| nonzeroes2 | number of nonzeros in second matrix |
| blockSize2 | block size of second matrix |
| N2 | number of rows in second matrix |
| void Opm::gpuistl::detail::validateMatrixConversion | ( | const M & | matrix, |
| const std::vector< int > & | columnIndices, | ||
| const std::vector< int > & | rowIndices ) |
Validate matrix conversion parameters.
Performs sanity checks on the extracted sparsity pattern to ensure it's valid for cuSPARSE operations.
| M | the matrix type |
| matrix | the original matrix |
| columnIndices | the extracted column indices |
| rowIndices | the extracted row indices |
|
inline |
Validate vector-matrix size compatibility for operations.
| vectorSize | size of the vector |
| matrixBlockSize | block size of the matrix |
| matrixColumns | number of matrix columns |
| void Opm::gpuistl::detail::weightedDiagMV | ( | const T * | squareBlockVector, |
| const size_t | numberOfRows, | ||
| const size_t | blocksize, | ||
| T | relaxationFactor, | ||
| const T * | srcVec, | ||
| T * | dstVec ) |
Compue the weighted matrix vector product where the matrix is diagonal, the diagonal is a vector, meaning we compute the Hadamard product.
| squareBlockVector | A GpuVector whose elements are NxN matrix blocks | |
| numberOfRows | The number of rows in the vector | |
| blocksize | The sidelength of the square block elements in the vector | |
| relaxationFactor | Relaxation factor to use | |
| srcVec | A pointer to the data of the GpuVector we multiply the blockvector with | |
| [out] | dstVec | A pointer to the data of the GpuVector we store the result in |