float3 vs Vector for cuda #11

luchete80 · 2021-10-18T16:29:34Z

No description provided.

luchete80 · 2021-10-18T16:34:56Z

https://forums.developer.nvidia.com/t/own-float3-and-float4/30236/2

Note that if you define your own float4 type as a simple struct, it will have insufficient alignment to qualify for vector loads on the device side, which may reduce performance. CUDA’s built-in float4 type is implemented as a struct with added alignment attributes on both host and device.

What won’t work (in the general case) is using CUDA’s aligned float4 for device code and interface it to your own unaligned float4 on the host side. This kind of mix-and-match might work under carefully constrained circumstances, but generally speaking you definitely want to use the same type for both host and device code. So your original concerns along those lines were justified.

Note that CUDA does not provide a built-in float3 type so you have no choice but to define your own.

luchete80 added this to the 1st GPU Solver With CPU Neighbours milestone Oct 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

float3 vs Vector for cuda #11

float3 vs Vector for cuda #11

float3 vs Vector for cuda #11

float3 vs Vector for cuda #11

Comments