[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPUv2 segfaults on split-head attention CLIP model #66721

Open
gustavla opened this issue Apr 30, 2024 · 3 comments
Open

GPUv2 segfaults on split-head attention CLIP model #66721

gustavla opened this issue Apr 30, 2024 · 3 comments
Assignees
Labels
comp:lite TF Lite related issues TF 2.16 TFLiteGpuDelegate TFLite Gpu delegate issue type:support Support issues

Comments

@gustavla
Copy link
Contributor
gustavla commented Apr 30, 2024

System information

  • Google Pixel 7 / Android 13 / Google Tensor G2
  • TFLite 2.16.1 (stock)

Standalone code to reproduce the issue

Model asset: tflite_66721_sha_clip_gpuv2_segfault.tflite

Run model through TFLite (GPUv2) on an Android device (for instance through benchmark tool).

Any other info / logs

Runtime log (executed on https://aihub.qualcomm.com/)

[30/Apr/2024:10:26:55 -07:00: profiler/info] -=- Tungsten Initializing -=-
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.board.platform = gs201
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.boot.hardware = panther
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.boot.hardware.platform = gs201
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.system.build.id = TQ1A.221205.011
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.system.build.version.release = 13
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.hardware = panther
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.hardware.chipname = 
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.product.board = panther
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.product.brand = google
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.product.device = panther
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.product.build.fingerprint = google/panther/panther:13/TQ1A.221205.011/9244662:user/release-keys
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.product.manufacturer = Google
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.product.model = Pixel 7
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.product.name = panther
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.soc.manufacturer = Google
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.soc.model = GS201
[30/Apr/2024:10:26:55 -07:00: profiler/info] [Manager] DeviceManager::DeviceManager
[30/Apr/2024:10:26:55 -07:00: profiler/info] [Manager] findAvailableDevices
[30/Apr/2024:10:26:55 -07:00: profiler/info] [Manager] Found interface google-edgetpu (version = 2.0)
[30/Apr/2024:10:26:55 -07:00: profiler/info] [Manager] Found interface google-armnn (version = ArmNN)
[30/Apr/2024:10:26:55 -07:00: profiler/info] NNAPI devices: google-edgetpu,google-armnn,nnapi-reference
[30/Apr/2024:10:26:55 -07:00: profiler/info] GPU device: ARM Mali-G710
[30/Apr/2024:10:26:55 -07:00: profiler/info] OpenGL Version: OpenGL ES 3.2 v1.r36p0-01eac0.1f36dec337e44918d811de9a8a2acf4d
[30/Apr/2024:10:26:55 -07:00: profiler/info] OpenCL Version: OpenCL C 1.2 v1.r36p0-01eac0.1f36dec337e44918d811de9a8a2acf4d
[30/Apr/2024:10:26:55 -07:00: profiler/info] -=- Tungsten Running Task: Loading -=-
[30/Apr/2024:10:26:55 -07:00: profiler/info] Detected chipset 3101, made by 3000.
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Loading tflite model Models/model.tflite
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Malloc VM size before: 24632.0 kB, allocated: 13796.0 kB, slack: 10836.0 kB.
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Current memory baseline range: 57552.0-68388.0 kB.
[30/Apr/2024:10:26:55 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Runtime metadata not found in Models/model.tflite/trt_metadata.json or Models/model.tflite/trt_metadata.pb
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] TF Lite version 2.16.1. Loading model from Models/model.tflite.
[30/Apr/2024:10:26:55 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Mapping resource file in Models/model.tflite
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Loaded model. Minimum TF Lite version = 2.3.0.
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] No delegates specified; using compute unit=cpu_and_gpu.
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Initialized TensorFlow Lite runtime.
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] GPUV2 delegate requested. OpenCL detected.
[30/Apr/2024:10:26:55 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Enabling delegate cache in dir=/data/user/0/ai.tetra.tungsten/cache/1714498015468/ai.tetra.runtime/0.6.0/model.tflite_8945450969824422876_1714498006500/gpuv2.
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Created TensorFlow Lite delegate for GPU.
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Replacing 2003 out of 2003 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 1 partitions for the whole graph.
[30/Apr/2024:10:26:56 -07:00: profiler/warning] [job_id: jygz19nxp] [model.tflite] [tflite] File /data/user/0/ai.tetra.tungsten/cache/1714498015468/ai.tetra.runtime/0.6.0/model.tflite_8945450969824422876_1714498006500/gpuv2/gpuv2_1297717803319390986.bin couldn't be opened for reading: No such file or directory
[30/Apr/2024:10:27:00 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Initialized OpenCL-based API.
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Created 1 GPU delegate kernels.
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Applied 1 delegates: GPUV2/OpenCL. Model is fully delegated=true.
[30/Apr/2024:10:27:01 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Saving delegate selection for subsequent steps.
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Malloc VM size after: 256412.0 kB, allocated: 233690.0 kB, slack: 22722.0 kB.
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Status Successfully Loaded Cold with t = 5381726 us and usage: before = 68388.0 kB; peakBefore = 68388.0 kB; mallocUnusedBefore = 10836.0 kB; after = 291732.0 kB; peakAfter = 805160.0 kB; mallocUnusedAfter = 22722.0 kB; increase = 200622.0-211458.0 kB; peak = 736772.0-747608.0 kB
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Saving results to /storage/emulated/0/Android/data/ai.tetra.tungsten/files/Results/job_jygz19nxp/job_jygz19nxp_results.bin
[30/Apr/2024:10:27:01 -07:00: profiler/info] -=- Tungsten Running Task: Loading -=-
[30/Apr/2024:10:27:01 -07:00: profiler/info] Detected chipset 3101, made by 3000.
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Loading previously saved results in /storage/emulated/0/Android/data/ai.tetra.tungsten/files/Results/job_jygz19nxp/job_jygz19nxp_results.bin
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Loading tflite model Models/model.tflite
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Malloc VM size before: 77880.0 kB, allocated: 16704.4 kB, slack: 61175.6 kB.
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Current memory baseline range: 25732.4-86908.0 kB.
[30/Apr/2024:10:27:01 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Runtime metadata not found in Models/model.tflite/trt_metadata.json or Models/model.tflite/trt_metadata.pb
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] TF Lite version 2.16.1. Loading model from Models/model.tflite.
[30/Apr/2024:10:27:01 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Mapping resource file in Models/model.tflite
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Loaded model. Minimum TF Lite version = 2.3.0.
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] GPUV2 delegate requested. OpenCL detected.
[30/Apr/2024:10:27:01 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Enabling delegate cache in dir=/data/user/0/ai.tetra.tungsten/cache/1714498015468/ai.tetra.runtime/0.6.0/model.tflite_8945450969824422876_1714498006500/gpuv2.
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Replacing 2003 out of 2003 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 1 partitions for the whole graph.
[30/Apr/2024:10:27:02 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Found serialized data for model gpuv2 (175507208 B) at /data/user/0/ai.tetra.tungsten/cache/1714498015468/ai.tetra.runtime/0.6.0/model.tflite_8945450969824422876_1714498006500/gpuv2/gpuv2_1297717803319390986.bin
[30/Apr/2024:10:27:02 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Initialized OpenCL-based API from serialized data.
[30/Apr/2024:10:27:02 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Created 1 GPU delegate kernels.
[30/Apr/2024:10:27:02 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Applied 1 delegates: GPUV2/OpenCL. Model is fully delegated=true.
[30/Apr/2024:10:27:02 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Malloc VM size after: 252240.0 kB, allocated: 225091.0 kB, slack: 27149.0 kB.
[30/Apr/2024:10:27:02 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Status Successfully Loaded Warm with t = 1281645 us and usage: before = 86908.0 kB; peakBefore = 86908.0 kB; mallocUnusedBefore = 61175.6 kB; after = 283312.0 kB; peakAfter = 785988.0 kB; mallocUnusedAfter = 27149.0 kB; increase = 169255.0-230430.6 kB; peak = 699080.0-760255.6 kB
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Saving results to /storage/emulated/0/Android/data/ai.tetra.tungsten/files/Results/job_jygz19nxp/job_jygz19nxp_results.bin
[30/Apr/2024:10:27:03 -07:00: profiler/info] -=- Tungsten Running Task: Performing inference by layer -=-
[30/Apr/2024:10:27:03 -07:00: profiler/info] Detected chipset 3101, made by 3000.
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Loading previously saved results in /storage/emulated/0/Android/data/ai.tetra.tungsten/files/Results/job_jygz19nxp/job_jygz19nxp_results.bin
[30/Apr/2024:10:27:03 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Starting profiler
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Loading tflite model Models/model.tflite
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Malloc VM size before: 77880.0 kB, allocated: 16961.8 kB, slack: 60918.2 kB.
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Current memory baseline range: 45341.8-106260.0 kB.
[30/Apr/2024:10:27:03 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Runtime metadata not found in Models/model.tflite/trt_metadata.json or Models/model.tflite/trt_metadata.pb
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] TF Lite version 2.16.1. Loading model from Models/model.tflite.
[30/Apr/2024:10:27:03 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Mapping resource file in Models/model.tflite
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Loaded model. Minimum TF Lite version = 2.3.0.
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] GPUV2 delegate requested. OpenCL detected.
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Replacing 2003 out of 2003 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 1 partitions for the whole graph.
[30/Apr/2024:10:27:07 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Created 1 GPU delegate kernels.
[30/Apr/2024:10:27:07 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Applied 1 delegates: GPUV2/OpenCL. Model is fully delegated=true.
[30/Apr/2024:10:27:07 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Malloc VM size after: 259724.0 kB, allocated: 243057.6 kB, slack: 16666.4 kB.
[30/Apr/2024:10:27:07 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Status Successfully Loaded Warm with t = 3769509 us and usage: before = 106260.0 kB; peakBefore = 106260.0 kB; mallocUnusedBefore = 60918.2 kB; after = 300360.0 kB; peakAfter = 635204.0 kB; mallocUnusedAfter = 16666.4 kB; increase = 177433.6-238351.8 kB; peak = 528944.0-589862.2 kB

The process ended because of a segmentation fault. Consult the runtime log for more details.
The following is the suspected stack trace.
 * ? (/vendor/lib64/egl/libGLES_mali.so)
 * ? (/vendor/lib64/egl/libGLES_mali.so)
 * ? (/vendor/lib64/egl/libGLES_mali.so)
 * ? (/vendor/lib64/egl/libGLES_mali.so)
 * ? (/vendor/lib64/egl/libGLES_mali.so)
 * clEnqueueNDRangeKernel (/vendor/lib64/egl/libGLES_mali.so)
 * tflite::gpu::cl::CLCommandQueue::Dispatch() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tflite::gpu::cl::ProfilingCommandQueue::DispatchNTimes() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tflite::gpu::cl::InferenceContext::ProfileTime() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tflite::gpu::cl::InferenceContext::Profile() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * ? (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * ? (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * ? (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tflite::Subgraph::InvokeImpl() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tflite::Subgraph::Invoke() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tflite::impl::Interpreter::Invoke() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * backend::tflite::TfLiteModel::Run() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tungsten::Profiler::ProfileOrValidate() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tungsten::ProfilerRunner::ProfileModels() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tungsten::ProfilerRunner::RunTask() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * Java_ai_tetra_tungsten_ProfilerRunner_profileModels (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * ? (/apex/com.android.art/lib64/libart.so)
 * ? (/apex/com.android.art/lib64/libart.so)
 * ? (/apex/com.android.art/lib64/libart.so)
@gustavla gustavla added the comp:lite TF Lite related issues label Apr 30, 2024
@sawantkumar
Copy link

Hi @gustavla ,

Did you try out this model on other devices? Also can you please provide me the inference code you used on qualcom ai hub

@sawantkumar sawantkumar added the stat:awaiting response Status - Awaiting response from author label May 21, 2024
Copy link

This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label May 29, 2024
@gustavla
Copy link
Contributor Author

I just tried it on a few devices and it only repros on the Pixels that I tried:

  • Google Pixel 7: Segfaults
  • Google Pixel 8: Segfaults
  • Samsung Galaxy S23: OK (100% GPUV2 (OpenCL), 81.6 ms / 0 - 366 MB peak memory)
  • Xiaomi Redmi Note 10 5G: OK (100% GPUv2 (OpenCL), 376.2 ms, 0 - 352 MB peak memory)

Here is the submission:

qai-hub submit-profile-job --model tflite_66721_sha_clip_gpuv2_segfault.tflite --device "Google Pixel 7" --profile_options " --compute_unit gpu"

Or

import qai_hub as hub
hub.submit_profile_job(
    "tflite_66721_sha_clip_gpuv2_segfault.tflite", 
    device=hub.Device("Google Pixel 7"), 
    options="--compute_unit gpu")

@google-ml-butler google-ml-butler bot removed stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author labels May 29, 2024
@sawantkumar sawantkumar removed the WIP label Jun 11, 2024
@sawantkumar sawantkumar added the TFLiteGpuDelegate TFLite Gpu delegate issue label Jul 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:lite TF Lite related issues TF 2.16 TFLiteGpuDelegate TFLite Gpu delegate issue type:support Support issues
Projects
None yet
Development

No branches or pull requests

4 participants