[Docs] Updated Profiling VTR Section in Developer Guide #2605

ueqri · 2024-06-11T13:38:20Z

Rewrote the existing Profiling VTR section, specifically the one using GNU gprof tool.

Added another subsection to explain how to use the Linux perf tool to profile VPR and visualize its output.

Rewrote the existing Profiling VTR section, specifically the one using GNU `gprof` tool. Added another subsection to explain how to use the Linux `perf` tool to profile VPR and visualize its output.

ueqri · 2024-06-11T14:13:09Z

Expectations:

The "Test" and "Container" CI workflows are skipped as expected, according to https://github.com/verilog-to-routing/vtr-verilog-to-routing/pull/2605/checks.
The Read the Docs are also built successfully, https://vtr-verilog-to-routing--2605.org.readthedocs.build/en/2605/

Remaining Issues:

"Some checks haven’t completed yet": It appears to be the issue Alex mentioned in #2598 (comment).

A potential issue you may run into is that there are required tests in our PR flow (sets of tests which must succeed in order for the code to be merged in). If the CI does not run for documentation changes, the PR may never be merged since it did not pass the required tests. I am not sure the best way to resolve this issue, but it should be resolved.

Moreover, here is the exact troubleshoot reference from GitHub.

If a workflow is skipped due to path filtering, branch filtering or a commit message, then checks associated with that workflow will remain in a "Pending" state. A pull request that requires those checks to be successful will be blocked from merging.

If, however, a job within a workflow is skipped due to a conditional, it will report its status as "Success". For more information, see "Using conditions to control job execution."

Potential Solutions:

Solution 1: Maybe we can fix it by disabling the 'Require status checks to pass' in the Branch Protection Rule Settings.
Solution 2: Instead of using path filtering on the CI trigger, we can use job conditions to prevent CI job execution for unnecessary tasks. However, this requires using a third-party dorny/paths-filter@v3 component as the first job step, rather than just relying on a GitHub CI primitive as we did before.

README.developers.md

vaughnbetz · 2024-06-11T18:39:53Z

README.developers.md

+        ```
+   - **Option 2** (Recommended): Record and offline analysis
+
+        Use `perf record` to record the profile data and the call graph. (Note: The argument `lbr` for `--call-graph` only works on Intel platforms. If you encounter issues with call graph recording, please refer to the [`perf record` manual](https://perf.wiki.kernel.org/index.php/Latest_Manual_Page_of_perf-record.1) for more information.)


If you are on a non-Intel platform what should you do? Just leave out --call-graph lbr ? Also describe what leaving it out does -- I believe perf still works but becomes more resource intensive.

If you are on a non-Intel platform what should you do?

For non-Intel platforms, the argument can be set to fp which utilizes frame pointer to produce call graph (sometimes might be inaccurate) or dwarf using debugging information generated by compiler (resource-consuming to generate call graphs from this during profiling).

Q: Would the following changes work? I was worried that it might be too long to read.

Edited:

Use perf record to record the profile data and the call graph. (Note: By default, perf uses the frame pointer to generate call graphs, which might produce inaccurate results if the program is highly optimized by the compiler. On Intel platforms, it is recommended to specify lbr for --call-graph, as it is less affected by compiler optimizations, does not require specific compiler options, and uses fewer resources, e.g., less disk storage for storing profiling results. On other platforms, use --call-graph dwarf if available. This requires the compiler to produce debugging information in DWARF format and is resource-intensive. For more information on call graph recording, please refer to the perf record manual and StackOverflow discussion.)

sudo perf record --call-graph -p <vpr pid> # use `--call-graph lbr` on Intel platforms or `--call-graph dwarf` on other platforms

vaughnbetz · 2024-06-11T18:40:17Z

Thanks! Looks good -- just a couple of suggestions.

[Docs] Updated Profiling VTR Section in Developer Guide

4c516ab

Rewrote the existing Profiling VTR section, specifically the one using GNU `gprof` tool. Added another subsection to explain how to use the Linux `perf` tool to profile VPR and visualize its output.

github-actions bot added the docs Documentation label Jun 11, 2024

vaughnbetz requested changes Jun 11, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Docs] Updated Profiling VTR Section in Developer Guide #2605

[Docs] Updated Profiling VTR Section in Developer Guide #2605

[Docs] Updated Profiling VTR Section in Developer Guide #2605

Are you sure you want to change the base?

[Docs] Updated Profiling VTR Section in Developer Guide #2605

Conversation

Expectations:

Remaining Issues:

Potential Solutions:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Edited: