[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Updated Profiling VTR Section in Developer Guide #2605

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ueqri
Copy link
Contributor
@ueqri ueqri commented Jun 11, 2024

Close #2545.

Rewrote the existing Profiling VTR section, specifically the one using GNU gprof tool.

Added another subsection to explain how to use the Linux perf tool to profile VPR and visualize its output.

Rewrote the existing Profiling VTR section, specifically the one using
GNU `gprof` tool.

Added another subsection to explain how to use the Linux `perf` tool to
profile VPR and visualize its output.
@github-actions github-actions bot added the docs Documentation label Jun 11, 2024
@ueqri
Copy link
Contributor Author
ueqri commented Jun 11, 2024

Expectations:

image

Remaining Issues:

"Some checks haven’t completed yet": It appears to be the issue Alex mentioned in #2598 (comment).

A potential issue you may run into is that there are required tests in our PR flow (sets of tests which must succeed in order for the code to be merged in). If the CI does not run for documentation changes, the PR may never be merged since it did not pass the required tests. I am not sure the best way to resolve this issue, but it should be resolved.

Moreover, here is the exact troubleshoot reference from GitHub.

If a workflow is skipped due to path filtering, branch filtering or a commit message, then checks associated with that workflow will remain in a "Pending" state. A pull request that requires those checks to be successful will be blocked from merging.

If, however, a job within a workflow is skipped due to a conditional, it will report its status as "Success". For more information, see "Using conditions to control job execution."

Potential Solutions:

  • Solution 1: Maybe we can fix it by disabling the 'Require status checks to pass' in the Branch Protection Rule Settings.
  • Solution 2: Instead of using path filtering on the CI trigger, we can use job conditions to prevent CI job execution for unnecessary tasks. However, this requires using a third-party dorny/paths-filter@v3 component as the first job step, rather than just relying on a GitHub CI primitive as we did before.

README.developers.md Show resolved Hide resolved
```
- **Option 2** (Recommended): Record and offline analysis

Use `perf record` to record the profile data and the call graph. (Note: The argument `lbr` for `--call-graph` only works on Intel platforms. If you encounter issues with call graph recording, please refer to the [`perf record` manual](https://perf.wiki.kernel.org/index.php/Latest_Manual_Page_of_perf-record.1) for more information.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are on a non-Intel platform what should you do? Just leave out --call-graph lbr ? Also describe what leaving it out does -- I believe perf still works but becomes more resource intensive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are on a non-Intel platform what should you do?

For non-Intel platforms, the argument can be set to fp which utilizes frame pointer to produce call graph (sometimes might be inaccurate) or dwarf using debugging information generated by compiler (resource-consuming to generate call graphs from this during profiling).

Q: Would the following changes work? I was worried that it might be too long to read.

Edited:

Use perf record to record the profile data and the call graph. (Note: By default, perf uses the frame pointer to generate call graphs, which might produce inaccurate results if the program is highly optimized by the compiler. On Intel platforms, it is recommended to specify lbr for --call-graph, as it is less affected by compiler optimizations, does not require specific compiler options, and uses fewer resources, e.g., less disk storage for storing profiling results. On other platforms, use --call-graph dwarf if available. This requires the compiler to produce debugging information in DWARF format and is resource-intensive. For more information on call graph recording, please refer to the perf record manual and StackOverflow discussion.)

sudo perf record --call-graph -p <vpr pid> # use `--call-graph lbr` on Intel platforms or `--call-graph dwarf` on other platforms

@vaughnbetz
Copy link
Contributor

Thanks! Looks good -- just a couple of suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document how to use Perf for profiling with vpr
2 participants