[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for splitting if's with barriers in parallel ops without interchanging them #240

Merged
merged 28 commits into from
Aug 18, 2022

Conversation

ivanradanov
Copy link
Collaborator

This pr adds two new ways to handle ifs with barriers in parallel regions.

This is the current way it is done:

parallel {
  A()
  if {
    B()
    barrier
    C()
  }
  D()
}

->

parallel {
  A()
}
if {
  parallel {
    B()
  }
  parallel {
    C()
  }
}
parallel {
  D()
}

The first one allows ifs with directly nested barriers to be split at the barrier without the need to split them off with barriers and interchange them with the parallel op as such:

parallel {
  A()
  if {
    B()
    barrier
    C()
  }
  D()
}

->

parallel {
  A()
  if {
    B()
  }
}
parallel {
  if {
    C()
  }
  D()
}

This should hopefully improve performance since it keeps A, B and C,D in the same parallel region.

The second one joins the appropriate blocks for the two cases where the if condition evaluates to true or false

parallel {
  A()
  if {
    B()
    barrier
    C()
  }
  D()
}

->

if {
  parallel {
    A()
    B()
  }
  parallel {
    C()
    D()
  }
} else {
  parallel {
    A()
    D()
  }
}

This allows us to get rid of the branch in the parallel at the cost of increased code size.
This second way actually makes the code size explode exponentially wrt the number of barriers so it might only have limited use with the help of some heuristics (not yet implemented) to decide when to use it.

@wsmoses
Copy link
Member
wsmoses commented Aug 11, 2022

Can this not alternatively become the following, avoiding code duplication?

parallel {
  A()
  if {
    B()
  }
}
parallel {
  if {
    C()
  }
  D()
}

@ivanradanov
Copy link
Collaborator Author

One can choose between

parallel {
  A()
  if {
    B()
  }
}
parallel {
  if {
    C()
  }
  D()
}

and

if {
  parallel {
    A()
    B()
  }
  parallel {
    C()
    D()
  }
} else {
  parallel {
    A()
    D()
  }
}

by specifying --cpuify="distribute.ifsplit" or --cpuify="distribute.ifhoist" respectively

(the default is still the original old way)

Both of the new ways result in close to no overall performance difference on all of rodinia combined, with individual benchmark speedups seemingly ranging from -7% to +4% and -2% to +2% respectively compared to the current transformation. (some of it could be attributed to randomness)

@ivanradanov ivanradanov merged commit 028114d into main Aug 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants