You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the feature and the current behavior/state.
Right now we have MirroredStrategy and MultiWorkerMirroredStrategy for synchronous training where variables and ops are all replicated and all-reduce is used for gradient aggregation. We also have ParameterServerStrategy where gradient updates from workers are purely asynchronous since SyncReplicasOptimizer is buggy and has been deprecated.
We would like to first collect use cases where synchronous training with MirroredStrategy and MultiWorkerMirroredStrategy is not ideal and synchronous training with parameter servers is necessary.
If this feature is necessary and important enough, we will then use this issue to track the progress of the development of this feature.
We have a separate feature request to support large embeddings with MirroredStrategy and MultiWorkerMirroredStrategy: #27726
Will this change the current api? How?
Yes.
Who will benefit with this feature?
Those who use distributed training.
Any Other info.
N/A
The text was updated successfully, but these errors were encountered:
System information
Describe the feature and the current behavior/state.
Right now we have
MirroredStrategy
andMultiWorkerMirroredStrategy
for synchronous training where variables and ops are all replicated and all-reduce is used for gradient aggregation. We also haveParameterServerStrategy
where gradient updates from workers are purely asynchronous sinceSyncReplicasOptimizer
is buggy and has been deprecated.We would like to first collect use cases where synchronous training with
MirroredStrategy
andMultiWorkerMirroredStrategy
is not ideal and synchronous training with parameter servers is necessary.If this feature is necessary and important enough, we will then use this issue to track the progress of the development of this feature.
We have a separate feature request to support large embeddings with
MirroredStrategy
andMultiWorkerMirroredStrategy
: #27726Will this change the current api? How?
Yes.
Who will benefit with this feature?
Those who use distributed training.
Any Other info.
N/A
The text was updated successfully, but these errors were encountered: