-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dev apply transformer #131
Dev apply transformer #131
Conversation
- added option to cache step output in memory and caching mask_resize output
- create make_apply_trasformer function in utils.py that creates transformer applying given function over input
- removed redundant transformers and replaced them with make_apply_trasformer
- added docstrings for functions in postprocessing.py
- removed stream mode
- removed CRF from postprocessing pipeline
…-solution-mapping-challenge into dev-apply_transformer
src/models.py
Outdated
@@ -42,7 +42,16 @@ | |||
|
|||
|
|||
class BasePyTorchUNet(Model): | |||
"""Base class for PyTorch U-net models. | |||
|
|||
Args: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@taraspiotr todo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakubczakon my bad, I've just started documenting models, but in this PR there should be only postprocessing docstrings
@@ -200,29 +200,35 @@ def _generate_prediction(self, cache_dirpath, outputs): | |||
|
|||
def postprocessing__pipeline_simplified(cache_dirpath): | |||
mask_resize = Step(name='mask_resize', | |||
transformer=post.Resizer(), | |||
transformer=make_apply_transformer(post.resize_image, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@taraspiotr I like how nice and easy to use it is.
@@ -56,6 +56,7 @@ def _prep_cache(self, cache_dirpath): | |||
self.cache_filepath_step_transformer = os.path.join(self.cache_dirpath_transformers, self.name) | |||
self.save_filepath_step_output = os.path.join(self.save_dirpath_outputs, '{}'.format(self.name)) | |||
self.save_filepath_step_tmp = os.path.join(self.save_dirpath_tmp, '{}'.format(self.name)) | |||
self._cached_output = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@taraspiotr so now we don't have this load/save cache option available at all. It is ok for this project but for other projects (general use) I'd rather have that available so when you make a PR to steppy make sure to deal with both options. That suggest it is probably a good spot to deal with both here not to have vastly different codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakubczakon we still have an option to save output on disk, but yes, cache in memory is implemented in place of cache on disk, with the same interface. Right know in steppy cache and save are pretty much the same with the only difference being that cache is saved in temp directory. I think it is worth rethinking if caching on disk alongside saving on disk is not a bit redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@taraspiotr we've discussed and figured that the difference between cache and save is that cache is during one run (cleaned after) saved is done between runs (no cleaning). If you do decide to save you need to load it on purpuse as well. You can also load those pickle objects somewhere else (notebook) to investigate. So I think we potentially all of the cases dealt with (cache on disk, cache in memory, save on disk).
I will merge it here but PR for steppy would need to be a bit more general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakubczakon ok, sure
src/utils.py
Outdated
def make_apply_transformer(func, output_name='output', apply_on=None): | ||
class StaticApplyTransformer(BaseTransformer): | ||
def transform(self, *args, **kwargs): | ||
iterator_length = self.check_input(*args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@taraspiotr I don't like the naming. check_input can return True or do some assertion but I don't like that it returns length. I would rather have 2 methods check_input
and get_length
that we run one ofter the other,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakubczakon I fully agree, at first, this function was just checking input and then I added length on the spot without rethinking naming.
@taraspiotr I think CRF is still left in the |
* initial restructure * clean structure (#126) * clean structure * correct readme * further cleaning * Dev apply transformer (#131) * clean structure * correct readme * further cleaning * resizer docstring * couple docstrings * make apply transformer, memory cache * fixes * postprocessing docstrings * fixes in PR * Dev repo cleanup (#132) * cleanup * remove src. * Dev clean tta (#134) * added resize padding, refactored inference pipelines * refactored piepliens * added color shift augmentation * reduced caching to just mask_resize * updated config * Dev-repo_cleanup models and losses docstrings (#135) * models and losses docstrings * small fixes in docstrings * resolve conflicts in with TTA PR (#137)
* added gmean tta, experimented with thresholding (#125) * Dev repo cleanup (#138) * initial restructure * clean structure (#126) * clean structure * correct readme * further cleaning * Dev apply transformer (#131) * clean structure * correct readme * further cleaning * resizer docstring * couple docstrings * make apply transformer, memory cache * fixes * postprocessing docstrings * fixes in PR * Dev repo cleanup (#132) * cleanup * remove src. * Dev clean tta (#134) * added resize padding, refactored inference pipelines * refactored piepliens * added color shift augmentation * reduced caching to just mask_resize * updated config * Dev-repo_cleanup models and losses docstrings (#135) * models and losses docstrings * small fixes in docstrings * resolve conflicts in with TTA PR (#137) * refactor in stream mode (#139) * hot fix of mask_postprocessing in tta with new make transformer * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * local * Update README.md * Update README.md * Update README.md * Update README.md * Dev preparation path fix (#140) * local * cleaned up paths in the masks and metadata generation * dropped debug stuff * Dev non trainable transformer flag (#141) * local * added is_trainable flag to models
* initial restructure * thresholds on unet output * added gmean tta, experimented with thresholding (#125) * feature exractor and lightgbm * pipeline is running ok * tmp commit * lgbm ready for tests * tmp * faster nms and feature extraction * small fix * cleaning * Dev repo cleanup (#138) * initial restructure * clean structure (#126) * clean structure * correct readme * further cleaning * Dev apply transformer (#131) * clean structure * correct readme * further cleaning * resizer docstring * couple docstrings * make apply transformer, memory cache * fixes * postprocessing docstrings * fixes in PR * Dev repo cleanup (#132) * cleanup * remove src. * Dev clean tta (#134) * added resize padding, refactored inference pipelines * refactored piepliens * added color shift augmentation * reduced caching to just mask_resize * updated config * Dev-repo_cleanup models and losses docstrings (#135) * models and losses docstrings * small fixes in docstrings * resolve conflicts in with TTA PR (#137) * refactor in stream mode (#139) * hot fix of mask_postprocessing in tta with new make transformer * finishing merge * finishing merge v2 * finishing merge v3 * finishing merge v4 * tmp commit * lgbm train and evaluate pipelines run correctly * something is not yes * fix * working lgbm training with ugly train_mode=True * back to pipelines.py * small fix * preparing PR * preparing PR v2 * preparing PR v2 * fix * fix_2 * fix_3 * fix_4
* initial restructure * thresholds on unet output * added gmean tta, experimented with thresholding (#125) * feature exractor and lightgbm * pipeline is running ok * tmp commit * lgbm ready for tests * tmp * faster nms and feature extraction * small fix * cleaning * Dev repo cleanup (#138) * initial restructure * clean structure (#126) * clean structure * correct readme * further cleaning * Dev apply transformer (#131) * clean structure * correct readme * further cleaning * resizer docstring * couple docstrings * make apply transformer, memory cache * fixes * postprocessing docstrings * fixes in PR * Dev repo cleanup (#132) * cleanup * remove src. * Dev clean tta (#134) * added resize padding, refactored inference pipelines * refactored piepliens * added color shift augmentation * reduced caching to just mask_resize * updated config * Dev-repo_cleanup models and losses docstrings (#135) * models and losses docstrings * small fixes in docstrings * resolve conflicts in with TTA PR (#137) * refactor in stream mode (#139) * hot fix of mask_postprocessing in tta with new make transformer * finishing merge * finishing merge v2 * finishing merge v3 * finishing merge v4 * tmp commit * lgbm train and evaluate pipelines run correctly * something is not yes * fix * working lgbm training with ugly train_mode=True * back to pipelines.py * small fix * preparing PR * preparing PR v2 * preparing PR v2 * fix * fix_2 * fix_3 * fix_4