[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev apply transformer #131

Merged

Conversation

taraspiotr
Copy link
Contributor
  • added option to cache step output in memory and caching mask_resize output
  • create make_apply_trasformer function in utils.py that creates transformer applying given function over input
  • removed redundant transformers and replaced them with make_apply_trasformer
  • added docstrings for functions in postprocessing.py
  • removed stream mode
  • removed CRF from postprocessing pipeline

src/models.py Outdated
@@ -42,7 +42,16 @@


class BasePyTorchUNet(Model):
"""Base class for PyTorch U-net models.

Args:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taraspiotr todo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakubczakon my bad, I've just started documenting models, but in this PR there should be only postprocessing docstrings

@@ -200,29 +200,35 @@ def _generate_prediction(self, cache_dirpath, outputs):

def postprocessing__pipeline_simplified(cache_dirpath):
mask_resize = Step(name='mask_resize',
transformer=post.Resizer(),
transformer=make_apply_transformer(post.resize_image,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taraspiotr I like how nice and easy to use it is.

@@ -56,6 +56,7 @@ def _prep_cache(self, cache_dirpath):
self.cache_filepath_step_transformer = os.path.join(self.cache_dirpath_transformers, self.name)
self.save_filepath_step_output = os.path.join(self.save_dirpath_outputs, '{}'.format(self.name))
self.save_filepath_step_tmp = os.path.join(self.save_dirpath_tmp, '{}'.format(self.name))
self._cached_output = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taraspiotr so now we don't have this load/save cache option available at all. It is ok for this project but for other projects (general use) I'd rather have that available so when you make a PR to steppy make sure to deal with both options. That suggest it is probably a good spot to deal with both here not to have vastly different codebase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakubczakon we still have an option to save output on disk, but yes, cache in memory is implemented in place of cache on disk, with the same interface. Right know in steppy cache and save are pretty much the same with the only difference being that cache is saved in temp directory. I think it is worth rethinking if caching on disk alongside saving on disk is not a bit redundant.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taraspiotr we've discussed and figured that the difference between cache and save is that cache is during one run (cleaned after) saved is done between runs (no cleaning). If you do decide to save you need to load it on purpuse as well. You can also load those pickle objects somewhere else (notebook) to investigate. So I think we potentially all of the cases dealt with (cache on disk, cache in memory, save on disk).

I will merge it here but PR for steppy would need to be a bit more general.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakubczakon ok, sure

src/utils.py Outdated
def make_apply_transformer(func, output_name='output', apply_on=None):
class StaticApplyTransformer(BaseTransformer):
def transform(self, *args, **kwargs):
iterator_length = self.check_input(*args, **kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taraspiotr I don't like the naming. check_input can return True or do some assertion but I don't like that it returns length. I would rather have 2 methods check_input and get_length that we run one ofter the other,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakubczakon I fully agree, at first, this function was just checking input and then I added length on the spot without rethinking naming.

@jakubczakon
Copy link
Collaborator

@taraspiotr I think CRF is still left in the neptune.yaml

@jakubczakon jakubczakon merged commit 8c4e12c into neptune-ai:dev-repo_cleanup Jun 14, 2018
jakubczakon added a commit that referenced this pull request Jun 15, 2018
* initial restructure

* clean structure (#126)

* clean structure

* correct readme

* further cleaning

* Dev apply transformer (#131)

* clean structure

* correct readme

* further cleaning

* resizer docstring

* couple docstrings

* make apply transformer, memory cache

* fixes

* postprocessing docstrings

* fixes in PR

* Dev repo cleanup (#132)

* cleanup

* remove src.

* Dev clean tta (#134)

* added resize padding, refactored inference pipelines

* refactored piepliens

* added color shift augmentation

* reduced caching to just mask_resize

* updated config

* Dev-repo_cleanup models and losses docstrings (#135)

* models and losses docstrings

* small fixes in docstrings

* resolve conflicts in with TTA PR (#137)
jakubczakon added a commit that referenced this pull request Jun 19, 2018
* added gmean tta, experimented with thresholding (#125)

* Dev repo cleanup (#138)

* initial restructure

* clean structure (#126)

* clean structure

* correct readme

* further cleaning

* Dev apply transformer (#131)

* clean structure

* correct readme

* further cleaning

* resizer docstring

* couple docstrings

* make apply transformer, memory cache

* fixes

* postprocessing docstrings

* fixes in PR

* Dev repo cleanup (#132)

* cleanup

* remove src.

* Dev clean tta (#134)

* added resize padding, refactored inference pipelines

* refactored piepliens

* added color shift augmentation

* reduced caching to just mask_resize

* updated config

* Dev-repo_cleanup models and losses docstrings (#135)

* models and losses docstrings

* small fixes in docstrings

* resolve conflicts in with TTA PR (#137)

* refactor in stream mode (#139)

* hot fix of mask_postprocessing in tta with new make transformer

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* local

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Dev preparation path fix (#140)

* local

* cleaned up paths in the masks and metadata generation

* dropped debug stuff

* Dev non trainable transformer flag (#141)

* local

* added is_trainable flag to models
jakubczakon pushed a commit that referenced this pull request Jun 21, 2018
* initial restructure

* thresholds on unet output

* added gmean tta, experimented with thresholding (#125)

* feature exractor and lightgbm

* pipeline is running ok

* tmp commit

* lgbm ready for tests

* tmp

* faster nms and feature extraction

* small fix

* cleaning

* Dev repo cleanup (#138)

* initial restructure

* clean structure (#126)

* clean structure

* correct readme

* further cleaning

* Dev apply transformer (#131)

* clean structure

* correct readme

* further cleaning

* resizer docstring

* couple docstrings

* make apply transformer, memory cache

* fixes

* postprocessing docstrings

* fixes in PR

* Dev repo cleanup (#132)

* cleanup

* remove src.

* Dev clean tta (#134)

* added resize padding, refactored inference pipelines

* refactored piepliens

* added color shift augmentation

* reduced caching to just mask_resize

* updated config

* Dev-repo_cleanup models and losses docstrings (#135)

* models and losses docstrings

* small fixes in docstrings

* resolve conflicts in with TTA PR (#137)

* refactor in stream mode (#139)

* hot fix of mask_postprocessing in tta with new make transformer

* finishing merge

* finishing merge v2

* finishing merge v3

* finishing merge v4

* tmp commit

* lgbm train and evaluate pipelines run correctly

* something is not yes

* fix

* working lgbm training with ugly train_mode=True

* back to pipelines.py

* small fix

* preparing PR

* preparing PR v2

* preparing PR v2

* fix

* fix_2

* fix_3

* fix_4
jakubczakon added a commit that referenced this pull request Jun 21, 2018
* initial restructure

* thresholds on unet output

* added gmean tta, experimented with thresholding (#125)

* feature exractor and lightgbm

* pipeline is running ok

* tmp commit

* lgbm ready for tests

* tmp

* faster nms and feature extraction

* small fix

* cleaning

* Dev repo cleanup (#138)

* initial restructure

* clean structure (#126)

* clean structure

* correct readme

* further cleaning

* Dev apply transformer (#131)

* clean structure

* correct readme

* further cleaning

* resizer docstring

* couple docstrings

* make apply transformer, memory cache

* fixes

* postprocessing docstrings

* fixes in PR

* Dev repo cleanup (#132)

* cleanup

* remove src.

* Dev clean tta (#134)

* added resize padding, refactored inference pipelines

* refactored piepliens

* added color shift augmentation

* reduced caching to just mask_resize

* updated config

* Dev-repo_cleanup models and losses docstrings (#135)

* models and losses docstrings

* small fixes in docstrings

* resolve conflicts in with TTA PR (#137)

* refactor in stream mode (#139)

* hot fix of mask_postprocessing in tta with new make transformer

* finishing merge

* finishing merge v2

* finishing merge v3

* finishing merge v4

* tmp commit

* lgbm train and evaluate pipelines run correctly

* something is not yes

* fix

* working lgbm training with ugly train_mode=True

* back to pipelines.py

* small fix

* preparing PR

* preparing PR v2

* preparing PR v2

* fix

* fix_2

* fix_3

* fix_4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants