Releases: tensorflow/text
Releases · tensorflow/text
2.2.0 release
Release 2.2
Major Features and Improvements
Breaking Changes
Bug Fixes and Other Changes
- Update version
Thanks to our Contributors
v2.2.0-rc2
Bug fixes
- Force MacOS builds to build for OSX 10.9 so they can be installed to a wider range of MacOS versions.
v2.2.0-rc1
Release 2.2.0-rc1
Major Features and Improvements
- Add op for solving max-spanning-tree (MST) problems. The code here is intended for NLP applications, but attempts to remain agnostic to particular NLP tasks (such as dependency parsing).
- Add max_spanning_tree_gradient.
- Add support for 'preserve_unused_tokens' options in BertTokenizer.
Bug Fixes and Other Changes
- Documentation updates.
- Reorganize the BUILD file for keras layers.
- Update model server testing. The test script now generates a model that integrates into tf serving's testing infra.
- Remove unneeded heavy dependencies in regex_split library.
- Turn TF text's ConstrainedSequence implementations into standalone callable functions.
- Fix bug in ViterbiAnalysis computation triggered when not using transition_weights.
- Removing testing_utils run_tf_function which is enabled by default now.
- Update patch params to work with Bazel >=1.0.0
- Remove circular dependencies by removing submodule imports from ragged package.
- Prevent lack of ragged_ops.py being released in TF from breaking tf.Text
Thanks to our Contributors
This release contains contributions from many people at Google, as well as:
Hyunwoo Cho
v2.1.1
v2.1.0-rc0
Major Updates
- Added SplitMergeTokenizer.
- Add support for token offsets to BertTokenizer.
Minor Updates
- Give BertTokenizer ability to read in a vocab file directly.
- Migrate from std::string to tensorflow::tstring.
- Many build script improvements.
- Update ToDense layer with ragged support attribute.
Bug Fixes
- Update SentencePiece to inherit from TokenizerWithOffsets.
- Fix ICU data linking issue.
v2.0.1
v1.15.1
v2.0.0
Major Updates
- Added a regex_split op.
- Fixes a bug in case_fold_utf8 and normalize_utf8 ops where they were unable to locate the ICU data file.
- Fixed a problem with the BertTokenizer where it was using merge_dims which is unreleased for the corresponding version of TensorFlow.
- Updated the BertTokenizer to use regex_split to match the exact regex used by original BERT.
v1.15.0
Major Updates
- Added a regex_split op.
- Fixes a bug in case_fold_utf8 and normalize_utf8 ops where they were unable to locate the ICU data file.
- Fixed a problem with the BertTokenizer where it was using merge_dims which is unreleased for the corresponding version of TensorFlow.
- Updated the BertTokenizer to use regex_split to match the exact regex used by original BERT.
v2.0.0-rc0
Please note that moving forward our releases and branches will match the major & minor versions of core TensorFlow. This should prevent future confusion. As such, this (previously 1.0) release is 2.0, and we will be skiping straight to 1.15 for the next 1.x release to support TF 1.15.
Major Updates:
- SentencepieceTokenizer has been added. Please see https://github.com/google/sentencepiece for more information on Sentencepiece.
- New ToDense Keras layer for RaggedTensor conversion
- Pipeline for generating a Wordpiece Vocabulary has been added to tools.
- New Rouge-L metric op for measuring text similarity. A new colab has been added to the examples directory which provides usage examples.
- New BertTokenizer which mimics the preprocessing performed in the original BERT model.
- New Detokenizer abstract class has been added to the TF.Text Tokenizer API.
- Many previously released ops have been added TensorFlow Serving model server. Please see https://github.com/tensorflow/serving for more information.
Minor Updates:
- API docs have received an update that should make finding relevant information easier.
- Wordpiece: Add support for splitting unknown characters
- Wordpiece: Add support for max characters per token
- Wordshape: Fix finding of currency symbols.
- Update Whitespace & UnicodeScript Tokenizers to accept scalar values.
- Build includes CC library targets. Useful for statically linking in TF.Text custom ops. Specifically useful for building into TF.Serving's model server.
- Build environment: Updated to match core TF's update.