[go: nahoru, domu]

Skip to content

v4.30.0: 100k, Agents improvements, Safetensors core dependency, Swiftformer, Autoformer, MobileViTv2, timm-as-a-backbone

Compare
Choose a tag to compare
@LysandreJik LysandreJik released this 08 Jun 18:07
· 3677 commits to main since this release
fe861e5

100k

Transformers has just reached 100k stars on GitHub, and to celebrate we wanted to highlight 100 projects in the vicinity of transformers and we have decided to create an awesome-transformers page to do just that.

We accept PRs to add projects to the list!

4-bit quantization and QLoRA

By leveraging the bitsandbytes library by @TimDettmers, we add 4-bit support to transformers models!

Agents

The Agents framework has been improved and continues to be stabilized. Among bug fixes, here are the important new features that were added:

  • Local agent capabilities, to load a generative model directly from transformers instead of relying on APIs.
  • Prompts are now hosted on the Hub, which means that anyone can fork the prompts and update them with theirs, to let other community contributors re-use them
  • We add an AzureOpenAiAgent class to support Azure OpenAI agents.

Safetensors

The safetensors library is a safe serialization framework for machine learning tensors. It has been audited and will become the default serialization framework for several organizations (Hugging Face, EleutherAI, Stability AI).

It has now become a core dependency of transformers.

New models

Swiftformer

The SwiftFormer paper introduces a novel efficient additive attention mechanism that effectively replaces the quadratic matrix multiplication operations in the self-attention computation with linear element-wise multiplications. A series of models called ‘SwiftFormer’ is built based on this, which achieves state-of-the-art performance in terms of both accuracy and mobile inference speed. Even their small variant achieves 78.5% top-1 ImageNet1K accuracy with only 0.8 ms latency on iPhone 14, which is more accurate and 2× faster compared to MobileViT-v2.

Autoformer

This model augments the Transformer as a deep decomposition architecture, which can progressively decompose the trend and seasonal components during the forecasting process.

MobileViTv2

MobileViTV2 is the second version of MobileViT, constructed by replacing the multi-headed self-attention in MobileViT with separable self-attention.

PerSAM

PerSAM proposes a minimal modification to SAM to allow dreambooth-like personalization, enabling to segment concepts in new images using just one example.

Timm backbone

We add support for loading timm weights within the AutoBackbone API in transformers. timm models can be instantiated through the TimmBackbone class, and then used with any vision model that needs a backbone.

Image to text pipeline conditional support

We add conditional text generation to the image to text pipeline; allowing the model to continue generating an initial text prompt according to an image.

  • [image-to-text pipeline] Add conditional text support + GIT by @NielsRogge in #23362

TensorFlow implementations

Accelerate Migration

A major rework of the internals of the Trainer is underway, leveraging accelerate instead of redefining them in transformers. This should unify both framework and lead to increased interoperability and more efficient development.

Bugfixes and improvements

Significant community contributions

The following contributors have made significant changes to the library over the last release:

  • @shehanmunasinghe
  • @TimDettmers
    • Bugfix: LLaMA layer norm incorrectly changes input type and consumers lots of memory (#23535)
    • 4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479)
    • Paged Optimizer + Lion Optimizer for Trainer (#23217)
  • @elisim
    • [Time-Series] Autoformer model (#21891)
    • Added time-series blogs to the models (#23857)
  • @kihoon71
    • 🌐 [i18n-KO] Translated fast_tokenizers.mdx to Korean (#22956)
    • [i18n-KO] Translated video_classification.mdx to Korean (#23026)
    • 🌐 [i18n-KO] Translated object_detection.mdx to Korean (#23164)
  • @D-Roberts
    • Add TensorFlow implementation of EfficientFormer (#22620)
  • @soongbren