[go: nahoru, domu]

Skip to content

A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph

License

Notifications You must be signed in to change notification settings

rhysdg/whisper-onnx-python

Repository files navigation

Contributors Apache LinkedIn


Whisper ONNX: An Optimized Speech-to-Text Python Package


Explore the docs »



Report Bug . Request Feature

Table of Contents

About The Project

Built With

The Story So Far

Coming soon

Getting Started:

  • Right now getting started is as simple as either a pip install from root or the upstream repo:

    pip install .
    
    #or 
    
    pip install git+https://github.com/rhysdg/whisper-onnx-python.git
    
  • For Jetpack 5 support with Python 3.11 go ahead and run the installation script first to grab a pre-built onnxruntime-gpu wheel for aarch_64 and a few extra dependencies:

    sh jetson_install.sh 
    
    pip install .
    

Example usage:

  • Currently usage closely follows the official package but with a trt swicth (currently being debugged, False is recommended as a result) and expects either an audio file or a numy array:

    import numpy as np
    import whisper
    
    args = {"language": 'English',
            "name": "small.en",
            "precision": "fp32",
            "disable_cupy": False}
    
    temperature = tuple(np.arange(0, 1.0 + 1e-6, 0.2))
    
    model = whisper.load_model(trt=False, **args)
    result = model.transcribe(
                        'data/test.wav', 
                        temperature=temperature,
                        **args
                        )
  • You can also find an example voice transcription assistant at examples/example_assistant.py

    • Go ahead and hold in your space bar from the command line in order to start recording
    • Release to start transcription
    • This has been tested on Ubuntu 22.04 and Jetpack 5 on a AGX Xavier but feel free to open an issue so we can work through any issues!
    python examples/example_assistant.py

Customisation:

  • Coming soon

Notebooks

  • Coming soon

Tools and Scripts

  • Coming soon

Testing

  • Ubuntu 22.04 - RTX 3080, 8-core, Python 3.11 - passing

  • AGX Xavier, Jetpack 5.1.3, Python 3.11 - Passing

  • CI/CD will be expanded as we go - all general instantiation test pass so far.

Models & Latency benchmarks

  • Coming soon

Similar projects

Latest Updates

  • Finished the core Python package
  • Added an example assistant
  • Added Jetpack support

Future updates

  • CI/CD
  • Pypi release
  • Becnhmarks for Jetson devices

Contact

About

A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published