[go: nahoru, domu]

Skip to content

Commit

Permalink
py: add python sdk
Browse files Browse the repository at this point in the history
- Bumping versions to 0.9.3

- Added autogenerated files from protoc and shim layer

- Added tests for client and TLS

- Added examples and usage for the python sdk in the documentation.

Co-Authored-By: Julie Sheffield <julie@cobaltspeech.com>
  • Loading branch information
shahruk10 and Julie Sheffield committed Feb 6, 2020
1 parent ae15cd9 commit 14761b1
Show file tree
Hide file tree
Showing 22 changed files with 1,837 additions and 25 deletions.
12 changes: 10 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,12 @@ TOP := $(shell pwd)
DEPSBIN := ${TOP}/deps/bin
DEPSGO := ${TOP}/deps/go
DEPSTMP := ${TOP}/deps/tmp
DEPSVENV := ${TOP}/deps/venv
$(shell mkdir -p $(DEPSBIN) $(DEPSGO) $(DEPSTMP))

export PATH := ${DEPSBIN}:${DEPSGO}/bin:$(PATH)

deps: deps-protoc deps-hugo deps-gendoc deps-gengo deps-gengateway deps-dotnet
deps: deps-protoc deps-hugo deps-gendoc deps-gengo deps-gengateway deps-dotnet deps-py

deps-protoc: ${DEPSBIN}/protoc
${DEPSBIN}/protoc:
Expand Down Expand Up @@ -49,8 +50,15 @@ ${DEPSBIN}/dotnet:
"https://download.visualstudio.microsoft.com/download/pr/d731f991-8e68-4c7c-8ea0-fad5605b077a/49497b5420eecbd905158d86d738af64/dotnet-sdk-3.1.100-linux-x64.tar.gz"
cd ${DEPSBIN} && tar -C ./ -xzvf dotnet-sdk-3.1.100-linux-x64.tar.gz

deps-py: ${DEPSVENV}/.done
${DEPSVENV}/.done:
virtualenv -p python3 ${DEPSVENV}
source ${DEPSVENV}/bin/activate && pip install grpcio-tools==1.20.0 googleapis-common-protos==1.5.9 && deactivate
touch $@

gen: deps
@ PROTOINC=${DEPSGO}/pkg/mod/github.com/grpc-ecosystem/grpc-gateway@v1.9.0/third_party/googleapis \
@ source ${DEPSVENV}/bin/activate && \
PROTOINC=${DEPSGO}/pkg/mod/github.com/grpc-ecosystem/grpc-gateway@v1.9.0/third_party/googleapis \
$(MAKE) -C grpc
@ pushd docs-src && hugo -d ../docs && popd

Expand Down
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,15 @@ repository.
Code generation has the following dependencies:
- The protobuf compiler itself (protoc)
- The protobuf documentation generation plugin (protoc-gen-doc)
- The python plugins (grpcio-tools and googleapis-common-protos)
- The golang plugins (protoc-gen-go and protoc-gen-grpc-gateway)
- The static website generator (hugo)

A few system dependencies are required:
- Go >= 1.12
- git
- python3
- virtualenv
- wget

The top level Makefile can set up all other dependencies.
Expand Down Expand Up @@ -80,6 +83,7 @@ git checkout -b version-update-v$NEW_VERSION

sed -i 's|grpc/go-juzu v[0-9.]*|grpc/go-juzu v'$NEW_VERSION'|g' grpc/go-juzu/juzupb/gw/go.mod
sed -i 's|<Version>[0-9.]*</Version>|<Version>'$NEW_VERSION'</Version>|g' grpc/csharp-juzu/juzu.csproj
sed -i 's|version='\''[0-9.]*'\''|version='\'$NEW_VERSION\''|g' grpc/py-juzu/setup.py
sed -i 's|^VERSION="[0-9.]*"|VERSION="'$NEW_VERSION'"|g' grpc/Makefile

git commit -m "Update version to v$NEW_VERSION"
Expand Down
10 changes: 5 additions & 5 deletions docs-src/content/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: "Juzu SDK Documentation"

# Juzu API Overview

Juzu is Cobalt's speaker diarization engine. It can be deployed on-prem and accessed over the network or on your local machine via an API. We currently support C# and are adding support for more languages.
Juzu is Cobalt's speaker diarization engine. It can be deployed on-prem and accessed over the network or on your local machine via an API. We currently support C# and Python, and are adding support for more languages.

Once running, Juzu's API provides a method to which you can stream audio. This audio can either be from a microphone or a file. We recommend uncompressed WAV or lossless compression such as FLAC as the encoding, but we can support other formats as well upon request.

Expand Down Expand Up @@ -265,7 +265,7 @@ Cubic as well for transcription and aiding the diarization process. This server
exports Juzu's functionality over the gRPC protocol. The
https://github.com/cobaltspeech/sdk-juzu repository contains the SDK that you
can use in your application to communicate with the Juzu server. This SDK is
currently available for C# and we would be happy to talk to you if you need
support for other languages. Most of the core SDK is generated automatically
using the gRPC tools, and Cobalt provides a top level package for more
convenient API calls.
currently available for C# and Python, and we would be happy to talk to you if
you need support for other languages. Most of the core SDK is generated
automatically using the gRPC tools, and Cobalt provides a top level package for
more convenient API calls.
35 changes: 33 additions & 2 deletions docs-src/content/using-juzu-sdk/connecting.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,26 @@ those to point to your server instance.

The following code snippet connects to the server and queries its version. It
uses our recommended default setup, expecting the server to be listening on a
TLS encrypted connection, as the demo server does.
TLS encrypted connection. Examples showing how to connect to a server not using
TLS is also shown in the [Insecure Connection](#insecure-connection) section.

{{%tabs %}}

{{% tab "Python" %}}

```py
import juzu

serverAddress = "127.0.0.1:2727"

client = juzu.Client(serverAddress)

resp = client.Version()
print(resp)
```

{{% /tab %}}

{{% tab "C#" %}}

``` csharp
Expand Down Expand Up @@ -51,6 +67,14 @@ can use:

{{%tabs %}}

{{% tab "Python" %}}

```py
client = juzu.Client(serverAddress, insecure=True)
```

{{% /tab %}}

{{% tab "C#" %}}

``` csharp
Expand Down Expand Up @@ -80,8 +104,15 @@ authenticated TLS. This can be done with:

{{%tabs %}}

{{% tab "C#" %}}
{{% tab "Python" %}}

```py
client = juzu.Client(serverAddress, clientCertificate=certPem, clientKey=keyPem)
```

{{% /tab %}}

{{% tab "C#" %}}

#### Authenticating Server Certificate

Expand Down
9 changes: 9 additions & 0 deletions docs-src/content/using-juzu-sdk/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,15 @@ Instructions for installing the SDK are language specific.

<!--more-->

### Python

The Python SDK depends on Python >= 3.5. You may use pip to perform a system-wide install, or use virtualenv for a local install.

```bash
pip install --upgrade pip
pip install "git+https://github.com/cobaltspeech/sdk-juzu#egg=cobalt-juzu&subdirectory=grpc/py-juzu"
```

### C\#

The C# SDK utilizes the [NuGet package manager](https://www.nuget.org). The package is called `Juzu-SDK`, under the owners name of `CobaltSpeech`.
Expand Down
183 changes: 183 additions & 0 deletions docs-src/content/using-juzu-sdk/streaming.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,72 @@ transcription).

{{%tabs %}}

{{% tab "Python" %}}

``` py
import juzu

serverAddress = '127.0.0.1:2727'

# set insecure=True for connecting to server not using TLS
client = juzu.Client(serverAddress, insecure=False)

# get list of available models
modelResp = client.ListModels()
for model in modelResp.models:
print("ID = {}\t Name = {}\t [SampleRate = {} Hz]".format(model.id, model.name, model.attributes.sample_rate))

# use the first available model
juzuModelID = modelResp.models[0]

# Using cubic model to transcribe; Cubicsvr must also be
# running and the address:port provided in the Juzu server
# config file. The cubic models and their ID on Cubicsvr can
# found in cubicsvr.cfg.toml or be obtained via sdk-cubic.
cubicModelID = "1"

cfg = juzu.DiarizationConfig(
model_id = juzuModel.id,
cubic_model_id = cubicModelID,
num_speakers = 2, # number of speakers expected in the audio file
audio_encoding = "WAV", # supported : "RAW_LINEAR16", "FLAC", "WAV"
sample_rate = 16000, # must match juzu model's expected sample rate
)

# client.StreamingDiarize takes any binary
# stream object that has a read(nBytes) method.
# The method should return nBytes from the stream.

# open audio file stream
audio = open('test.wav', 'rb')

# helper function convert protobuf duration objects
# (which stores the time split into in integer seconds
# and integer nano seconds) into single floating value
# in seconds
def protoDurToSec(dur):
return float(dur.seconds) + float(dur.nanos) * 1e-9

# defining function to print speaker segments and transcripts to screen
def handleResults(diarizationResp):
for result in diarizationResp.results:
for segment in result.segments:
print("{start:.3f} - {end:.3f}\t{speaker}:\t{transcript}\n".format(
start = protoDurToSec(segment.start_time),
end = protoDurToSec(segment.end_time),
speaker = segment.speaker_label,
transcript = segment.transcript,
))

# sending streaming request to Juzu and
# waiting for results to return
for resp in client.StreamingDiarize(cfg, audio):
handleResults(resp)

```

{{% /tab %}}

{{% tab "C#" %}}

#### Program.cs
Expand Down Expand Up @@ -115,6 +181,123 @@ chosen correctly.

{{%tabs %}}

{{% tab "Python" %}}

This example requires the [pyaudio](http://people.csail.mit.edu/hubert/pyaudio/)
module to stream audio from a microphone. Instructions for installing pyaudio
for different systems are available at the link. On most platforms, this is
simply `pip install pyaudio`

``` py
import juzu
import pyaudio
import threading

serverAddress = '127.0.0.1:2727'

# set insecure=True for connecting to server not using TLS
client = juzu.Client(serverAddress, insecure=True)

# get list of available models
modelResp = client.ListModels()

# use the first available model
juzuModel = modelResp.models[0]

# creating diarization config to transcribe + diarize
# audio stream from microphone
cfg = juzu.DiarizationConfig(
model_id = juzuModel.id,
cubic_model_id = "1",
num_speakers = 2,
audio_encoding = "RAW_LINEAR16",
sample_rate = juzuModel.attributes.sample_rate,
)

# client.StreamingDiarize takes any binary stream object that has a read(nBytes)
# method. The method should return nBytes from the stream. So pyaudio is a suitable
# library to use here for streaming audio from the microphone. Other libraries or
# modules may also be used as long as they have the read method or have been wrapped
# to do so.

# defining class to wrap around microphone stream from py audio
class MicStream(object):

def __init__(self, sampleRate):

self._p = pyaudio.PyAudio()
# opening mic stream, recording 16 bit little endian integer samples, mono channel
self._stream = self._p.open(format=pyaudio.paInt16, channels=1, rate=sampleRate, input=True)
self._stopped = False

def __del__(self):
self._stream.close()
self._p.terminate()

# streamingDiarize requires a read(nBytes) method
# that return a list of nBytes from the stream. An
# empty list signals the end of stream.
def read(self, nBytes):
# if stream is stopped, return empty list to
# signal end of stream to Juzu
if self._stopped:
return []
return self._stream.read(nBytes)

def pause(self):
self._stream.stop_stream()

def resume(self):
self._stream.start_stream()

def stop(self):
self._stopped = True


audio = MicStream(juzuModel.attributes.sample_rate)

# helper function convert protobuf duration objects
# (which stores the time split into in integer seconds
# and integer nano seconds) into single floating value
# in seconds
def protoDurToSec(dur):
return float(dur.seconds) + float(dur.nanos) * 1e-9

# starting thread to send streaming request to juzu
# and process results once they come back after the
# stream ends.
def streamToJuzu(cfg, audio):
try:
for resp in client.StreamingDiarize(cfg, audio):
for result in resp.results:
for segment in result.segments:
print("{start:.3f} - {end:.3f}\t{speaker}:\t{transcript}\n".format(
start = protoDurToSec(segment.start_time),
end = protoDurToSec(segment.end_time),
speaker = segment.speaker_label,
transcript = segment.transcript,
))
except Exception as ex:
print("[error]: streaming diarization failed: {}".format(ex))

streamThread = threading.Thread(target=streamToJuzu, args=(cfg,audio))
streamThread.setDaemon(True)
streamThread.start()

# waiting for user to end mic stream
print("\nStreaming audio to Juzu server ...\n")
k = input("-- Press Enter key to stop stream --")

print("\nStopping Stream ...")
audio.stop()

print("Waiting for results ...")
streamThread.join()

```

{{% /tab %}}

{{% tab "C#" %}}

We do not currently have example C# code for streaming from a microphone. Simply
Expand Down
10 changes: 5 additions & 5 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@

<h1 id="juzu-api-overview">Juzu API Overview</h1>

<p>Juzu is Cobalt&rsquo;s speaker diarization engine. It can be deployed on-prem and accessed over the network or on your local machine via an API. We currently support C# and are adding support for more languages.</p>
<p>Juzu is Cobalt&rsquo;s speaker diarization engine. It can be deployed on-prem and accessed over the network or on your local machine via an API. We currently support C# and Python, and are adding support for more languages.</p>

<p>Once running, Juzu&rsquo;s API provides a method to which you can stream audio. This audio can either be from a microphone or a file. We recommend uncompressed WAV or lossless compression such as FLAC as the encoding, but we can support other formats as well upon request.</p>

Expand Down Expand Up @@ -417,10 +417,10 @@ <h2 id="obtaining-juzu">Obtaining Juzu</h2>
exports Juzu&rsquo;s functionality over the gRPC protocol. The
<a href="https://github.com/cobaltspeech/sdk-juzu">https://github.com/cobaltspeech/sdk-juzu</a> repository contains the SDK that you
can use in your application to communicate with the Juzu server. This SDK is
currently available for C# and we would be happy to talk to you if you need
support for other languages. Most of the core SDK is generated automatically
using the gRPC tools, and Cobalt provides a top level package for more
convenient API calls.</p>
currently available for C# and Python, and we would be happy to talk to you if
you need support for other languages. Most of the core SDK is generated
automatically using the gRPC tools, and Cobalt provides a top level package for
more convenient API calls.</p>



Expand Down
Loading

0 comments on commit 14761b1

Please sign in to comment.