android

Smart Reply Model

What is On-Device Smart Reply Model?

Smart Replies are contextually relevant, one-touch responses that help the user to reply to an incoming text message (or email) efficiently and effortlessly. Smart Replies have been highly successful across several Google products including Gmail, Inbox and Allo.

The On-device Smart Reply model is targeted towards text chat use cases. It has a completely different architecture from its cloud-based counterparts, and is built specifically for memory constraints devices such as phones & watches. It has been successfully used to provide Smart Replies on Android Wear to all first- & third-party apps.

The on-device model comes with several benefits. It is:

Faster: The model resides on the device and does not require internet connectivity. Thus, the inference is very fast and has an average latency of only a few milliseconds.
Resource efficient: The model has a small memory footprint on the device.
Privacy-friendly: The user data never leaves the device and this eliminates any privacy restrictions.

A caveat, though, is that the on-device model has lower triggering rate than its cloud counterparts (triggering rate is the percentage of times the model suggests a response for an incoming message).

When to use this Model?

The On-Device Smart Reply model is aimed towards improving the messaging experience for day-to-day conversational chat messages. We recommend using this model for similar use cases. Some sample messages on which the model does well are provided in this tsv file for reference. The file format is:

   {incoming_message  smart_reply1   [smart_reply2]   [smart_reply3]}

For the current model, we see a triggering rate of about 30-40% for messages which are similar to those provided in the tsv file above.

In case the model does not trigger any response, the system falls back to suggesting replies from a fixed back-off set that was compiled from popular response intents observed in chat conversations. Some of the fallback responses are Ok, Yes, No, 👍, ☺.

The model can only be used for inference at this time (i.e. it cannot be custom trained). If you are interested to know how the model was trained, please refer to this blog post and research paper.

How to Use This Model?

We have provided a pre-built demo APK that you can download, install and test on your phone (demo APK here).

The On-Device Smart Reply demo App works in the following way:

Android app links to the JNI binary with a predictor library.
In the predictor library, GetSegmentPredictions is called with a list of input strings.

2.1 The input string can be 1-3 most recent messages of the conversations in form of string vector. The model will run on these input sentences and provide Smart Replies corresponding to them.

2.2 The function performs some preprocessing on input data which includes:
- Sentence splitting: The input message will be split into sentences if message has more than one sentence. Eg: a message like “How are you? Want to grab lunch?” will be broken down into 2 different sentences.
- Normalization: The individual sentences will be normalized by converting them into lower cases, removing unnecessary punctuations, etc. Eg: “how are you????” will be converted to “how are you?” (refer for NORMALIZE op for more details).
  
  The input string content will be converted to tensors.
2.3 The function then runs the prediction model on the input tensors.

2.4 The function also performs some post-processing which includes aggregating the model predictions for the input sentences from 2.2 and returning the appropriate responses.
Finally, it gets response(s) from std::vector<PredictorResponse>, and returns back to Android app. Responses are sorted in descending order of confidence score.

Ops and Functionality Supported

Following are the ops supported for using On-Device Smart Reply model:

NORMALIZE

This is a custom op which normalizes the sentences by:
- Converting all sentences into lower case.
- Removing unnecessary punctuations (eg: “how are you????” → “how are you?”).
- Expanding sentences wherever necessary (eg: “ I’m home” → “I am home”).
SKIP_GRAM

This is an op inside TensorFlow Lite that converts sentences into a list of skip grams. The configurable parameters are ngram_size and max_skip_size. For the model provided, the values for these parameters are set to 3 & 2 respectively.
EXTRACT_FEATURES

This is a custom op that hashes skip grams to features represented as integers. Longer skip-grams are allocated higher weights.
LSH_PROJECTION

This is an op inside TensorFlow Lite that projects input features to a corresponding bit vector space using Locality Sensitive Hashing (LSH).
PREDICT

This is a custom op that runs the input features through the projection model (details here), computes the appropriate response labels along with weights for the projected features, and aggregates the response labels and weights together.
HASHTABLE_LOOKUP

This is an op inside TensorFlow Lite that uses label id from predict op and looks up the response text from the given label id.

How to Run the App?

We release (pre-built demo APK here) for you to download, install and test on your phone.

If you want to know how to build the app at own side, please refer to how-to-build.md.

Further Information

Open source code here.

Name		Name	Last commit message	Last commit date
parent directory ..
app		app
gradle/wrapper		gradle/wrapper
README.md		README.md
build.gradle		build.gradle
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
how-to-build.md		how-to-build.md
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

android

android

README.md

Smart Reply Model

What is On-Device Smart Reply Model?

When to use this Model?

How to Use This Model?

Ops and Functionality Supported

How to Run the App?

Further Information

Files

android

Directory actions

More options

Directory actions

More options

Latest commit

History

android

Folders and files

parent directory

README.md

Smart Reply Model

What is On-Device Smart Reply Model?

When to use this Model?

How to Use This Model?

Ops and Functionality Supported

How to Run the App?

Further Information