UIKit version screen cast | SwiftUI version screen cast |
This is an end-to-end example of BERT Question & Answer application built with TensorFlow 2.0, and tested on SQuAD dataset version 1.1. The demo app provides 48 passages from the dataset for users to choose from, and gives 5 most possible answers corresponding to the input passage and query.
This example includes two types of application and a test set, one application is developed with UIKit, and the other is developed with SwiftUI. Each application can be run with this XCode project, by choosing the target to build. Each application shares the core logic needed to run the BertQA model. The test set tests this core logic.
Question input to the application is discarded after inference.
BERT, or Bidirectional Encoder Representations from Transformers, is a method of pre-training language representations which obtains state-of-the-art results on a wide array of Natural Language Processing tasks.
This app uses MobileBERT, a compressed version of BERT that runs 4x faster and has 4x smaller model size.
For more information, refer to the BERT github page.
While preprocessing, it tokenizes input query and content with given vocabulary data. Tokenization process mostly followed the bert tokenization, except handling Chinese characters as we don’t have passages written in Chinese.
It hands over the IDs of tokens with two other data. One is its segment ID to verify whether it is a question or a content. The other is a mask, which indicates if the given token is valid input to be processed or just a padding to fit the input tensor. As the model requires a fixed size of token ID array, some entries of the array could have invalid data.
Assign preprocessed data to the input tensor and run the model. Output data is assigned to the output tensors as a result.
While postprocessing, it retrieves original string from the arrays of start &
end logits in the output tensor. A logit of a token from start
to end
is
derived from a summation of start logit array’s start
th value and end logit
array’s end
th value. The higher sum of two logits, the more likely the token
between starting point and end point could be an answer.
After finding an answer, it retrieves the original string in the given range and calculates the score. The score of the answer is calculated by the softmax function.
-
Xcode 11.0 or above
-
Valid Apple Developer ID
-
Real iOS device
Note: You can also use an iOS emulator, but some of the functionality may not be fully supported.
-
iOS version 12.0 or above
-
Xcode command line tools (to install,
run xcode-select --install
) -
CocoaPods (to install,
run sudo gem install cocoapods
)
-
Clone the TensorFlow examples GitHub repository to your computer to get the demo application:
git clone https://github.com/tensorflow/examples
-
Install the pod to generate the workspace file:
cd examples/lite/examples/bert_qa/ios && pod install
Note: If you have installed this pod before and that command doesn't work, try
pod update
. At the end of this step you should have a directory calledBertQA.xcworkspace
. -
Open the project in Xcode with the following command:
open BertQA.xcworkspace
This launches Xcode and opens the BertQA project.
-
In the Menu bar, select
Product
→Destination
and choose your device. -
Follow the direction below if you want to:
-
Run the application:
- In the Menu bar, select
Product
→Scheme
and chooseBertQA-UIKit
orBertQA-SwiftUI
. - In the Menu bar, select
Product
→Run
to install the app on your device.
- In the Menu bar, select
-
Test the core logic:
- In the Menu bar, select
Product
-->Scheme
and chooseBertQA-UIKit
. - In the Menu bar, select
Product
-->Test
.
- In the Menu bar, select
-