TalkingStage is a conversational bot that uses machine learning to predict responses to various questions about personal preferences and characteristics. The bot is designed to assist users in gathering and responding to personal information during the "talking stage" of a relationship.
-
Predefined responses for direct keyword matches.
-
Machine learning-based predictions for non-direct keyword matches.
-
Model training and saving functionality.
-
Uses a saved model for making predictions in production.
-
Supports multiple platforms, including macOS and Windows, would support Android and iOS in future updates through API model access or if microsoft makes ML.NET compatible for mobile devices.
- Implemented a more advanced machine learning algorithm (LightGBM) for improved prediction accuracy.
- Fine-tuned model parameters for enhanced performance and reliability.
- Optimized data preprocessing steps to ensure consistent and effective training.
- Updated data loading and preprocessing to handle text data more efficiently.
- Streamlined the training pipeline for faster model training and deployment.
- Enhanced model robustness with increased iterations and improved feature handling.
- Integrated comprehensive error handling and logging for better debugging capabilities.
- Improved overall code readability and maintainability.
TalkingStageBot.cs
: Main class for the bot functionality.TrainingModel.cs
: Class responsible for training and saving the machine learning model.
-
Clone the repository:
git clone https://github.com/Brainydaps/TalkingStage.git cd TalkingStage
-
Ensure you have .NET SDK installed:
- You can download it from the official Microsoft .NET website.
-
Restore the required packages:
dotnet restore
-
Build the project:
dotnet build
-
Run the project:
dotnet run
- Place your training data in a CSV file named
training_data.csv
in the project directory. - Ensure the CSV file has two columns:
Text
andLabel
.
When you run the project for the first time, the bot will check for a pre-existing model file (model.zip
) in the project directory:
- If the model file is not found, it will train a new model using the provided
training_data.csv
and save it asmodel.zip
. - If the model file is found, it will load the existing model and create a
PredictionEngine
from it.
You can get responses by calling the GetResponse
method with a question string:
var bot = new TalkingStageBot();
var response = bot.GetResponse("What is your favorite color?");
Console.WriteLine(response);
- TalkingStageBot.cs: Contains the bot logic, predefined responses, and model prediction methods.
- TrainingModel.cs: Contains the logic for training the machine learning model and saving it to a file.
- Added logging to confirm paths for the model and training data.
- Text data is featurized using the
FeaturizeText
method, converting text into numerical vectors. - The labels are mapped to keys using
MapValueToKey
.
- The
SdcaMaximumEntropy
trainer from the ML.NET library is used for multiclass classification. - Regularization parameters and the number of iterations were adjusted to improve model performance:
- L2 Regularization: 0.1
- L1 Regularization: 0.01
- Maximum Number of Iterations: 1000
- The trained model is used to create a
PredictionEngine
that takes user input, processes it, and predicts the most appropriate label.
The training data consists of user queries and corresponding labels. The labels are aligned with predefined responses in the bot. The training data was populated by mapping various ways users might ask a question to a specific label. Below is a sample of the training_data.csv
format also found in this repo:
Text,Label
What is your name,your name
Where do you live,your location
How old are you,your age
What is your job,your job
...
- TalkingStageBot.cs: The core file containing the
TalkingStageBot
class, responsible for initializing the ML context, training the model, and providing responses. - training_data.csv: The CSV file containing the training data used to train the machine learning model.
- Responses Dictionary: A predefined dictionary mapping keywords to responses, used for quick responses to known queries. Before compiling the code, edit the placeholders in the responses dictionary values to your actual information. For example, change "your name" to "My name is Adedapo Adeniran", "your age" to "I am 29 years old, born in October 1994", etc.
- ML Model: The ML.NET model trained using the
SdcaMaximumEntropy
trainer for multiclass classification.
This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) License. For more details, please refer to the LICENSE file.
Contributions are welcome! Please feel free to submit a pull request or open an issue on GitHub.
For any questions or inquiries, please contact Brainydaps via GitHub.