We name the proposed method as pylogsentiment
.
To run the pylogsentiment
tool, please follow these steps.
-
Clone the repository
git clone https://github.com/studiawan/pylogsentiment.git
-
Change directory to
pylogsentiment
cd pylogsentiment
-
Create virtual environment using anaconda:
conda create --name pylogsentiment python=3.5
and then activate it:
conda activate pylogsentiment
If you do not have anaconda, use any other tools to create virtual environment. We highly recommend to install
pylogsentiment
on a virtual environment. -
Install
pylogsentiment
pip install -e .
To run pylogsentiment
, we need to download the model file and word index file. When downloading the datasets using megadl
command, both files are also downloaded. Please read here for instructions.
Note that both files should be placed in pylogsentiment/datasets/
directory.
To run pylogsentiment
, type the command:
python pylogsentiment/pylogsentiment.py -i log_file.log -o results_file.csv
where log_file.log
is the input log file and results_file.csv
is the anomaly detection results in a CSV file.
Follow the instructions here: Download the datasets
If you want to build the ground truth by your own, follow these steps. In the project root directory, run script groundtruth.py
followed by dataset name. For example, the dataset names are casper-rw, dfrws-2009-jhuisi, dfrws-2009-nssal, honeynet-challenge7
. For example:
python pylogsentiment/groundtruth/groundtruth.py casper-rw
To train your own model, please download and build ground truth as described above. Subsequently, download and extract GloVe word embedding as described here. Then, we can run this command:
python pylogsentiment/experiment/experiment.py pylogsentiment
The final model is located in directory datasets/best-model-pylogsentiment.hdf5