STAT5003

UK Accidents 10 years history with many variables

Dataset from Kaggle.

The original datasets were large, with 1.6 million, 3M and 2.2 million instances, with 32, 22 and 15 features per section respectively. Then merge Vehicles with Accidents by “Accident_Index” and then merge it with Casualties by (“Accident_Index”, “Vehicle_Reference”). After merging, the final dataset consists of 3.3M rows and 66 columns. “Accident_Severity” is target which is what we need to predict.

Several models were created including Random Forest, Decision tree with bagging, XGBoost, Naive Bayes.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
STAT5003.html		STAT5003.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STAT5003

About

Releases

Packages

Languages

lgX1123/STAT5003

Folders and files

Latest commit

History

Repository files navigation

STAT5003

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages