Your approach to dealing with the unbalanced dataset is pretty good but there is a subtelty which may be missed.

It may be worth emphasising that the process you took to create a new dataset had a 50:50 split between resignations and not rather than the original 6% and therefore you didn't need to take a stratified splitting of the data.

Data Scientist and Chartered Aeronautical Engineer (MEng CEng EUR ING MRAeS) with over 15 years experience in the Aerospace, Defence and Rail Industry.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ashraf Miah

Data Scientist and Chartered Aeronautical Engineer (MEng CEng EUR ING MRAeS) with over 15 years experience in the Aerospace, Defence and Rail Industry.