What kind of a comparison or benchmark doesn't state what hardware was used? You state `pandas` single core usage is it's limiting factor but most of the issues you've encounter are memory based.

You don't make clear where `pandas` was being run. Locally on a 8, 16, GB laptop or a remote server? Or a free environment like Google Colab?

The comparison isn't useful for practical purposes for two reasons:

1) What was the loading time to get the data into pandas Vs gigasheet?

2) Gigasheet may have run on much more capable hardware Vs local laptop -, then wouldn't you expect this result?

Data Scientist and Chartered Aeronautical Engineer (MEng CEng EUR ING MRAeS) with over 15 years experience in the Aerospace, Defence and Rail Industry.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ashraf Miah

Data Scientist and Chartered Aeronautical Engineer (MEng CEng EUR ING MRAeS) with over 15 years experience in the Aerospace, Defence and Rail Industry.