Distilling the lessons learnt from over a hundred Data Science intern applications where the majority of the candidates were not selected for interview. The focus of this article is on what and how to communicate rather than what technical skills to acquire.
As an avid user of Jupyter Notebooks and/or Lab, this guide shows how to automatically start-up Jupyter server allowing you to select your desired Python environment using the web interface and then start coding.
If like me, your typical workflow was to start a terminal, activate your desired Python virtual environment before running
jupyter notebook or
jupyterlab, then this guide should provide an easier and more integrated experience. …
Flight tickets state the departure and arrival date and time but with their respective local Time Zones, which can make working out how long the flight is a bit tricky. In this article, the
Delorean library is introduced and along with
datetime objects with Time Zone information are manipulated. As an example, the flight time between London Heathrow, UK and Kuala Lumpur, Malaysia are calculated.
Kaggle Grandmasters tackle character recognition for the fifth most popular native language, how to choose the right colour scheme for your plot and useful Python newsletters.
The Nvidia Machine Learning (ML) Grandmasters took part in the Kaggle competition for the character recognition of the World’s fifth most popular native language, Bengali. The team address some of the “unwritten” rules of the language in tuning there models. Rather than choose either the default colour scheme from your favourite plotting library or personal preference, a more scientific basis can be found in the Colorcet library. …
python functions to prepare for “Big Data” processing, the advantages of Nomad over Kubernetes and Visual Studio Codes’s next-generation code completion for Python.
Links to three interesting and wide ranging topics on Data Science, from understanding key features of
python as a means of preparing to process “Big Data” with libraries such as Dask and PySpark. Nomad as an alternative to Kubernetes as a container orchestrator, which covers the full range of Data Science activities from Data Ingestion to Model deployment to Data Visualisation. …
Selecting articles from a Pocket list curated over the last 5 years on Data Science.
Utilising a wide range of articles saved on Pocket and their application to Data Science.
The purpose of the Data Science Shorts series is to summarise articles from a vast library of articles I’ve stored and tracked using Pocket over the past 5 years — a service to curate and collect interesting reads on the internet. A similar capability is provided by InstaPaper; both services store links and copies of articles and can provide a unified interface to reading these again later.
Additional plugins for
git to make using repositories more manageable for Data Science.
Introduction to the open source
git-extras package, it’s installation on a Ubuntu (GNU/Linux) environment and it’s application in general and for Data Science.
It’s common practice for most Data Scientists to interact with Version Control Systems (VCS) like Git. Beyond the basic commands like
log you might wander what else you need? …