NYC taxi fare prediction

NYC Taxi fare prediction #

taxi-dropoffs-map

In this project you will participate in a Kaggle competition. If this is your first competition don’t despair as thousands of other engineers have learned to excel in competitions such as this - a good ranking can lift your resume out of the noise and drw the attention of recruiters. In this class we will not judge your performance against the global Kaggle leader board but we will focus only on relative performance with the other teams in this class.

Now, head over to the competition page and go over the description, the evaluation metrics and the getting started guide. You dont need to develop your notebook in Kaggle (please note that Kaggle is owned by Google as well) - you can download the data in your google drive and use Google’s colab. You may want to run though some existing notebooks authored by others that participated in the competition and are already hosted in Kaggle (Notebooks tab) within the Kaggle runtime environment to test the waters.

If you find that the runtime environment in Colab with the dataset of the competition has memory issues, please consider sampling the dataset to a smaller number of rows. The original dataset is 55M rows and the starter code reduced this to 10M rows. Colab is constantly upgrading its hardware so this may not be needed.

You may find the visualizations and data exploration shown here useful.