A Pythonic Journey

Background

One of the great things about Python is the ease of access to learning from traditional books and online materials and websites. Here is the path that I have been on for the better part of year.

Current Course

Harvard CS109 is a great course to learn more about Data Science the supporting methods of data engineering, data analysis, data visualization. Taking this course not only broadens your knowledge of Python, but teaches you good patterns for implementing the tools and solving real world problems. I would highly recommend this one. The lectures and labs are available online from Harvard, but the homeworks were only posted for the 2013 course.

This course introduces methods for five key facets of an investigation: data wrangling, cleaning, and sampling to get a suitable data set; data management to be able to access big data quickly and reliably; exploratory data analysis to generate hypotheses and intuition; prediction based on statistical methods such as regression and classification; and communication of results through visualization, stories, and interpretable summaries.

Massive Open Online Courses(MOOCs)

This Coursera course is touch by Dr. Charles Severance from the University of Michigan and is based on the book Python for Informatics. This is a great place to start your journey in Python. The course covers the basics from syntax and data dtypes to conditionals. If you are new to programming or Python I would highly recommend starting here. Ther are some data types like tuples and patterns like how to iterate over a dictionary that you can use over and over again in your journey. This course also covers some advanced topics like how to work with an API and write data to a SQLite database.

Although this course is titled introduction, it is certainly not an introductory course. You should have a firm grasp of working with data, cleaning data, and processing data prior to taking this. The best part of this course were the latter weeks where the statistical concepts where introduced. Using Python and Pandas to summarize the data and perform more complex operations like t-tests.

Once you have the basics under your belt this is a good second step. It follows a use case style approach and has a heavy emphasis on building your code using functions for repeatabilty and extensibility.

This edX course is one of the first courses I took that really emphasised the algorithms and math used in Data Science. After a couple of weeks of introducing some basic concepts and the numpy library it dives into uses cases such as translating a DNA sequence and doing a correlation of whiskey based on their flavor profile.

Not directly related to Data Science or Machine Learning, but courses that get back to the fun! - Coursera: An Introduction to Interactive Programming in Python (Part 1) - Coursera: An Introduction to Interactive Programming in Python (Part 2)

Books and other Learning Material

Python for Informatics - This is the book the Coursea Course Python for Everybody uses.

Think Python 2e - How to Think Like a Computer Scientist

Anaconda - Open Data Science Platform - Download

A few words about markdown - Using GitHub Flavored Markdown

Future Courses

Andrew Ng's course built the cirriculum for teaching Machine Learning. https://www.coursera.org/learn/machine-learning

In this module, we introduce the core idea of teaching a computer to learn concepts using data—without being explicitly programmed.

blogroll

social