As developer, I use vim as my code editor for my personal projects, and Visual Studio Code at work. Occasionally I use Emacs as I really like org mode, especially using org babel for reproducible research and literal programming, providing a computing environment for authoring mixed natural and computer language documents.
For those who are not familiar with literate programming, from Wikipedia
Literate programming is a programming paradigm introduced by Donald Knuth in which a computer program is given an explanation of its logic in a natural language, such as English, interspersed with snippets of macros and traditional source code, from which compilable source code can be generated.
For data science projects, I use Jupyter notebook for exploratory data analysis, data preparation, wrangling, training and testing the models and eventually coming out with the final model for production. During this process, sometimes the code I wrote in the notebooks need to be extracted and packaged as a python module or library for further reuse.
I have been using fast.ai library in few of my projects and it is a cool library which make machine learning accessible and easy for everyone to get started. fast.ai library is developed through a notebook approach using nbdev. After watching a recent video by Jeremy I decided to try this approach.
Python Programming using nbdev
nbdev is a library that allows you to develop a python library in Jupyter notebooks, putting all your code, tests and documentation in one place. There are detailed instructions and tutorials provided at the website so I will not repeat the details here.
For my experiment, I use the
nbdev_newcli command to create my library. Optionally you can clone the nbdev_template repository. I rewrote the generic data ingestion routines that I talked about in the previous article, adding additional support for CSV and MySQL. You can find the notebooks available in the following repository
Data engineering and data science library developed using Jupyter notebook through literate programming. …
The idea is simple. As you can see I am using different custom annotations in my notebook…