Data analysis with scikit learn and pandas

Using scikit learn to data mine a small dataset of clinical exams.

Download Miniconda 64bit version with python 3.6.

Installation

Install Miniconda using the current user (no sudo)

cd ~/Downloads/
bash Miniconda3-latest-Linux-x86_64.sh
# reload env (or restart terminal)
source ~/.bashrc

This toolset is my only use for anaconda, so everithing is installed in the default environment.

Install all the tools.

# this will also install numpy
conda install scikit-learn
conda install ipython
conda install matplotlib
# this will also install pandas
conda install seaborn
conda install jupyter

Workspace

Create a folder for notebooks and data files, then start ipython.

mkdir ~/datascience/tesigiulia -p
conda activate
jupyter notebook --notebook-dir='~/datascience/tesigiulia'

Now a new browser window will open at the url http://localhost:8888/tree

Notebook