Skip to content

Commit 61efa6a

Browse files
authored
Add files via upload
1 parent cdb8001 commit 61efa6a

9 files changed

+6124
-0
lines changed

01-Preliminaries.ipynb

+190
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,190 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"<small><i>This notebook was put together by [Jake Vanderplas](http://www.vanderplas.com). Source and license info is on [GitHub](https://github.com/jakevdp/sklearn_tutorial/).</i></small>"
8+
]
9+
},
10+
{
11+
"cell_type": "markdown",
12+
"metadata": {},
13+
"source": [
14+
"# An Introduction to scikit-learn: Machine Learning in Python"
15+
]
16+
},
17+
{
18+
"cell_type": "markdown",
19+
"metadata": {},
20+
"source": [
21+
"## Goals of this Tutorial"
22+
]
23+
},
24+
{
25+
"cell_type": "markdown",
26+
"metadata": {},
27+
"source": [
28+
"- **Introduce the basics of Machine Learning**, and some skills useful in practice.\n",
29+
"- **Introduce the syntax of scikit-learn**, so that you can make use of the rich toolset available."
30+
]
31+
},
32+
{
33+
"cell_type": "markdown",
34+
"metadata": {},
35+
"source": [
36+
"## Schedule:"
37+
]
38+
},
39+
{
40+
"cell_type": "markdown",
41+
"metadata": {},
42+
"source": [
43+
"**Preliminaries: Setup & introduction** (15 min)\n",
44+
"* Making sure your computer is set-up\n",
45+
"\n",
46+
"**Basic Principles of Machine Learning and the Scikit-learn Interface** (45 min)\n",
47+
"* What is Machine Learning?\n",
48+
"* Machine learning data layout\n",
49+
"* Supervised Learning\n",
50+
" - Classification\n",
51+
" - Regression\n",
52+
" - Measuring performance\n",
53+
"* Unsupervised Learning\n",
54+
" - Clustering\n",
55+
" - Dimensionality Reduction\n",
56+
" - Density Estimation\n",
57+
"* Evaluation of Learning Models\n",
58+
"* Choosing the right algorithm for your dataset\n",
59+
"\n",
60+
"**Supervised learning in-depth** (1 hr)\n",
61+
"* Support Vector Machines\n",
62+
"* Decision Trees and Random Forests\n",
63+
"\n",
64+
"**Unsupervised learning in-depth** (1 hr)\n",
65+
"* Principal Component Analysis\n",
66+
"* K-means Clustering\n",
67+
"* Gaussian Mixture Models\n",
68+
"\n",
69+
"**Model Validation** (1 hr)\n",
70+
"* Validation and Cross-validation"
71+
]
72+
},
73+
{
74+
"cell_type": "markdown",
75+
"metadata": {},
76+
"source": [
77+
"## Preliminaries"
78+
]
79+
},
80+
{
81+
"cell_type": "markdown",
82+
"metadata": {},
83+
"source": [
84+
"This tutorial requires the following packages:\n",
85+
"\n",
86+
"- Python version 2.7 or 3.4+\n",
87+
"- `numpy` version 1.8 or later: http://www.numpy.org/\n",
88+
"- `scipy` version 0.15 or later: http://www.scipy.org/\n",
89+
"- `matplotlib` version 1.3 or later: http://matplotlib.org/\n",
90+
"- `scikit-learn` version 0.15 or later: http://scikit-learn.org\n",
91+
"- `ipython`/`jupyter` version 3.0 or later, with notebook support: http://ipython.org\n",
92+
"- `seaborn`: version 0.5 or later, used mainly for plot styling\n",
93+
"\n",
94+
"The easiest way to get these is to use the [conda](http://store.continuum.io/) environment manager.\n",
95+
"I suggest downloading and installing [miniconda](http://conda.pydata.org/miniconda.html).\n",
96+
"\n",
97+
"The following command will install all required packages:\n",
98+
"```\n",
99+
"$ conda install numpy scipy matplotlib scikit-learn ipython-notebook\n",
100+
"```\n",
101+
"\n",
102+
"Alternatively, you can download and install the (very large) Anaconda software distribution, found at https://store.continuum.io/."
103+
]
104+
},
105+
{
106+
"cell_type": "markdown",
107+
"metadata": {},
108+
"source": [
109+
"### Checking your installation\n",
110+
"\n",
111+
"You can run the following code to check the versions of the packages on your system:\n",
112+
"\n",
113+
"(in IPython notebook, press `shift` and `return` together to execute the contents of a cell)"
114+
]
115+
},
116+
{
117+
"cell_type": "code",
118+
"execution_count": 1,
119+
"metadata": {},
120+
"outputs": [
121+
{
122+
"name": "stdout",
123+
"output_type": "stream",
124+
"text": [
125+
"IPython: 6.2.1\n",
126+
"numpy: 1.13.3\n",
127+
"scipy: 1.0.0\n",
128+
"matplotlib: 2.2.2\n",
129+
"scikit-learn: 0.19.1\n"
130+
]
131+
}
132+
],
133+
"source": [
134+
"from __future__ import print_function\n",
135+
"\n",
136+
"import IPython\n",
137+
"print('IPython:', IPython.__version__)\n",
138+
"\n",
139+
"import numpy\n",
140+
"print('numpy:', numpy.__version__)\n",
141+
"\n",
142+
"import scipy\n",
143+
"print('scipy:', scipy.__version__)\n",
144+
"\n",
145+
"import matplotlib\n",
146+
"print('matplotlib:', matplotlib.__version__)\n",
147+
"\n",
148+
"import sklearn\n",
149+
"print('scikit-learn:', sklearn.__version__)"
150+
]
151+
},
152+
{
153+
"cell_type": "markdown",
154+
"metadata": {},
155+
"source": [
156+
"## Useful Resources"
157+
]
158+
},
159+
{
160+
"cell_type": "markdown",
161+
"metadata": {},
162+
"source": [
163+
"- **scikit-learn:** http://scikit-learn.org (see especially the narrative documentation)\n",
164+
"- **matplotlib:** http://matplotlib.org (see especially the gallery section)\n",
165+
"- **Jupyter:** http://jupyter.org (also check out http://nbviewer.jupyter.org)"
166+
]
167+
}
168+
],
169+
"metadata": {
170+
"kernelspec": {
171+
"display_name": "Python [default]",
172+
"language": "python",
173+
"name": "python3"
174+
},
175+
"language_info": {
176+
"codemirror_mode": {
177+
"name": "ipython",
178+
"version": 3
179+
},
180+
"file_extension": ".py",
181+
"mimetype": "text/x-python",
182+
"name": "python",
183+
"nbconvert_exporter": "python",
184+
"pygments_lexer": "ipython3",
185+
"version": "3.6.6"
186+
}
187+
},
188+
"nbformat": 4,
189+
"nbformat_minor": 1
190+
}

02.1-Machine-Learning-Intro.ipynb

+580
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)