Skip to content

Commit e6e16f6

Browse files
committedMay 31, 2020
Initial Commit
1 parent 7a9faa1 commit e6e16f6

File tree

3 files changed

+100
-0
lines changed

3 files changed

+100
-0
lines changed
 

‎.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# Leave entire .ipython checkpoint folder
2+
*/.ipynb*
3+
*Untitled*
4+

‎notebook/data/people.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
{"name":"Prashant"}
2+
{"name":"Abdul", "age":30}
3+
{"name":"Justin", "age":19}
4+
{"name":"Andy", "age":43}

‎notebook/pyspark-test.ipynb

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"## How to run pyspark in Anaconda Jupyter Notebook"
8+
]
9+
},
10+
{
11+
"cell_type": "markdown",
12+
"metadata": {},
13+
"source": [
14+
" 1. Set following environment variables\n",
15+
" ```bat\n",
16+
" setx HADOOP_HOME C:\\demo\\hadoop27\n",
17+
" setx SPARK_HOME C:\\demo\\spark3\n",
18+
" setx PYSPARK_PYTHON C:\\Users\\prashant\\Anaconda3\\python \n",
19+
" \n",
20+
" ```\n",
21+
" 2. Start Anacoda Prompt\n",
22+
" 3. Install findspark uisng following command\n",
23+
" ```bat \n",
24+
" python -m pip install findspark\n",
25+
" \n",
26+
" ```\n",
27+
" 4. Start Jupyter Notenook"
28+
]
29+
},
30+
{
31+
"cell_type": "code",
32+
"execution_count": 1,
33+
"metadata": {},
34+
"outputs": [],
35+
"source": [
36+
"import findspark\n",
37+
"findspark.init()\n",
38+
"import pyspark\n",
39+
"from pyspark.sql import SparkSession"
40+
]
41+
},
42+
{
43+
"cell_type": "code",
44+
"execution_count": 4,
45+
"metadata": {
46+
"scrolled": true
47+
},
48+
"outputs": [
49+
{
50+
"name": "stdout",
51+
"output_type": "stream",
52+
"text": [
53+
"+----+--------+\n",
54+
"| age| name|\n",
55+
"+----+--------+\n",
56+
"|null|Prashant|\n",
57+
"| 30| Abdul|\n",
58+
"| 19| Justin|\n",
59+
"| 43| Andy|\n",
60+
"+----+--------+\n",
61+
"\n"
62+
]
63+
}
64+
],
65+
"source": [
66+
"spark = SparkSession.builder.getOrCreate()\n",
67+
"spark.read.json(\"C:/demo/notebook/data/people.json\").show()"
68+
]
69+
}
70+
],
71+
"metadata": {
72+
"kernelspec": {
73+
"display_name": "Python 3",
74+
"language": "python",
75+
"name": "python3"
76+
},
77+
"language_info": {
78+
"codemirror_mode": {
79+
"name": "ipython",
80+
"version": 3
81+
},
82+
"file_extension": ".py",
83+
"mimetype": "text/x-python",
84+
"name": "python",
85+
"nbconvert_exporter": "python",
86+
"pygments_lexer": "ipython3",
87+
"version": "3.7.6"
88+
}
89+
},
90+
"nbformat": 4,
91+
"nbformat_minor": 2
92+
}

0 commit comments

Comments
 (0)