Skip to content

Commit 0a06023

Browse files
author
Tania Allard
committed
Add solutions
1 parent 03cbe8e commit 0a06023

4 files changed

Lines changed: 86 additions & 8 deletions

File tree

03_ProcessData.ipynb

Lines changed: 36 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
}
99
},
1010
"source": [
11-
"# Writing the code to perform the analysis\n",
11+
"# Adding the code that performs the analysis\n",
1212
"\n",
1313
"Now for the real work- writing the code that will perform out analysis.\n",
1414
"\n",
@@ -22,17 +22,45 @@
2222
},
2323
{
2424
"cell_type": "markdown",
25-
"metadata": {},
25+
"metadata": {
26+
"slideshow": {
27+
"slide_type": "subslide"
28+
}
29+
},
2630
"source": [
2731
"Don't worry you do not have to generate all of the scripts... we have provided the bases for you to start working.\n",
28-
"You should now have a directory called `SupportScripts`"
32+
"You should now have a directory called `SupportScripts`\n",
33+
"\n",
34+
"You need to make sure that they are in the appropriate directory inside your newly created project.\n",
35+
"- Noteboks\n",
36+
"- src/data\n",
37+
"- src/visualization\n",
38+
"\n",
39+
"Once this is done commit your changes to git\n",
40+
"```bash\n",
41+
"$ git add .\n",
42+
"$ git commit -m \"Add processing scripts\"\n",
43+
"```"
44+
]
45+
},
46+
{
47+
"cell_type": "markdown",
48+
"metadata": {},
49+
"source": [
50+
"# Documentation\n",
51+
"\n",
52+
"Documentation is an important part of a reproducible workflow.\n",
53+
"\n",
54+
"Take 5 minutes and identify which scripts/notebook have the best documentation. Why makes it a good documentation?\n",
55+
"\n",
56+
"A good point to start is checking the [Google Python style guidelines](https://google.github.io/styleguide/pyguide.html#Comments)"
2957
]
3058
},
3159
{
3260
"cell_type": "markdown",
3361
"metadata": {
3462
"slideshow": {
35-
"slide_type": "-"
63+
"slide_type": "slide"
3664
}
3765
},
3866
"source": [
@@ -47,11 +75,11 @@
4775
]
4876
},
4977
{
50-
"cell_type": "code",
51-
"execution_count": null,
78+
"cell_type": "markdown",
5279
"metadata": {},
53-
"outputs": [],
54-
"source": []
80+
"source": [
81+
"![](./assets/dates_ISO.png)"
82+
]
5583
},
5684
{
5785
"cell_type": "code",
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
#!/usr/bin/env python
2+
3+
import sys
4+
import datetime
5+
6+
import pandas as pd
7+
import numpy as np
8+
import matplotlib.pyplot as plt
9+
10+
11+
def process_data_GBP(filename):
12+
"""
13+
Get only the needed subset from the data.
14+
Args:
15+
filename: str
16+
Path to the filename containing the wine data
17+
18+
Returns:
19+
20+
data_path: st
21+
Path to the created data set
22+
"""
23+
24+
# Load table
25+
try:
26+
wine = pd.read_csv(filename)
27+
except IOError:
28+
print('That file does not seem to exist')
29+
30+
# Subset of data to keep
31+
wine_keep = wine.loc[:,['country', 'designation', 'points', 'price']]
32+
33+
# Add column with prices in GBP
34+
wine_keep['price_GBP'] = wine_keep['price'].apply(lambda x : x * 1.2)
35+
36+
# Constructing the fname
37+
today = datetime.datetime.today().strftime('%Y-%m-%d')
38+
fname = f'data/interim/{today}-winemag_priceGBP.csv'
39+
40+
# Saving the csv
41+
wine_keep.to_csv(fname)
42+
43+
return(fname)
44+
45+
46+
if __name__ == '__main__':
47+
filename = sys.argv[1]
48+
print(filename)
49+
print(process_data_GBP(filename))

solutions/test_workflow.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
#!/usr/bin/python

0 commit comments

Comments
 (0)