@@ -3,17 +3,19 @@ title: Data Ingest and Visualization - Matplotlib and Pandas
33teaching : 20
44exercises : 25
55questions :
6- - " What other tools can I use to create plots apart from ggplot? "
7- - " Why should I use Python to create plots?"
6+ - " What other tools can I use to create plots apart from ggplot?"
7+ - " Why should I use Python to create plots?"
88objectives :
9- - Import the pyplot toolbox to create figures in Python.
9+ - " Import the pyplot toolbox to create figures in Python."
10+ keypoints :
11+ - " FIXME"
1012---
1113
1214
1315## Putting it all together
1416
1517Up to this point, we have walked through tasks that are often
16- involved in handling and processing data using the workshop ready cleaned
18+ involved in handling and processing data using the workshop ready cleaned
1719files that we have provided. In this wrap-up exercise, we will perform
1820many of the same tasks with real data sets. This lesson also covers data
1921visualization.
@@ -44,10 +46,11 @@ If you are still having trouble importing the data as a table using Pandas,
4446check the documentation. You can open the docstring in an ipython notebook using
4547a question mark. For example:
4648
47- ``` python
49+ ~~~
4850 import pandas as pd
4951 pd.read_csv?
50- ```
52+ ~~~
53+ {: .language-python}
5154
5255Look through the function arguments to see if there is a default value that is
5356different from what your file requires (Hint: the problem is most likely the
@@ -59,7 +62,7 @@ you. In the streamgage file, those values might be the date, time, and discharge
5962measurements. Convert any measurements in imperial units into SI units. You can
6063also change the name of the columns in the DataFrame like this:
6164
62- ``` python
65+ ~~~
6366 df = pd.DataFrame({'1stcolumn':[100,200], '2ndcolumn':[10,20]}) # this just creates a DataFrame for the example!
6467 print('With the old column names:\n') # the \n makes a new line, so it's easier to see
6568 print(df)
@@ -80,7 +83,8 @@ also change the name of the columns in the DataFrame like this:
8083 FirstColumn SecondColumn
8184 0 100 10
8285 1 200 20
83- ```
86+ ~~~
87+ {: .language-python}
8488
8589## Make a line plot of your data
8690
@@ -111,27 +115,30 @@ line plots using pyplots.
111115
112116First, import the pyplot toolbox:
113117
114- ``` python
118+ ~~~
115119 import matplotlib.pyplot as plt
116- ```
120+ ~~~
121+ {: .language-python}
117122
118123By default, matplotlib will create the figure in a separate window. When using
119124ipython notebooks, we can make figures appear in-line within the notebook by
120125writing:
121126
122- ``` python
127+ ~~~
123128 %matplotlib inline
124- ```
129+ ~~~
130+ {: .language-python}
125131
126132We can start by plotting the values of a list of numbers (matplotlib can handle
127133many types of numeric data, including numpy arrays and pandas DataFrames - we
128134are just using a list as an example!):
129135
130- ``` python
136+ ~~~
131137 list_numbers = [1.5, 4, 2.2, 5.7]
132138 plt.plot(list_numbers)
133139 plt.show()
134- ```
140+ ~~~
141+ {: .language-python}
135142
136143The command ` plt.show() ` prompts Python to display the figure. Without it, it
137144creates an object in memory but doesn't produce a visible plot. The ipython
@@ -146,22 +153,24 @@ function `plot()` receives two lists, it assumes the first one is the x-values
146153and the second the y-values. The line connecting the points will follow the list
147154in order:
148155
149- ``` python
156+ ~~~
150157 plt.plot([6.8, 4.3, 3.2, 8.1], list_numbers)
151158 plt.show()
152- ```
159+ ~~~
160+ {: .language-python}
153161
154162A third, optional argument in ` plot() ` is a string of characters that indicates
155163the line type and color for the plot. The default value is a continuous blue
156164line. For example, we can make the line red (` 'r' ` ), with circles at every data
157165point (` 'o' ` ), and a dot-dash pattern (` '-.' ` ). Look through the matplotlib
158166gallery for more examples.
159167
160- ``` python
168+ ~~~
161169 plt.plot([6.8, 4.3, 3.2, 8.1], list_numbers, 'ro-.')
162170 plt.axis([0,10,0,6])
163171 plt.show()
164- ```
172+ ~~~
173+ {: .language-python}
165174
166175The command ` plt.axis() ` sets the limits of the axes from a list of `[ xmin,
167176xmax, ymin, ymax] ` values (the square brackets are needed because the argument
@@ -173,7 +182,7 @@ A single figure can include multiple lines, and they can be plotted using the
173182same ` plt.plot() ` command by adding more pairs of x values and y values (and
174183optionally line styles):
175184
176- ``` python
185+ ~~~
177186 import numpy as np
178187
179188 # Create a numpy array between 0 and 10, with values evenly spaced every 0.5
@@ -187,7 +196,8 @@ optionally line styles):
187196 plt.title('This is the figure title')
188197
189198 plt.show()
190- ```
199+ ~~~
200+ {: .language-python}
191201
192202We can include a legend by adding the optional keyword argument ` label='' ` in
193203` plot() ` . Caution: We cannot add labels to multiple lines that are plotted
@@ -196,7 +206,7 @@ won't know to which line to assign the value of the argument label. Multiple
196206lines can also be plotted in the same figure by calling the ` plot() ` function
197207several times:
198208
199- ``` python
209+ ~~~
200210 # Red dashes with no symbols, blue squares with a solid line, and green triangles with a dotted line
201211 plt.plot(t, t, 'r--', label='linear')
202212 plt.plot(t, t**2, 'bs-', label='square')
@@ -209,7 +219,8 @@ several times:
209219 plt.title('This is the figure title')
210220
211221 plt.show()
212- ```
222+ ~~~
223+ {: .language-python}
213224
214225The function ` legend() ` adds a legend to the figure, and the optional keyword
215226arguments change its style. By default [ typing just ` plt.legend() ` ] , the legend
@@ -223,7 +234,7 @@ plotting area, and any plotting functions are directed to those axes. To make
223234more than one figure, we use the command ` plt.figure() ` with an increasing
224235figure number inside the parentheses:
225236
226- ``` python
237+ ~~~
227238 # This is the first figure
228239 plt.figure(1)
229240 plt.plot(t, t, 'r--', label='linear')
@@ -241,13 +252,14 @@ figure number inside the parentheses:
241252 plt.title('This is figure 2')
242253
243254 plt.show()
244- ```
255+ ~~~
256+ {: .language-python}
245257
246258A single figure can also include multiple plots in a grid pattern. The
247259` subplot() ` command especifies the number of rows, the number of columns, and
248260the number of the space in the grid that particular plot is occupying:
249261
250- ``` python
262+ ~~~
251263 plt.figure(1)
252264
253265 plt.subplot(2,2,1) # Two row, two columns, position 1
@@ -260,7 +272,8 @@ the number of the space in the grid that particular plot is occupying:
260272 plt.plot(t, t**3, 'g^:', label='cubic')
261273
262274 plt.show()
263- ```
275+ ~~~
276+ {: .language-python}
264277
265278## Make other types of plots:
266279
0 commit comments