You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _episodes/00-Before-we-start.md
+26-26Lines changed: 26 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,13 +7,13 @@ questions:
7
7
- "Why should I learn Python?"
8
8
objectives:
9
9
- "Describe the purpose of the editor, console, help, variable explorer and file explorer panes in Spyder."
10
-
- "Organize files and directories for a set of analyses as a python project, and understand the purpose of the working directory."
10
+
- "Organize files and directories for a set of analyses as a Python project, and understand the purpose of the working directory."
11
11
- "Know where to find help."
12
-
- "Demonstrate how to provide sufficient information for troubleshooting with the python user community."
12
+
- "Demonstrate how to provide sufficient information for troubleshooting with the Python user community."
13
13
keypoints:
14
14
- "Python is an open source and platform independent programming language"
15
-
- "SciPy ecosystem for python provides the tools necessary for scientific computing"
16
-
- "Jupyter Notebook and the Spyder IDE are great tools to code in and interact with python with its large community it is easy to find help in the internet"
15
+
- "SciPy ecosystem for Python provides the tools necessary for scientific computing"
16
+
- "Jupyter Notebook and the Spyder IDE are great tools to code in and interact with Python with its large community it is easy to find help in the internet"
17
17
---
18
18
19
19
# Before we start
@@ -32,7 +32,7 @@ Python's main advantages:
32
32
* interpreted language, i.e. direct execution of commands, no compilation neccessary
33
33
34
34
35
-
## Why learn python for data analysis?
35
+
## Why learn Python for data analysis?
36
36
### Python does not involve lots of pointing and clicking, and that’s a good thing
37
37
The learning curve might be steeper than with other software, but with python, the results of your analysis does not rely on remembering a succession of pointing and clicking, but instead on a series of written commands, and that’s a good thing! So, if you want to redo your analysis because you collected more data, you don’t have to remember which button you clicked in which order to obtain your results, you just have to run your script again.
38
38
@@ -45,10 +45,10 @@ Reproducibility is when someone else (including your future self) can obtain the
45
45
46
46
Python integrates with other tools to generate manuscripts from your code. If you collect more data, or fix a mistake in your dataset, the figures and the statistical tests in your manuscript are updated automatically.
47
47
48
-
An increasing number of journals and funding agencies expect analyses to be reproducible, so knowing python will give you an edge with these requirements.
48
+
An increasing number of journals and funding agencies expect analyses to be reproducible, so knowing Python will give you an edge with these requirements.
49
49
50
50
### Python is interdisciplinary and extensible
51
-
With `numpy`, `pandas`, `sciPy`, `matplotlib` and many other modules that can be installed to extend its capabilities, python provides a framework that allows you to combine approaches from many scientific disciplines to best suit the analytical framework you need to analyze your data.
51
+
With `numpy`, `pandas`, `sciPy`, `matplotlib` and many other modules that can be installed to extend its capabilities, Python provides a framework that allows you to combine approaches from many scientific disciplines to best suit the analytical framework you need to analyze your data.
52
52
53
53
#### Scientific Computing Tools for Python - the SciPy ecosystem
54
54
To do scientific computions in python, [SciPy](https://scipy.org) offers a collection of open source software for mathematics, science, and engineering:
@@ -59,22 +59,22 @@ To do scientific computions in python, [SciPy](https://scipy.org) offers a colle
59
59
*[Pandas](http://pandas.pydata.org) provides high-performance, easy to use data structures.
60
60
*[SymPy](http://www.sympy.org/en/index.html) is for symbolic mathematics and computer algebra.
61
61
62
-
These software can be installed as libraries in python and comes pre-installed with Anaconda.
62
+
These software can be installed as libraries in Python and comes pre-installed with Anaconda.
63
63
64
64
### Python works on data of all shapes and sizes
65
-
The skills you learn with python scale easily with the size of your dataset. Whether your dataset has hundreds or millions of lines, it won’t make much difference to you.
66
-
[pandas](http://pandas.pydata.org), a python data analysis library, comes with special data structures and data types that make handling of missing data and categorial data convenient.
65
+
The skills you learn with Python scale easily with the size of your dataset. Whether your dataset has hundreds or millions of lines, it won’t make much difference to you.
66
+
[pandas](http://pandas.pydata.org), a Python data analysis library, comes with special data structures and data types that make handling of missing data and categorial data convenient.
67
67
Python can read text files, connect to databases, and many other data formats, on your computer or on the web.
68
68
69
69
70
70
### Python produces high-quality graphics
71
-
The plotting functionalities in python are endless, and allow you to adjust any aspect of your graph to convey most effectively the message from your data. [matplotlib](http://matplotlib.org) is a popular plotting library in python. [plotnine](http://plotnine.readthedocs.io/en/stable/) is an implementation of a _grammar of graphics_ in Python and builds upon matplotlib. _Grammar of graphics_ is a general scheme for data visualization which breaks up graphs into semantic components such as scales and layers
71
+
The plotting functionalities in Python are endless, and allow you to adjust any aspect of your graph to convey most effectively the message from your data. [matplotlib](http://matplotlib.org) is a popular plotting library in python. [plotnine](http://plotnine.readthedocs.io/en/stable/) is an implementation of a _grammar of graphics_ in Python and builds upon matplotlib. _Grammar of graphics_ is a general scheme for data visualization which breaks up graphs into semantic components such as scales and layers
72
72
73
73
### Python has a large and welcoming community
74
-
Thousands of people use python daily. Many of them are willing to help you through mailing lists and websites such as [Stack Overflow](https://stackoverflow.com), or on the [Anaconda community](https://www.anaconda.com/community/).
74
+
Thousands of people use Python daily. Many of them are willing to help you through mailing lists and websites such as [Stack Overflow](https://stackoverflow.com), or on the [Anaconda community](https://www.anaconda.com/community/).
75
75
76
-
### Not only is python free, but it is also open-source and cross-platform
77
-
Anyone can inspect the source code to see how python works. Because of this transparency, there is less chance for mistakes, and if you (or someone else) find some, you can report and fix bugs.
76
+
### Not only is Python free, but it is also open-source and cross-platform
77
+
Anyone can inspect the source code to see how Python works. Because of this transparency, there is less chance for mistakes, and if you (or someone else) find some, you can report and fix bugs.
78
78
79
79
## Knowing your way around Anaconda
80
80
@@ -83,7 +83,7 @@ Have a quick look around the Anaconda Navigator. You can launch programs from th
83
83
84
84
The [Jupyter Notebook](https://jupyter.org) is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.
85
85
[Spyder](https://spyder-ide.github.io
86
-
) is a popular way to write python scripts and also to interact with the python software; it is an interactive development enviroment, a so called IDE.
86
+
) is a popular way to write Python scripts and also to interact with the Python software; it is an interactive development enviroment, a so called IDE.
87
87
88
88
Anaconda also comes with a package manager, [Conda](https://conda.io/docs/), which makes it easy to install and update additional packages.
89
89
@@ -99,44 +99,44 @@ Using a consistent folder structure across your projects will help keep things o
99
99
100
100
**`documents/`** This would be a place to keep outlines, drafts, and other text.
101
101
102
-
**`scripts/`** This would be the location to keep your python scripts for different analyses or plotting, and potentially a separate folder for your functions (more on that later).
102
+
**`scripts/`** This would be the location to keep your Python scripts for different analyses or plotting, and potentially a separate folder for your functions (more on that later).
103
103
You may want additional directories or subdirectories depending on your project needs, but these should form the backbone of your working directory. For this workshop, we will need a `data/` folder to store our raw data, and we will create later a `data_output/` folder when we learn how to export data as CSV files.
104
104
105
105
106
106
## Interacting with Python
107
-
The basis of programming is that we write down instructions for the computer to follow, and then we tell the computer to follow those instructions. We write, or code, instructions in python because it is a common language that both the computer and we can understand. We call the instructions commands and we tell the computer to follow the instructions by executing (also called running) those commands.
107
+
The basis of programming is that we write down instructions for the computer to follow, and then we tell the computer to follow those instructions. We write, or code, instructions in Python because it is a common language that both the computer and we can understand. We call the instructions commands and we tell the computer to follow the instructions by executing (also called running) those commands.
108
108
109
109
##### If you are working with Jupyter notebook:
110
-
You can type python code into a code cell and then execute the code by pressing <kbd>Shift</kbd> + <kbd>Enter</kbd>. Any output will be printed directly under the input cell. You can recognise a code cell by the `In[ ]:` at the beginning of the cell and output is marked by `Out[ ]:`. Pressing the __+__ button in the menu bar will add a new cell. All your commands as well as any output will be saved with the notebook.
110
+
You can type Python code into a code cell and then execute the code by pressing <kbd>Shift</kbd> + <kbd>Enter</kbd>. Any output will be printed directly under the input cell. You can recognise a code cell by the `In[ ]:` at the beginning of the cell and output is marked by `Out[ ]:`. Pressing the __+__ button in the menu bar will add a new cell. All your commands as well as any output will be saved with the notebook.
111
111
112
112
##### If you are working with Spyder:
113
113
114
-
You can either use the console or use script files (plain text files that contain your code). The console pane (in Spyder, the bottom right panel) is the place where commands written in the python language can be typed and executed immediately by the computer. It is also where the results will be shown for commands that have been executed. You can type commands directly into the console and press Enter to execute those commands, but they will be forgotten when you close the session. Spyder uses the [IPython](http://ipython.org) console by default.
114
+
You can either use the console or use script files (plain text files that contain your code). The console pane (in Spyder, the bottom right panel) is the place where commands written in the Python language can be typed and executed immediately by the computer. It is also where the results will be shown for commands that have been executed. You can type commands directly into the console and press Enter to execute those commands, but they will be forgotten when you close the session. Spyder uses the [IPython](http://ipython.org) console by default.
115
115
116
116
Because we want our code and workflow to be reproducible, it is better to type the commands we want in the script editor, and save the script. This way, there is a complete record of what we did, and anyone (including our future selves!) can easily replicate the results on their computer.
117
117
118
118
Spyder allows you to execute commands directly from the script editor by using the run buttons on top or keyboard shortcuts. To run the entire script click _Run file_ or press <kbd>F5</kbd>, to run the current line click _Run selection or current line_ or press <kbd>F9</kbd>, other run buttons allow to run script cells or go into debug mode. When using <kbd>F9</kbd>, the command on the current line in the script (indicated by the cursor) or all of the commands in the currently selected text will be sent to the console and executed.
119
119
120
120
At some point in your analysis you may want to check the content of a variable or the structure of an object, without necessarily keeping a record of it in your script. You can type these commands and execute them directly in the console. Spyder provides the <kbd>Ctrl</kbd> + <kbd>Shift</kbd> + <kbd>E</kbd> and <kbd>Ctrl</kbd> + <kbd>Shift</kbd> + <kbd>I</kbd> shortcuts to allow you to jump between the script and the console panes.
121
121
122
-
If python is ready to accept commands, the IPython console shows an `In [..]:` prompt with the current console line number in `[]`. If it receives a command (by typing, copy-pasting or sent from the script editor), python will try to execute it, and when ready, will show the results with an `Out [..]:` prompt and come back with a new `In [..]:` prompt to wait for new commands.
122
+
If Python is ready to accept commands, the IPython console shows an `In [..]:` prompt with the current console line number in `[]`. If it receives a command (by typing, copy-pasting or sent from the script editor), Python will try to execute it, and when ready, will show the results with an `Out [..]:` prompt and come back with a new `In [..]:` prompt to wait for new commands.
123
123
124
-
If python is still waiting for you to enter more data because it isn’t complete yet, the console will show a `...:` prompt. It means that you haven’t finished entering a complete command. This can be because you have not ‘closed’ a parenthesis or quotation, i.e. you don’t have the same number of left-parentheses as right-parentheses, or the same number of opening and closing quotation marks. When this happens, and you thought you finished typing your command, click inside the console window and press Esc; this will cancel the incomplete command and return you to the `In [..]:` prompt.
124
+
If Python is still waiting for you to enter more data because it isn’t complete yet, the console will show a `...:` prompt. It means that you haven’t finished entering a complete command. This can be because you have not ‘closed’ a parenthesis or quotation, i.e. you don’t have the same number of left-parentheses as right-parentheses, or the same number of opening and closing quotation marks. When this happens, and you thought you finished typing your command, click inside the console window and press Esc; this will cancel the incomplete command and return you to the `In [..]:` prompt.
125
125
126
126
## How to learn more after the workshop?
127
-
The material we cover during this workshop will give you an initial taste of how you can use python to analyze data for your own research. However, you will need to learn more to do advanced operations such as cleaning your dataset, using statistical methods, or creating beautiful graphics. The best way to become proficient and efficient at python, as with any other tool, is to use it to address your actual research questions. As a beginner, it can feel daunting to have to write a script from scratch, and given that many people make their code available online, modifying existing code to suit your purpose might make it easier for you to get started.
127
+
The material we cover during this workshop will give you an initial taste of how you can use Python to analyze data for your own research. However, you will need to learn more to do advanced operations such as cleaning your dataset, using statistical methods, or creating beautiful graphics. The best way to become proficient and efficient at python, as with any other tool, is to use it to address your actual research questions. As a beginner, it can feel daunting to have to write a script from scratch, and given that many people make their code available online, modifying existing code to suit your purpose might make it easier for you to get started.
128
128
129
129
## Seeking help
130
130
131
131
* check under the _Help_ menu
132
132
* type `help()`
133
133
* type `?object` or `help(object)` to get information about an object
Finally, a generic Google or internet search “python task” will often either send you to the appropriate module documentation or a helpful forum where someone else has already asked your question.
136
+
Finally, a generic Google or internet search "Python task" will often either send you to the appropriate module documentation or a helpful forum where someone else has already asked your question.
137
137
138
138
I am stuck… I get an error message that I don’t understand
139
-
Start by googling the error message. However, this doesn’t always work very well because often, package developers rely on the error catching provided by python. You end up with general error messages that might not be very helpful to diagnose a problem (e.g. “subscript out of bounds”). If the message is very generic, you might also include the name of the function or package you’re using in your query.
139
+
Start by googling the error message. However, this doesn’t always work very well because often, package developers rely on the error catching provided by python. You end up with general error messages that might not be very helpful to diagnose a problem (e.g. "subscript out of bounds"). If the message is very generic, you might also include the name of the function or package you’re using in your query.
140
140
141
141
However, you should check Stack Overflow. Search using the [python] tag. Most questions have already been answered, but the challenge is to use the right words in the search to find the answers: <http://stackoverflow.com/questions/tagged/python>
142
142
@@ -151,7 +151,7 @@ If possible, try to reduce what doesn’t work to a simple reproducible example.
151
151
* The person sitting next to you during the workshop. Don’t hesitate to talk to your neighbor during the workshop, compare your answers, and ask for help. You might also be interested in organizing regular meetings following the workshop to keep learning from each other.
152
152
* Your friendly colleagues: if you know someone with more experience than you, they might be able and willing to help you.
153
153
*[Stack Overflow](http://stackoverflow.com/questions/tagged/python): if your question hasn’t been answered before and is well crafted, chances are you will get an answer in less than 5 min. Remember to follow their guidelines on how to ask a good question.
Copy file name to clipboardExpand all lines: _episodes/02-index-slice-subset.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -97,7 +97,7 @@ surveys_df['speciess']
97
97
~~~
98
98
{: .language-python}
99
99
100
-
Python tells us what type of error it is in the traceback, at the bottom it says `KeyError: 'speciess'` which means that `speciess` is not a column name (or Key in the related python data type dictionary).
100
+
Python tells us what type of error it is in the traceback, at the bottom it says `KeyError: 'speciess'` which means that `speciess` is not a column name (or Key in the related Python data type dictionary).
0 commit comments