Skip to content

Commit b4773dd

Browse files
committed
lesson 3 proposal
1 parent 31c6a6c commit b4773dd

1 file changed

Lines changed: 8 additions & 2 deletions

File tree

_episodes/03-index-slice-subset.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,7 @@ the related Python data type dictionary).
118118
> the names of built-in data structures and methods. For example, a _list_ is a built-in
119119
> data type. It is possible to use the word 'list' as an identifier for a new object,
120120
> for example `list = ['apples', 'oranges', 'bananas']`. However, you would then
121-
> be unable to create an empty list using `list()` or convert a tuple to a
121+
> be unable to create an empty list using `list()` or convert a tuple to a
122122
> list using `list(sometuple)`.
123123
{: .callout}
124124

@@ -358,22 +358,28 @@ gives the **output**
358358
Remember that Python indexing begins at 0. So, the index location [2, 6]
359359
selects the element that is 3 rows down and 7 columns over in the DataFrame.
360360
361+
It is worth noting that rows are selected when using `loc` with a single list of labels (or `iloc` with a single list of integers). However, unlike `loc` or `iloc`, indexing a data frame directly with labels will select columns, while ranges of integers will select rows. Direct indexing of rows is redundant with using `iloc`, and will raise a `KeyError` if a single integer or list is used; the error will also occur if index labels are used without `loc` (or column labels used with it).
362+
A useful rule of thumb is the following: integer-based slicing is best done with `iloc` and will avoid errors (and is generally consistent with indexing of Numpy arrays), label-based slicing of rows is done with `loc`, and slicing of columns by directly indexing column names.
361363
362364
363365
> ## Challenge - Range
364366
>
365367
> 1. What happens when you execute:
366368
>
367369
> - `surveys_df[0:1]`
370+
> - `surveys_df[0]`
368371
> - `surveys_df[:4]`
369372
> - `surveys_df[:-1]`
370373
>
371374
> 2. What happens when you call:
372375
>
376+
> - `surveys_df.iloc[0:1]`
377+
> - `surveys_df.iloc[0]`
378+
> - `surveys_df.iloc[:4, :]`
373379
> - `surveys_df.iloc[0:4, 1:4]`
374380
> - `surveys_df.loc[0:4, 1:4]`
375381
>
376-
> - How are the two commands different?
382+
> - How are the last two commands different?
377383
{: .challenge}
378384
379385

0 commit comments

Comments
 (0)