@@ -118,7 +118,7 @@ the related Python data type dictionary).
118118> the names of built-in data structures and methods. For example, a _ list_ is a built-in
119119> data type. It is possible to use the word 'list' as an identifier for a new object,
120120> for example ` list = ['apples', 'oranges', 'bananas'] ` . However, you would then
121- > be unable to create an empty list using ` list() ` or convert a tuple to a
121+ > be unable to create an empty list using ` list() ` or convert a tuple to a
122122> list using ` list(sometuple) ` .
123123 {: .callout}
124124
@@ -364,22 +364,38 @@ gives the **output**
364364Remember that Python indexing begins at 0. So, the index location [2, 6]
365365selects the element that is 3 rows down and 7 columns over in the DataFrame.
366366
367+ It is worth noting that rows are selected when using `loc` with a single list of
368+ labels (or `iloc` with a single list of integers). However, unlike `loc` or `iloc`,
369+ indexing a data frame directly with labels will select columns (e.g.
370+ `surveys_df['species_id', 'plot_id', 'weight']`), while ranges of integers will
371+ select rows (e.g. surveys_df[0:13]). Direct indexing of rows is redundant with
372+ using `iloc`, and will raise a `KeyError` if a single integer or list is used; the
373+ error will also occur if index labels are used without `loc` (or column labels used
374+ with it).
375+ A useful rule of thumb is the following: integer-based slicing is best done with
376+ `iloc` and will avoid errors (and is generally consistent with indexing of Numpy
377+ arrays), label-based slicing of rows is done with `loc`, and slicing of columns by
378+ directly indexing column names.
367379
368380
369381> ## Challenge - Range
370382>
371383> 1. What happens when you execute:
372384>
373385> - `surveys_df[0:1]`
386+ > - `surveys_df[0]`
374387> - `surveys_df[:4]`
375388> - `surveys_df[:-1]`
376389>
377390> 2. What happens when you call:
378391>
392+ > - `surveys_df.iloc[0:1]`
393+ > - `surveys_df.iloc[0]`
394+ > - `surveys_df.iloc[:4, :]`
379395> - `surveys_df.iloc[0:4, 1:4]`
380396> - `surveys_df.loc[0:4, 1:4]`
381397>
382- > - How are the two commands different?
398+ > - How are the last two commands different?
383399{: .challenge}
384400
385401
0 commit comments