Skip to content

Commit 64fb4bf

Browse files
authored
Merge pull request #584 from datacarpentry/improveExpandJoinsDiscussion
Update join types in pandas merge function
2 parents 00173b8 + 18e3e2f commit 64fb4bf

1 file changed

Lines changed: 8 additions & 4 deletions

File tree

episodes/05-merging-data.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -434,16 +434,20 @@ case, `PF`) does not occur in `species_sub`.
434434

435435
### Other join types
436436

437-
The pandas `merge` function supports two other join types:
437+
The pandas `merge` function supports other join types:
438438

439439
- Right (outer) join: Invoked by passing `how='right'` as an argument. Similar
440440
to a left join, except *all* rows from the `right` DataFrame are kept, while
441441
rows from the `left` DataFrame without matching join key(s) values are
442442
discarded.
443443
- Full (outer) join: Invoked by passing `how='outer'` as an argument. This join
444-
type returns the all pairwise combinations of rows from both DataFrames; i.e.,
445-
the result DataFrame will `NaN` where data is missing in one of the dataframes. This join type is
446-
very rarely used.
444+
type returns the all pairwise combinations of rows from both DataFrames; i.e., the
445+
*Cartesian product* and the result DataFrame will use `NaN` where data is missing in one
446+
of the dataframes. This join type is very rarely used, but can be helpful to see all
447+
the qualities of both tables, including each common and duplicate column.
448+
- Self-join: Joins a data frame with itself. Self-joins can be useful when you want to, for
449+
instance, compare records within the same dataset based on a given criteria. A fuller discussion
450+
of how and when it might be useful to do so can be found in [Self-Join and Cross Join in Pandas DataFrame](https://blog.devgenius.io/self-join-and-cross-join-in-pandas-dataframe-b30bfbc0e52a)
447451

448452
## Final Challenges
449453

0 commit comments

Comments
 (0)