Skip to content

Commit 8c4705e

Browse files
committed
Correct bug in args.md
Where all the edges of a huge ts are accidentally listed. Also tidy some text phrasing
1 parent 62e7034 commit 8c4705e

1 file changed

Lines changed: 17 additions & 15 deletions

File tree

args.md

Lines changed: 17 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -291,13 +291,13 @@ its simplified version:
291291
```{code-cell}
292292
large_sim_parameters = parameters.copy()
293293
large_sim_parameters["sequence_length"] *= 1000
294-
ts_arg = msprime.sim_ancestry(**large_sim_parameters, record_full_arg=True)
295-
ts = ts_arg.simplify()
294+
large_ts_arg = msprime.sim_ancestry(**large_sim_parameters, record_full_arg=True)
295+
large_ts = large_ts_arg.simplify()
296296
297297
print(
298298
"Non-coalescent nodes take up "
299-
f"{(1-ts.num_nodes/ts_arg.num_nodes) * 100:0.2f}% "
300-
f"of this {ts.sequence_length/1e6:g} megabase {ts.num_samples}-tip ARG"
299+
f"{(1-large_ts.num_nodes/large_ts_arg.num_nodes) * 100:0.2f}% "
300+
f"of this {large_ts.sequence_length/1e6:g} megabase {large_ts.num_samples}-tip ARG"
301301
)
302302
```
303303

@@ -314,18 +314,21 @@ vastly larger number of nodes than even the ARGs simulated here, and using such
314314
structures for simulation or inference is therefore infeasible.
315315
:::
316316

317-
## Working with the ARG
317+
## Working with the tree sequence graph
318318

319-
All the normal tskit functions can be used to analyse an ARG stored in tskit form. However, some
320-
operations are naturally though of in terms of the tree sequence as a graph.
319+
All tree sequences, including, but not limited to full ARGs, can be treated as
320+
directed (acyclic) graphs. Although many tree sequence operations operate from left to
321+
right along the genome, some are more naturally though of as passing from node
322+
to node via the edges, regardless of the genomic position of the edge. This section
323+
describes some of these fundamental graph operations.
321324

322325
### Graph traversal
323326

324327
The standard edge iterator, {meth}`TreeSequence.edge_diffs()`, goes from left to
325-
right along the genome, matching the {meth}`TreeSequence.trees()` iterator. This
326-
means that unlike most conventional graph traversal methods, the returned edges
327-
are *not* necessarily grouped by node ID (either
328-
the edge's parent node or the edge's child node).
328+
right along the genome, matching the {meth}`TreeSequence.trees()` iterator. Although
329+
this will visit all the edges in the graph, these will *not* necessarily be grouped
330+
by the node ID either of the edge parent or the edge child. To do this, an alternative
331+
traversal (from top-to-bottom or bottom-to-top of the tree sequence) is required.
329332

330333
To traverse the graph by node ID, the {meth}`TreeSequence.nodes()` iterator can be
331334
used. In particular, because parents are required to be strictly older than their
@@ -334,10 +337,9 @@ are always visited before their parents (similar to a breadth-first or "level or
334337
search). However, using {meth}`TreeSequence.nodes()` is inefficient if you also
335338
want to access the *edges* associated with each node.
336339

337-
The examples below show how to efficiently traverse the connected nodes in a tree
338-
sequence graph, visiting each
339-
only once, ensuring children are visited before parents (or vice versa), while
340-
simultaneously giving access to the edges associated with each node.
340+
The examples below show how to efficiently visit the all the edges in a
341+
tree sequence, grouped by the nodes to which they are connected, while
342+
also ensuring that children are visited before parents (or vice versa).
341343

342344
#### Traversing parent nodes
343345

0 commit comments

Comments
 (0)