Skip to content

Commit e1ceb0e

Browse files
jiyooonpnevalsar
andauthored
Add robotic state machine guide (#163)
- This PR introduces a new guide on designing robotic state machines and refactors the ROS GUI documentation for better standards compliance. - New Content: - Added 'How to design a robotic state machine' to the System Design & Development section. - Integrated the new article into the site navigation. - Documentation Refactoring & Remediation: - Standardized header hierarchy (H1 -> H2) in the state machine guide. - Centralized PyQt assets from 'wiki/tools/assets/' to 'assets/images/tools/'. - Updated 'ros-gui.md' to use absolute asset paths. - Removed duplicate 'ROSgui.md' in favor of the existing 'ros-gui.md' --------- Co-authored-by: Nevin Valsaraj <nevin.valsaraj32@gmail.com>
1 parent 0d07e74 commit e1ceb0e

3 files changed

Lines changed: 170 additions & 0 deletions

File tree

_data/navigation.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,8 @@ wiki:
2525
url: /wiki/system-design-development/subsystem-interface-modeling/
2626
- title: In Loop Testing
2727
url: /wiki/system-design-development/In-Loop-Testing/
28+
- title: How to design a robotic state machine
29+
url: /wiki/system-design-development/how-to-design-a-robotic-state-machine/
2830
- title: Project Management
2931
url: /wiki/project-management/
3032
children:

assets/images/tools/PyQt-init.png

40 KB
Loading
Lines changed: 168 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,168 @@
1+
---
2+
# Jekyll 'Front Matter' goes here. Most are set by default, and should NOT be
3+
# overwritten except in special circumstances.
4+
# You should set the date the article was last updated like this:
5+
date: 2023-12-04 # YYYY-MM-DD
6+
# This will be displayed at the bottom of the article
7+
# You should set the article's title:
8+
title: How to design a robotic state machine
9+
# The 'title' is automatically displayed at the top of the page
10+
# and used in other parts of the site.
11+
---
12+
## Motivation
13+
14+
One key aspect of designing complex robotic systems is the effective
15+
implementation of state machines. This article will explore the best
16+
practices and design considerations for designing state machines
17+
tailored for robotic applications. We begin by describing an example
18+
state machine, then talk about linear versus asynchronous state
19+
machines, publishers and subscribers versus services in ROS, and finally
20+
discuss strategies for error handling in state machines. To learn more
21+
about state machines, read these articles:
22+
23+
- [Finite-state_machine](https://en.wikipedia.org/wiki/Finite-state_machine)
24+
25+
- [State Machine Basics](https://www.freecodecamp.org/news/state-machines-basics-of-computer-science-d42855debc66/)
26+
27+
## Example
28+
29+
Let us consider a green pepper harvesting robot which consists of a
30+
manipulator arm and a custom end-effector. The major subsystems are
31+
perception, motion planning, and end-effector. This system is structured
32+
in ROS with the following nodes:
33+
34+
- Perception: takes an image and outputs 6D pose for point-of-interaction (POI)
35+
36+
37+
- Kalman filtering: filters the noisy 6D poses of the POI to provide filtered 6D pose
38+
39+
- Motion planning: generates a plan for manipulator arm and actuates it to the POI
40+
41+
- End-effector: actuates a gripper and cutter mechanism to harvest a pepper
42+
43+
Two other nodes jointly control these subsystems. A state machine node
44+
handles the definitions of different states and transitions between
45+
them. This node accesses all the information published or made available
46+
by other nodes to decide when we transition from one state to another.
47+
On the other hand, a planner node constantly listens to the current
48+
state to execute actions that need to be performed in that state. This
49+
node may utilize any data that it needs.
50+
51+
- State machine: defines different states and transitions between them
52+
53+
- Planner: listens to the current state to execute actions that need to be performed
54+
55+
## Linear vs Asynchronous
56+
57+
The easiest way to structure a state machine is to linearly move forward
58+
from one state to another upon completion of tasks in the current state.
59+
Doing so, however, can limit the system's decision-making abilities and
60+
limit scalability to complex architectures. In the case of the pepper
61+
harvesting robot, a linear state machine will move the arm to multiple
62+
positions one by one, capture images to estimate noisy 6D poses for POI,
63+
then obtain a filtered pose, move to the POI, and harvest the pepper
64+
with the end-effector. All these actions are completed one at a time.
65+
The state machine will wait for one state to complete its actions before
66+
moving on.
67+
68+
A direct consequence of this is an increase in harvesting time with an
69+
increased number of positions the arm is moved to. Furthermore, this
70+
system cannot scale to have a mobile base that transports the arm to a
71+
new pepper plant location by detecting peppers. With a linear state
72+
machine, base motion and pepper detection cannot be performed at the
73+
same time.
74+
75+
In the case of an asynchronous state machine, some nodes are structured
76+
to constantly process their inputs to publish outputs at all times. The
77+
perception node can continuously generate POIs from the input images
78+
published by a camera. This would allow the Kalman filtering node to
79+
keep track of existing POIs and update them at the same frequency as the
80+
perception node. One immediate advantage of this structure is that
81+
images are processed in parallel with arm motion. This would drastically
82+
reduce the overall harvesting time. Similarly, a mobile base can easily
83+
be added to the system now. Another advantage of such a state machine is
84+
that it can make decisions asynchronously. Given that some nodes are
85+
always publishing data and some perform actuation based on the state,
86+
the state machine can utilize current data to quickly update the next
87+
action, instead of waiting for a state to be completed.
88+
89+
## Choosing Communication Paradigm for State Machines in ROS
90+
91+
When implementing a state machine in ROS, a crucial decision is whether
92+
to use ROS publishers and subscribers or ROS services for state
93+
communication. Let\'s explore the pros and cons of each approach.
94+
95+
### Publisher & Subscriber
96+
#### Pro
97+
1. Decentralized Communication: Enables asynchronous communication, allowing multiple nodes to be informed of the state independently.
98+
2. Real-time Suitability: Non-blocking communication ensures timely updates for decisions requiring real-time computation.
99+
3. Scalability: Easily scalable by adding more nodes as subscribers due to the decentralized system.
100+
101+
#### Con
102+
1. No Guarantee of Delivery: Lack of confirmation on message receipt necessitates manual checks for communication.
103+
2. Synchronization Issues: Asynchronous nature may require additional synchronization mechanisms for timely information retrieval.
104+
3. Limited to Point-to-Point Communication: Restriction to one-to-one communication impedes multi-node awareness of the state.
105+
4. Enforces a Linear State Machine: The point-to-point limitation results in a linear state model.
106+
5. Communication Overhead: Frequent state changes with request-response interactions introduce performance overhead.
107+
108+
### Ros service
109+
#### Pro
110+
111+
1. Blocks Communication: Code blocking ensures coordination between nodes during communication.
112+
113+
#### Con
114+
1. Limited to Point-to-Point Communication: Similar to publishers and subscribers, ROS services only allow communication between two nodes.
115+
2. Enforces a Linear State Machine: Restricts the state machine to a linear model.
116+
3. Overhead in Communication: Frequent state changes with request-response interactions introduce performance overhead.
117+
118+
119+
As many networking errors can arise in a robotic system, it is usually
120+
better to have a decentralized communication system rather than a
121+
centralized, linear system. Using a rosservice often causes the system
122+
to be blocked and slows down the entire system, and it is challenging to
123+
debug where the blockage is happening. Thanks to the increased
124+
computational speed, although publishers and subscribers do not
125+
guarantee delivery, we generally do not have to worry about whether or
126+
not the message gets delivered to the nodes.
127+
128+
## Frequent errors that occur in state machines
129+
130+
As mentioned, there are two ways of implementing the state machine. The
131+
errors that occur vary depending on the implementation.
132+
133+
### Publishing & Subscribing method:
134+
135+
- The publisher and subscriber frequency may differ. If the publishing node and subscribing node are running at different frequencies, and the publisher is publishing at 500 Hz and the subscriber is listening at 1hz, there might be some messages (states) that are lost on the listener end.
136+
137+
### Ros service method:
138+
139+
- The system performance is slow: As Ros services block the code from moving on to the next block until it ensures a successful communication, you can easily face your code slowing down to ensure this successful communication. If this communication occurs across multiple nodes, the problem becomes worse. If you go down this route, ensuring no unnecessary blockers is crucial to system performance.
140+
141+
### How to separate scripts
142+
143+
Assuming you have decided to move forward with the publisher and
144+
subscriber, the question of how you will design the script arises. We
145+
suggest having a state_machine.py that just listens to all the other
146+
node's current status and moves from state to state. In this case, all
147+
the subsystem nodes should also publish their status through publishers.
148+
149+
### Handling Errors in State Machine Design:
150+
151+
Consider whether a singular error state requiring human intervention is
152+
acceptable. Alternatively, separate error-handling states can automate
153+
resolution for certain non-critical errors. Design your states
154+
accordingly.
155+
156+
Here are some example cases we encountered.
157+
158+
1. When the depth image was sparse and could not get a value on that pixel, the system would go into an error state. Instead, you could handle this by getting a new depth image until there is a non-null value
159+
160+
2. Plt was throwing an error because it was not running in the mainthread. This error can be neglected, and the system can perform without handling this error. Not all errors are crucial to the system and can be ignored.
161+
162+
3. The end-effector subsystem threw a motor failure error that halted the entire system. You can reset the motors using the motor API instead of intervening when you are in this particular error state.
163+
164+
Some errors make sense for humans to intervene, however, there are many
165+
more cases that make sense to be handled separately through code. When
166+
designing states, especially error states, carefully decide on the
167+
number and nature of error states to optimize system resilience and
168+
efficiency.

0 commit comments

Comments
 (0)