Skip to content

Commit 6b0e37d

Browse files
author
Phillip Kuznetsov
authored
Add OpenTelemetry tutorial (#164)
* Add otel documentation Signed-off-by: Phillip Kuznetsov <pkuznetsov@pixielabs.ai> * Add indices and change the order of the tutorial Signed-off-by: Phillip Kuznetsov <pkuznetsov@pixielabs.ai> * Address comments from review Signed-off-by: Phillip Kuznetsov <pkuznetsov@pixielabs.ai>
1 parent 3dd6e04 commit 6b0e37d

3 files changed

Lines changed: 168 additions & 0 deletions

File tree

Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
---
2+
title: "Export Data to OpenTelemetry"
3+
metaTitle: "Tutorials | Integrations and Alerts | Pixie <> OpenTelemetry"
4+
metaDescription: "Export Data to OpenTelemetry"
5+
order: 3
6+
redirect_from:
7+
- /tutorials/otel/
8+
---
9+
10+
11+
Pixie comes packaged with an OpenTelemetry exporter. You can write PxL scripts that define the transformation of Pixie DataFrames into OpenTelemetry data. This article walks through a script that exports HTTP data collected by Pixie into an OpenTelemetry endpoint. More detailed PxL documentation for the OpenTelemetry integration is available [here](/reference/pxl/otel-export).
12+
13+
14+
## Example OpenTelemetry Export PxL Script
15+
16+
The following [PxL script](/tutorials/pxl-scripts/write-pxl-scripts/#overview) calculates the rate of HTTP requests made to each pod in your cluster and exports that data as an OpenTelemetry Gauge metric.
17+
18+
19+
```python
20+
import px
21+
# Read in the http_events table
22+
df = px.DataFrame(table='http_events', start_time='-10s')
23+
24+
# Attach the pod and service metadata
25+
df.pod = df.ctx['pod']
26+
df.service = df.ctx['service']
27+
# Count the number of requests per pod and service
28+
df = df.groupby(['pod', 'service', 'req_path']).agg(
29+
throughput=('latency', px.count),
30+
time_=('time_', px.max),
31+
)
32+
33+
# Change the denominator if you change start_time above.
34+
df.requests_per_s = df.throughput / 10
35+
36+
px.export(df, px.otel.Data(
37+
# endpoint arg not required if run in a plugin that provides the endpoint
38+
endpoint=px.otel.Endpoint(
39+
url='0.0.0.0:98765',
40+
headers={
41+
'apikey': '12345',
42+
}
43+
),
44+
resource={
45+
# service.name is required by OpenTelemetry.
46+
'service.name' : df.service,
47+
'service.instance.id': df.pod,
48+
'k8s.pod.name': df.pod,
49+
},
50+
data=[
51+
px.otel.metric.Gauge(
52+
name='http.throughput',
53+
description='The number of messages sent per second',
54+
value=df.requests_per_s,
55+
attributes={
56+
'req_path': df.req_path,
57+
}
58+
)
59+
]
60+
))
61+
```
62+
63+
64+
65+
## The Data
66+
The first part of this script (lines 1-19) read in the `http_events` data and count the number of requests made to each pod from the last 10s.
67+
68+
69+
```python
70+
import px
71+
72+
# Read in the http_events table
73+
df = px.DataFrame(table='http_events', start_time='-10s')
74+
75+
# Attach the pod and service metadata
76+
df.pod = df.ctx['pod']
77+
df.service = df.ctx['service']
78+
79+
# Count the number of requests per pod and service
80+
df = df.groupby(['pod', 'service', 'req_path']).agg(
81+
throughput=('latency', px.count),
82+
time_=('time_', px.max),
83+
)
84+
85+
# Calculate the rate for the time window
86+
df.requests_per_s = df.throughput / 10
87+
```
88+
89+
90+
91+
## Exporting
92+
93+
To export the data, you’ll call `px.export` with the DataFrame as the first argument and the export target `px.otel.Data` as the second argument.
94+
95+
96+
```python
97+
px.export(df, px.otel.Data(...))
98+
```
99+
100+
101+
The export target (`px.otel.Data`) describes which columns to use for the corresponding OpenTelemetry fields. You specify a column using the same syntax as in a regular query: `df.column_name` or `df[‘column_name’]`. The columns must reference a column available in the `df` argument or the PxL compiler will throw an error
102+
103+
104+
## Specifying a Collector Endpoint and Authentication
105+
106+
The PxL OpenTelemetry exporter needs to talk with a collector. You must specify this information via the `endpoint` parameter:
107+
108+
109+
```python
110+
endpoint=px.otel.Endpoint(
111+
url='0.0.0.0:55690',
112+
headers={
113+
'api-key': '12345',
114+
}
115+
),
116+
```
117+
118+
119+
The endpoint url must be an OpenTelemetry grpc endpoint and must be secured with SSL. Don’t specify a protocol prefix. Optionally, you can also specify the headers passed to the endpoint. Some OpenTelemetry collector providers look for authentication tokens or api keys in the connection context. The headers field is where you can add this information.
120+
121+
Note that if you’re writing a [plugin script](/reference/plugins/plugin-system), this information should be passed in from the plugin context.
122+
123+
124+
## Transforming Data
125+
126+
The core idea of the PxL OpenTelemetry export is that you’re converting columnar data from a Pixie DataFrame into the fields of whatever OpenTelemetry data that you wish to capture. You can reference a column by using the attribute syntax `df.column_name`. Under the hood, Pixie will convert the values for each row into a new OpenTelemetry message. The columns must match up with the DataFrame that you are exporting (the first argument to `px.export`), otherwise you will receive a compiler error.
127+
128+
129+
## Specifying a Resource
130+
131+
The `resource` parameter defines the entity producing the [telemetry data](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/resource/sdk.md). Users define the `resource` argument as a dictionary mapping attribute keys to the STRING columns that populate the attribute values. The PxL configuration expects `service.name` to be set, all other attributes are optional.
132+
133+
When creating new attribute keys, keep in mind OpenTelemetry has a [recommended pattern](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/resource/semantic_conventions/README.md#document-conventions) that you should follow to maintain broad compatibility with OpenTelemetry collectors.
134+
135+
```python
136+
resource={
137+
# service.name is required by OpenTelemetry.
138+
'service.name' : df.service,
139+
'service.instance.id': df.pod,
140+
'k8s.pod.name': df.pod,
141+
},
142+
```
143+
144+
145+
146+
## Specifying Data
147+
148+
The data parameter allows you to specify a list of metrics or traces that are generated from the DataFrame. In the example script, we specify a single Gauge metric for the `df.request_per_s` column. We also supply an attribute for the metric, `req_path`. Each Metric and Trace type supports a custom attribute field. Metric/Trace attributes work similarly to Resource attributes, but they are scoped only to the specific method
149+
150+
151+
```python
152+
data=[
153+
px.otel.metric.Gauge(
154+
name='http.throughput',
155+
description='The number of messages sent per second',
156+
value=df.requests_per_s,
157+
attributes={
158+
'req_path': df.req_path,
159+
}
160+
)
161+
]
162+
```
163+
164+
165+
We currently support a limited set of OpenTelemetry signal types: `metric.Gauge`, `metric.Summary` and `trace.Span`. We also support a subset of the available fields for each instrument. You can see the full set of features [in our api documentation.](/reference/pxl/otel-export) If you want support for other fields, please [open an issue](https://github.com/pixie-io/pixie).
166+

content/en/04-tutorials/03-integrations/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,4 @@ Tutorials:
99

1010
- [Slack Alerts using the Pixie API](/tutorials/integrations/slackbot-alert)
1111
- [Grafana Datasource Plugin](/tutorials/integrations/grafana)
12+
- [Export Data to OpenTelemetry](/tutorials/integrations/otel)

content/en/04-tutorials/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ order: 40
2525

2626
- [Slack Alerts using the Pixie API](/tutorials/integrations/slackbot-alert)
2727
- [Grafana Datasource Plugin](/tutorials/integrations/grafana)
28+
- [Export Data to OpenTelemetry](/tutorials/integrations/otel)
2829

2930
#### Collecting Custom Data
3031

0 commit comments

Comments
 (0)