observability quick tidy-up

RichardSmedley · RichardSmedley · commit 34714f9f8fb5 · 2025-02-18T15:23:52.000Z
diff --git a/modules/concept-docs/pages/durability-replication-failure-considerations.adoc b/modules/concept-docs/pages/durability-replication-failure-considerations.adoc
@@ -1,30 +1,60 @@
 = Failure Considerations
 :description: Data durability refers to the fault tolerance and persistence of data in the face of software or hardware failure.
-:page-topic-type: concept
-// :page-aliases: ROOT:failure-considerations,ROOT:durability,ROOT:enhanced-durability,7.6@server:developer-guide:durability.adoc
+:page-toclevels: 2
+// :page-aliases: ROOT:failure-considerations.adoc,ROOT:durability.adoc,ROOT:enhanced-durability.adoc,7.6@server:developer-guide:durability.adoc
 
 include::project-docs:partial$attributes.adoc[]
 
 [abstract]
 {description}
+Prepare your app for the inevitable challenges of working in a distributed network environment.
+
+
+
 Even the most reliable software and hardware might fail at some point, and along with the failures, introduce a chance of data loss.
-Couchbase’s durability features include Synchronous Replication, and the possibility to use distributed, multi-document ACID transactions.
+Couchbase's durability features include Synchronous Replication, and the possibility to use distributed, multi-document ACID transactions.
 It is the responsibility of the development team and the software architect to evaluate the best choice for each use case.
 
+This page covers the durability options offered by Couchbase Server, 
+with the rest of this section covering logging, health check, and observability --
+all key to understanding the health of a complex, distributed environment.
+
+
+
 include::{version-common}@sdk:shared:partial$durability-replication-failure-considerations.adoc[tag=intro]
 
 include::{version-common}@sdk:shared:partial$durability-replication-failure-considerations.adoc[tag=syncrep]
 include::{version-common}@sdk:shared:partial$durability-replication-failure-considerations.adoc[tag=syncrep2]
 include::{version-common}@sdk:shared:partial$durability-replication-failure-considerations.adoc[tag=syncrep3]
 
-include::{version-common}@sdk:shared:partial$durability-replication-failure-considerations.adoc[tag=older]
-
 include::{version-common}@sdk:shared:partial$durability-replication-failure-considerations.adoc[tag=performance]
 
+
+=== Legacy Durability
+
+Early versions of Couchbase Server used client-verified durablilty.
+This is still available in the SDK --
+see the https://docs.couchbase.com/sdk-api/couchbase-scala-client/com/couchbase/client/scala/durability/index.html[API documentation on durability] for details of `PersistTo` and `ReplicateTo` --
+but in almost every case with current Couchbase Server versions it's best to use the guarantees offered by the the Server.
+
+
+
+
 include::{version-common}@sdk:shared:partial$durability-replication-failure-considerations.adoc[tag=txns]
 
+
+// placeholder for discussions about what happens when a node goes down.
+
 // include::{version-common}@sdk:shared:partial$durability-replication-failure-considerations.adoc[tag=failover]
 
+
+
+
+
+
+
+////
 == Further Reading
 
 For now, much of the discussion (concept-level documentation) can still be found interleaved in the xref:howtos:error-handling.adoc#exception-handling[practical error handling howto doc].
+////
diff --git a/modules/concept-docs/pages/response-time-observability.adoc b/modules/concept-docs/pages/response-time-observability.adoc
@@ -1,10 +1,9 @@
 = Tracing
 :description: Tracing and Metrics provide fine-grained insight into how an application is performing, and helps to diagnose when it is not.
-:nav-title: Request Tracing and Metrics
-:page-topic-type: concept
+// :nav-title: Request Tracing and Metrics
 :page-aliases: ROOT:threshold-logging.adoc
+:page-toclevels: 2
 
-include::project-docs:partial$attributes.adoc[]
 
 [abstract]
 {description}
diff --git a/modules/howtos/pages/collecting-information-and-logging.adoc b/modules/howtos/pages/collecting-information-and-logging.adoc
@@ -1,7 +1,7 @@
 = Logging
 :description: Configuring logging; working with the event bus; and log redaction for data security.
-:page-topic-type: howto
-:page-aliases: ROOT:logging
+:page-toclevels: 3
+:page-aliases: ROOT:logging.adoc
 
 [abstract]
 {description}
@@ -84,7 +84,7 @@ NOTE: Gradle automatically uses the correct SLF4J API 2.x dependency required by
 ====
 
 [configuring-log4j]
-==== Configuring Log4j 2 output
+==== Configuring Log4j 2 Output
 
 Log4j 2 needs a configuration file to tell it which messages to log, where to write them, and how each message should be formatted.
 
@@ -163,6 +163,7 @@ Add these as children of the `dependencies` element.
 TIP: An alternate way to ensure Maven uses the correct version of the SLF4J API is to declare the dependency on `slf4j-jdk14` *before* the dependency on the Couchbase SDK.
 See the Maven documentation on https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Transitive_Dependencies[Transitive Dependencies] to learn more about how Maven resolves transitive dependency version conflicts.
 --
+
 Gradle::
 +
 --
@@ -176,6 +177,7 @@ NOTE: Gradle automatically uses the correct SLF4J API 2.x dependency required by
 --
 ====
 
+
 [configuring-the-jdk-logger]
 ==== Configuring a JUL Logger
 
diff --git a/modules/howtos/pages/health-check.adoc b/modules/howtos/pages/health-check.adoc
@@ -1,24 +1,42 @@
-= Diagnosing and preventing Network Problems with Health Check
-:description: In today's distributed and virtual environments, users will often not have full administrative control over their whole network.
-:navtitle: Health Check
-:page-topic-type: howto
+= Health Check
+:description: pass:q[Health Check provides `ping()` and `diagnostics()` tests for the health of the network and the cluster.]
+:page-aliases: concept-docs:health-check.adoc
+// :page-aliases: ROOT:health-check.adoc
+:page-toclevels: 2
+
+
 
 [abstract]
 {description}
-Health Check introduces _Ping_ to check nodes are still healthy, and to force idle connections to be kept alive in environments with eager shutdowns of unused resources.
-_Diagnostics_ requests a report from all the connected sockets against the cluster (from a client point of view), giving instant, but passive health check information.
 
 
-Diagnosing problems in distributed environments is far from easy, so Couchbase provides a _Health Check API_ with `Ping()` for active monitoring, and `Diagnostics()` for a look at what the client believes is the current state of the cluster.
-More extensive discussion of the uses of Health Check can be found in the xref:concept-docs:health-check.adoc[Health Check Concept Guide].
+
+In today's distributed and virtual environments, users will often not have full administrative control over their whole network.
+Working in distributed environments is hard. Latencies come and go, so do connections in their entirety.
+Is it a network glitch, or is the remote cluster down?
+Sometimes just knowing the likely cause is enough to get a good start on a workaround, or at least avoid hours wasted on an inappropriate solution.
+
+Health Check features _Ping_ to check nodes are still healthy, and to force idle connections to be kept alive in environments with eager shutdowns of unused resources.
+_Diagnostics_ requests a report from a node, giving instant health check information.
+
+
+
+// Uses
+include::{version-common}@sdk:pages:partial$health-check.adoc[tag="uses"]
+
+
 
 == Ping
 
+
+`Ping` _actively_ queries the status of the specified services, giving status and latency information for every node reachable.
+In addition to its use as a monitoring tool, a regular `Ping` can be used in an environment which does not respect keep alive values for a connection.
+
 At its simplest, `ping` provides information about the current state of the connections in the Couchbase Cluster, by actively polling:
 
 [source,java]
 ----
-include::example$HealthCheck.java[tag=ping-basic]
+include::devguide:example$java/HealthCheck.java[tag=ping-basic]
 ----
 
 This will print the latency for each socket (endpoint) connected per service. More information is available on the classes. 
@@ -27,24 +45,25 @@ This is made easy by the `exportToJson` method:
 
 [source,java]
 ----
-include::example$HealthCheck.java[tag=ping-json-export]
+include::devguide:example$java/HealthCheck.java[tag=ping-json-export]
 ----
 
 By default the SDK will ping all services available on the target cluster. 
 You can customize the type of services to ping through the `PingOptions`:
 
 [source,java]
 ----
-include::example$HealthCheck.java[tag=ping-options]
+include::devguide:example$java/HealthCheck.java[tag=ping-options]
 ----
 
 In this example, only the Query service is included in the ping report.
 
-Note that `ping` is available both on the `Cluster` and the `Bucket` level. 
-The difference is that at the cluster level, the key-value service might not be
+Note that `ping` is available both at the `Cluster` and the `Bucket` level. 
+The difference is that at the cluster level, the key-value (Data) service might not be
 included based on the Couchbase Server version in use. 
 If you want to make sure the key-value service is included, perform it at the bucket level.
 
+
 == Diagnostics
 
 Diagnostics works in a similar fashion to `ping` in the sense that it returns a report of how all the sockets/endpoints are doing, but the main difference is that it is passive. 
@@ -53,17 +72,18 @@ This makes it much cheaper to call on a regular basis, but does not provide any
 
 [source,java]
 ----
-include::example$HealthCheck.java[tag=diagnostics-basic]
+include::devguide:example$java/HealthCheck.java[tag=diagnostics-basic]
 ----
 
 Because it is passive, diagnostics are only available at the `Cluster` level and cover everything in the current SDK state. Also, because it is not doing any I/O you cannot proactively filter the list of services that are returned, all you need to do is look only at the ones that are interesting to you.
 
-A `DiagnosticsResult` has one interesting property over a ping result: It provides a cumulative `ClusterState` through the `state()` method. 
-The state can be `ONLINE`, `DEGRADED` or `OFFLINE`. This allows to give a single, although simplistic, view on how your cluster is doing from a client point of view. 
+A `DiagnosticsResult` has one interesting property over a ping result -- it provides a cumulative `ClusterState` through the `state()` method. 
+The state can be `ONLINE`, `DEGRADED` or `OFFLINE`.
+This allows to give a single, although simplistic, view on how your cluster is doing from a client point of view. 
 The state is determined as follows:
 
  * If at least one socket is open and all of them are connected, it is `ONLINE`
  * If at least one is connected but not all are, it is `DEGRADED`
  * If none are connected, it is `OFFLINE`
 
-Of course you can iterate over the individual states and apply a different algorithm if needed.
+You can iterate over the individual states and apply a different algorithm if needed.
diff --git a/modules/howtos/pages/observability-metrics.adoc b/modules/howtos/pages/observability-metrics.adoc
@@ -1,6 +1,7 @@
 = Metrics Reporting
 :description: Individual request tracing presents a very specific (though isolated) view of the system.
-:page-topic-type: howto
+:page-toclevels: 2
+
 
 [abstract]
 {description}
@@ -21,7 +22,7 @@ By default the metrics will be emitted every 10 minutes, but you can customize t
 
 [source,java]
 ----
-include::example$Metrics.java[tag=metrics-enable-custom,indent=0]
+include::devguide:example$java/Metrics.java[tag=metrics-enable-custom,indent=0]
 ----
 
 Once enabled, there is no further configuration needed. The `LoggingMeter` will emit the collected request statistics every interval.
@@ -128,7 +129,7 @@ For metrics, add this logic to the application:
 
 [source,java]
 ----
-include::example$Metrics.java[tag=metrics-otel-prometheus,indent=0]
+include::devguide:example$java/Metrics.java[tag=metrics-otel-prometheus,indent=0]
 ----
 
 
@@ -242,7 +243,7 @@ See the Micrometer documentation for details.
 
 [source,java]
 ----
-include::example$MetricsMicrometer.java[tag=metrics-micrometer-prometheus,indent=0]
+include::devguide:example$java/MetricsMicrometer.java[tag=metrics-micrometer-prometheus,indent=0]
 ----
 
 
diff --git a/modules/howtos/pages/observability-tracing.adoc b/modules/howtos/pages/observability-tracing.adoc
@@ -1,6 +1,6 @@
 = Request Tracing
 :description: Collecting information about an individual request and its response is an essential feature of every observability stack.
-:page-topic-type: howto
+:page-toclevels: 2
 :page-aliases: ROOT:tracing-from-the-sdk.adoc
 
 [abstract]
@@ -18,7 +18,7 @@ It is possible to customize this behavior by modifying the configuration:
 
 [source,java]
 ----
-include::example$Tracing.java[tag=tracing-configure,indent=0]
+include::devguide:example$java/Tracing.java[tag=tracing-configure,indent=0]
 ----
 
 In this case the emit interval is one minute and Key/Value requests will only be considered if their latency is greater or equal than two seconds.
@@ -69,9 +69,11 @@ More information will be provided as we get closer to stabilization.
 
 
 == OpenTelemetry Integration
+
 The built-in tracer is great if you do not have a centralized monitoring system, but if you already plug into the OpenTelemetry ecosystem we want to make sure to provide first-class support.
 
 === Exporting to OpenTelemetry
+
 This method exports tracing telemetry in OpenTelemetry's standard format (OTLP), which can be sent to any OTLP-compatible receiver such as Jaeger, Zipkin or `opentelemetry-collector`.
 
 Add this to your Maven, or the equivalent to your build tool of choice:
diff --git a/modules/howtos/pages/slow-operations-logging.adoc b/modules/howtos/pages/slow-operations-logging.adoc
@@ -1,6 +1,6 @@
 = Slow Operations Logging
 :description: Tracing information on slow operations can be found in the logs as threshold logging, orphan logging, and other span metrics.
-:page-topic-type: howto
+:page-toclevels: 2
 
 [abstract]
 {description}