Lab Timer
Solution hint unlocks in 45:00
What You’ll Do
Start InsureWatch on the lab/3-collector branch. All services are running and generating telemetry — but nothing appears in Grafana. A Collector container is running, but its pipeline config is empty. Complete collector/skeleton.yml so the Collector receives OTLP, processes it, and exports to the LGTM stack.
Branch: lab/3-collector
Primary file: collector/skeleton.yml
Setup
cd insurewatch
git checkout lab/3-collector
docker compose up --build
Submit a claim after everything starts:
curl -s -X POST http://localhost:3000/api/claims \
-H "Content-Type: application/json" \
-d '{
"customer_id": "CUST001",
"policy_number": "POL-001",
"claim_type": "medical",
"amount": 500,
"description": "Lab 3 test",
"incident_date": "2026-03-01"
}'
Open Grafana at http://localhost:3100 → Explore → Tempo. No traces. The collector container logs will show errors — it can’t validate the config because the pipeline is incomplete.
The Architecture
In the main branch, services send OTLP directly to the LGTM stack:
services → lgtm:4318 → Tempo / Prometheus / Loki
In this branch, a Collector sits in the middle:
services → collector:4318 → [pipeline] → lgtm:4318 → Tempo / Prometheus / Loki
The OTEL_EXPORTER_OTLP_ENDPOINT env var in each service’s docker-compose entry now points to http://collector:4318 instead of http://lgtm:4318. The Collector is the only gateway to the backend.
Check docker-compose.yml to confirm — every service has:
OTEL_EXPORTER_OTLP_ENDPOINT=http://collector:4318
The Collector is running with this mount:
volumes:
- ./collector/skeleton.yml:/etc/otelcol/config.yml:ro
Your job: fill in collector/skeleton.yml.
The Collector Config Structure
Every Collector config has five top-level sections:
receivers: # where data comes in
processors: # what happens to data in transit
exporters: # where data goes out
extensions: # optional: health checks, pprof, etc.
service:
pipelines: # wires receivers → processors → exporters per signal
Open collector/skeleton.yml. It has the TODO structure. You’ll fill it in step by step.
Step 1: Define the Receiver
The services send OTLP over HTTP to port 4318. You need an otlp receiver that listens on both gRPC (4317) and HTTP (4318):
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
The 0.0.0.0 binding makes the Collector accept connections from any Docker network interface — required for container-to-container communication.
Step 2: Define Processors
Processors transform or gate data between receiver and exporter. Two processors are standard in every production Collector deployment:
memory_limiter — prevents the Collector process from consuming unbounded memory when a spike of telemetry arrives. It must always be the first processor in the pipeline.
batch — buffers spans/metrics/logs and sends them in batches. This dramatically reduces the number of export requests and improves throughput.
processors:
memory_limiter:
check_interval: 5s
limit_mib: 256
spike_limit_mib: 64
batch:
timeout: 5s
send_batch_size: 1024
memory_limiter being first is enforced by convention, not by the Collector itself. If you put it after batch, the Collector will start — but during a memory pressure event, the batch buffer will grow before the limiter fires, potentially causing an OOM. Always: memory_limiter → batch.
Step 3: Define the Exporter
The LGTM stack accepts OTLP/HTTP on port 4318 (inside the Docker network, lgtm:4318). Use the otlphttp exporter:
exporters:
otlphttp/lgtm:
endpoint: http://lgtm:4318
The /lgtm suffix is an arbitrary name — it’s how the Collector supports multiple instances of the same component type. You could have otlphttp/staging and otlphttp/production pointing to different backends. The base type is otlphttp; the /lgtm part is the instance name.
Important: You cannot use the same name for both a receiver and an exporter. If you define otlp as the receiver, you cannot also define otlp as the exporter — that’s why the exporter is named otlphttp/lgtm while the receiver is named otlp.
Step 4: Wire the Pipelines
The service.pipelines section connects receivers, processors, and exporters for each signal:
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [otlphttp/lgtm]
metrics:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [otlphttp/lgtm]
logs:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [otlphttp/lgtm]
Three separate pipelines — one per signal type. Each pipeline is independent: you could filter logs differently than traces, or fan out metrics to a second backend without affecting traces.
The Complete Config
Your finished collector/skeleton.yml should look like this:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
memory_limiter:
check_interval: 5s
limit_mib: 256
spike_limit_mib: 64
batch:
timeout: 5s
send_batch_size: 1024
exporters:
otlphttp/lgtm:
endpoint: http://lgtm:4318
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [otlphttp/lgtm]
metrics:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [otlphttp/lgtm]
logs:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [otlphttp/lgtm]
Verification {#solution}
Restart the Collector to pick up the new config (you don’t need to rebuild — the config file is mounted as a volume):
docker compose restart collector
Watch the Collector logs for a clean startup:
docker compose logs -f collector
You should see lines like:
Everything is ready. Begin running and processing data.
Submit a claim:
curl -s -X POST http://localhost:3000/api/claims \
-H "Content-Type: application/json" \
-d '{
"customer_id": "CUST002",
"policy_number": "POL-002",
"claim_type": "auto",
"amount": 3500,
"description": "Lab 3 verified",
"incident_date": "2026-03-01"
}'
Open Grafana → Explore → Tempo. You should now see the full trace.
Check metrics in Prometheus:
claims_approved_total
Check logs in Loki with filter: {service_name="claims-service"}.
All three signals — traces, metrics, logs — flowing through the Collector to Grafana.
What You Learned
The Collector is a pipeline, not a proxy. Receivers, processors, and exporters are distinct stages. The receiver accepts data in a specific format (OTLP, Jaeger, Zipkin, Prometheus scrape). Processors transform it in flight. Exporters send it to a backend in potentially a different format. This separation is what makes the Collector the right place to add sampling, filtering, enrichment, and fan-out without touching service code.
memory_limiter must be first. It’s the circuit breaker for the pipeline. If downstream processing is backed up, the limiter starts dropping data before the Collector runs out of memory. Putting it first means it can shed load before batch accumulates it.
Named component instances enable multi-backend fan-out. If you have two backends:
exporters:
otlphttp/staging:
endpoint: http://staging-backend:4318
otlphttp/production:
endpoint: http://prod-backend:4318
service:
pipelines:
traces:
exporters: [otlphttp/staging, otlphttp/production]
This fan-out happens in the Collector — zero changes to service code. This is one of the most valuable properties of the Collector architecture: centralize routing and backend configuration, keep services simple.
The Collector is the right place for cross-cutting concerns. Sampling decisions, PII scrubbing, resource attribute enrichment, routing to different backends by environment — none of these belong in service code. They’re infrastructure policy, not application logic.
Bonus Challenge
Add a debug exporter to the traces pipeline to print span summaries to the Collector’s stdout:
exporters:
otlphttp/lgtm:
endpoint: http://lgtm:4318
debug:
verbosity: detailed
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [otlphttp/lgtm, debug]
Run docker compose logs -f collector while submitting claims. You’ll see each span printed to the console — span name, trace ID, attributes, duration. Useful for debugging Collector config changes without opening Grafana.
Then add a filter processor to drop health check spans:
processors:
filter/drop_health:
traces:
span:
- 'attributes["http.route"] == "/health"'
Add filter/drop_health before batch in the traces pipeline. Restart the Collector and verify that GET /health spans no longer appear in Tempo.