I spent six years thinking I was bad at mobile observability. I wasn’t.

OpenTelemetry

30 April 2026 • 3 min read

It took me joining Embrace to figure out I wasn’t totally incompetent at mobile observability. The tools I was using just weren’t making it easy.

I know how that sounds coming from someone who works at an observability company. But I spent six years as an iOS engineer before this, and I remember exactly what it felt like — chasing bugs I couldn’t reproduce, staring at dashboards that said everything was fine while users were clearly hitting something, and assuming I was missing what everyone else had figured out.

I hadn’t missed anything. The tooling had a structural problem. Nobody was really talking about it directly.

Mobile breaks the assumptions backend observability is built on

I spent six years thinking I was bad at mobile observability. I wasn't.

When people say mobile observability is hard, they usually mean complicated. I mean something more specific: mobile breaks the foundational assumptions backend observability tools were built on. Not harder — structurally different.

Three things in particular.

Network reliability. Backend systems assume a stable connection to ship telemetry. Mobile doesn’t have that. Devices lose connectivity, switch networks, go into airplane mode mid-session. Data queues for hours before it transmits — or gets dropped when the OS cleans up background processes. Default OTel exporters assume the batch span processor can flush on a timer to a reachable collector. On mobile, neither the timer nor the collector is reliably there.

Device fragmentation. “P50 latency is 200 milliseconds” sounds useful until you ask: for which devices? What’s fast on a flagship is slow on a three-year-old budget Android — and a real portion of your users are on those devices. Aggregated metrics flatten all of that into something that doesn’t represent the users actually struggling.

Resource constraints. Mobile operating systems kill background processes in ways backend infrastructure doesn’t. When that happens, you lose whatever telemetry hadn’t shipped yet. Instrument too aggressively and you hurt performance on the exact devices where you most need visibility.

None of this is an edge case. It’s the default environment for a significant portion of your users. And most of the tools engineers reach for were designed for environments where none of these constraints exist.

The vanity metric trap

Most mobile teams track the same things: crash rate, P50 launch time, network latency percentiles. Real numbers. They just don’t answer the question that actually matters — how did this technical change affect real users?

Your app startup improved from 3.2 seconds to 2.9 seconds. Sounds good. What was the user impact? Without user-level context, you’re optimizing a metric that may not move any business needle.

The version of this I lived through most often: a backend dashboard showing all green, support tickets climbing, and no clear link between the two. No crash. No error log. Users hitting something broken, giving up, not coming back. That’s not a rare failure mode. That’s what sampled, backend-first observability looks like from the user’s side.

What changes when you can see the full session

The difference hit me the first time I pulled up a session timeline for a user whose app had hit cascading API errors from a cold start.

Every network call was there in sequence, color-coded by what happened. You could see where the retry logic started stacking up. You could see the exact moment the app got overwhelmed and crashed — and because dSYM uploads are automated, you could jump straight to the line of code responsible. Everything that led up to it, in order, in the context of what that user was actually trying to do.

That’s not a crash report. It’s a full reconstruction. A crash report tells you something broke. A session timeline tells you why, for whom, and under what conditions. If you want to see this kind of end-to-end visibility in practice, we walked through it in depth in our recent OpenTelemetry session.

The other thing that changes: where the data lives. Because Embrace is built on OpenTelemetry, mobile telemetry forwards into whatever backend your team already uses — Grafana, Chronosphere, Honeycomb. Mobile engineers and backend SREs end up looking at the same dashboards. A crash that’s actually rooted in a backend latency issue becomes visible to both teams at once, instead of after a multi-day investigation where everyone’s working from different data.

Why this took so long

Mobile has been the blind spot in observability for a long time. Not because the engineers working on it weren’t good. Because the ecosystem treated mobile as a subset of backend monitoring, when it’s actually a different problem.

The release cycle made it worse. Backend: deploy, get feedback. Mobile: build, test, app store review, gradual rollout, then finally start collecting meaningful data. Four-to-six week feedback loop. By the time you have answers about a performance change, you’ve already shipped three more.

A lot of senior mobile engineers I’ve talked to have some version of the story I had. Years of feeling like they were missing something obvious, working around tools that weren’t built for their environment, quietly absorbing blame for problems that were actually tooling problems.

That’s what’s changing. Not just better instrumentation. A different starting assumption about what mobile engineers should be able to see.

Last thing

The other shift is how you get at the data. Instead of learning a query language or clicking through filters, you can ask in plain English — “show me crashes in Illinois” — and go straight to the answer. We’ve also shipped an MCP server so you can query sessions, crashes, spans, network calls, and logs from whatever AI tool you already work in. Less time in the UI, more time fixing things.

If you want to see it hands-on — including where we’re taking it next — watch our webinar on AI-powered observability with Embrace: natural language querying, the MCP server, automated workflows, and a preview of what’s coming.

Deliver incredible mobile experiences with Embrace.

Get started today with 1 million free user sessions.

Get started free

Author

Sergio Rodriguez

Sergio is an iOS solutions engineer at Embrace based out of Chicago. He has worked at several small to large organizations shipping code. He's been fully drawn into the Apple ecosystem since getting his first Mac in 2007.

Browser OpenTelemetry without the compromises: How to stay vendor-neutral and still get full RUM

A real-world React SPA setup surfaced a question about browser OTel. The answer points to a better way to instrument the web.

OpenTelemetry

4 January 2026 • 6 min read

10 Best OpenTelemetry tools in 2026

Discover the best OpenTelemetry tools of 2026 to unify observability, speed up root cause analysis, and optimize performance across modern apps.

OpenTelemetry

20 December 2025 • 44 min read

An OTel Carol: Past, present, and future of OpenTelemetry panel recap

Join your favorite OTel thought leaders for a lighthearted journey through the past, present, and future of OpenTelemetry. We’ll cover early challenges, key improvements to the spec, tooling, and developer experience, and exciting developments we’re most looking forward to in 2026.

Product Overview

User-focused observability for mobile and web

Use Cases

Industries

Featured Resource

The AI and observability gap for frontend teams: 2026 research report from Embrace

Company

Community + Support

I spent six years thinking I was bad at mobile observability. I wasn’t.

Mobile breaks the assumptions backend observability is built on

The vanity metric trap

What changes when you can see the full session

Why this took so long

Last thing

Browser OpenTelemetry without the compromises: How to stay vendor-neutral and still get full RUM

10 Best OpenTelemetry tools in 2026

An OTel Carol: Past, present, and future of OpenTelemetry panel recap