OpenTelemetry

Why Embrace is leading the way in OpenTelemetry for mobile

Embrace is modernizing full-stack observability by providing open source, portable, and extensible mobile frameworks to OTel for site reliability and developer teams.

OpenTelemetry is built on the premise of transparent, portable, and extensible data collection. While these practices are changing the way development teams work for server-side infrastructure and application monitoring, these same principles have not been realized for the client-side layer, often labeled ‘RUM’ in legacy terminology. Embrace is leading the way to complete the picture for a modern observability practice by bringing OpenTelemetry to engineers building user experiences.

The drive for modern observability

We’ve heard loud and clear that modern, forward-thinking organizations that really value their users don’t just want their observability practices to stop at the edge of the datacenter – they want them to extend all the way through understanding the impact (and behavior) of their users.

Users connect to business KPIs while SLAs really don’t. Businesses want their engineering teams to work on what matters, and that means measuring work based on how it affects the company. Engineering teams thus need to connect their observability directly to user outcomes so that they can understand the true business impact of technical failures.

For backend teams working in a silo of infrastructure and service health metrics, this requires better collaboration and data sharing with frontend teams. Engineering teams need to speak the same language and access the same datasets to draw insights directly from user experiences.

We saw an opportunity to help solve this challenge by collaborating with the OpenTelemetry community to drive the future of open standards for observability, specifically within our expertise in mobile.

So recently, we announced that our Android and iOS SDKs are now built on OpenTelemetry. Anyone can use Embrace’s iOS and Android SDKs to send logs and spans to any OTLP capable tracing and logging backend, with metrics support soon to follow. And other mobile SDKs are soon to follow, many of which the community is clamoring to support but are not included in OpenTelemetry at all.

I’m going to briefly address the philosophy behind these changes and what the longer term vision to modernize observability practices looks like.

There’s a glaring problem that’s separating frontend and backend teams

One of the biggest challenges site reliability and developer teams consistently face is an inability to integrate insights from their user-facing web and mobile apps into their observability practice. Ideally, frontend teams collect data about the health of end-user experiences, and backend teams collect data about the health of infrastructure and services.

Today it’s common for these to be entirely separate tools that don’t share a common set of telemetry, don’t interoperate, and thus prevent engineering teams from speaking the same language.

Companies want to work on what matters, and that requires understanding where to invest engineering resources to deliver the best user experiences. Engineering teams are increasingly being judged on business KPIs, but they do not have visibility to effectively collaborate across frontend and backend.

Broadly speaking, we’ve heard some key needs from companies building best-in-class mobile apps and user experiences:

  • Prioritizing issues and outages by understanding the actual user impact, which is only possible by connecting backend observability data directly to end-user experiences.
  • Providing visibility into complete user experiences with deep context that highlights root causes among combinations of behavioral and technical factors.
  • Making decisions based on business impact by connecting backend and frontend issues directly to business KPIs.
  • Resolving issues with streamlined workflows thanks to connected data and collaboration among teams.

And yet, existing solutions leave most teams and companies wanting. They consist of limited crash and error reporting tools, or legacy real user monitoring tools. They might give some number of sampled stack traces with a set of breadcrumbs, or some highly sampled and extrapolated dataset that tries to answer some key performance metrics questions, but they don’t really allow frontend teams to build observability into their everyday engineering practices with the same level of rigor their peers building services or managing infrastructure do.

Enter open standards for instrumentation and language across best-in-class tools

There’s no shortage of options for observability tooling. However, traditional vendors favor proprietary, closed codebases, resulting in a lack of common standards. They may support OTel, but do they really adhere to its principles of open, portable, and extensible? No. Everyone models and collects telemetry differently, which burdens teams to have to invest in proprietary instrumentation across the stack. Changing vendors thus incurs significant engineering cost.

Open standards allow company-wide investment in instrumentation practices, so teams don’t have to re-instrument when changing vendors. Site reliability and developer teams can use the same language and semantics to create telemetry that’s accessible to everyone across the entire engineering org.

In addition, with the rise of open standards solutions, teams are free to build their own tech stacks across any combination of supporting software. With OpenTelemetry becoming the telemetry standard for observability, software vendors now need to do real innovation, based on specialization, to avoid becoming commodity data ingest and visualization tools. After all, the push for open standards is in large part a reaction to the immense toil and lock-in created by existing vendors. Loyalty now requires building the best product, not just a platform that’s difficult to escape from.

The success of modern open source companies is a testament to sustainable business models built around providing services for non-proprietary software. By collaborating with the worldwide software community, open source software vendors benefit not just from a larger base of core code contributors, but also a healthy ecosystem of supporting products like plugins, extensions, and connectors to better interoperate with other software.

Finally, let’s not ignore the fact that our customers greatly outnumber us as vendors, and endlessly find new ways to innovate with their own SDKs, third-party libraries, and development patterns. Observability vendors cannot keep up with this endless complexity, and when they tell you that instrumentation for your custom library is on the roadmap and they’ll be delivering it soon, they’re probably lying to you. Without common standards, this often means teams have to go without visibility into key functionality, or be forced to build their own custom tooling.

Telemetry from user-facing apps doesn’t make sense when it looks like APM because your customers are not computers

Open standards alone are unfortunately not enough to ensure telemetry is actionable for site reliability and developer teams. You need investment from experts to ensure data modeling and collection truly captures the key signals and context needed to understand user experiences. That way, when backend teams investigate an outage, they can connect it to not just a raw number of affected users, but also the usage patterns and, ultimately, the negative business impact it’s causing.

Connecting frontend and backend observability is crucial because otherwise, site reliability and developer teams are operating from guesswork:

  • Site reliability teams are forced to assume user and business impact from service and infrastructure metrics.
  • Developer teams are tasked with magically solving issues that are out of their control.

Site reliability and developer teams need visibility into endless combinations of variables across user behaviors, network connectivities, heterogeneous devices, operating systems, app versions, and more. While backend sessions take milliseconds and are mostly chains of individual service calls, mobile sessions can span minutes, hours, or even days and the “state of the system” often is totally unique to that individual user, and how they used your app.

And finally, simply knowing you had a crash, or an error, or a simple increase in duration for a key activity in your app isn’t enough. Ultimately, you want to know impact, not just impact to metrics in a vacuum. Did increasing the duration of that activity materially decrease app engagement, and the root cause is a service getting incrementally worse? The difference between truly excellent engineering organizations and those checking the boxes is extending their field of vision to the user.

Where Embrace fits in the future of observability

Embrace collects the full technical and behavioral details of every user session, providing developer teams with the necessary context to identify and resolve issues quickly. With our OTel solution, customers can also extend instrumentation to any custom library in their app and leverage Embrace’s platform to contextualize the added instrumentation. This empowers the entire engineering org, from site reliability to developers, to explore insights and expedite issue resolution. Additionally, Embrace’s instrumentation is compatible with any OTel backend, which means if we’re not right for you today, or you don’t find value in what our commercial platform does, we hope you still see the value in the instrumentation we’ve built and how we model telemetry around the behavior of users.

We hope you’ll check out our OpenTelemetry page for more info, including our roadmap to full OTel adoption across all our mobile SDKs. You can also learn how to start using Embrace’s OTel-compliant Android and iOS SDKs today. We look forward to working with the OpenTelemetry community to expand the scope of what’s possible when it comes to user-focused telemetry.

Additional resources:

Embrace OpenTelemetry for mobile

Learn more about leveraging Embrace's open-source SDKs for mobile observability.

Learn more

Build better mobile apps with Embrace

Find out how Embrace helps engineers identify, prioritize, and resolve app issues with ease.