An average person spends around 6 hours and 40 minutes on the Internet daily, using dozens of online products. We share our opinions and emotions via social media, rely on messengers to stay in touch with relatives and friends, shop on eCommerce platforms, make payments through digital wallets, and entertain ourselves with streaming services. Not to mention software that lets us work and earn our living!
Now imagine that you must create and remember a separate username-password pair for each app and website you regularly visit. Feels somewhat overwhelming, yeah? Luckily, today, billions of Internet users have another option — to employ the same credentials across numerous platforms. This became possible due to the Open Authorization or OAuth technology that paved the way for building trust between online services.
The article explains how OAuth works and what to consider if you want to implement the authorization process in your app to make the lives of your customers much easier.
What is OAuth?
OAuth, or Open Authorization, is a dominating authorization standard that lets us securely log into a new website or app without having to devise and remember another password. Most large vendors — such as Amazon, Twitter, Google, and Microsoft, to name just a few — support OAuth to share a portion of customer data while not disclosing existing credentials.
For example, you want to register on Booking.com. Instead of manually creating a new account, you can sign in via a platform you already use — Facebook, Google, or Apple. In essence, you authorize Booking.com to fetch your data stored by another service and automatically generate a profile to let you in.
Booking.com uses OAuth 2.0 protocol to enable you to sign in via third-party services. Source: account.booking.com
The framework is maintained by the Internet Engineering Task Force (IETF), an organization that defines and documents norms for World Wide Web technologies. Since 2012, version 2 — or OAuth 2.0 — has been in force. Let’s look at how it works from the inside.
OAuth roles
The OAuth process involves four roles taking an active part in the authorization and defining its flow.
A resource owner is an entity that grants access to certain protected resources — such as personal data, pictures, etc. Typically, it’s an end user, but it can also be a machine.
A resource server keeps and manages protected resources. In other words, it’s an API you want to interact with. It can be a social media platform API that handles access to user profiles or a cloud storage service API controlling file availability.
A client is an application or service that requests access to protected resources. In the case mentioned above, it’s Booking.com. Sometimes, the client also acts as the resource owner. An example is a robotic process automation (RPA) tool that regularly updates a database. In this case, no end-user input is needed.
OAuth defines two types of clients — confidential and public. The former can guarantee the safety of their credentials (apps running on web servers). The latter has no means of protecting secret data from misuse, which makes them especially vulnerable to data leaks. Public clients include programs running on the browser (single-page apps) and individual devices (mobile phones, tablets, computers, and so on). The two types rely on different authorization flows.
An authorization server presents the interface where the user confirms or denies the client’s request and grants access to required data. It also validates the app’s credentials and issues access tokens. This entity can be a part of a resource server or a separate module. In large-scale deployments, multiple resource servers share the same authorization server. That’s how dozens of products owned by Google — Google Maps, Google Drive, YouTube, and others — work.
How to get started with OAuth
To become an OAuth client for a given authorization and resource servers, an app must register with the API it’s going to call. This process includes creating a developer account on the website of the resource server and providing basic information such as
- app name,
- logo,
- home page,
- link to the privacy policy page,
- a list of redirect URLs, and
- a short description of the project.
After registration, the app gets credentials in the form of a unique Client ID and, in some cases, Client Secret.
The Client ID, also known as an API, Consumer, or App Key, is not something to be hidden. The service API uses it to recognize the application and build authorization URLs presented to end users.
On the other hand, the Client Secret confirms the app’s identity each time it requests access to a user’s account. So, only the client and the targeted API should know them. By their nature, public clients can’t maintain confidentiality of credentials, so they don’t acquire the Client Secret. Such programs employ other means to prove their identity we’ll discuss later.
In any case, upon registration, app developers from the client’s side can proceed with implementing an OAuth flow.
OAuth 2.0 flow and OAuth grant type
An OAuth flow depends on various factors — such as the resource owner (end user or machine), the client’s type (confidential or public) or the number of resource servers to be accessed. The framework specifies several standard scenarios defined by the grant type — or methods clients use to acquire an access token. There are several grant types serving different use cases.
Authorization code flow: for confidential web apps
The authorization code is the most common grant type for server-side applications that don’t expose source code and are considered confidential. Here is the flow it triggers.
Common OAuth flow for confidential apps
- The user visits the client app and selects logging via a certain resource server.
- The client redirects the user to the authorization server.
- The authorization server provides an interface (a consent page) to approve the authorization request.
- The authorization server redirects the user back to the client’s app via a pre-registered redirect URL with the authorization code.
- The client extracts the code from the redirect URL and forwards it alongside the app’s Client ID and Client Secret to the authorization server.
- The authorization server checks credentials and responds with the access token.
- The client presents the access token to the resource server.
- The resource server verifies the access token itself or with the authorization server and shares the required protected resources with the client.
The code can be applied only once and expires shortly after the issuance. OAuth specs mention that the lifespan should not exceed ten minutes. In practice, most authorization servers set the expiration time of 30 to 60 seconds.
But even with such time limits, the method doesn’t suit public clients since malicious software can easily acquire the Client Secret and intercept the authorization code. To prevent these risks, OAuth introduced a special extension to the regular flow.
PKCE or Proof Key for Code Exchange: for mobile and other public apps
Proof Key for Code Exchange (PKCE, an acronym pronounced pixie) was designed to enable the authorization code flow for mobile apps. But it’s also applicable for single-page apps and other public clients. Confidential apps may also use PKCE to enhance security.
In essence, PKCE extends the regular flow with additional steps.
- A legitimate client creates a new secret (proof key) each time it initiates the OAuth process and converts the key into a unique identifier (hash).
- The client sends the hash along with info on the transformation method to the authorization server, which saves them before responding with the authorization code.
- The client makes a token request. But instead of the pre-built Client Secret, it sends the proof key.
- The authorization server applies to the key the stored transformation method, and compares the result with the hash. If the results are equal, it responds with the access token.
In this scenario, if a malicious app steals the authorization code, it won’t be able to exchange it for a token since it doesn’t have the proof key. PKCE ensures the app that eventually receives a token is the same one that started the authorization flow.
Client credentials flow: for machine-to-machine applications
Client credentials grant type and the corresponding flow is the simplest one since it excludes the end user, working for machine-to-machine interactions. The client sends its credentials to the authorization server and receives the token that allows the app to access resources on its own behalf, without a customer’s consent.
This scenario applies to situations when a program needs to use its own credentials for authentication with another web server. Think of an online travel agency website calling a hotel booking API to update information on available rooms. Another example is a batch data pipeline: The client credential flow will ensure that only authorized apps trigger workflows, transmit data, or get access to information processed by the pipeline.
Refresh token: for repeated access to resources
An access token typically has a limited lifespan. This approach helps safeguard applications but comes at the cost of user experience. Once the token expires, the visitor must log in again to restore the availability of protected resources.
Refresh tokens were introduced to address this problem. They are common if the app needs to reach protected resources repeatedly, making it critical to keep a user logged in.
Refresh tokens have a longer validity time and are issued alongside access tokens. Instead of asking a user to re-enter the system, the client app sends a refresh token to the authorization server in exchange for a new access token and another refresh token.
There are other grant types, but all of them pursue the same goal — to retrieve OAuth tokens.
OAuth tokens: formats, types, and how they work
An OAuth token is a piece of code that permits a client application to access specific data on a user’s behalf. Its form is dictated by the authorization server, requirements, app architecture, and other factors. Basically, tokens vary by
- purpose — access and ID;
- format — opaque and JWT; and
- security level — bearer and sender-constrained.
For implementing authorization with OAuth in an app, it’s critical to understand the difference between various tokens.
Access token vs ID token
An access token is a token of any type and format the OAuth client receives to call the targeted API and get protected data. It doesn’t contain any user information, which is readable for the client app. Only the authorization or resource server can interpret the access token.
An access token is often confused with an ID token, though they bear different content and pursue different goals. While the former authorizes the app to manipulate certain data on a user’s behalf, the latter serves to authenticate users or assert their identity. ID tokens store personal data (name, email, etc.) and don’t apply to calling APIs. Only the client app can read them and use their information to build an individual profile and personalize the user experience. An ID token is a central concept of the OpenID standard we’ll talk about later.
In the regular flow, the authorization server provides the client with an ID token, access token, and, sometimes, refresh token in the same response in exchange for the authorization code and credentials.
JWT token vs opaque token
OAuth has no unified format for access tokens, so different authorization servers can generate them in various ways. There are two widely used options — opaque string and JWT. Unlike access tokens, ID tokens are JWT only, while refresh tokens are usually opaque strings.
An opaque token is a unique random sequence of characters that doesn’t contain any meaningful data about the user or the client. Instead, it acts as a reference to identification information stored by the issuer. The resource server calls the authorization server to interpret and validate the token. This format hides sensitive data and minimizes the request size, preventing server overload.
A JWT, or JSON web token, holds enough data for the resource server to make permission decisions without calling the authorization server. It simplifies the app architecture and improves the speed of the OAuth process. Besides that, JWTs are flexible, extendable, and portable, meaning that they can be used across different platforms and systems.
On the dark side, anybody who intercepts JWT can read the content, so you need to take extra security measures — like encryption. Anyway, using JWTs for sensitive data introduces additional risks and complexity. Another problem is the token size. It can be really large if too much data is included, which leads to increased load times and bandwidth usage.
Bearer token vs sender-constrained token
Depending on security requirements, an authorization server issues either a bearer or a sender-constrained token. Both types can use opaque or JWT formats.
A bearer token is not bound to a particular client. Whoever possesses (bears) it can access the required data, hence the name. This type predominates in OAuth and is most common in low-risk scenarios. It’s simple to implement, but vulnerable to breaches and leaks.
A sender-constrained token has an additional mechanism, such as a cryptographic key, restricting its use to the client that originally requested it. So, if an unauthorized party intercepts the token, it still won’t be able to access data on the resource server. Sender constraints are applied to mitigate risks in businesses with high-security standards — for example, online banking or e-health.
No matter the type or format, all access tokens come with a predefined scope which limits access granted to the client app.
OAuth scopes
OAuth scopes define what data is available to the client. When configuring the OAuth flow, developers choose one or several scopes provided by the resource server so that the app can perform its tasks. An end user sees the requested scopes on the authorization form (consent page).
In case of approval, the authorization server issues a token that reflects the degree of granted access. Usually, it coincides with what the app asks for. Yet OAuth enables users to modify scopes during initial consent or even after it, so the client may finally receive fewer permissions than indicated in the request.
The general recommendation here is to ask for the most restricted access possible and avoid permissions your product doesn’t require. There are reasons for that. First, people more readily agree to share limited data. Second, if a token is compromised, it will minimize potential damage.
Below are examples of scopes granted by popular services.
Google OAuth scopes
Google APIs employ the OAuth protocol for authorization and support all regular OAuth flows. It classifies all available scopes into three large groups.
Non-sensitive scopes cover limited information relevant to the current task. To acquire them, an app needs only a basic verification. This means you must confirm that you give correct information about your product.
Sensitive scopes give access to personal data and can be requested only by approved apps for approved use cases. For example, the list of approved use cases for Gmail API includes applications that:
- allow customers to compose, send, read, and process email via a user interface;
- automatically backup email;
- enhance the email experience; and
- employ information from emails for reporting and monitoring that benefit users.
At the same time, Gmail bans access to its API scopes to software that exports email on the manual basis or stores data other than email messages in Gmail.
Restricted scopes embrace highly sensitive data and extensive user data. To request them, an app must be approved the same way as for sensitive scopes and, besides that, undergo an annual security assessment. It validates the client’s ability to protect data and delete user information upon request.
GitHub OAuth scopes
GitHub, a platform where developers can store, manage, and share their code, provides multiple granular scopes to select from when creating an OAuth client.
By default, any app has read-only access to public information. Other scopes vary by actions a client can perform (read, write, ping, delete, create, etc.) and accessible data (code, statuses, security events, etc.). You can find the complete list of GitHub scopes here.
Microsoft identity platform scopes
Microsoft identity platform enables your app to have users sign in through their Microsoft accounts. It specifies
- the OpenID scope to acquire ID tokens for user authentication;
- the email scope that gives access to the user’s primary address associated with the user account;
- the profile scope to get the user’s name and surname, preferred username, and other information about the user;
- the offline access scope for receiving refresh tokens; and
- the default scope to ask for all needed permissions listed in the app registration.
It’s worth noting that each Microsoft service can have its own list of scopes, dividing data and functionality into smaller parts.
Other security protocols and when to use them
The OAuth 2.0 framework dramatically simplifies the authorization of apps when they need to manipulate data on behalf of a user. However, it doesn’t provide authentication — a process verifying a user’s identity. Below, we’ll compare two processes and see what other standards there are to extend or replace OAuth.
How three open standards differ from one another
Authorization vs authentication
Let’s again describe the authorization process in the simplest words.So, the client reaches out to the authentication server with a request like “I want to access certain data owned by Miranda. Ask Miranda whether she agrees to this.” The authorization server interacts with Miranda. If she says yes, it comes back with the access token.
Suppose the client feels concerned about the user and asks the server, “How do I know that it was the Miranda — not some other person — who actually authorized me?” The OAuth standard can’t answer this question clearly since it doesn’t describe how authentication should be done. It focuses on protecting end users’ resources. But what about protecting the client itself? That’s where OpenID Connect with ID tokens comes into the scene.
OpenID vs OAuth
OpenID Connect (OIDC) is an authentication protocol built on top of the OAuth framework and meant to work alongside it. In the OIDC scenario, a website or mobile app asks a trusted platform called OIDC or identity provider (IdP) to confirm that users are who they claim to be. IdPs apply proprietary authentication methods and issue JSON-based ID tokens acting as a customer’s passport. Common OIDC providers are
- social media platforms —like Facebook, Twitter, or LinkedIn; and
- large tech companies — like Google, Microsoft, or Apple.
OAuth and OIDC create the gold standard for businesses to deliver convenience to the login process. Particularly, OIDC powers a single sign-on (SSO) — a mechanism that allows users to automatically log in to multiple connected applications with one set of credentials. For example, Google asks you to enter your username and password only once to access a range of Google products.
SAML vs OAuth
Security Assertion Markup Language, or SAML, is a separate, independent of, OAuth standard for both authorization and authentication through an identity provider. Designed for large work environments, it particularly suits organizations dealing with sensitive data.
The SAML flow is almost the same as with OAuth/OIDC, except for the fact that it relies on XML, not on JSON-based ID tokens. With SAML in place, employees can have single sign-on accounts to log into the corporate intranet once and then access numerous services throughout the workday. Besides that, SAML empowers network administrators to manage users from a central location.
Which standard is better for you?
Many identity providers — such as Google or Microsoft — support both SAML and OIDC/OAuth for authentication and authorization.
The rule of thumb here is to opt for SAML if you need to run identity management in government, healthcare, and enterprise software where the safety of sensitive information is paramount. This standard focuses on data protection, while OAuth/OpenID lacks in-built encryption and often doesn’t meet the security requirements of companies with thousands of employees.
At the same time, SAML was not designed with modern apps in mind. So, if you call JSON-based REST APIs or want to implement login via a social media platform into the mobile app, the OAuth/OIDC pair will work better.