Ensuring the Security of Your APIs
Learn about API types, types of clients, and techniques for protecting their information in this in-depth look at security practices for APIs.
Join the DZone community and get the full member experience.
Join For FreeThis document discusses having a common security framework to protect our REST APIs against different types of attacks, e.g. OWASP top 10 threats, unauthorized access, denial of service attacks, masking confidential data, etc., and at the same time, also ensure that APIs cater to the needs of an omni-channel experience.
Before we jump into the security, we need to understand few basic concepts about clients and API types, explained below:
Client Types
Clients typically are a chunk of code that application developers can add to their development projects to do the basic things an application needs to do in order to interact with the API. Clients are categorized into two types:
- Confidential Clients
- Public Clients
Confidential Clients
Confidential clients are capable of maintaining the confidentiality of their credentials (e.g., client implemented on a secure server with restricted access to the client credentials), or capable of securing client authentication using other means.
A web application is a confidential client running on a web server. Resource owners access the client via an HTML user interface rendered in a user-agent on the device used by the resource owner. The client credentials as well as any access token issued to the client are stored on the web server and are not exposed to or accessible by the resource owner.
Public Clients
Public clients are incapable of maintaining the confidentiality of their credentials (e.g., clients executing on the device used by the resource owner, such as an installed native application or a web browser-based application), and are incapable of secure client authentication via any other means.
- User-agent-based application: A user-agent-based application is a public client in which the client code is downloaded from a web server and executes within a user-agent (e.g., web browser) on the device used by the resource owner. Protocol data and credentials are easily accessible (and often visible) to the resource owner. Since such applications reside within the user-agent, they can make seamless use of the user-agent capabilities when requesting authorization.
- Native application: A native application is a public client installed and executed on the device used by the resource owner. Protocol data and credentials are accessible to the resource owner. It is assumed that any client authentication credentials included in the application can be extracted. On the other hand, dynamically issued credentials such as access tokens or refresh tokens can receive an acceptable level of protection. At a minimum, these credentials are protected from hostile servers with which the application may interact. On some platforms, these credentials might be protected from other applications residing on the same device.
API Types
APIs are categorized into two categories:
Public APIs
Organizations expose their information to be available to public/third parties which do not have any direct relationship with their business.
Public APIs are further categorized into two categories based on their access and the data they represent:
- Unprotected APIs: No security in place and data exposed through the API is non-sensitive and is read-only, e.g. Weather APIs.
- Protected APIs: Data is public but available only to registered users after passing the security restrictions implemented in the API, e.g. the NY Times movie review APIs available for use in developer apps only after a user has successfully registered him/herself on the NY Times developer website.
Partner APIs
These APIs facilitate communication and integration between an organization and its business partners, e.g. credit card organization exposing APIs for Uber to pay via loyalty points credited and available in a user’s credit card.
Internal APIs
Strictly for use within an organization for integration between applications and systems used by the organization, e.g. an organization developing APIs to fill timecards for employees and make the feature available on their company mobile app.
Managing API Security
API security involves securing data end to end, which includes security, from a request originating at the client, passing through networks, reaching the server/backend, the response being prepared and sent by the server/backend, the response being communicated across networks, and finally, reaching the client. Therefore, API security has been broadly categorized into four different categories, described below and discussed in depth in the subsequent sections:
- Data in Transit/Data in Motion Security
- Securing Data in Motion between client & API gateway
- Securing Data in Motion between API gateway & Backend Services
- Access Control & Security against Denial of Service (DoS) Attacks
- Authentication& Authorization: Reliably identify end user information using OAuth 2.0 or OpenID Connect
- Data Confidentiality & Masking Personally Identifiable Information (PII)
The below diagram depicts the above categories and various ways to secure them end to end:
Data in Transit/Data in Motion Security
For all except public unprotected API implementations, the use of TLS should be a MUST HAVE requirement. Also, the overhead of TLS is negligible on modern hardware, with a minor latency increase that is more than compensated for by safety for the end user. Key considerations are:
- TLS should be implemented at both northbound and southbound endpoints.
- It should be ensured that TLS is the latest version and supported by the client, API gateway, and target backend.
- The certificate key stores and trust stores should be highly protected and encrypted.
- Only authorized users should have access to certificate key stores and trust stores.
Access Control and Security Against Denial of Service (DoS) Attacks
- Network Level Defense: If the API gateway is hosted in the cloud, then the DDoS defense mechanism provided by cloud should be leveraged, e.g. the Apigee Edge managed cloud platform, currently deployed and operated by Apigee (Google), on GCP (Google Cloud Platform) and AWS (Amazon Web Services), leverages DDoS defenses offered by both cloud hosting providers at a network level.
- Content Delivery Network: CDNs like Akamai, Neustar, and Rackspace can be used to mitigate/minimize DDoS attacks on APIs.
- Bot Detection: Various API management platforms have already come up with services/bots which keep an eye on API traffic, identify any malicious/unwanted requests, and generate alerts/stop malicious requests from reaching the API gateway, e.g. Apigee (Google) offers a Bot Detection service called Apigee Sense. Sense is an intelligent data-driven API security product that detects and protects APIs from malicious or unwanted traffic. Sense provides another layer of protection by automatically identifying suspicious API client behaviors, upon which administrators can apply corrective actions in order to maintain user experience as well as protect backend systems
- Policies Enforcement:Policies should be enforced on the API proxy that sits between an API client and the customer backend to restrict API access to legitimate users. the following policy enforcements are a MUST HAVE to protect APIs from malicious hackers:
- API Rate Limiting: The API Rate limits are applied to reduce massive API requests that cause denial of services, and also to mitigate potential brute-force attacks or misuses of services. The following API rate limits mechanism should be considered to apply at the API Proxy:
- API rate limits per application or per API: Every API or application can only access the services for defined the number of requests per rate limit window.
- API rate limits per GET or POST request: The allowed access requests may vary based on GET or POST requests per period.
- API Rate Limiting: The API Rate limits are applied to reduce massive API requests that cause denial of services, and also to mitigate potential brute-force attacks or misuses of services. The following API rate limits mechanism should be considered to apply at the API Proxy:
- Regex Protection: The URI Path, Query Param, Header, Form Param, Variable, XML Payload, or JSON Payload of the incoming request should be evaluated against predefined regular expressions like DELETE, UPDATE, and EXECUTE. The presence of any of these pre-defined expressions should be treated as a threat and request should be rejected. For regular expressions to validate for, refer to the OWASP top 10.
- JSON Input Validation: JSON validation on payloads for PUT/POST/DELETE requests should be performed to minimize the risk posed by content-level attacks by specifying limits on various JSON structures, such as maximum depth, maximum number of object entries, maximum string length of a name, maximum number of elements allowed in an array. etc.
- XML Input Validation:The XML validation on payload for PUT/POST/DELETE requests should be performed to detect XML payload attacks based on configured limits and screen API against XML threats using the following approaches:
- Validate messages against an XML schema (.xsd)
- Evaluate message content for specific blacklisted keywords or patterns
- Detect corrupt or malformed messages before those messages are parsed
- Request Validation
- Input HTP Verb Validation: Properly restrict the allowable verbs such that only the allowed verbs work, while all others return a proper response code (for example, a 403 Forbidden).
- Headers Validations: Headers like Content-Type, Accept, Content-Length should be explicitly validated against API supported functionality. Also validation against mandatory headers like Authorization, API-specific headers should be performed.
- Validate incoming content-types: For PUT/POST/DELETE requests, the Content-Type (e.g. application/XML or application/JSON) of the incoming request and Content-Type header value should be same. A missing Content-Type header or an unexpected Content-Type header should result in the API rejecting the content with a 406 Not Acceptable response.
- Validate Response Types: Do NOT simply copy the Accept header to the Content-type header of the response. Reject the request (ideally with a 406 Not Acceptable response) if the Accept header does not specifically contain one of the allowable types).
- Handle Unsupported Resource: Properly restrict the allowable resources such that only the resources exposed work, while all others unimplemented resources return a proper response code, e.g. unknown resource.
- Access Control: Policies can be configured to allow requests from specific IPs, domains, or regions. Requests which do not pass these criteria are rejected by the gateway.
Authentication and Authorization
Generally, authentication and authorization are used in sync.
- Authentication is used to identify an end user.
- Authorization is used to grant access to the resources the identified user has access to.
In the API world, OAuth and OpenID Connect are the most commonly used mechanisms to secure API endpoints by leveraging the existing IAM infrastructure instead of building a separate system every time to store the username and password. Both the protocols authenticate users against existing the IAM system to exchange for an access token, which is further used to access the API resources.
OpenID & OAuth History
OAuth
OAuth = Delegated Access using an access token, generally an OPAQUE token.
The OAuth 2.0 authorization framework enables a third-party application to obtain limited access to an HTTP service. If you're storing protected data on your users' behalf, they shouldn't be spreading their passwords around the web to get access to it. Use OAuth to give your users access to their data while protecting their account credentials.
- OAuth is not an authentication protocol. It is an Authorization protocol. Since an authentication usually occurs ahead of the issuance of an access token, it is tempting to consider the reception of an access token of any type proof that such an authentication has occurred. However, mere possession of an access token doesn't tell the client anything on its own. In OAuth, the token is designed to be opaque to the client, but in the context of a user authentication, the client needs to be able to derive some information from the token. This problem stems from the fact that the client is not the intended audience of the OAuth access token. Instead, it is the authorized presenter of that token, and the audience is, in fact, the protected resource. The protected resource is not generally going to be in a position to tell if the user is still present by the token alone, since by the very nature and design of the OAuth protocol the user will not be available on the connection between the client and protected resource.
- Since the access token can be traded for a set of user attributes, it is tempting to think that possession of a valid access token is enough to prove that a user is authenticated. This assumption turns out to be true in some cases, where the token was freshly minted in the context of a user being authenticated at the authorization server. However, that's not the only way to get an access token in OAuth. Refresh tokens and assertions can be used to get access tokens without the user being present, and in some cases, access grants can occur without the user having to authenticate at all.
- Since OAuth is a delegation protocol, this is fundamental to its design. This means that if a client wants to make sure that an authentication is still valid, it's not sufficient to simply trade the token for the user's attributes again because the OAuth-protected resource, the identity API, often has no way of telling if the user is there or not.
Opaque Token: Many OAuth 2.0 implementations return OPAQUE strings in exchange for user credentials also called as Access Tokens. These tokens are further used to access API resources. These Opaque tokens are literally what they sound like. Instead of storing user identity and claims in the token, the opaque token is simply a primary key that references a database entry which has the data. Fast key-value stores like Redis, NoSQL databases like Cassandra are perfect for leveraging in-memory hash tables for I/O lookup of the payload. Since the roles are read from a database directly, roles can be changed and the user will see the new roles as soon as the changes propagate through the backend.
OpenID Connect
OpenID Connect= User Identity + Delegated Access using an ID Token and Access Token.
OpenID Connect= A standard for user authentication using OAuth.
- OpenID Connect is built directly on OAuth 2.0 and in most cases is deployed right along with (or on top of) an OAuth infrastructure.
- In addition to OAuth access and refresh tokens, it also provides the client an OpenID Connect ID Token. An ID Token is a signed JSON Web Token (JWT) that is given to the client application alongside the regular OAuth access token.
JSON Web Token: A JWT Token is actually a full JSON Object that has been base64 encoded and then signed with either a symmetric shared key or using a public/private key pair. The JWT can contain information like the subject or user_id, when the token was issued, and when it expires. By signing with a secret, it is ensured that only system having access to secret can generate the token. One thing to keep in mind though, while the JWT is signed, JWTs are usually not encrypted (although you can encrypt it optionally). This means any data that is in the token can be read by anyone who has access to the token. Therefore it is good practice to place identifiers in the token such as a user_id, but not personally identifiable information like an email or social security number. Also, they should be passed over an encrypted channel using TLS.
JWT Limitations: Banning users or adding/removing roles is a little harder as it does not immediately reflect. Remember, the JWT has a predefined expiration date which may be set a week into the future. Since the token is stored client side, there is no way to directly invalidate the token even if the user against which JWT is issued is marked as disabled in the database. Rather, you must wait until it expires. This can influence your architecture especially if designing a public API that could be starved by one power user or an e-commerce app where fraudulent users need to be banned.
Though there are workarounds, for example, if all you care is banning compromised tokens or users, you can have a blacklist of tokens or user_ids, but this may reintroduce a database back into your auth framework. A recommended way to blacklist is to ensure each token has a JTI claim (or a JWT Id which can be stored in the DB). Assuming that the number of tokens you would like to invalidate is much smaller than the number of users in your application, this may scale pretty easily.
For an enterprise app with many roles such as admin, project owner, service account manager, switching user roles may not have the immediate effect on JWTs. Especially, think of the case where an admin is modifying someone else's authorized roles such as his/her immediate reports. Thus, the modified user doesn't even know his/her roles have changed without refreshing the JWT
Mentioned below are certain use cases to implement OpenIDconnect:
1) Use Case 1- Outbound Web Single Sign-on: To provide enterprise users access to SaaS application and partner application without exposing enterprise username/passwords.
2) Use Case 2- Inbound Web Single Sign-on via Social Login/ Third Party Login: To allow social /third party logins without storing the password of external users.
3) Use Case 3- Enable Native Single Sign-on for Native apps.
Both OAuth and OpenIDConnect support four grant types specified by OAuth 2 specification and the below diagram depicts scenarios for selecting a specific grant type using a flow chart. The diagram tries to generalize the scenarios but does not restricts/enforces to go with specific grant type as depicted in the flowchart. API developers might opt to go with different grant type based on constraints/implementation scenarios as required by the project.
Data Confidentiality and Masking PII***
Passwords, security tokens, and API keys should not appear in the URL, as this can be captured in web server logs, which makes them intrinsically valuable. Also, personally identifiable information like Userid, password, account numbers, credit card numbers, etc. should be masked everywhere, including transaction and audit logs.
Security Practices for Public APIs
Public APIs meant to expose data which is non-sensitive and read-only in nature (e.g. Weather APIs) and do not wish to add any user authentication/authorization (since the data is user independent) are recommended to incorporate below points to make API robust against threats/abuse:
Apply Rate limit policy at IP Address level.
Apply API key verification against a public API key. The API key can be stored at gateway itself without exposing it to clients. The benefit of applying API key verification is, the traffic to the gateway through this public API can be blocked any time by invalidating the key in case a DoS attack is attempted on API and other policies to block the hacker have failed. Though this will block all API traffic including valid calls as well, but the infrastructure is protected.
Quota policy (single or multiple quotas) should be applied to apply a limit on API usage.
IP filtering should be applied at geography level (county/region, etc.) if the API is meant to server traffic for a specific geography.
The developer should still be pushed to go through one-time registration and call the API using his own API key.
Conclusion
APIs are a great way to integrate applications within and across enterprises. They are quick and easy to implement. But at the same time, they can be dangerous if not secured properly, putting the entire enterprise at risk of exposing it to hackers. Therefore, API security should be well-architected and designed, much prior to actually starting the development of APIs. There is no proven technique to fool-proof APIs, but a well-designed API can greatly mitigate the security risks and enterprises can reap the benefits of APIs.
Opinions expressed by DZone contributors are their own.
Comments