Secure API Design With OpenAPI Specification
Effective API security uses a Positive Security model with well-defined, tested, and enforced contracts throughout the API lifecycle.
Join the DZone community and get the full member experience.
Join For FreeEditor’s Note: The following is an article written for and published in DZone’s 2021 API Design and Management Trend Report.
API security is at the forefront of cybersecurity. Emerging trends and technologies like cloud-native applications, serverless, microservices, single-page applications, and mobile and IoT devices have led to the proliferation of APIs. Application components are no longer internal objects communicating with each other on a single machine within a single process — they are APIs talking to each other over a network.
This significantly increases the attack surface. Moreover, by discovering and attacking back-end APIs, attackers can often bypass the front-end controls and directly access sensitive data and critical internal components. This has led to the proliferation of API attacks. Every week, there are new API vulnerabilities reported in the news. OWASP now has a separate list of top 10 vulnerabilities specifically for APIs. And Gartner estimates that by 2022, APIs are going to become the number one attack vector.
Traditional web application firewalls (WAF) with their manually configured deny and allow rules are not able to determine which API call is legitimate and which one is an attack. For them, all calls are just GETs and POSTs with some JSON being exchanged.
Anomaly detection (AI, ML) solutions, which rely on large volumes of 100% legitimate traffic, are prone to false positives and false negatives; cannot deterministically explain why a specific call is “suspicious”; and frankly just cannot be expected to magically turn any insecure API into a secure one — no matter how poorly designed and implemented.
To solve this problem, companies increasingly turn to the Positive Security model. They define the expected API behavior, ensure that the definition is strict and detailed enough to be meaningful, test the implementation to conform to the definition, and then enforce that definition during API use. Any calls outside of what the API expects and any responses outside of what the API is supposed to return get automatically rejected.
Figure 1: Positive Security API protection
Unlike traditional applications, APIs actually have a way to define their expected inputs and outputs in a standard, machine-readable model. Specifically, for the most popular type of APIs today — REST APIs — this contract format is OpenAPI. It originated as the Swagger standard created for documentation purposes and later got adopted by the OpenAPI Initiative under the Linux Foundation. Most of the development and API tools on the market today either use OpenAPI natively or support import and export in that format.
Let’s look at the specific parts of the OpenAPI specification and their role in security.
Note: We will be using the YAML format and OpenAPI version 3 for the examples here, but JSON is equally fine as the format and one can find similar ways of specifying API behavior in version 2, also known as Swagger.
Paths
API paths are the most basic part of an API definition. Together with the server URL and base path, they document which URLs API clients should invoke:
xxxxxxxxxx
paths
/activation
...
/books
...
/books/{id}
...
Paths are mandatory in OpenAPI contracts and rightly so. Fully documenting all APIs and API paths exposed by your servers prevents attackers from launching path traversal attacks or finding shadow endpoints: staging, non-production, and legacy, as described in OWASP API9:2019 — Improper assets management.
Operations
Operations are the HTTP verbs — get
, post
, put
, patch
, delete
, head
, options
, and trace
— that a path can support.
paths
/ping
get
It is extremely important to document the operations that each path supports and reject anything else.
For example, there are multiple tools that generate quick APIs on top of databases. These can be very convenient for developers creating front-end user interfaces and thus wanting to get a REST API to write and retrieve back-end data.
This convenience can come back to bite later down the line when an attacker discovers that instead of a GET
operation on a particular path (for example /accounts
), they can send a PUT
or even a DELETE
, thus actually changing your back-end data on a path that you intended to be used for read-only.
Another, more subtle example of a vulnerability that originated with operations not getting enforced happened to GitHub in 2019. GitHub’s authentication endpoint was supposed to only expose GET
and POST
operations. Unfortunately, unbeknownst to GitHub developers, the Rails framework that they were using also exposed a HEAD
operation for each GET
. GitHub’s code was expecting only a GET
or a POST
and thus had a simple if/else statement, so sending it an unexpected third option led to unexpected code behavior and an authentication bypass.
Parameters
API operations can have parameters such as:
- Part of the path (e.g.,
/book/2951
— see example in thepaths
section above) - Part of a query in the URL (e.g.,
/book?id=2951
) - HTTP header
- Cookie
x
paths
/users/{id}
get
parameters
in path # alternatively can be query, header, cookie
name id
requiredtrue
schema
type string
minLength1
maxLength20
pattern"^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,5})$"
Parameters are not mandatory, and it is not mandatory to describe their format and limits in such detail as we did in the example above, but it is a very good idea to do so. Locking down strings to their exact pattern — with a strict regular expression — can save you from potential API8:2019 — Injection attacks. And even without injection, you never know how your back-end code will react when being sent something outside of the expected range (this happened to the license key API of the Steam marketplace, causing it to leak thousands of game license keys).
Tl;dr: Document all your parameters and provide as much detail as possible.
Payloads
Both requests and responses may contain data in the body. REST APIs typically use JSON as the format to exchange data. While not mandatory, it is very much recommended that you strictly describe the schema of such requests and responses (just like we discussed with parameters above):
x
/book
post
requestBody
requiredtrue
content
application/json
schema
type object
properties
isbn
type string
pattern"ISBN\x20(?=.{13}$)\d{1,5}([- ])\d{1,7}\1\d{1,6}\1(\d|X)$"
maxLength:18
quantity
type integer
minimum:1
maximum:10
required
isbn
quantity
additionalProperties:false
Note that you should also explicitly specify the type of payload as object and set the additionalProperties
value to false
. These prevent additional properties from being inserted into the call.
These extra steps serve as protection against API6:2019 — Mass assignment attacks when a back-end implementation blindly saves to the database any properties it gets and inadvertently overwrites sensitive fields. This kind of API vulnerability recently happened to a popular container registry system, Harbor, allowing attackers to make themselves admins by simply including "has_admin_role":true
in the API call to change user profile details.
This is an excerpt from DZone's 2021 API Design and Management Trend Report.
For more:
Read the Report
Response Payloads
Surprisingly, response bodies are as important and should be defined just as rigorously. However, the primary reason for that is different. Documented and enforced responses are a great way to fight API3:2019 — Excessive data exposure flaws that happen when an API inadvertently leaks more data than the user is expected to see — be it extra properties, extra elements (as in the Steam example that we discussed earlier), or even just crash traces when the backend fails.
Authentication and Authorization
Authentication is important and OpenAPI has a way to define the security scheme that your API is supposed to have (basic, bearer, API keys, OAuth2, OpenID Connect, and so on):
x
components
securitySchemes
OAuth2
type oauth2
flows
authorizationCode
authorizationUrl https //example.com/oauth/authorize
tokenUrl https //example.com/oauth/token
scopes
read Grants read access
write Grants write access
admin Grants access to admin operations
And then apply it at any level (the whole API or just a specific operation):
xxxxxxxxxx
paths
/user
post
security
OAuth2 admin
As you can also see in the examples above, scopes are natively supported by the standard.
Starting with version 3.1, OpenAPI will also natively support mutual certificate (mTLS) authentication. And there are third-party extensions that provide further authentication and authorization policies, for example, JWT, which we will discuss briefly below.
Needless to say, authentication and authorization are extremely important to API security. So many APIs are getting hacked because:
- They were meant to be “internal,” and developers never expected attackers to find a way to get to the network and invoke them (see recent Mercedes-Benz API hack).
- Authentication was poorly designed and didn’t follow industry best practices (see API2:2019 — Broken authentication, API1:2019 — Broken object level authorization, API5:2019 — Broken function level authorization).
Transport
In this day and age, you should always encrypt your traffic and use HTTPS rather than HTTP. In OpenAPI version 3, this is specified by the mandatory protocol prefix in the base API URL:
xxxxxxxxxx
servers
url https //api.example.com/v1
Use of HTTP opens you up to man-in-the-middle attacks in which your traffic gets intercepted and an attacker pretends to be your legitimate API consumer. See the recent ASUS and Dell attacks for examples of such flaws.
Tooling
One other aspect that we will cover briefly here is tooling support. This article is not meant to be a tooling review, so we will only provide a few pointers. Your API contract is only helpful if you actually use it, and the use is different depending on the stage of the lifecycle.
Figure 2: Stages of API lifecycle
Developers can use specialized OpenAPI tools or OpenAPI plugins for VS Code, IntelliJ, and Eclipse. Additionally, CI/CD pipelines can include static and dynamic API testing based on OpenAPI contracts. API firewalls and API gateways can enforce Positive Security based on the contracts, both on external (North-South) and internal (East-West) levels, as demonstrated in Figure 3:
Figure 3: API firewalls and gateways
Conclusion
Leveraging the advice from this article, you will be well covered for 95%+ of possible attacks. But realize that OpenAPI is still an evolving standard and its coverage of API security is also evolving. For example, we already mentioned that up until version 3.1, mTLS authentication was not something you could describe using OpenAPI, and there are still other aspects not covered out of the box, like JSON Web Token requirements, control of response headers, rate limiting, and so on.
Luckily, OpenAPI allows you to extend it using your own custom objects and properties. This is something that you can do yourself or by using custom extensions from the API security or API management vendor of your choice. With API attacks on the rise, the most effective way to ensure you’re protected is by using a Positive Security model with well-defined, tested, and enforced contracts throughout the API lifecycle.
Opinions expressed by DZone contributors are their own.
Comments