How to Check Open Source Code for Vulnerabilities
As open source code becomes a greater part of the foundation of the tech we use every day, it's important that developers know how to check it for security vulnerabilities.
Join the DZone community and get the full member experience.
Join For FreeHumans are creating and sharing massive amounts of code, and the amount is increasing. You can get a sense of this by looking at this chart of GitHub's growth in repositories from 2008 to 2013:
This is classic Hockey Stick Growth.
"The first million repositories were created in just under 4 years; 3 years, 8 months and 15 days to be exact. This last million took just 48 days. In fact, over 5.5M repositories — more than half of the repositories on the site — were created this year alone." - Brian Doll, GitHub
This massive code sharing is both a blessing and a curse. One one hand, it means that developers aren't wasting a lot of time re-inventing the wheel. Once someone solves a problem with code, they can share that code with the world on sites like GitHub and BitBucket. On the other hand, many software projects are increasingly reliant on open source code. The average application contains a staggering 46 components. This makes it difficult for developers to even know which components they're using, let alone keep up with vulnerabilities.
You may raise an eyebrow at the number of components cited above, but consider this: transitive dependencies. A software project may only depend on four or five libraries, but each of those libraries may have additional dependencies, and those dependencies may have still more, and so on. In the end, the amount of code for any given application which is unique is likely dwarfed by the amount of code in direct and transitive dependencies.
"Over 80% of the software in our handsets is open source."
Some estimates of the number of applications which contain open source components with vulnerabilities are as high as 44%. The most recent and dramatic example of a company getting hacked because of an open source vulnerability was Equifax which was caused by a vulnerability in the Struts2 package. Before Equifax, there were malicious libraries found in Python's PyPI package repository. Before that, there was a similar case of package hijacking in JavaScript's npm repository. And let's not forget the two huge vulnerabilities in GnuTLS and OpenSSL's Heartbleed that really got everyone thinking about open source code security.
Once you accept the premise that software projects contain many components and these components may have vulnerabilities, the question immediately arises: How do you inventory components and check for vulnerabilities? The answer can be boiled down into three steps:
Collect dependencies, including transitive dependencies.
Search for vulnerabilities for each dependency.
Remediate by either upgrading, patching your code, or patching dependency code.
This can be done manually, but it doesn't scale. It's only feasible if you have one or projects with a dozen or so dependencies. If you're securing an entire organization's code, it's not cost-effective. Also, it's just plain hard to find vulnerabilities because most vulnerabilities are never reported. This is why I recommend using a component inventory and vulnerability checking tool such as SourceClear, BlackDuck, VeraCode, OWASP Dependency Check, or Nexus Repository Pro.
Option 1: Use a Tool
SourceClear is a good combination of easy-to-use and its free features are pretty good. They do all the work of monitoring vulnerability disclosure databases, searching repositories for undisclosed vulnerabilities, and analyzing code for security bugs.
When you scan a project using SourceClear, their tool enumerates all of your project's dependencies, tells you if any of them are vulnerable, and gives remediation instructions.
Paying for premium services gets you access to analysis which can determine if you're using the specific vulnerable code. It's somewhat common to depend on a vulnerable library in a way that doesn't actually expose you to the vulnerability. This can save you a lot of time and effort if you're a large organization with a lot of processes since the cost of upgrading a single dependency can be quite high. You'll want to avoid it unless necessary. However, it's always good hygiene to upgrade to a non-vulnerable version, and you have no excuse for not doing this if you're a single developer.
Here's an example of what their site shows after scanning an example Ruby application: srcclr/example-ruby.
You can see that it lists all the vulnerabilities found in the project as well as an inventory of the project's libraries and how they're licensed. Enterprises tend to be more concerned about licensing than developers because they need to avoid using open source components which have a copyleft license such as GPL.
SourceClear's vulnerability registry is browsable for free, and some vulnerabilities have full technical teardowns. For example, there's CVE-2015-3253, Remote Code Execution Through Object Deserialization which describes the vulnerability and the fix in detail.
In addition, to complete auditing systems, there are specialized tools which work for single languages or build managers. Below is a list of several tools:
bundler audit - scans Ruby projects which use Bundler against Ruby Advisory DB
auditjs - scans JavaScript projects which use npm against OSS Index
OSS Index Gradle Plugin - scans Gradle projects against OSS Index
OSS Index Maven Plugin - scans Maven projects against OSS Index
Option 2: Check Manually
While checking manually is more work, it's a good exercise because it'll give you a better understanding of how many transitive dependencies you have as well as familiarity with open source vulnerability databases.
Collecting Dependencies
Most build managers such as Gradle, Maven, npm, Ruby Gems, and Python's Pip include a way of listing a project's dependencies. The table below shows commands for collecting dependency information for several popular build managers.
Build Manager | Command | Notes |
Gradle | gradle dependencies |
Prints dependency tree |
npm | npm ls <package name> |
Prints dependency tree |
Ruby Gems | gem dependency <gem name> |
Create Gemfile.lock style dependency list |
Python pip | pip download <package name> -d <temp dir> --no-binary :all: |
Downloads all dependencies to <temp dir> |
Maven | mvn dependency:tree |
Prints dependency tree |
Table 1: Commands for collecting dependencies
If you've been using any of these build managers for some time, you may have already used these commands.
Check Vulnerability Databases
After getting a list of your dependencies, you need to check the various vulnerability databases. A good place to start would be SourceClear's Knowledge Center which contains a lot of information aggregated from several free sources.
Additionally, there are open source vulnerability databases you can search:
Mitre's Common Vulnerabilities and Exposures (CVE) - Contains information on many types of vulnerabilities, including commercial applications and closed source projects. It's been around forever and everyone in application security knows what a CVE number is.
Ruby Advisory DB - Has vulnerability information for Ruby libraries, also known as gems.
OSS Index - Has vulnerability information for several types of build managers including npm, Maven, and others.
Once you identify a vulnerable dependency, you must somehow remediate it. For most vulnerabilities, remediation is as simple as upgrading to a non-vulnerable version. More popular dependencies sometimes have point releases with security fixes. This makes it easier to remediate without introducing breaking changes which often accompany upgrading to newer versions of dependencies. However, in some cases, especially if the project is no longer maintained, remediation may be more difficult. You may need to investigate the disclosure and understand the technical details of an exploit in order to patch the code yourself. You can read more about this process in Reversing an Open Source Vulnerability.
Check Git Commits
Many developers don't report vulnerabilities. This could be because they don't understand how publicly announcing vulnerabilities helps inform and motivate people to update, or maybe they're just lazy. In any case, disclosure means more work for the developer, and it's not good to rely on people doing extra work. This problem is explored more in Why Most Vulnerabilities Are Never Disclosed.
The bottom line is that important security bugs are sometimes fixed without being properly communicated. In order to find these fixes, it's sometimes necessary to audit source code commits. This can be done manually by going through commits looking for suspicious words like "security," "exploit," and "secret backdoor." This can be a lot of work, so I suggest using a tool like commit-watcher. It can audit multiple repositories by scanning commit code and messages according to flexible, user-defined rules. It's surprisingly easy to use the default rule set on popular projects to find undisclosed vulnerabilities.
Summary
Understanding how to check your projects for open source vulnerabilities is an increasingly important topic as we continue to build software which relies on other people's code. Hopefully, after reading this article, you have a better understanding of the process as well as some names of tools you can investigate if you need to audit and inventory code at scale.
Opinions expressed by DZone contributors are their own.
Comments