EclipseStore High-Performance-Serializer
I will introduce you to the serializer from the EclipseStore project and show you how to use it to take advantage of a new type of serialization.
Join the DZone community and get the full member experience.
Join For FreeSince I learned Java over 20 years ago, I wanted to have a simple solution to serialize Java-Object-Graphs, but without the serialization security and performance issues Java brought us. It should be doable like the following…
byte[] data = serializer.serialize(objectGraph);
Node objectGraphDeserialized = serializer.deserialize(data);
Do you want to know how this is doable with the new Open-Source Project EclipseSerializer? You are in the right place.
Before we look at the Open-Source project EclipseStore Serializer, I want to recap a bit of the challenge coming from the Java Serialization itself. This will be the background information to see how powerful the project is.
Java Serialization in a Nutshell
Java Serialization is a mechanism provided by the Java programming language that allows you to convert the state of an object into a byte stream. This byte stream can be easily stored in a file, sent over a network, or otherwise persisted. Later, you can deserialize the byte stream to reconstruct the original object, effectively saving and restoring the object's state.
Here are some key points about Java Serialization:
Serializable Interface: To make a Java class serializable, it needs to implement the Serializable
interface. This interface doesn't have any methods; it acts as a marker interface to indicate that the objects of this class can be serialized.
import java.io.Serializable;
public class MyClass implements Serializable {
// class members and methods
}
Serialization Process: To serialize an object, you typically use ObjectOutputStream
. You create an instance of this class and write your object to it. For example:
try (FileOutputStream fileOut = new FileOutputStream("object.ser");
ObjectOutputStream out = new ObjectOutputStream(fileOut)) {
MyClass obj = new MyClass();
out.writeObject(obj);
} catch (IOException e) {
e.printStackTrace();
}
Deserialization Process: To deserialize an object, you use ObjectInputStream
. You read the byte stream from a file or network and then use ObjectInputStream
to recreate the object.
try (FileInputStream fileIn = new FileInputStream("object.ser");
ObjectInputStream in = new ObjectInputStream(fileIn)) {
MyClass obj = (MyClass) in.readObject();
// Now 'obj' contains the deserialized object
} catch (IOException | ClassNotFoundException e) {
e.printStackTrace();
}
Versioning Considerations: If you change a serialized class's structure (e.g., adding or removing fields or changing their types), the deserialization of old serialized objects may fail. Java provides mechanisms like serialVersionUID to help with versioning and compatibility.
Security Considerations: Serialization can be a security risk, especially if you deserialize data from untrusted sources. Malicious code can be executed during deserialization. To mitigate this risk, you should carefully validate and sanitize any data you deserialize or consider alternative serialization mechanisms like JSON or XML.
Custom Serialization: In your class, you can customize the serialization and deserialization process by providing writeObject
and readObject
methods. These methods allow you to control how the object's state is written to and read from the byte stream.
In summary, Java Serialization is a valuable feature for persisting objects and sending them across a network. However, it comes with some challenges related to versioning and security, so it should be used cautiously, especially when dealing with untrusted data sources.
What Are the Security Issues?
Java Serialization can introduce several security issues, particularly when deserializing data from untrusted sources. Here are some of the security concerns associated with Java Serialization:
- Remote code execution: One of the most significant security risks with Java Serialization is the potential for remote code execution. When you deserialize an object, the Java runtime system can execute arbitrary code contained within the serialized data. Attackers can exploit this to execute malicious code on the target system. This vulnerability can lead to serious security breaches.
- Denial of Service (DoS): An attacker can create a serialized object with a large size, causing excessive memory consumption and potentially leading to a denial of service attack. Deserializing large objects can consume significant CPU and memory resources, slowing down or crashing the application.
- Data tampering: Serialized data can be tampered with during transmission or storage. Attackers can modify the serialized byte stream to alter the state of the deserialized object or introduce vulnerabilities.
- Insecure deserialization: Deserializing untrusted data without proper validation can lead to security issues. For example, if a class that performs sensitive operations is deserialized from untrusted input, an attacker can manipulate the object's state to perform unauthorized actions.
- Information disclosure: When objects are serialized, sensitive information may be included in the serialized form. If this data is not adequately protected or encrypted, an attacker may gain access to sensitive information.
How To Mitigate Serialization Issues
To mitigate these security issues, consider the following best practices:
- Avoid deserializing untrusted data: If possible, avoid deserializing data from untrusted sources altogether. Instead, use safer data interchange formats like JSON or XML for untrusted data. (Or use the EclipseSerializer ;-) )
- Implement input validation: When deserializing data, validate and sanitize the input to ensure it adheres to expected data structures and doesn't contain unexpected or malicious data.
- Use security managers: Java's Security Manager can be used to restrict the permissions and actions of deserialized code. However, it's important to note that Security Managers have been deprecated in newer versions of Java.
- Whitelist classes: Limit the classes that can be deserialized to a predefined set of trusted classes. This can help prevent the deserialization of arbitrary and potentially malicious classes.
- Versioning and compatibility: Be cautious when making changes to serialized classes. Use
serialVersionUID
to manage versioning and compatibility between different versions of serialized objects. - Security Libraries: Consider using third-party libraries like Apache Commons Collections or OWASP Java Serialization Security (Java-Serial-Killer) to help mitigate known vulnerabilities and prevent common attacks.
In summary, Java Serialization can introduce serious security risks, especially when dealing with untrusted data. It's essential to take precautions, validate inputs, and consider alternative serialization methods or libraries to enhance security. Additionally, keeping your Java runtime environment up to date is crucial, as newer versions of Java may include security improvements and fixes for known vulnerabilities.
Why Is JSON or XML Not the Perfect Solution for the JVM?
Many papers and lectures recommend circumventing the security risks of serialization by using XML or JSON. This is a structured representation of the data that is to be transferred. There are also security problems, but I will address these in a separate article. However, what should be addressed are two things. First, the data must be converted into a text representation. This usually requires more data volume than with a pure binary model. In addition, data such as the binary data of images must be recorded so that only printable or UTF-8 characters can be transmitted. This process requires a lot of time and usually a lot of memory when transforming it into XML and back from XML into the original format.
The second point that causes problems in most cases is the data structure. In XML and JSON, object references can only be stored in a more manageable manner. This makes processing many times more complicated, slower, and more resource-intensive. Even though many solid solutions can be used to convert Java objects into XML or JSON, I recommend looking for new approaches occasionally.
EclipseStore: Serializer: Practical Part
Now, let's get to the practical stuff in this article. The dependency is needed first. To do this, we add the following instructions to the pom.xml. The first release was prepared when writing this article, and a SNAPSHOT version (1.0.0-SNAPSHOT) was available from the repositories. In this case, you still have to use the SNAPSHOT repositories.
Definition inside the pom.xml.
<dependency>
<groupId>org.eclipse.serializer</groupId>
<artifactId>serializer</artifactId>
<version>{maven-version-number}</version>
</dependency>
The rest will happen quickly once we're ready and have fetched the dependency. For the first test, we created a class called Node. Each Node can have a right and a left child. With this, we can create a tree.
private String id;
private Node leftNode;
private Node rightNode;
As an example, I created the following construct and then serialized and de-serialized it once using the serializer.
Node rootNode = new Node("rootNode");
Node leftChildLev01 = new Node("Root-L");
Node rightChildLev01 = new Node("Root-R");
leftChildLev01.addLeft(new Node("Root-L-R"));
leftChildLev01.addLeft(new Node("Root-L-L"));
rightChildLev01.addLeft(new Node("Root-R-L"));
rightChildLev01.addRight(new Node("Root-R-R"));
rootNode.addLeft(leftChildLev01);
rootNode.addRight(rightChildLev01);
Serializer<byte[]> serializer = Serializer.Bytes();
byte[] data = serializer.serialize(rootNode);
Node rootNodeDeserialized = serializer.deserialize(data);
System.out.println(rootNode.toString());
System.out.println(" ========== ");
System.out.println(rootNodeDeserialized.toString());
Now, let's see whether this also works with a Java object graph. To do this, the class representing the node is changed so that a father node can also be defined. Cycles can now be set up.
public class GraphNode {
private String id;
private GraphNode parent;
private List<GraphNode> childGraphNodes = new ArrayList<>();
We take the graph listed here as an example.
GraphNode rootNode = new GraphNode("rootNode");
GraphNode child01Lev01 = new GraphNode("child01Lev01");
GraphNode child02Lev01 = new GraphNode("child02Lev01");
rootNode.addChildGraphNode(child01Lev01);
rootNode.addChildGraphNode(child02Lev01);
child01Lev01.setParent(rootNode);
child02Lev01.setParent(rootNode);
GraphNode child01Lev02 = new GraphNode("child01Lev02");
GraphNode child02Lev02 = new GraphNode("child02Lev02");
child01Lev01.addChildGraphNode(child01Lev02);
child01Lev01.addChildGraphNode(child02Lev02);
child01Lev02.setParent(child01Lev01);
child01Lev02.setParent(child01Lev01);
GraphNode child01Lev03 = new GraphNode("child01Lev03");
GraphNode child02Lev03 = new GraphNode("child02Lev03");
child01Lev03.setParent(child02Lev01);
child02Lev03.setParent(child02Lev01);
child02Lev01.addChildGraphNode(child01Lev03);
child02Lev01.addChildGraphNode(child02Lev03);
//creating cycles
rootNode.addChildGraphNode(child01Lev03);
rootNode.setParent(child02Lev03);
This graph is also processed without any problems, without the cycles causing any issues.
Serializer<byte[]> serializer = Serializer.Bytes();
byte[] data = serializer.serialize(rootNode);
GraphNode rootNodeDeserialized = serializer.deserialize(data);
You can make the examples even more complex and try out the subtleties of inheritance. New data types from JDK17 are also supported. This means I have a potent tool to handle various tasks. For example, one use can be found in another Eclipse project called EclipseStore. A persistence mechanism is provided here based on this serialization. But your own small projects can also benefit from this. I will show how quickly this can be integrated into a residual service.
Building a Simple REST Service
If you want to create a simple REST service for transferring byte streams in Java without using Spring Boot, you can use the Java SE API and the HttpServer class from the com.sun.net.httpserver package, which allows you to create an HTTP server.
- We create an HTTP server on port 8080 and define a context for handling requests to
/api/bytestream
. - The
ByteStreamHandler
class handles bothPOST
requests for uploading byte streams andGET
requests for downloading byte streams. - For
POST
requests, it reads the incoming byte stream, processes it as needed, and sends a response. - For
GET
requests, it sends a predefined byte stream as a response.
Remember that this is a simple example, and you can expand upon it to handle more complex use cases and error handling as needed for your specific application. Also, note that the com.sun.net.httpserver package is part of the JDK, but it may not be available in all Java distributions.
public class ByteStreamHandler implements HttpHandler {
@Override
public void handle(HttpExchange exchange) throws IOException {
String requestMethod = exchange.getRequestMethod();
if (requestMethod.equalsIgnoreCase("POST")) {
// Handle POST requests for uploading byte streams
handleUpload(exchange);
} else if (requestMethod.equalsIgnoreCase("GET")) {
// Handle GET requests for downloading byte streams
handleDownload(exchange);
}
}
private void handleUpload(HttpExchange exchange) throws IOException {
// Get the input stream from the request
InputStream inputStream = exchange.getRequestBody();
// Read the byte stream and process it as needed
ByteArrayOutputStream byteArrayOutputStream
= new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
int bytesRead;
while ((bytesRead = inputStream.read(buffer)) != -1) {
byteArrayOutputStream.write(buffer, 0, bytesRead);
}
// Process the uploaded byte stream
// (e.g., save to a file or perform other actions)
byte[] data = byteArrayOutputStream.toByteArray();
// Do something with 'data'
Serializer<byte[]> serializer = Serializer.Bytes();
GraphNode deserialized = serializer.deserialize(data);
//process the data
System.out.println("deserialized = " + deserialized);
// Send a response (you can customize this)
String response = "Byte stream uploaded successfully.";
exchange.sendResponseHeaders(200, response.length());
OutputStream os = exchange.getResponseBody();
os.write(response.getBytes());
os.close();
}
private void handleDownload(HttpExchange exchange) throws IOException {
// Simulate generating and sending a byte stream as a response
String response = "Hello, Byte Stream!";
Serializer<byte[]> serializer = Serializer.Bytes();
byte[] data = serializer.serialize(response.getBytes());
exchange.sendResponseHeaders(200, data.length);
OutputStream os = exchange.getResponseBody();
os.write(data);
os.close();
}
}
Conclusion
We've looked at the typical problems with Java's original serialization and how cumbersome the implementation is to use. The detour via JSON and XML is unnecessary when communicating from JVM to JVM using the open-source Eclipse Serializer project. There are no restrictions when it comes to modeling a graph, as not only are the current new data types up to and including JDK17 already processed, but cycles within the graph are also no problem.
Using the Serializable interface is also unnecessary and does not influence processing. The easy handling allows it to be used even in tiny projects such as the REST service shown here using on-board JDK resources. A larger project that uses the serializer is the open-source project EclipseStore. A high-performance persistence mechanism for the JVM is offered here.
Happy Coding
Sven
Published at DZone with permission of Sven Ruppert. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments