Simple Java Program to Append to a File in HDFS
Get the code and instructions needed to build a Java program to append to a file in HDFS, using Maven as the build tool.
Join the DZone community and get the full member experience.
Join For FreeIn this article, I will present you with a Java program to append to a file in HDFS.
I will be using Maven as the build tool.
First, we need to add Maven dependencies in the pom.xml.
Now, we need to import the following classes:
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import java.io.*;
We will be using the hadoop.conf.Configuration
class to set the file system configurations as per the configuration of the Hadoop cluster installed.
Let's now start with configuring the file system:
public FileSystem configureFileSystem(String coreSitePath, String hdfsSitePath) {
FileSystem fileSystem = null;
try {
Configuration conf = new Configuration();
conf.setBoolean("dfs.support.append", true);
Path coreSite = new Path(coreSitePath);
Path hdfsSite = new Path(hdfsSitePath);
conf.addResource(coreSite);
conf.addResource(hdfsSite);
fileSystem = FileSystem.get(conf);
} catch (IOException ex) {
System.out.println("Error occurred while configuring FileSystem");
}
return fileSystem;
}
Make sure that the property dfs.support.append
in hdfs-site.xml is set to true
.
You can either set it manually by editing the hdfs-site.xml file or programmatically using:
conf.setBoolean("dfs.support.append", true);
Now that the file system is configured, we can access the files stored in HDFS.
Let's start with appending to a file in HDFS.
public String appendToFile(FileSystem fileSystem, String content, String dest) throws IOException {
Path destPath = new Path(dest);
if (!fileSystem.exists(destPath)) {
System.err.println("File doesn't exist");
return "Failure";
}
Boolean isAppendable = Boolean.valueOf(fileSystem.getConf().get("dfs.support.append"));
if(isAppendable) {
FSDataOutputStream fs_append = fileSystem.append(destPath);
PrintWriter writer = new PrintWriter(fs_append);
writer.append(content);
writer.flush();
fs_append.hflush();
writer.close();
fs_append.close();
return "Success";
}
else {
System.err.println("Please set the dfs.support.append property to true");
return "Failure";
}
}
To see whether the data has been correctly written to HDFS, let's write a method to read from HDFS and return the content as a String
.
public String readFromHdfs(FileSystem fileSystem, String hdfsFilePath) {
Path hdfsPath = new Path(hdfsFilePath);
StringBuilder fileContent = new StringBuilder("");
try{
BufferedReader bfr=new BufferedReader(new InputStreamReader(fileSystem.open(hdfsPath)));
String str;
while ((str = bfr.readLine()) != null) {
fileContent.append(str+"\n");
}
}
catch (IOException ex){
System.out.println("----------Could not read from HDFS---------\n");
}
return fileContent.toString();
}
After that, we have successfully written and read the file in HDFS. It's time to close the file system.
public void closeFileSystem(FileSystem fileSystem){
try {
fileSystem.close();
}
catch (IOException ex){
System.out.println("----------Could not close the FileSystem----------");
}
}
Before executing the code, you should have Hadoop running on your system.
You just need to go to HADOOP_HOME
and run following command:
./sbin/start-all.sh
For the complete program, refer to my GitHub repository.
Happy coding!
Published at DZone with permission of Simarpreet Kaur Monga, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments