CSV File Writer Using Scala
Are you looking to generate your own CSV file using Scala? We've got you covered! Learn how to do it, and do it quickly to save you time.
Join the DZone community and get the full member experience.
Join For FreeThe other day, I was looking for a CSV file with some records and started approaching people about it, then I wondered whether I could write my own CSV file, since borrowing it from others is pointless. This actually made me write a piece of code in Scala which generates a CSV file in the specified directory. You can generate your own CSV file with n number of fields and n number of records in it. Also, you can play around with the fields and number of records in the file as and when required.
Creating a Scala Class
Today we're going to make an SBT project.
First, you will need to add a dependency in your build.sbt project: libraryDependencies += "au.com.bytecode" % "opencsv" % "2.4"
Now we will write code in our class. In my case, it’s a companion object MakeCSV. You will need to import a few packages in your class.
import java.io.{BufferedWriter, FileWriter}
import scala.collection.JavaConversions._
import scala.collection.mutable.ListBuffer
import scala.util.Random
import au.com.bytecode.opencsv.CSVWriter
Now We Will Start Writing Code In Our Class
val outputFile = new BufferedWriter(new FileWriter("PATH_TO_STORE_FILE/output.csv"))
: This will create an output file which is an output.csv file in the said directoryval csvWriter = new CSVWriter(outputFile)
: this will create a csvwriter object which will have the outputFile in it.val csvSchema = Array("id", "name", "age", "city")
: this is the schema for your CSV file, in my case I have four fields. You can include the schema if you want. It’s totally optional.val nameList = List("Deepak", "Sangeeta", "Geetika", "Anubhav", "Sahil", "Akshay")
: This is the list for the name field.val ageList = (24 to 26).toList
: This is the list for the age
field.val cityList = List("Delhi", "Kolkata", "Chennai", "Mumbai")
: This is the list for the city field.val random = new Random()
: This is the random object which I have created to take up random items from the list of fields.var listOfRecords = new ListBuffer[Array[String]]()
: Here is the list buffer which holds all the records.listOfRecords += csvFields
: This is how we add the fields to our CSV file.for (i listOfRecords += Array(i.toString, nameList(random.nextInt(nameList.length)), ageList(random.nextInt(ageList.length)).toString, cityList(random.nextInt(cityList.length)))}
: Here is the loop which adds records to the listbuffers,here I have used random object to pick up random items from the list of fields.csvWriter.writeAll(listOfRecords.toList)
: Here we are writing all the records to the CSV files.outFile.close()
: Here we will finally close the file after writing all the records into it.
The Final Code
import java.io.{BufferedWriter, FileWriter}
import scala.collection.JavaConversions._
import scala.collection.mutable.ListBuffer
import scala.util.Random
import au.com.bytecode.opencsv.CSVWriter
object MakeCSV extends App {
val outputFile = new BufferedWriter(new FileWriter(“/home/deepak/Desktop/deepak19.csv”)) //replace the path with the desired path and filename with the desired filename
val csvWriter = new CSVWriter(outputFile)
val csvFields = Array(“id”, “name”, “age”, “city”)
val nameList = List(“Deepak”, “Sangeeta”, “Geetika”, “Anubhav”, “Sahil”, “Akshay”)
val ageList = (24 to 26).toList
val cityList = List(“Delhi”, “Kolkata”, “Chennai”, “Mumbai”)
val random = new Random()
var listOfRecords = new ListBuffer[Array[String]]()
listOfRecords += csvFields
for (i listOfRecords += Array(i.toString, nameList(random.nextInt(nameList.length))
, ageList(random.nextInt(ageList.length)).toString, cityList(random.nextInt(cityList.length)))
}
csvWriter.writeAll(listOfRecords.toList)
outputFile.close()
}
I have tested the code to make 9 million records in a CSV file. It took 2 minutes and 22 seconds on my machine with an i5 processor and 8 GB RAM. I am gonna come up with a new blog where I will be writing the same code with Spark so that we can test the performance. I really hope the performance will increase when we use Spark.
If you have any challenges, please let me know in the comments. If you enjoyed this post, I’d be very grateful if you’d help it spread. Keep smiling, keep coding!
Published at DZone with permission of Deepak Mehra, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments