Loading CSV file into Database can be cumbersome task if your Database provider does not offer an out of box feature for this. Most of the time you’ll spend up in creating valid insert statements and putting up values escaping all special characters. Importing CSV files gets a bit complicated when you start doing things like importing files with description fields that can contain punctuation (such as commas or single-double quotation marks). So here’s a simple Java Utility class that can be used to load CSV file into Database. Note how we used some of the best practices for loading data. The CSV file is parsed line by line and SQL insert query is created. The values in query are binded and query is added to SQL batch. Each batch is executed when a limit is reached (in this case 1000 queries per batch).
Let’s us check an example. Below is the sample CSV file that I want to upload in database table Customer. employee.csv – Sample CSV file:
EMPLOYEE_ID,FIRSTNAME,LASTNAME,BIRTHDATE,SALARY
1,Dean,Winchester,27.03.1975,60000
2,John,Winchester,01.05.1960,120000
3,Sam,Winchester,04.01.1980,56000
Code language: CSS (css)
The Table customer contains few fields. We added fields of different types like VARCHAR, DATE, NUMBER to check our load method works properly. Table: Customer – Database table
CREATE TABLE Customer (
EMPLOYEE_ID NUMBER,
FIRSTNAME VARCHAR2(50 BYTE),
LASTNAME VARCHAR2(50 BYTE),
BIRTHDATE DATE,
SALARY NUMBER
)
Code language: SQL (Structured Query Language) (sql)
Following is a sample Java class that will use CSVLoader utility class (we will come to this shortly). Main.java – Load sample.csv to database
package net.viralpatel.java;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
public class Main {
private static String JDBC_CONNECTION_URL =
"jdbc:oracle:thin:SCOTT/TIGER@localhost:1500:MyDB";
public static void main(String[] args) {
try {
CSVLoader loader = new CSVLoader(getCon());
loader.loadCSV("C:\\employee.sql", "CUSTOMER", true);
} catch (Exception e) {
e.printStackTrace();
}
}
private static Connection getCon() {
Connection connection = null;
try {
Class.forName("oracle.jdbc.driver.OracleDriver");
connection = DriverManager.getConnection(JDBC_CONNECTION_URL);
} catch (ClassNotFoundException e) {
e.printStackTrace();
} catch (SQLException e) {
e.printStackTrace();
}
return connection;
}
}
Code language: Java (java)
In above Main class, we created an object of class CSVLoader
using parameterized constructor and passed java.sql.Connection
object. Then we called the loadCSV
method with three arguments. First the path of CSV file, second the table name where data needs to be loaded and third boolean parameter which decides whether table has to be truncated before inserting new records. Execute this Java class and you’ll see the records getting inserted in table.
The CSV is successfully loaded in database. Let’s check the Utility class now. I strongly recommend you to go through below tutorials as the Utility class combines the idea from these tutorials.
The utility class uses OpenCSV library to load and parse CSV file. Then it uses the idea of Batching in JDBC to batch insert queries and execute them. Each CSV value is checked if it is valid date before inserting. CSVLoader.java – Utility class to load CSV into Database
package net.viralpatel.java;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.util.Date;
import org.apache.commons.lang.StringUtils;
import au.com.bytecode.opencsv.CSVReader;
/**
*
* @author viralpatel.net
*
*/
public class CSVLoader {
private static final
String SQL_INSERT = "INSERT INTO ${table}(${keys}) VALUES(${values})";
private static final String TABLE_REGEX = "\\$\\{table\\}";
private static final String KEYS_REGEX = "\\$\\{keys\\}";
private static final String VALUES_REGEX = "\\$\\{values\\}";
private Connection connection;
private char seprator;
/**
* Public constructor to build CSVLoader object with
* Connection details. The connection is closed on success
* or failure.
* @param connection
*/
public CSVLoader(Connection connection) {
this.connection = connection;
//Set default separator
this.seprator = ',';
}
/**
* Parse CSV file using OpenCSV library and load in
* given database table.
* @param csvFile Input CSV file
* @param tableName Database table name to import data
* @param truncateBeforeLoad Truncate the table before inserting
* new records.
* @throws Exception
*/
public void loadCSV(String csvFile, String tableName,
boolean truncateBeforeLoad) throws Exception {
CSVReader csvReader = null;
if(null == this.connection) {
throw new Exception("Not a valid connection.");
}
try {
csvReader = new CSVReader(new FileReader(csvFile), this.seprator);
} catch (Exception e) {
e.printStackTrace();
throw new Exception("Error occured while executing file. "
+ e.getMessage());
}
String[] headerRow = csvReader.readNext();
if (null == headerRow) {
throw new FileNotFoundException(
"No columns defined in given CSV file." +
"Please check the CSV file format.");
}
String questionmarks = StringUtils.repeat("?,", headerRow.length);
questionmarks = (String) questionmarks.subSequence(0, questionmarks
.length() - 1);
String query = SQL_INSERT.replaceFirst(TABLE_REGEX, tableName);
query = query
.replaceFirst(KEYS_REGEX, StringUtils.join(headerRow, ","));
query = query.replaceFirst(VALUES_REGEX, questionmarks);
System.out.println("Query: " + query);
String[] nextLine;
Connection con = null;
PreparedStatement ps = null;
try {
con = this.connection;
con.setAutoCommit(false);
ps = con.prepareStatement(query);
if(truncateBeforeLoad) {
//delete data from table before loading csv
con.createStatement().execute("DELETE FROM " + tableName);
}
final int batchSize = 1000;
int count = 0;
Date date = null;
while ((nextLine = csvReader.readNext()) != null) {
if (null != nextLine) {
int index = 1;
for (String string : nextLine) {
date = DateUtil.convertToDate(string);
if (null != date) {
ps.setDate(index++, new java.sql.Date(date
.getTime()));
} else {
ps.setString(index++, string);
}
}
ps.addBatch();
}
if (++count % batchSize == 0) {
ps.executeBatch();
}
}
ps.executeBatch(); // insert remaining records
con.commit();
} catch (Exception e) {
con.rollback();
e.printStackTrace();
throw new Exception(
"Error occured while loading data from file to database."
+ e.getMessage());
} finally {
if (null != ps)
ps.close();
if (null != con)
con.close();
csvReader.close();
}
}
public char getSeprator() {
return seprator;
}
public void setSeprator(char seprator) {
this.seprator = seprator;
}
}
Code language: Java (java)
The class looks complicated but it is simple :) The loadCSV
methods combines the idea from above three tutorials and create insert queries. Following is the usage of this class if you want to use it in your project: Usage
CSVLoader loader = new CSVLoader(connection);
loader.loadCSV("C:\\employee.csv", "TABLE_NAME", true);
Code language: Java (java)
Load file with semicolon as delimeter:
CSVLoader loader = new CSVLoader(connection);
loader.setSeparator(';');
loader.loadCSV("C:\\employee.csv", "TABLE_NAME", true);
Code language: Java (java)
Load file without truncating the table:
CSVLoader loader = new CSVLoader(connection);
loader.loadCSV("C:\\employee.csv", "TABLE_NAME", false);
Code language: Java (java)
Hope this helps.
Load_CSV_Database_Java_example.zip (2.05 MB)
Java URL Encoder/Decoder Example - In this tutorial we will see how to URL encode/decode…
Show Multiple Examples in OpenAPI - OpenAPI (aka Swagger) Specifications has become a defecto standard…
Local WordPress using Docker - Running a local WordPress development environment is crucial for testing…
1. JWT Token Overview JSON Web Token (JWT) is an open standard defines a compact…
GraphQL Subscription provides a great way of building real-time API. In this tutorial we will…
1. Overview Spring Boot Webflux DynamoDB Integration tests - In this tutorial we will see…
View Comments
Hi,
is there or could there be a way by which we can show to the user the progress of the database operations performed so far ???
Hi, Viral
Please change argument from following line :
.loadCSV("C:\\employee.sql", "TABLE_NAME", false);
instead of C:\\employee.sql ,it should be C:\\employee.csv.
It will be very useful for other people to directly understand your code without confusion.
Opps :) Thanks for that. I updated the code.
What if the database had a password for it? I am using Postgresql as my database...
Hi Faud, The CSVLoader class takes Connection object as input. You can get connection object to Postgresql like below:
[code language="java"]
Class.forName("org.postgresql.Driver");
Connection connection = null;
connection = DriverManager.getConnection(
"jdbc:postgresql://hostname:port/dbname","username", "password");
//..
CSVLoader loader = new CSVLoader(connection);
loader.loadCSV("C:\\employee.csv", "TABLE_NAME", true);
[/code]
Hope this helps.
I am getting error (java.sql.BatchUpdateException: Batch entry 0 INSERT INTO ) when running the Main.java. Why is that happening?
Hi Zack, This exception comes whenever there is error in batch update operation. The count indicates how many queries were executed successfully. In this case 0. Please check if the database table is present and you are able to connect properly.
I couldn't get more details from this error message.
Lets say I have a CSV file that contain 100 records and 22 column. Is using batch update is suitable or is there another way to insert the data, 1 record per times,update it until it finish? I hope you understand what I want to say.Sorry for the trouble.
Here is my schema in the table database and the column inside the CSV files is the same.
[code language="sql"] CREATE TABLE patient_registration
(
id bigint NOT NULL,
fullname character varying(255),
icnumber character varying(100),
gender character varying(10),
placeofbirth character varying(100),
citizenship character varying(100),
religion character varying(50),
race character varying(50),
sector character varying(50),
occupation character varying(100),
maritalstatus character varying(50),
mobileno character varying(25),
officeno character varying(25),
phoneno character varying(25),
email character varying(255),
address1 character varying(100),
address2 character varying(100),
address3 character varying(100),
address4 character varying(100),
postcode character varying(100),
state character varying(100),
CONSTRAINT patient_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE patient_registration
OWNER TO postgres;
GRANT ALL ON TABLE patient_registration TO postgres; [/code]
I think I need to change a bit in this code in CSV Loader but I'm still cant figure it out.Hope you can help.
[code language="java"] final int batchSize = 1000;
int count = 0;
Date date = null;
while ((nextLine = csvReader.readNext()) != null) {
if (null != nextLine) {
int index = 1;
for (String string : nextLine) {
date = DateUtil.convertToDate(string);
if (null != date) {
ps.setDate(index++, new java.sql.Date(date
.getTime()));
} else {
ps.setString(index++, string);
}
}
ps.addBatch();
}
if (++count % batchSize == 0) {
ps.executeBatch();
}
}
ps.executeBatch(); // insert remaining records
con.commit();
} catch (Exception e) {
con.rollback();
e.printStackTrace();
throw new Exception(
"Error occured while loading data from file to database."
+ e.getMessage());
} finally {
if (null != ps)
ps.close();
if (null != con)
con.close();
csvReader.close();
}
}
[/code]
What if my primary key is a bigInt type? The Java error says that my column 'id' in the database is a type of bigInt, while the value that been passed in the Java code from the CSV file is recognised as character varying type... The value in the column id is like this: id, 1,2,3
Viral if i use Derby at back end... package my java application in JAR format and den deploy it in client environment den do we have to install Derby on systems on which we intend to use our java desktop application or will that go embedded into JAR...
I am getting error in reading the data when data field has "\ ". can you help in this ?
Hi Viral,
I am also getting same error . "\" get replace by blank while inserting into DB.
Can you pls help on this.
thanks for such good article.
Cant we create the table automatically from the java code itself by reading the first line of csv file? obviously taking all columns as strings. This would be generalized to all csv files then..
I am getting the following error while running the Main Class
Exception in thread "main" java.lang.UnsupportedClassVersionError: Main (Unsuppo
rted major.minor version 50.0)
at java.lang.ClassLoader.defineClass0(Native Method)
at java.lang.ClassLoader.defineClass(Unknown Source)
at java.security.SecureClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.access$100(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClassInternal(Unknown Source)
Pls help me