Let’s see how we can perform batch insert in Java using JDBC APIs. Although you might already knew this, I will try to explain the basic to a bit complex scenarios.
In this note, we will see how we can use JDBC APIs like Statement
and PreparedStatement
to insert data in any database in batches. Also we will try to explore scenarios where we can run out of memory and how to optimize the batch operation.
So first, the basic API to Insert data in database in batches using Java JDBC.
Simple Batch
I am calling this a simple batch. The requirement is simple. Execute a list of inserts in batch. Instead of hitting database once for each insert statement, we will using JDBC batch operation and optimize the performance.
Consider the following code:
Bad Code
String [] queries = {
"insert into employee (name, city, phone) values ('A', 'X', '123')",
"insert into employee (name, city, phone) values ('B', 'Y', '234')",
"insert into employee (name, city, phone) values ('C', 'Z', '345')",
};
Connection connection = new getConnection();
Statement statement = connection.createStatement();
for (String query : queries) {
statement.execute(query);
}
statement.close();
connection.close();
Code language: Java (java)
This is the BAD code. You are executing each query separately. This hits the database for each insert statement. Consider if you want to insert 1000 records. This is not a good idea.
We’ll below is the basic code to perform batch insert. Check it out:
Good Code
Connection connection = new getConnection();
Statement statement = connection.createStatement();
for (String query : queries) {
statement.addBatch(query);
}
statement.executeBatch();
statement.close();
connection.close();
Code language: Java (java)
Note how we used addBatch()
method of Statement, instead of directly executing the query. And after adding all the queries we executed them in one go using statement.executeBatch()
method. Nothing fancy, just a simple batch insert.
Note that we have taken the queries from a String array. Instead you may want to make it dynamically. For example:
import java.sql.Connection;
import java.sql.Statement;
//...
Connection connection = new getConnection();
Statement statement = connection.createStatement();
for (Employee employee: employees) {
String query = "insert into employee (name, city) values('"
+ employee.getName() + "','" + employee.getCity + "')";
statement.addBatch(query);
}
statement.executeBatch();
statement.close();
connection.close();
Code language: Java (java)
Note how we are creating query dynamically using data from Employee object and adding it in batch to insert in one go. Perfect! isn’t it?
wait.. You must be thinking what about SQL Injection? Creating queries like this dynamically is very prone to SQL injection. And also the insert query has to be compiled each time.
Why not to use PreparedStatement
instead of simple Statement
. Yes, that can be the solution. Check out the below SQL Injection Safe Batch.
SQL Injection Safe Batch
Consider the following code:
import java.sql.Connection;
import java.sql.PreparedStatement;
//...
String sql = "insert into employee (name, city, phone) values (?, ?, ?)";
Connection connection = new getConnection();
PreparedStatement ps = connection.prepareStatement(sql);
for (Employee employee: employees) {
ps.setString(1, employee.getName());
ps.setString(2, employee.getCity());
ps.setString(3, employee.getPhone());
ps.addBatch();
}
ps.executeBatch();
ps.close();
connection.close();
Code language: Java (java)
Checkout the above code. Beautiful. We used java.sql.PreparedStatement
and added insert query in the batch. This is the solution you must implement in your batch insert logic, instead of above Statement
one.
Still there is one problem with this solution. Consider a scenario where you want to insert half million records into database using batch. Well, that may generate OutOfMemoryError:
java.lang.OutOfMemoryError: Java heap space
com.mysql.jdbc.ServerPreparedStatement$BatchedBindValues.<init>(ServerPreparedStatement.java:72)
com.mysql.jdbc.ServerPreparedStatement.addBatch(ServerPreparedStatement.java:330)
org.apache.commons.dbcp.DelegatingPreparedStatement.addBatch(DelegatingPreparedStatement.java:171)
Code language: HTML, XML (xml)
This is because you are trying to add everything in one batch and inserting once. Best idea would be to execute batch itself in batch. Check out the below solution.
Smart Insert: Batch within Batch
This is a simplest solution. Consider a batch size like 1000 and insert queries in the batches of 1000 queries at a time.
String sql = "insert into employee (name, city, phone) values (?, ?, ?)";
Connection connection = new getConnection();
PreparedStatement ps = connection.prepareStatement(sql);
final int batchSize = 1000;
int count = 0;
for (Employee employee: employees) {
ps.setString(1, employee.getName());
ps.setString(2, employee.getCity());
ps.setString(3, employee.getPhone());
ps.addBatch();
if(++count % batchSize == 0) {
ps.executeBatch();
}
}
ps.executeBatch(); // insert remaining records
ps.close();
connection.close();
Code language: Java (java)
This would be the ideal solution. This avoids SQL Injection and also takes care of out of memory issue. Check how we have incremented a counter count
and once it reaches batchSize
which is 1000, we call executeBatch()
.
Hope this helps.
Well this is great.
But what about managing exception that could happen in some of the inserting records?
I would obviously want to pass correct record and catch only records with some problem.
Thanks
What if one of the columns is to be populated with sequence values??
For example..
String [] queries = {
“insert into employee (id,name, city, phone) values (seq.nextval,’A’, ‘X’, ‘123’)”,
“insert into employee (id,name, city, phone) values (seq.nextval,’B’, ‘Y’, ‘234’)”,
“insert into employee (id,name, city, phone) values (seq.nextval,’C’, ‘Z’, ‘345’)”, };
when i execute the above query using batch.. it is inserting same id value in all the rows. How to solve this problem?? I do not want to use Triggers.
Although you execute these queries in batch, Oracle should resolve the sequences properly and should update its values for each row. Let me know if the problem is not resolved. It seems to me some diff issue.
Hi, i declare one variable type int and in X, in construtor execute select * MAX(ID) from tableexample.
X=select * MAX(ID) from tableexample.
this next ID.
insert into table (ID, name) values (x, “Michel”);
this way you do not need to put the autoincrement ID or serial,
always in the constructor, will a int x = select * MAX (id) +1, when you insert the same id to next item in the database
insert into table (x, “orange”)
insert into table (x, “apple”)
insert into table (x, “pineapple”)
so I decided to point of sale, I have helped
Superb :)
Awesome Tutorial. Really Great.
It has a problem, only works for the same DML, I can not to do an Update and an Insert at the same time
same problem I have faced. Did you find the solution for it.
Excellent tutorial. Concise and explained very well
Really this good… for learners
Very concise and to the point. Thanks for this link.
Thanks Dhiraj :)
how to create this kind of website plz…. suggest me..:)
very interesting and to the point i will get all idea how to work out
in batch insert
good idea… thanx
Nice atricle. I learned Batch insert today.
one problem is there in one code snippet although..so just mentioning it.
Problem exists where you wrote :
If we have 5400 records abd Batchsize is 1000 then…above code will ignore last 400 record. Please correct me if i am wrong…
Something of this type looks to solve this prob here –
Hi Gaurav, you should not worry as at the end of 5400, outer loop will complete and executeBatch after loop will do the remaining .
Hi, this code very good, i lije i am utililis is code. Congratulatios.
I like send me a code complete, class employe, class connection.
please, send in email -> [email protected]
Hi,
How to handle the scenario like what if a row failed to insert ? Suppose out of 1000 rows to be inserted, the 100th row failed to insert and the program ends abruptly without inserting the remaining rows(i.e. from 100th row to 1000th row). How to handle this kind of scenario such that even if the 100th row fails to be inserted, the process should carry on for the remaining rows(i.e. from 101th row to 1000th row) ?
I am also looking for the same .. if something fails , how to ignore and process the rest ..
Please help … if u gt answer ..
I am also looking for the same, pls, someone help!!!!
Helpful stuff…
Efforts appreciated..!!!
Thnx..:)
@Gaurav
Line no 19 will do that part (inserting remaining 400 records)
ps.executeBatch(); // insert remaining records
thanx dude
hi all
i have problem with addbatch
i want to commit every 200 row. but…
“com.ibm.db2.jcc.b.vd: Non-atomic batch failure. The batch was submitted, but at least one exception occurred on an individual member of the batch. Use getNextException() to retrieve the exceptions for specific batched elements.”
Ex.
u_sql = “UPDATE unemploy SET STATUS = ‘8’, D_UPDATE = current timestamp, E_UPDATE = ‘xxxx’ WHERE e_pid=’1234′ with ur”;
prstUpdate.addBatch(u_sql);
if(count++==200)
prstUpdate.executeBatch();
db2Conn.commit();
……………………………..
help me please !!
Thanks for the tutorial, I am getting: SEVERE: BatchUpdateException Io exception: Checksum fail, when I try to insert the rest of the records.
hi M, you need
if (count++ % 200)
instead of
if ( count++ == 200)
Anita, do you have auto commit on or off ? and can you list full exception stack trace ?
Thanks for this Viral very helpful, good on you.
Hi, thanks for the tutorial….. I have a question….. doing a benchmark with the first approach…. how much faster will this batch insertion will be??? Is there any option to do a bulk load blocks of data in a database using JDBC or any other api??? Thanks in advance.
nice work.
Thnks Viral for the best tutorials.
thanks a lot.
Vital,
Thanks a lot. Greatly presented. Saved me a lot of effort.
Thank you very much.
Plz send me details about actual procedure of data storing in the database using java?
Simple and to the point explanation! Invariably google brings me to your website whenever I have a java query!
Superb job!
I used this program to do insert and it took roughly about 21 minutes to insert 2 and a half million records? Is there anyway we can speed this up? I was originally using JetEngine query which take about a second to insert all of those records but the only problem I was having with that was that it doesn’t handle leading and trailing spaces.
^^…Im using windows 7. processor i7 2.9GHZ with 8 GB ram to execute that.
thanks nice code it is very useful
Great posts here, however i want to parameterize my insert such that i can dynamicaly add fields and values to insert in a text file rather than hardcoding. Can you help on this?
Clear Explanation. Thanks
Hello,
suppose my data is failed within the batch then how should I elaborate it ???
On using executebatch(), the data is getting inserted in reverse order.Sometimes it is working fine.
Thank you….Please, Would you tell me what should I do if I want to do the following:
without setString to the first field, I just want to setString to the second and the third field.
Hi VP,
let i have a list of 10000 or more records i want to send this list to android using webservice,
so i decide i split to list 100-100 records. and send this list to android mobile,
now if all record 100 insert mobile side then send ack to server.
i want to know.is it efficient way?
or not?tell me…
Hi all,
we did exactly the same you insisted here. Using prepared Statements and then do a batchInsert all n records.
Anyhow we discovered a memeory leakage and after several million inserts a out of memory exception.
The reason seams to be that you have to call a clearBatch() after each executeBatch() else the PreparedStatement will not release its internal used memory.
We tested this with MS SQL Server, I do not know if this is DB or JDBC-driver dependent.
regards
Udo
it’s a beautiful way to operate DB in batch.
I am trying to use this code to import the .csv file in a MSSQL server. I have added the “sqljdbc4-3.0.jar” file into the project and changed the required places. But I am getting this error
Query: INSERT INTO new(NewLead_IDX,DateReceived,FirstName,LastName,Address,City,State,ZipCode,Email,Phone,PhoneExtension,Year,Make,Model,Trim,ExteriorColor,InteriorColor,TimeFrame,FinancialPreference,BestContactTime,Score,NewLeadSalePrice,LeadDescription,LeadStatus,RejectedDate,NewSource,Buyer,BuyerExternalId,AccountRepresentative,WholesaleId,ProviderId,) VALUES(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)
sun.jdbc.odbc.JdbcOdbcBatchUpdateException: [Microsoft][SQL Server Native Client 10.0][SQL Server]Incorrect syntax near ‘)’.
at sun.jdbc.odbc.JdbcOdbcPreparedStatement.emulateExecuteBatch(Unknown Source)
at sun.jdbc.odbc.JdbcOdbcPreparedStatement.executeBatchUpdate(Unknown Source)
at sun.jdbc.odbc.JdbcOdbcStatement.executeBatch(Unknown Source)
at net.viralpatel.java.CSVLoader.loadCSV(CSVLoader.java:118)
at net.viralpatel.java.Main.main(Main.java:17)
java.lang.Exception: Error occured while loading data from file to database.[Microsoft][SQL Server Native Client 10.0][SQL Server]Incorrect syntax near ‘)’.
at net.viralpatel.java.CSVLoader.loadCSV(CSVLoader.java:126)
at net.viralpatel.java.Main.main(Main.java:17)
Please can anybody help me out from this. I am a beginner in Java. Thanks in advance
Nice articles for basic understanding of JDBC.
Hi Viral Patel,
Please guide bulk insert in spring jdbc for Parent child tables.
Thanks & Regards
Krishna
Please help me .. I am getting SQL Error: ORA-00001: unique constraint (TSMGR.PKMIMTX) violated while inserting data to database. I am having 11 sql queries added to batchupdate. My 1st qeries is giving this error, so i wanna skip it when ever it gives error and wants my rest 10 sql qeries to execute.
String query1=q1;
String query2=q2;
String query3=q3;
String query4=q4;
String query5=q5;
String query6=q6;
String query7=q7;
String query8=q8;
String query9=q9;
String query10=q10;
String query11=q11;
Connection con=null;
try
{
con=DBUtil.getConnection();
con.setAutoCommit(false);
st=con.createStatement();
st.addBatch(query1);
st.addBatch(query2);
st.addBatch(query3);
st.addBatch(query4);
st.addBatch(query5);
st.addBatch(query6);
st.addBatch(query7);
st.addBatch(query8);
st.addBatch(query9);
st.addBatch(query10);
st.addBatch(query11);
int count[] = st.executeBatch();
Please reply me, very urgently needs to solve this issue . [email protected]
thanks.
Thank you for this usefull article!
i tried this code :
String []edu_record = getEduDetail(edu_table, emp_code.getText().trim(),form_no.getText().trim());
for (String edu_record1 : edu_record) {
pstmt.addBatch(edu_record1);
}
the method getEduDetail() generates my query and returns it string array.
while adding into batch it gives me following error.
SEVERE: null
com.microsoft.sqlserver.jdbc.SQLServerException: The method addBatch() cannot take arguments on a PreparedStatement or CallableStatement.
pls help me.
Hi ,
What is it for(Employee employee:employee) ?
Is it the table object from which we get the records ?
Hi ,
i am inserting multiple records using batch query in database successfully but when again same value insert in database then show error.can u tell me how can resolve this issues i don’t want duplicate value in database.
Do you know if connection.rollback(); works with MySQL ?
I’ve tested version MySQL 5.6 and latest Connector-J 5.1.33 and unfortunately the records that succeeded
are still in the database even statement.executeBatch(); throws an exception
I was expecting that connection.rollback() will remove the records that succeeded.
Hi Viral,
I am using batch inserts for batch size 1000.
but the commit for 1000 rows is taking 4 to 5 minutes to complete.
Isn’t it too long ?
I am connected to remote database server through VPN.
Is this the reason it is taking too much time for 1000 rows?
one more question,
Can i share one single connection object with multiple threads (All the threads having different statements)?
What should be done ideally?
Waiting for your response.
Thanks,
Deepali
I enhance some part.
…
Dear Viral Patel, I read with great pleasure publication about for JDBC batch insert. It has really been very useful when it comes to an appreciable amount of records (20000-70000), which is what I’ve tried. However, in a case of an insert 8000 records, to execute sentence ps.executeBatch () the program stops responding. Regardless of the length of the batch. I tested with 1000, 100 and 10 records is the same. Now if I try by one with ps.executeUpdate () sentence is executed with a speed normally lets say. Do you have any idea what might be going on? I attach the method code for your appreciation. From already thank you very much.
Raul from Argentina.
Continúo el mensaje anterior.
El código que no funciona en ese caso es,
Sorry for the inconvenience, but yesterday I sent a query related with this issue, however I can not see it in the list. Maybe this topic is now closed?
Thank you.
Raul.
Thanks, nice post.
for (Employee employee: “”employees””)
i have quetion over this..
what to write in place of emplyees.. my code is like in jsp
table name msmnt in mysql
model class name is mesurement.
pls help me out fast
hey pls put the whole program once pls
Thank you very much for this post! Very helpful and clearly explained!
Hi Patel,
I’ve one scenario, need to update a column with UUID in table where the column is null.
I’m my table i’ve 7crore needs to be updated. It’s taking 1hr to update 1laks records.
i’m using batchUpdate is there any way to improve the performance of updating..
or is there any api for to do…
Thanks, nice post.
it’s really awesome one. Keep it up
Shouldn’t you be resetting count back to 0 in the ‘if’ condition after you executeBatch in batch with batch example.
Oops my bad, sorry the count need not be reset to 0!
ps.executeBatch(); // insert remaining records
can this line executed even if there is no new batch added (exactly 1000/2000 records) ?
well, my need is to insert multiple rows but with atomicity
But the operation is not atomic
if database throw exception in middle of insertion , it insert the rows executed before the exception
what is employees in for(Employee employee:employees)?
Nice solution, you could try using multithread execution with forkJoinPool, if you are using java8 you can run parallel tasks.
IntStream stream = IntStream.range(0, listOfValues.size()).sorted(); //Number of rows to insert
ForkJoinPool forkJoinPool = new ForkJoinPool(100); //Using 100 threads
try {
forkJoinPool.submit(() -> {
stream.parallel().forEach(x -> {
//System.out.println(“Paralell: ” + x);
loadData(x);//Load your data here, insert statement using jdb conection pool
try {
Thread.sleep(10);
} catch (InterruptedException e) { }
});
}).get();
forkJoinPool.shutdown();
} catch (InterruptedException | ExecutionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
What if one query in a batch fails ?
It will not allow other queries in batch to be executed hence leaving that batch unexpected how will we handle such scenarios?
Nice tutorial. Just a word of caution Autocommit true is not recommended while doing batch update. refer
https://stackoverflow.com/questions/14625371/rollback-batch-execution-when-using-jdbc-with-autocommit-true
Hi Viral,
This post is very useful for beginners like me. I have question regarding prepared statement, after executing the batch we need to clear that batch? because I am working with 200 millions line of data. so while running the program performance is reduced. Can you please let me know ASAP. Can any one help me out from this issue
A great post, which helps me understand batchUpdate, thank you very much.
for (Employee employee: employees) {
ps.setString(1, employee.getName());
ps.setString(2, employee.getCity());
ps.setString(3, employee.getPhone());
ps.addBatch();
if(++count == batchSize) {
ps.executeBatch();
connection.commit();
ps.clearBatch();
count = 0;
}
}
if(count > 0){
ps.executeBatch(); // insert remaining records
connection.commit();
}
I think is better use this way. I mean way use % if we can check directly with ==.
Beside that in order to send remnants we should clear the prepareStatement
This is a bit old, but it is gold, thanks for sharing !!
Thanks. Detailed one.