Win a copy of The Java Performance Companion this week in the Performance forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Inserting null or an empty String

 
Roel De Nijs
Sheriff
Posts: 10213
129
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you use a PreparedStatement is there any difference (for a varchar column) in setting null or "" (empty string) as parameter value? Because our batch insert takes twice as long when using null values (instead of empty strings) as parameter values.

I'm not sure if it's a real JDBC issue, so I give some information about our environment. Our database is a Microsoft SQL Server 2008 R2, the JDBC driver we use is the Microsoft JDBC4 Driver for SQL Server driver. Our standalone application is Spring-enabled: to connect to the database we use a DriverManagerDataSource and to perform the batch inserts we use a BatchSqlUpdate instance.

Any ideas?

 
Jeanne Boyarsky
author & internet detective
Marshal
Posts: 34669
367
Eclipse IDE Java VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
One is null and one is empty string. Are their keys/constraints on this column? Because it is odd for null to take noticeably longer.
 
Roel De Nijs
Sheriff
Posts: 10213
129
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
No keys or constraints at all (these are added after batch insert is finished).
 
Jaikiran Pai
Marshal
Pie
Posts: 10447
227
IntelliJ IDE Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
How are you computing the time taken and at what level (DB or Java code)? And how big is the time difference?
 
Roel De Nijs
Sheriff
Posts: 10213
129
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Time is computed at Java code level: I log the time of each flush (of the BatchSqlUpdate) and of the complete process. The batch is using the default batch size of 5000.

Using null values:
begin import
flush finished in 63860 ms
flush finished in 20378 ms
finished processing table1 in 85957 ms
total[6280] added[6131] notValid[149]
end import [completed in 86340 ms]


Using empty strings:
begin import
flush finished in 15460 ms
flush finished in 3211 ms
finished processing table1 in 20334 ms
total[6280] added[6131] notValid[149]
end import [completed in 20702 ms]


So I would say the time difference is big, very big.
 
Roel De Nijs
Sheriff
Posts: 10213
129
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
As a little experiment I used jTDS JDBC Driver instead of Microsoft JDBC4 Driver for SQL Server and executed same tests. Here are the results.

Using null values:
begin import
flush finished in 11719 ms
flush finished in 2606 ms
finished processing table1 in 16019 ms
total[6280] added[6131] notValid[149]
end import [completed in 16099 ms]


Using empty strings:
begin import
flush finished in 18082 ms
flush finished in 4051 ms
finished processing table1 in 23786 ms
total[6280] added[6131] notValid[149]
end import [completed in 23868 ms]


So it seems to be a driver related issue...
 
Roel De Nijs
Sheriff
Posts: 10213
129
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I replaced BatchSqlUpdate with JdbcTemplate.batchUpdate which gives me more control about which methods to invoke on the PreparedStatement to set the parameter values. After playing a bit with all the possible alternatives, it's proven that invoking setNull and/or setObject are responsible for the performance issues when using Microsoft JDBC4 Driver for SQL Server.
 
Jaikiran Pai
Marshal
Pie
Posts: 10447
227
IntelliJ IDE Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That's interesting. I'm just curious here, do you have the final SQL query that gets generated and run when using these different approaches? That might give us some hint.
 
Roel De Nijs
Sheriff
Posts: 10213
129
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jaikiran Pai wrote:do you have the final SQL query that gets generated and run when using these different approaches?

The SQL query used by these different approaches is always the same one: INSERT INTO table1 (column1, column2, ..., columnN) VALUES (?, ?, ..., ?)

Jaikiran Pai wrote:That might give us some hint.

I hope to get some hints from Microsoft regarding these issues. So I opened a question on their SQL Server Data Access forum.
 
Roel De Nijs
Sheriff
Posts: 10213
129
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I experimented a bit more and discovered some interesting facts about this driver:

  • When you provide the sqlType of the column, you have a performance loss of factor 3-4. So don't use setNull-method (you are required to pass the sqlType). The setObject-method has a variant with sqlType and one without sqlType, use the 2-parameter variant (without sqlType).
  • If you pass a java.util.Date to the setObject-method, you'll get this exeception: The conversion from UNKNOWN to UNKNOWN is unsupported.. Use a java.sql.Date instead
  • If you create a BigDecimal instance using a double and pass this value to the setBigDecimal-method (e.g. ps.setBigDecimal(parameterIndex, new BigDecimal(123.456));), you'll get this exception: Error converting data type nvarchar to decimal.


  • In the end I created my own BatchSqlUpdate class (instead of using the one provided by String). When it comes down to setting parameter values on the PreparedStatement I just use the setObject(parameterIndex, value) (so no setNull or setObject with sqlType parameter).

    The results using null values:
    begin import
    flush finished in 10562 ms
    flush finished in 2723 ms
    finished processing table1 in 14841 ms
    total[6280] added[6131] notValid[149]
    end import [completed in 15201 ms]
     
    • Post Reply
    • Bookmark Topic Watch Topic
    • New Topic