This week's book giveaway is in the OCAJP 8 forum. We're giving away four copies of OCA Java SE 8 Programmer I Study Guide and have Edward Finegan & Robert Liguori on-line! See this thread for details.
I am trying to populate a table from a .data file. It works but when I have run it once and in the DB already exists a primary key as personId and I run it again it says that primary is duplicated, the expec is:
I need know if that primary key already axists in the DB and if it exist don´t insert
1) Just try to insert the rows and catch and gracefully handle exceptions caused by primary key violations. Recognizing the exception in various databases might be tricky. If you're using a Java7, JDBC 4.1 compliant driver, there is the SQLIntegrityConstraintViolationException exception. (Still, this exception would be thrown for any constraint violation, not just PK violation, so it might cause rows violating some other constraints to be silently ignored). Update batching might be problematic due to the thrown exceptions.
2) Check the row doesn't exist before each insert. Obviously requires one more roundtrip to the database and precludes update batching, so it won't perform as well. Moreover, if any other process is inserting new records to the table as well, you might still get constraint violations, depending on your isolation level.
3) You can upload all records into a temporary/staging table private to your session (so that there can't be any PK violations in this step), and then insert all non-existent record into the real table at once using an SQL statement. The data is moved twice, but can be processed in bulk in both cases (using update batching during the upload).
4) Use your database's ETL tool.
A note: you're setting all parameters as String, though clearly they aren't (see person_id). This causes implicit conversions, which is bad. You should properly parse the input file and ideally store individual records as instances of a single class created to represent them, with proper types for all fields.
Also, I've somehow assumed there are more than one record per file. If this is not the case (just one record per file), certainly go with the second solution - the other solutions would be an overkill.