jQuery in Action, 2nd edition*
The moose likes JDBC and the fly likes How to do Bulk Emailing when u have Millions of records???? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Databases » JDBC
Bookmark "How to do Bulk Emailing when u have Millions of records????" Watch "How to do Bulk Emailing when u have Millions of records????" New topic
Author

How to do Bulk Emailing when u have Millions of records????

nagesh vutukuru
Greenhorn

Joined: May 25, 2001
Posts: 10
Hi All,
I have four million records for Bulk emailing. I use Oracle8i/type 4 driver.
I wanna do batch processing, say 100k mails/hour. I used setFetchSize method available in Statement. i didn't get the expected result.
Does anyone have an idea/clue on bulk emailing or setFetchSize???
Thanks
Bjarki Holm
Author
Ranch Hand

Joined: May 25, 2001
Posts: 65
Hmmm... what method are you using to send the e-mail messages?
------------------
Bjarki Holm
Author of Professional Java Data


Bjarki Holm
nagesh vutukuru
Greenhorn

Joined: May 25, 2001
Posts: 10
I use Java Mail API to send mails.
One way to do this batch job is to keep a counter, while reading the records from Resultset.
I am looking for an alternative.....
Jason Menard
Sheriff

Joined: Nov 09, 2000
Posts: 6450
Please don't take anything personally, but I have a very strong moral/philosophical problem with helping a spammer conduct business.
Exactly how many mail servers are you trying to crash with this application? How much space are you trying to take up on machines that don't belong to you? Do you have permission from all 4,000,000 recipients to store your data on their machine? Do you have permission of an ISP to use their mailhost to bulk send spam, or will a mailhost(s) be hijacked for this purpose? Will a legitimate return address be used on your emails, so that the sender and his ISP may be contacted, or will the headers be forged? Have you or your employers harvested the emails yourself, or have you purchased OUR email addresses from some low life who has harvested them for use without their owners permission? Do the emails advertise a legal service or product and will that service or product be delivered, or is the purpose of the mass emailings to defraud people of their money?
Code responsibly. Be a part of the solution, not a part of the problem. Anybody providing help/advice should first examine their own conscience to see where they stand. Writing software of this type is similar to writing a virus or a worm. It is malicious code.
On the VERY off chance there is some legitimate purpose for this software, other than stealing space, crashing servers, and generally annoying 4,000,000 people, I appologize for jumping to conclusions and would be interested to know for what legitimate purpose this software will be used for. Although you said bulk emailing, and that equals spam.
Jason Menard
[This message has been edited by Jason Menard (edited June 12, 2001).]
Bjarki Holm
Author
Ranch Hand

Joined: May 25, 2001
Posts: 65
Nagesh, you originally said:

I wanna do batch processing, say 100k mails/hour. I used setFetchSize method available in Statement. i didn't get the expected result.

Can you explain in more detail what you mean by this? What results did you get, and what were you expecting? Are you holding the cursor (ResultSet) open whilst supplying the content to Javamail?
From my experience with bulk e-mailing (500.000 - 1.000.000 records), it would be natural to split the transaction in little pieces, as you say. A possible approach would be writing a PL/SQL program that calls a JavaMail class that is loaded into the database. When the PL/SQL is first executed, it will put the message transaction on some sort of a queue. A job in the database will check the queue every 30 minutes or so, and send the next chunk of messages if there is any. When X more messages have been sent, some PL/SQL will store the current position in the message queue, which will be picked up by the job, in the next round. This means that you can have many concurrent message transactions running the same time, although this will probably not be very wise.
This means, of course, that sending the message to the whole set of 4 million users will take some hours, which is probably wise anyway, as this will cause a lot of strain on the mail server.
If I'm confusing you here, or just misunderstanding things, let me know
P.S. I did a chapter on that subject, JavaMail stored inside Oracle, executed from PL/SQL, in the title Professional Oracle 8i Application Programming, also published by Wrox (http://www.wrox.com/Books/Book_Details.asp?sub_section=1&isbn=1861004842).

------------------
Bjarki Holm
Author of Professional Java Data
Bjarki Holm
Author
Ranch Hand

Joined: May 25, 2001
Posts: 65
Jason,
this is not entirely fair. There are scenarios where you want to send e-mail messages to such a large number of people. For example, in my work with Icelandair, Iceland's larges airline company, we often had to send e-mail offers to some 500.000 people at the sime time. Imagine the size of SAS's netclub!
Although your criticism is absolutely right, in many ways, Nagesh's purpose might be perfectly valid. Even though our experience is often quite the contrary
------------------
Bjarki Holm
Author of Professional Java Data
Jason Menard
Sheriff

Joined: Nov 09, 2000
Posts: 6450
Originally posted by Bjarki Holm:
Jason,
this is not entirely fair. There are scenarios where you want to send e-mail messages to such a large number of people. For example, in my work with Icelandair, Iceland's larges airline company, we often had to send e-mail offers to some 500.000 people at the sime time. Imagine the size of SAS's netclub!
Although your criticism is absolutely right, in many ways, Nagesh's purpose might be perfectly valid. Even though our experience is often quite the contrary

You are absolutely right. I was thinking of a situation where a large company would send out a large amount of bulk email (like you said around 500,000) to people it has had previous business contact with, which is a perfectly legitimate purpose. I am quite sure that many large corpartions do this. But 4,000,000?! That's a lot of prior business contact!
But again, I appologize if I have jumped to a mistaken conclusion.
Jason
Bjarki Holm
Author
Ranch Hand

Joined: May 25, 2001
Posts: 65
Yes, I guess it's Nagesh's turn to make his case
------------------
Bjarki Holm
Author of Professional Java Data
nagesh vutukuru
Greenhorn

Joined: May 25, 2001
Posts: 10
Hi Jason and Bjarki,
Thanks for reply. I am in sync with Jason. I too don't like spamming.
I work for a leading ISP. I wanna help my Mailing team to do their work much easier by giving a facility to add new mail servers, capacity, no.of channels etc.
We use Three mail servers(Two working and One Common Standby)dedicated for bulk emailing and each has capacity to send 50k/hour.
i have evaluated some bulk emailing software. they are not flexible and doesn't meet our requirement.
I have done the following pilot test with 10k.
1. read 5k records from resultset using a counter
2. fork the email ID's to mail servers
3. i'll get a callback from mail server after completion of the process
4. Update the table with processed flag
5. read the unprocessed records and process again
This works fine..
My question is on the method setFetchSize in Statement.
for e.g. i have 'x' records, i'd like to process in 10 batches and each batch will have 'x/10' records.
by using setFetchSize(x/10), i can read x/10 records at a time.
is it possible to keep the Resultset in wait state till i get a callback from mail server???. so that i can read the next batch after the callback.
If yes, pls. give me a clue...
Thanks



Bjarki Holm
Author
Ranch Hand

Joined: May 25, 2001
Posts: 65
Nagesh,
setFetchSize() should be used with your ResultSet with a reasonable ballpark figure in order to minimize round trips to the database. However, this value should not be set to a variable size of X/10, or equivalent. I recommend a fixed value of 100-200 for your application. More than this, and you might get memory problems. Besides, I don't think raising this value much more will do you any good, since processing the data takes time anyway, and a round trip to the databas every 200 records or so is not that bad.
You should not try to keep the ResultSet in a wait state whilst you process the data to the mail server. The ResultSet, while open, keeps an open cursor to the database, as well as an open connection, since the original Connection object must be open as well. Instead, you should process the whole ResultSet at once, and put the results in some variables in memory (List, Vector, etc.). Then close the ResultSet and the Connection, and process the data from the variable list. If the amount of data is too much to be held in memory, you should keep track of the position in the list of recipents, and proceed from that position in the next batch - close database resource in the meantime.
I hope I'm understanding you correctly.
Regards,

Bjarki Holm
Author of Professional Java Data
[This message has been edited by Bjarki Holm (edited June 13, 2001).]
nagesh vutukuru
Greenhorn

Joined: May 25, 2001
Posts: 10
Hi Bjarki,
Thanks for ur answer. u got my problem clearly. In my next test i'll calculate memory consuption. After launching the application i'll share my exp. with all of u...

Thanks
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: How to do Bulk Emailing when u have Millions of records????