aspose file tools*
The moose likes OO, Patterns, UML and Refactoring and the fly likes Mail Server - Database Design Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » OO, Patterns, UML and Refactoring
Bookmark "Mail Server - Database Design" Watch "Mail Server - Database Design" New topic
Author

Mail Server - Database Design

ankur rathi
Ranch Hand

Joined: Oct 11, 2004
Posts: 3830
Hi,

With my little knowledge, I have created database design for Mail Server project. Please put any comments/suggestions on anyting (even on name of table, column etc)....

I have two tables at first: User_Info and Message_Info






Now, whenever a new user will sign-up, I will generate userId for this user(u1), add one row into User_Info table and create a u1_Info table (this table belongs to this user only)




Now suppose I have 2 registered users. So I have 4 tables in total, User_Info (has 2 rows), Message_Info (no row), u1_Info (no row), u2_Info (no row).

Now suppose u1 mails to u2 (compose), I will generate messageId for this message (m1), add one row into Message_Info table, add one row into u1_Info table with values (m1, false, true, false, false) and add one row into u2_Info table with values (m1, true, false, false, false).


Why (m1, false, true, false, false) in u1_Info

messageId is m1.
It should not be in inbox of u1, so inbox_flag is false.
It should be in outbox of u1, so outbox_flag is true.
It should not be in bulk folder of u1, so bulk_flag is false.
It has not been read by u1 so read_flag is false.





Why (m1, true, false, false, false) in u2_Info

messageId is m1.
u1 is not in the bulkList of u2, so bulk_flag is false and inbox_flag is true. Others are same as above.



Now I think it is clear that, what I will add where, when something will happen...

Please give your valuable suggestions about this design.

Thanks a lot.


[ November 26, 2005: Message edited by: rathi ji ]
Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920


If the userId is unique, but the username is not, the userId should be the primary key. Remember that in the real world there are lots of people with the same names.

20 characters also seems very short for a user name. There are plenty of users here at the Ranch with usernames longer than that - why would your application be different?

If the userid is machine generated, why the varchar? Why not just an integer?



This seems short, too. I often use passwords longer than this.



What do you imagine would be in this field? If it is the name of the country, remember that different languages have different names for different countries ("Deustschland" vs "Germany" vs "Allemagne", for example). Unless you have a very strong reason, I would suggest using something like the international two-letter country code.



I don't quite understand what you mean by this, but it sounds like very poor design. Any relational database with long compound fields like this is dangerous and wrong.

If you mean that each user can be in multiple bulk groups, and each bulk group can have multiple members, what you have is a "many to many" relationship. The usual way to solve this is to create a table containing the ids (and optionally descriptions etc.) of all your groups, and then make a "linking table" where each row contains the id of a user and the id of a group. Then if you need to find out if a user is in a particular group, just ask the database for rows with that userid.



If this is just an id, why the varchar. Could it not just as easily be an integer?



This seems very short. Look at the length of the subjects here at the Ranch, or on some of your incoming email.



Again very short. I'd be tempted to make this a CLOB or MEMO field instead. Do you imagine you will be asking the database to look inside this field during queries?



Shouldn't these be user ids rather than names.



I'd suggest a more detailled timestamp would probably be useful. I certainly get more than one email message per day, and I like to be able to sort them in the order they came in.


Now, whenever a new user will sign-up, I will generate userId for this user(u1), add one row into User_Info table and create a u1_Info table (this table belongs to this user only)


Are you really suggesting creating a whole new table for each user? That's crazy.



Instead of a whole new table for each user, just add a column for the user id to this data, then the database can select out just the rows for that user. Much simpler. Let the database do the work.

Can a message be in both the inbox and outbox? If not, why have you got two flags? Surely one field indicating which "box" it is in would be a better choice.

When a user signs up, add a row to the user table.

When a new bulk group is created, add a row to the bulk table.

When a user joins a group, add a row to the bulk_link table.

When a user leaves a group, remove a row from the bulk_link table.

When a message is created, add a row to the message table, with a link to the id of the creating user

When a message is sent/received, add a row to the received table for each user that recieves it.

When a user deletes a message, remove the row for that message to that user from the received table.

Does that make more sense?


Read about me at frankcarver.me ~ Raspberry Alpha Omega ~ Frank's Punchbarrel Blog
ankur rathi
Ranch Hand

Joined: Oct 11, 2004
Posts: 3830
Thanks again Frank.


If the userId is unique, but the username is not, the userId should be the primary key. Remember that in the real world there are lots of people with the same names.


Actually both username and userId are unique, username is not the name of user, it is mail Id of user. Actually I forgot to add name column in User_Info, may be because of that, this confusion gets created...

Why I made username is unique and userId is not unique because, username will be entered by user (I can't make sure that user will enter unique always) and userId will be generated by system (so I will take care of uniqueness here).



What do you imagine would be in this field? If it is the name of the country, remember that different languages have different names for different countries ("Deustschland" vs "Germany" vs "Allemagne", for example). Unless you have a very strong reason, I would suggest using something like the international two-letter country code.


May be I didn't understand it completely, but I am not using 'country' for programming, it is just information about user that will be in database always....


I don't quite understand what you mean by this, but it sounds like very poor design. Any relational database with long compound fields like this is dangerous and wrong.


A sample value for bulkList is like this:

john_123,smith_passion,ramkumar,shankar.prasad

I am confuse, how it is dangerous, wrong and poor design? Could you please put some more light on this...



If you mean that each user can be in multiple bulk groups, and each bulk group can have multiple members, what you have is a "many to many" relationship. The usual way to solve this is to create a table containing the ids (and optionally descriptions etc.) of all your groups, and then make a "linking table" where each row contains the id of a user and the id of a group. Then if you need to find out if a user is in a particular group, just ask the database for rows with that userid.


could you please explain this with an example if possible.



Do you imagine you will be asking the database to look inside this field during queries?


probably No, no need to look into the value of any field except messageId. But I am not sure, what exactly you wanted to point out?




Instead of a whole new table for each user, just add a column for the user id to this data, then the database can select out just the rows for that user. Much simpler. Let the database do the work.


You mean, two column into Message_Info table. One is senderInfo and second one is receiverInfo. The type will be varchar(25), and the sample value will be something like this....


snderInfo -> false, true, false, false

receiverInfo -> true, false, false, false


Is this what you are suggesting???

Apart from these all, your all comments regarding size and type are really good and I will change fields according to that...

Thank you very much Frank.


[ November 27, 2005: Message edited by: rathi ji ]
Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920

Actually both username and userId are unique, username is not the name of user, it is mail Id of user. Actually I forgot to add name column in User_Info, may be because of that, this confusion gets created...

Why I made username is unique and userId is not unique because, username will be entered by user (I can't make sure that user will enter unique always) and userId will be generated by system (so I will take care of uniqueness here).


The point is that a primary key must be unique. In most cases this means that the primary key should be machine generated. Anything a user can enter could be a duplicate.



May be I didn't understand it completely, but I am not using 'country' for programming, it is just information about user that will be in database always....


But if it will never be used, why is it in there? If it is in there, someone might be tempted to (for example) list all the users in Germany. At which point it might miss out users who entered Deutschland for their country.

Why risk this when a country code is both simpler and takes up less database space?

I don't quite understand what you mean by this, but it sounds like very poor design. Any relational database with long compound fields like this is dangerous and wrong.


A sample value for bulkList is like this:

john_123,smith_passion,ramkumar,shankar.prasad

I am confuse, how it is dangerous, wrong and poor design? Could you please put some more light on this...


Several reasons.
  • If there are too many names in the list it could run out of space in the field
  • Lots of field space is allocated for mostly empty fields
  • The database can't check that the names in the list represent valid users
  • The databsse can't add or remove names from the list
  • The database can't count the names in the list
  • The database can't read the list the other way (ie, show which lbulk groups a user is part of
  • and many more.


  • Trust me. Many years of database theory and development have shown that taking this route always leads to trouble.


    If you mean that each user can be in multiple bulk groups, and each bulk group can have multiple members, what you have is a "many to many" relationship. The usual way to solve this is to create a table containing the ids (and optionally descriptions etc.) of all your groups, and then make a "linking table" where each row contains the id of a user and the id of a group. Then if you need to find out if a user is in a particular group, just ask the database for rows with that userid.

    could you please explain this with an example if possible.




    When a new user signs up:


    When a new group is created:


    When a user is added to a group:


    Example: To find all the users in the sheriffs group:


    Example: To find all groups that Frank Carver is a member of:



    Do you imagine you will be asking the database to look inside this field during queries?

    probably No, no need to look into the value of any field except messageId. But I am not sure, what exactly you wanted to point out?


    My point was that if you (a) don't know the size of a chunk of text and (b) don't need the database to look inside it, you should consider using a CLOB or MEMO field. That's what these fields are for.


    You mean, two column into Message_Info table. One is senderInfo and second one is receiverInfo. The type will be varchar(25), and the sample value will be something like this....
    Is this what you are suggesting???


    Not Really. More Like:



    Example: To show all the subjects of messages in Frank Carver's INBOX:



    Is that any clearer?
    ankur rathi
    Ranch Hand

    Joined: Oct 11, 2004
    Posts: 3830




    I am still not clear, why I need this functionality. Let me explain first, what my functionality is, and how I was thinking to implement:

    If any user (u1) wants to add any other user (u2) into bulk list then he/she (u1) will select any of the message of that user (u2) and click on bulk button. At this time, I will add u2 into the bulklist of u1.

    Now from now onwards, if u2 will mail to u1. I will add one row into Message_Info table (as I explained in first mail), then I will add one row into u1's table and one row into u2's table.

    While adding one row into u1's table, I will check, whether u2 is in the bulk list of u1, if yes, then will make bulk_flag true, if no then will make inbox_flag true.

    In case of u2's table, I will keep outbox_flag true.


    This is how I was thinking to implement this functionality.

    How this group logic will help me to implement this???


    Whenever a user sends a message to other user. It will be displayed at two location (inbox of receiver and outbox of sender). So

    The table you have mentioned (Message_Location) has inbormation about receiver's location only. How could I know that wheather sender has read a message or not (sender can read message from outbox)...

    What you think about table like this:



    By this way, user can delete message and that will not affect other user (sender can delete message from outbox, it will not be deleted from inbox/bulk folder of receiver).

    Waiting for your comments.

    Thanks a lot.

     
    I agree. Here's the link: http://aspose.com/file-tools
     
    subject: Mail Server - Database Design