It's not a secret anymore!*
The moose likes OO, Patterns, UML and Refactoring and the fly likes Refactoring Databases vs Normalization Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCM Java EE 6 Enterprise Architect Exam Guide this week in the OCMJEA forum!
JavaRanch » Java Forums » Engineering » OO, Patterns, UML and Refactoring
Bookmark "Refactoring Databases vs Normalization" Watch "Refactoring Databases vs Normalization" New topic
Author

Refactoring Databases vs Normalization

Paul Michael
Ranch Hand

Joined: Jul 02, 2001
Posts: 697
how different / similar are these two concepts?


SCJP 1.2 (89%), SCWCD 1.3 (94%), IBM 486 (90%), SCJA Beta (96%), SCEA (91% / 77%), SCEA 5 P1 (77%), SCBCD 5 (85%)
Timothy Sam
Ranch Hand

Joined: Sep 18, 2005
Posts: 746
They are somewhat similiar


SCJP 1.5
http://devpinoy.org/blogs/lamia/ - http://everypesocounts.com/
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Traditionally, normalization is done up front.

With refactoring, you could certainly normalize an already existing database, but there are a lot of other things you can do with refactoring, too (denormalizing it, for example ).

With other words, normalization is one goal in designing a database. Refactoring is about changing the design of an existing database, whatever goal you follow by doing so.


The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
Francis Siu
Ranch Hand

Joined: Jan 04, 2003
Posts: 867
With refactoring, you could certainly normalize an already existing database, but there are a lot of other things you can do with refactoring, too (denormalizing it, for example ).

With other words, normalization is one goal in designing a database. Refactoring is about changing the design of an existing database, whatever goal you follow by doing so.

I wonder if you could tell me that is refactoring a kind of process re-engineering based on the database with some conditions such as performance?
thanks


Francis Siu
SCJP, MCDBA
Adeel Ansari
Ranch Hand

Joined: Aug 15, 2004
Posts: 2874
To me refactoring doesn't actually come into performance, but can refactor for a performance gain. By refactoring, first I would go for usability and flexibilty of the design.
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
from http://www.refactoring.com/

Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior. Its heart is a series of small behavior preserving transformations. Each transformation (called a 'refactoring') does little, but a sequence of transformations can produce a significant restructuring. Since each refactoring is small, it's less likely to go wrong. The system is also kept fully working after each small refactoring, reducing the chances that a system can get seriously broken during the restructuring.


Consequently, Database Refactoring would be a disciplined technique for restructuring an existing database, altering its internal structure without changing its external behaviour.
Adam Hardy
Ranch Hand

Joined: Oct 09, 2001
Posts: 566
A database doesn't really have external behaviour though. At least you would have to include the persistence O/R mapping layer in the discussion, and talk about refactoring behind the API which that offers.

Otherwise you would be limited to refactoring indexing and other mechanisms which had no impact on the tables and columns (the externally visible part).


Adam


I have seen things you people would not believe, attack ships on fire off the shoulder of Orion, c-beams sparkling in the dark near the Tennhauser Gate. All these moments will be lost in time, like tears in the rain.
Lasse Koskela
author
Sheriff

Joined: Jan 23, 2002
Posts: 11962
    
    5
Originally posted by Adam Hardy:
A database doesn't really have external behaviour though. At least you would have to include the persistence O/R mapping layer in the discussion, and talk about refactoring behind the API which that offers.

Otherwise you would be limited to refactoring indexing and other mechanisms which had no impact on the tables and columns (the externally visible part).

Well, sometimes there are stored procedures. And those are often, almost by definition, in need of serious refactoring...

Perhaps views might be something that could be considered externally visible while the underlying, real tables would be internal structure? I haven't personally used views almost at all so I'd be interested in hearing others' thoughts on whether this makes any sense.


Author of Test Driven (2007) and Effective Unit Testing (2013) [Blog] [HowToAskQuestionsOnJavaRanch]
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Adam Hardy:
A database doesn't really have external behaviour though.


That depends on how you define "behaviour"...

I think with regards to database refactorings, it includes what information is provided to the outside.

Splitting a table is an example for a database refactoring.
Scott Ambler
author
Ranch Hand

Joined: Dec 12, 2003
Posts: 608
You might want to visit The Process of Database Refactoring and DB Refactoring to learn a bit more about what the technique is about.

A database refactoring is a simple change to a database schema that improves its design while retaining both its behavioral and informational semantics. For the sake of this discussion a database schema includes both structural aspects such as table and view definitions as well as functional aspects such as stored procedures and triggers. An interesting thing to note is that a database refactoring is conceptually more difficult than a code refactoring; code refactorings only need to maintain behavioral semantics while database refactorings also must maintain informational semantics.

You need to realize that there is a lot more to RDBs than simple CRUD of objects. Yes, there are tables and columns where you store data. But there is functionality (stored procs, stored functions, triggers), views, and so on that we should take into consideration. Just like your application code can evolve, so can your database schema.

- Scott


<a href="http://www-306.ibm.com/software/rational/bios/ambler.html" target="_blank" rel="nofollow">Scott W. Ambler</a><br />Practice Leader Agile Development, IBM Rational<br /> <br />Now available: <a href="http://www.ambysoft.com/books/refactoringDatabases.html" target="_blank" rel="nofollow">Refactoring Databases: Evolutionary Database Design</a>
Scott Ambler
author
Ranch Hand

Joined: Dec 12, 2003
Posts: 608
As to the normalization issue, one way to look at some refactorings are that they are "normalization" after the fact.

You could argue that you should get the DB design right in the first place, but as we all know from the often sad state of the existing production databases out there that this doesn't always happen in practice. Therefore we need a safe and simple technique such as database refactoring to fix things after the fact.

- Scott
Mike Farnham
Ranch Hand

Joined: Sep 25, 2001
Posts: 76
Welcome Scott!

I've been reading your stuff for years.

Since refactoring is in part re-design,
when does refactoring end and re-design begin.

We have two separate databases.
The first db is a "resource" db providing information to a multitude of applications. It is basically a collect and store point from external sources. The second db is specific to an application.

We have come to discover that the information in the application specific db is in reality a "resource" (used by other applications). So, we are looking to re-factor/re-design the application db and pull the information out of it and put in the "resource" db. One concern has been "Can the application db stand on its own?"

Is this change too radical to be considered a db refactoring?

How would the book (or your other articles) help address this change?

Thanks,
Mike
Paul Michael
Ranch Hand

Joined: Jul 02, 2001
Posts: 697
Originally posted by Ilja Preuss:


Consequently, Database Refactoring would be a disciplined technique for restructuring an existing database, altering its internal structure without changing its external behaviour.


Ahh... right. I remember now, refactoring should retain its external interfaces and the original intent or behavior of the system. But it also should in turn make the system more manageable/easily maintainable and remove some duplication of behavior or data in the process (which in database terms is equivalent to normalization).

Thanks guys.
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Mike Farnham:
Since refactoring is in part re-design,
when does refactoring end and re-design begin.


To me, refactoring is re-design done in very small, very controlled steps.


Is this change too radical to be considered a db refactoring?


It sounds to me as if this change, however radical, probably could be done in a lot of small, controlled steps, with a fully working system after each step - in which case it might not be *a* refactoring, but a sequence of refactorings.

I might even bet that it would be the *best* way to do it, but to be sure, I'd need to know much, much more about your situation...
manuel aldana
Ranch Hand

Joined: Dec 29, 2005
Posts: 308
what about dropping a table, because it is not needed in your horrible grown 200 tables schema? that means it is not used by any application (frontend, bussiness tier) bits.

could that be included in a database-refactoring because it is kind of dead code?


aldana software engineering blog & .more
Lasse Koskela
author
Sheriff

Joined: Jan 23, 2002
Posts: 11962
    
    5
Originally posted by manuel aldana:
what about dropping a table, because it is not needed in your horrible grown 200 tables schema? that means it is not used by any application (frontend, bussiness tier) bits.

could that be included in a database-refactoring because it is kind of dead code?

Yes, "Drop Table" could very well be considered a database refactoring.
Peer Reynders
Bartender

Joined: Aug 19, 2005
Posts: 2922
    
    5
Originally posted by Lasse Koskela:

Yes, "Drop Table" could very well be considered a database refactoring.

"Refactoring Databases: Evolutionary Database Design" Drop Table - p.77
Francis Siu
Ranch Hand

Joined: Jan 04, 2003
Posts: 867
Yes, "Drop Table" could very well be considered a database refactoring.

Um.....could you tell us why "Drop Table" could very well be considered a database refactoring?
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by siu chung man:

Um.....could you tell us why "Drop Table" could very well be considered a database refactoring?


Well - why not?
Pradeep bhatt
Ranch Hand

Joined: Feb 27, 2002
Posts: 8919

RDBs

What is that ? Relational databases ?
[ July 26, 2006: Message edited by: Pradip Bhat ]

Groovy
Scott Ambler
author
Ranch Hand

Joined: Dec 12, 2003
Posts: 608
We have two separate databases.
The first db is a "resource" db providing information to a multitude of applications. It is basically a collect and store point from external sources. The second db is specific to an application.

We have come to discover that the information in the application specific db is in reality a "resource" (used by other applications). So, we are looking to re-factor/re-design the application db and pull the information out of it and put in the "resource" db. One concern has been "Can the application db stand on its own?"

Is this change too radical to be considered a db refactoring?


Yes. We include refactorings such as Move Column, Move Data, Use Official Data Source, and so on that would likely apply.

- Scott
Rajan Chinna
Ranch Hand

Joined: Jul 01, 2004
Posts: 320
I don't think droping un-used tables is considered as Refactoring.....
Does anyone has valid point to consider that as Refactoring?
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Rajan Chinna:
I don't think droping un-used tables is considered as Refactoring.....
Does anyone has valid point to consider that as Refactoring?


Well, it is a small, controlled change that doesn't change the perceived behaviour of the system.

Can you explain why you wouldn't call it a refactoring, please?
Pramod Sadalage
Greenhorn

Joined: Jul 26, 2006
Posts: 2
Drop Table is a refactoring since you are removing unused tables hence improving the maintainibility of the database similar to removing Methods that are not used.

If tables are kept around that are not used, your database becomes a difficult to understand and there are high chances of the Table being used by some one else adding more confusion. You will also have to maintain this table for eternity (space allocation, maintain the OR mappings, views etc)

<<Drop Table>> is a good refactoring to do.


- Pramod<br />--------------------<br />Pramod Sadalage<br />Consultant <br />ThoughtWorks Inc.<br /><a href="http://www.thoughtworks.com" target="_blank" rel="nofollow">www.thoughtworks.com</a><br /> <br />Now available: Refactoring Databases: Evolutionary Database Design
Mike Farnham
Ranch Hand

Joined: Sep 25, 2001
Posts: 76
The equivalent anti-pattern of leaving in a database table that is no longer used would be "Lava Flow" aka "Dead Code".

I could see one way of this happening would be if you "accidentally" moved a table from dev to prod.

As it was, we had a situation where we moved a view from dev of one instance (Oracle 8i) to another instance (Oracle 9). The code from dev that accessed the view was used and made its way thru test, until we saw problems on our way to production. Moral of the story was an old lesson: Always use production code before making changes.
Adeel Ansari
Ranch Hand

Joined: Aug 15, 2004
Posts: 2874
Originally posted by Rajan Chinna:
I don't think droping un-used tables is considered as Refactoring.....
Does anyone has valid point to consider that as Refactoring?


Okay. What about removing a unused method from your class?

[Edited]
Or remove any of unused property?
[ July 27, 2006: Message edited by: Adeel Ansari ]
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Refactoring Databases vs Normalization