Win a copy of Svelte and Sapper in Action this week in the JavaScript forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Bear Bibeault
  • Junilu Lacar
Sheriffs:
  • Jeanne Boyarsky
  • Tim Cooke
  • Henry Wong
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Tim Holloway
  • salvin francis
  • Frits Walraven
Bartenders:
  • Scott Selikoff
  • Piet Souris
  • Carey Brown

Schema question for storing spreadsheet data

 
Ranch Hand
Posts: 120
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What would be the best way to store a spreadsheet in a db? If the columns will be fairly static, is it appropriate to just use a table to mimic the spreadsheet with a db column for each spreadsheet column? The can be many different datasets in this table, and some will be overwritten at times. This schema seems a bit strange to me, but I am not sure of a better schema. Thoughts?
 
Marshal
Posts: 25814
69
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
"Best" of course is going to depend on whether your spreadsheets have some internal structure, and on how many features of your spreadsheet software you've taken advantage of. For the general case you could start with three columns, namely "RowNumber", "ColumnNumber", and "Data".

But that's excessively simple, since "Data" is much more complex than that. You might also want "DataType" and you might want columns for the cell formatting -- colour, font information, alignment, borders, fill effects, you could go on for quite a while. And then there's information which doesn't apply to cells, like row heights and column widths, and whether rows and columns are hidden. And then groups of cells can be merged. And... you get the picture, but it depends on whether you care about any of that sort of thing. So as I said, it depends on what information you're actually using from the spreadsheet.
 
Rj Ewing
Ranch Hand
Posts: 120
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The spreadsheet data is very simple. Just text and numbers. We have validation profiles which state the given columns datatype, etc. The only thing I need to store in the db are the column values, which I know of the datatype in advanced. So I'm not so sure it would be necessary to keep track of row and cell numbers. Each row in the spreadsheet is essentially metadata about a biological sample. The spreadsheet data will have been validated (ensuring correct datatypes for each column, etc) before I load to the db.

Maybe loading the spreadsheet into a RDBMS isn't the best option and using something like apache metamodel would be more appropriate?
 
Paul Clapham
Marshal
Posts: 25814
69
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Let's look at the other side then. Why do you want to store that data in a database, or in some other external form? Why not just leave it in the spreadsheets?

If you stored it in a DB table then conceivably other applications could read the data from that table, which is a great deal easier than extracting it from the spreadsheets. So if that's one of your goals then perhaps you should drive the design based on the needs of those other applications.
 
Rj Ewing
Ranch Hand
Posts: 120
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes, the uploaded datasets will be queryable. Majority of the time, the data is just regurgitated, but there are times when the user will query based on one or multiple column values, returning the entire row(s) that.

Other times, users will want to download the entire dataset as csv or excel files.

We currently have a fuseki triple store backend, but another client is requesting for their instance to mysql.

Thanks for helping me think through this.
 
Bartender
Posts: 598
26
Oracle Notepad Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Detail is important. Databases store data, not spreadsheets. Unless the spreadsheet itself is the datum to be stored.

For example: If the spreadsheet was a receipt-form to be stored in the Order table alongside the rest of the order, it would be a BLOB. The database would not know what is in the spreadsheet; it just knows it is there.
Conversely: If the data in the spreadsheet is important, and its being a spreadsheet a mere medium to present the data, the data would be stored in the database's columns and rows instead of (and akin to) the spreadsheet's columns and rows.
Similarly: If the database were to assume the role of the spreadsheet, it would likely require normalization, supporting lookups, and views to bring it all together.
Though: As spreadsheets often support saving as XML, and databases often support XML querying, it might make sense to store the XML in an XML column, and be queryable via an XMLQuery.

So many choices, so much fun.
 
This guy is skipping without a rope. At least, that's what this tiny ad said:
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop
https://coderanch.com/wiki/718759/books/Building-World-Backyard-Paul-Wheaton
    Bookmark Topic Watch Topic
  • New Topic