Schema in Calc

Hi,
   In a relationall DB there is a facility to display all the tables and
this also shows how the tables are related and also which fields are
related and how they are related. This is extremely useful to get an
overall pcture of how things fit together and, importantly in this case,
what the effects would be of changing something in one table on the data in
the other tables. Does such a feature exist in Calc? With a large number of
sheets and many cells looking up data in cells on other sheets it would be
great to be able to see how changing a sheet name or a column heading would
affect the rest of the data.

Thanks

Hi :slight_smile:
I seriously doubt it! It is database functionality.

It would be like expecting a word-processor to have DTP functionality, or a
text-editor to do the same things as a word-processor.

One option might be to get Base to read the tables as it's external
back-ends. That would get you to the same screen showing all the different
tables/work-sheets and their headings. However since the calculations take
place within the work-sheets Base wouldn't show any links between the
tables just yet.

Although you don't see the links directly it might help you figure out the
obvious ones and a print-out (or screen-shot) of that screen might help you
be able to draw in the rest as you figure them out by hunting through the
worksheets.

This process might help you figure out how to set-up Queries to do the work
that is mostly done within worksheets at the moment and that would probably
increase reliability quite a lot.

So, that route might help you migrate the whole work-book and all it's
sheets into a proper database. One problem with spreadsheets is that a
column or row might not have every field doing the same calculation as all
the rest in the row or column. People sometimes put a lot of effort into
finding errant cells that misbehave in that way and there are a lot of
tools to help track such cells down. A database allows you to write a (or
modify) a calculation (formula) in one place, in a Query usually. Then you
can be certain that exactly that formula is applied to every single line
with no exceptions (although IF type statements help deal with special
cases). The Query contains no actual data and only has the formula written
once so it's extremely light-weight but when viewed as a table it looks
like one of the tabs (a worksheet) within a spreadsheet. If you want it to
look pretty then set-up a Form or Report to present the output of the Query
in a more pleasant manner.

Errr, part of the power of spreadsheets is that it does have the
flexibility to have very different calculations in a column or row but then
each cell needs to be labeled usually in an adjacent cell/field so that you
know what it's for a few months or years later. Unfortunately many people
keep using spreadsheets to do what a database would do better = which is ok
as part of the planning process but often becomes unwieldy in the
longer-term.
Regards from
Tom :slight_smile:

The most of which I am aware is the ability to see which cells reference the current cell, and, even then, you won't see the cells if they are not visible on screen.

It requires a macro to turn on and off. It is rather obscure; I think it is meant for debugging.

While what Tom said about databases versus spreadsheets is largely
true, it is also true that a lot of people use spreadsheets as a sort
of report, without the database bit. This is often easier to put
together, especially for people that don't have the know-how to develop
a database application complete with reports, and sometimes easier to
maintain, especially in situtations where the requirements change
often, even if only slightly.

And what the OP asked for should be fairly trivial to implement, at
least for a simple case. I don't think it'll ever be possible to see a
proper "schema", not given how flexible spreadsheets are, but seeing
some sort of cell dependencies should be possible.

And in fact, looking through Calc, I have found exactly that.
Under "Tools | Detective " you have both "Trace Precedents" and "Trace
Dependents" which show you what cells a given cell depends on, and what
cell depend on a given cell. This should be most of what the OP needs.
You can use "Fill Mode" to select multiple cells, but it doesn't seem
to work on a whole page at a time, nor very well across pages.

What would be better is a filterable list of formulae, and given that
all non-empty cells have to be saved in the .ods file, it must surely
be relatively trivial to pull out a list of formulae (and any other
dependencies that it might be possible to make), present that list in a
filterable manner, and allow the user to click to see the depender and
dependee of the formula in some fashion, by highlighting the cells or
double-clicking to go to them. At least that way you could, for example,
pull up a list of formulae in the current spreadsheet, filter that list
by one of the worksheets, and see all the formulae that depend on the
worksheet name, or on a specific column in the worksheet, etc. This may
not allow one to get a grand overview of how the data hangs together
like with a database schema, but that is the price you pay for having
the flexibility to have the data not hang together in a specific way.
It would allow one to check how certain pieces hang together, and to
establish before making changes what the effects might be.

Just my thoughts

Paul

Short answer : no.

As Tom has pointed out, this is db design functionality, and not Calc,
although nothing is stopping anyone from developing, say, an Addon or
Extension that might be able to what you are looking for - certainly not
trivial though, given that db drivers are notoriously bad at
(mis)representing their actual capabilities.

Alex

Unfortunately not trivial to implement because it relies on the driver
and the underlying db to tell the truth about what it actually supports,
in terms of relations, integrity constraints, etc.

As this is problematic for many drivers at the best of times, having
such a feature, which I would class as a db design tool add-on, is not
likely to be implemented any time soon.

Alex

Just out of interest, why would a db driver come into play? This is a
spreadsheet we're talking about.

The mention of db schema was merely to ask why Calc doesn't have
something like it.

And I do understand that a schema for calc would be nigh impossible,
spreadsheets don't have to follow any basic design principles, so
the spreadsheet would be the schema, but I suggested a simple list of
formulae as being useful.

And now that I think about it, the detective features could possibly be
expanded, first to work better across worksheets, second to allow
people to show all dependencies on a sheet at a time instead of per
cell, and third as some sort of "print preview", allowing a birds-eye
overview of what's going on.

Paul

How else would one obtain the schema in the first place ? As I see it
there are only two possibilities :

- import from an existing db connection using the db source/named
ranges functionality -this requires a db context/connection and
consequently a db driver;

- import from a file (XML/UML or other Calc-recognisable file type) that
has been obtained by exporting the schema from a database.

The second way wouldn't require a db driver, but is more cumbersome IMO
than being able to query the db directly for the schema because it
requires specific matching of the structure to the internal
representation of a Calc file, i.e. implementation of a separate
import/export filter.

Alex

I believe one of us is misunderstanding things. My understanding is
that the OP didn't actually want a database schema, he just wanted
something similar for normal spreadsheets.

So basically, create normal spreadsheet from scratch, develop it over
time to do some fancy calculations on multiple worksheets (or maybe
even only one). Then, sometime later, want to make changes, but
being unsure of what effects those changes will have in terms of other
formulae relying on cells you want to modify, you would like some sort
of outline of how all the data in the spreadhseet interacts.

At no point in this is a database ever introduced. The word schema
makes one think of a database, but I think the OP only used it to say he
wanted something like it. He doesn't actually want a database schema.

Now for spreadsheets, as they store unstructured data whereas databases
store structured data, a schema-like outline would be well-nigh
impossible to generate. But some sort of filterable listing of formulae
would help a lot with the above scenario, and is what I have proposed.

I think this should be a feature request for Calc, personally.

Paul

Thank you, Paul. I have followed the convolutions of the discussion with
interest but you have more clearly presented what I tried to say in the
beginning. After much of the discussion I now see many of the difficulties
involved but I will follow your suggestion and see if I can phrase this
clearly enough to present it as a Calc feature request.
  I must express my thanks to all who contributed. It is this type of
interaction that can lead to a more useful product.

Paddy

Actually, the OP (me) started with Calc database because changes are
easy. Changes are still under way (field names, new fields, etc.), so
I'm using Base to access Calc, which gives me queries.

The next step is a real db, but I'm still sorting that out. I'm not
interested in RAM-resident, embedded for reasons pointed out in this
thread. As the db requirements become more clear (size, number of
users, structure), I'll decide. Could it be a future version of Base?
Possibly, depending on how it progresses over the rest of the year. I'm
looking for robust, good performance, reliable, and stright forward
setup (not necessarly easy, but not requiring a db phd).

Dave,