database or spreadsheet

Am 07.02.2012 16:52, Jay Lozier wrote:

Most people are scared off by having to develop queries, particularly if
they must use SQL. It is too much like programming to them and that
tends to scare them; seemingly arcane rules and syntax that does not
appear very flexible or easy to use.

Some of them are so desparately trying to organize their data in spreadsheets that they start learning an extinct programming lingo together with formulas like

The irony is that most SQL queries are easier to understand because they reference tables and columns by name (out of necessity) and usually have a selection criteria that is makes contextual sense, such as vendor name. The above is not easy to follow or debug, I know because I do similar in the past.

SELECT column names
FROM table
WHERE criteria for selection

is easier to follow, even when the query is much more complex

I hope to to condense and reply to all.

A person has spent arduous hours of entering/linking/formula-ing etc... on spreadsheets.
It is not the optimum of data entry, as would be a database form. But it does do every thing they want it to.

Now, if a translation Calc (spreadsheet) to Base (database SQL) generated the the database, they would only have to make the data entry form.

Yes, this is an overly simplistic view, but it is what is being asked for by most.

Exp:
1. make a spread sheet, add data, add formulas, add look up references (internal or external)
2. Save as BASE. ?Do you wish to keep Calc data or create an empty database? Yes or No.
3. Learn how to create your first entry form in Base and add more records.

It is at this point the original spread sheet comes to haunt us. Items 1 and 2 become reiterative until user tires out and learns SQL.

It is not the who is bigger than who, but who is trying to get in, how many are left out and how may we help them to get in.

Carrot, Stick or Sugar with Medicine?

Am 07.02.2012 21:57, Jay Lozier wrote:

The irony is that most SQL queries are easier to understand because they
reference tables and columns by name (out of necessity) and usually have
a selection criteria that is makes contextual sense, such as vendor
name. The above is not easy to follow or debug, I know because I do
similar in the past.

SELECT column names
FROM table
WHERE criteria for selection

is easier to follow, even when the query is much more complex

SQL is much closer to human language than spreadsheet formulas. There is no need to test for missing/invalid data since a database does not allow for missing, invalid or inconsistent data. If there is a relation, the look up in the detail table can not fail since you can not use any references to non existing items nor can you delete any items that are referred in other tables.
Databases provide much better functionality for text values where Calc does it all wrong.

=IF(A1="";"";IF(AND(ISNUMBER(A1);A1>0);ROUND(A1;2)*VLOOKUP(A1;TaxRates;2)) [and this bullshit has to be copied down to each and every row of the ever growing sales list].

If a database record requires a product price then there can not be an empty value nor text in the price column. If that price is defined as a number between 0.01 and 9,999.99 there won't be any value with more than 2 digits behind the comma and all calculations will use that level of precision without explicit rounding. And a user can not override this "validation" with a simple copy&paste. Nobody has to copy any calculation formula. They are defined by the database designer for the entire field. A spreadsheet has no records nor fields, not even tables. Tables with records and fields are only in the head of the user when he uses rectangles of cells as if they were organized like this. In fact you can use a sheet of cells beyond these rigid concepts.

Am 07.02.2012 21:58, Paul D. Mirowsky wrote:

I hope to to condense and reply to all.

A person has spent arduous hours of entering/linking/formula-ing etc...
on spreadsheets.
It is not the optimum of data entry, as would be a database form. But it
does do every thing they want it to.

The majority of today's computer/tablet/phone users will never be able to manage data properly in this way.

Now, if a translation Calc (spreadsheet) to Base (database SQL)
generated the the database, they would only have to make the data entry
form.

There is no "translation" from spreadsheet to SQL. It is an import process of the raw data into your own data structures which are analog to a user defined file format. I have done this dozends of times. It has nothing to do with plain copy&paste or save as <other file type>. A database knows a dozend of data types. Calc knows only two types of data.
Input forms are the second step.

Yes, this is an overly simplistic view, but it is what is being asked
for by most.

Yes, they only describe of what they want to have. They do not waste any thought on what they already have and what it takes to make something useful with what they have.
All these questions are about software features that exist in that software but not in this one.
They can not work with software tools. Instead they expect that the tools do their work automagically.
They never show their actual data material (lengthy complaints about Calc's csv import options without pasting a single line of the text data to import).

In your experience, when is it better to use the embedded database
instead of a client-server database?

A spreadsheet (Calc or Excel ) cab be used as a back end data
source and Base as the front end. How does this fit into this
discussion?
     Remember that Base will only create queries and reports from the
spreadsheet. Neither fields, views, nor tables can be created, modified,
or deleted. All of these types of changes must be done in the
spreadsheet.
     Also, what about saving each sheet of the spreadsheet as a dBase
file?

--Dan

The statement made by someone that databases requires planned design
_first_ is important to remember. But for a novice how do you make the
plan?

This is why the original poster was asked the question why did
spreadsheet names have to be changed, although he doesn't think it
important. It is important to know the objective. For example, we
could be asked how to manoeuvre a 10-tonne truck a distance of 50
metres, when if we knew the objective was to buy a drink from the shop
across the road, we could advise it would be easier to walk! A silly
analogy but hopefully understood.

Maybe the way to promote the option of designing databases is to
describe particular scenarios and show how particular tasks can be
performed using either a database or a spreadsheet. Users would then
see two tutorial scenarios to show the (dis)advantages.

Hopefully someone can suggest such a scenario or two please?

Not sure if this beyond the scope of LO, but an explanation of how to
perform basic data modelling, design, planning would be very useful,
especially as a "suggested reading guide" before explaining the
features of base and calc.

The comment " Tables with records and fields are only in the head of the user when he uses rectangles of cells as if they were organized like this. In fact you can use a sheet of cells beyond these rigid concepts." is often overlooked. Often another can override the implicit design of a spreadsheet. Spreadsheets do have tools to limit this but they must be activated by the designer.

Another flaw with spreadsheet database is that it is difficult to make sure the user enters all the proper data while a database will have implicit rules embedded for each table for checking that all the data is entered. If a table row has 8 elements and requires 5 must have data, this is can set in database table design but not so easily with a spreadsheet.

Am 07.02.2012 21:58, Paul D. Mirowsky wrote:

I hope to to condense and reply to all.

A person has spent arduous hours of entering/linking/formula-ing etc...
on spreadsheets.
It is not the optimum of data entry, as would be a database form. But it
does do every thing they want it to.

The majority of today's computer/tablet/phone users will never be able
to manage data properly in this way.

Now, if a translation Calc (spreadsheet) to Base (database SQL)
generated the the database, they would only have to make the data entry
form.

There is no "translation" from spreadsheet to SQL. It is an import
process of the raw data into your own data structures which are analog
to a user defined file format. I have done this dozends of times. It has
nothing to do with plain copy&paste or save as<other file type>. A
database knows a dozend of data types. Calc knows only two types of data.
Input forms are the second step.

Yes, this is an overly simplistic view, but it is what is being asked
for by most.

Yes, they only describe of what they want to have. They do not waste any
thought on what they already have and what it takes to make something
useful with what they have.
All these questions are about software features that exist in that
software but not in this one.
They can not work with software tools. Instead they expect that the
tools do their work automagically.
They never show their actual data material (lengthy complaints about
Calc's csv import options without pasting a single line of the text data
to import).

      A spreadsheet (Calc or Excel ) cab be used as a back end data
source and Base as the front end. How does this fit into this
discussion?

Which is the front end probably depends on the problem at hand. I would expect a good design will use Base for data storage, some preliminary data manipulation while reserving Calc to more complex caculations on the data.

      Remember that Base will only create queries and reports from the
spreadsheet. Neither fields, views, nor tables can be created, modified,
or deleted. All of these types of changes must be done in the
spreadsheet.
      Also, what about saving each sheet of the spreadsheet as a dBase
file?

Depends on the sheet layout, but it is possible. I often use a database to generate CSV files for import into other tables or into spreadsheets and do the reverse with Calc.

Am 07.02.2012 23:08, Dan Lewis wrote:

      A spreadsheet (Calc or Excel ) cab be used as a back end data
source and Base as the front end. How does this fit into this
discussion?

We were talking about the process to turn a collection of spreadsheet lists into a relational database. I tried to explain that this involves a lot more than copying data and adding forms.

A spreadsheet as data source of a Base document is not a database at all. It is a way to present spreadsheet data as if they were stored in a database. Some sheets may contain matrices, descriptive text or any mix of data rather than database-like lists.
I use this quite often for "fuzzy address collections" copied and dragged around from various other sources. Finally I collect everything in a Base query and tell "my users" to use this query for a certain serial letter.
The beauty of data sources is that the user who needs to compose a serial letter or label printing, does not even need to know where the data come from. All he/she needs to know is the name of the data source and the name of a query. The rest is always the same.

      Remember that Base will only create queries and reports from the
spreadsheet. Neither fields, views, nor tables can be created, modified,
or deleted. All of these types of changes must be done in the
spreadsheet.
      Also, what about saving each sheet of the spreadsheet as a dBase
file?

It is the best database format for flat lists. You can edit the data directly in the data source window. dBase is editable, can be indexed for fast lookups and it has no practically relevant row limits. But the built-in dBase driver does not provide relations and the query limitations are the same as with csv or sheets. If you save an inconsistent spreadsheet as dBase the resulting table will be text only. So you better check your data.

The idea behind client-server is that while one accesses a database and that others may be accessing the same database. The DMS is "intelligent" enough to process updates, etc. without causing problems. With Base, you are using the Base as the client interface with "server" database.

Embedded database engines are better for stand alone applications. Often they are easier to access, open and go, while a client-server database will require more steps such as logging in for access.

I usually use Base as a front end to MariaDB because I prefer it over the embedded engine as if MariaDB was actually embedded (yes it is more work initially)

As a consequence: It would help most, if the database set up barrier could
be
lowered.

We therefore should ask (and try to learn) what makes database creation so
difficult, and how could we contribute to ease the learing curve?

Or directly: how can we make database creation easy and a low-threshold
task?

The statement made by someone that databases requires planned design
_first_ is important to remember. But for a novice how do you make the
plan?

For example you wish to develop an inventory of all your books. You look at the data you have for each book such as title, author, subject, publisher, etc. and determine what data is needed. Next look at each group of data for a book. A book can have multiple authors and multiple subjects but only one title, publisher, ISBN number. This suggests that there should be three tables in the initial: title, author, and subject.

First design you have a main table of based on the title each book - primary key titleid, an author table with each author linked by a "foreign key" to the correct title id, and a subject table with each subject linked by a "foreign key" to the correct titileid. The author and subject tables have their own internal ids. I like to have an integer id for each row as the primary key. You can use any column or group of columns as a primary key as long as each generates a unique value. Using integer id fields makes implementing foreign keys easier.

To refine the design, you want every individual piece of data only inputted once and you may find it advisable to split say the title table into two tables,; one for titles and one for publishers.

Also, determine the best type for each data column, text, date, time, integer, floating point number, etc and determine which data must be entered, which has a default value you can override, and which can have no data entered. The database engine will enforce these rules when one enters the data. If any is the wrong type or required it will through an error.

Sorry - and without comment on the new discussion going on here - but it
was not important in the case of the question asked earlier.

The user asked how to rename sheets in Calc - end of story.

If you don't want to answer that question, then don't answer - but don't
open the door and show him two young men in white shirts offering to
save his soul..

or - answer his, perfectly valid, question first and then offer the
sermon, but what he got was a sermon and no answer to his question.

Respectfully,

//drew

AFAIK, a brief overview of data modeling is included in the Base guide (I think it is a hands on tutorial actually). Books on Access, MySQL, MSSQL Server, and MariaDB I have all have some information on data modeling because a good working knowledge is very important to proper database design.

Actually, for most people three key concepts are critical: determine what data you have and need, break up the data into logical groups, and design the data entry so that the data is entered only once. For your purposes, you may have more information available than you need or in some cases can easily generate the "data" from other data. The total price for an invoice could be generated other data included in the invoice, for example. Whether to include it depends on your needs and the details you will capture. How the data should be grouped depends on the type of data and how you need to use the data. You want table structures that allow entry of each piece of data without requiring used fields. On an invoice there are often multiple items but each invoice may a different number of items. Breaking the invoice into multiple tables; one for the basic invoice data and another for actual line items is recommended. The idea of enter once is critical, you want to reduce redundant data and eliminate as much data entry as possible, this makes locating, updating much easier.

The key modeling point is take a little time before starting to think about what you are doing and why you are doing it. You can use pen and paper or software tools and formal procedures to aid the process. For a simple project, could you write a few paragraphs clearly describing what you intend to do, how you intend to do it, and outline the data you plan to use? If, yes you have your solution outlined. The problem is that basic concept is simple but the execution can be difficult. I know often people will rush this step or skip it completely but it is very important.

There are massive tomes written about database design but do not lose sight that what you are doing is thinking about the problem and its solution before doing any detailed work on implementing it using the tools you have available.

> >
> > As a consequence: It would help most, if the database set up barrier could
> > be
> > lowered.
> >
> > We therefore should ask (and try to learn) what makes database creation so
> > difficult, and how could we contribute to ease the learing curve?
> >
> > Or directly: how can we make database creation easy and a low-threshold
> > task?
> >
>
> The statement made by someone that databases requires planned design
> _first_ is important to remember. But for a novice how do you make the
> plan?
>
> This is why the original poster was asked the question why did
> spreadsheet names have to be changed, although he doesn't think it
> important. It is important to know the objective.

Sorry - and without comment on the new discussion going on here - but it
was not important in the case of the question asked earlier.

The user asked how to rename sheets in Calc - end of story.

If you don't want to answer that question, then don't answer - but don't
open the door and show him two young men in white shirts offering to
save his soul..

or - answer his, perfectly valid, question first and then offer the
sermon, but what he got was a sermon and no answer to his question.

Just one more thing if I may.

That isn't the norm here, I know it's not
- I don't know how that got out of hand so quickly, maybe there was an
earlier exchange on the list that I didn't see
- nor am I trying to preach to you guys either, just asking that you
keep it in perspective, that's all.

ok - now I'll shut up.

Yes, in Calc row 1 is usually a set of fields defined (headers) by there type within a sheet (single tab). This is the table.

The essence already exists. When copying/importing data to Base from Calc, it request whether or not the first line
is a definition. In a "save as Base" scenario, it is assumed, but is checked for:
Field Undefined - Row 1 column X is blank - cannot save as Base
Field Undefined - Row 1 column X is formatted as (Date,Number,Text, Etc..) Are you sure you wish to continue?
If it passes the test, the table is defined.

The crux of the problem is an engine that redefines references, formulas and query ranges in Calc to SQL as they become the Base Queries.

I think this is my "bad".

I apologize to all.

I think this is my "bad".

I apologize to all.

No need - a lot of traffic happens here, and it all is generated by
humans.. stuff happens - my main point is - put it behind, a good
discussion came of it.

//drew