database or spreadsheet

Readers,

There was an interesting discussion which seemed to be about using
accounting principles/conventions with computer software.

Clearly the original poster forgot to dispense with traditional
thought processes and think critically about how new tools (first the
computer, then more specifically open source software) offers the
opportunity to develop new methods for solving problems.

We read the all-too-common scenario: an m$ fan wanting to use LO as an
m$ clone without learning anything new or assessing whether there is a
better way of doing things. As always, such people want open source
software users to help them for free. Please reconsider and revert to
using m$; the fact that the problem was solved using m$ proves this
option.

m$ users, please take the time to do your homework, research the
alternatives before asking for open source software to adopt the same
behaviour, mentality and sometimes inefficient process to performing
tasks as m$.

The example of organising identical data types into multiple
spreadsheets is so common, primarily because the average "office"
personnel is not introduced (nor willing to accept) to the power of
databases.

Realistically, we cannot expect someone due to retire soon to suddenly
change, but the next generation should be encouraged to be open minded
to using new ideas.

By the way, thank you for the hyperlink to the data pilot functions,
but should this type of functionality be performed by learning how to
use relational database queries?

Everybody knows how write simple math statement as a formula. Everybody doesn't know how to right an SQL statement.

The largest failure to using database vs. spreadsheet is the refusal to allow tables to set up formulas in databases as they are in spreadsheets.

A formula interpreter would quickly resolve inherent coding fear of database.

Reason for not doing this. I don't know.

Am 06.02.2012 18:24, Paul D. Mirowsky wrote:

Everybody knows how write simple math statement as a formula. Everybody
doesn't know how to right an SQL statement.

"My users" can not fill out spreadsheets properly. They have no idea how to write formulas, they don't understand cell formats, subtile input methods nor do they want to expand any formulas for growing lists.

Even for the most simple protocols and item inventories I provide them a set of input forms. They pick options from list boxes, write dates into date fields (or pick them from a calendar), check boolean options, type numbers into numeric fields and remarks into multi-line boxes.
Automatic time stamps and user names are not even visible.
I use to keep older versions of my forms so everybody can choose the preferred input form for the job.
For each virtual item they use, they see who has introduced it at which time, who else used that item today and how many items are left.
"My users" never have to think about storing any file in some file format. Every record is stored on a server when they move to another record unless the record is invalid.
"My users" update/insert/delete full records without bothering about references nor formattings.
When you pull the plug out of the wall the currently edited record is the maximum you lose.
When they want to print something, they hit a button which generates a spreadsheet from template with charts and formulas dumping the requested data into preformatted import range(s) and print areas.
The whole system works with a tiny amount of macro programming.

Or maybe the failure is the refusal to understand the purpose for
each part of a database. Tables and forms are for adding, modifying, and
deleting data. Queries and reports are for data output. If you want to
enter data, you use a table or form. If you want to manipulate data, you
use queries. (With Report Builder, you can also manipulate data in a
report.)
     In Calc spreadsheets, we give columns names; in Base tables, our
column names are called field names. In Calc spreadsheets, we have rows
of data; in Base tables, we call the rows of data, records. In Calc
spreadsheets, we format our columns; in Base tables, we select a field
type and field properties to format our fields (columns).
     The output of a query is a table: combination of rows and columns.
In this is much like a section of rows and columns of a spreadsheet
whose cells contained manipulated data.
     Calc formulas are based upon Algebra; so are the formulas that SQL
uses.
     I can clearly identify why most people don't use databases: they
don't think they can. Perhaps it is also fear of the unknown. Perhaps
they don't want to learn something new: "You can't teach an old dog new
tricks." And perhaps the biggest reason of all: the specialized language
that is used by many people who write about databases. Tuples and
Records are used instead of rows is one example. Columns are called
Attributes.
     I wonder how many people who use LO have read chapter 8 of the
Getting Started Gude, "Getting Started with Base"? I know that I have
very seldom seen a comment about its contents. (I wrote it and am
presently updating it. I'm also working on the Base Guide in its
entirety.)

Dan

Am 06.02.2012 19:22, Dan Lewis wrote:
        I wonder how many people who use LO have read chapter 8 of the

Getting Started Gude, "Getting Started with Base"? I know that I have
very seldom seen a comment about its contents. (I wrote it and am
presently updating it. I'm also working on the Base Guide in its
entirety.)

Dan

Sorry, that guide is too light weight. The matter is as abstract as a programming language. It takes an IT guy with some theoretical concepts and experience. Most parts of the helping Base tools are useless. They can not do what they pretend to be made for. Most importantly, the form wizard can not build forms. Most of the possible types of forms have to be drawn by hand and you need to know very well what you are going to do when you build up your own hierarchy in the form navigator.

A little bit of click experience with MS Access is not enough to design a relational database connected to Base just like some experience with the VBA macro recorder and code completion does not qualify anybody to program anything outside that specific context.

The problem is that too many new developers try to learn having their hands on the problems they are trying to solve right now. This is very inefficient.
In my opinion, the "mid level tutorial" by Mariano Casanova is the best guide to start learning about databases in general in the context of the Base component: http://openoffice.org/projects/documentation/downloads/directory/Base/Mid%20level%20Base%20tutorial. Most importantly, the term "macro" occurs only twice on 189 pages.

You are talking about me.
I and many many others have used the same methods of categorising and storing data in Excel, Lotus 123, Multiplan and various other spreadsheet applications for a great many years. It's nothing whatsoever to do with "dispensing with traditional thought processes" at all. It's all to do with EASE of manipulating the data and what you want eventually to do with that data.
In that respect there is absolutely NO DIFFERENCE between using Excel and Libre Office Calc - they do almost EXACTLY the same things so proprietary or open source has no relevance at all.
If I have specific needs to use a data base then I would use Access or Base. Spreadsheets are NOT databases in the strict sense of the term (although I do agree that in the latest versions some functions do overlap).
I suspect that you never had to do analyses using Analysis Paper and have come direct to computer-based solutions.
The point about spreadsheets is that the user can manipulate and store data in the way that THEY WANT TO and in the way that is most useful to them.
There is NO right way or wrong way to do this.
There seems to be a clique on this list that think the only right way is THEIR way.
WRONG and WRONG.

       Or maybe the failure is the refusal to understand the purpose for
each part of a database. Tables and forms are for adding, modifying, and
deleting data. Queries and reports are for data output. If you want to
enter data, you use a table or form. If you want to manipulate data, you
use queries. (With Report Builder, you can also manipulate data in a
report.)
      In Calc spreadsheets, we give columns names; in Base tables, our
column names are called field names. In Calc spreadsheets, we have rows
of data; in Base tables, we call the rows of data, records. In Calc
spreadsheets, we format our columns; in Base tables, we select a field
type and field properties to format our fields (columns).
      The output of a query is a table: combination of rows and columns.
In this is much like a section of rows and columns of a spreadsheet
whose cells contained manipulated data.
      Calc formulas are based upon Algebra; so are the formulas that SQL
uses.
      I can clearly identify why most people don't use databases: they
don't think they can. Perhaps it is also fear of the unknown. Perhaps
they don't want to learn something new: "You can't teach an old dog new
tricks." And perhaps the biggest reason of all: the specialized language
that is used by many people who write about databases. Tuples and
Records are used instead of rows is one example. Columns are called

I think part of the problem is that most people do not understand the basic difference between a spreadsheet and database. Spreadsheets are very good models of ledger books and basic data tables. Databases do not have a very good paper based analog; the closet is probably a library card catalog. However, databases are actually much better at relational analysis than a spreadsheet but one must learn SQL typically while with a spreadsheet you can start without much "programming" to use it. Also, databases take more initial work to set up than a spreadsheet.

A further problem is that MSO often does not include Access in all the bundles, such as Student/Teachers or some of the business bundles. This adds to the perception that the database is a "geek" thing not something that is for the general user. The thinking is that databases must be very hard or MS would have put in the bundle for more general users.

Personally, I have found databases to be more flexible than spreadsheets for many applications.

And it appears that you just explained why people will use
spreadsheets (incorrectly perhaps) instead of a database. You just
raised the bar too high for the average person. Why would anyone want to
learn something as abstract as a programming language before creating a
"simple" database?
     And the "mid-level tutorial" you mentioned contains this quote at
the beginning: "Step-by-step guide to producing fairly sophisticated
database applications with OpenOffice.org Base, from initial problem to
final product complete with
forms and reports."
     You obviously have a very in dept knowledges of database and can do
many things with them that others can not do. But that does not mean
others have to have a detailed knowledge of databases before they can
create them. Not everyone has a need for a "fairly sophisticated
database" to meet their needs.
     Yes, there are databases that need the DBMS to rely upon as much of
the SQL language as possible. (The latest Oracle program with its PL/SQL
might be what they need.) For help, you would be a good person to ask
advice from. And there are many databases that will run using Base with
its "infamous" 1.8 database engine. For some of these, the Getting
Started with Base is all the person needs. For others, the future Base
Guide will be sufficient.
     I think this thread came from a another thread in which a
spreadsheet was used as a database and the question involved changing
the names of the sheets of the spreadsheet. If a database were
constructed to do what this spreadsheet does, that would be time
consuming and require someone like yourself with a great deal of
knowledge of database theory. It would probably be very sophisticated.
     But the original question that began this thread seemed to broaden
the scope to include much simpler spreadsheets and simpler databases.
And that is why I answered as I did. It is also why I have written this.
The range of databases that can be created goes from the very simple to
the very complex.
     For some people, entering SQL statements in to the Execute SQL
Statement dialog is very easy to do. But doing this would be very
frustrating to to others because it is hard to tell whether they wrote
the write SQL or not. For them, using the dialogs of Base will provide
them with what they need.

--Dan

The problem appears to be when should a simple database be used in preference to a very messy spreadsheet. Spreadsheets are excellent for doing calculation intensive operations but not very good when dealing with scattered relational data. Databases are much better at handling data relationships with modest calculations.

For example one person company might initially find that using a spreadsheet to generate and track invoices fairly easy. Later it might become a problem to locate information in the spreadsheets but it would be much easier to generate the required data from a database. The question then becomes when does the spreadsheet fail as database. It is not question that always has an obvious answer.

Also, one should remember that many who do not use databases regularly fail to appreciate one major advantage; enter once and query and use as needed in many places and ways. Yes you can link to cells within a spreadsheet but if the data must be reused else where then you often cut & paste or worse yet retype the data in another spreadsheet.

Am 06.02.2012 20:45, Dan Lewis wrote:

  You obviously have a very in dept knowledges of database and can do
many things with them that others can not do. But that does not mean
others have to have a detailed knowledge of databases before they can
create them. Not everyone has a need for a "fairly sophisticated
database" to meet their needs.

No, I'm an entirely self-tought amateur. I know that I know almost nothing about databases. All I know is 100% of the Base features (while ignoring the report builder extension).
And yes, even a most simple database backend with one table requires knowledgable decisions on data types and that your single table surely describes the things to be stored.
You've got to know what the software tool expects you to do before you get anything back. At this point complete non-techies run in circles looking for the right trigger point where the software does the right thing for them. This is like holding the hammer and the nail close to the wall without hammering.
It is irrelevant how difficult it is to build up a database and its frontend tools. It's worth the effort if the result is easy enough for the next million records without further intervention.

But all this affects database design and Base input forms only. If you already have some relational database (or a csv file or a well formed spreadsheet list) then it is easy enough for anybody to connect a so called "Base document" with it, add some useful queries and reports in order to dump data into the layout and formatting capabilities of text and spreadsheet documents.
The data flow from the source into the documents is the main capability of the database capability. However, even the concept of a connection through an abstraction layer is beyond what todays users are able to digest, particularly when a ".odb Base document" comes into play. Many users seem to believe that the program converts their spreadsheet (or what else) into a database file format and that they can not edit their data anymore.
See recent thread "create mailmerge with spreadsheet librecalc without using libre base!". This misconception is very common and even some of the regular supporters on this list fail to put it straight.

"My users" never really have to struggle with that. When they complain about next serial letter's address pool, I dump and rearrange some data in a certain spreadsheet and add a query to the spreadsheet's connected database. The query collects the right columns and rows in a certain order of columns and rows with certain well known column labels from the improvised spreadsheet list. Then I call them back and tell them to pick up query "Invitation_03_April" from data source "Mixed Addresses".
Meanwhile the letter has been written and two of "my users" are familiar with the data source window, the placeholder fields in a Writer template and they know how to print serial letters. But none of them can handle more than a screenful of spreadsheet without panic.

      Or maybe the failure is the refusal to understand the purpose for
each part of a database. Tables and forms are for adding, modifying, and
deleting data. Queries and reports are for data output. If you want to
enter data, you use a table or form. If you want to manipulate data, you
use queries. (With Report Builder, you can also manipulate data in a
report.)

Shouldn't we as users contribute to encouraging other users to always
think: what is the most appropriate use for the job? How many
questions posted are the result of making a mistake to use a
spreadsheet when a database is more appropriate?

     I wonder how many people who use LO have read chapter 8 of the
Getting Started Gude, "Getting Started with Base"? I know that I have
very seldom seen a comment about its contents. (I wrote it and am
presently updating it. I'm also working on the Base Guide in its
entirety.)

Somebody mentioned tutorials about open office base; very good
information. Maybe the authors responsible for calc should add more
conceptual advice about whether certain tasks are best performed in
base, not calc.

You are talking about me.
I and many many others have used the same methods of categorising and
storing data in Excel, Lotus 123, Multiplan and various other
spreadsheet applications for a great many years. It's nothing whatsoever
to do with "dispensing with traditional thought processes" at all. It's
all to do with EASE of manipulating the data and what you want
eventually to do with that data.

I think the ability to dispense with concepts conceived when
paper-based accounting was prevalent is required. If decades of using
a fundamentally wrong method is applied to new tools (e.g. "I did this
in m$, therefore this process is the only process), the flexibility of
mind to achieve the same result via a new tool (e.g. relational
database queries) is never realised.

The point about spreadsheets is that the user can manipulate and store
data in the way that THEY WANT TO and in the way that is most useful to
them.
There is NO right way or wrong way to do this.

I think you contradicted yourself, like many m$ fans do: you solved a
problem using a process that was successful using m$ (I think it was a
spreadsheet macro). Personally it would be preferable to ask: "I want
to do X; what is the best tool?" Then knowledgeable users would make
their suggestions (use calc feature X, base feature Y, etc.) and you
(and others, don't forget!) would be able to make an informed choice
of which is most appropriate for each personal preference.

It does not seem apparent that you posted to LO mail-list to ask
whether there was a better/alternative method (e.g. using a different
tool such as base) to achieved your desired result: you asked for the
same process (use a spreadsheet macro) to be applied using calc.

Those m$ users who want LO to be a nothing more than a m$ clone
(instead of being a fundamentally better product, full stop) should
make a financial donation.

     And it appears that you just explained why people will use
spreadsheets (incorrectly perhaps) instead of a database. You just
raised the bar too high for the average person. Why would anyone want to
learn something as abstract as a programming language before creating a
"simple" database?

But surely learning the abstract concept of databases is useful for
the general user to help decide whether her/his problem is best solved
using a database/spreadsheet?

     I think this thread came from a another thread in which a
spreadsheet was used as a database and the question involved changing
the names of the sheets of the spreadsheet. If a database were
constructed to do what this spreadsheet does, that would be time
consuming and require someone like yourself with a great deal of
knowledge of database theory. It would probably be very sophisticated.

According to the responses, it seems that if a database structure was
chosen, the original question would not have been necessary?! It is
not known from the original poster _why_ the sheets needed to be
changed.

     But the original question that began this thread seemed to broaden
the scope to include much simpler spreadsheets and simpler databases.

That was deliberate: to gain knowledge from others to help me in the
future understand when databases and spreadsheets should be used most
appropriately.

     For some people, entering SQL statements in to the Execute SQL
Statement dialog is very easy to do. But doing this would be very
frustrating to to others because it is hard to tell whether they wrote
the write SQL or not. For them, using the dialogs of Base will provide
them with what they need.

I must admit, still trying to understand why base is required as a
"front end" to a database; shouldn't it be more important to learn SQL
first? As an analogy, it is easy to use LyX, but learning LaTeX (and
in turn, TeX I guess!) is beneficial to understanding better LyX.

You completely miss the point just like the others did.
LO is NOT some "wonder tool" that instantly produces a sea change in the way in which people use OFFICE applications. (Emphasis on Office - i.e. work). Yes LO IS an "MS Office clone" in many many ways. So is Lotus Symphony, so is Lotus Smartsuite. You could also say that MS Office is a "clone" of Lotus Smartsuite etc etc. All Office suits basically borrow and adapt ideas from each other, whether they are proprietary or not.
Office suites are designed so people can do a JOB in the best way that SUITS THEM, not necessarily to experiment, neither to bow down to the idea that there is a "best" way of doing the job, dictated by someone who is an evangelist in a different method of data manipulation and that has no insight into the particular user's needs or desires.
I don't know your background, but neither do you know mine. I have used almost every single spreadsheet application (with the exception of Quattro Pro) to an advanced level in reasonably large quoted organisations ever since Multiplan first came out on DOS, so I do NOT need you or your evangelistic buddies to tell ME that the way I am doing things is wrong.
I would also state that there is not an "MS Office" way of doing things nor an "LO" way of doing things either - the only way that is right is the way the USER decides they want to do things.

Jeeze. DOES IT MATTER? No of course it doesn't MATTER.

The problem is not the obvious database type applications but the smaller ones that are better in a database for technical reasons but are easier for most to set up in a spreadsheet. If one understands how database queries work and that they allow data to be selected to that is needed often a database is more flexible.

Most people are scared off by having to develop queries, particularly if they must use SQL. It is too much like programming to them and that tends to scare them; seemingly arcane rules and syntax that does not appear very flexible or easy to use.

I do not know if any guide will do any good most will probably use a spreadsheet or even worse a text table when even a simple database will be better suited to the problem.

Two related issues are a problem: spreadsheets are relatively easy to set up and enter data with some inflexibility while databases harder to set up and without the right front end tools some tricky to properly enter data but are much more flexible.

I have used both as databases and much prefer the flexibility of true databases.

As a consequence: It would help most, if the database set up barrier could be
lowered.

We therefore should ask (and try to learn) what makes database creation so
difficult, and how could we contribute to ease the learing curve?

Or directly: how can we make database creation easy and a low-threshold task?

(Just a thought)

Nino
who has never got familiar with databases "because of their complexity" (at
least from an innocent user's point of view). OTOH, I've never seen really
good tools for database creation (and change btw). So maybe, we could add to
database usability by creating Really Good Tools(TM) for creating, changing,
filling and quering databases.

Hello everybody – and especially Dan Lewis.
I happened to read this thread about a database vs. spreadsheet. First of all I will thank Dan for him mentioning the “Getting Started Guide/Getting Started with Base/chapter 8"! I cannot understand how I could have missed it when I for some time have been looking for info about Base (LibreO & OOo).
It is an excellent briefing into how to build up a database using Base.

I think that you LibreO-guys should take a look at the layout where and how you introduce Base, how to get the “Getting started with Base” more visible – sort of nearer her/him who (like me) is looking for a really handy database tool.
A week ago I asked your community “how get LibreO-Base working”. That is now OK (I found the JRE-file ”jxpiinstall.exe” and installed it before installing LO). Your Andreas then said that LO-Base is a front-end application, not a database. I’m not arguing but for an ordinary database consumer like me the front-end and the background database engine together mean one total database.
After reading Dan’s “Chapter 8” I realized that LO-Base is The solution when a database of medium size/complexity is needed, either using the embedded HSQLDB or connecting to an external “real” database engine (e.g. MySQL or SQLite). Very much like MSAccess but free – no costs.
After working some 10 years ago quite a lot with MSAccess I am a great fan of databases. Whenever I can I recommend to people not to limit to spreadsheets but build relational databases especially for their organizations but also at home.
One of the great benefits compared with a spreadsheet is that you write input data only once thus minimizing both writing and writing errors - and then using the same data multiple times in different kind of reports. Another is that when a nspreadsheet is flat a (relational) database can handle data multi-dimensional which gives you the possibility to mix/combine data and get the most interesting analyzing results.
All the best
Pertti Rönnberg

Two related issues are a problem: spreadsheets are relatively easy to
set up and enter data with some inflexibility while databases harder to
set up and without the right front end tools some tricky to properly
enter data but are much more flexible.

As a consequence: It would help most, if the database set up barrier could be
lowered.

We therefore should ask (and try to learn) what makes database creation so
difficult, and how could we contribute to ease the learing curve?

I think the biggest barrier is not the tools to set up a database but the design process of setting up a database. Databases generally are design first then use while spreadsheets tend to be more design as you go. Databases require one to think more clearly about the data and how to model relationships within the data before one actually creates any tables, forms, etc. What is it I am trying to do with the data and how is the data best represented/stored/manipulated are questions that are often addressed in database design. The thinking about the data model is often a good exercise but one that can be skipped when using a spreadsheet as a database.

One problem is that people often want to jump in and start doing something when a little thinking should be done or how it should be done. Spreadsheets are easier to jump in and start with so often the planning is not done. Thus often spreadsheet users have databases that are nightmares because of poor data design. These are problems that are less common with databases because databases force more thought on data modeling.

IMHO, the real problem is that good data modeling is not usually taught to users and unless they start working with databases they never learn data modeling. Most books on using databases will, if briefly, discuss the concepts of data modeling and database design. I do not remember any book on a spreadsheet discussing it or even mentioning it.

Or directly: how can we make database creation easy and a low-threshold task?

I think some of the tools make the actual creation of a database reasonably easy though data importing is often a pain.

>
>> Two related issues are a problem: spreadsheets are relatively easy to
>> set up and enter data with some inflexibility while databases harder to
>> set up and without the right front end tools some tricky to properly
>> enter data but are much more flexible.
> As a consequence: It would help most, if the database set up barrier could be
> lowered.
>
> We therefore should ask (and try to learn) what makes database creation so
> difficult, and how could we contribute to ease the learing curve?
I think the biggest barrier is not the tools to set up a database but
the design process of setting up a database. Databases generally are
design first then use while spreadsheets tend to be more design as you
go. Databases require one to think more clearly about the data and how
to model relationships within the data before one actually creates any
tables, forms, etc. What is it I am trying to do with the data and how
is the data best represented/stored/manipulated are questions that are
often addressed in database design. The thinking about the data model is
often a good exercise but one that can be skipped when using a
spreadsheet as a database.

One problem is that people often want to jump in and start doing
something when a little thinking should be done or how it should be
done. Spreadsheets are easier to jump in and start with so often the
planning is not done. Thus often spreadsheet users have databases that
are nightmares because of poor data design. These are problems that are
less common with databases because databases force more thought on data
modeling.

IMHO, the real problem is that good data modeling is not usually taught
to users and unless they start working with databases they never learn
data modeling. Most books on using databases will, if briefly, discuss
the concepts of data modeling and database design. I do not remember any
book on a spreadsheet discussing it or even mentioning it.
>
> Or directly: how can we make database creation easy and a low-threshold task?
I think some of the tools make the actual creation of a database
reasonably easy though data importing is often a pain.

     This is a link to the Base Guide Ch 2, Planning/designing your
database. It is a draft document for OOo. The LO version is being
written right now. At least it is a beginning point when planning and
designing a database.
http://www.odfauthors.org/openoffice.org/english/userguide3/db3/dbg3_draft/planning-designing-your-database/view

--Dan