Need Calc help

Hi,

Hopefully someone can help me out.

I maintain in Calc the regional section of a National membership file of Alumni. There are about 650 members in my regional Calc file.

Each record is keyed to an unique College number and consists of about 26 fields of data including First/Last names, address, telephone number, etc. On a monthly basis, the national office sends me a 'pull' in Excel format from their national database. Since changes, additions, and deletions are not annotated by National, I have to manually compare each record in the latest listing with the appropriate record in my files. As you can imagine, doing this manually takes a long time when there are about 650 records involved.

Assuming that the Column headers are the same in both files (not necessarily in the same order), is there a way that I can compare the two files to create a listing by College Number of fields that do not agree, and records that have been deleted or added?

I am computer savvy but failed programming miserably so I need the input of the many smart people on this list. I do follow instructions well so could be led by the nose to create what is required.

Thank you for your patience and help

Preston

Hi

  There are two programs that you can use for check the diferences
between both files:

1) Meld
2) Kdiff3

  Both programs are in the Ubuntu repository.

Regards,

Jorge Rodríguez

Am 06.08.2011 00:50, Preston Smith wrote:

Hi,

Hopefully someone can help me out.

I maintain in Calc the regional section of a National membership file of
Alumni. There are about 650 members in my regional Calc file.

Each record is keyed to an unique College number and consists of about

Calc is a calculator which may be misused as a database to some extend.

Your terminology describes a relational database. Deriving lists from other lists referring to unique records with keys and indices has always been the most natural thing to do in relational databases.

Hi Andreas,

Thanks for your suggestion

I know nothing about databases however I just successfully imported an xls file into LO's Base. I assume the next step is to import the second xls file to create a second table.

Now for the difficult bit - how do I compare the record associated with a College Number in one file against the record associated with the same College number in the second file (should I be saying Table?)? And how do I display the data that differs in the two entries for the same College Number?

Any help you can offer will be greatly appreciated.

Thank you,

Preston

Thank you Jorge,

My first quick glance suggests that these programmes may not work well with xls/ods files - I will look more closely at them

Regards,
Preston

Preston

Hi Andreas,

Thanks for your suggestion

I know nothing about databases however I just successfully imported an
xls file into LO's Base. I assume the next step is to import the second
xls file to create a second table.

Now for the difficult bit - how do I compare the record associated with
a College Number in one file against the record associated with the same
College number in the second file (should I be saying Table?)? And how
do I display the data that differs in the two entries for the same
College Number?

Any help you can offer will be greatly appreciated.

Thank you,

Preston

> Am 06.08.2011 00:50, Preston Smith wrote:
>> Hi,
>>
>> Hopefully someone can help me out.
>>
>> I maintain in Calc the regional section of a National membership file of
>> Alumni. There are about 650 members in my regional Calc file.
>>
>> Each record is keyed to an unique College number and consists of about
>
> Calc is a calculator which may be misused as a database to some extend.
>
> Your terminology describes a relational database. Deriving lists from
> other lists referring to unique records with keys and indices has
> always been the most natural thing to do in relational databases.
>
>

Retrieving information from a database is called querying the database.
Often it is done using Structured Query Language (SQL).

Once you have your tables in the database you can select INSERT >> QUERY
(Design View). This allows you to select the tables, columns from each
table, and the selection criteria for Base to generate the actual
database query. The selection criteria can be something like Table1.ID
NOT IN Table2.ID, which I think is the selection criteria you need. You
can have more than one criteria and the other criteria do not need to
reference your others. If you wanted to include only those in California
add Table1.State = 'CA' or Table2.State = 'CA'. If any of the tables
contain columns with the same name you use dot notation like I have been
to properly identify the column you want. I used it in my examples for
completeness. If the column name only occurs in one table you can just
use the column name alone. Your query can include as little or as much
data as you want to be included in the results table. Base allows to
save the query for reuse. If you do not know SQL this is the best way to
query the database.

If you know SQL you can write your own queries by using QUERY (SQL
View). Personally, I work with SQL daily so I am more comfortable with
this option.

If you have any questions, do hesitate to ask. Databases are very
powerful tools but the first time using one can be bit daunting to use
at first, at least it was for me.

How do the data get into spreadsheets? Can't you simply avoid all spreadsheets? Spreadsheets are the worst file format to store interrelated table data. Even plain text files can be more suitable for data processing as suggested by Jorge.
Databases are development tools which save you and you co-workers a lot of time over the years. Just like a programming language, a database won't do anything for you unless you do something with it proactively. You won't get anywhere by mere trial and error. It is not as simple as copying raw data from one file into another.
The way how you describe your problem is entirely non-technical. We do not see any of your data you are going to process nor can we really understand your aim.

I am computer savvy but failed programming miserably so I need the input of the many smart people on this list

In other words, you want us to do your work?

Hello Andreas,

In no way do I want you to do my work and I am sorry if that is the way you feel.

I am prepared to learn how to work with data bases and it is with that intent that I am asking questions. If you feel that I want you to do my work, then please do not respond to my queries.

The format and content of the files in question (which I described in my first message) is beyond my control. In my first message, I explained what I wanted to do with the data provided me.

Thanks for taking the time to respond.

Preston

Good morning Jay,

Thank you for your response and guidance

I started working through a Base tutorial that I found at http://sheepdogguides.com/fdb/fdb1main.htm and have slowly started to grasp some of the fundamentals of database. I am finding the whole idea of data manipulation via a database most intriguing. I will keep working at this in the hopes that I can create what I need

Again thanks for your help. Your short tutorial on field notation reinforced what I learned from the tutorial.

All of the best,
Preston

What is the best way to use Spreadsheets sef? Can someone explain (with examples). It seems it is a becoming a redundant tool for dBase pros. When is it right to use it?

The format and content of the files in question (which I described in my first message) is beyond my control.

Let me retry your request in this way:

0) Technical prerequisites: Running LibreOffice 3.3.3 under WinXP I have a spreadsheet with 2 tables imported from text files which I receive on a regular basis from other departments of our organization.

1) A list of persons with an ID number, name, forname, birthday, address info and more:
0 Smith Jennifer 1967-07-09 ...
1 Smith John 1988-07-29 ...
2 Doe Farina 1999-11-12 ...

2) A membership list with person ID, entry date, department
1 2003-12-03 Helsinki
2 2007-01-01 Stockholm
0 2010-11-11 London
2 2009-01-31 Berlin

3)Now I want to merge both lists so they look like

Smith John 1988-07-29 2003-12-03 Helsinki ...
Doe Farina 1999-11-12 2007-01-01 Stockholm ...
Smith Jennifer 1967-07-09 2010-11-11 London ...
Doe Farina 1999-11-12 2009-01-31 Berlin ...

where the membership's person-ID is merged with the membership information.

This entirely technical information about what you have and where you want to go could serve as a platform where anybody on this list could be involved with some hints, tips, suggestions how to process these concrete example data in a spreadsheet, in a database, in a scripting language. Example data can be copied into a spreadsheet very easily and from there into a database.

Preston

> Preston
>
>
>> Hi Andreas,
>>
>> Thanks for your suggestion
>>
>> I know nothing about databases however I just successfully imported an
>> xls file into LO's Base. I assume the next step is to import the second
>> xls file to create a second table.
>>
>> Now for the difficult bit - how do I compare the record associated with
>> a College Number in one file against the record associated with the same
>> College number in the second file (should I be saying Table?)? And how
>> do I display the data that differs in the two entries for the same
>> College Number?
>>
>> Any help you can offer will be greatly appreciated.
>>
>> Thank you,
>>
>> Preston
>>
>>> Am 06.08.2011 00:50, Preston Smith wrote:
>>>> Hi,
>>>>
>>>> Hopefully someone can help me out.
>>>>
>>>> I maintain in Calc the regional section of a National membership file of
>>>> Alumni. There are about 650 members in my regional Calc file.
>>>>
>>>> Each record is keyed to an unique College number and consists of about
>>> Calc is a calculator which may be misused as a database to some extend.
>>>
>>> Your terminology describes a relational database. Deriving lists from
>>> other lists referring to unique records with keys and indices has
>>> always been the most natural thing to do in relational databases.
>>>
>>>
> Retrieving information from a database is called querying the database.
> Often it is done using Structured Query Language (SQL).
>
> Once you have your tables in the database you can select INSERT>> QUERY
> (Design View). This allows you to select the tables, columns from each
> table, and the selection criteria for Base to generate the actual
> database query. The selection criteria can be something like Table1.ID
> NOT IN Table2.ID, which I think is the selection criteria you need. You
> can have more than one criteria and the other criteria do not need to
> reference your others. If you wanted to include only those in California
> add Table1.State = 'CA' or Table2.State = 'CA'. If any of the tables
> contain columns with the same name you use dot notation like I have been
> to properly identify the column you want. I used it in my examples for
> completeness. If the column name only occurs in one table you can just
> use the column name alone. Your query can include as little or as much
> data as you want to be included in the results table. Base allows to
> save the query for reuse. If you do not know SQL this is the best way to
> query the database.
>
> If you know SQL you can write your own queries by using QUERY (SQL
> View). Personally, I work with SQL daily so I am more comfortable with
> this option.
>
> If you have any questions, do hesitate to ask. Databases are very
> powerful tools but the first time using one can be bit daunting to use
> at first, at least it was for me.
>
Good morning Jay,

Thank you for your response and guidance

I started working through a Base tutorial that I found at
http://sheepdogguides.com/fdb/fdb1main.htm and have slowly started to
grasp some of the fundamentals of database. I am finding the whole idea
of data manipulation via a database most intriguing. I will keep working
at this in the hopes that I can create what I need

Again thanks for your help. Your short tutorial on field notation
reinforced what I learned from the tutorial.

All of the best,
Preston

If you hit a snag, just asks a question and we will try to point you the
right direction.

For me this is the ultimate tutorial on databases in the Base context:
http://openoffice.org/projects/documentation/downloads/directory/Base/Mid%20level%20Base%20tutorial

There are small tutorials and lots of example documents in the community forum:
http://user.services.openoffice.org/en/forum/viewforum.php?f=74

+1.

Andreas, very well put. A well structured question makes it very easy to conceptualise the problem and propose solutions.

Hello Andreas,

Thank you for your message and your patience.

I am attaching two files - the first one outlines what I am hoping to accomplish and the second one has three tables in a worksheet that provides a shortened typical entry of the files involved and what I hope to see as a result

Hope this provides enough info so that someone can point me in the right direction

Thanks again,
Preston

Am 08.08.2011 19:18, Betti Ann and Preston Smith wrote:

Hello Andreas,

Thank you for your message and your patience.

I am attaching two files - the first one outlines what I am hoping to
accomplish and the second one has three tables in a worksheet that
provides a shortened typical entry of the files involved and what I hope
to see as a result

Mailing lists do not support attachments. Upload your files somewhere or register at the communitiy forum

http://user.services.openoffice.org/en/forum/index.php

where you can upload files of a size up to 100kB each.

I have uploaded the information to the Calc Section of the OpenOffice Forum - it is titled 'Compare Two Files'

Thanks again for your assistance,
Preston

Am 09.08.2011 00:57, Betti Ann and Preston Smith wrote:

I have uploaded the information to the Calc Section of the OpenOffice
Forum - it is titled 'Compare Two Files'

Thanks again for your assistance,
Preston

Thank you for posting the example data. My rough estimate was not that far away from your actual requirement (in fact everybody tries the same trivial thing, assuming that a spreadsheet must be the right tool which it isn't).
Since we are going to automate the processing of data records using one software tool or the other, it is vital to have the exact structure of the incoming data, particularly when the structure is beyond your control.
When you are going to engage an architect for a reconstruction of your familiy home, an album of polaroids and some verbal description won't be enough for a rough estimate on possibilities and costs. You need detailed plans, measurements, a chronological history of the house and the materials involved.
This is about engineering, the ugly little sister of magic.

Am 09.08.2011 00:57, Betti Ann and Preston Smith wrote:

I have uploaded the information to the Calc Section of the OpenOffice
Forum - it is titled 'Compare Two Files'

Thanks again for your assistance,
Preston

There is a Calc Section of the forum? Where?

Thank you for posting the example data. My rough estimate was not that
far away from your actual requirement (in fact everybody tries the same
trivial thing, assuming that a spreadsheet must be the right tool which
it isn't).

I'm still looking forward to a good example of a scenario best suited
for Calc (Spreadsheets). Since a spreadsheet is about calculative
tables and database programs can achieve most things spreadsheets are
used for (these days) ... shouldn't we be pushing for deprecating the
likes of Calc so the vast majority can face the real tools?

Hi Onyeibo;

> Am 09.08.2011 00:57, Betti Ann and Preston Smith wrote:
>> I have uploaded the information to the Calc Section of the OpenOffice
>> Forum - it is titled 'Compare Two Files'
>>
>> Thanks again for your assistance,
>> Preston

There is a Calc Section of the forum? Where?

    There is no LO official forum but there are several OpenOffice and
LO forums you can use. OpenOffice being the older project has more
extensive documentation and I believe some forums.

>
> Thank you for posting the example data. My rough estimate was not that
> far away from your actual requirement (in fact everybody tries the same
> trivial thing, assuming that a spreadsheet must be the right tool which
> it isn't).

I'm still looking forward to a good example of a scenario best suited
for Calc (Spreadsheets). Since a spreadsheet is about calculative
tables and database programs can achieve most things spreadsheets are
used for (these days) ... shouldn't we be pushing for deprecating the
likes of Calc so the vast majority can face the real tools?

Calc or any spreadsheet are best used when you are primarily concerned
with using data for calculations beyond very basic arithmetic and
statistical summaries. They are very good for generating "what if"
scenarios and similar analyses as well as producing graphs. I have
extensively used spreadsheets for many calculations. If you do not need
relational analysis of the data they can be used as a poor man's
database. The most basic database just stores data in some logical
manner and a spreadsheet can do this.

Databases are best suited to collecting, storing, and organizing data in
logical groups as the users need. The allow users to explore possible
relationships within the data that are always obvious. The data
searching tools are much more powerful in a database than in a
spreadsheet, particularly if you know SQL or similar tools. However a
database can not be used as a poor man's spreadsheet. Strictly speaking
most modern databases are relational databases meaning they are designed
to sift through datasets that are somehow related.

An example of the use of both, I would use a spreadsheet to calculate my
project estimates for a proposal and would track the vendor bids for
each proposal and each proposal in a database. If I set the relationship
between correctly I could be entering the data once.