Filter algorithm question

Hi,
      This may sound like some ones homework but its really an algorithm
question I have.

Imagine you had a square plot of land. You could assign this to a grid and
create a two dimensional graph with an X and Y-axis. Now imagine an inner
square plot that's a wildlife sanctuary. Now you fly a camera drone over the
outer plot and photograph all the pelicans nesting there. So your covering
the outer plot and the inner sanctuary plot. Now you digitize the photo,
enabling you to turn the pelicans into points on the X & Y graph. Then you
take these X & Y points and either put them in a serial file or a
spreadsheet.

Now you want to filter the data, going through the X, Y pairs, and finding
only those within the boundaries of the inner sanctuary plot. If you could
do that you could count the pelicans, now represented by X, Y coordinates in
your database, within the sanctuary.

Say the outer plot is X = 20 and Y = 20. And the inner plot goes from 4 to 7
on the X axis and from 6 to 9 on the Y axis.

If you were reading your data into a BASIC program, you could use code like:

If (X > = 4 AND X < = 7) AND (Y > = 6 AND Y < = 9) THEN Count = Count + 1

This would look in the inner sanctuary boundaries, and if the data fell
within its boundaries, you'd increment a counter to count the pelicans.

This is great if the inner plot is rectangular. My question is, what if its
irregular like a kidney shaped swimming pool? What would the filter code
look like?

I'm not asking for you to solve the problem here for me. I'm asking for any
pointers to stuff on the web that may be applicable to this question. There
must be a name for this problem but I really don't know what it is or where
to start looking.

Any ideas appreciated.

This should be problem that has occurred before, how to map and count instead a non-geometrical shape. I am not sure if something like Google Maps or Mapquest may have an API for something like this.

Depending on your programming skills, you might solve using a mapping protocol/API using data file as the input.

Jay

One thought might be to reduce the size of the square until the squares that overlap are not material. If you are looking for elephants in an irregular spaced area, and the size of the squares are less than the size of an elephant, you're home free. Of course as the size of the square gets smaller, there is more memory involved that more cpu processing time required, but that's what computers are for.

John

OK, the algorithm I'm looking for is called "Flood Fill" and has quite a few
examples on the web in different programming languages. Its sort of a pixel
color search function. Probably could be adapted to the task I proposed.

Thanks for the replies,
                                          I like the superimposition idea.
If I had a known irregular shape I could map it onto the existing grid. A
brute force approach would be to just make a list of all the X-Y coordinates
in the irregular area, then just search through the entire grid and see if
any coordinates show up in the list, and count them if they do. If the grid
spacing was coarse enough I suppose I could do this on graph paper.
                                          Creating this kind of pre-defined
list with a grid that had a lot of coordinates would be tougher as it would
need to be done programmatically. I think a drawing program where you have a
grid and mark coordinates perhaps as the mouse cursor goes over them could
generate the boundry coordinates of the irregular inner area. The task then
would be to somehow programmatically create a list of all the coordinates
within that area and write them to a list. I've noticed the MS Paint app has
a "fill" function to color in some boundried figure you've drawn with the
mouse. So maybe theres an algorithm out there for this somewhere.

If I had a known irregular shape I could map it onto the existing grid. A brute force approach would be to just make a list of all the X-Y coordinates in the irregular area, then just search through the entire grid and see if any coordinates show up in the list, and count them if they do.

I think you are making work for yourself by suggesting doing this backwards. Don't create a list of cells in the shape and look to see if each contains a bird. Instead, use the defined boundary of the shape to determine directly whether each potential bird is in the shape. You need only one list - of birds. You don't need a list of cells that appear in the shape, just some way of determining whether any bird is inside it. Once you have the edges of the shape defined in some way, constructing a COUNTIF() condition to apply to the list of birds will give you what you need. Kidneys don't usually have holes (mine don't; I trust yours don't), so you probably need only a list of (two) limiting Y co-ordinates for each X value - or vice versa if this is easier.

Creating this kind of pre-defined list with a grid that had a lot of coordinates would be tougher as it would need to be done programmatically.

But you don't need this list: all you need is those co-ordinates (of the shape) in order to determine whether any bird in inside. How you get this depends on how your shape is defined (which you have not explained).

the algorithm I'm looking for is called "Flood Fill" ...

I really don't think it is - if your problem is as originally explained. If you want to colour in your imaginary diagram, by all mean use this. But once you have the definition of the shape (as you would need if you wanted to use flood fill), you can easily determine the status of each bird directly.

Brian Barker

I don't understand the COUNTIF() condition which is why I was looking at the
Flood Fill function to search within the sanctuary boundry. If I had a list
of X & Y coordinates defining the boundry of the sanctuary I suppose I could
search from left to right along the X-axis, then drop down one Y-axis
increment searching the inner area and looking for inner coordinates that
represented birds. I think I may be talking BASIC and you may be talking
LibreOffice.

Whether I filter the X & Y coordinates by reading a serial file in some
flavor of BASIC or process them in a spreadsheet like LibreOffice Calc is
irrelevant to me, its the algorithm to do it that I'm looking for. However I
understand this is a LibreOffice forum so I understand if that's where your
coming from. So I probably need to look up the COUNTIF() function.

It may be that you are looking for Z axis data at three levels of Z.

0 = All land
1 = All perimeter enclosed data
2 = All bird locations

If Z data = 0 + 1 + 2 = 3 (bird in perimeter) bird in perimeter counter +1
If Z data = 0 + 2 + 0 = 2 (bird outside perimeter) bird outside perimeter counter +1

If location of bird Z data = 2 read x,y for position on map
If location of bird Z data = 3 read x,y for position on map

Use GPS co-ordinates under overlay of image.

0 level is the number of defined points X, Y. Resolution dependent on GPS resolution.

Finding the edge alone is more complex because you have to test for X+1, X-1, Y+1 and Y-1. If any of them are 0 in the Z axis, it is a perimeter edge.

It should be do-able in Calc. But I'm sure there are better ways.

Hope this helps.

I assumed (for obvious reasons!) that you would be using a spreadsheet, but I think the question of whether you use a program or a spreadsheet to solve this problem is a red herring.

You have a list of bird locations. You are proposing to construct a separate list of cells that are located within your shape. Then it seems that you are going to step through your long list of cells and for each cell run through the entire list of bird locations to see how many birds are in that cell. That is surely a very inefficient way of doing what you ask? You would need something like this only if you needed to know separately how many bird locations were positioned in *each* cell.

But your declared problem is merely counting the single total number of bird locations within your shape. To do this, all you need to do is to create a list of the Y values of the shape boundary for each X coordinate in your grid (or vice versa). Provided that the shape is not re-entrant in at least one direction, you should be able to have just two values for each row (or column). So this list has only twice-N values instead of the N-squared values in yours. Now you step through the *birds* - not the cells - and for each bird you first determine the column (or row) in which it occurs and then simply whether it is within the limits of the shape in that row in order to decide whether to count it. That's far smaller data and far fewer calculations - and no need for the "fill" algorithm you mention.

Brian Barker

Correction

If I create a list of Y-values for the shape boundary for each X-coordinate
in the grid I end up with:

Y-value of the shape boundary Corresponding X-coordinate of the
grid
9.10,11,12,13,14 7
8,15 8
7,16 9
6,17 10
6,17 11
6,11,12,17 12
7,10,13,16 13
8,9,14,15 14

If I now look at my list of bird coordinates and take one example of a birds
X-coordinate being 12, its within the ranges of the shape boundary Y-values
above, but if I check its position (at X=12, Y=19) I can see its not within
the boundary. So I'm still missing something.

If I create a list of Y-values for the shape boundary for each X-coordinate in the grid I end up with:

Y-value of the shape boundary Corresponding X-coordinate of the grid
9,10,11,12,13,14 7

Does this mean that the shape is re-entrant and that 9 to 10, 11 to 12, and 13 to 14 are inside? Well, that's the same as saying that 9 to 14 is inside, and you need better resolution of your grid if you are to treat the shape appropriately. Note that it may be the case that if you chose, arbitrarily, to list the boundary X-values for each Y-value instead, it's possible that your shape might not be re-entrant and the shape could be expressed more simply.

...
6,11,12,17 12
...

If I now look at my list of bird coordinates and take one example of a birds X-coordinate being 12, its within the ranges of the shape boundary Y-values above, but if I check its position (at X=12, Y=19) I can see its not within the boundary. So I'm still missing something.

Again, if this means that 6 to 11 and 12 to 17 are within the shape, then that is the same as 6 to 17 and you need better resolution. But in any case, a Y-value of 19 is clearly outside the 6 to 17 range for X=12, which coincides with your claim that this bird is outside the shape. No problem.

Are you perhaps confusing X and Y?

Brian Barker

If you picture a kidney shaped swimming pool or a lima bean, its boundary is
not a circle, but a circle with one side pushed in. In the X & Y coordinates
I wrote earlier the pushed in area are the coordinates:
X Y
13,10
12,11
12,12
13,13

Maybe this is the "re-entrant" portion you mentioned. Describing problems
like this verbally is difficult. I don't suppose there is a way to post an
image of my graph here?

More efficient is good, but simpler might be better here.