SitePoint Sponsor

User Tag List

Results 1 to 6 of 6
  1. #1
    SitePoint Member
    Join Date
    Jan 2005
    Location
    redbourn
    Posts
    3
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Question Extracting overlapping data fields in MS Access.

    I have an an Access table which contains fields for min and max values of a variable. I need to find all records in which the data range falls either completely or partially in the range covered by any other record and print out the start and end values of the overlapping range.

    I actually need to do it for the data ranges of four variables but one will do to start with!

    I know it can be done in code but is there an SQL solution?

    Any help would be greatfully appreciated.

  2. #2
    SitePoint Addict danfran's Avatar
    Join Date
    Jan 2005
    Location
    New York City
    Posts
    244
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Kav,

    If I interpret your question correctly, you have this sort of situation..


    Code:
    1) xxxxxxxx
    2)          xxxxxxxxxxxxxx
    3)                         xxxxxxxxxxxxxxxxxx
    4)     xxxxxxxx
    5)           xxxx
                 ^m ^m
                  i  a
                  n  x
    Where row 1 partially overlaps with row 4.
    Where row 4 partially overlaps with rows 2 and 5
    Where row 5 completely overlaps with row 2 and partially with 4
    And where row 3 is independent.

    What you need to do is write 2 or 3 algorithms. One for partial overlaps and one for complete overlaps. -Maybe one for independent overlaps, too. There's definitely no way to get your answer(s) in a single statement without the use of algorithms. That the number of "answers" (records) returned by your query change with the min and max values indicates that you must get your answer iteratively.

    I think you'll have to go record-by-record, making your comparisons based on each row's set of values. Note that I mentioned rows 5 and 4 twice, above. Your iterations are going to return redundant answers, too. In fact, I'm not even sure how useful a set of results like this would be, but that's your problem!

    Maybe, you could approach the problem from a different angle. -Divide the graph above (your min & max values) into single values (vertical lines). For example, say that each "x" is equivalent to 1. Then, it might be easier to count the records that contain "1" (min is < 1 and max > 1). Repeat for 2. Then, 3, 4, 5... until you get to the highest max. Storing THAT data would allow you to find the records that "cross" certain numbers most often. Think of it as putting knifes through your data-set and seeing which knives cut the most records.

    Dan

  3. #3
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,015
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    finding a row that overlaps with any other row implies a self-join

    if row M overlaps with row N, then row N overlaps with row M

    therefore in the self-join, this fact will come up twice, so it is safe to filter out one of the two occurrences simply by specifying that the start and/or end of the M row will be less than the start and/or end of the N row

    in danfran's example, row 4 overlaps with row 5 and row 5 overlaps with row 4, but since row 4's start is earlier, we'd report this fact only for row 4 overlapping row 5

    with me so far?

    okay, now before we tackle the sql, may i ask if it's possible for the same start/end values to exist more than once? in other words, if danfran's example had a row 6 that had the exact same start/end as row 3?

    finally, please give the exact names of the table and columns and a few sample rows of data so that we can actually test the sql that i will write for you
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  4. #4
    SitePoint Member
    Join Date
    Jan 2005
    Location
    redbourn
    Posts
    3
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Danfran, Rudy

    Thanks. Thats exactly what I am looking for. There is no constraint on the values, so yes any given value can be repeated. I suppose the question is what can I see, looking down from row 1 to row 5, in any of six dimensions. I do have a rough solution which addresses six different sets of parameters but it is extremely inelegant, long, and slow.

    Sample data in tblWorking

    ID RFMin RFMax
    10 400 500
    20 350 470
    30 300 320
    40 460 540
    50 310 360

  5. #5
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,015
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    my comment about if X overlaps with Y then Y overlaps with X was intended to clarify why the following self-join uses on t1.id < t2.id
    Code:
    select t1.id
         , t1.RFMin
         , t1.RFMax
         , t2.id
         , t2.RFMin
         , t2.RFMax
      from tblWorking as t1
    inner
      join tblWorking as t2
        on t1.id < t2.id
    order
        by t1.id
         , t2.id
    the above query is the basic self-join which returns everything, as follows --

    10 400 500 20 350 470
    10 400 500 30 300 320
    10 400 500 40 460 540
    10 400 500 50 310 360
    20 350 470 30 300 320
    20 350 470 40 460 540
    20 350 470 50 310 360
    30 300 320 40 460 540
    30 300 320 50 310 360
    40 460 540 50 310 360

    now we simply throw away the ones that don't overlap

    without a long winded explanation of ANDs and ORs and NOTs, let me just say that the easiest way to find the ones that do overlap is to throw away the ones that don't!

    and the ones that don't overlap are where both t1 values are either lower than the first t2 value or greater than the second t2 value
    Code:
    select t1.id
         , t1.RFMin
         , t1.RFMax
         , t2.id
         , t2.RFMin
         , t2.RFMax
      from tblWorking as t1
    inner
      join tblWorking as t2
        on t1.id < t2.id
     where not
           ( 
           t1.RFMin < t2.RFMin
       and t1.RFMax < t2.RFMin
        or t1.RFMin > t2.RFMax
       and t1.RFMax > t2.RFMax   
           )    
    order
        by t1.id
         , t2.id
    10 400 500 20 350 470
    10 400 500 40 460 540
    20 350 470 40 460 540
    20 350 470 50 310 360
    30 300 320 50 310 360

    sweet, eh?

    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  6. #6
    SitePoint Member
    Join Date
    Jan 2005
    Location
    redbourn
    Posts
    3
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks a lot. It just goes to show that it's worth finding an expert. The VBA code that I have been trying to sort out was over 300 lines long just to do this bit.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •