Comparing a range of IP addresses

Hi,

I have setup a few ‘test’ tables to (i) store IP range blocks and (ii) firewall log information. Using this SQL query select:


 $query = "SELECT * FROM firewall_logs where log_source_ip >= '218.214.1.3' and log_source_ip <= '218.214.128.128'";
  

there were 3 records returned as follows:

218.214.128.123
218.214.128.126
218.214.128.128

but if the query is:


  $query = "SELECT * FROM firewall_logs where log_source_ip >= '218.214.2.3' and log_source_ip <= '218.214.128.128'";
   

no records are returned. :slight_smile:

IP address 218.214.2.3 is lower than 218.214.128.128 , but of course, as the field is defined as VARCHAR, then positionally, “.2” is not lower than “.1”, hence no records (I assume this is how MySQL handles this data type).

What can I do to fix this ? Should I store the IP addresses with leading zeros, or is there some other method of doing this ? I can’t see that changing the data type would help, the IP must have the periods, so it can’t be anything else but a ‘char’ type.

Thanks,

Peter

yes, store it broken up as 4 tinyints

Hi,

Thanks, that is a good idea. I tried “googling” for quite a while, to find out how to use the 4 seperate fields in queries and how to ‘format’ them (patterns ?). Then I stumbled upon 2 MySQL functions from http://www.mysql.com/doc/en/Miscellaneous_functions.html

INET_ATON(expr)
Given the dotted-quad representation of a network address as a string, returns an integer that represents the numeric value of the address. Addresses may be 4 or 8 byte addresses:

mysql> SELECT INET_ATON(‘209.207.224.40’);
-> 3520061480

 The generated number is always in network byte order; for example the above number is calculated as 209*256^3 + 207*256^2 + 224*256 +40. As of MySQL 4.1.2, INET_ATON() also understands short-form IP addresses:

mysql> SELECT INET_ATON(‘127.0.0.1’), INET_ATON(‘127.1’);
-> 2130706433, 2130706433

INET_NTOA(expr)
Given a numeric network address (4 or 8 byte), returns the dotted-quad representation of the address as a string:

mysql> SELECT INET_NTOA(3520061480);
-> “209.207.224.40”

Would that be better doing it that way ??

Peter

Well, yes and no. If you store it as a string and use these functions when doing your comparisons, then no. You wouldn’t want to evaluate a function in a WHERE clause if you could avoid it. If store it as a decimal and use these functions only when inserting the values in the first place, then yes.

In the past I have had reasons to look for a range of IP addresses in the database, and I’ve pretty much always been able to use something like:

SELECT * FROM session WHERE session_IP LIKE ‘192.168%’

However, obviously this wouldn’t work with the ranges you’re using.

good one, mmj

yes, the answer is yes and no

why not store both

Does this place too much load on the server, hence result in long query execution ? The string is 15 chars, whereas apparently as a decimal, because they are bit settings, the whole IP address can be stored in 4 bytes (unless I’m missing something). Surely, a large db, searching on a 15 byte column, as opposed to a 4 byte column, would have a huge difference, wouldn’t it ?

The number of rows could well exceed 100,00 per year, possibly more, so I have to take the # of rows and overall db size into consideration. Query times are always at the mercy of these factors.

Anyway, considering storing the 15 char IP address, I assume I can easily explode an address like 212.28.23.16 to 212.028.023.016 for building the ‘sql/where’ clause, but then the data won’t match on ranges will it ? Or is that where I can use the “like” phrase ?

If store it as a decimal and use these functions only when inserting the values in the first place, then yes.

I need to check one table against another:

  1. Table 1 has all the IP addresses in it
  2. Table 2 has the IP ranges in it (IP blocks, and company details)

So, I need to pass through ‘table 1’, and then use the IP address from that, to (try and) find the row from table 2, that has the IP ranges.

Also, I may want to run other queries and reports from time to time, so my objectives wouldn’t be to only use the IP address for an insert.

In the past I have had reasons to look for a range of IP addresses in the database, and I’ve pretty much always been able to use something like:

SELECT * FROM session WHERE session_IP LIKE ‘192.168%’

However, obviously this wouldn’t work with the ranges you’re using.

Would the ‘like’ clause get the row that had the range of IP addresses ? For instance, table 1 has an IP address of 218.214.128.218 , and table 2 has one of its’ rows, where the block IP’s are:

IP_From: 218.214.86.1
IP_to: 218.214.130.255

Notice this, as a string compare would not get the row, if I just looked at partial string comparison, that is

‘218.214.8’ is NOT >= to ‘218.214.1’

whereas …

‘218.214.08’ is >= to ‘218.214.12’

Whether to store the Ip as 15 char, and put in the leading zeros (editing them out with html o/p, etc), or try the:

store it broken up as 4 tinyints

I don’t know ? I’m quite okay with the 4 tinyints idea, it saves on storage space, and should result in faster queries, but I don’t know how to use it, that is, “assemble” it, for building the sql query statement.

That is, if there were 4 tinyints, if one value was ‘13’, do I convert it to a string, add a leading zero, then convert back to numeric, so I have a numeric value of ‘013’, or is it already stored in this format ??

Thanks for all your help, :slight_smile:

Peter

Hi,

What impact would that place on the server (load), and what affect on query execution time ? I saw enough posts via Google to other forums where people discussed these issues at length, and the bottom line was, the smaller the db and the smaller the column/s you are using in your sql query, the faster the query will execute.

Peter

While the impact on query execution time will vary according to the size of your tables, etc, in general you don’t want to evaluate a function in a WHERE clause. This is because the function is evaluated for each row in the table. It also means that MySQL is unable to use an index.

Storing both would have no effect on query time if you use the decimal version to do the comparison.

yes, the smaller the table, the faster the query, but on the other hand, you could say that a 1500-lb car is a lot easier to push up a hill with the engine turned off than a 3200-lb car

what really matters is indexing

remember that when you perform a search, if the column in the WHERE clause is indexed, you will get substantial performance improvements

along those lines, you cannot expect an index to be used if you apply a function to the column

for example, WHERE month(dateadded)=2 will certainly find dates in february, but using a table scan

so what you want to do comes down to what your WHERE clause conditions will look like

and a lot of this will depend on how you define “blocks” and “ranges”

Hi Thomas,

Okay, so in general, functions in a where clause are out. Because the (smaller) table is the one I would be accessing the most, an index on it would be good. This smaller table is the range of IP addreses, and I called it "ipblocks’ for want of a better name. :slight_smile:

Storing both would have no effect on query time if you use the decimal version to do the comparison.

I tried using the data type tinyint, here is the structure of the table that stores the IP ranges:

CREATE TABLE `ipblocks` (
   `inetnum_from` varchar(15) NOT NULL default '',
   `ip_from_1_3` tinyint(2) NOT NULL default '0',
   `ip_from_4_6` tinyint(2) NOT NULL default '0',
   `ip_from_7_9` tinyint(2) NOT NULL default '0',
   `ip_from_10_12` tinyint(2) NOT NULL default '0',
   `inetnum_to` varchar(15) NOT NULL default '',
   `ip_to_1_3` tinyint(2) NOT NULL default '0',
   `ip_to_4_6` tinyint(2) NOT NULL default '0',
   `ip_to_7_9` tinyint(2) NOT NULL default '0',
   `ip_to_10_12` tinyint(2) NOT NULL default '0',
   `netname` varchar(35) NOT NULL default 'SWIFTEL',
   `company_descr` varchar(40) NOT NULL default 'Swiftel Communications, Perth WA',
   `assignment-descr` varchar(40) NOT NULL default 'Cust-Net assignmnet',
   `country` varchar(10) NOT NULL default 'AU'
 ) TYPE=MyISAM COMMENT='IP blocks - IP addresses';

I had to make the data type of ‘tinyint’ of size 2, because when it was size 1, the max value was 127. This obviously looked like a decimal value of a bit setting, the max that can be stored in 1 byte with all bits set is 127. So, I changed it to 2 bytes, still no go, obviously the max. size for tinyint is 1 byte. Anyway, here is the data:

INSERT INTO `ipblocks` VALUES ('202.154.64.0', 127, 127, 64, 0, '202.154.64.31', 127, 127, 64, 31, 'SWIFTEL', 'Swiftel Communications, Perth WA', 'Cust-Net assignmnet', 'AU');
 INSERT INTO `ipblocks` VALUES ('218.214.128.0', 127, 127, 127, 0, '218.214.159.255', 127, 127, 127, 127, 'SWIFTEL', 'Swiftel Communications, Perth WA', 'Infrastructure - Victoria ADSL users', 'AU');

I guess I have to move up to using a ‘smallint’ to store up to value 255.

This table above (ipblocks) is the table that would be referenced in the select statement, and the where clause. I can’t see any way of constructing the querry though, unless I used 8 “where …and” statements, …yuk. :frowning:

The other table is called “firewall_logs”


 CREATE TABLE `firewall_logs` (
   `log_datetime` datetime NOT NULL default '0000-00-00 00:00:00',
   `log_source_ip` varchar(15) NOT NULL default '',
   `log_source_port` int(11) NOT NULL default '0',
   `log_destination_ip` varchar(15) NOT NULL default '',
   `log_destination_port` int(11) NOT NULL default '0'
 ) TYPE=MyISAM COMMENT='Detailed firewall log info - from Kiwi logs';

This is the table that would be very large; at present I don’t see the need to also add the (decimal) IP address into this one, because I will be reading through the entire table, getting the “log_source_ip” column" , and then doing the select on the other table, to try and find if the IP address is within any of the ranges in the table ‘ipblocks’.

Hope this makes sense. Why am I doing this ? Our ADSL provider wants to know if any users are showing up in firewall logs, so they can contact them (most “attacks” are from people who do not know thye have trojans,etc on their computer, so it’s an education exercise). The other reasons are I need to brush up on MySQL and PHP skills. :slight_smile:

Thanks,

Peter

Hi,

So, considering the above post, where the table structure is, and what I need to do, adding an index to table ‘ipblocks’ would be better.

remember that when you perform a search, if the column in the WHERE clause is indexed, you will get substantial performance improvements

Hmm, that seems to hint that I put the index on the 15 char IP address, but that is not the column/s I need to select on ?

so what you want to do comes down to what your WHERE clause conditions will look like, and a lot of this will depend on how you define “blocks” and “ranges”

Hope the above structures help, it’s just a draft/test thing at present, I don’t want to go too far until the ‘model’ is good.

Peter

Yes, data type smallint(2) stores the values okay

Peter

use TINYINT UNSIGNED

the number in brackets, e.g. tinyint(2), integer(9), etc., merely indicates how many digits to show when using ZEROFILL

an integer column(tinyint, smallint, mediumint, integer, bigint) always takes the same storage size in bytes regardless of the zerofill number

Hi,

Thanks, that works just fine, in terms of storage.

Onto the actual query, the IP value I will be using to do the select is the 15 char column, so how can the qry be built, unless I do something like:

  1. Split the IP address I’m looking for into 4 parts (e.g. from table ‘firewall_logs’, the column ‘log_source_ip’ has a value of 218.214.128.123, so the split is to 4 variables:

$var1 = 218
$var2 = 214
$var3 = 128
$var4 - 123

all the variables are numeric values.)

  1. Then the actual query would have to be a ‘monster’ select/where like:

 select *
 [indent]from ip_blocks
 [/indent] [indent]   [indent]where ip_from_1_3 >= $var1
   [/indent] [/indent] [indent]   [indent]     [indent]and ip_from_4_6 >= $var2
     [/indent]   [/indent] [/indent] [indent]   [indent]     [indent]and ip_from_7_9 >= $var3
     [/indent]   [/indent] [/indent] [indent]   [indent]     [indent]and ip_from_10_12 >= $var4
     [/indent]   [/indent] [/indent] [indent]   [indent]     [indent]and ip_to_1_3 <= $var1
     [/indent]   [/indent] [/indent] [indent]   [indent]     [indent]and ip_to_4_6 <= $var2
     [/indent]   [/indent] [/indent] [indent]   [indent]     [indent]and ip_to_7_9 <= $var3
     [/indent]   [/indent] [/indent] [indent]   [indent]     [indent]and ip_to_10_12 <= $var4
 [/indent]   [/indent] [/indent]  

… to match the IP to be within the “from” and “to” ranges. I feel I’m missing something though. :smiley:

Peter

this is not a monster select at all

it is a natural consequence of the sad fact that xxx.yyy.zzz.2 is greater than xxx.yyy.zzz.12 in a string compare

what you want is a numeric compare

Hi,

Yes, that’s correct. I know what I need, but how do you do it with the IP address octets split into 4 numeric values ?

(in fact split into 8, because it is 2 IP addresses specifying a range (block) of IP addresses. )

Peter

Hi,

I found this function below at http://www.ip2location.com/articles/article2.htm , so I think it might be better to simply store the IP address as a long number, and use function to display the ‘dotted’ IP address. This will make the qry very simple also.


 // Function to convert IP address (xxx.xxx.xxx.xxx) to IP number (0 to 256^4-1) 
 function Dot2LongIP ($IPaddr) { 
  if ($IPaddr == "") 
  {
    return 0;
  } else { 
    $ips = split ("\\.", "$IPaddr"); 
    return ($ips[3] + $ips[2] * 256 + $ips[1] * 256 * 256 + $ips[0] * 256 * 256 * 256); 
  }
 } 
 ?> 

Peter

PHP has ip2long() and long2ip()
www.php.net/ip2long
Store the IP as an int is much more efficient in range searching.

“SELECT * FROM ips WHERE $iplong BETWEEN ipFROM AND ipTO”

Hi,

Thanks, I had read some threads on using that (on http://php.net), and it would appear to be the best solution, as you say, especially for comparing ranges. Some of the posts about using those functions seemed to be suggesting that storing the IP in this manner does not always resolve back to the correct IP address (i.e. the original one). But then again, I may have misunderstood what they were saying ??

Does the integer have to be signed, and how big does it need to be (max is 255.255.255.255 I guess) ? I will be storing it in a MySQL db, so would just INT be sufficient ?

Thanks,

Peter

Here is the schema used by an IP-country geomapping database


CREATE TABLE `ip2country` (
  `ipFROM` double NOT NULL default '0',
  `ipTO` double NOT NULL default '0',
  `country` char(2) NOT NULL default '',
  PRIMARY KEY  (`ipFROM`,`ipTO`)
) TYPE=MyISAM;