Check and map IP address to subnet lookup table with SQL

这一生的挚爱 提交于 2020-03-01 04:35:45

问题


I have a table that shows IP addresses assigned to equipment belonging to a customer. This table is used to measure user's activity and performance and generated on a regular basis. I would like to map the IP addresses of these customers to customer ID or name. Due to size and multiple IP addresses that can be assigned to the same customer we have to use subnet instead of individual IP addresses.

The report table contains an IPaddress field and the subnet mapping table is something like below.

+-------------+------+-------------------+
| Customer_ID | VLAN |      Subnet       |
+-------------+------+-------------------+
| HARRIS      | 1012 | 10.111.253.46/32  |
| HARRIS      | 1012 | 10.164.20.49/30   |
| HARRIS      | 1012 | 10.223.165.193/28 |
| HARRIS      | 1013 | 10.155.75.128/26  |
| HARRIS      | 1014 | 10.155.75.128/26  |
| HARRIS      | 1330 | 10.121.30.192/27  |
| HARRIS      | 1331 | 10.120.30.192/27  |
| HARRIS      | 1332 | 10.122.30.192/27  |
| HARRIS      | 3910 | 100.104.12.144/32 |
| HARRIS      | 3927 | 10.70.24.233/32   |
| HARRIS      | 3959 | 10.102.11.182/32  |
| HARRIS      | 3966 | 10.98.11.170/32   |
| STEPHANIE   | 1010 | 100.72.0.33/32    |
| STEPHANIE   | 3896 | 10.96.11.169/29   |
| JOE         | 1010 | 100.69.72.10/32   |
| NED         | 1010 | 100.72.255.2/32   |
| ESTHER      | 1010 | 100.72.255.66/32  |
| BEN         | 1010 | 100.72.255.30/32  |
| SHAWN       | 1010 | 100.72.254.230/32 |
| JACK        | 1010 | 1.1.1.7/32        |
| TONY        | 1010 | 100.72.255.242/32 |
| 335553339   | 1010 | 100.72.254.250/32 |
| 335553342   | 1010 | 100.72.254.186/32 |
| 335553343   | 1010 | 100.72.254.238/32 |
| 335553346   | 1010 | 100.72.255.182/32 |
| 335553347   | 1010 | 1.1.1.2/32        |
| 335553348   | 1010 | 100.72.255.82/32  |
| 335553349   | 1010 | 100.72.254.30/32  |
| 335553351   | 1010 | 1.1.1.1/32        |
| 335553352   | 1010 | 100.80.255.174/32 |
| 335553411   | 1010 | 100.72.255.18/32  |
| 335553412   | 1010 | 100.72.255.22/32  |
| 335553413   | 1010 | 100.72.255.253/32 |
| 335553414   | 1010 | 100.72.255.222/32 |
| 335553415   | 1010 | 100.72.255.202/32 |
| 335553416   | 1010 | 100.72.255.210/32 |
| 335553417   | 1010 | 100.72.255.10/32  |
| 335553418   | 1010 | 100.72.255.218/32 |
| 335553419   | 1010 | 100.72.255.206/32 |
| 335553420   | 1010 | 100.72.255.14/32  |
| 335553421   | 1010 | 100.72.255.114/32 |
| 335553431   | 1010 | 100.72.255.166/32 |
| 335553632   | 1010 | 100.72.4.176/32   |
| 335553632   | 3898 | 10.98.0.49/28     |
| RANDY       |  100 | 100.72.5.6/32     |
| RANDY       | 1012 | 10.111.230.41/32  |
| RANDY       | 1012 | 10.164.31.185/30  |
| RANDY       | 1012 | 10.228.5.1/28     |
| RANDY       | 1013 | 10.154.10.0/26    |
| RANDY       | 1014 | 10.154.10.0/26    |
| HASAN       | 1015 | 100.72.8.94/32    |
| HASAN       | 1015 | 100.73.13.56/32   |
| HASAN       | 3910 | 100.104.3.66/32   |
| HASAN       | 3927 | 10.70.7.157/32    |
| HASAN       | 3959 | 10.102.1.228/32   |
| HASAN       | 3966 | 10.98.2.68/32     |
| DICKY       |  100 | 100.72.10.203/32  |
| DICKY       | 3897 | 1.1.1.1/29        |
| RINA        |  100 | 100.73.2.12/32    |
| RINA        | 1430 | 10.64.9.0/27      |
| RINA        | 3910 | 100.104.2.12/32   |
| RINA        | 3927 | 10.70.2.155/32    |
| RINA        | 3959 | 10.102.1.15/32    |
| RINA        | 3959 | 10.104.8.24/29    |
| RINA        | 3966 | 10.98.1.13/32     |
+-------------+------+-------------------+

How do I map the addresses in my dynamic table (check if it belongs to which subnet) in an SQL query and then add a column (i.e. left join) showing the associated Customer name?

Thank you for the help


回答1:


Below example for BigQuery Standard SQL

#standardSQL
CREATE TEMP FUNCTION SubnetToRange(CIDR STRING)
RETURNS STRUCT<start_IP STRING, end_IP STRING>
LANGUAGE js AS """
  var beg = CIDR.substr(CIDR,CIDR.indexOf('/'));
  var end = beg;
  var off = (1<<(32-parseInt(CIDR.substr(CIDR.indexOf('/')+1))))-1; 
  var sub = beg.split('.').map(function(a){return parseInt(a)});
  var buf = new ArrayBuffer(4); 
  var i32 = new Uint32Array(buf);
  i32[0]  = (sub[0]<<24) + (sub[1]<<16) + (sub[2]<<8) + (sub[3]) + off;
  var end = Array.apply([],new Uint8Array(buf)).reverse().join('.');
  return {start_IP: beg, end_IP: end};
"""; 
SELECT e.*, t.* EXCEPT(start_IP,    end_IP)
FROM `project.dataset.equipments` e,
(
  SELECT s.*, 
    NET.IPV4_TO_INT64(NET.IP_FROM_STRING(start_IP)) AS start_IP,
    NET.IPV4_TO_INT64(NET.IP_FROM_STRING(end_IP)) AS end_IP
  FROM `project.dataset.subnets` s,
  UNNEST([SubnetToRange(Subnet)]) 
) t
WHERE NET.IPV4_TO_INT64(NET.IP_FROM_STRING(IP)) BETWEEN start_IP AND end_IP

You can test, play with above using sample/dummy data as in below example

#standardSQL
CREATE TEMP FUNCTION SubnetToRange(CIDR STRING)
RETURNS STRUCT<start_IP STRING, end_IP STRING>
LANGUAGE js AS """
  var beg = CIDR.substr(CIDR,CIDR.indexOf('/'));
  var end = beg;
  var off = (1<<(32-parseInt(CIDR.substr(CIDR.indexOf('/')+1))))-1; 
  var sub = beg.split('.').map(function(a){return parseInt(a)});
  var buf = new ArrayBuffer(4); 
  var i32 = new Uint32Array(buf);
  i32[0]  = (sub[0]<<24) + (sub[1]<<16) + (sub[2]<<8) + (sub[3]) + off;
  var end = Array.apply([],new Uint8Array(buf)).reverse().join('.');
  return {start_IP: beg, end_IP: end};
"""; 
WITH `project.dataset.subnets` AS (
  SELECT 'HARRIS' Customer_ID, 1012 VLAN, '10.111.253.46/32' Subnet UNION ALL
  SELECT 'HARRIS', 1012, '10.164.20.49/30' UNION ALL
  SELECT 'HARRIS', 1012, '10.223.165.193/28' UNION ALL
  SELECT '335553632', 3898, '10.98.0.49/28' UNION ALL
  SELECT 'RINA', 1430, '10.64.9.0/27' UNION ALL
  SELECT 'RINA', 3966, '10.98.1.13/32'
), `project.dataset.equipments` AS (
  SELECT 'equipment A' equipment, '10.164.20.50' IP UNION ALL
  SELECT 'equipment B', '10.64.9.0'
)
SELECT e.*, t.* EXCEPT(start_IP,    end_IP)
FROM `project.dataset.equipments` e,
(
  SELECT s.*, 
    NET.IPV4_TO_INT64(NET.IP_FROM_STRING(start_IP)) AS start_IP,
    NET.IPV4_TO_INT64(NET.IP_FROM_STRING(end_IP)) AS end_IP
  FROM `project.dataset.subnets` s,
  UNNEST([SubnetToRange(Subnet)]) 
) t
WHERE NET.IPV4_TO_INT64(NET.IP_FROM_STRING(IP)) BETWEEN start_IP AND end_IP   

with output

Row equipment   IP              Customer_ID VLAN    Subnet          
1   equipment A 10.164.20.50    HARRIS      1012    10.164.20.49/30      
2   equipment B 10.64.9.0       RINA        1430    10.64.9.0/27         



回答2:


A query that looks for IP addresses and subnets, and published in:

  • https://towardsdatascience.com/geolocation-with-bigquery-de-identify-76-million-ip-addresses-in-20-seconds-e9e652480bd2
# replace with your source of IP addresses

WITH source_of_ip_addresses AS (
  SELECT REGEXP_REPLACE(contributor_ip, 'xxx', '0')  ip, COUNT(*) c
  FROM `publicdata.samples.wikipedia`
  WHERE contributor_ip IS NOT null  
  GROUP BY 1
)

SELECT country_name, SUM(c) c
FROM (
  SELECT ip, country_name, c
  FROM (
    SELECT *, NET.SAFE_IP_FROM_STRING(ip) & NET.IP_NET_MASK(4, mask) network_bin
    FROM source_of_ip_addresses, UNNEST(GENERATE_ARRAY(9,32)) mask
    WHERE BYTE_LENGTH(NET.SAFE_IP_FROM_STRING(ip)) = 4
  )
  JOIN `fh-bigquery.geocode.201806_geolite2_city_ipv4_locs`  
  USING (network_bin, mask)
)
GROUP BY 1
ORDER BY 2 DESC

The math here is pretty fast, and doesn't require JS UDFs. Just replace country_name with customer_name.



来源:https://stackoverflow.com/questions/59866093/check-and-map-ip-address-to-subnet-lookup-table-with-sql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!