如何在AMAZON REDSHIFT中将userip转换为整数

时间:2023-01-13 23:06:47

just starting to play around and test amazon's redshift. One thing I need to do that i can easily do in sql is change userip to integer. This is done in mssql with a scalar function that uses parsename to break up the ip numbres and multiple by them by constants.

刚刚开始玩,并测试亚马逊的红移。我需要做的一件事,我可以轻松地在sql中做的是将userip更改为整数。这是在mssql中使用标量函数完成的,该函数使用parsename来分解ip numbres和它们的常数。

 CAST(

       (CAST(PARSENAME(@IP,4) AS BIGINT) * 16777216) +
       (CAST(PARSENAME(@IP,3) AS BIGINT) * 65536) +
       (CAST(PARSENAME(@IP,2) AS BIGINT) * 256) +
        CAST(PARSENAME(@IP,1) AS BIGINT) 
  AS BIGINT)

That is what it looks likes for reference.

这就是它看起来喜欢参考的东西。

As i expected parsename is not a function in redshift and thus my question arises. Do you guys know of a way i can achieve the same resuslt?

正如我所料,parsename不是redshift中的函数,因此我的问题出现了。你们知道我可以达到同样的方式吗?

Figured it out:

弄清楚了:

( LEFT(ip_address, STRPOS(ip_address, '.')-1) * 16777216) + (LEFT(SUBSTRING(ip_address, LEN(LEFT(ip_address, STRPOS(ip_address, '.')+1)), LEN(ip_address) - LEN(LEFT(ip_address, STRPOS(ip_address, '.')-1)) - LEN(LEFT(REVERSE(ip_address), STRPOS(REVERSE(ip_address), '.')-1)) - 2), STRPOS( SUBSTRING(ip_address, LEN(LEFT(ip_address, STRPOS(ip_address, '.')+1)), LEN(ip_address) - LEN(LEFT(ip_address, STRPOS(ip_address, '.')-1)) - LEN(LEFT(REVERSE(ip_address), STRPOS(REVERSE(ip_address), '.')-1)) - 2), '.')-1) * 65536) + (RIGHT( SUBSTRING(ip_address, LEN(LEFT(ip_address, STRPOS(ip_address, '.')+1)), LEN(ip_address) - LEN(LEFT(ip_address, STRPOS(ip_address, '.')-1)) - LEN(LEFT(REVERSE(ip_address), STRPOS(REVERSE(ip_address), '.')-1)) - 2), LEN(SUBSTRING(ip_address, LEN(LEFT(ip_address, STRPOS(ip_address, '.')+1)), LEN(ip_address) - LEN(LEFT(ip_address, STRPOS(ip_address, '.')-1)) - LEN(LEFT(REVERSE(ip_address), STRPOS(REVERSE(ip_address), '.')-1)) - 2)) - STRPOS(SUBSTRING(ip_address, LEN(LEFT(ip_address, STRPOS(ip_address, '.')+1)), LEN(ip_address) - LEN(LEFT(ip_address, STRPOS(ip_address, '.')-1)) - LEN(LEFT(REVERSE(ip_address), STRPOS(REVERSE(ip_address), '.')-1)) - 2), '.') ) * 256) + (REVERSE( LEFT(REVERSE(ip_address), STRPOS(REVERSE(ip_address), '.')-1) ) * 1 )

(LEFT(ip_address,STRPOS(ip_address,'。') - 1)* 16777216)+(LEFT(SUBSTRING(ip_address,LEN(LEFT(ip_address,STRPOS(ip_address,'。')+ 1)),LEN(ip_address) - LEN(LEFT(ip_address,STRPOS(ip_address,'。') - 1)) - LEN(LEFT(REVERSE(ip_address),STRPOS(REVERSE(ip_address),'。') - 1)) - 2),STRPOS( SUBSTRING(ip_address,LEN(LEFT(ip_address,STRPOS(ip_address,'。')+ 1)),LEN(ip_address) - LEN(LEFT(ip_address,STRPOS(ip_address,'。') - 1)) - LEN(左) (REVERSE(ip_address),STRPOS(REVERSE(ip_address),'。') - 1)) - 2),'。') - 1)* 65536)+(RIGHT(SUBSTRING(ip_address,LEN(LEFT(ip_address,STRPOS) (ip_address,'。')+ 1)),LEN(ip_address) - LEN(左(ip_address,STRPOS(ip_address,'。') - 1)) - LEN(左(REVERSE(ip_address),STRPOS(REVERSE(ip_address) ),'。') - 1)) - 2),LEN(SUBSTRING(ip_address,LEN(LEFT(ip_address,STRPOS(ip_address,'。')+ 1)),LEN(ip_address) - LEN(LEFT(ip_address, STRPOS(ip_address,'。') - 1)) - LEN(LEFT(REVERSE(ip_address),STRPOS(REVERSE(ip_address),'。') - 1)) - 2)) - STRPOS(SUBSTRING(ip_address,LEN(剩下(ip_address,STRPOS(ip_address,'。')+ 1)),LEN(ip_address) - LEN(LEFT(ip_address,STRPOS(ip_address,'。') - 1)) - LEN(LEFT(REVERSE(ip_address),STRPOS (REVERSE(ip_address),'。') - 1)) - 2),'。'))* 256)+(REVERSE(LEFT(REVERSE(ip_address),STRPOS(REVERSE(ip_address),'。') - 1 ))* 1)

2 个解决方案

#1


2  

Wow, my eyes are watering at the sight of that query, though I'm sure you don't have tons of choice given the restrictions imposed by Redshift.

哇,看到那个询问,我的目光正在浇水,虽然我确定你没有多少选择,因为Redshift施加了限制。

I'm still amazed you have to do something quite that cumbersome. Can't you at least create an SQL function or two to tidy it up? Or does Redshift not even support CREATE FUNCTION ... LANGUAGE sql?

我仍然很惊讶你必须做一些非常繁琐的事情。你不能至少创建一个或两个SQL函数来整理它吗?或者Redshift甚至不支持CREATE FUNCTION ... LANGUAGE sql?

For reference, in proper PostgreSQL you'd do:

作为参考,在适当的PostgreSQL中你会做:

select (split_part(ip, '.', 1)::bigint << 24) +
       (split_part(ip, '.', 2)::bigint << 16) +
       (split_part(ip, '.', 3)::bigint << 8) +
       (split_part(ip, '.', 4)::bigint);

or using a simple-ish SQL function:

或使用简单的SQL函数:

CREATE OR REPLACE FUNCTION inet_to_bigint(inet) AS $$
SELECT sum(split_part($1::text,'.',octetnum)::bigint << (32 - octetnum*8))
FROM generate_series(1,4) octetnum;
$$ LANGUAGE sql;

or, almost certainly most efficiently, by abusing the inet data type's subtraction operator:

或者,几乎可以肯定最有效的方法是滥用inet数据类型的减法运算符:

SELECT (ip - '0.0.0.0')

(This one might even work in Redshift if they've retained the inet data type and if this feature existed back in PostgreSQL 8.1, when ParAccel forked from PostgreSQL).

(如果他们保留了inet数据类型,并且如果这个功能在PostgreSQL 8.1中存在,当ParAccel从PostgreSQL派生时,这个甚至可能在Redshift中工作)。

On a side note, I was quite astonished to see that there's no cast defined from inet to bigint, in PostgreSQL, as I expected to just be able to write '127.0.0.1'::inet::bigint, which would be shorthand for CAST(CAST('127.0.0.1' AS inet) AS bigint).

在旁注中,我很惊讶地看到在PostgreSQL中没有从inet到bigint定义的强制转换,因为我希望能够编写'127.0.0.1':: inet :: bigint,这将是简写CAST(CAST('127.0.0.1'AS inet)AS bigint)。

#2


1  

split_part(ip, '.', n) should do it.

split_part(ip,'。',n)应该这样做。

#1


2  

Wow, my eyes are watering at the sight of that query, though I'm sure you don't have tons of choice given the restrictions imposed by Redshift.

哇,看到那个询问,我的目光正在浇水,虽然我确定你没有多少选择,因为Redshift施加了限制。

I'm still amazed you have to do something quite that cumbersome. Can't you at least create an SQL function or two to tidy it up? Or does Redshift not even support CREATE FUNCTION ... LANGUAGE sql?

我仍然很惊讶你必须做一些非常繁琐的事情。你不能至少创建一个或两个SQL函数来整理它吗?或者Redshift甚至不支持CREATE FUNCTION ... LANGUAGE sql?

For reference, in proper PostgreSQL you'd do:

作为参考,在适当的PostgreSQL中你会做:

select (split_part(ip, '.', 1)::bigint << 24) +
       (split_part(ip, '.', 2)::bigint << 16) +
       (split_part(ip, '.', 3)::bigint << 8) +
       (split_part(ip, '.', 4)::bigint);

or using a simple-ish SQL function:

或使用简单的SQL函数:

CREATE OR REPLACE FUNCTION inet_to_bigint(inet) AS $$
SELECT sum(split_part($1::text,'.',octetnum)::bigint << (32 - octetnum*8))
FROM generate_series(1,4) octetnum;
$$ LANGUAGE sql;

or, almost certainly most efficiently, by abusing the inet data type's subtraction operator:

或者,几乎可以肯定最有效的方法是滥用inet数据类型的减法运算符:

SELECT (ip - '0.0.0.0')

(This one might even work in Redshift if they've retained the inet data type and if this feature existed back in PostgreSQL 8.1, when ParAccel forked from PostgreSQL).

(如果他们保留了inet数据类型,并且如果这个功能在PostgreSQL 8.1中存在,当ParAccel从PostgreSQL派生时,这个甚至可能在Redshift中工作)。

On a side note, I was quite astonished to see that there's no cast defined from inet to bigint, in PostgreSQL, as I expected to just be able to write '127.0.0.1'::inet::bigint, which would be shorthand for CAST(CAST('127.0.0.1' AS inet) AS bigint).

在旁注中,我很惊讶地看到在PostgreSQL中没有从inet到bigint定义的强制转换,因为我希望能够编写'127.0.0.1':: inet :: bigint,这将是简写CAST(CAST('127.0.0.1'AS inet)AS bigint)。

#2


1  

split_part(ip, '.', n) should do it.

split_part(ip,'。',n)应该这样做。