PostgreSQL pipelinedb flow calculation plug-in - IoT application - real time trajectory aggregation

Keywords: Database PostgreSQL github SQL CentOS

Label

PostgreSQL, IOT, track aggregation, pipelinedb, flow calculation, real-time aggregation

background

IoT scenes, Internet of vehicles scenes, shared bicycle scenes, human behavior sites, etc. the terminal reports isolated sites in real time. We need to supplement them into tracks.

For example, sharing bicycles, placing orders, unlocking, generating orders, cycling, closing orders, and closing locks. In this process, there is a unique order number. Each reported site will contain time, order number and location.

Aggregate points into tracks based on the order number.

With the pipelinedb plug-in, aggregation can be implemented in real time.

Example

Take ECS (CentOS 7. X x64) and PostgreSQL 10 for example

1. Compile zeromq

wget https://github.com/zeromq/libzmq/releases/download/v4.2.5/zeromq-4.2.5.tar.gz  
tar -zxvf zeromq-4.2.5.tar.gz  
cd zeromq-4.2.5  
./configure  
make  
make install  

2. Compile pipelinedb

wget https://github.com/pipelinedb/pipelinedb/archive/1.0.0rev4.tar.gz  
tar -zxvf 1.0.0rev4.tar.gz   
cd pipelinedb-1.0.0rev4/  
  
vi Makefile  
  
SHLIB_LINK += /usr/local/lib/libzmq.so -lstdc++  
  
. /var/lib/pgsql/env.sh 1925  
  
USE_PGXS=1 make  
USE_PGXS=1 make install  

3. Configure postgresql.conf

max_worker_processes = 512  
shared_preload_libraries = 'pipelinedb'  
pipelinedb.stream_insert_level=async
pipelinedb.num_combiners=8
pipelinedb.num_workers=16
pipelinedb.num_queues=16
pipelinedb.fillfactor=75
pipelinedb.continuous_queries_enabled=true  

restart

pg_ctl restart -m fast  

4. Install plug-ins

postgres=# create extension pipelinedb;  

5. Create stream and write track points in real time

CREATE FOREIGN TABLE s1 ( order_id int8, ts timestamp, pos geometry )  
SERVER pipelinedb;  

6. Create a Continue view for real-time aggregation

CREATE VIEW cv1 WITH (action=materialize ) AS   
select order_id, min(ts) min_ts, array_agg(ts||','||st_astext(pos)) as seg  
from s1  
group by order_id;  

Activate view (activated by default)

select pipelinedb.activate('public.cv1');  

7. Pressure measurement

vi test.sql  
\set order_id random(1,100000)  
\set x random(70,90)  
\set y random(120,125)  
insert into s1 (order_id, ts, pos) values (:order_id, clock_timestamp(), st_makepoint(:x+10*random(), :y+10*random()));  
  
  
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 256 -j 256 -T 120  

8. Pressure test results

transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 256  
number of threads: 256  
duration: 120 s  
number of transactions actually processed: 17614607  
latency average = 1.740 ms  
latency stddev = 1.730 ms  
tps = 146550.933776 (including connections establishing)  
tps = 146906.482277 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.002  \set order_id random(1,10000000)  
         0.001  \set x random(70,90)  
         0.000  \set y random(120,125)  
         1.742  insert into s1 (order_id, ts, pos) values (:order_id, clock_timestamp(), st_makepoint(:x+10*random(), :y+10*random()));  
postgres=# \x  
Expanded display is on.  
-[ RECORD 17 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
order_id | 8672585  
min_ts   | 2018-11-01 18:44:08.140027  
seg      | {"2018-11-01 18:44:08.140027,POINT(78.3615547642112 121.881739947945)","2018-11-01 18:44:11.739248,POINT(80.9645632216707 121.450987955555)"}  
-[ RECORD 18 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
order_id | 4011211  
min_ts   | 2018-11-01 18:44:08.166407  
seg      | {"2018-11-01 18:44:08.166407,POINT(87.126777020283 132.819293198176)","2018-11-01 18:44:11.524995,POINT(80.482944605872 126.906906872056)"}  
-[ RECORD 19 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
order_id | 2468486  
min_ts   | 2018-11-01 18:44:08.135136  
seg      | {"2018-11-01 18:44:08.135136,POINT(84.7732630362734 132.659516767599)","2018-11-01 18:44:20.603312,POINT(87.6352122295648 132.18647258915)","2018-11-01 18:44:19.447776,POINT(94.9817024609074 131.295661441982)"}  

9. Reservation of historical track

Set cv life cycle and automatically clean aging data

postgres=# select pipelinedb.set_ttl('cv1', interval '1 hour' , 'min_ts');  
-[ RECORD 1 ]-----  
set_ttl | (3600,2)  

Create target persistence table

create table cv1_persist (like cv1);  

Create time field index (CV1)

postgres=# create index idx_1 on cv1 (min_ts);  
CREATE INDEX  

ETL form, extract data from cv to target persistence table

postgres=# insert into cv1_persist select * from cv1 where min_ts <= '2018-01-01';  
INSERT 0 0  

Reference resources

https://github.com/pipelinedb/pipelinedb

http://zeromq.org/intro:get-the-software

https://www.pipelinedb.com/

Posted by haku87 on Thu, 21 Nov 2019 10:23:50 -0800