Label
PostgreSQL, IOT, track aggregation, pipelinedb, flow calculation, real-time aggregation
background
IoT scenes, Internet of vehicles scenes, shared bicycle scenes, human behavior sites, etc. the terminal reports isolated sites in real time. We need to supplement them into tracks.
For example, sharing bicycles, placing orders, unlocking, generating orders, cycling, closing orders, and closing locks. In this process, there is a unique order number. Each reported site will contain time, order number and location.
Aggregate points into tracks based on the order number.
With the pipelinedb plug-in, aggregation can be implemented in real time.
Example
Take ECS (CentOS 7. X x64) and PostgreSQL 10 for example
1. Compile zeromq
wget https://github.com/zeromq/libzmq/releases/download/v4.2.5/zeromq-4.2.5.tar.gz tar -zxvf zeromq-4.2.5.tar.gz cd zeromq-4.2.5 ./configure make make install
2. Compile pipelinedb
wget https://github.com/pipelinedb/pipelinedb/archive/1.0.0rev4.tar.gz tar -zxvf 1.0.0rev4.tar.gz cd pipelinedb-1.0.0rev4/ vi Makefile SHLIB_LINK += /usr/local/lib/libzmq.so -lstdc++ . /var/lib/pgsql/env.sh 1925 USE_PGXS=1 make USE_PGXS=1 make install
3. Configure postgresql.conf
max_worker_processes = 512 shared_preload_libraries = 'pipelinedb' pipelinedb.stream_insert_level=async pipelinedb.num_combiners=8 pipelinedb.num_workers=16 pipelinedb.num_queues=16 pipelinedb.fillfactor=75 pipelinedb.continuous_queries_enabled=true
restart
pg_ctl restart -m fast
4. Install plug-ins
postgres=# create extension pipelinedb;
5. Create stream and write track points in real time
CREATE FOREIGN TABLE s1 ( order_id int8, ts timestamp, pos geometry ) SERVER pipelinedb;
6. Create a Continue view for real-time aggregation
CREATE VIEW cv1 WITH (action=materialize ) AS select order_id, min(ts) min_ts, array_agg(ts||','||st_astext(pos)) as seg from s1 group by order_id;
Activate view (activated by default)
select pipelinedb.activate('public.cv1');
7. Pressure measurement
vi test.sql \set order_id random(1,100000) \set x random(70,90) \set y random(120,125) insert into s1 (order_id, ts, pos) values (:order_id, clock_timestamp(), st_makepoint(:x+10*random(), :y+10*random())); pgbench -M prepared -n -r -P 1 -f ./test.sql -c 256 -j 256 -T 120
8. Pressure test results
transaction type: ./test.sql scaling factor: 1 query mode: prepared number of clients: 256 number of threads: 256 duration: 120 s number of transactions actually processed: 17614607 latency average = 1.740 ms latency stddev = 1.730 ms tps = 146550.933776 (including connections establishing) tps = 146906.482277 (excluding connections establishing) script statistics: - statement latencies in milliseconds: 0.002 \set order_id random(1,10000000) 0.001 \set x random(70,90) 0.000 \set y random(120,125) 1.742 insert into s1 (order_id, ts, pos) values (:order_id, clock_timestamp(), st_makepoint(:x+10*random(), :y+10*random()));
postgres=# \x Expanded display is on. -[ RECORD 17 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- order_id | 8672585 min_ts | 2018-11-01 18:44:08.140027 seg | {"2018-11-01 18:44:08.140027,POINT(78.3615547642112 121.881739947945)","2018-11-01 18:44:11.739248,POINT(80.9645632216707 121.450987955555)"} -[ RECORD 18 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- order_id | 4011211 min_ts | 2018-11-01 18:44:08.166407 seg | {"2018-11-01 18:44:08.166407,POINT(87.126777020283 132.819293198176)","2018-11-01 18:44:11.524995,POINT(80.482944605872 126.906906872056)"} -[ RECORD 19 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- order_id | 2468486 min_ts | 2018-11-01 18:44:08.135136 seg | {"2018-11-01 18:44:08.135136,POINT(84.7732630362734 132.659516767599)","2018-11-01 18:44:20.603312,POINT(87.6352122295648 132.18647258915)","2018-11-01 18:44:19.447776,POINT(94.9817024609074 131.295661441982)"}
9. Reservation of historical track
Set cv life cycle and automatically clean aging data
postgres=# select pipelinedb.set_ttl('cv1', interval '1 hour' , 'min_ts'); -[ RECORD 1 ]----- set_ttl | (3600,2)
Create target persistence table
create table cv1_persist (like cv1);
Create time field index (CV1)
postgres=# create index idx_1 on cv1 (min_ts); CREATE INDEX
ETL form, extract data from cv to target persistence table
postgres=# insert into cv1_persist select * from cv1 where min_ts <= '2018-01-01'; INSERT 0 0
Reference resources
https://github.com/pipelinedb/pipelinedb