Building a big data environment for user portrait -- building a real-time user portrait from scratch

​ In this chapter, we started to formally build a big data environment, with the goal of building a stable big data environment that can be operated and monitored. We will use Ambari to build the underlying Hadoop environment, and use the native way to build Flink, Druid, Superset and other real-time computing environments. Use the combinatio ...

Posted by allinurl on Sun, 31 May 2020 20:43:34 -0700

phoenix-5.0.0 and CDH6.0.1 compatibility cause secondary index unavailable

Today, when testing phoenix's secondary index function, the following exception occurred when the index was created to write data: Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: org.apache.phoenix.hbase.index.builder.IndexBuildingFailureException: Failed to build index for unexpected reason! ...

Posted by superhaggis on Sat, 09 May 2020 07:38:49 -0700

hadoop native conflict error

Problem description After the cluster upgrade, hadoop cannot load the local library normally $ hadoop checknative -a 20/05/08 14:32:11 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version 20/05/08 14:32:11 WARN zlib.ZlibFactory: Failed to load/initialize native-zlib library 20/05/08 1 ...

Posted by conker87 on Fri, 08 May 2020 07:41:16 -0700

Hive quick start series (12) | introduction and use of hive data compression

                   . Catalog 1, Hadoop source compilation supports Snappy compression 1.1 resource preparation 1.2 jar package installation 1.3 compiling source code 2, Hadoop compression configuration 2 ...

Posted by pavanpuligandla on Tue, 05 May 2020 23:29:55 -0700

Hbase API create table error record for Docker container deployment cluster

Hbase API create table error record Demo method: import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.TableName; import org.apache.hadoop.hbase.client.Admin; import ...

Posted by spartan7 on Tue, 05 May 2020 16:37:36 -0700

Hadoop Ecosphere-Ranger Data Security Management Framework

* Hadoop Ecosphere-Ranger Data Security Management Framework Author: Yin Zhengjie Copyright Statement: Original work, decline to reprint!Otherwise, legal liability will be pursued.        Introduction to Ranger Apache Ranger is a data security management framework designed to fully understand the Hadoop ecosystem.It provides a unified ...

Posted by mcirl2 on Sun, 03 May 2020 23:33:49 -0700

Use demonstration of Hive lateral view

Lateral view is used with split, expand and other UDTF to split a row of data into multiple rows. On this basis, the split data can be aggregated. Lateral view first calls UDTF for each row of the original table, and UDTF will split a row into one or more rows. Then lateral view combines the results to produce a virtual table that supports alia ...

Posted by ayzee01 on Sat, 02 May 2020 19:41:55 -0700

Hbase API create table error record for Docker container deployment cluster

Hbase API create table error record Demo method: import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.TableName; import org.apache.hadoop.hbase.client.Admin; import org.apa ...

Posted by fusionxn1 on Thu, 30 Apr 2020 02:57:56 -0700

HBase operation: Spark read HBase snapshot demo share

**Before * * I shared with you a small demo of Spark reading HBase directly through the interface: HBase-Spark-Read-Demo However, if the amount of data is very large, Spark's direct scanning of HBase table will inevitably cause a lot of pressure on HBase cluster. Based on this, today I'd like to share with you the way spark directly reads HBas ...

Posted by vapokerpro on Fri, 17 Apr 2020 08:28:05 -0700

Hive is finally waiting, Flink

When did Apache Spark start to support the integrated Hive feature? I believe that as long as readers have used Spark, they will say that this is a long time ago. When does Apache Flink support integration with Hive? Readers may be confused. Haven't they supported it yet, haven't they used it? Or the latest version only supports it, but the fu ...

Posted by elfeste on Fri, 27 Mar 2020 03:09:45 -0700