Hadoop [Page 21] - Programmer Group - a programming skills sharing group

Hadoop

YDB is deployed in Mini Centos environment based on haoop, zookeeper and kafka

YDB is deployed in Mini Centos environment based on haoop, zookeeper and kafka YDB introduction YDB full name Yan Yun YDB It is a real-time, multi-dimensional, interactive query, statistics and analysis engine based on Hadoop distributed architecture. It has second-level performance under trillion data scale and stable and reliable perform ...

Posted by narked on Wed, 13 Feb 2019 13:12:18 -0800

[Error] The node/hbase is not in ZooKeeper, Hbase port occupancy cannot start normally

After starting zookeeper, I started HDFS, and finally started Hbase. I found that HMaster process was not displayed in JPS process, as follows: [hadoop@hadoop000 bin]$ jps 3936 NameNode 4241 SecondaryNameNode 6561 Jps 4041 DataNode 3418 QuorumPeerMain Then. / hbase shell finds the following error: This is my previous hbase-site.xml confi ...

Posted by intenseone345 on Thu, 07 Feb 2019 10:15:18 -0800

Yarn tuning

1.Yarn Common Commands: [rachel@bigdata-senior01 bin]$ ./yarn Usage: yarn [--config confdir] COMMAND where COMMAND is one of: resourcemanager run the ResourceManager nodemanager run a nodemanager on each slave timelineserver run the timeline server rmadmin admin tools version ...

Posted by soloslinger on Sun, 03 Feb 2019 08:15:18 -0800

Spark textfile reads HDFS file partitions [compressed and uncompressed]

Spark textfile reads HDFS file partitions [compressed and uncompressed] sc.textFile("/blabla/{*.gz}") When we create spark context and use textfile to read files, what partition is it based on? What is the partition size? Compressed format of files File size and HDFS block size textfile will create a Hadoop RDD that uses ...

Posted by MK27 on Sat, 02 Feb 2019 17:54:15 -0800

I/O Operation of Hadoop: Serialization (I)

I/O Operation of Hadoop: Serialization (I) 1. serialization (1) Basic definitionsSerialization refers to the conversion of structured objects into byte streams for transmission over the network or permanent storage on disk; deserialization refers to the conversion of byte streams back to structured objects. (2) Applications ...

Posted by cedricm on Sat, 02 Feb 2019 14:36:16 -0800

Java Operates HDFS (Common API)

Java Operates HDFS Previous preparation API operation see file Create a new folder Upload files Download File Delete files Internal replication and internal movement (shear) rename Create new files Writing file read file Additional writing Get data location Previous preparation Ensure that the HDFS cluster has been bu ...

Posted by StroiX on Sat, 02 Feb 2019 13:57:16 -0800

Java API for HDFS operations

1. Environmental Construction Configuration of environment variables HADOOP_HONE Add bin of HADOOP_HOME to PATH Permission issues: Add HADOOP_USER_NAME=root environment variable Configuration of Eclipse Add the following hadoop-eclipse-plugin.jar package to the dropins plugins folder in the Eclipse installation directory ...

Posted by naskoo on Sat, 02 Feb 2019 12:36:15 -0800

Operating HDFS with Java

After building a high-availability HDFS cluster, Java can be used in Eclipse to operate HDFS and read and write files. High Availability HDFS Cluster Building Steps: https://blog.csdn.net/Chris_MZJ/article/details/83033471 Connecting HDFS with Eclipse 1. Place hadoop-eclipse-plugin-2.6.0.rar in the installation directory of ...

Posted by mfos on Sat, 02 Feb 2019 09:06:15 -0800

Hive Integrated HBase Detailed

Reproduced from: https://www.cnblogs.com/MOBIN/p/5704001.html 1. Create HBase tables from Hive Create a Live table pointing to HBase using the HQL statement CREATE TABLE hbase_table_1(key int, value string) //Table name hbase_table_1 in Hive STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' //Designated Storage P ...

Posted by maxpagels on Sat, 02 Feb 2019 02:45:15 -0800

Spark Learning Notes (1) - Introduction to Spark, Cluster Installation

1 Spark Introduction Spark is a fast, universal and scalable large data analysis engine. It was born in AMPLab, University of California, Berkeley in 2009. It was open source in 2010. It became Apache incubator in June 2013 and top-level Apache project in February 2014. At present, Spark ecosystem has developed into a collecti ...

Posted by All4172 on Sat, 02 Feb 2019 01:21:15 -0800

Hot Keywords