Java Api has no aggregation problem after writing Spark program reduceByKey (custom type as Key)

Keywords: Big Data Spark Apache Java Eclipse

Writing Spark using Java Api If PairRDD's key value is a custom type, you need to override hashcode and equals methods, otherwise you will find that the same Key value is not aggregated.

For example: Use User type as Key

​
public class User {
	
	private String name;
	private String age;
	public String getName() {
		return name;
	}
	public void setName(String name) {
		this.name = name;
	}
	public String getAge() {
		return age;
	}
	public void setAge(String age) {
		this.age = age;
	}
	@Override
	public int hashCode() {
		final int prime = 31;
		int result = 1;
		result = prime * result + ((age == null) ? 0 : age.hashCode());
		result = prime * result + ((name == null) ? 0 : name.hashCode());
		return result;
	}
	@Override
	public boolean equals(Object obj) {
		if (this == obj)
			return true;
		if (obj == null)
			return false;
		if (getClass() != obj.getClass())
			return false;
		User other = (User) obj;
		if (age == null) {
			if (other.age != null)
				return false;
		} else if (!age.equals(other.age))
			return false;
		if (name == null) {
			if (other.name != null)
				return false;
		} else if (!name.equals(other.name))
			return false;
		return true;
	}
	
}

​

In general, eclipse can automatically generate hashcode and equals methods of types without special treatment by itself.

If we encounter a special case, we can use the HashCodeBuilder and EqualsBuilder tool classes in the commons-lang3 package to generate the corresponding methods.

 

package run.aaa.spark;

import org.apache.commons.lang3.builder.EqualsBuilder;
import org.apache.commons.lang3.builder.HashCodeBuilder;

public class User {
	
	private String name;
	private String age;
	public String getName() {
		return name;
	}
	public void setName(String name) {
		this.name = name;
	}
	public String getAge() {
		return age;
	}
	public void setAge(String age) {
		this.age = age;
	}
	@Override
	public int hashCode() {
		return HashCodeBuilder.reflectionHashCode(this);
	}
	@Override
	public boolean equals(Object obj) {
		return EqualsBuilder.reflectionEquals(this, obj);
	}
	
}

Posted by harrisonad on Thu, 24 Jan 2019 20:18:13 -0800