According to Spark Connector Java Guilde, section Datasets and SQL, it is possible to define a explicit schema for Dataset:
Dataset<Character> explicitDS = MongoSpark.load(jsc)
.toDS(Character.class);
Is it possible to skip some of the declared fields in this Java Bean Class?
import java.io.Serializable;
import com.fasterxml.jackson.annotation.JsonIgnore;
public final class Character implements Serializable {
private String name;
@JsonIgnore
private transient Integer age;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public Integer getAge() {
return age;
}
public void setAge(final Integer age) {
this.age = age;
}
}
I tried both transient
or @JsonIgnore
, but none of it seems to work? I have a Java Bean class which contains a lot of schema attributes and also a temporal variable. I would like to skip this temporal variable. Currently, if there is a declared fields in schema class which is not in the Dataset schema, I always got an empty Dataset from
MongoSpark.load(jsc).toDS(SchemaWithFieldToSkip.class);