Search

How do we write our own custom SerDe ?



In most cases, users want to write a Deserializer instead of a SerDe, because users just want to read their own data format instead of writing to it.
  • For example, the RegexDeserializer will deserialize the data using the configuration
    parameter ‘regex’, and possibly a list of column names
  • If your SerDe supports DDL (basically, SerDe with parameterized columns and column types), you probably want to implement a Protocol based on DynamicSerDe, instead of writing a SerDe from scratch.
  • The reason is that the framework passes DDL to SerDe through”thrift DDL” format, and it’s non-trivial to write a “thrift DDL” parser.