In most cases, users want to write a Deserializer instead of a SerDe, because users just want to read their own data format instead of writing to it.
- For example, the
RegexDeserializer
will deserialize the data using the configuration
parameter ‘regex’, and possibly a list of column names - If your SerDe supports DDL (basically, SerDe with parameterized columns and column types), you probably want to implement a Protocol based on DynamicSerDe, instead of writing a SerDe from scratch.
- The reason is that the framework passes DDL to SerDe through”thrift DDL” format, and it’s non-trivial to write a “thrift DDL” parser.