Great work on this lib! It's a great way to write Spark code!
As discussed here and in the docs, withColumn requires a full schema when a column is added.
Here's the example in the docs:
case class CityBedsOther(city: String, bedrooms: Int, other: List[String])
cityBeds.
withColumn[CityBedsOther](lit(List("a","b","c"))).
show(1).run()
Couldn't we just assume that the schema stays the same for the existing columns and only supply the schema for the column that's being added?
cityBeds.
withColumn[List[String]](lit(List("a","b","c"))).
show(1).run()
I think this'd be a lot more use friendly. I'm often dealing with schemas that have tons of columns and add lots of columns with withColumn. Let me know your thoughts!
Great work on this lib! It's a great way to write Spark code!
As discussed here and in the docs,
withColumnrequires a full schema when a column is added.Here's the example in the docs:
Couldn't we just assume that the schema stays the same for the existing columns and only supply the schema for the column that's being added?
I think this'd be a lot more use friendly. I'm often dealing with schemas that have tons of columns and add lots of columns with
withColumn. Let me know your thoughts!