从数据框中的任何列为空或为空的记录中删除

问题描述 投票:0回答:2

是否有任何方法可以从其中任何列值为空或为空的数据框中删除记录?

+---+-------+--------+-------------------+-----+----------+
|id |zipcode|type    |city               |state|population|
+---+-------+--------+-------------------+-----+----------+
|1  |704    |STANDARD|                   |PR   |30100     |
|2  |704    |        |PASEO COSTA DEL SUR|PR   |          |
|3  |76166  |UNIQUE  |CINGULAR WIRELESS  |TX   |84000     |
+---+-------+--------+-------------------+-----+----------+

我希望输出为:

+---+-------+------+-----------------+-----+----------+
|id |zipcode|type  |city             |state|population|
+---+-------+------+-----------------+-----+----------+
|4  |76166  |UNIQUE|CINGULAR WIRELESS|TX   |84000     |
+---+-------+------+-----------------+-----+----------+
scala dataframe apache-spark record is-empty
2个回答
0
投票

尝试一下:

df
  .na.replace(df.columns,Map("" -> null)) // convert empty strings with null
  .na.drop() // drop nulls and NaNs
  .show()

0
投票

尝试一下:

df.na.drop()
  .show(false)

希望有帮助...

© www.soinside.com 2019 - 2024. All rights reserved.