0

How to improve pyspark script performance? It has huge data processing in it.

I have written one pyspark script. In which I'm extracting few required fields from nested json structure. And creating dataframe on top of it. But it is taking to kuch time to process even less amount of records.

2nd Oct 2020, 11:24 AM
Sangram Patil
Sangram Patil - avatar
2 Respostas
+ 7
without having seen your code we can not help you. you can put your code in playground and link it here. Thanks!
2nd Oct 2020, 11:30 AM
Lothar
Lothar - avatar
+ 1
You may want to look into threading if you have too many processes going on.
2nd Oct 2020, 11:29 AM
Slick
Slick - avatar