I need a complete guidance on restructuring a data, using a streaming lining data technique .
#BigData #datascinece #machinelearing #spark #hadoop #HDFS #tech
It depends on the scale of the data and your SLA requirements. Let's say you want to reconstruct a small dataset in memory. If you are a Java or Scala developer, You can leverage Java 8 Lamda or Scala to manipulate data using functions like map, flatmap, reduce, and etc.
But if your dataset will be in TB and low latency data processing is required, then you need to consider to build a large scale and distributed streaming pipelines. The following tech stack may help you to initiate your evaluation:
Kafka (for pub/sub)
Kinesis (for pub/sub)
Spark Streaming (for compute)
Hope it helps!