Home
problems and approaches (geo-distributed systems)
tags: #geo-distributed
- Problems and Approaches in [[geo-distributed-systems]]
- low response times(latency) for analytics queries issued across all geo-distributed sites
- Most common approach is loading relevant data to a datacenter and centrally executing it.
- Not ideal as this is slow and we might need (near) real time analysis
- Latency depends heavily on Query Execution Plan
- But query optimizers are not aware
- There also could be bottlenecks at the runtime
- lube - a system framework that detects and minimize bottlenecks
- Data transfer issue across sites & network Performance bottleneck
- running queries over geo-distributed inputs using the current intra-DC analytics frameworks
- also leads to high latency because these frameworks cannot cope with the relatively low and variable capacity of WAN links
- Also Expensive to transfer all data to a site
- Running parallel jobs across geo-distributed sites
- running queries over geo-distributed inputs using the current intra-DC analytics frameworks
- network transfer cost also need to be considered
- kimchi - a network cost aware geo-distributed analytics system
- Most common approach is loading relevant data to a datacenter and centrally executing it.
- low response times(latency) for analytics queries issued across all geo-distributed sites