Reading Notes - Large-scale Distributed Storage Systems: Principles Analysis and Architecture Practice: IX

This article was last updated on: February 7, 2024 pm

13 Big Data

13.1 Concepts

Features: 4 V

  1. Volume: The amount of data is particularly large
  2. Variety: There are a lot of data types
  3. Velocity: Data is growing particularly fast
  4. Value: Low value density

13.2 MapReduce

The consumer only needs to write 2 functions called Map and Reduce.

The MapReduce framework includes 3 roles:

  • Master: Perform task division, scheduling, coordination between tasks
  • Map Worker processes
  • Reduce Worker processes

13.3 Streaming Computing

Greater emphasis on delays in data processing.

13.5 Real-Time Analytics

13.5.1 MPP architecture

MPP(Massively Paraller Processing)

13.5.2 EMC Greenplum

OLAP products, underlying based on the open-source Postgresql database.


Reading Notes - Large-scale Distributed Storage Systems: Principles Analysis and Architecture Practice: IX
https://e-whisper.com/posts/27007/
Author
east4ming
Posted on
September 18, 2021
Licensed under