What is Helix?

It  used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes. Helix automates reassignment of resources in the face of node failure and recovery, cluster expansion, and reconfiguration. Modeling a distributed system as a state machine with constraints on states and transitions.

Terminologies

  • Node :  A single machine
  • Cluster: Set of Nodes   
  • Resource : A logical entry (e.g.    database, index, task)
  • Partition: Subset of the resource  (Each subtask is referred to as a partition)
  • Replica: Copy of a Partition State  (e.g Master, Slave). It increase the availability of the system
  • State: Describes the role of a replica (Each node in the cluster has its own Current State)
  • State Machine and Transitions: An action that allows a replica to move from one state to another, thus changing its role. ( e.g    Slave    -->    Master )  
  • spectators: the external clients. Helix provides an External View that is an aggregated view of the current state across all nodes.
  • Current State: represents resource's actual state at a participating node.
    - INSTANCE_NAME: Unique name representing the process
    - SESSION_ID: ID that is automatically assigned every time a process joins the cluster
  • Rebalancer: The core component of Helix is the Controller which runs the Rebalance algorithm on every cluster event.
  • Dynamic Ideal State: Helix powerful is that Ideal State can be changed dynamically. It is adjusting the ideal state. Whenever a cluster event occurs, Helix can operate in one of three modes
  1. FULL_AUTO
  2. SEMI_AUTO
  3. CUSTOMIZED

Cluster events can be one of the following:

  • Nodes start and/or stop
  • Nodes experience soft and/or hard failures
  • New nodes are added/removed


[1] http://helix.apache.org/Concepts.html

0

Add a comment

I am
I am
Archives
Total Pageviews
Total Pageviews
2 0 5 8 0 4 3
Categories
Categories
Loading