Wednesday, June 11, 2014

Biggest Hadoop environments

Some interesting numbers found in posts about biggest hadoop clusters:
  • Yahoo: 
    • 32,000 nodes
    • benchmarking environment: 300 nodes
  • Twitter (just one of the many hadoop clusters): 
    • 3,500 nodes
    • 50PB total
    • 6PB per day
    • 30K jobs per day
  • Facebook: 
    • 300PB hive data
    • 600 TB per day
  • Linkedin: 
    • 170PB of logs (kafka)


sources:
http://www.datanami.com/2014/06/04/yahoo-run-whole-company-hadoop/
https://code.facebook.com/posts/229861827208629/scaling-the-facebook-data-warehouse-to-300-pb/
http://pt.slideshare.net/lohitvijayarenu/hadoop-2-twitter-elephant-scale-presented-at
http://www.enterprisetech.com/2013/11/08/cluster-sizes-reveal-hadoop-maturity-curve/
http://www.slideshare.net/JayKreps1/i-32858698

2 comments:

  1. Actually, you have explained the technology to the fullest. Thanks for sharing the information you have got. It helped me a lot. I experimented your thoughts in my training program.


    Hadoop Training in Chennai
    Big Data Training in Chennai

    ReplyDelete
  2. Thank you for your guide to with upgrade information about Hadoop
    Hadoop admin Online Training

    ReplyDelete