Category: Big Data

Analytics, Big Data, Technical Stuff

Storm, a brief exploration into it

What is Apache Storm? Storm is a free and open source distributed realtime computation system that makes easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. It was created by Nathan Marz and team at BackType and written predominantly in the Clojure programming language. The project …

Analytics, Big Data, Technical Stuff

Spark, a brief exploration into it

What is Spark? Spark is a open source cluster computing framework originally developed by UC Berkeley (2009) but later donated to the Apache Software Foundation (2010). It is a Big Data analytics engine created after Hadoop technology but improving and adding more capabilities. Spark comes as an alternative to Hadoop-MapReduce to make easier to build …

Analytics, Big Data, Technical Stuff

Cassandra, a brief exploration into it

What is Cassandra? Cassandra is an open source BigData technology, more specific, it is a NoSQL distributed database. It started on Facebook but was mainly developed into the Apache Software Foundation. It is based on Dynamo and Big Table papers. Amazon Dynamo paper was created for distributed database technology and Google’s Big Table paper was …

Big Data, Technical Stuff

Hadoop, a brief exploration into it

Hadoop is a clustering and big data processing system. It’s an open source and java based technology developed on Yahoo that now belongs to the Apache Software Foundation. Why Hadoop? Is a reality that all of us are more connected hence we produce more data and we will keep on that path, so, Hadoop arises …