I’m sharing Big Data Fundamentals, Part II, (Part I is here) with an introduction to Big Data covering: Big Data processes: ingest, store, process/query, visualize; tools and technologies: Hadoop, Sqoop, Kafka, Mesos, Redis, CouchDB; Document stores: MongoDB; Column stores: HBase + Cassandra; Big Data analytics: Spark, Storm; and Elastic Stack: Logstash, ElasticSearch and Kibana.
We’ll see also Machine learning techniques with Spark (MLlib, Streaming) and TensorFlow. Continuar leyendo «Big Data Fundamentals, Part II»
I’m sharing Big Data Fundamentals, Part I, with an introduction to Big Data covering: Big Data market and trends, definition and history, Big Data types (structured, unstructured, semi-structured), some use cases, best practices for Big Data analytics; and an overview of Apache Hadoop: HDFS, MapReduce, YARN. Continuar leyendo «Big Data Fundamentals – Part I»
In this tutorial we’ll install the Big Data framework Apache Hadoop on a previously installed CentOS 8 virtual machine. We’ll use Docker containers for the cluster creation.
This is for testing purposes and not for production. Be careful and don’t expose it to Internet since I’m not setting up any security measure for it.
Continuar leyendo «Hadoop installation on CentOS 8 Tutorial»