• Big Data Fundamentals, Part II

    Big Data Fundamentals, Part II

    I’m sharing Big Data Fundamentals, Part II, (Part I is here) with an introduction to Big Data covering: Big Data processes: ingest, store, process/query, visualize; tools and technologies: Hadoop, Sqoop, Kafka, Mesos, Redis, CouchDB; Document stores: MongoDB; Column stores: HBase + Cassandra; Big Data analytics: Spark, Storm; and Elastic Stack: Logstash, ElasticSearch and Kibana.

    We’ll see also Machine learning techniques with Spark (MLlib, Streaming) and TensorFlow.

    Read More

  • Big Data Fundamentals – Part I

    Big Data Fundamentals – Part I

    I’m sharing Big Data Fundamentals, Part I, with an introduction to Big Data covering: Big Data market and trends, definition and history, Big Data types (structured, unstructured, semi-structured), some use cases, best practices for Big Data analytics; and an overview of Apache Hadoop: HDFS, MapReduce, YARN.

    Read More

  • Hadoop installation on CentOS 8 Tutorial

    Hadoop installation on CentOS 8 Tutorial

    In this tutorial we’ll install the Big Data framework Apache Hadoop on a previously installed CentOS 8 virtual machine. We’ll use Docker containers for the cluster creation.

    This is for testing purposes and not for production. Be careful and don’t expose it to Internet since I’m not setting up any security measure for it.

    Read More