Java, Java EE & Java Script

This is default featured slide 1 title

Go to Blogger edit html and find these sentences.Now replace these sentences with your own descriptions.This theme is Bloggerized by Lasantha Bandara - Premiumbloggertemplates.com.

This is default featured slide 2 title

Go to Blogger edit html and find these sentences.Now replace these sentences with your own descriptions.This theme is Bloggerized by Lasantha Bandara - Premiumbloggertemplates.com.

This is default featured slide 3 title

Go to Blogger edit html and find these sentences.Now replace these sentences with your own descriptions.This theme is Bloggerized by Lasantha Bandara - Premiumbloggertemplates.com.

This is default featured slide 4 title

Go to Blogger edit html and find these sentences.Now replace these sentences with your own descriptions.This theme is Bloggerized by Lasantha Bandara - Premiumbloggertemplates.com.

This is default featured slide 5 title

Go to Blogger edit html and find these sentences.Now replace these sentences with your own descriptions.This theme is Bloggerized by Lasantha Bandara - Premiumbloggertemplates.com.

Monday, January 25, 2016

Analyzing Apache Access Logs in Apache Hive

By Pratheeban9:46 PM No comments:

The following statistics are analyzed:

A count of response code's returned from the server.
The content size of responses returned from the server to host.
The top ten most popular URL’s in the Apache log
The average, min, and max content size of responses returned from the server.

The steps to process data with Apache Hive

Before proceed the below steps, we have to install the Cloudera Quickstart vm 5.5 and VMwareplayer. The Hadoop 2.6, Java 1.7, Eclipse Luna, Hive, Hbase, Spark, and all required libraries have been included in cloudera.

Download the apache log file from http://www.monitorware.com/en/logsamples/ apache.php and unzip it.
Create a loganalyzer/input directory named path in HDFS.

hadoop fs -mkdir -p /user/cloudera/hive/input

Copy the log file from the local file system to directory within the HDFS.

hadoop fs -put access_log /user/cloudera/hive/input/

Create appropriate table for string Apache logs.

Load access_log file, depending location of file (local file system or HDFS) do on of followings

List the count of response code's returned from the server

Result

List the top 10 most popular URL’s in the Apache log

Result

List the content size of responses returned from the server.

Result

List the average, min, and max content size of responses returned from the server.

Result

Share: