task-6

PHOTO EMBED

Tue Oct 14 2025 18:50:09 GMT+0000 (Coordinated Universal Time)

Saved by @rcb

STEP 1: Launch Hive

Open the Terminal in Cloudera and type:

hive


You should see:

Logging initialized using configuration in /etc/hive/conf/hive-log4j.properties
Hive>

STEP 2: Create or Use a Database
SHOW DATABASES;
CREATE DATABASE IF NOT EXISTS company;
USE company;


Confirm:

SELECT current_database();

STEP 3: Create a Table

Create a simple table to hold employee data.

CREATE TABLE employees (
  id INT,
  name STRING
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
TBLPROPERTIES ("skip.header.line.count"="1");

STEP 4: Create a CSV File

Exit Hive (type exit;) and in the terminal run:

cd /home/cloudera/
gedit employees.csv


Paste the data below:

id,name
101,satvik
102,rahul
103,rishi
104,nithish


Save and close the file.

STEP 5: Load Data into the Hive Table

Reopen Hive:

hive
USE company;
LOAD DATA LOCAL INPATH '/home/cloudera/employees.csv' INTO TABLE employees;


Check the data:

SELECT * FROM employees;


✅ Output:

101	satvik
102	rahul
103	rishi
104	nithish

STEP 6: Create a Hive UDF (Java File)

Exit Hive and go back to terminal.

Create a new Java file:

gedit CapitalizeUDF.java


Paste the code:

import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;

public class CapitalizeUDF extends UDF {
    public Text evaluate(Text input) {
        if (input == null) return null;
        String str = input.toString().trim();
        if (str.isEmpty()) return new Text("");
        String result = str.substring(0, 1).toUpperCase() + str.substring(1).toLowerCase();
        return new Text(result);
    }
}


Save and close.

STEP 7: Compile the Java File

In the terminal:

javac -classpath $(hadoop classpath):/usr/lib/hive/lib/* -d . CapitalizeUDF.java


If successful, it won’t show any error and will create a .class file.

STEP 8: Create a JAR File
jar -cvf CapitalizeUDF.jar CapitalizeUDF.class


Check:

ls


You should see:

CapitalizeUDF.java  CapitalizeUDF.class  CapitalizeUDF.jar

STEP 9: Add JAR to Hive

Open Hive again:

hive
USE company;
ADD JAR /home/cloudera/CapitalizeUDF.jar;


You’ll get:

Added resources: /home/cloudera/CapitalizeUDF.jar

STEP 10: Create a Temporary Function
CREATE TEMPORARY FUNCTION capitalize AS 'CapitalizeUDF';

STEP 11: Use the Function
SELECT id, capitalize(name) AS capitalized_name FROM employees;
content_copyCOPY