PRACTICAL 5

Practical 5


AIM: 

Data Pre-processing and text analytics using Orange


THEORY:

Text Analytics

Text analytics is the automated process of translating large volumes of unstructured text into quantitative data to uncover insights, trends, and patterns.


Sentiment Analysis

Sentiment Analysis is the process of determining whether a piece of writing is positive, negative or neutral. A sentiment analysis system for text analysis combines natural language processing (NLP) and machine learning techniques to assign weighted sentiment scores to the entities, topics, themes and categories within a sentence or phrase.


WHY?

Sentiment analysis is extremely useful in social media monitoring as it allows us to gain an overview of the wider public opinion behind certain topics. Sentiment analysis is useful for quickly gaining insights using large volumes of text.


Dataset Description:

On the basis of the mobile Specification like Battery power, 3G enabled , wifi ,Bluetooth, Ram etc. we need to predict Price range of the mobile


Preprocessing is a key component in Data Science. Orange tool has various ways to achieve the activities.


Normalization: 

It is a systematic approach of decomposing tables to eliminate data redundancy(repetition) and undesirable characteristics like Insertion, Update and Deletion Anomalies.

Sample Code:

from Orange.data import Table

from Orange.preprocess import Normalize

data = in_data

normalizer = Normalize(norm_type=Normalize.NormalizeBySpan)

out_data=normalizer(data)


1. Preprocessing

Flow of Work



Preprocessing Widgets

Original Dataset

Preprocessed Dataset


2. Python Script in Orange


Python Scripts Workflow

Normalization Script

Normalized Data





REFERENCES:

https://docs.biolab.si//3/data-mining-library/reference/preprocess.html

https://orange-data-mining-library.readthedocs.io/en/latest/#tutorial



Comments

Popular posts from this blog