Sentiment Analysis Using Doctors Dataset
Create New

Sentiment Analysis Using Doctors Dataset

Project period

06/23/2018 - 07/22/2018

Views

395

4



Sentiment Analysis Using Doctors Dataset
Sentiment Analysis Using Doctors Dataset

Sentiment analysis is the process of determining the emotional tone behind a series of words, used to gain an understanding of the attitudes, opinions, and emotions expressed. Sentiment analysis is extremely useful in social media monitoring as it allows us to gain an overview of the wider public opinion behind certain topics. The applications of sentiment analysis are broad and powerful. The ability to extract insights from social data is a practice that is being widely adopted by organizations across the world.

Why: Problem statement

Most of the good doctors are invisible to the public. Patients are seeking for the good and familiar doctors. 

How: Solution description

In order to make the good doctors familiar, I am going to analyze the feedback of the patients. Here, I used sentiment analysis to review patient feedback for a select group of doctors from overall Pondicherry. I used Python and Jupyter Notebook to develop our system, relying on Scikit-Learn for the machine learning components.

Bag of words:

The classifiers and learning algorithms can not directly process the text documents in their original form, as most of them expect numerical feature vectors with a fixed size rather than the raw text documents with variable length. Therefore, during the preprocessing step, the texts are converted to a more manageable representation.

One common approach for extracting features from the text is to use the bag of words model: a model where for each document, the patient feedback in our case, the presence of words is taken into consideration, but the order in which they occur is ignored.

TF-IDF:

Specifically, for each term in our dataset, we will calculate a measure called Term Frequency (TF) and Inverse Document Frequency (IDF). We will use sklearn.feature_extraction.text.TfidfVectorizer to calculate a tf-idfvector for each of patient feedback. Finally, we can even reduce the weightage of more common words which occurs in all document.

Now, each of 570 patient feedback is represented by 4 features, representing the tf-idf score for different unigrams and bigrams.

Scikit-learn has a high-level component which will create feature vectors for us ‘CountVectorizer’. Then, I split the total data into train and test data using train_test_split method. Finally, it was split into 456 items in training data, 114 in test data.

Stopwords:

A stop word is a commonly used word that a search engine has been programmed to ignore, both when indexing entries for searching and when retrieving them as the result of a search query.

We would not want these words taking up space in our database, or taking up valuable processing time. For this, we can remove them easily, by storing a list of words that you consider to stop words. NLTK (Natural Language processing Toolkit) in python has a list of stopwords stored in the nltk_data directory

Machine Learning Models:

We are now ready to experiment with different machine learning models, evaluate their accuracy and find the source of any potential issues.

We will benchmark the following three models:

Logistic Regression: 0.97

Bernoulli NB: 0.93

MultinomialNB: 0.86

Confusion Matrix:

Continue with our best model (Logistic Regression), we are going to look at the confusion matrix, and show the similarity between predicted and actual labels. The vast majority of the predictions end up on the diagonal (predicted label = actual label), where we want them to be.

    

Result:

This is the result of testing with sample feedback from the patients.

How is it different from competition

I used Logistic Regression which gives more accuracy than any other algorithms.

Who are your customers

Useful to the patients who are suffering from any type of disorder.

Data Scientists can use this for study and literature survey.

Project Phases and Schedule

Phase 1: Data collection

Phase 2: Data Analysis

Phase 3: Sentiment Analysis and Prediction.

Resources Required

Tool required: Anaconda - Python 3.6 version

Download:
Project Code Code copy
/* Your file Name : Doctors recommendation.ipynb */
/* Your coding Language : python */
/* Your code snippet start here */
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "column_names = ['User_ID','Name_ID', 'Rating_ID', 'Comments', 'Specialist_ID','Address']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_csv('dataset_doctor4.csv', sep=',', names=column_names)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>User_ID</th>\n",
       "      <th>Name_ID</th>\n",
       "      <th>Rating_ID</th>\n",
       "      <th>Comments</th>\n",
       "      <th>Specialist_ID</th>\n",
       "      <th>Address</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>Puducherry Diabetes Foundat..</td>\n",
       "      <td>3.9</td>\n",
       "      <td>Verygood</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 86, Eswaran Kovil Street, Pondicherry - 6...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>L K Nursing Home</td>\n",
       "      <td>4.3</td>\n",
       "      <td>Excellent</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No:25, Rajarajeswari nagar, Venkata Nagar, P...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>East Coast Hospitals</td>\n",
       "      <td>4.1</td>\n",
       "      <td>Excellent</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 370, 2nd Main Road Mahaveer Nagar, Lawspe...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "      <td>Delta Q Labs</td>\n",
       "      <td>5.0</td>\n",
       "      <td>Excellent</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 1, Perambai Road, Moolakulam, Pondicherry...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "      <td>Clinic Nallam</td>\n",
       "      <td>4.1</td>\n",
       "      <td>Excellent</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 36/A, Cuddalore Main Road, Mudaliarpet, P...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   User_ID                        Name_ID  Rating_ID   Comments  \\\n",
       "0        0  Puducherry Diabetes Foundat..        3.9   Verygood   \n",
       "1        1               L K Nursing Home        4.3  Excellent   \n",
       "2        2           East Coast Hospitals        4.1  Excellent   \n",
       "3        3                   Delta Q Labs        5.0  Excellent   \n",
       "4        4                  Clinic Nallam        4.1  Excellent   \n",
       "\n",
       "                        Specialist_ID  \\\n",
       "0  Doctors For Dengue Fever Treatment   \n",
       "1  Doctors For Dengue Fever Treatment   \n",
       "2  Doctors For Dengue Fever Treatment   \n",
       "3  Doctors For Dengue Fever Treatment   \n",
       "4  Doctors For Dengue Fever Treatment   \n",
       "\n",
       "                                             Address  \n",
       "0    No 86, Eswaran Kovil Street, Pondicherry - 6...  \n",
       "1    No:25, Rajarajeswari nagar, Venkata Nagar, P...  \n",
       "2    No 370, 2nd Main Road Mahaveer Nagar, Lawspe...  \n",
       "3    No 1, Perambai Road, Moolakulam, Pondicherry...  \n",
       "4    No 36/A, Cuddalore Main Road, Mudaliarpet, P...  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Name_ID\n",
       "Dr. Kathimathi r                 5.0\n",
       "Sara Physiotherapy Clinic        5.0\n",
       "Dr. Ramasamy (Dabur Ayurved..    5.0\n",
       "Dr. Ravendran v                  5.0\n",
       "Senthil Womens & Child Hosp..    5.0\n",
       "Shakshi Dental Care              5.0\n",
       "Dr. Gowthaman m                  5.0\n",
       "Shifaya Multispeciality Den..    5.0\n",
       "Dr. Ravi Kannah (East Coast..    5.0\n",
       "Dr. Chidambaram (New Medica..    5.0\n",
       "Name: Rating_ID, dtype: float64"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby('Name_ID')['Rating_ID'].mean().sort_values(ascending=False).head(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Name_ID\n",
       "Sri Venkateshwara Hospital       9\n",
       "Apollo Hospital Information..    9\n",
       "Rani Hospital                    8\n",
       "Sun Pharmacy                     8\n",
       "Aum Hospitals                    8\n",
       "Be Well Hospitals                7\n",
       "Pondicherry Institute Of Me..    6\n",
       "Sri Venkateshwaraa Medical ..    6\n",
       "Abishek Ortho Centre             6\n",
       "Clinic Nallam                    6\n",
       "Lavanya Ayurvedic Hospital ..    5\n",
       "Vasista Ayurveda                 5\n",
       "Jipmer Hospital                  5\n",
       "Neomed Hospital(Dr Sivabala..    5\n",
       "Herbocare Hospital Pvt Ltd       5\n",
       "Aum Speciality Clinics           4\n",
       "Dr. Kulandaivelu S               4\n",
       "Royal India Associates           4\n",
       "Sri Manakula Vinayagar Medi..    4\n",
       "Decibel Audiology Centre He..    4\n",
       "Dr. Narendra Agarawal            4\n",
       "L K Nursing Home                 4\n",
       "Sri Shruthi Clinic               4\n",
       "Simha Hearing Aids And Spee..    4\n",
       "United Care Hearing Aids & ..    4\n",
       "East Coast Hospitals             4\n",
       "Dr. N D Rajan(simha Hearing..    4\n",
       "Ever Smiles                      4\n",
       "Dental Orthodontic & Implan..    3\n",
       "Dr. K Narayanan (Kidney Cen..    3\n",
       "                                ..\n",
       "Dr. Vishnu Prasad T L            1\n",
       "Dr. Yuvaraj. V (Vijaya Dent..    1\n",
       "Dr.narendra Agarwal Clinic       1\n",
       "East Coast Hospitals Ltd         1\n",
       "Ever Smiles Dental Clinic        1\n",
       "Dr. Thiroumourougane Serane..    1\n",
       "Fortis East Coast Hospitals..    1\n",
       "Future Dental Art                1\n",
       "G.s. Dental Care                 1\n",
       "Gem Dental Care                  1\n",
       "Giggles Dental Care              1\n",
       "Gvan Clinic                      1\n",
       "Dr. Vijaya Balagandan            1\n",
       "Dr. Vijay Oza (East Coast H..    1\n",
       "Dr. Vijay Oza                    1\n",
       "Dr. Venkateswaralou N            1\n",
       "Dr. Veena K S                    1\n",
       "Dr. Vasudevan T l                1\n",
       "Dr. Vasudevan S                  1\n",
       "Dr. Vasudevan P l                1\n",
       "Dr. Vasudevan (East Coast H..    1\n",
       "Dr. Vasudevaiah V                1\n",
       "Dr. Varsha Murthy (Vijaya D..    1\n",
       "Dr. Varalakshmi                  1\n",
       "Dr. Uma Shankar v                1\n",
       "Dr. Uma Maheshwaran              1\n",
       "Dr. Udhayakumar                  1\n",
       "Dr. Thomas Edvin Raj B (ann..    1\n",
       "Dr. Thiru Selvam e               1\n",
       "Dr. Sivadhasan                   1\n",
       "Name: Rating_ID, Length: 348, dtype: int64"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby('Name_ID')['Rating_ID'].count().sort_values(ascending=False).head(500)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "ratings =pd.DataFrame(df.groupby('Name_ID')['Rating_ID'].mean())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Rating_ID</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Name_ID</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>A G Dental Clinic</th>\n",
       "      <td>4.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>A G Padmavathi Hospital Pvt..</th>\n",
       "      <td>3.9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>A K Dental Clinic</th>\n",
       "      <td>4.7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ASSJ Chidambaram Dental Hos..</th>\n",
       "      <td>2.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aayush Health Care Centre</th>\n",
       "      <td>4.4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                               Rating_ID\n",
       "Name_ID                                 \n",
       "A G Dental Clinic                    4.6\n",
       "A G Padmavathi Hospital Pvt..        3.9\n",
       "A K Dental Clinic                    4.7\n",
       "ASSJ Chidambaram Dental Hos..        2.8\n",
       "Aayush Health Care Centre            4.4"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ratings.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "ratings['Rating_Numbers'] = pd.DataFrame(df.groupby('Name_ID')['Rating_ID'].count())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Rating_ID</th>\n",
       "      <th>Rating_Numbers</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Name_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>A G Dental Clinic</th>\n",
       "      <td>4.6</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>A G Padmavathi Hospital Pvt..</th>\n",
       "      <td>3.9</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>A K Dental Clinic</th>\n",
       "      <td>4.7</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ASSJ Chidambaram Dental Hos..</th>\n",
       "      <td>2.8</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aayush Health Care Centre</th>\n",
       "      <td>4.4</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                               Rating_ID  Rating_Numbers\n",
       "Name_ID                                                 \n",
       "A G Dental Clinic                    4.6               2\n",
       "A G Padmavathi Hospital Pvt..        3.9               2\n",
       "A K Dental Clinic                    4.7               1\n",
       "ASSJ Chidambaram Dental Hos..        2.8               3\n",
       "Aayush Health Care Centre            4.4               2"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ratings.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x1a18c75ef0>"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAD8CAYAAAB5Pm/hAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvNQv5yAAAEQFJREFUeJzt3X+sZHV9xvH3I0sVWAUU3azLtksTaqSQItxQWhJyt1TLDyNoq4EggrVZ02CDLUld/UebhoQmYhupJV1dCkZkS/kRCFArpWwpf6ACoguu6Fa2dBe6qwUXbiXqwqd/zLl4Fy9778y948x+eb+SyZz5zplznrm5+8y53zkzm6pCktSuV4w6gCRpuCx6SWqcRS9JjbPoJalxFr0kNc6il6TGzVn0SVYmuSvJ5iQPJ7moG/9Eku1JHuwup894zEeTbEnySJLfG+YTkCTtXeY6jz7JcmB5VT2Q5NXA/cBZwHuAqar65IvWPwq4FjgBeCPwr8CvVdVzQ8gvSZrDnEf0VfVEVT3QLT8DbAZW7OUhZwIbqurHVfUosIVe6UuSRmBJPysnWQW8BfgKcBLwoSTvA+4DLq6qp+i9CNw742HbmOWFIckaYA3AAQcccPzKlSsHiA/PP/88r3jF+L3VMK65YHyzmas/5upPi7m+853v/KCqXj/nilU1rwuwlN60zbu628uA/ej9VXAJcGU3/hngvTMetx74/b1t+/jjj69B3XXXXQM/dpjGNVfV+GYzV3/M1Z8WcwH31Tz6e14vI0n2B24ArqmqG7sXiB1V9VxVPQ98lp9Nz2wDZh6eHw48Pp/9SJIW33zOugm9o/LNVfWpGePLZ6z2TuChbvkW4Owkr0xyBHAk8NXFiyxJ6sd85uhPAs4DNiV5sBv7GHBOkmOBArYCHwSoqoeTXAd8C9gNXFiecSNJIzNn0VfVPUBmuev2vTzmEnrz9pKkERu/t6AlSYvKopekxln0ktQ4i16SGmfRS1Lj+voKhHG0afsuLlh72x5jWy89Y0RpJGn8eEQvSY2z6CWpcRa9JDXOopekxln0ktQ4i16SGmfRS1LjLHpJapxFL0mNs+glqXEWvSQ1zqKXpMZZ9JLUOItekhpn0UtS4yx6SWqcRS9JjbPoJalxFr0kNc6il6TGWfSS1DiLXpIaZ9FLUuMseklqnEUvSY2z6CWpcRa9JDXOopekxln0ktS4OYs+ycokdyXZnOThJBd1469NckeS73bXh3bjSfLpJFuSfDPJccN+EpKklzafI/rdwMVV9WbgRODCJEcBa4E7q+pI4M7uNsBpwJHdZQ1wxaKnliTN25xFX1VPVNUD3fIzwGZgBXAmcHW32tXAWd3ymcDnq+de4JAkyxc9uSRpXlJV8185WQXcDRwNPFZVh8y476mqOjTJrcClVXVPN34n8JGquu9F21pD74ifZcuWHb9hw4aBnsDOJ3ex49k9x45ZcfBA21pMU1NTLF26dNQxZjWu2czVH3P1p8Vcq1evvr+qJuZab8l8N5hkKXAD8OGqejrJS646y9jPvZpU1TpgHcDExERNTk7ON8oeLr/mZi7btOfT2HruYNtaTBs3bmTQ5zRs45rNXP0xV39ezrnmddZNkv3plfw1VXVjN7xjekqmu97ZjW8DVs54+OHA44sTV5LUr/mcdRNgPbC5qj41465bgPO75fOBm2eMv687++ZEYFdVPbGImSVJfZjP1M1JwHnApiQPdmMfAy4FrkvyAeAx4N3dfbcDpwNbgB8B71/UxJKkvsxZ9N2bqi81IX/KLOsXcOECc0mSFomfjJWkxln0ktQ4i16SGmfRS1LjLHpJapxFL0mNs+glqXEWvSQ1zqKXpMZZ9JLUOItekhpn0UtS4yx6SWqcRS9JjbPoJalxFr0kNc6il6TGWfSS1DiLXpIaZ9FLUuMseklqnEUvSY2z6CWpcRa9JDXOopekxln0ktQ4i16SGmfRS1LjLHpJapxFL0mNs+glqXEWvSQ1zqKXpMZZ9JLUuDmLPsmVSXYmeWjG2CeSbE/yYHc5fcZ9H02yJckjSX5vWMElSfMznyP6q4BTZxn/66o6trvcDpDkKOBs4Ne7x/xdkv0WK6wkqX9zFn1V3Q08Oc/tnQlsqKofV9WjwBbghAXkkyQtUKpq7pWSVcCtVXV0d/sTwAXA08B9wMVV9VSSvwXuraovdOutB/65qq6fZZtrgDUAy5YtO37Dhg0DPYGdT+5ix7N7jh2z4uCBtrWYpqamWLp06ahjzGpcs5mrP+bqT4u5Vq9efX9VTcy13pKBtg5XAH8JVHd9GfCHQGZZd9ZXkqpaB6wDmJiYqMnJyYGCXH7NzVy2ac+nsfXcwba1mDZu3Migz2nYxjWbufpjrv68nHMNdNZNVe2oqueq6nngs/xsemYbsHLGqocDjy8soiRpIQYq+iTLZ9x8JzB9Rs4twNlJXpnkCOBI4KsLiyhJWog5p26SXAtMAocl2QZ8HJhMciy9aZmtwAcBqurhJNcB3wJ2AxdW1XPDiS5Jmo85i76qzplleP1e1r8EuGQhoSRJi8dPxkpS4yx6SWqcRS9JjbPoJalxFr0kNc6il6TGWfSS1DiLXpIaZ9FLUuMseklqnEUvSY2z6CWpcRa9JDXOopekxln0ktQ4i16SGmfRS1LjLHpJapxFL0mNs+glqXEWvSQ1zqKXpMZZ9JLUOItekhpn0UtS4yx6SWqcRS9JjbPoJalxFr0kNc6il6TGWfSS1DiLXpIaZ9FLUuMseklqnEUvSY2bs+iTXJlkZ5KHZoy9NskdSb7bXR/ajSfJp5NsSfLNJMcNM7wkaW7zOaK/Cjj1RWNrgTur6kjgzu42wGnAkd1lDXDF4sSUJA1qzqKvqruBJ180fCZwdbd8NXDWjPHPV8+9wCFJli9WWElS/1JVc6+UrAJuraqju9s/rKpDZtz/VFUdmuRW4NKquqcbvxP4SFXdN8s219A76mfZsmXHb9iwYaAnsPPJXex4ds+xY1YcPNC2FtPU1BRLly4ddYxZjWs2c/XHXP1pMdfq1avvr6qJudZbMtDWX1pmGZv1laSq1gHrACYmJmpycnKgHV5+zc1ctmnPp7H13MG2tZg2btzIoM9p2MY1m7n6Y67+vJxzDXrWzY7pKZnuemc3vg1YOWO9w4HHB48nSVqoQYv+FuD8bvl84OYZ4+/rzr45EdhVVU8sMKMkaQHmnLpJci0wCRyWZBvwceBS4LokHwAeA97drX47cDqwBfgR8P4hZJYk9WHOoq+qc17irlNmWbeACxcaSpK0ePxkrCQ1zqKXpMZZ9JLUOItekhpn0UtS4yx6SWqcRS9JjbPoJalxFr0kNc6il6TGWfSS1DiLXpIaZ9FLUuMseklqnEUvSY2z6CWpcRa9JDXOopekxln0ktQ4i16SGmfRS1LjLHpJapxFL0mNs+glqXEWvSQ1bsmoA7Rk1drbXli++JjdXLD2NrZeesYIE0mSR/SS1DyLXpIaZ9FLUuMseklqnEUvSY2z6CWpcRa9JDXOopekxi3oA1NJtgLPAM8Bu6tqIslrgX8EVgFbgfdU1VMLiylJGtRiHNGvrqpjq2qiu70WuLOqjgTu7G5LkkZkGFM3ZwJXd8tXA2cNYR+SpHlKVQ3+4ORR4CmggL+vqnVJflhVh8xY56mqOnSWx64B1gAsW7bs+A0bNgyUYeeTu9jx7J5jx6w4eKBtLdSm7bteWF52AOx4dnRZ9mZqaoqlS5eOOsbPMVd/zNWfFnOtXr36/hmzKS9poV9qdlJVPZ7kDcAdSb493wdW1TpgHcDExERNTk4OFODya27msk17Po2t5w62rYW64EVfanbZpiUjy7I3GzduZNCf9zCZqz/m6s/LOdeCpm6q6vHueidwE3ACsCPJcoDueudCQ0qSBjdw0Sc5KMmrp5eBtwEPAbcA53ernQ/cvNCQkqTBLWTqZhlwU5Lp7Xyxqr6U5GvAdUk+ADwGvHvhMSVJgxq46Kvqe8BvzDL+v8ApCwklSVo8fjJWkhpn0UtS4yx6SWqcRS9JjbPoJalxFr0kNc6il6TGWfSS1DiLXpIaZ9FLUuMseklqnEUvSY2z6CWpcRa9JDXOopekxln0ktQ4i16SGmfRS1LjLHpJapxFL0mNs+glqXEWvSQ1bsmoA+gXY9Xa215YvviY3Vyw9ja2XnrGCBNJ+kXxiF6SGmfRS1LjLHpJapxz9Bq56fcPpt87AHz/QFpEHtFLUuMseklqnEUvSY2z6CWpcRa9JDXOopekxnl6pTSHTdt3vXDaJ3jqp/Y9Fr20j5j5fUXQ+9zB5GiiaB8ztKmbJKcmeSTJliRrh7UfSZrLqrW3sWn7Llatve3nXjBfDoZyRJ9kP+AzwFuBbcDXktxSVd8axv4kaV8w24vMVaceNPT9DuuI/gRgS1V9r6p+AmwAzhzSviSNgemj5ekjZ42PVNXibzT5A+DUqvqj7vZ5wG9W1YdmrLMGWNPdfBPwyIC7Owz4wQLiDsu45oLxzWau/pirPy3m+pWqev1cKw3rzdjMMrbHK0pVrQPWLXhHyX1VNbHQ7Sy2cc0F45vNXP0xV39ezrmGNXWzDVg54/bhwOND2pckaS+GVfRfA45MckSSXwLOBm4Z0r4kSXsxlKmbqtqd5EPAvwD7AVdW1cPD2BeLMP0zJOOaC8Y3m7n6Y67+vGxzDeXNWEnS+PC7biSpcRa9JDVuny36JFcm2ZnkoVFnmSnJyiR3Jdmc5OEkF406E0CSVyX5apJvdLn+YtSZZkqyX5KvJ7l11FmmJdmaZFOSB5PcN+o805IckuT6JN/ufs9+awwyvan7OU1fnk7y4VHnAkjyp93v/ENJrk3yqlFnAkhyUZfp4WH/rPbZOfokJwNTwOer6uhR55mWZDmwvKoeSPJq4H7grFF//UOSAAdV1VSS/YF7gIuq6t5R5pqW5M+ACeA1VfX2UeeBXtEDE1U1Vh+ySXI18B9V9bnurLYDq+qHo841rfsKlO30PiT5XyPOsoLe7/pRVfVskuuA26vqqhHnOpreNwacAPwE+BLwx1X13WHsb589oq+qu4EnR53jxarqiap6oFt+BtgMrBhtKqieqe7m/t1lLF7lkxwOnAF8btRZxl2S1wAnA+sBquon41TynVOA/xx1yc+wBDggyRLgQMbjMz1vBu6tqh9V1W7g34F3Dmtn+2zR7wuSrALeAnxltEl6uumRB4GdwB1VNRa5gL8B/hx4ftRBXqSALye5v/vKjnHwq8D3gX/opro+l2T434rVn7OBa0cdAqCqtgOfBB4DngB2VdWXR5sKgIeAk5O8LsmBwOns+SHTRWXRD0mSpcANwIer6ulR5wGoqueq6lh6n1Q+ofvzcaSSvB3YWVX3jzrLLE6qquOA04ALu+nCUVsCHAdcUVVvAf4PGJuvAe+mkt4B/NOoswAkOZTeFyoeAbwROCjJe0ebCqpqM/BXwB30pm2+Aewe1v4s+iHo5sBvAK6pqhtHnefFuj/1NwKnjjgKwEnAO7r58A3A7yT5wmgj9VTV4931TuAmevOpo7YN2Dbjr7Hr6RX/uDgNeKCqdow6SOd3gUer6vtV9VPgRuC3R5wJgKpaX1XHVdXJ9KahhzI/Dxb9ouve9FwPbK6qT406z7Qkr09ySLd8AL1/AN8ebSqoqo9W1eFVtYren/z/VlUjP+JKclD3Zjrd1Mjb6P25PVJV9T/Afyd5Uzd0CjBO/8/DOYzJtE3nMeDEJAd2/zZPofe+2cgleUN3/cvAuxjiz22f/a8Ek1wLTAKHJdkGfLyq1o82FdA7Qj0P2NTNhwN8rKpuH2EmgOXA1d0ZEa8ArquqsTmVcQwtA27qdQNLgC9W1ZdGG+kFfwJc002TfA94/4jzANDNNb8V+OCos0yrqq8kuR54gN7UyNcZn69CuCHJ64CfAhdW1VPD2tE+e3qlJGl+nLqRpMZZ9JLUOItekhpn0UtS4yx6SWqcRS9JjbPoJalx/w+NWbxR+EnF+wAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x1a18c75a58>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "ratings['Rating_Numbers'].hist(bins=70)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x1a21c2ba90>"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD8CAYAAABn919SAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvNQv5yAAAEUdJREFUeJzt3W1snWd9x/Hvf0krWAxJS4uVJdUciah7IKKQo4qpErJbYIEimhd0AnUooGx+MVYVjQnCXmxDYlJ4UR5U9iaiXYKWYao+KFFTAVGoh5CgYLcBUwILVKEk7WJBk4ChGir898J3wTFxz+3z4ONzne9Hss657nNdPv+rR/35yuX7vh2ZiSSp//1BrwuQJHWGgS5JhTDQJakQBrokFcJAl6RCGOiSVAgDXZIKYaBLUiEMdEkqxNqVfLOrrroqR0ZGWhr7i1/8gnXr1nW2oFXOOQ8G51y+duc7PT39k8y8ulm/FQ30kZERpqamWho7OTnJ6OhoZwta5ZzzYHDO5Wt3vhHxozr93HKRpEIY6JJUCANdkgphoEtSIQx0SSqEgS5JhTDQJakQBrokFcJAl6RCrOiVopI0CEb2HLmovX/HytzmwBW6JBWiaaBHxLURcXzB188i4v0RcWVEHI2Ik9XjFStRsCTp0poGemZ+PzOvy8zrgO3AL4EHgT3AsczcChyr2pKkHlnulstNwA8z80fALcCB6vgBYGcnC5MkLU9kZv3OEfcAj2XmpyPifGZuWPDaucz8vW2XiBgHxgGGh4e3T0xMtFTo3NwcQ0NDLY3tV855MDjn8sycuXBRe8v6NW3Nd2xsbDozG8361Q70iLgceBr488w8WzfQF2o0Gun90OtzzoPBOZfnUme5tHk/9FqBvpwtl7cwvzo/W7XPRsTG6s02ArPLL1OS1CnLCfR3AZ9b0D4M7Kqe7wIOdaooSdLy1Qr0iPhD4E3AAwsO7wXeFBEnq9f2dr48SVJdta4UzcxfAq9YdOynzJ/1IklaBbxSVJIKYaBLUiEMdEkqhIEuSYUw0CWpEAa6JBXCQJekQhjoklQIA12SCmGgS1IhDHRJKoSBLkmFMNAlqRAGuiQVwkCXpEIY6JJUCANdkgphoEtSIQx0SSqEgS5JhagV6BGxISLui4jvRcSJiPiLiLgyIo5GxMnq8YpuFytJWlrdFfqngC9k5p8ArwFOAHuAY5m5FThWtSVJPdI00CPi5cAbgLsBMvNXmXkeuAU4UHU7AOzsVpGSpOYiM1+8Q8R1wD7gu8yvzqeBO4AzmblhQb9zmfl72y4RMQ6MAwwPD2+fmJhoqdC5uTmGhoZaGtuvnPNgcM7lmTlz4aL2lvVr2prv2NjYdGY2mvWrE+gN4OvADZn5aER8CvgZcHudQF+o0Wjk1NRUrQksNjk5yejoaEtj+5VzHgzOuTwje45c1N6/Y11b842IWoFeZw/9NHA6Mx+t2vcBrwPORsTG6s02ArOtFitJal/TQM/M/wV+HBHXVoduYn775TCwqzq2CzjUlQolSbWsrdnvduBgRFwOPAm8l/kfBvdGxG7gKeDW7pQoSaqjVqBn5nHgUvs3N3W2HElSq7xSVJIKYaBLUiEMdEkqhIEuSYUw0CWpEAa6JBXCQJekQhjoklQIA12SCmGgS1IhDHRJKoSBLkmFMNAlqRAGuiQVwkCXpEIY6JJUCANdkgphoEtSIQx0SSpErb8pGhGngJ8Dvwaez8xGRFwJfB4YAU4Bf5WZ57pTpiSpmeWs0Mcy87rMfOGPRe8BjmXmVuBY1ZYk9Ug7Wy63AAeq5weAne2XI0lqVd1AT+BLETEdEePVseHMfAagenxlNwqUJNUTmdm8U8QfZebTEfFK4ChwO3A4Mzcs6HMuM6+4xNhxYBxgeHh4+8TEREuFzs3NMTQ01NLYfuWcB4NzLs/MmQsXtbesX9PWfMfGxqYXbHcvqVagXzQg4l+BOeBvgdHMfCYiNgKTmXnti41tNBo5NTW1rPd7weTkJKOjoy2N7VfOeTA45/KM7DlyUXv/jnVtzTciagV60y2XiFgXES974TnwZuA7wGFgV9VtF3Co5WolSW2rc9riMPBgRLzQ/78y8wsR8U3g3ojYDTwF3Nq9MiVJzTQN9Mx8EnjNJY7/FLipG0VJkpbPK0UlqRAGuiQVwkCXpEIY6JJUCANdkgphoEtSIQx0SSqEgS5JhTDQJakQBrokFcJAl6RCGOiSVAgDXZIKYaBLUiEMdEkqhIEuSYUw0CWpEAa6JBXCQJekQhjoklSI2oEeEWsi4vGIeKhqb4mIRyPiZER8PiIu716ZkqRmlrNCvwM4saD9MeATmbkVOAfs7mRhkso1c+YCI3uO/PZLnVEr0CNiM3Az8JmqHcCNwH1VlwPAzm4UKEmqp+4K/ZPAB4HfVO1XAOcz8/mqfRrY1OHaJEnLEJn54h0i3ga8NTP/LiJGgX8E3gt8LTNfVfW5Bng4M7ddYvw4MA4wPDy8fWJioqVC5+bmGBoaamlsv3LOg2EQ5zz77AXOPve79rZN63tXTBfMnLlwUXvL+jVtfcZjY2PTmdlo1m9tje91A/D2iHgr8BLg5cyv2DdExNpqlb4ZePpSgzNzH7APoNFo5OjoaL0ZLDI5OUmrY/uVcx4Mgzjnuw4e4s6Z38XPqdtGe1dMF7xn0e8F9u9YtyKfcdMtl8z8cGZuzswR4J3AlzPzNuAR4B1Vt13Aoa5VKUlqqs4KfSkfAiYi4qPA48DdnSlJ0qBZfKbLqb0396iS/rasQM/MSWCyev4kcH3nS5IktcIrRSWpEAa6JBXCQJekQhjoklQIA12SCmGgS1IhDHRJKoSBLkmFMNAlqRAGuiQVwkCXpEIY6JJUCANdkgphoEtSIQx0SSqEgS5JhTDQJakQBrokFcJAl6RCGOiSVIimgR4RL4mIb0TEtyLiiYj4SHV8S0Q8GhEnI+LzEXF598uVJC2lzgr9/4AbM/M1wHXAjoh4PfAx4BOZuRU4B+zuXpmSpGaaBnrOm6ual1VfCdwI3FcdPwDs7EqFkqRaau2hR8SaiDgOzAJHgR8C5zPz+arLaWBTd0qUJNURmVm/c8QG4EHgn4H/yMxXVcevAR7OzG2XGDMOjAMMDw9vn5iYaKnQubk5hoaGWhrbr5zzYBjEOc8+e4Gzzy39+rZN61eumC6YOXPhovaW9Wva+ozHxsamM7PRrN/a5XzTzDwfEZPA64ENEbG2WqVvBp5eYsw+YB9Ao9HI0dHR5bzlb01OTtLq2H7lnAfDIM75roOHuHNm6fg5ddvoyhXTBe/Zc+Si9v4d61bkM65zlsvV1cqciHgp8EbgBPAI8I6q2y7gULeKlCQ1V2eFvhE4EBFrmP8BcG9mPhQR3wUmIuKjwOPA3V2sU5LURNNAz8xvA6+9xPEngeu7UZQkafm8UlSSCmGgS1IhDHRJKoSBLkmFMNAlqRAGuiQVwkCXpEIY6JJUCANdkgphoEtSIQx0SSqEgS5JhTDQJakQBrokFcJAl6RCGOiSVAgDXZIKYaBLUiEMdEkqhIEuSYVoGugRcU1EPBIRJyLiiYi4ozp+ZUQcjYiT1eMV3S9XkrSUOiv054EPZOafAq8H3hcRfwbsAY5l5lbgWNWWJPVI00DPzGcy87Hq+c+BE8Am4BbgQNXtALCzW0VKkpqLzKzfOWIE+ArwauCpzNyw4LVzmfl72y4RMQ6MAwwPD2+fmJhoqdC5uTmGhoZaGtuvnPNgGMQ5zz57gbPPLf36tk3rm36PmTMXlj1mpSyubcv6NW19xmNjY9OZ2WjWr3agR8QQ8N/Av2XmAxFxvk6gL9RoNHJqaqrW+y02OTnJ6OhoS2P7lXMeDIM457sOHuLOmbVLvn5q781Nv8fIniPLHrNSFte2f8e6tj7jiKgV6LXOcomIy4D7gYOZ+UB1+GxEbKxe3wjMtlqsJKl9S/+IrEREAHcDJzLz4wteOgzsAvZWj4e6UqEkrSKLV9+wev510DTQgRuAdwMzEXG8OvZPzAf5vRGxG3gKuLU7JUqS6mga6Jn5VSCWePmmzpYjSWpVnRW6JK2obm1rrOZfpHaCl/5LUiEMdEkqhIEuSYUw0CWpEAa6JBXCs1wkFWE1X/CzUlyhS1IhDHRJKoSBLkmFMNAlqRAGuiQVwkCXpEIY6JJUCANdkgphoEtSIQx0SSqEl/5LUpsudduBXnCFLkmFaBroEXFPRMxGxHcWHLsyIo5GxMnq8YrulilJaqbOlst+4NPAZxcc2wMcy8y9EbGnan+o8+VJ0srq57872nSFnplfAZ5ddPgW4ED1/ACws8N1SZKWqdU99OHMfAagenxl50qSJLUiMrN5p4gR4KHMfHXVPp+ZGxa8fi4zL7mPHhHjwDjA8PDw9omJiZYKnZubY2hoqKWx/co5D4ZBnPPssxc4+9zyxmzbtP6i9syZC22PWfx6nT513nexLevXtPUZj42NTWdmo1m/Vk9bPBsRGzPzmYjYCMwu1TEz9wH7ABqNRo6Ojrb0hpOTk7Q6tl8558EwiHO+6+Ah7pxZXvycum30ovZ7apwq2GzM4tfr9Knzvovt37FuRT7jVrdcDgO7que7gEOdKUeS1KqmPyIj4nPAKHBVRJwG/gXYC9wbEbuBp4Bbu1mkpNXJv+O5ujQN9Mx81xIv3dThWiRJbfBKUUkqhPdykQZUty6g6ecLc/qdK3RJKoSBLkmFMNAlqRAGuiQVwkCXpEIY6JJUCANdkgrheeiSllT6OeWr5W+BdoordEkqhIEuSYXomy2XmTMXLroPcWn/9FPn9PIOgKtli6KV/walbT8MIlfoklQIA12SCtE3Wy694g38B8Nq+pxXy7bNauOWUHOu0CWpEAa6JBXCLZdVzDN7BsPiz7kVK7Ud4bbH6uYKXZIK0VagR8SOiPh+RPwgIvZ0qihJ0vK1vOUSEWuAfwfeBJwGvhkRhzPzu50q7sX028UjnrlQTz9tM7n9UM+l/jt9YFvv3rtk7azQrwd+kJlPZuavgAngls6UJUlarnYCfRPw4wXt09UxSVIPRGa2NjDiVuAvM/Nvqva7gesz8/ZF/caB8ap5LfD9Fmu9CvhJi2P7lXMeDM65fO3O948z8+pmndo5bfE0cM2C9mbg6cWdMnMfsK+N9wEgIqYys9Hu9+knznkwOOfyrdR829ly+SawNSK2RMTlwDuBw50pS5K0XC2v0DPz+Yj4e+CLwBrgnsx8omOVSZKWpa0rRTPzYeDhDtXSTNvbNn3IOQ8G51y+FZlvy78UlSStLl76L0mF6ItAH7RbDETEPRExGxHf6XUtKyEiromIRyLiREQ8ERF39LqmbouIl0TENyLiW9WcP9LrmlZKRKyJiMcj4qFe17ISIuJURMxExPGImOrqe632LZfqFgP/w4JbDADvWqlbDPRCRLwBmAM+m5mv7nU93RYRG4GNmflYRLwMmAZ2Fv4ZB7AuM+ci4jLgq8Admfn1HpfWdRHxD0ADeHlmvq3X9XRbRJwCGpnZ9fPu+2GFPnC3GMjMrwDP9rqOlZKZz2TmY9XznwMnKPyq45w3VzUvq75W9+qqAyJiM3Az8Jle11Kifgh0bzEwQCJiBHgt8GhvK+m+auvhODALHM3M4ucMfBL4IPCbXheyghL4UkRMV1fOd00/BHpc4ljxK5lBFBFDwP3A+zPzZ72up9sy89eZeR3zV1lfHxFFb69FxNuA2cyc7nUtK+yGzHwd8BbgfdWWalf0Q6DXusWA+lu1j3w/cDAzH+h1PSspM88Dk8COHpfSbTcAb6/2lCeAGyPiP3tbUvdl5tPV4yzwIPPbyF3RD4HuLQYKV/2C8G7gRGZ+vNf1rISIuDoiNlTPXwq8Efheb6vqrsz8cGZuzswR5v8//nJm/nWPy+qqiFhX/aKfiFgHvBno2tlrqz7QM/N54IVbDJwA7i39FgMR8Tnga8C1EXE6Inb3uqYuuwF4N/MrtuPV11t7XVSXbQQeiYhvM79oOZqZA3Ea34AZBr4aEd8CvgEcycwvdOvNVv1pi5Kkelb9Cl2SVI+BLkmFMNAlqRAGuiQVwkCXpEIY6JJUCANdkgphoEtSIf4fHr6CNcxLlssAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x1a21be48d0>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "ratings['Rating_ID'].hist(bins=70)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<seaborn.axisgrid.JointGrid at 0x1a21ce4b70>"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x1a21ce47b8>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "sns.jointplot(x='Rating_ID', y='Rating_Numbers', data=ratings, alpha=0.5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Name_ID</th>\n",
       "      <th>A G Dental Clinic</th>\n",
       "      <th>A G Padmavathi Hospital Pvt..</th>\n",
       "      <th>A K Dental Clinic</th>\n",
       "      <th>ASSJ Chidambaram Dental Hos..</th>\n",
       "      <th>Aayush Health Care Centre</th>\n",
       "      <th>Abishek Ortho Centre</th>\n",
       "      <th>Active Life</th>\n",
       "      <th>Adisha Clinic</th>\n",
       "      <th>Aeon Ayush Clinic</th>\n",
       "      <th>Agar Clinic</th>\n",
       "      <th>...</th>\n",
       "      <th>Vige.dr</th>\n",
       "      <th>Vijay Kidney &amp; Urology Clin..</th>\n",
       "      <th>Vijaya Dental Health Cure</th>\n",
       "      <th>Viji Clinic</th>\n",
       "      <th>Vikare Dental Clinic</th>\n",
       "      <th>Viniyoga Heealing Centre</th>\n",
       "      <th>Viroga Ayurvedic Clinic And..</th>\n",
       "      <th>Visalakshmi Dental Health &amp;..</th>\n",
       "      <th>Vishnu Dhanvanthri Dental C..</th>\n",
       "      <th>drmongaclinic.com</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>User_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 348 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "Name_ID  A G Dental Clinic  A G Padmavathi Hospital Pvt..  A K Dental Clinic  \\\n",
       "User_ID                                                                        \n",
       "0                      NaN                            NaN                NaN   \n",
       "1                      NaN                            NaN                NaN   \n",
       "2                      NaN                            NaN                NaN   \n",
       "3                      NaN                            NaN                NaN   \n",
       "4                      NaN                            NaN                NaN   \n",
       "\n",
       "Name_ID  ASSJ Chidambaram Dental Hos..  Aayush Health Care Centre  \\\n",
       "User_ID                                                             \n",
       "0                                  NaN                        NaN   \n",
       "1                                  NaN                        NaN   \n",
       "2                                  NaN                        NaN   \n",
       "3                                  NaN                        NaN   \n",
       "4                                  NaN                        NaN   \n",
       "\n",
       "Name_ID  Abishek Ortho Centre  Active Life  Adisha Clinic  Aeon Ayush Clinic  \\\n",
       "User_ID                                                                        \n",
       "0                         NaN          NaN            NaN                NaN   \n",
       "1                         NaN          NaN            NaN                NaN   \n",
       "2                         NaN          NaN            NaN                NaN   \n",
       "3                         NaN          NaN            NaN                NaN   \n",
       "4                         NaN          NaN            NaN                NaN   \n",
       "\n",
       "Name_ID  Agar Clinic        ...          Vige.dr  \\\n",
       "User_ID                     ...                    \n",
       "0                NaN        ...              NaN   \n",
       "1                NaN        ...              NaN   \n",
       "2                NaN        ...              NaN   \n",
       "3                NaN        ...              NaN   \n",
       "4                NaN        ...              NaN   \n",
       "\n",
       "Name_ID  Vijay Kidney & Urology Clin..  Vijaya Dental Health Cure  \\\n",
       "User_ID                                                             \n",
       "0                                  NaN                        NaN   \n",
       "1                                  NaN                        NaN   \n",
       "2                                  NaN                        NaN   \n",
       "3                                  NaN                        NaN   \n",
       "4                                  NaN                        NaN   \n",
       "\n",
       "Name_ID  Viji Clinic  Vikare Dental Clinic  Viniyoga Heealing Centre  \\\n",
       "User_ID                                                                \n",
       "0                NaN                   NaN                       NaN   \n",
       "1                NaN                   NaN                       NaN   \n",
       "2                NaN                   NaN                       NaN   \n",
       "3                NaN                   NaN                       NaN   \n",
       "4                NaN                   NaN                       NaN   \n",
       "\n",
       "Name_ID  Viroga Ayurvedic Clinic And..  Visalakshmi Dental Health &..  \\\n",
       "User_ID                                                                 \n",
       "0                                  NaN                            NaN   \n",
       "1                                  NaN                            NaN   \n",
       "2                                  NaN                            NaN   \n",
       "3                                  NaN                            NaN   \n",
       "4                                  NaN                            NaN   \n",
       "\n",
       "Name_ID  Vishnu Dhanvanthri Dental C..  drmongaclinic.com  \n",
       "User_ID                                                    \n",
       "0                                  NaN                NaN  \n",
       "1                                  NaN                NaN  \n",
       "2                                  NaN                NaN  \n",
       "3                                  NaN                NaN  \n",
       "4                                  NaN                NaN  \n",
       "\n",
       "[5 rows x 348 columns]"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "moviemat = df.pivot_table(index='User_ID', columns='Name_ID', values='Rating_ID')\n",
    "moviemat.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Rating_ID</th>\n",
       "      <th>Rating_Numbers</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Name_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Sri Venkateshwara Hospital</th>\n",
       "      <td>3.80</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Apollo Hospital Information..</th>\n",
       "      <td>3.90</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Rani Hospital</th>\n",
       "      <td>4.00</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aum Hospitals</th>\n",
       "      <td>4.30</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Sun Pharmacy</th>\n",
       "      <td>4.20</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Be Well Hospitals</th>\n",
       "      <td>3.70</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Clinic Nallam</th>\n",
       "      <td>4.25</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Abishek Ortho Centre</th>\n",
       "      <td>4.35</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Sri Venkateshwaraa Medical ..</th>\n",
       "      <td>4.10</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Pondicherry Institute Of Me..</th>\n",
       "      <td>4.00</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                               Rating_ID  Rating_Numbers\n",
       "Name_ID                                                 \n",
       "Sri Venkateshwara Hospital          3.80               9\n",
       "Apollo Hospital Information..       3.90               9\n",
       "Rani Hospital                       4.00               8\n",
       "Aum Hospitals                       4.30               8\n",
       "Sun Pharmacy                        4.20               8\n",
       "Be Well Hospitals                   3.70               7\n",
       "Clinic Nallam                       4.25               6\n",
       "Abishek Ortho Centre                4.35               6\n",
       "Sri Venkateshwaraa Medical ..       4.10               6\n",
       "Pondicherry Institute Of Me..       4.00               6"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ratings.sort_values('Rating_Numbers', ascending=False).head(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "Aum_ratings = moviemat['Aum Hospitals']\n",
    "Sun_ratings =moviemat['Sun Pharmacy']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "User_ID\n",
       "0   NaN\n",
       "1   NaN\n",
       "2   NaN\n",
       "3   NaN\n",
       "4   NaN\n",
       "Name: Aum Hospitals, dtype: float64"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Aum_ratings.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Name_ID\n",
       "A G Dental Clinic               NaN\n",
       "A G Padmavathi Hospital Pvt..   NaN\n",
       "A K Dental Clinic               NaN\n",
       "ASSJ Chidambaram Dental Hos..   NaN\n",
       "Aayush Health Care Centre       NaN\n",
       "dtype: float64"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "similar_to_Aum = moviemat.corrwith(Aum_ratings)\n",
    "similar_to_Aum.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Name_ID\n",
       "A G Dental Clinic               NaN\n",
       "A G Padmavathi Hospital Pvt..   NaN\n",
       "A K Dental Clinic               NaN\n",
       "ASSJ Chidambaram Dental Hos..   NaN\n",
       "Aayush Health Care Centre       NaN\n",
       "dtype: float64"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "similar_to_Sun = moviemat.corrwith(Sun_ratings)\n",
    "similar_to_Sun.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Name_ID\n",
       "A G Dental Clinic               NaN\n",
       "A G Padmavathi Hospital Pvt..   NaN\n",
       "A K Dental Clinic               NaN\n",
       "ASSJ Chidambaram Dental Hos..   NaN\n",
       "Aayush Health Care Centre       NaN\n",
       "dtype: float64"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "similar_to_Sun = moviemat.corrwith(Sun_ratings)\n",
    "similar_to_Sun.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Correlation</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Name_ID</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "Empty DataFrame\n",
       "Columns: [Correlation]\n",
       "Index: []"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "corr_Aum = pd.DataFrame(similar_to_Aum, columns=['Correlation'])\n",
    "corr_Aum.dropna(inplace=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Correlation</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Name_ID</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>A G Dental Clinic</th>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>A G Padmavathi Hospital Pvt..</th>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>A K Dental Clinic</th>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ASSJ Chidambaram Dental Hos..</th>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aayush Health Care Centre</th>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                               Correlation\n",
       "Name_ID                                   \n",
       "A G Dental Clinic                      NaN\n",
       "A G Padmavathi Hospital Pvt..          NaN\n",
       "A K Dental Clinic                      NaN\n",
       "ASSJ Chidambaram Dental Hos..          NaN\n",
       "Aayush Health Care Centre              NaN"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "corr_Aum.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Correlation</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Name_ID</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>A G Dental Clinic</th>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>A G Padmavathi Hospital Pvt..</th>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>A K Dental Clinic</th>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ASSJ Chidambaram Dental Hos..</th>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aayush Health Care Centre</th>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Abishek Ortho Centre</th>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Active Life</th>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Adisha Clinic</th>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aeon Ayush Clinic</th>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Agar Clinic</th>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                               Correlation\n",
       "Name_ID                                   \n",
       "A G Dental Clinic                      NaN\n",
       "A G Padmavathi Hospital Pvt..          NaN\n",
       "A K Dental Clinic                      NaN\n",
       "ASSJ Chidambaram Dental Hos..          NaN\n",
       "Aayush Health Care Centre              NaN\n",
       "Abishek Ortho Centre                   NaN\n",
       "Active Life                            NaN\n",
       "Adisha Clinic                          NaN\n",
       "Aeon Ayush Clinic                      NaN\n",
       "Agar Clinic                            NaN"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "corr_Aum.sort_values('Correlation', ascending=False).head(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Correlation</th>\n",
       "      <th>Rating_Numbers</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Name_ID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>A G Dental Clinic</th>\n",
       "      <td>NaN</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>A G Padmavathi Hospital Pvt..</th>\n",
       "      <td>NaN</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>A K Dental Clinic</th>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ASSJ Chidambaram Dental Hos..</th>\n",
       "      <td>NaN</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aayush Health Care Centre</th>\n",
       "      <td>NaN</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                               Correlation  Rating_Numbers\n",
       "Name_ID                                                   \n",
       "A G Dental Clinic                      NaN               2\n",
       "A G Padmavathi Hospital Pvt..          NaN               2\n",
       "A K Dental Clinic                      NaN               1\n",
       "ASSJ Chidambaram Dental Hos..          NaN               3\n",
       "Aayush Health Care Centre              NaN               2"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "corr_Aum = corr_Aum.join(ratings['Rating_Numbers'], how='left', lsuffix='_left', rsuffix='_right')\n",
    "corr_Aum.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "sentiment_data=[]\n",
    "sentiment_data=df['Comments']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0       Verygood\n",
       "1      Excellent\n",
       "2      Excellent\n",
       "3      Excellent\n",
       "4      Excellent\n",
       "5      Excellent\n",
       "6      Excellent\n",
       "7      Excellent\n",
       "8      Excellent\n",
       "9      Excellent\n",
       "10     Excellent\n",
       "11     Excellent\n",
       "12     Excellent\n",
       "13     Excellent\n",
       "14     Excellent\n",
       "15     Excellent\n",
       "16     Excellent\n",
       "17     Excellent\n",
       "18     Excellent\n",
       "19     Excellent\n",
       "20     Excellent\n",
       "21     Excellent\n",
       "22     Excellent\n",
       "23     Excellent\n",
       "24     Excellent\n",
       "25     Excellent\n",
       "26      Verygood\n",
       "27      Verygood\n",
       "28      Verygood\n",
       "29      Verygood\n",
       "         ...    \n",
       "540     Verypoor\n",
       "541     Verypoor\n",
       "542    Excellent\n",
       "543     Verygood\n",
       "544     Verypoor\n",
       "545    Excellent\n",
       "546    Excellent\n",
       "547    Excellent\n",
       "548    Excellent\n",
       "549    Excellent\n",
       "550    Excellent\n",
       "551    Excellent\n",
       "552    Excellent\n",
       "553    Excellent\n",
       "554    Excellent\n",
       "555    Excellent\n",
       "556    Excellent\n",
       "557    Excellent\n",
       "558     Verygood\n",
       "559    Excellent\n",
       "560    Excellent\n",
       "561     Verygood\n",
       "562    Excellent\n",
       "563    Excellent\n",
       "564     Verygood\n",
       "565     Verygood\n",
       "566     Verypoor\n",
       "567    Excellent\n",
       "568    Excellent\n",
       "569    Excellent\n",
       "Name: Comments, Length: 570, dtype: object"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sentiment_data[:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['User_ID', 'Name_ID', 'Rating_ID', 'Comments', 'Specialist_ID',\n",
       "       'Address'],\n",
       "      dtype='object')"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.keys()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "sentiment_data[:].to_csv('sentiment_data.csv')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "sen_data = []\n",
    "for sentiment in sentiment_data[:]:\n",
    "    if sentiment == \"Excellent\":\n",
    "        sen_data.append('1')\n",
    "    elif sentiment == \"Verygood\":\n",
    "        sen_data.append('1')\n",
    "    elif sentiment == \"Good\":\n",
    "        sen_data.append('1')\n",
    "    elif sentiment == \"Poor\":\n",
    "        sen_data.append('0')\n",
    "    elif sentiment == \"Verypoor\":\n",
    "        sen_data.append('0')\n",
    "            "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "df['Sentiment_Data'] = sen_data\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>User_ID</th>\n",
       "      <th>Name_ID</th>\n",
       "      <th>Rating_ID</th>\n",
       "      <th>Comments</th>\n",
       "      <th>Specialist_ID</th>\n",
       "      <th>Address</th>\n",
       "      <th>Sentiment_Data</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>Puducherry Diabetes Foundat..</td>\n",
       "      <td>3.9</td>\n",
       "      <td>Verygood</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 86, Eswaran Kovil Street, Pondicherry - 6...</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>L K Nursing Home</td>\n",
       "      <td>4.3</td>\n",
       "      <td>Excellent</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No:25, Rajarajeswari nagar, Venkata Nagar, P...</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>East Coast Hospitals</td>\n",
       "      <td>4.1</td>\n",
       "      <td>Excellent</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 370, 2nd Main Road Mahaveer Nagar, Lawspe...</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "      <td>Delta Q Labs</td>\n",
       "      <td>5.0</td>\n",
       "      <td>Excellent</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 1, Perambai Road, Moolakulam, Pondicherry...</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "      <td>Clinic Nallam</td>\n",
       "      <td>4.1</td>\n",
       "      <td>Excellent</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 36/A, Cuddalore Main Road, Mudaliarpet, P...</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   User_ID                        Name_ID  Rating_ID   Comments  \\\n",
       "0        0  Puducherry Diabetes Foundat..        3.9   Verygood   \n",
       "1        1               L K Nursing Home        4.3  Excellent   \n",
       "2        2           East Coast Hospitals        4.1  Excellent   \n",
       "3        3                   Delta Q Labs        5.0  Excellent   \n",
       "4        4                  Clinic Nallam        4.1  Excellent   \n",
       "\n",
       "                        Specialist_ID  \\\n",
       "0  Doctors For Dengue Fever Treatment   \n",
       "1  Doctors For Dengue Fever Treatment   \n",
       "2  Doctors For Dengue Fever Treatment   \n",
       "3  Doctors For Dengue Fever Treatment   \n",
       "4  Doctors For Dengue Fever Treatment   \n",
       "\n",
       "                                             Address Sentiment_Data  \n",
       "0    No 86, Eswaran Kovil Street, Pondicherry - 6...              1  \n",
       "1    No:25, Rajarajeswari nagar, Venkata Nagar, P...              1  \n",
       "2    No 370, 2nd Main Road Mahaveer Nagar, Lawspe...              1  \n",
       "3    No 1, Perambai Road, Moolakulam, Pondicherry...              1  \n",
       "4    No 36/A, Cuddalore Main Road, Mudaliarpet, P...              1  "
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "ratings =pd.DataFrame(df.groupby('Name_ID')['Sentiment_Data'].count())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Sentiment_Data</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Name_ID</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>A G Dental Clinic</th>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>A G Padmavathi Hospital Pvt..</th>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>A K Dental Clinic</th>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ASSJ Chidambaram Dental Hos..</th>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aayush Health Care Centre</th>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                               Sentiment_Data\n",
       "Name_ID                                      \n",
       "A G Dental Clinic                           2\n",
       "A G Padmavathi Hospital Pvt..               2\n",
       "A K Dental Clinic                           1\n",
       "ASSJ Chidambaram Dental Hos..               3\n",
       "Aayush Health Care Centre                   2"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ratings.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x1a23edbb38>"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x1a240a7c88>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "df[:20].plot(x= 'Specialist_ID', y = 'Rating_ID', kind = 'bar')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Specialist_ID</th>\n",
       "      <th>Address</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 86, Eswaran Kovil Street, Pondicherry - 6...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No:25, Rajarajeswari nagar, Venkata Nagar, P...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 370, 2nd Main Road Mahaveer Nagar, Lawspe...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 1, Perambai Road, Moolakulam, Pondicherry...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 36/A, Cuddalore Main Road, Mudaliarpet, P...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 86, Eswaran Kovil Street, Pondicherry - 6...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 4, 7th Street, Lawspet Pondicherry, Pondi...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>122, west car street, Villianur, Pondicherry...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 169D, Sundaraju Chettiar Complex Cuddalor...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 2/A, New Street, Puthupet, Rajaji Nagar, ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 315, Mission Street, Puducherry H O, Pond...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>Grr Complex, Maraimalai Adigal Salai, Pondic...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>Pondicherry-Villupuram Main Road, Pondicherr...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 34 1st Floor Plot No-146dr Roshni's Skin ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 36 A, Airport Main Road, Lawspet, Pondich...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 211, Bussy Street, Pondicherry - 605001</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 34-B, Cuddalore HO, PONDICHERRY - 607001,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 90, Thiruvalluvar Salai, Saram, Pondicher...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 1, Cholai Nagar Main Road, Muthialpet, Po...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 130, Perumal Koil Street, Bharathi Street...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 230 Naveiranj Skin Clinic, Lenin Street, ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 44, Mettu Street, Kosapalayam, Pondicherr...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 77, Ambedkar Salai, Nellithope, Pondicher...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 159 &amp;161, Bussy Street, Pondicherry - 605...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 16, Roam Rolland Street, Pondicherry - 60...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 90, Vazhudavur Main Road, Thattanchavadi,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 216, Bussy Street, Pondicherry - 605001, ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>Pondicherry University, Pondicherry - 605014...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>Ambour Salai, Pondicherry - 605001</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 6o, 4th Cross Street, Venkata Nagar, Pond...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>71</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 122, Bussy Street, Puducherry H O, Pondic...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>72</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 225/F, Chetty Street, Pondicherry - 605001</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>73</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>Muthialpet, Pondicherry - 605003, Near Gover...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>74</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 18/2, Nellikuppam Road, Cuddalore HO, PON...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 430, Vazhudavur Main Road, Thattanchavadi...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>76</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No:49, Villainur Main Road, Pavazha Nagar, N...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>77</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>Pillaitottam, Pondicherry - 605013</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>78</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 59, M M Koil Street, Nellithope, Pondiche...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>79</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>133, 100 Feet Road, Mudaliarpet, Pondicherry...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>80</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 63 1st Floor, Mariamman Koil Street, Laws...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>81</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 66, Kandappa Mudali Street, Puducherry H ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>82</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 401, M G Road, Puducherry H O, Pondicherr...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>83</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>A G Padmavathi Hospital R S 127/ 1 A, Villia...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>84</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>A G Padmavathi Hospital R S 127/ 1 A, Villia...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>85</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 2/2, Vallalar Salai 45 Feet Road, Venkata...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>86</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>Pondicherry - 605001</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>87</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>Govt General Hospital, Pondicherry - 605001,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>88</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 9, St Louis Street, Pondicherry - 605001</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>89</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 3, Rue Suffren Street, Pondicherry - 605003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>90</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 77, 3rd Cross 45 Feet Road Corner, Venkat...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>91</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 3, Vazhaikulam, Padmini Nagar, Pondicherr...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>92</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>120, West car street, Villianur, Pondicherry...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>93</th>\n",
       "      <td>General Physician Doctors Exp: 35 yrs Fee: 100...</td>\n",
       "      <td>No 2, Wigil Mier Street, Cuddalore Ho, PONDI...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>94</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 209-A, Nethaji Road, Cuddalore HO, PONDIC...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>95</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 169-B, Nethaji Road, Cuddalore HO, PONDIC...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>96</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 110, Bazaar Street, Chidambaram H O, POND...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>97</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 127, S P Koil Street, Chidambaram H O, PO...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>98</th>\n",
       "      <td>General Physician Doctors</td>\n",
       "      <td>No 180, Lalbahadur Sastri Street, Puducherry...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>99</th>\n",
       "      <td>Nephrologists</td>\n",
       "      <td>No 26, Perumal Koil Street, Puducherry H O, ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>100</th>\n",
       "      <td>Nephrologists</td>\n",
       "      <td>No 32 A Sri Venkateshwara Hospital, Pondy Vi...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>101 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                         Specialist_ID  \\\n",
       "0                   Doctors For Dengue Fever Treatment   \n",
       "1                   Doctors For Dengue Fever Treatment   \n",
       "2                   Doctors For Dengue Fever Treatment   \n",
       "3                   Doctors For Dengue Fever Treatment   \n",
       "4                   Doctors For Dengue Fever Treatment   \n",
       "5                   Doctors For Dengue Fever Treatment   \n",
       "6                   Doctors For Dengue Fever Treatment   \n",
       "7                   Doctors For Dengue Fever Treatment   \n",
       "8                   Doctors For Dengue Fever Treatment   \n",
       "9                   Doctors For Dengue Fever Treatment   \n",
       "10                           General Physician Doctors   \n",
       "11                           General Physician Doctors   \n",
       "12                           General Physician Doctors   \n",
       "13                           General Physician Doctors   \n",
       "14                           General Physician Doctors   \n",
       "15                           General Physician Doctors   \n",
       "16                           General Physician Doctors   \n",
       "17                           General Physician Doctors   \n",
       "18                           General Physician Doctors   \n",
       "19                           General Physician Doctors   \n",
       "20                           General Physician Doctors   \n",
       "21                           General Physician Doctors   \n",
       "22                           General Physician Doctors   \n",
       "23                           General Physician Doctors   \n",
       "24                           General Physician Doctors   \n",
       "25                           General Physician Doctors   \n",
       "26                           General Physician Doctors   \n",
       "27                           General Physician Doctors   \n",
       "28                           General Physician Doctors   \n",
       "29                           General Physician Doctors   \n",
       "..                                                 ...   \n",
       "71                           General Physician Doctors   \n",
       "72                           General Physician Doctors   \n",
       "73                           General Physician Doctors   \n",
       "74                           General Physician Doctors   \n",
       "75                           General Physician Doctors   \n",
       "76                           General Physician Doctors   \n",
       "77                           General Physician Doctors   \n",
       "78                           General Physician Doctors   \n",
       "79                           General Physician Doctors   \n",
       "80                           General Physician Doctors   \n",
       "81                           General Physician Doctors   \n",
       "82                           General Physician Doctors   \n",
       "83                           General Physician Doctors   \n",
       "84                           General Physician Doctors   \n",
       "85                           General Physician Doctors   \n",
       "86                           General Physician Doctors   \n",
       "87                           General Physician Doctors   \n",
       "88                           General Physician Doctors   \n",
       "89                           General Physician Doctors   \n",
       "90                           General Physician Doctors   \n",
       "91                           General Physician Doctors   \n",
       "92                           General Physician Doctors   \n",
       "93   General Physician Doctors Exp: 35 yrs Fee: 100...   \n",
       "94                           General Physician Doctors   \n",
       "95                           General Physician Doctors   \n",
       "96                           General Physician Doctors   \n",
       "97                           General Physician Doctors   \n",
       "98                           General Physician Doctors   \n",
       "99                                       Nephrologists   \n",
       "100                                      Nephrologists   \n",
       "\n",
       "                                               Address  \n",
       "0      No 86, Eswaran Kovil Street, Pondicherry - 6...  \n",
       "1      No:25, Rajarajeswari nagar, Venkata Nagar, P...  \n",
       "2      No 370, 2nd Main Road Mahaveer Nagar, Lawspe...  \n",
       "3      No 1, Perambai Road, Moolakulam, Pondicherry...  \n",
       "4      No 36/A, Cuddalore Main Road, Mudaliarpet, P...  \n",
       "5      No 86, Eswaran Kovil Street, Pondicherry - 6...  \n",
       "6      No 4, 7th Street, Lawspet Pondicherry, Pondi...  \n",
       "7      122, west car street, Villianur, Pondicherry...  \n",
       "8      No 169D, Sundaraju Chettiar Complex Cuddalor...  \n",
       "9      No 2/A, New Street, Puthupet, Rajaji Nagar, ...  \n",
       "10     No 315, Mission Street, Puducherry H O, Pond...  \n",
       "11     Grr Complex, Maraimalai Adigal Salai, Pondic...  \n",
       "12     Pondicherry-Villupuram Main Road, Pondicherr...  \n",
       "13     No 34 1st Floor Plot No-146dr Roshni's Skin ...  \n",
       "14     No 36 A, Airport Main Road, Lawspet, Pondich...  \n",
       "15          No 211, Bussy Street, Pondicherry - 605001  \n",
       "16     No 34-B, Cuddalore HO, PONDICHERRY - 607001,...  \n",
       "17     No 90, Thiruvalluvar Salai, Saram, Pondicher...  \n",
       "18     No 1, Cholai Nagar Main Road, Muthialpet, Po...  \n",
       "19     No 130, Perumal Koil Street, Bharathi Street...  \n",
       "20     No 230 Naveiranj Skin Clinic, Lenin Street, ...  \n",
       "21     No 44, Mettu Street, Kosapalayam, Pondicherr...  \n",
       "22     No 77, Ambedkar Salai, Nellithope, Pondicher...  \n",
       "23     No 159 &161, Bussy Street, Pondicherry - 605...  \n",
       "24     No 16, Roam Rolland Street, Pondicherry - 60...  \n",
       "25     No 90, Vazhudavur Main Road, Thattanchavadi,...  \n",
       "26     No 216, Bussy Street, Pondicherry - 605001, ...  \n",
       "27     Pondicherry University, Pondicherry - 605014...  \n",
       "28                  Ambour Salai, Pondicherry - 605001  \n",
       "29     No 6o, 4th Cross Street, Venkata Nagar, Pond...  \n",
       "..                                                 ...  \n",
       "71     No 122, Bussy Street, Puducherry H O, Pondic...  \n",
       "72       No 225/F, Chetty Street, Pondicherry - 605001  \n",
       "73     Muthialpet, Pondicherry - 605003, Near Gover...  \n",
       "74     No 18/2, Nellikuppam Road, Cuddalore HO, PON...  \n",
       "75     No 430, Vazhudavur Main Road, Thattanchavadi...  \n",
       "76     No:49, Villainur Main Road, Pavazha Nagar, N...  \n",
       "77                  Pillaitottam, Pondicherry - 605013  \n",
       "78     No 59, M M Koil Street, Nellithope, Pondiche...  \n",
       "79     133, 100 Feet Road, Mudaliarpet, Pondicherry...  \n",
       "80     No 63 1st Floor, Mariamman Koil Street, Laws...  \n",
       "81     No 66, Kandappa Mudali Street, Puducherry H ...  \n",
       "82     No 401, M G Road, Puducherry H O, Pondicherr...  \n",
       "83     A G Padmavathi Hospital R S 127/ 1 A, Villia...  \n",
       "84     A G Padmavathi Hospital R S 127/ 1 A, Villia...  \n",
       "85     No 2/2, Vallalar Salai 45 Feet Road, Venkata...  \n",
       "86                                Pondicherry - 605001  \n",
       "87     Govt General Hospital, Pondicherry - 605001,...  \n",
       "88         No 9, St Louis Street, Pondicherry - 605001  \n",
       "89      No 3, Rue Suffren Street, Pondicherry - 605003  \n",
       "90     No 77, 3rd Cross 45 Feet Road Corner, Venkat...  \n",
       "91     No 3, Vazhaikulam, Padmini Nagar, Pondicherr...  \n",
       "92     120, West car street, Villianur, Pondicherry...  \n",
       "93     No 2, Wigil Mier Street, Cuddalore Ho, PONDI...  \n",
       "94     No 209-A, Nethaji Road, Cuddalore HO, PONDIC...  \n",
       "95     No 169-B, Nethaji Road, Cuddalore HO, PONDIC...  \n",
       "96     No 110, Bazaar Street, Chidambaram H O, POND...  \n",
       "97     No 127, S P Koil Street, Chidambaram H O, PO...  \n",
       "98     No 180, Lalbahadur Sastri Street, Puducherry...  \n",
       "99     No 26, Perumal Koil Street, Puducherry H O, ...  \n",
       "100    No 32 A Sri Venkateshwara Hospital, Pondy Vi...  \n",
       "\n",
       "[101 rows x 2 columns]"
      ]
     },
     "execution_count": 60,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[0:100,\"Specialist_ID\":\"Address\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Name_ID          False\n",
      "Rating_ID        False\n",
      "Comments         False\n",
      "Specialist_ID    False\n",
      "Address          False\n",
      "dtype: bool\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name_ID</th>\n",
       "      <th>Rating_ID</th>\n",
       "      <th>Comments</th>\n",
       "      <th>Specialist_ID</th>\n",
       "      <th>Address</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Puducherry Diabetes Foundat..</td>\n",
       "      <td>3.9</td>\n",
       "      <td>Verygood</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 86, Eswaran Kovil Street, Pondicherry - 6...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>L K Nursing Home</td>\n",
       "      <td>4.3</td>\n",
       "      <td>Excellent</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No:25, Rajarajeswari nagar, Venkata Nagar, P...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>East Coast Hospitals</td>\n",
       "      <td>4.1</td>\n",
       "      <td>Excellent</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 370, 2nd Main Road Mahaveer Nagar, Lawspe...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Delta Q Labs</td>\n",
       "      <td>5.0</td>\n",
       "      <td>Excellent</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 1, Perambai Road, Moolakulam, Pondicherry...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Clinic Nallam</td>\n",
       "      <td>4.1</td>\n",
       "      <td>Excellent</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 36/A, Cuddalore Main Road, Mudaliarpet, P...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                         Name_ID  Rating_ID   Comments  \\\n",
       "0  Puducherry Diabetes Foundat..        3.9   Verygood   \n",
       "1               L K Nursing Home        4.3  Excellent   \n",
       "2           East Coast Hospitals        4.1  Excellent   \n",
       "3                   Delta Q Labs        5.0  Excellent   \n",
       "4                  Clinic Nallam        4.1  Excellent   \n",
       "\n",
       "                        Specialist_ID  \\\n",
       "0  Doctors For Dengue Fever Treatment   \n",
       "1  Doctors For Dengue Fever Treatment   \n",
       "2  Doctors For Dengue Fever Treatment   \n",
       "3  Doctors For Dengue Fever Treatment   \n",
       "4  Doctors For Dengue Fever Treatment   \n",
       "\n",
       "                                             Address  \n",
       "0    No 86, Eswaran Kovil Street, Pondicherry - 6...  \n",
       "1    No:25, Rajarajeswari nagar, Venkata Nagar, P...  \n",
       "2    No 370, 2nd Main Road Mahaveer Nagar, Lawspe...  \n",
       "3    No 1, Perambai Road, Moolakulam, Pondicherry...  \n",
       "4    No 36/A, Cuddalore Main Road, Mudaliarpet, P...  "
      ]
     },
     "execution_count": 61,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "permanent = df[['Name_ID', 'Rating_ID', 'Comments', 'Specialist_ID','Address']]\n",
    "print(permanent.isnull().any()) #Checking for null values\n",
    "permanent.head()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name_ID</th>\n",
       "      <th>Rating_ID</th>\n",
       "      <th>Comments</th>\n",
       "      <th>Specialist_ID</th>\n",
       "      <th>Address</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "Empty DataFrame\n",
       "Columns: [Name_ID, Rating_ID, Comments, Specialist_ID, Address]\n",
       "Index: []"
      ]
     },
     "execution_count": 62,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "check =  permanent[permanent[\"Rating_ID\"].isnull()]\n",
    "check.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name_ID</th>\n",
       "      <th>Rating_ID</th>\n",
       "      <th>Comments</th>\n",
       "      <th>Specialist_ID</th>\n",
       "      <th>Address</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Puducherry Diabetes Foundat..</td>\n",
       "      <td>3.9</td>\n",
       "      <td>Verygood</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 86, Eswaran Kovil Street, Pondicherry - 6...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>L K Nursing Home</td>\n",
       "      <td>4.3</td>\n",
       "      <td>Excellent</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No:25, Rajarajeswari nagar, Venkata Nagar, P...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>East Coast Hospitals</td>\n",
       "      <td>4.1</td>\n",
       "      <td>Excellent</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 370, 2nd Main Road Mahaveer Nagar, Lawspe...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Delta Q Labs</td>\n",
       "      <td>5.0</td>\n",
       "      <td>Excellent</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 1, Perambai Road, Moolakulam, Pondicherry...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Clinic Nallam</td>\n",
       "      <td>4.1</td>\n",
       "      <td>Excellent</td>\n",
       "      <td>Doctors For Dengue Fever Treatment</td>\n",
       "      <td>No 36/A, Cuddalore Main Road, Mudaliarpet, P...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                         Name_ID  Rating_ID   Comments  \\\n",
       "0  Puducherry Diabetes Foundat..        3.9   Verygood   \n",
       "1               L K Nursing Home        4.3  Excellent   \n",
       "2           East Coast Hospitals        4.1  Excellent   \n",
       "3                   Delta Q Labs        5.0  Excellent   \n",
       "4                  Clinic Nallam        4.1  Excellent   \n",
       "\n",
       "                        Specialist_ID  \\\n",
       "0  Doctors For Dengue Fever Treatment   \n",
       "1  Doctors For Dengue Fever Treatment   \n",
       "2  Doctors For Dengue Fever Treatment   \n",
       "3  Doctors For Dengue Fever Treatment   \n",
       "4  Doctors For Dengue Fever Treatment   \n",
       "\n",
       "                                             Address  \n",
       "0    No 86, Eswaran Kovil Street, Pondicherry - 6...  \n",
       "1    No:25, Rajarajeswari nagar, Venkata Nagar, P...  \n",
       "2    No 370, 2nd Main Road Mahaveer Nagar, Lawspe...  \n",
       "3    No 1, Perambai Road, Moolakulam, Pondicherry...  \n",
       "4    No 36/A, Cuddalore Main Road, Mudaliarpet, P...  "
      ]
     },
     "execution_count": 63,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "senti= permanent[permanent[\"Rating_ID\"].notnull()]\n",
    "permanent.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {},
   "outputs": [],
   "source": [
    "senti[\"senti\"] = senti[\"Rating_ID\"]>=4\n",
    "senti[\"senti\"] = senti[\"senti\"].replace([True , False] , [\"pos\" , \"neg\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x1a2305e0f0>"
      ]
     },
     "execution_count": 65,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAEFCAYAAADt1CyEAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvNQv5yAAADJtJREFUeJzt3G+snvVdx/H3ZxSYixl/zxbS1hWlW0bMNliD1SXGgDHAlpUHsECmI4SkicFkujlFY+JmfAA+GHNGcVWInVnGyNSAk8QQ/szMBGYZg8HqQsWN1uIo4U9ZFqbA1wfnV3dSDpy77Tm94Hver+Tk3Nfv+vU+3yYn7169zn2fVBWSpL7eMPUAkqSVZeglqTlDL0nNGXpJas7QS1Jzhl6SmjP0ktScoZek5gy9JDW3ZuoBAE499dTasGHD1GNI0uvKfffd92RVzS217zUR+g0bNrBjx46px5Ck15Uk35tln7duJKk5Qy9JzRl6SWrO0EtSc4Zekpoz9JLUnKGXpOYMvSQ1Z+glqbnXxDtjXy82XP1PU4/Qynevef/UI0irglf0ktScoZek5gy9JDVn6CWpOUMvSc0ZeklqztBLUnOGXpKaM/SS1Jyhl6TmDL0kNWfoJak5Qy9JzRl6SWrO0EtSc4ZekpqbOfRJjklyf5KvjOPTk9yb5JEkX0py3Fg/fhzvGuc3rMzokqRZHMoV/UeBnQuOrwWuq6qNwNPAlWP9SuDpqjoDuG7skyRNZKbQJ1kHvB/463Ec4Fzgy2PLduCi8XjLOGacP2/slyRNYNYr+s8AvwO8NI5PAZ6pqhfG8R5g7Xi8FtgNMM4/O/ZLkiawZOiTfAB4oqruW7i8yNaa4dzC592aZEeSHfv27ZtpWEnSoZvliv59wAeTfBe4iflbNp8BTkyyZuxZB+wdj/cA6wHG+ROApw5+0qraVlWbqmrT3NzcEf0lJEmvbMnQV9XvVdW6qtoAXArcWVUfBu4CLh7bLgduGY9vHceM83dW1cuu6CVJR8eRvI7+d4GPJdnF/D34G8b6DcApY/1jwNVHNqIk6UisWXrLj1XV3cDd4/GjwDmL7HkeuGQZZpMkLQPfGStJzRl6SWrO0EtSc4Zekpoz9JLUnKGXpOYMvSQ1Z+glqTlDL0nNGXpJas7QS1Jzhl6SmjP0ktScoZek5gy9JDVn6CWpOUMvSc0ZeklqztBLUnOGXpKaM/SS1Jyhl6TmDL0kNWfoJak5Qy9JzRl6SWrO0EtSc4Zekpoz9JLUnKGXpOYMvSQ1Z+glqTlDL0nNGXpJas7QS1Jzhl6SmjP0ktScoZek5pYMfZI3Jvl6kgeSPJzkU2P99CT3JnkkyZeSHDfWjx/Hu8b5DSv7V5AkvZpZruh/BJxbVe8G3gOcn2QzcC1wXVVtBJ4Grhz7rwSerqozgOvGPknSRJYMfc37wTg8dnwUcC7w5bG+HbhoPN4yjhnnz0uSZZtYknRIZrpHn+SYJN8EngBuB/4DeKaqXhhb9gBrx+O1wG6Acf5Z4JTlHFqSNLuZQl9VL1bVe4B1wDnAOxfbNj4vdvVeBy8k2ZpkR5Id+/btm3VeSdIhOqRX3VTVM8DdwGbgxCRrxql1wN7xeA+wHmCcPwF4apHn2lZVm6pq09zc3OFNL0la0iyvuplLcuJ4/BPALwM7gbuAi8e2y4FbxuNbxzHj/J1V9bIreknS0bFm6S2cBmxPcgzz/zDcXFVfSfJt4KYkfwzcD9ww9t8A/G2SXcxfyV+6AnNLkma0ZOir6kHgrEXWH2X+fv3B688DlyzLdJKkI+Y7YyWpOUMvSc0ZeklqztBLUnOGXpKaM/SS1Jyhl6TmDL0kNWfoJak5Qy9JzRl6SWrO0EtSc4Zekpoz9JLUnKGXpOYMvSQ1Z+glqTlDL0nNGXpJas7QS1Jzhl6SmjP0ktScoZek5gy9JDVn6CWpOUMvSc0ZeklqztBLUnOGXpKaM/SS1Jyhl6TmDL0kNWfoJak5Qy9JzRl6SWrO0EtSc4ZekppbMvRJ1ie5K8nOJA8n+ehYPznJ7UkeGZ9PGutJ8tkku5I8mOTslf5LSJJe2SxX9C8AH6+qdwKbgauSnAlcDdxRVRuBO8YxwAXAxvGxFbh+2aeWJM1sydBX1eNV9Y3x+DlgJ7AW2AJsH9u2AxeNx1uAz9e8e4ATk5y27JNLkmZySPfok2wAzgLuBd5aVY/D/D8GwFvGtrXA7gV/bM9YkyRNYObQJ/lJ4O+A36yq/a+2dZG1WuT5tibZkWTHvn37Zh1DknSIZgp9kmOZj/wXqurvx/L3D9ySGZ+fGOt7gPUL/vg6YO/Bz1lV26pqU1VtmpubO9z5JUlLmOVVNwFuAHZW1acXnLoVuHw8vhy4ZcH6R8arbzYDzx64xSNJOvrWzLDnfcCvAd9K8s2x9vvANcDNSa4EHgMuGeduAy4EdgE/BK5Y1oklSYdkydBX1ddY/L47wHmL7C/gqiOcS5K0THxnrCQ1Z+glqTlDL0nNGXpJas7QS1Jzhl6SmjP0ktScoZek5gy9JDVn6CWpOUMvSc0ZeklqztBLUnOGXpKaM/SS1Jyhl6TmDL0kNWfoJak5Qy9JzRl6SWrO0EtSc4Zekpoz9JLUnKGXpOYMvSQ1Z+glqTlDL0nNGXpJas7QS1Jzhl6Smlsz9QCSlsEnT5h6gl4++ezUEywrr+glqTlDL0nNGXpJas7QS1Jzhl6SmjP0ktScoZek5pYMfZIbkzyR5KEFaycnuT3JI+PzSWM9ST6bZFeSB5OcvZLDS5KWNssV/d8A5x+0djVwR1VtBO4YxwAXABvHx1bg+uUZU5J0uJYMfVX9C/DUQctbgO3j8XbgogXrn6959wAnJjltuYaVJB26w71H/9aqehxgfH7LWF8L7F6wb89Ye5kkW5PsSLJj3759hzmGJGkpy/3D2CyyVottrKptVbWpqjbNzc0t8xiSpAMON/TfP3BLZnx+YqzvAdYv2LcO2Hv440mSjtThhv5W4PLx+HLglgXrHxmvvtkMPHvgFo8kaRpL/priJF8Efgk4Ncke4A+Ba4Cbk1wJPAZcMrbfBlwI7AJ+CFyxAjNLkg7BkqGvqste4dR5i+wt4KojHUqStHx8Z6wkNWfoJak5Qy9JzRl6SWrO0EtSc4Zekpoz9JLUnKGXpOYMvSQ1Z+glqTlDL0nNGXpJas7QS1Jzhl6SmjP0ktScoZek5gy9JDVn6CWpOUMvSc0ZeklqztBLUnOGXpKaM/SS1Jyhl6TmDL0kNWfoJak5Qy9JzRl6SWrO0EtSc4Zekpoz9JLUnKGXpOYMvSQ1Z+glqTlDL0nNGXpJas7QS1JzKxL6JOcn+U6SXUmuXomvIUmazbKHPskxwJ8DFwBnApclOXO5v44kaTYrcUV/DrCrqh6tqv8BbgK2rMDXkSTNYCVCvxbYveB4z1iTJE1gzQo8ZxZZq5dtSrYCW8fhD5J8ZwVmWa1OBZ6ceoil5NqpJ9AEXhffm3xqsYy9Jr1tlk0rEfo9wPoFx+uAvQdvqqptwLYV+PqrXpIdVbVp6jmkg/m9OY2VuHXzb8DGJKcnOQ64FLh1Bb6OJGkGy35FX1UvJPkN4J+BY4Abq+rh5f46kqTZrMStG6rqNuC2lXhuzcRbYnqt8ntzAql62c9JJUmN+CsQJKk5Qy9JzRl6SWrO0EtSc4a+iSR/kuTNSY5NckeSJ5P86tRzSUmeS7L/oI/dSf4hyU9PPd9qYOj7+JWq2g98gPl3J78d+MS0I0kAfJr578W1zL9T/reBv2L+Fx7eOOFcq4ah7+PY8flC4ItV9dSUw0gLnF9Vn6uq56pq//j1JxdW1ZeAk6YebjUw9H38Y5J/BzYBdySZA56feCYJ4KUkH0ryhvHxoQXnfCPPUeAbphpJchKwv6peTPIm4M1V9d9Tz6XVbdyH/1Pg55kP+z3AbwH/Bby3qr424XirgqFvIsmxwK8DvziWvgr8ZVX973RTSXot8NZNH9cD7wX+YnycPdakSSV5+3gl2EPj+F1J/mDquVYTr+ibSPJAVb17qTXpaEvyVeZfdfO5qjprrD1UVT877WSrh1f0fbyY5GcOHIz7oi9OOI90wJuq6usHrb0wySSr1Ir8mmJN4hPAXUkeHccbgCumG0f6f0+Oi5ACSHIx8Pi0I60u3rppIskbgY8D542l24HrqsqXWGpS43+X24BfAJ4G/hP4cFV9b9LBVhFD30SSm4H9wBfG0mXASVV1yXRTSZDkeOBi5v+XeTLz36dVVX805Vyribdu+njHQT94vSvJA5NNI/3YLcAzwDeAvRPPsioZ+j7uT7K5qu4BSPJzwL9OPJMEsK6qzp96iNXMWzdNJNkJvAN4bCz9FLATeIn5/ya/a6rZtLol2Qb8WVV9a+pZVitD30SSt73aeX/wpakk+TZwBvM/hP0RELz4OKoMvaQV9UoXIV58HD2GXpKa852xktScoZek5gy9JDVn6CWpOUMvSc39HwKAxZQGJcLMAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x1a2365f518>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "senti[\"senti\"].value_counts().plot.bar()\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {},
   "outputs": [],
   "source": [
    "import nltk.classify.util\n",
    "from nltk.classify import NaiveBayesClassifier\n",
    "import numpy as np\n",
    "import re\n",
    "import string\n",
    "import nltk\n",
    "\n",
    "cleanup_re = re.compile('[^a-z]+')\n",
    "def cleanup(sentence):\n",
    "    sentence = sentence.lower()\n",
    "    sentence = cleanup_re.sub(' ', sentence).strip()\n",
    "    #sentence = \" \".join(nltk.word_tokenize(sentence))\n",
    "    return sentence\n",
    "\n",
    "senti[\"Summary_Clean\"] = senti[\"Comments\"].apply(cleanup)\n",
    "check[\"Summary_Clean\"] = check[\"Comments\"].apply(cleanup)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {},
   "outputs": [],
   "source": [
    "split = senti[[\"Summary_Clean\" , \"senti\"]]\n",
    "train=split.sample(frac=0.8,random_state=200)\n",
    "test=split.drop(train.index)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {},
   "outputs": [],
   "source": [
    "def word_feats(words):\n",
    "    features = {}\n",
    "    for word in words:\n",
    "        features [word] = True\n",
    "    return features"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "NLTK Naive bayes Accuracy : 0.9122807017543859\n",
      "Most Informative Features\n",
      "               excellent = None              neg : pos    =      8.0 : 1.0\n",
      "                verygood = True              neg : pos    =      5.5 : 1.0\n",
      "                verygood = None              pos : neg    =      2.8 : 1.0\n",
      "                verypoor = None              pos : neg    =      1.2 : 1.0\n",
      "                    good = None              pos : neg    =      1.2 : 1.0\n"
     ]
    }
   ],
   "source": [
    "train[\"words\"] = train[\"Summary_Clean\"].str.lower().str.split()\n",
    "test[\"words\"] = test[\"Summary_Clean\"].str.lower().str.split()\n",
    "check[\"words\"] = check[\"Summary_Clean\"].str.lower().str.split()\n",
    "\n",
    "train.index = range(train.shape[0])\n",
    "test.index = range(test.shape[0])\n",
    "check.index = range(check.shape[0])\n",
    "prediction =  {} ## For storing results of different classifiers\n",
    "\n",
    "train_naive = []\n",
    "test_naive = []\n",
    "check_naive = []\n",
    "\n",
    "for i in range(train.shape[0]):\n",
    "    train_naive = train_naive +[[word_feats(train[\"words\"][i]) , train[\"senti\"][i]]]\n",
    "for i in range(test.shape[0]):\n",
    "    test_naive = test_naive +[[word_feats(test[\"words\"][i]) , test[\"senti\"][i]]]\n",
    "for i in range(check.shape[0]):\n",
    "    check_naive = check_naive +[word_feats(check[\"words\"][i])]\n",
    "\n",
    "\n",
    "classifier = NaiveBayesClassifier.train(train_naive)\n",
    "print(\"NLTK Naive bayes Accuracy : {}\".format(nltk.classify.util.accuracy(classifier , test_naive)))\n",
    "classifier.show_most_informative_features(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.metrics import roc_curve, auc"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "No handles with labels found to put in legend.\n"
     ]
    },
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x1a25dd1080>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "def formatt(x):\n",
    "    if x == 'neg':\n",
    "        return 0\n",
    "    return 1\n",
    "vfunc = np.vectorize(formatt)\n",
    "\n",
    "cmp = 0\n",
    "colors = ['b', 'g', 'y', 'm', 'k']\n",
    "for model, predicted in prediction.items():\n",
    "    false_positive_rate, true_positive_rate, thresholds = roc_curve(test[\"senti\"].map(formatt), vfunc(predicted))\n",
    "    roc_auc = auc(false_positive_rate, true_positive_rate)\n",
    "    plt.plot(false_positive_rate, true_positive_rate, colors[cmp], label='%s: AUC %0.2f'% (model,roc_auc))\n",
    "    cmp += 1\n",
    "plt.legend(loc='lower right')\n",
    "plt.title('Classifiers comparaison with ROC')\n",
    "plt.plot([0,1],[0,1],'r--')\n",
    "plt.xlim([-0.1,1.2])\n",
    "plt.ylim([-0.1,1.2])\n",
    "plt.ylabel('True Positive Rate')\n",
    "plt.xlabel('False Positive Rate')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
View on Github
Doctor's Sentiment analysis

Comments

Leave a Comment

Post a Comment