Upgrade & Secure Your Future with DevOps, SRE, DevSecOps, MLOps!

We spend hours on Instagram and YouTube and waste money on coffee and fast food, but won’t spend 30 minutes a day learning skills to boost our careers.
Master in DevOps, SRE, DevSecOps & MLOps!

Learn from Guru Rajesh Kumar and double your salary in just one year.


Get Started Now!

Jupyter notebook – Lab Session – 1 – Exploring Dataset with Pandas and NumPy

Importing Libraries

import pandas as pd
import numpy as np

Explanation: Import the essential libraries.

Loading the Dataset

df = pd.read_csv('/path_to_your_dataset.csv')

Explanation: Load the dataset into a Pandas DataFrame.

Display First Few Rows

df.head()

Explanation: Display the first five rows to understand the structure.

Display Last Few Rows

df.tail()

Explanation: Display the last five rows of the dataset.

Dataset Information

df.info()

Explanation: Get an overview, including data types and null values.

Descriptive Statistics

df.describe()

Explanation: Get statistics like mean, median, min, and max for each column.

Column Names

df.columns

Explanation: List all column names in the dataset.

Shape of the Dataset

df.shape

Explanation: Get the number of rows and columns.

Check for Null Values

df.isnull().sum()

Explanation: Count null values in each column.

Drop Rows with Null Values

df_cleaned = df.dropna()

Explanation: Remove rows with null values for a cleaner dataset.

Fill Null Values

df.fillna(value='Unknown', inplace=True)

Explanation: Fill null values with a placeholder.

Unique Values in a Column

df['column_name'].unique()

Explanation: Display unique values in a specific column.

Value Counts

df['column_name'].value_counts()

Explanation: Count the occurrences of each unique value in a column.

Filter Rows by Condition

df_filtered = df[df['column_name'] > some_value]

Explanation: Filter rows based on a condition.

Selecting Multiple Columns

df[['column1', 'column2']]

Explanation: Select and display specific columns.

Add a New Column

df['new_column'] = df['column1'] + df['column2']

Explanation: Add a new column by combining values from other columns.

Rename Columns

df.rename(columns={'old_name': 'new_name'}, inplace=True)

Explanation: Rename columns for better readability.

Sorting Values

df.sort_values(by='column_name', ascending=False)

Explanation: Sort the dataset by a specific column.

Drop a Column

df.drop('column_name', axis=1, inplace=True)

Explanation: Remove a specific column.

Group By and Aggregate

df.groupby('column_name').sum()

Explanation: Group by a column and apply an aggregate function like sum.

Calculate Mean of a Column

df['column_name'].mean()

Explanation: Calculate the mean of a specific column.

Calculate Median of a Column

df['column_name'].median()

Explanation: Calculate the median of a specific column.

Standard Deviation of a Column

df['column_name'].std()

Explanation: Calculate the standard deviation of a specific column.

Detecting Outliers

df[(df['column_name'] > upper_limit) | (df['column_name'] < lower_limit)]

Explanation: Detect outliers by specifying upper and lower limits.

Apply Custom Function

df['new_column'] = df['column_name'].apply(lambda x: x * 2)

Explanation: Apply a custom function to each value in a column.

Pivot Table

df.pivot_table(values='value_column', index='index_column', columns='column_name')

Explanation: Create a pivot table to analyze relationships.

Correlation Matrix

df.corr()

Explanation: Calculate the correlation matrix for numeric columns.

Visualizing with Histograms

df['column_name'].hist()

Explanation: Plot a histogram for a column to view the distribution.

Scatter Plot

df.plot.scatter(x='column_x', y='column_y')

Explanation: Create a scatter plot to see relationships between two columns.

Box Plot

df.boxplot(column='column_name')

Explanation: Generate a box plot to identify the spread and outliers.

Live Example of Data set Attached

DOWNLOAD from HERE – CLICK HERE

Subscribe
Notify of
guest
0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments

Certification Courses

DevOpsSchool has introduced a series of professional certification courses designed to enhance your skills and expertise in cutting-edge technologies and methodologies. Whether you are aiming to excel in development, security, or operations, these certifications provide a comprehensive learning experience. Explore the following programs:

DevOps Certification, SRE Certification, and DevSecOps Certification by DevOpsSchool

Explore our DevOps Certification, SRE Certification, and DevSecOps Certification programs at DevOpsSchool. Gain the expertise needed to excel in your career with hands-on training and globally recognized certifications.

0
Would love your thoughts, please comment.x
()
x