Today let’s explore the unemployment rate across different countries of the world. The dataset we will be using today is Unemployment data— World wide Figures present on the Kaggle platform. It includes the unemployment rate for 31 years of each country i.e. from the year 1991–2021 with columns such as Columns of the dataset, Country name, Country code, Years- 1991 to 2021

Note: For quick Pandas revision you can refer to this blog : Tutorial: Pandas

Photo by The New York Public Library on Unsplash

The unemployment rate formula is the number of unemployed people in the country, divided by the total number of workers available in the civilian labor force.
Source

1. Import important libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as  sns
import plotly.express as px

2. Reading the CSV file

df  = pd.read_csv("/content/unemployment analysis.csv")
df.head()

3. Data Summary

I/P:
df.columnsO/P:
Index(['Country Name', 'Country Code', '1991', '1992', '1993', '1994', '1995','1996', '1997', '1998', '1999', '2000', '2001', '2002', '2003', '2004', '2005', '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013','2014', '2015', '2016', '2017', '2018', '2019', '2020', '2021'],  
dtype='object')

The info() method prints information about the DataFrame. The information contains the number of columns, column labels, column data types, memory usage, range index, and the number of cells in each column (non-null values).

df.info()

Observations:
* There are no missing values
* There are 33 columns in total
* There are 235 rows which has a country name, country code, and unemployment dataset between 1991 and 2021
* There are 2 categorical features Country Name and Country Code; and the rest are numeric features.

4. Finding and Removing the missing values

The isna() function is used to detect missing values.

I/P:
df.isna().sum()O/P:
Country Name      0 Country Code    0 1991            0 1992            0 1993            0 1994            0 1995            0 1996            0 1997            0 1998            0 1999            0 2000            0 2001            0 2002            0 2003            0 2004            0 2005            0 2006            0 2007            0 2008            0 2009            0 2010            0 2011            0 2012            0 2013            0 2014            0 2015            0 2016            0 2017            0 2018            0 2019            0 2020            0 2021            0 
dtype: int64Observations:
* There are no missing values

Top 10 Countries with highest Unemployment rate

I/P:
top_10 = df.groupby(by = 'Country Name')['2021'].sum().sort_values(ascending=False).head(10)O/P:
Country Name South Africa               33.56 
Djibouti                                28.39 
Eswatini                                25.76 
West Bank and Gaza                      24.90 
Botswana                                24.72 
Lesotho                                 24.60 
Congo, Rep.                             23.01 
Gabon                                   22.26 
Namibia                                 21.68 
St. Vincent and the Grenadines          21.62 
Name: 2021, dtype: float64

Observations:
The countries with the highest unemployment rates include South Africa, Djibouti, and Eswatini.

Top 10 Countries with the lowest Unemployment rate

I/P:
top_10 = df.groupby(by = 'Country Name')['2021'].sum().sort_values(ascending=True).head(10)O/P:
Country Name 
Qatar              0.26 
Cambodia           0.61 
Niger              0.75 
Solomon Islands    1.03 
Lao PDR            1.26 
Thailand           1.42 
Benin              1.57 
Rwanda             1.61 
Burundi            1.79 
Bahrain            1.87 
Name: 2021, dtype: float64

Observations:
The countries that have the lowest unemployment rates are Qatar, Cambodia, and Niger.

Just because a country has a low unemployment rate, does not mean its citizens are necessarily well-off. That is determined by GDP per capita. - Source

Visualizing the Unemployment rate of the world in the year 1991

fig = px.choropleth(df,locations='Country Name',locationmode='country names',color='2021',hover_name='Country Name',title = '1991 Unemployment rate',
color_continuous_scale='aggrnyl') 
fig.show()

Visualizing the Unemployment rate of the world in the year 2021

Switching rows and Columns

df = df.set_index("Country Name").transpose()
df.index.names = ["Year"]
df.head()

Note:
The set_index() method allows one or more column values become the row index.
Syntax: dataframe.set_index(keys, drop, append, inplace, verify_integrity)
The transpose() function is used to transpose index and columns.Reflect the DataFrame over its main diagonal by writing rows as columns and vice-versa.
Source: W3School

Number of Countries:

I/P:
print(df.columns.tolist())O/P:
['Africa Eastern and Southern', 'Afghanistan', 'Africa Western and Central', 'Angola', 'Albania', 'Arab World', 'United Arab Emirates', 'Argentina', 'Armenia', 'Australia', 'Austria', 'Azerbaijan', 'Burundi', 'Belgium', 'Benin', 'Burkina Faso', 'Bangladesh', 'Bulgaria', 'Bahrain', 'Bahamas, The', 'Bosnia and Herzegovina', 'Belarus', 'Belize', 'Bolivia', 'Brazil', 'Barbados', 'Brunei Darussalam', 'Bhutan', 'Botswana', 'Central African Republic', 'Canada', 'Central Europe and the Baltics', 'Switzerland', 'Channel Islands', 'Chile', 'China', "Cote d'Ivoire", 'Cameroon', 'Congo, Dem. Rep.', 'Congo, Rep.', 'Colombia', 'Comoros', 'Cabo Verde', 'Costa Rica', 'Caribbean small states', 'Cuba', 'Cyprus', 'Czech Republic', 'Germany', 'Djibouti', 'Denmark', 'Dominican Republic', 'Algeria', 'East Asia & Pacific (excluding high income)', 'Early-demographic dividend', 'East Asia & Pacific', 'Europe & Central Asia (excluding high income)', 'Europe & Central Asia', 'Ecuador', 'Egypt, Arab Rep.', 'Euro area', 'Eritrea', 'Spain', 'Estonia', 'Ethiopia', 'European Union', 'Fragile and conflict affected situations', 'Finland', 'Fiji', 'France', 'Gabon', 'United Kingdom', 'Georgia', 'Ghana', 'Guinea', 'Gambia, The', 'Guinea-Bissau', 'Equatorial Guinea', 'Greece', 'Guatemala', 'Guam', 'Guyana', 'High income', 'Hong Kong SAR, China', 'Honduras', 'Heavily indebted poor countries (HIPC)', 'Croatia', 'Haiti', 'Hungary', 'IBRD only', 'IDA & IBRD total', 'IDA total', 'IDA blend', 'Indonesia', 'IDA only', 'India', 'Ireland', 'Iran, Islamic Rep.', 'Iraq', 'Iceland', 'Israel', 'Italy', 'Jamaica', 'Jordan', 'Japan', 'Kazakhstan', 'Kenya', 'Kyrgyz Republic', 'Cambodia', 'Korea, Rep.', 'Kuwait', 'Latin America & Caribbean (excluding high income)', 'Lao PDR', 'Lebanon', 'Liberia', 'Libya', 'St. Lucia', 'Latin America & Caribbean', 'Least developed countries: UN classification', 'Low income', 'Sri Lanka', 'Lower middle income', 'Low & middle income', 'Lesotho', 'Late-demographic dividend', 'Lithuania', 'Luxembourg', 'Latvia', 'Macao SAR, China', 'Morocco', 'Moldova', 'Madagascar', 'Maldives', 'Middle East & North Africa', 'Mexico', 'Middle income', 'North Macedonia', 'Mali', 'Malta', 'Myanmar', 'Middle East & North Africa (excluding high income)', 'Montenegro', 'Mongolia', 'Mozambique', 'Mauritania', 'Mauritius', 'Malawi', 'Malaysia', 'North America', 'Namibia', 'New Caledonia', 'Niger', 'Nigeria', 'Nicaragua', 'Netherlands', 'Norway', 'Nepal', 'New Zealand', 'OECD members', 'Oman', 'Other small states', 'Pakistan', 'Panama', 'Peru', 'Philippines', 'Papua New Guinea', 'Poland', 'Pre-demographic dividend', 'Puerto Rico', "Korea, Dem. People's Rep.", 'Portugal', 'Paraguay', 'West Bank and Gaza', 'Pacific island small states', 'Post-demographic dividend', 'French Polynesia', 'Qatar', 'Romania', 'Russian Federation', 'Rwanda', 'South Asia', 'Saudi Arabia', 'Sudan', 'Senegal', 'Singapore', 'Solomon Islands', 'Sierra Leone', 'El Salvador', 'Somalia', 'Serbia', 'Sub-Saharan Africa (excluding high income)', 'South Sudan', 'Sub-Saharan Africa', 'Small states', 'Sao Tome and Principe', 'Suriname', 'Slovak Republic', 'Slovenia', 'Sweden', 'Eswatini', 'Syrian Arab Republic', 'Chad', 'East Asia & Pacific (IDA & IBRD countries)', 'Europe & Central Asia (IDA & IBRD countries)', 'Togo', 'Thailand', 'Tajikistan', 'Turkmenistan', 'Latin America & the Caribbean (IDA & IBRD countries)', 'Timor-Leste', 'Middle East & North Africa (IDA & IBRD countries)', 'Tonga', 'South Asia (IDA & IBRD)', 'Sub-Saharan Africa (IDA & IBRD countries)', 'Trinidad and Tobago', 'Tunisia', 'Turkiye', 'Tanzania', 'Uganda', 'Ukraine', 'Upper middle income', 'Uruguay', 'United States', 'Uzbekistan', 'St. Vincent and the Grenadines', 'Venezuela, RB', 'Virgin Islands (U.S.)', 'Vietnam', 'Vanuatu', 'World', 'Samoa', 'Yemen, Rep.', 'South Africa', 'Zambia', 'Zimbabwe']

Unemployment Rates for the World’s major Economies as of 1999

I/P:
Country = ["United States", "China","Japan", "Germany","India","United Kingdom","France","Italy","Canada"]for i in Country:
     print(f'{i} ~~~~> {df[f"{i}"]["1999"]}')

Observation:
At the end of the 20th century, China had the least unemployment rate among all major economies.France and Italy were having the highest unemployment rate

Effect of 2008 recession on World’s major economies

I/P:
Country = ["United States", "China","Japan", "Germany","India","United Kingdom","France","Italy","Canada"]for i in Country:
  print(i)
  print("Before 2008 Recession",df[f"{i}"]["2007"])
  print("After 2008 Recession",df[f"{i}"]["2009"]) 
  print()

Observations:
The country which was highly impacted by the 2008 recession was the USA
Compared to other major countries India’s unemployment rate didn’t change much

Effect of COVID-19 pandemic on World’s major economies

for i in Country:
  print(i)
  print("Before Pandemic",df[f"{i}"]["2018"])
  print("After Pandemic",df[f"{i}"]["2021"]) 
  print()

More EDA blogs are available down below, check out for gaining new concepts.

EDA ~ World Population

Today let’s go on a World Tour with the help of the airplane of this blog and explore. The dataset we will be using…

medium.com

EDA ~Top 100 Richest People in the World dataset

Let’s make our hands dirty by playing with the dataset named Top 100 Richest People in the World. This dataset contains…

medium.com

References:

W3School

Unemployment rate dataset

Search This Blog

ML_Easily_Explained

EDA ~ Unemployment Rate