How to Read .CSV file in Pandas

import pandas as pd
df = pd.read_csv('downloads/adeshbhai.csv')
df.head()

Out[1]:

	Region	Country	Item Type	Sales Channel	Order Priority	Order Date	Order ID	Ship Date	Units Sold	Unit Price	Unit Cost	Total Revenue	Total Cost	Total Profit
0	Australia and Oceania	Tuvalu	Baby Food	Offline	H	5/28/2010	669165933	6/27/2010	9925	255.28	159.42	2533654.00	1582243.50	951410.50
1	Central America and the Caribbean	Grenada	Cereal	Online	C	8/22/2012	963881480	9/15/2012	2804	205.70	117.11	576782.80	328376.44	248406.36
2	Europe	Russia	Office Supplies	Offline	L	5/2/2014	341417157	5/8/2014	1779	651.21	524.96	1158502.59	933903.84	224598.75
3	Sub-Saharan Africa	Sao Tome and Principe	Fruits	Online	C	6/20/2014	514321792	7/5/2014	8102	9.33	6.92	75591.66	56065.84	19525.82
4	Sub-Saharan Africa	Rwanda	Office Supplies	Offline	L	2/1/2013	115456712	2/6/2013	5062	651.21	524.96	3296425.02	2657347.52	639077.50

In [2]:

df.tail()

Out[2]:

	Region	Country	Item Type	Sales Channel	Order Priority	Order Date	Order ID	Ship Date	Units Sold	Unit Price	Unit Cost	Total Revenue	Total Cost	Total Profit
95	Sub-Saharan Africa	Mali	Clothes	Online	M	7/26/2011	512878119	9/3/2011	888	109.28	35.84	97040.64	31825.92	65214.72
96	Asia	Malaysia	Fruits	Offline	L	11/11/2011	810711038	12/28/2011	6267	9.33	6.92	58471.11	43367.64	15103.47
97	Sub-Saharan Africa	Sierra Leone	Vegetables	Offline	C	6/1/2016	728815257	6/29/2016	1485	154.06	90.93	228779.10	135031.05	93748.05
98	North America	Mexico	Personal Care	Offline	M	7/30/2015	559427106	8/8/2015	5767	81.73	56.67	471336.91	326815.89	144521.02
99	Sub-Saharan Africa	Mozambique	Household	Offline	L	2/10/2012	665095412	2/15/2012	5367	668.27	502.54	3586605.09	2697132.18	889472.91

In [7]:

import matplotlib.pyplot as plt # import library 
x = df['Region']  # store the value in x
y= df['Country']  # store the vatue in y
plt.plot(x,y)  # Simple plot in two data
plt.show() # for shown in figure command shell or jupyter notebook

In [8]:

plt.scatter(x,y) # plot the Scatter  plot

Out[8]:

<matplotlib.collections.PathCollection at 0x1fe0742acc8>

In [11]:

plt.bar(x,y) # plot the bar plot

Out[11]:

<BarContainer object of 100 artists>

In [13]:

df.index[1] #The index (row labels) of the DataFrame.

Out[13]:

In [15]:

df.columns[ : 3]    # The column labels of the DataFrame.

Out[15]:

Index(['Region', 'Country', 'Item Type'], dtype='object')

In [20]:

df.columns[0: ]    # The column labels of the DataFrame.

Out[20]:

Index(['Region', 'Country', 'Item Type', 'Sales Channel', 'Order Priority',
       'Order Date', 'Order ID', 'Ship Date', 'Units Sold', 'Unit Price',
       'Unit Cost', 'Total Revenue', 'Total Cost', 'Total Profit'],
      dtype='object')

In [24]:

df.dtypes  #Return the dtypes in the DataFrame.

Out[24]:

Region             object
Country            object
Item Type          object
Sales Channel      object
Order Priority     object
Order Date         object
Order ID            int64
Ship Date          object
Units Sold          int64
Unit Price        float64
Unit Cost         float64
Total Revenue     float64
Total Cost        float64
Total Profit      float64
dtype: object

In [26]:

df.ftypes # Return the ftypes (indication of sparse/dense and dtype) in DataFrame.

C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:1: FutureWarning: DataFrame.ftypes is deprecated and will be removed in a future version. Use DataFrame.dtypes instead.
  """Entry point for launching an IPython kernel.

Out[26]:

Region             object:dense
Country            object:dense
Item Type          object:dense
Sales Channel      object:dense
Order Priority     object:dense
Order Date         object:dense
Order ID            int64:dense
Ship Date          object:dense
Units Sold          int64:dense
Unit Price        float64:dense
Unit Cost         float64:dense
Total Revenue     float64:dense
Total Cost        float64:dense
Total Profit      float64:dense
dtype: object

In [29]:

df.get_dtype_counts()  # Return counts of unique dtypes in this object.

C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:1: FutureWarning: `get_dtype_counts` has been deprecated and will be removed in a future version. For DataFrames use `.dtypes.value_counts()
  """Entry point for launching an IPython kernel.

Out[29]:

float64    5
int64      2
object     7
dtype: int64

In [30]:

df.get_ftype_counts()  #Return counts of unique ftypes in this object.

C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:1: FutureWarning: get_ftype_counts is deprecated and will be removed in a future version
  """Entry point for launching an IPython kernel.

Out[30]:

float64:dense    5
int64:dense      2
object:dense     7
dtype: int64

In [33]:

# applying get_value() function  
df.get_value(1, 'Order ID') #get_value( index,col)

C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:2: FutureWarning: get_value is deprecated and will be removed in a future release. Please use .at[] or .iat[] accessors instead

Out[33]:

963881480

In [34]:

# column index value of "Name" column is 0 
# We have set takeable = True 
# to interpret the index / col as indexer 
df.get_value(4, 0, takeable = True)

C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:4: FutureWarning: get_value is deprecated and will be removed in a future release. Please use .at[] or .iat[] accessors instead
  after removing the cwd from sys.path.

Out[34]:

'Sub-Saharan Africa'

In [40]:

df.groupby('Country').mean()

Out[40]:

	Order ID	Units Sold	Unit Price	Unit Cost	Total Revenue	Total Cost	Total Profit
Country
Albania	385383069.0	2269.000000	109.280000	35.840	2.479563e+05	8.132096e+04	166635.360000
Angola	135425221.0	4187.000000	668.270000	502.540	2.798046e+06	2.104135e+06	693911.510000
Australia	283189761.0	4331.666667	301.453333	224.620	8.299778e+05	6.377761e+05	192201.706667
Austria	868214595.0	2847.000000	437.200000	263.330	1.244708e+06	7.497005e+05	495007.890000
Azerbaijan	402861845.0	4627.500000	544.205000	394.145	2.239400e+06	1.482937e+06	756463.415000
...	...	...	...	...	...	...	...
The Gambia	800142168.5	3703.250000	387.785000	285.940	1.362379e+06	1.015909e+06	346470.817500
Turkmenistan	452012574.0	4420.000000	659.740000	513.750	2.911018e+06	2.277389e+06	633629.200000
Tuvalu	669165933.0	9925.000000	255.280000	159.420	2.533654e+06	1.582244e+06	951410.500000
United Kingdom	955357205.0	282.000000	668.270000	502.540	1.884521e+05	1.417163e+05	46735.860000
Zambia	122583663.0	4085.000000	152.580000	97.440	6.232893e+05	3.980424e+05	225246.900000

76 rows × 7 columns

In [41]:

df.groupby('Region').mean()

Out[41]:

	Order ID	Units Sold	Unit Price	Unit Cost	Total Revenue	Total Cost	Total Profit
Region
Asia	4.980493e+08	5451.545455	335.809091	239.587273	1.940645e+06	1.384840e+06	555804.170000
Australia and Oceania	4.012882e+08	6211.363636	222.672727	154.744545	1.281297e+06	8.520096e+05	429287.275455
Central America and the Caribbean	7.164449e+08	5110.142857	243.172857	157.817143	1.310055e+06	9.033539e+05	406701.121429
Europe	5.843770e+08	4459.863636	328.979545	223.166364	1.516770e+06	1.013000e+06	503769.937727
Middle East and North Africa	5.028923e+08	4867.800000	241.506000	152.450000	1.405271e+06	8.291515e+05	576119.186000
North America	6.589260e+08	6381.000000	277.243333	205.293333	1.881119e+06	1.395138e+06	485980.920000
Sub-Saharan Africa	5.758950e+08	5079.722222	259.618889	183.677500	1.102001e+06	7.635783e+05	338422.538889

Assignment 4 - Understanding and Predicting Property Maintenance Fines

You are currently looking at version 1.0 of this notebook. To download notebooks and datafiles, as well as get help on Jupyter notebooks in the Coursera platform, visit the Jupyter Notebook FAQ course resource. Assignment 4 - Understanding and Predicting Property Maintenance Fines This assignment is based on a data challenge from the Michigan Data Science Team ( MDST ). The Michigan Data Science Team ( MDST ) and the Michigan Student Symposium for Interdisciplinary Statistical Sciences ( MSSISS ) have partnered with the City of Detroit to help solve one of the most pressing problems facing Detroit - blight. Blight violations are issued by the city to individuals who allow their properties to remain in a deteriorated condition. Every year, the city of Detroit issues millions of dollars in fines to residents and every year, many of these fines remain unpaid. Enforcing unpaid blight fines is a costly and tedious process, so the city...

Data Science

Search This Blog

How to Read .CSV file in Pandas

Comments

Popular posts from this blog

Regression Graded Quiz week 2 quiz (ibm) Coursera

Assignment 4 - Understanding and Predicting Property Maintenance Fines