In [1]:
import pandas as pd
In [2]:
df = pd.read_csv('words.csv', index_col='Word')
In [3]:
df.head()
Out[3]:
Char Count Value
Word
aa 2 2
aah 3 10
aahed 5 19
aahing 6 40
aahs 4 29

Activities¶

How many elements does this dataframe have?¶
In [4]:
df.shape
Out[4]:
(172821, 2)
What is the value of the word microspectrophotometries?¶
In [29]:
df.loc["microspectrophotometries]
  Cell In[29], line 1
    df.loc["microspectrophotometries]
           ^
SyntaxError: unterminated string literal (detected at line 1)
What is the highest possible value of a word?¶
In [7]:
df["Value"].max()
Out[7]:
319
Which of the following words have a Char Count of 15?¶
In [ ]:
 
What is the highest possible length of a word?¶
In [8]:
df["Char Count"].max()
Out[8]:
28
What is the word with the value of 319?¶
In [ ]:
 
What is the most common value?¶
In [10]:
df['Value'].describe()
df['Value'].value_counts().head()
Out[10]:
Value
93     1965
100    1921
95     1915
99     1907
92     1902
Name: count, dtype: int64
What is the shortest word with value 274?¶
In [15]:
df.loc[
(df['Value'] == 274) &
(df['Char Count'] == 20)
]
Out[15]:
Char Count Value
Word
overprotectivenesses 20 274
Create a column Ratio which represents the 'Value Ratio' of a word¶
In [16]:
df['Ratio']=df['Value']/df['Char Count']
What is the maximum value of Ratio?¶
In [18]:
df['Ratio'].max()
Out[18]:
22.5
What word is the one with the highest Ratio?¶
In [20]:
df.sort_values(by='Ratio',ascending=False).head()
Out[20]:
Char Count Value Ratio
Word
xu 2 45 22.500000
muzzy 5 111 22.200000
wry 3 66 22.000000
xyst 4 88 22.000000
tux 3 65 21.666667
How many words have a Ratio of 10?¶
In [23]:
df.loc[df['Ratio'] == 10].shape
Out[23]:
(2604, 3)
What is the maximum Value of all the words with a Ratio of 10?¶
In [26]:
df.query("Ratio == 10").sort_values(by="Value",ascending=False).head()
Out[26]:
Char Count Value Ratio
Word
electrocardiographically 24 240 10.0
electroencephalographies 24 240 10.0
electroencephalographer 23 230 10.0
electrodesiccation 18 180 10.0
phonocardiographic 18 180 10.0
Of those words with a Value of 260, what is the lowest Char Count found?¶
In [27]:
df.query("Value == 260").sort_values(by="Char Count")
Out[27]:
Char Count Value Ratio
Word
hydroxytryptamine 17 260 15.294118
neuropsychologists 18 260 14.444444
psychophysiologist 18 260 14.444444
revolutionarinesses 19 260 13.684211
countermobilizations 20 260 13.000000
underrepresentations 20 260 13.000000
Based on the previous task, what word is it?¶
In [ ]: