In [1]:
import pandas as pd
In [2]:
df = pd.read_csv('words.csv', index_col='Word')
In [3]:
df.head()
Out[3]:
| Char Count | Value | |
|---|---|---|
| Word | ||
| aa | 2 | 2 |
| aah | 3 | 10 |
| aahed | 5 | 19 |
| aahing | 6 | 40 |
| aahs | 4 | 29 |
Activities¶
How many elements does this dataframe have?¶
In [4]:
df.shape
Out[4]:
(172821, 2)
What is the value of the word microspectrophotometries?¶
In [29]:
df.loc["microspectrophotometries]
Cell In[29], line 1 df.loc["microspectrophotometries] ^ SyntaxError: unterminated string literal (detected at line 1)
What is the highest possible value of a word?¶
In [7]:
df["Value"].max()
Out[7]:
319
Which of the following words have a Char Count of 15?¶
In [ ]:
What is the highest possible length of a word?¶
In [8]:
df["Char Count"].max()
Out[8]:
28
What is the word with the value of 319?¶
In [ ]:
What is the most common value?¶
In [10]:
df['Value'].describe()
df['Value'].value_counts().head()
Out[10]:
Value 93 1965 100 1921 95 1915 99 1907 92 1902 Name: count, dtype: int64
What is the shortest word with value 274?¶
In [15]:
df.loc[
(df['Value'] == 274) &
(df['Char Count'] == 20)
]
Out[15]:
| Char Count | Value | |
|---|---|---|
| Word | ||
| overprotectivenesses | 20 | 274 |
Create a column Ratio which represents the 'Value Ratio' of a word¶
In [16]:
df['Ratio']=df['Value']/df['Char Count']
What is the maximum value of Ratio?¶
In [18]:
df['Ratio'].max()
Out[18]:
22.5
What word is the one with the highest Ratio?¶
In [20]:
df.sort_values(by='Ratio',ascending=False).head()
Out[20]:
| Char Count | Value | Ratio | |
|---|---|---|---|
| Word | |||
| xu | 2 | 45 | 22.500000 |
| muzzy | 5 | 111 | 22.200000 |
| wry | 3 | 66 | 22.000000 |
| xyst | 4 | 88 | 22.000000 |
| tux | 3 | 65 | 21.666667 |
How many words have a Ratio of 10?¶
In [23]:
df.loc[df['Ratio'] == 10].shape
Out[23]:
(2604, 3)
What is the maximum Value of all the words with a Ratio of 10?¶
In [26]:
df.query("Ratio == 10").sort_values(by="Value",ascending=False).head()
Out[26]:
| Char Count | Value | Ratio | |
|---|---|---|---|
| Word | |||
| electrocardiographically | 24 | 240 | 10.0 |
| electroencephalographies | 24 | 240 | 10.0 |
| electroencephalographer | 23 | 230 | 10.0 |
| electrodesiccation | 18 | 180 | 10.0 |
| phonocardiographic | 18 | 180 | 10.0 |
Of those words with a Value of 260, what is the lowest Char Count found?¶
In [27]:
df.query("Value == 260").sort_values(by="Char Count")
Out[27]:
| Char Count | Value | Ratio | |
|---|---|---|---|
| Word | |||
| hydroxytryptamine | 17 | 260 | 15.294118 |
| neuropsychologists | 18 | 260 | 14.444444 |
| psychophysiologist | 18 | 260 | 14.444444 |
| revolutionarinesses | 19 | 260 | 13.684211 |
| countermobilizations | 20 | 260 | 13.000000 |
| underrepresentations | 20 | 260 | 13.000000 |
Based on the previous task, what word is it?¶
In [ ]: