Pandas: Selecting Rows (And Columns) With iloc[]

import pandas as pd

persons = pd.DataFrame({
    'firstname': ['Joerg',                  'Johanna',           'Caro',              'Philipp'          ],
    'lastname':  ['Faschingbauer',          'Faschingbauer',     'Faschingbauer',     'Lichtenberger'    ],
    'email':     ['jf@faschingbauer.co.at', 'johanna@email.com', 'caro@email.com',    'philipp@email.com'],
    'age':       [56,                       27,                  25,                  37                 ],
})

Row By Number: iloc[]

  • Note the index column

  • … has no explicit column name

  • Default index (unless configured explicitly): row numbers

  • integers

  • iloc, for integer location

    persons.iloc[1]
    
    firstname              Johanna
    lastname         Faschingbauer
    email        johanna@email.com
    age                         27
    Name: 1, dtype: object
    
    type(persons.iloc[1])
    
    pandas.core.series.Series
    
  • Out-of-range access not possible (see Pandas: Adding Rows)

persons.iloc[4]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[4], line 1
----> 1 persons.iloc[4]

File ~/My-Environments/jfasch-home/lib64/python3.12/site-packages/pandas/core/indexing.py:1153, in _LocationIndexer.__getitem__(self, key)
   1150 axis = self.axis or 0
   1152 maybe_callable = com.apply_if_callable(key, self.obj)
-> 1153 return self._getitem_axis(maybe_callable, axis=axis)

File ~/My-Environments/jfasch-home/lib64/python3.12/site-packages/pandas/core/indexing.py:1714, in _iLocIndexer._getitem_axis(self, key, axis)
   1711     raise TypeError("Cannot index by location index with a non-integer key")
   1713 # validate the location
-> 1714 self._validate_integer(key, axis)
   1716 return self.obj._ixs(key, axis=axis)

File ~/My-Environments/jfasch-home/lib64/python3.12/site-packages/pandas/core/indexing.py:1647, in _iLocIndexer._validate_integer(self, key, axis)
   1645 len_axis = len(self.obj._get_axis(axis))
   1646 if key >= len_axis or key < -len_axis:
-> 1647     raise IndexError("single positional indexer is out-of-bounds")

IndexError: single positional indexer is out-of-bounds

So What Is A Row, Then?

row = persons.iloc[1]
type(row)
pandas.core.series.Series
row
firstname              Johanna
lastname         Faschingbauer
email        johanna@email.com
age                         27
Name: 1, dtype: object
  • Series

  • Non-default index ⟶ column name

  • Best accessed using loc[], using the column name

  • Or the column index/number (clumsy though)

row.loc['firstname']
'Johanna'
row.iloc[0]
'Johanna'

Selecting Multiple Rows

  • Using list of column numbers as iloc[] subscript parameter

    persons.iloc[[0,1]]      # <--- single list [0,1] inside []
    
    firstname lastname email age
    0 Joerg Faschingbauer jf@faschingbauer.co.at 56
    1 Johanna Faschingbauer johanna@email.com 27
  • Note how the index is an integer again

  • ⟶ two rows selected

  • DataFrame, not Series

Slicing

  • [0,1] (contiguous range) is alternatively expressed as a slice [0:2]

  • 0 … inclusive

  • 2 … exclusive

    persons.iloc[0:2]        # <--- note: no double squares! 0:2 *is* [0,1]
    
    firstname lastname email age
    0 Joerg Faschingbauer jf@faschingbauer.co.at 56
    1 Johanna Faschingbauer johanna@email.com 27

Selecting Rows And Columns

  • iloc[] selects rows, primarily

  • Can select columns from those in the same step

  • Example: row 1, column 2 (which is email)

    persons.iloc[1, 2]
    
    'johanna@email.com'
    
  • Example: rows 0 and 1 (i.e. two rows), column 2 (email)

    persons.iloc[[0,1], 2]
    
    0    jf@faschingbauer.co.at
    1         johanna@email.com
    Name: email, dtype: object
    
  • Example …

    persons.iloc[[0,1], [0, 2]]
    
    firstname email
    0 Joerg jf@faschingbauer.co.at
    1 Johanna johanna@email.com
  • Example: slices … note that the end is exclusive with iloc[] (as opposed to loc[]; see Pandas: Selecting Rows (And Columns) With loc[])

    persons.iloc[0:2, 0:3]
    
    firstname lastname email
    0 Joerg Faschingbauer jf@faschingbauer.co.at
    1 Johanna Faschingbauer johanna@email.com

Summary

  • Works with integers only

  • Cannot even specify columns by their names

  • Efficient though