# How do I get the row count of a pandas DataFrame?

## How do I get the row count of a pandas DataFrame?

### Question

I'm trying to get the number of rows of dataframe df with Pandas, and here is my code.

### Method 1:

```
total_rows = df.count
print total_rows +1
```

### Method 2:

```
total_rows = df['First_columnn_label'].count
print total_rows +1
```

Both the code snippets give me this error:

TypeError: unsupported operand type(s) for +: 'instancemethod' and 'int'

What am I doing wrong?

### Accepted Answer

You can use the `.shape`

property or just `len(DataFrame.index)`

. However, there are notable performance differences ( `len(DataFrame.index)`

is fastest).

Code to reproduce the plot:

```
import numpy as np
import pandas as pd
import perfplot
perfplot.save(
"out.png",
setup=lambda n: pd.DataFrame(np.arange(n * 3).reshape(n, 3)),
n_range=[2**k for k in range(25)],
kernels=[
lambda data: data.shape[0],
lambda data: data[0].count(),
lambda data: len(data.index),
],
labels=["data.shape[0]", "data[0].count()", "len(data.index)"],
xlabel="data rows"
)
```

EDIT: As @Dan Allen noted in the comments `len(df.index)`

and `df[0].count()`

are not interchangeable as `count`

excludes `NaN`

s,

Read more... Read less...

Suppose `df`

is your dataframe then:

```
count_row = df.shape[0] # gives number of row count
count_col = df.shape[1] # gives number of col count
```

Or, more succinctly,

```
r, c = df.shape
```

Use `len(df)`

. This works as of pandas 0.11 or maybe even earlier.

`__len__()`

is currently (0.12) documented with `Returns length of index`

. Timing info, set up the same way as in root's answer:

```
In [7]: timeit len(df.index)
1000000 loops, best of 3: 248 ns per loop
In [8]: timeit len(df)
1000000 loops, best of 3: 573 ns per loop
```

Due to one additional function call it is a bit slower than calling `len(df.index)`

directly, but this should not play any role in most use cases.

## How do I get the row count of a pandas DataFrame?

This table summarises the different situations in which you'd want to count something in a DataFrame (or Series, for completeness), along with the recommended method(s).

Footnotes

`DataFrame.count`

returns counts for each column as a`Series`

since the non-null count varies by column.`DataFrameGroupBy.size`

returns a`Series`

, since all columns in the same group share the same row-count.`DataFrameGroupBy.count`

returns a`DataFrame`

, since the non-null count could differ across columns in the same group. To get the group-wise non-null count for a specific column, use`df.groupby(...)['x'].count()`

where "x" is the column to count.

**Minimal Code Examples**

Below, I show examples of each of the methods described in the table above. First, the setup -

```
df = pd.DataFrame({
'A': list('aabbc'), 'B': ['x', 'x', np.nan, 'x', np.nan]})
s = df['B'].copy()
df
A B
0 a x
1 a x
2 b NaN
3 b x
4 c NaN
s
0 x
1 x
2 NaN
3 x
4 NaN
Name: B, dtype: object
```

### Row Count of a DataFrame: `len(df)`

, `df.shape[0]`

, or `len(df.index)`

```
len(df)
# 5
df.shape[0]
# 5
len(df.index)
# 5
```

It seems silly to compare the performance of constant time operations, especially when the difference is on the level of "seriously, don't worry about it". But this seems to be a trend with other answers, so I'm doing the same for completeness.

Of the 3 methods above, `len(df.index)`

(as mentioned in other answers) is the fastest.

Note

- All the methods above are constant time operations as they are simple attribute lookups.
`df.shape`

(similar to`ndarray.shape`

) is an attribute that returns a tuple of`(# Rows, # Cols)`

. For example,`df.shape`

returns`(8, 2)`

for the example here.

### Column Count of a DataFrame: `df.shape[1]`

, `len(df.columns)`

```
df.shape[1]
# 2
len(df.columns)
# 2
```

Analogous to `len(df.index)`

, `len(df.columns)`

is the faster of the two methods (but takes more characters to type).

### Row Count of a Series: `len(s)`

, `s.size`

, `len(s.index)`

```
len(s)
# 5
s.size
# 5
len(s.index)
# 5
```

`s.size`

and `len(s.index)`

are about the same in terms of speed. But I recommend `len(df)`

.

Note

`size`

is an attribute, and it returns the number of elements (=count of rows for any Series). DataFrames also define a size attribute which returns the same result as`df.shape[0] * df.shape[1]`

.

### Non-Null Row Count: `DataFrame.count`

and `Series.count`

The methods described here only count non-null values (meaning NaNs are ignored).

Calling `DataFrame.count`

will return non-NaN counts for *each* column:

```
df.count()
A 5
B 3
dtype: int64
```

For Series, use `Series.count`

to similar effect:

```
s.count()
# 3
```

### Group-wise Row Count: `GroupBy.size`

For `DataFrames`

, use `DataFrameGroupBy.size`

to count the number of rows per group.

```
df.groupby('A').size()
A
a 2
b 2
c 1
dtype: int64
```

Similarly, for `Series`

, you'll use `SeriesGroupBy.size`

.

```
s.groupby(df.A).size()
A
a 2
b 2
c 1
Name: B, dtype: int64
```

In both cases, a `Series`

is returned. This makes sense for `DataFrames`

as well since all groups share the same row-count.

### Group-wise Non-Null Row Count: `GroupBy.count`

Similar to above, but use `GroupBy.count`

, not `GroupBy.size`

. Note that `size`

always returns a `Series`

, while `count`

returns a `Series`

if called on a specific column, or else a `DataFrame`

.

The following methods return the same thing:

```
df.groupby('A')['B'].size()
df.groupby('A').size()
A
a 2
b 2
c 1
Name: B, dtype: int64
```

Meanwhile, for `count`

, we have

```
df.groupby('A').count()
B
A
a 2
b 1
c 0
```

...called on the entire GroupBy object, v/s,

```
df.groupby('A')['B'].count()
A
a 2
b 1
c 0
Name: B, dtype: int64
```

Called on a specific column.

## TL;DR

use `len(df)`

** len()** is your friend, it can be used for row counts as

`len(df)`

. Alternatively, you can access all rows by `df.index`

and all columns by
`df.columns`

, and as you can use the ** len(anyList)** for getting the count of list, use

`len(df.index)`

for getting the number of rows, and `len(df.columns)`

for the column count.Or, you can use `df.shape`

which returns the number of rows and columns together, if you want to access the number of rows only use `df.shape[0]`

and for the number of columns only use: `df.shape[1]`

.

Apart from above answers use can use `df.axes`

to get the tuple with row and column indexes and then use `len()`

function:

```
total_rows=len(df.axes[0])
total_cols=len(df.axes[1])
```