Create an empty data.frame
I'm trying to initialize a data.frame without any rows. Basically, I want to specify the data types for each column and name them, but not have any rows created as a result.
The best I've been able to do so far is something like:
df <- data.frame(Date=as.Date("01/01/2000", format="%m/%d/%Y"), File="", User="", stringsAsFactors=FALSE) df <- df[-1,]
Which creates a data.frame with a single row containing all of the data types and column names I wanted, but also creates a useless row which then needs to be removed.
Is there a better way to do this?
Just initialize it with empty vectors:
df <- data.frame(Date=as.Date(character()), File=character(), User=character(), stringsAsFactors=FALSE)
Here's an other example with different column types :
df <- data.frame(Doubles=double(), Ints=integer(), Factors=factor(), Logicals=logical(), Characters=character(), stringsAsFactors=FALSE) str(df) > str(df) 'data.frame': 0 obs. of 5 variables: $ Doubles : num $ Ints : int $ Factors : Factor w/ 0 levels: $ Logicals : logi $ Characters: chr
data.frame with an empty column of the wrong type does not prevent further additions of rows having columns of different types.
This method is just a bit safer in the sense that you'll have the correct column types from the beginning, hence if your code relies on some column type checking, it will work even with a
data.frame with zero rows.
Read more… Read less…
If you already have an existent data frame, let's say
df that has the columns you want, then you can just create an empty data frame by removing all the rows:
empty_df = df[FALSE,]
df still contains the data, but
I found this question looking for how to create a new instance with empty rows, so I think it might be helpful for some people.
You can do it without specifying column types
df = data.frame(matrix(vector(), 0, 3, dimnames=list(c(), c("Date", "File", "User"))), stringsAsFactors=F)
You could use
read.table with an empty string for the input
text as follows:
colClasses = c("Date", "character", "character") col.names = c("Date", "File", "User") df <- read.table(text = "", colClasses = colClasses, col.names = col.names)
Alternatively specifying the
col.names as a string:
df <- read.csv(text="Date,File,User", colClasses = colClasses)
Thanks to Richard Scriven for the improvement
table = data.frame()
when you try to
rbind the first line it will create the columns
The most efficient way to do this is to use
structure to create a list that has the class
structure(list(Date = as.Date(character()), File = character(), User = character()), class = "data.frame") #  Date File User # <0 rows> (or 0-length row.names)
To put this into perspective compared to the presently accepted answer, here's a simple benchmark:
s <- function() structure(list(Date = as.Date(character()), File = character(), User = character()), class = "data.frame") d <- function() data.frame(Date = as.Date(character()), File = character(), User = character(), stringsAsFactors = FALSE) library("microbenchmark") microbenchmark(s(), d()) # Unit: microseconds # expr min lq mean median uq max neval # s() 58.503 66.5860 90.7682 82.1735 101.803 469.560 100 # d() 370.644 382.5755 523.3397 420.1025 604.654 1565.711 100