Data frames are fundamental structures in R for working with tabular data. They're similar to spreadsheets or database tables, making them essential for data analysis and manipulation.
A data frame is a two-dimensional structure that can hold multiple types of data (numeric, character, logical) in columns. Each column must contain the same data type, but different columns can have different types.
You can create a data frame using the data.frame()
function:
df <- data.frame(
name = c("Alice", "Bob", "Charlie"),
age = c(25, 30, 35),
city = c("New York", "London", "Paris")
)
Access columns using the $
operator or square brackets:
# Access the 'name' column
df$name
# Access the second column
df[, 2]
# Access a specific cell
df[1, 2] # First row, second column
R provides various functions for working with data frames:
head()
and tail()
: View the first or last few rowsnrow()
and ncol()
: Get the number of rows or columnsrbind()
and cbind()
: Add rows or columnssubset()
: Filter data based on conditionsFor larger data sets, consider using R Tibbles or the dplyr Package for more efficient data manipulation.
Learn about more advanced operations like Merging Data, Reshaping Data, and Aggregating Data to enhance your data manipulation skills in R.
Data frames are crucial for data analysis in R. They provide a flexible and powerful way to work with structured data, making them indispensable for any R programmer dealing with tabular datasets.