CSV (Comma-Separated Values) files are a common format for storing tabular data. R provides powerful tools for handling CSV files efficiently.
To read a CSV file in R, use the read.csv()
function:
data <- read.csv("filename.csv")
head(data) # View the first few rows
For larger files or improved performance, consider using the fread()
function from the data.table
package:
library(data.table)
data <- fread("filename.csv")
head(data)
To save data as a CSV file, use the write.csv()
function:
write.csv(data, "output.csv", row.names = FALSE)
Once you've loaded your CSV data, you can manipulate it using various R functions and packages. The dplyr package is particularly useful for data manipulation tasks:
library(dplyr)
# Filter rows
filtered_data <- data %>% filter(column_name > 10)
# Select columns
selected_data <- data %>% select(column1, column2)
# Create new columns
modified_data <- data %>% mutate(new_column = column1 + column2)
# Summarize data
summary_data <- data %>% group_by(category) %>% summarize(mean_value = mean(value))
str()
or glimpse()
.na.omit()
or is.na()
.data.table
or readr
for improved performance.For more complex CSV scenarios, consider these techniques:
If your CSV uses a different delimiter, specify it in the read.csv()
function:
data <- read.csv("filename.csv", sep = ";") # For semicolon-separated files
When dealing with quoted fields in CSV files, use the quote
parameter:
data <- read.csv("filename.csv", quote = "\"") # For double-quoted fields
To ensure correct data types, use the colClasses
parameter:
data <- read.csv("filename.csv", colClasses = c("numeric", "factor", "character"))
Working with CSV files is a fundamental skill in R data analysis. By mastering these techniques, you'll be well-equipped to handle various data import and export tasks in your R projects.
Remember to explore related concepts like R Data Frames and R Tibbles to further enhance your data manipulation skills in R.