😎Load Data

Load CSV Files in R

ObjectName <- read.csv("path-to-file/filename.csv", header = TURE)

Code Breakdown:

  • read.csv used to load csv files.

  • header = TURE argument will consider the first row as the header or column names.

Get Data from the URL

-> Option 1: Directly save as an object.

Let's say the data is in csv format. We can use read.csv function to directly parse the data and save it as a DataFrame.

polls <- read.csv("https://raw.githubusercontent.com/ds4stats/r-tutorials/master/tidying-data/data/rcp-polls.csv",
na.strings = "--", as.is = TRUE)

Code Breakdown:

  • We have to put the URL inside the double quote under read.csv function.

  • na.strings = "--" : This dataset denotes missing data as --. But R doesn't understand that. So we converted the -- into NA.

  • as.is = TRUE : Normally R converts the character column into a factor. By this argument, we specified not to do this conversion.

-> Option 2: First save as a file and then load as an object.

# Set the URL as an object
url <- "https://raw.githubusercontent.com/ds4stats/r-tutorials/master/tidying-data/data/rcp-polls.csv"

# Download the file
download.file(url, "poll_dataset.csv")

# Load as an object
polls <- read.csv("poll_dataset.csv", header = TRUE, na.strings = "--",
as.is = TRUE)

Code Breakdown:

  • At first, the URL was specified by an object url.

  • In the second code, download.file function downloaded the dataset.

  • At 3rd code, we loaded the dataset as a csv file.

Load TSV Files in R

To load a TSV file in R, we can use either the read.delim() function or the read_tsv() function from the readr package.

Using the read.delim() function

The read.delim() function is a general function for reading delimited text files. To read a TSV file, you need to specify the delimiter as "\t".

# Load the TSV file
tsv_data <- read.delim("path/to/tsv_file.tsv", sep = "\t")

# Print the head of the data frame
head(tsv_data)

Using the read_tsv() function

The read_tsv() function is a specific function for reading TSV files. It is more efficient than using the read.delim() function for TSV files.

To use the read_tsv() function, you need to install the readr package first.

# Install the readr package
install.packages("readr")

# Load the readr package
library(readr)

# Load the TSV file
tsv_data <- read_tsv("path/to/tsv_file.tsv")

# Print the head of the data frame
head(tsv_data)

When you observe the outputs, the basic difference between both methods is read_tsv() function returns the dataframe with columns by specifying the type of it [ Student_Id – double, Student_Name – Character ], when it comes to read.delim() method it simply returns the data present in the tsv file.

However, the read_tsv() function is more efficient and easier to use for TSV files.

Load Excel Files in R

To load an .xlsx file in R, you can use the read_excel() function from the readxl package.

# Load the the readxl package
library("readxl")

# Read the XLSX file
xlsx_data <- read_excel("data.xlsx")

# Print the head of the data frame to see the first few rows of data
head(xlsx_data)

Loading a specific sheet from an XLSX file

If you want to load a specific sheet from an XLSX file, you can use the sheet argument to the read_excel() function.

For example, to load the sheet named "Sheet1" from the XLSX file data.xlsx, you would use the following code:

# load the data
xlsx_data <- read_excel("data.xlsx", sheet = "Sheet1")

# loading multiple sheets
xlsx_data <- read_excel("data.xlsx", sheets = c("Sheet1", "Sheet2"))

Last updated