We have to put the URL inside the double quote under read.csv function.
na.strings = "--": This dataset denotes missing data as --. But R doesn't understand that. So we converted the -- into NA.
as.is = TRUE : Normally R converts the character column into a factor. By this argument, we specified not to do this conversion.
-> Option 2: First save as a file and then load as an object.
Code Breakdown:
At first, the URL was specified by an object url.
In the second code, download.file function downloaded the dataset.
At 3rd code, we loaded the dataset as a csv file.
Load TSV Files in R
To load a TSV file in R, we can use either the read.delim() function or the read_tsv() function from the readr package.
Using the read.delim() function
The read.delim() function is a general function for reading delimited text files. To read a TSV file, you need to specify the delimiter as "\t".
Using the read_tsv() function
The read_tsv() function is a specific function for reading TSV files. It is more efficient than using the read.delim() function for TSV files.
To use the read_tsv() function, you need to install the readr package first.
When you observe the outputs, the basic difference between both methods is read_tsv() function returns the dataframe with columns by specifying the type of it [ Student_Id β double, Student_Name β Character ], when it comes to read.delim() method it simply returns the data present in the tsv file.
However, the read_tsv() function is more efficient and easier to use for TSV files.
Load Excel Files in R
To load an .xlsx file in R, you can use the read_excel() function from the readxl package.
Loading a specific sheet from an XLSX file
If you want to load a specific sheet from an XLSX file, you can use the sheet argument to the read_excel() function.
For example, to load the sheet named "Sheet1" from the XLSX file data.xlsx, you would use the following code:
# Set the URL as an object
url <- "https://raw.githubusercontent.com/ds4stats/r-tutorials/master/tidying-data/data/rcp-polls.csv"
# Download the file
download.file(url, "poll_dataset.csv")
# Load as an object
polls <- read.csv("poll_dataset.csv", header = TRUE, na.strings = "--",
as.is = TRUE)
# Load the TSV file
tsv_data <- read.delim("path/to/tsv_file.tsv", sep = "\t")
# Print the head of the data frame
head(tsv_data)
# Install the readr package
install.packages("readr")
# Load the readr package
library(readr)
# Load the TSV file
tsv_data <- read_tsv("path/to/tsv_file.tsv")
# Print the head of the data frame
head(tsv_data)
# Load the the readxl package
library("readxl")
# Read the XLSX file
xlsx_data <- read_excel("data.xlsx")
# Print the head of the data frame to see the first few rows of data
head(xlsx_data)