# Data Type

A **data type**, in programming, is a classification that specifies which type of value a variable has and what type of mathematical, relational, or logical operations can be applied to it without causing an error.

There are mainly <mark style="background-color:orange;">5 data types</mark> in R:

1. **Vector**
2. **Matrix**
3. **Array**
4. **List**
5. **Data Frame**

<figure><img src="https://3681152927-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvUtrdiIkCrBX60yTgn1m%2Fuploads%2Ft1N4GtVLPadp2HZandCG%2FdataStructuresNew.png?alt=media&#x26;token=39bfb526-d25c-4c66-81a3-7800fe6e5f9a" alt=""><figcaption><p>Source:  <a href="http://venus.ifca.unican.es/Rintro/dataStruct.html">http://venus.ifca.unican.es/Rintro/dataStruct.html</a></p></figcaption></figure>

### 1. Vector

A vector is a sequence of data elements of the <mark style="color:red;background-color:orange;">**same basic type**</mark>. The are 5 classes of vectors.

> In R vectors are denoted by <mark style="color:red;">`c()`</mark>.

1. **Logical**

   Ex: True or False
2. **Integer**: The whole number values.

   Ex: 1, 2, 5, 100, 20L, 15L, etc.
3. **Numeric**: Both whole numbers and decimal values.

   Ex: 4, 3.1416, 0.534, etc.
4. **Complex**

   Ex: 3+4i, 5+2i, etc.
5. **Character:** Needs to be enclosed between single or double quotes.

   Ex: "M", "We", "Someone", etc.

{% hint style="info" %}
"We can use the <mark style="color:red;background-color:green;">**`L`**</mark> suffix to qualify any number with the intent of making it an explicit integer"
{% endhint %}

{% hint style="info" %}
To check the type/class of a vector use <mark style="color:red;">`class(vectorName)`</mark>.
{% endhint %}

**Code Example:**

```r
VactorName <- c("We", "love", "R", "programming")
```

{% hint style="warning" %}
Do not use more than one class of vector in a single vector.
{% endhint %}

### 2. Matrix

Matrix is the R object in which the elements are arranged in a two-dimensional rectangular layout.

{% tabs %}
{% tab title="Sytax" %}

```r
matrix(data, nrow, ncol, byrow, dimnames)
```

**Syntax Breakdown:**

* **data**: is the input vector which becomes the data elements of the matrix.
* **nrow**: is the number of rows to be created.
* **ncol**: is the number of columns to be created.
* **byrow**: is a logical clue. If TRUE then the input vector elements will be arranged by row.
* **dimnames**: are the names assigned to the rows and columns.
  {% endtab %}

{% tab title="Code Example" %}

```r
# By default byrow = FALSE
matr <- matrix(c(5:29),5,5)

# When byrow = TRUE
matr <- matrix(c(5:29),5,5, byrow = TRUE)

```

Code Breakdown:

* `c(5:29)`: is the **data**, which creates a sequence of numbers from 5 to 29.
* Then by `5,5` we denote to create a matrix with **5x5** dimension.
  {% endtab %}
  {% endtabs %}

\
**Code Example Output:**

<figure><img src="https://3681152927-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvUtrdiIkCrBX60yTgn1m%2Fuploads%2F8ymYYydV3wJFvZreoRJY%2Fimage140.png?alt=media&#x26;token=5de3ddc1-b9cd-4efa-aec4-f983af9ccee2" alt=""><figcaption><p>Example of a matrix</p></figcaption></figure>

### **3. Array**

Arrays are the R data objects which can store data in more than two dimensions.

{% tabs %}
{% tab title="Syntax" %}

```r
array(data, dim, dimnames)
```

{% endtab %}

{% tab title="Code Example" %}

```r
arr <- array(c(0:15), dim = c(4,4,2,2))
```

**Code Breakdown:**

* `c(0:15)`: is the **data**, which creates a sequence of numbers from 0 to 15
* `c(4,4,2,2):` is the dimension (dim), <mark style="color:red;">`4,4`</mark> denotes a matrix of **4x4** dimension, then <mark style="color:red;">`2,2`</mark> denotes an array of **2x2** dimensions with that matrix.
  {% endtab %}
  {% endtabs %}

<figure><img src="https://3681152927-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvUtrdiIkCrBX60yTgn1m%2Fuploads%2Fyixj9REPHH3TZcpiRynP%2Farray_2.png?alt=media&#x26;token=24c88fe1-49fa-43b7-b346-e735ffbb29a9" alt=""><figcaption><p>Source: <a href="https://www.neonscience.org/resources/learning-hub/tutorials/hsi-hdf5-r">https://www.neonscience.org/resources/learning-hub/tutorials/hsi-hdf5-r</a></p></figcaption></figure>

### 4. List

Lists are the R objects that contain elements of different types like - numbers, strings, vectors, and other lists inside them.

{% hint style="info" %}
In simple words, a list can contain more than one type of data.
{% endhint %}

{% tabs %}
{% tab title="Syntax" %}

```
listName <- list(data)
```

{% endtab %}

{% tab title="Code Example" %}

```r
# Let's create some vectors with different Data type
# Integer
vctr1 <- c(1:5)
# Character
vctr2 <- c("I", "love", "R", "programming")
# Logical
vctr3 <- c(TRUE, FALSE)
# Create a list
aList <- list(vctr1, vctr2, vctr3)
# View the list
aList
```

{% endtab %}
{% endtabs %}

**Code Output:**

<figure><img src="https://3681152927-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvUtrdiIkCrBX60yTgn1m%2Fuploads%2F4kXT3wQQjW9BroHfoAqT%2Fimage.png?alt=media&#x26;token=ca899882-b639-4fe4-bbb9-6c4f7b2a1198" alt=""><figcaption></figcaption></figure>

### 4. DataFrame

A dataframe is a table or two-dimensional array-like structure in which each column contains values of one variable and each row contains one set of values from each column.

{% tabs %}
{% tab title="Syntax" %}

```r
data.frame(data, row.names = NULL, stringsAsFactors = FALSE)
```

**Syntax Breakdown:**

* **data:** can be a matrix, table, etc.
* <mark style="color:red;">`row.names = NULL`</mark>: Whether you want to specify a column name that will be used as row names. Unless specified (NULL) row names will be integer numbers.
* <mark style="color:red;">`stringsAsFactors = FALSE`</mark>: If TRUE, the columns with character values will be considered as a factor.
  {% endtab %}

{% tab title="Code Example" %}

```r
# Let's create some vector
vctr1 <- c(1:5)
vctr2 <- c("Rahim", "Karim", "Jodu", "Modu", "Neymar")
vctr3 <- c(14,16,78,23,24)

# Now, use these vectors to create a dataframe
df <- data.frame(vctr1, vctr2, vctr3)
# View the dataframe
df
```

{% endtab %}
{% endtabs %}

**Code Output:**

<figure><img src="https://3681152927-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvUtrdiIkCrBX60yTgn1m%2Fuploads%2FvAKGIuTzwSnQw8ctxn9r%2Fimage.png?alt=media&#x26;token=213b8d3b-2e10-4094-be22-19c31bed7992" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
R contains a number of preloaded datasets. You can load and use them instantly.

* To view the built-in datasets in R, use the following command: <mark style="color:red;">`data()`</mark>
* To load these datasets in your **RStudio** environment use <mark style="color:red;">`data(datasetName)`</mark>.
  {% endhint %}

To learn more about dataframe and their manipulation view the following page:

{% content-ref url="../dataframe" %}
[dataframe](https://ar-riyaz.gitbook.io/r-for-bioinformatics/dataframe)
{% endcontent-ref %}

### Sources of the contents on this page:

1. [R for Data Science Full Course | Data Science Training | Edureka](https://youtu.be/ckdHNu4kfL0)
