Data format requirements for graphs and ANOVAs.

`grafify`

Eight data sets are included for practising using the package. To view these, type `data(package = "grafify")`

. Description of each of these can be obtained by typing `?<name of data set>`

on the console. They are all in long format, which is described further below.

All `plot_`

functions require data in *long* format. Use `pivot_longer`

and `pivot_wider`

from the `tidyr`

package to interchange data table formats. Examples on how to do this are available here.

Analyses as linear models for ANOVAs (i.e., `simple_model`

, `simple_anova`

, `mixed_model`

, `mixed_anova`

, `mixed_model_slopes`

and `mixed_anova_slopes`

) also requires data to be supplied in long format.

In addition, a few more things need to be kept in mind: whether independent variables are categorical/nominal or numeric variables.

Confirm that **categorical/discreet** fixed and random factors are converted into *factors* by using `as.factor()`

(see here), and check the data frame with `str()`

. Failing to do this for columns that have numbers describing levels will lead to incorrect results (e.g., if a column describes “Experiments” as 1, 2, 3 and so on, R will think of these as numeric values when these should be analysed as categorical/discreet/nominal variables). Examples of numeric variables we might come across in biology may be time, temperature, bodyweight (mass), lengths etc.

ANOVA and model fitting functions do not check whether the variables are “factors”. This is because some may actually want to fit lines through numeric variables. *Unless this is what you intended to do, convert fixed factors (also called discreet, categorical) to what R understands as ‘factors’ first.*

It is the user’s responsibility to check data frame structure.

Here is an example of a data frame in `grafify`

.

```
#10 rows of data_1w_death table
head(data_1w_death, n = 10)
```

```
#> Experiment Genotype Death
#> 1 Exp_1 WT 25.012173
#> 2 Exp_1 KO_1 1.824973
#> 3 Exp_1 KO_2 14.294956
#> 4 Exp_2 WT 16.542609
#> 5 Exp_2 KO_1 2.131199
#> 6 Exp_2 KO_2 23.749439
#> 7 Exp_3 WT 31.125802
#> 8 Exp_3 KO_1 1.916670
#> 9 Exp_3 KO_2 21.998527
#> 10 Exp_4 WT 21.596348
```

```
#structure of the data frame
str(data_1w_death)
```

```
#> 'data.frame': 15 obs. of 3 variables:
#> $ Experiment: Factor w/ 5 levels "Exp_1","Exp_2",..: 1 1 1 2 2 2 3 3 3 4 ...
#> $ Genotype : Factor w/ 3 levels "WT","KO_1","KO_2": 1 2 3 1 2 3 1 2 3 1 ...
#> $ Death : num 25.01 1.82 14.29 16.54 2.13 ...
```

Note that Experiment and Genotype are columns with `Factor`

attribute, Death is `num`

(numeric).