For loops

This workshop is going to go through the different ways of writing and implementing for loops in R. From subsetting your data into different chunks, to generating individuals plots, to bootstrapping iterations, hopefully by the end of this workshop you’ll be able to implement loops into your code and make it more efficient and less frustrating. We will also cover how to include if else statements into your code.

Now, many of these examples can be done using the apply family or the purrr package, but it sounds like the apply family is starting to get phased out. The purrr package is more tidyverse-friendly as it allows for %>%, but in many instances it would require you to also write out your own function(), even if it uses a predetermined function. So, for the purpose of this workshop, we’ll stick to for() loops.

What are for loops?

For loops are basically a way of repeating a task (or a series of tasks) over and over again. If you have ever had to copy-paste a chunk of code more that two times, then you would have probably been better off using a for loop.

Why do we need for loops?

Great question! As much fun as it is to copy-paste something 10000000000 times, it can introduce some problems.

  • What if you wanted to change a single element and apply it to everything? Now you have to rewrite the same line over and over and over again, which can very easily introduce mistakes that are hard to catch
  • For loops run iteratively, so you can just hit “run” and let it go! This saves you a bunch of time instead of running one iteration, waiting for it finish, then running the next and so on.
  • It’s more reproducible-friendly. Nobody wants to go through your 39485728059824375 lines of code, especially if you have the same 200 lines repeating

Long story short, if you have ever had to copy-paste something a bunch of times, you were probably better off writing it into a loop. Of course, they are not obviously intuitive, but once you get the hang of it, you’ll love running them!

From this workshop, we’ll start off with some very easy and basic for loops, then we’ll get into some more everyday examples of when they’re useful, and we’ll end the workshop with some not-so-complicated nested for loops (i.e. a loop within a loop).

Basic loops

In this section, we’ll go through the very basics of a for loop and how they are typically coded. We call them “for loops” because we’re telling the computer “for every element in my list, do this task”. So, if we had a list of names and we wanted to print them, we’d say “for every name in my list, print it”

for (name in c('Isabelle', 'Juliano', 'Will')) print(name)
## [1] "Isabelle"
## [1] "Juliano"
## [1] "Will"

We can give it the list inside the parentheses, or we can specify a list outside of the loop and then feed it in

name_list <- c('Isabelle', 'Juliano', 'Will')

for (name in name_list) print(name)
## [1] "Isabelle"
## [1] "Juliano"
## [1] "Will"

We end up with the same outcome! This will be very helpful down the road when lists get longer.

You’ll notice that the function, in this case print(), came immediately after the for() call. That’s because we are only asking it to do a single function (in this case print()). If you want it to perform multiple functions, then you will need to add curly brackets {}

for (name in name_list) {
  
  name2 <- paste(name, 'rocks at coding!', sep = ' ')
  print(name2)
  
}
## [1] "Isabelle rocks at coding!"
## [1] "Juliano rocks at coding!"
## [1] "Will rocks at coding!"

In this example, we’ve created a second object called name2, where we added a pasted phrase of ‘rocks at coding’ and separated clauses with a space. Now, the important thing to note here is that each time the loop runs through an iteration, it is rewriting what name2 is. That means that when the loop is finished, name2 will take on the last value of the list.

name2
## [1] "Will rocks at coding!"

This means that we did not keep any of the previous versions of name2. What if we want access to all three? That’s where things get exciting. We’ll have to designate an empty list outside the loop, and then assign values to things from the loop.

# first, create an empty list that we will infill later
name2_list <- list()

# Now, we'll run the loop and assign each value to its own element from the list
for (name in name_list) {
  
  name2 <- paste(name, 'rocks at coding!', sep = ' ') # create the statement
  print(name2) # print it out to read
  name2_list[[name]] <- name2 # assign it to the list
  
}
## [1] "Isabelle rocks at coding!"
## [1] "Juliano rocks at coding!"
## [1] "Will rocks at coding!"

We assign objects in the list using the double square brackets [[]]. Under the hood, R converts the character string that you feed it (i.e. name) into numeric.

Now, we can pull out individual values from the list we created

name2_list[[1]]
## [1] "Isabelle rocks at coding!"
name2_list[[2]]
## [1] "Juliano rocks at coding!"

Groovy! Ok, so we’ve assigned ‘rocks at coding!’ to everyone, but what if we wanted to say something different about Juliano? That’s when we can use if else statements. An if else statement is basically telling R “If the element of the list is equal to a specific value, do this task. For everything else, do this other task”.

for (name in name_list) {
  
  # We'll start with the condition if the name is Juliano
  if (name == 'Juliano') 
    name2 <- paste(name, 'is amazing at coding!', sep = ' ')
  # for everything else in the list
  else 
    name2 <- paste(name, 'rocks at coding!', sep = ' ')
  
  print(name2)
  
}
## [1] "Isabelle rocks at coding!"
## [1] "Juliano is amazing at coding!"
## [1] "Will rocks at coding!"

Just like the for() statement before, because we only have a single function in our if else statement, we don’t need curly brackets. If we had multiple functions, we would add curly brackets. We’ll go through some more complicated examples below.

Now, what if we wanted to say something different about each name? Then we would add more if statements in:

for (name in name_list) {
  
  # We'll start with the condition for Juliano
  if (name == 'Juliano') 
    name2 <- paste(name, 'is amazing at coding!', sep = ' ')
  # Now we'll change it for Isabelle too
  else if (name == 'Isabelle') 
    name2 <- paste(name, 'is super cool!', sep = ' ')
  # Now the last one left will take the else statement
  else 
    name2 <- paste(name, 'rocks at coding!', sep = ' ')
  
  print(name2)
  
}
## [1] "Isabelle is super cool!"
## [1] "Juliano is amazing at coding!"
## [1] "Will rocks at coding!"

You can layer as many if statements onto your block as you want. We’ll go through some examples below of how this can make things easier.

Ok, what if we actually didn’t want to say anything about Juliano at all? What if we wanted to skip Juliano? Then we can use the command next. We would use it inside an if statement block and basically tell R “If the element of the list is equal to a specific value, move on to the next element”.

for (name in name_list) {
  
  # We'll tell the for loop to move to the next value after Juliano
  if (name == 'Juliano') next
  
  # Create our name2 object and print it
  name2 <- paste(name, 'is amazing at coding!', sep = ' ')
  print(name2)
  
}
## [1] "Isabelle is amazing at coding!"
## [1] "Will is amazing at coding!"

In this example, we don’t need to specify an else portion to the if statement, because it knows to just move onto the next value. This can become very helpful when you’re iterating through a list of values in a dataframe and you just want to skip one (or multiple) levels. It will save you from having to create a whole separate dataframe or rewriting out an entire list and omitting those specific levels.

We can also tell the loop to break at a certain value. This is helpful when you’re generating a big loop and you want to troubleshoot any problems that you might have. For example, you can tell R things like “If an error occurs with this iteration, break the loop”. It’s also helpful in cases if, say, you want rows n and n+1 from a dataframe. When you get to the last row, then there is no more n+1, so you can tell the loop to break.

for (name in name_list) {
  
  # We'll tell the loop to break at Juliano
  if (name == 'Juliano') break 
  
  # Create the name2 object and print it
  name2 <- paste(name, 'is amazing at coding!', sep = ' ')
  print(name2)
  
}
## [1] "Isabelle is amazing at coding!"

Some things to keep in mind, because the loop runs iteratively, be careful if you decide you re-write the iterative object (in this case, it’s name). Basic for loop practice typically uses i to denote numbers and if you decide to rewrite the object i, it will change your output.

for (i in 1:10) {
  
  i <- i + 1
  print(i) 
  
}
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10
## [1] 11

Incorporating for loops into your everyday code

Alrighty, now that we have the basics of for loops, let’s get into the fun stuff and when this can actually be helpful for us biologists. I typically use for loops when I’m

  • Cleaning and wrangling data
  • Generating multiple plots that are similar (but different)
  • Bootstrapping model iterations

Obviously, there are also other times where it’s helpful to use for loops, but I find that these are the most common instances for which they are used. Luckily, there are more and more tidyverse solutions to data cleaning and wrangling, so you probably will not need for loops as often, but it’s still a handy trick to know.

For all the next examples, we’re going to use the palmerpenguins package data. This data measured a bunch of morphological traits from three species of penguin from three different islands in Antarctica. We’ll also load the tidyverse package for data wrangling and the patchwork package to bind plots together.

library(tidyverse) 
library(palmerpenguins) # the data
library(patchwork) # to patch multiple plots together

summary(penguins)
##       species          island    bill_length_mm  bill_depth_mm  
##  Adelie   :152   Biscoe   :168   Min.   :32.10   Min.   :13.10  
##  Chinstrap: 68   Dream    :124   1st Qu.:39.23   1st Qu.:15.60  
##  Gentoo   :124   Torgersen: 52   Median :44.45   Median :17.30  
##                                  Mean   :43.92   Mean   :17.15  
##                                  3rd Qu.:48.50   3rd Qu.:18.70  
##                                  Max.   :59.60   Max.   :21.50  
##                                  NA's   :2       NA's   :2      
##  flipper_length_mm  body_mass_g       sex           year     
##  Min.   :172.0     Min.   :2700   female:165   Min.   :2007  
##  1st Qu.:190.0     1st Qu.:3550   male  :168   1st Qu.:2007  
##  Median :197.0     Median :4050   NA's  : 11   Median :2008  
##  Mean   :200.9     Mean   :4202                Mean   :2008  
##  3rd Qu.:213.0     3rd Qu.:4750                3rd Qu.:2009  
##  Max.   :231.0     Max.   :6300                Max.   :2009  
##  NA's   :2         NA's   :2

Data wrangling

Let’s start with a data wrangling example. Let’s say, for each species, we want to know the minimum and maximum flipper length, but we want to keep all the data. If we used the classic dplyr::group_by() and summarise() functions, we’d lose all the other information from the data. If we used dplyr::slice, we would have to generate two different dataframes for the minimum and maximum and bind them together (and repeat this for each species). If we do it in a for loop, we can get it done all with one chunk of code.

Now, when this is given in a live workshop, I like to show them how I build loops. Basically, you start with your most basic command and check along the way that they are all working. This way, you can tell exactly which part of the for loop might give you trouble. Another way of building for loops is to write out all the code outside the loop first (but instead of giving it a list to iterate through, you just give it a single level).

# We'll generate three different datasets, one for each species, from the for loop
# Then, we'll bind them together at the end

# First, create an empty list that we'll fill from the for loop
penguins_list <- list()

# Now we'll run through the loop.
for (penguin_sp in unique(penguins$species)) {
  
  # First, we'll filter the dataframe to only contain the species that we care about
  data_subset <- 
    penguins %>% 
    dplyr::filter(species == penguin_sp) 
  # note you don't need the quotations here because it's iterating through a character vector
  
  # Now we can specify the dataframe into our empty list
  penguins_list[[penguin_sp]] <- 
    # We'll use rbind() to bind the rows together of two separate dataframes
    rbind(data_subset %>% 
             dplyr::slice_min(flipper_length_mm), # takes the minimum value
          data_subset %>% 
            dplyr::slice_max(flipper_length_mm)) # takes the maximum value
  
  # We can also leave a nice message for ourselves to know that it's working
  cat(paste('Finishing species:', penguin_sp, '\n', sep = ' '))
  
}
## Finishing species: Adelie 
## Finishing species: Gentoo 
## Finishing species: Chinstrap

If we look at each of the dataframes we’ve created now:

penguins_list[[1]]
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
Adelie Biscoe 37.9 18.6 172 3150 female 2007
Adelie Torgersen 44.1 18.0 210 4000 male 2009
penguins_list[[2]]
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
Gentoo Biscoe 48.4 14.4 203 4625 female 2009
Gentoo Biscoe 54.3 15.7 231 5650 male 2008
penguins_list[[3]]
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
Chinstrap Dream 46.1 18.2 178 3250 female 2007
Chinstrap Dream 49.0 19.6 212 4300 male 2009

And now, we can bind them all together into a single dataframe

penguins_species <- 
  do.call(rbind, penguins_list)

penguins_species
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
Adelie Biscoe 37.9 18.6 172 3150 female 2007
Adelie Torgersen 44.1 18.0 210 4000 male 2009
Gentoo Biscoe 48.4 14.4 203 4625 female 2009
Gentoo Biscoe 54.3 15.7 231 5650 male 2008
Chinstrap Dream 46.1 18.2 178 3250 female 2007
Chinstrap Dream 49.0 19.6 212 4300 male 2009

Let’s see how we can incorporate an if else statement here. Let’s say that for Adelie penguin, we only want the minimum and maximum for males, but for Gentoo and Chinstrap we want females. The backbone of this loop is very similar to the one that we just did, but we’ll include an if else statement at the top

# First, create an empty list that we'll fill from the for loop
penguins_list <- list()

# Now we'll run through the loop.
for (penguin_sp in unique(penguins$species)) {
  
  # We'll put the if else statement here
  # We'll create a new object called penguin_sex, which we will feed into the filter() function below
  if (penguin_sp == 'Adelie') 
    penguin_sex <- 'male' # we want males for Adelie
  else 
    penguin_sex <- 'female' # females for Gentoo and Chinstrap
  
  # First, we'll filter the dataframe to only contain the species that we care about
  data_subset <- 
    penguins %>% 
    dplyr::filter(species == penguin_sp,
                  # now we'll filter out the sex
                  sex == penguin_sex) 
  
  # Now we can specify the dataframe into our empty list
  penguins_list[[penguin_sp]] <- 
    # We'll use rbind() to bind the rows together of two separate dataframes
    rbind(data_subset %>% 
             dplyr::slice_min(flipper_length_mm), # takes the minimum value
          data_subset %>% 
            dplyr::slice_max(flipper_length_mm)) # takes the maximum value
  
  # We can also leave a nice message for ourselves to know that it's working
  cat(paste('Finishing species:'), penguin_sp, '\n', sep = ' ')
  
}
## Finishing species: Adelie 
## Finishing species: Gentoo 
## Finishing species: Chinstrap
penguins_species <- 
  do.call(rbind, penguins_list)

penguins_species
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
Adelie Dream 37.2 18.1 178 3900 male 2007
Adelie Torgersen 44.1 18.0 210 4000 male 2009
Gentoo Biscoe 48.4 14.4 203 4625 female 2009
Gentoo Biscoe 46.9 14.6 222 4875 female 2009
Chinstrap Dream 46.1 18.2 178 3250 female 2007
Chinstrap Dream 43.5 18.1 202 3400 female 2009

Too easy! This is a pretty basic way of using for loops for data wrangling, but once you understand this, everything becomes much easier!

Plotting

Now, let’s go through an example of how to generate plots. The premise is basically the same, but using if else statements makes this process much easier. I typically use for loops for plots when I need to generate a bunch of plots (> 2) that are mostly the same, but different in their dimensions, colours, etc. There are instances where ggplot2::facet_wrap or facet_grid don’t work well with patchwork to bind multiple plots together, so I find it’s easier to just generate multiple plots. Also, when you use facet_wrap, you can’t easily add individual plot tags to each subplot (you have to do this roundabout method of creating a separate dataframe of x and y positions with the plot label and feeding it into geom_text), which may not conform to journal submission guidelines.

For each species, we’re going to generate scatterplots of bill length as a function of body mass, but we want to colour the points by year. We also want to fit a trend line, but we want the colour of each trend line to be different for species as well. ggplot2 will be unable to handle two separate colour grouping factors (i.e. species and year), so we’ll have to generate these individually.

In the end, let’s try to generate three plots stacked on top of each other. There are a couple of things we need to keep in mind (covered sequentially in the workshop, but not here).

  1. We only need an x axis title on the bottom plot and a y axis title in the middle plot.
  2. We only need a legend on one plot (not all)
  3. We need to separately specify the plot colours for the fit lines
  4. We need to separately specify the plot tag levels
# Again, we'll first specify an empty list outside the loop
plot_list <- list()

# Now, we'll go through the loop to generate the plots
for (penguin_sp in unique(penguins$species)) {
  
  # This is where we'll specify the plot-specific details
  if (penguin_sp == 'Adelie') {
    
    x_axis_title <- '' # x axis title name
    y_axis_title <- '' # y axis title name
    legend_position <- c(0.1, 0.85) # legend position
    penguin_colour <- 'dodgerblue' # colour of the trend line
    penguin_tag <- '(a) Adelie' # plot tag level
    
  } else if (penguin_sp == 'Gentoo') {
    
    x_axis_title <- '' 
    y_axis_title <- 'Bill length (mm)'
    legend_position <- 'none'
    penguin_colour <- 'orange'
    penguin_tag <- '(b) Gentoo'
    
  } else {
    
    x_axis_title <- 'Body mass (g)' 
    y_axis_title <- ''
    legend_position <- 'none'
    penguin_colour <- 'darkgreen'
    penguin_tag <- '(c) Chinstrap'
    
  }
  
  plot_list[[penguin_sp]] <- 
    penguins %>% 
    # First let's filter out the species we want
    dplyr::filter(species == penguin_sp) %>% 
    # remove missing values
    drop_na() %>% 
    # now let's generate the plot
    ggplot(aes(x = body_mass_g, y = bill_length_mm)) +
    # the points - convert year to a factor
    geom_point(aes(colour = factor(year))) +
    # the fitline - we'll just make it a straight line with no standard error
    geom_smooth(colour = penguin_colour, linewidth = 2, se = FALSE, 
                method = 'lm', formula = 'y ~ x') +
    # Specify the colours we want for year
    scale_colour_manual(values = c('black', 'grey40', 'grey80')) +
    # Add in the different titles
    labs(x = x_axis_title,
         y = y_axis_title,
         colour = 'Year',
         title = penguin_tag) +
    # Specify a theme
    theme_classic() +
    theme(plot.title = element_text(face = 'bold', # make title bold
                                    hjust = 0), # align text to the left
          legend.position = legend_position)
  
}
plot_list[[1]]

plot_list[[2]]

plot_list[[3]]

Now we’ll bind all the plots together using patchwork

p <- 
  plot_list[[1]] + plot_list[[2]] + plot_list[[3]] +
  plot_layout(ncol = 1)


p

And there you go! The best part about generating individuals plots is that, when combined with patchwork, you can really customise how the plot will look. By controlling the relative sizes of all the plots, you can emphasise the importance of some over others (such as Fig. 3) or you can create overlapping plots with a negative plot_spacer() (such as Fig. 2).

Bootstrapping

The final use of for loops that we’re going to cover here is bootstrapping. Bootstrapping is the process by which you are resampling a dataset/model/etc. to create many simulated samples. For example, if you wanted to build a predictive model, you could use a cross-validation technique where you build your model on, say, 80% of your data and you predict its accuracy against the remaining 20%. Ideally, the first 80% of your data would have been chosen randomly. To assess the predictive accuracy, you could bootstrap this process multiple times and assess how good your model is at predicting. Similarly, a lot of machine learning algorithms these days (e.g. boosted regression trees) have a stochastic model-building process and can produce slightly different results based on the randomly chosen starting point. Therefore, these models are also typically bootstrapped across multiple iterations. Finally, people who are running simulations (such as the calculation of fish productivity), would bootstrap their simulations to assess how robust a set of outcomes is.

Here, we’re going to go with the first example and just run multiple linear models with a subset of the penguins dataset and predict on the remaining amount. Because this is not a workshop about linear modelling (check out our Everything you need to know about linear modelling workshop), we are just going to run a linear model of: $Bill~length \sim Body ~ mass$

So, the steps that we’re going to follow are:

  1. Randomly sample 80% of the dataset
  2. Build a model on the 80%
  3. Predict the bill length of the remaining 20%
  4. Model the predicted against the observed
  5. Extract the $r^2$ from each iteration

We are also going to include a progress bar so we can keep track of the progress of our bootstraps. This shouldn’t take too long with our data here, but some models take quite a while to run and this will be very helpful.

# First, we're going to generate a row number ID so we can easily separate the training set
# from the test set
penguins_data <- 
  penguins %>% 
  mutate(ID = row_number())

# This time, instead of creating an empty vector list, we'll create a dataframe with a column designating the bootstrap iteration and another with the perdicted R2 value
r2_data <- 
  tibble(iteration = seq(1:1000),
         r2 = NA)

# Now to code the progress bar
pb <- txtProgressBar(min = 0, max = nrow(r2_data), style = 3)
##   |                                                                              |                                                                      |   0%
# Ok, now let's run the for loop
for (i in 1:1000) {
  
  # Because we are randomly generating a portion of the data, we can set a seed at the beginning of
  # each iteration. That way, our results will be reproducible. Note: you don't want to generate a 
  # single seed outside of the loop, because then all your values will be identical! 
  set.seed(i)
  
  # First, we'll separate our data into a training set and a test set
  # The training set:
  data_train <- 
    penguins_data %>% 
    sample_frac(0.8)
  
  # The test set
  data_test <- 
    anti_join(penguins_data, data_train, by = 'ID')
  
  # Now we'll create the model
  penguin_model <- lm(bill_length_mm ~ body_mass_g, data_train)
  
  # Create a new dataframe where we'll add the predictions to the original data
  data_predicted <- 
    data_test %>% 
    mutate(predicted_bill = stats::predict(penguin_model,
                                           newdata = tibble(body_mass_g = data_test$body_mass_g)))
  
  # Now, we'll run a model between our predicted and our observed values and extract the R2
  pred_mod <- lm(predicted_bill ~ bill_length_mm, data_predicted)
  
  # Finally, we'll infill the dataframe with the r2
  r2_data$r2[i] <- summary(pred_mod)$r.squared
  
  # Don't forget the progress bar!
  setTxtProgressBar(pb, i)
  
}
##   |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |=======                                                               |  11%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |=========                                                             |  14%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  16%  |                                                                              |============                                                          |  17%  |                                                                              |============                                                          |  18%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |==============                                                        |  21%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  22%  |                                                                              |================                                                      |  23%  |                                                                              |================                                                      |  24%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |=====================                                                 |  31%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |=======================                                               |  34%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |============================                                          |  41%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |==============================                                        |  44%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |===================================                                   |  51%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |=====================================                                 |  54%  |                                                                              |======================================                                |  54%  |                                                                              |======================================                                |  55%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |============================================                          |  64%  |                                                                              |=============================================                         |  64%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |===============================================                       |  68%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |===================================================                   |  74%  |                                                                              |====================================================                  |  74%  |                                                                              |====================================================                  |  75%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  76%  |                                                                              |======================================================                |  77%  |                                                                              |======================================================                |  78%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  79%  |                                                                              |========================================================              |  80%  |                                                                              |========================================================              |  81%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |==========================================================            |  84%  |                                                                              |===========================================================           |  84%  |                                                                              |===========================================================           |  85%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |=============================================================         |  88%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |===============================================================       |  91%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |=================================================================     |  94%  |                                                                              |==================================================================    |  94%  |                                                                              |==================================================================    |  95%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |====================================================================  |  98%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================|  99%  |                                                                              |======================================================================| 100%
close(pb) # it's good practice to close the progress bar

So now, if we look at the output r2_data, we can look at the predictive accuracy of our model

ggplot(r2_data, aes(x = r2)) +
  geom_histogram(bins = 40)

We have a $r^2$ of 0.36, which is not great for a predictive model…but not our problem for this example!

To keep this example simple, we only extracted the $r^2$ from this model, but obviously there are a bunch of other metrics/values that can come from this as well. A helpful tip, especially if your models take a while to run, when you’re first writing your code, don’t use the maximum number of bootstrapped iterations. You can usually find errors within 5-10 runs and that way, you’re not just sitting there waiting for your computer to finish running something incorrectly.


Nested for loops

In this final section, we’re going to go through a pretty simple nested for loop (i.e. a loop within a loop - loopception if you will). These can get pretty ugly pretty fast, so it’s best to break things down into small tasks and troubleshoot as you code.

Let’s say, we want the minimum and maximum flipper length again for each species of penguin, for each year that they were sampled. The key here is that every species was sampled in each year - if you have gaps, then you would need to introduce if next or if break statements. Remember when we would specify empty lists and empty dataframes outside the loop? Well, when you’re nesting, you basically do the same thing - we can designate objects inside the first loop but outside the second. Tip: make sure you keep track of what your iterating object is for each loop - mixing up i with j can really mess things up.

# We'll designate an empty list to infill with species-level data
species_list <- list()

# Now we'll run the loops. The outter loop is the species-level loop. 
# The inner loop is the year-level loop
for (i in unique(penguins$species)) {
  
  # First, we'll subset the data at the species level
  species_subset <- 
    penguins %>% 
    dplyr::filter(species == i)
  
  # We will also specify a separate list to infill with year-specific data
  year_list <- list()
  
  # now we'll run the year-specific loop
  for (j in unique(penguins$year)) {
    
    # Now we'll subset the species_subset dataframe by year
    year_subset <- 
      species_subset %>% # take the dataframe from the species loop
      dplyr::filter(year == j)
    
    # Now we can specify the dataframe into the empty datalist above
    year_list[[j]] <- 
      rbind(year_subset %>% 
              dplyr::slice_min(bill_length_mm), # take the minimum
            year_subset %>% 
              dplyr::slice_max(bill_length_mm)) # take the maximum
  }
  
  # Now, we'll bind together the year-level dataframes into our empty list that
  # we set outside
  species_list[[i]] <- 
    do.call(rbind, year_list)
  
  # Let's leave a nice message for ourselves
  cat(paste('Finishing species:', i, '\n', sep = ' '))
  
}
## Finishing species: Adelie 
## Finishing species: Gentoo 
## Finishing species: Chinstrap
species_list[[1]]
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
Adelie Torgersen 34.1 18.1 193 3475 NA 2007
Adelie Torgersen 46.0 21.5 194 4200 male 2007
Adelie Dream 33.1 16.1 178 2900 female 2008
Adelie Torgersen 45.8 18.9 197 4150 male 2008
Adelie Dream 32.1 15.5 188 3050 female 2009
Adelie Biscoe 45.6 20.3 191 4600 male 2009
species_list[[2]]
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
Gentoo Biscoe 40.9 13.7 214 4650 female 2007
Gentoo Biscoe 59.6 17.0 230 6050 male 2007
Gentoo Biscoe 42.6 13.7 213 4950 female 2008
Gentoo Biscoe 54.3 15.7 231 5650 male 2008
Gentoo Biscoe 41.7 14.7 210 4700 female 2009
Gentoo Biscoe 55.9 17.0 228 5600 male 2009
species_list[[3]]
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
Chinstrap Dream 42.4 17.3 181 3600 female 2007
Chinstrap Dream 58.0 17.8 181 3700 female 2007
Chinstrap Dream 40.9 16.6 187 3200 female 2008
Chinstrap Dream 54.2 20.8 201 4300 male 2008
Chinstrap Dream 42.5 17.3 187 3350 female 2009
Chinstrap Dream 55.8 19.8 207 4000 male 2009

And now, we can bind them all together into a single dataframe

penguins_year <- 
  do.call(rbind, species_list)

penguins_year
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
Adelie Torgersen 34.1 18.1 193 3475 NA 2007
Adelie Torgersen 46.0 21.5 194 4200 male 2007
Adelie Dream 33.1 16.1 178 2900 female 2008
Adelie Torgersen 45.8 18.9 197 4150 male 2008
Adelie Dream 32.1 15.5 188 3050 female 2009
Adelie Biscoe 45.6 20.3 191 4600 male 2009
Gentoo Biscoe 40.9 13.7 214 4650 female 2007
Gentoo Biscoe 59.6 17.0 230 6050 male 2007
Gentoo Biscoe 42.6 13.7 213 4950 female 2008
Gentoo Biscoe 54.3 15.7 231 5650 male 2008
Gentoo Biscoe 41.7 14.7 210 4700 female 2009
Gentoo Biscoe 55.9 17.0 228 5600 male 2009
Chinstrap Dream 42.4 17.3 181 3600 female 2007
Chinstrap Dream 58.0 17.8 181 3700 female 2007
Chinstrap Dream 40.9 16.6 187 3200 female 2008
Chinstrap Dream 54.2 20.8 201 4300 male 2008
Chinstrap Dream 42.5 17.3 187 3350 female 2009
Chinstrap Dream 55.8 19.8 207 4000 male 2009

Congratualtions! You have now gone through and learned how to implement for loops in R! Obviously, there are so many other places where you can use for loops, but hopefully this tutorial was helpful in providing you with a baseline understanding and some common implementation techniques!