The Impact of Solving Adaptive Parsons Problems with Common and Uncommon Solutions

CCS Concepts:Human-centered computing → Field studies; • Applied computing → Interactive learning environments; • Applied computing → Computer-assisted instruction; • Human-centered computing → HCI theory, concepts and models; • Human-centered computing → Human computer interaction (HCI);

ACM Reference Format:
Carl Haynes-Magyar and Barbara Ericson. 2022. The Impact of Solving Adaptive Parsons Problems with Common and Uncommon Solutions. In Koli Calling '22: 22nd Koli Calling International Conference on Computing Education Research (Koli 2022), November 17–20, 2022, Koli, Finland. ACM, New York, NY, USA 14 Pages. https://doi.org/10.1145/3564721.3564736

1 INTRODUCTION

Computing education theorists hypothesize that novice programmers need explicit and incremental instruction to develop at least four basic skills: code-reading and tracing, code-writing, pattern comprehension, and pattern application [84]. Novice programmers should be taught explicitly about “stereotypical solutions to programming problems as well as strategies for coordinating and composing them”—Elliot Soloway [75, p. 850]. To become proficient at computer programming, it is critical for learners to recognize and apply programming patterns/solutions (i.e., higher-level, reusable abstractions of code) [53, 82, 84]. However, the acquisition and retention of these skills (i.e., academic growth or improvement and expertise) depend on the quality and quantity of deliberate practice [1, 26]. Time imposes limits on what can be learned and the goal is to maintain desirable difficulties during programming practicing programming [24]. It is important to maximize learning efficiency for each learner. There is evidence that Parsons problems can be more efficient to solve than writing the equivalent code, but not always. It is important to investigate when a Parsons problem is more efficient to solve versus writing the equivalent code.

Traditional introductory computer programming practice such as code-tracing, in which students use paper and pencil to hand trace the execution of a program [44], and code-writing, which requires students to write code from scratch, are time-intensive, frustrating, and can decrease students’ engagement and motivation [4, 72]. Instead, drag-and-drop block-based coding exercises such as Parsons problems, also called Parsons Programming Puzzles, are increasingly being used to introduce learners to computer programming concepts and patterns/solutions [18, 71]. Recently, Weinman et al. [82] found that students were more likely to acquire a pattern when exposed to it first as a Parsons problem (or code-tracing problem) compared to a write-code problem. And furthermore, historically underrepresented minorities and females perform better on block-based versus text-based problems, likely due to having less prior programming experience [39, 83].

Our prior research revealed that an adaptive Parsons problem with an uncommon solution was not significantly more efficient to solve than writing the equivalent code [34]. However, students who solved the Parsons problem first were more likely to use the uncommon solution when they later wrote the equivalent code [34]. In this study, we tested hypotheses based on the commonality of Parsons problems solutions and the order in which they were solved. This work has implications for how Parsons problems are generated and how we can best support adaptive learning [14, 63].

Our prior research also revealed that self-reported cognitive load ratings were less for Parsons problems than equivalent write-code problems and that problem-solving efficiency positively correlated with cognitive load for some write-code problems [34]. Prior research shows Parsons problems not only impact cognitive learning outcomes but also behavioral [17, 23, 60, 71] and affective [17, 21, 71] learning outcomes as well. In this study, we explored the impact on cognitive load ratings of solving each problem type. Perhaps we can improve how adaptive Parsons problems are sequenced by analyzing the relationships between measures such as problem-solving efficiency and cognitive load ratings [9, 20, 61]. Currently, inter-problem (between problem) adaptation only changes the difficulty of a succeeding Parsons problem based on a learner's prior performance. Our research questions and hypotheses were:

1. What are the effects on efficiency of solving adaptive Parsons problems created from the most common student-written solution or an uncommon solution versus writing the equivalent code? What are the order effects?
2. If a Parsons problem with an unusual solution is modified to use the most common student-written solution then students will be more efficient at solving it.
3. If students are first presented with a Parsons problem that has an uncommon solution then they will be more likely to use that solution to solve an equivalent write-code problem than students who write the code first.
4. What is the effect on self-reported cognitive load ratings of solving adaptive Parsons problems versus solving equivalent write-code problems?
5. Why do students struggle to solve Parsons problem with an uncommon solution?

To answer the research questions we conducted a mixed within-between-subjects experiment. We wanted to (1) investigate how common and uncommon Parsons problem solutions mediate problem-solving efficiency and (2) explore any order effects (i.e., how the order of the conditions affected students’ solution acquisition). We chose a within-subjects design since problem-solving efficiency and cognitive load are dependent on the learner and thus subject to intra-individual variation [50]. We used OverCode, a system for visualizing and clustering programming solutions, to determine commonality [31].

Our study resulted in several findings: 1) Students were significantly more efficient at solving Parsons problems with a common solution versus an uncommon solution, 2) Students first presented with a Parsons problem that had an uncommon solution tended to use that solution to solve the equivalent write-code problem, 3) There was a significant difference in cognitive load ratings for students who solved the modified Parsons problem first versus those who wrote the equivalent code first, and 4) Students may struggle to solve Parsons problems with uncommon solutions because they need help with planning, self-regulated learning, and understanding distractor blocks.

2 RELATED WORK

Our experiment draws on research about Parsons problems, problem-solving efficiency, and cognitive load.

2.1 Parsons Problems

Parsons problems are a type of code completion problem that require learners to place mixed-up code blocks in the correct order; they can vary by dimensions, fill-in-the-blank variable options, feedback, adaptation, and the use of distractors [18, 21, 82]. They can also be used as formative or summative assessments [16, 24]. There is evidence that these kinds of problems can improve problem-solving efficiency, lower cognitive load, maximize engagement, and help teachers identify where students are struggling [16, 18, 34, 60, 64, 88]. Yet, “Parsons problems can be perceived as difficult because they require students to read code written by others (using syntax and logic that might not be in their personal comfort zone)” [16, p. 7]. These problems typically only have one correct solution while there are many ways to write code [60]. Furthermore, some advanced learners want harder programming problems while novices struggle with the exact same problems [23, 24, 25]. Studies have provided evidence that solving Parsons problems can lead to learning gains similar to writing the equivalent code, but in significantly less time [22, 24, 88].

Parsons problems with and without adaptation impact affective, behavioral, and cognitive (ABC) learning outcomes [16, 18, 21, 22, 24, 34, 60, 64, 71, 82, 88]. Cognitive learning outcomes correspond to changes in cognitive abilities and resources (e.g., efficiency or time-on-task and cognitive load) [66]. Behavioral learning outcomes correspond to changes in engagement, study skills, etc. [66]. And affective learning outcomes correspond to changes in attitudes and motivation (e.g., self-efficacy) [66]. With regard to cognitive learning outcomes, studies provide evidence it is significantly more efficient to solve Parsons problems with adaptation than to solve equivalent write-code problems [34, 71, 88] and equally as effective for learning gains [22]. However, our previous study provided evidence that students are not more efficient at solving a Parsons problem with an uncommon solution versus writing the equivalent code [34]. It also showed that students use uncommon Parsons problem solutions to solve equivalent write-code problems when they are first exposed to the uncommon Parsons solution before writing the equivalent code [34]—similar to the results from [82]. In this study, we explored hypotheses based on our prior research to better understand the impact of problems designed with common versus uncommon solutions.

Ericson [22] developed two types of adaptation for Parsons problems to keep students in Vygotsky's zone of proximal development [81]. This zone represents the difference between what learners can do autonomously and what learners can do with support [81]. The adaptability of the system was designed to support learners’ individual differences in knowledge acquisition, support optimal cognitive load, and improve affect (i.e., emotions and self-efficacy while learning how to program) [21, 22]. The goal is to maintain desirable difficulties and reduce or eliminate undesirable difficulties during programming practice [22, 24, 85]. Adaptation can increase learning efficiency and engagement [12].

Intra-problem (same problem) adaptation is learner-initiated; it occurs when the learner clicks the “Help Me” button which then removes a distractor block or combines two blocks into one [21]. This button also used to provide indentation before combining blocks [21], but that was removed after several research studies provided evidence that this confused learners [21, 34]. Inter-problem (between problem) adaptation is system-initiated; it occurs when the system modifies the difficulty of the next Parsons problem based on the learner's preceding Parsons problem performance. It does this by removing distractors and pairing distractors with the correct code (making it easier) or by adding distractors and jumbling them with correct code blocks (making it harder) [21].

In contrast, intelligent tutoring systems can be described as having an inner loop and an outer loop [80]. The inner loop provides feedback and hints during a task while the outer loop uses information from the inner loop (i.e., performance on the task) to select the next task. This is known as problem-sequencing [43, 80]. In this study, we also explored the relationship between efficiency and cognitive load ratings. By exploring these relationships, we may find better ways to adapt Parsons problems [43].

Time is a limited human resource and it can take some students twice as long to learn what other students learn in less time [6]. Spending more time learning the same material to achieve the same performance as one's peers can also leave learners feeling frustrated and decrease their motivation to learn [6].

Educational researchers both within and outside of computing education have used the term ‘time-on-task’ to describe the amount of time students spend on learning [6, 45, 68, 76]. Both efficiency and time-on-task measures can be used to build models for adaptive learning technologies, but the different strategies for computing efficiency and estimating time-on-task affect the accuracy of how learning is measured [41]—especially for programming tasks [45]. Furthermore, learners often engage in activities that are not related to learning during a learning task and researchers account for these gaps differently [7, 37, 41, 67].

In this study, efficiency (or time-on-task) for each problem was measured by computing the time to the first correct solution. Our estimation heuristic to calculate ‘spending time’ involved removing any difference greater than five minutes between the last recorded timestamp and the current timestamp for a problem and time spent on other problems [8]. We chose five minutes because no interaction for more than five minutes likely meant that the student took a break, especially since the largest median time to solve any of the problems was less than seven minutes (420 seconds) as seen in table 3. We controlled for effectiveness by only including in our analysis task completion times for solutions that were 100% correct. Adaptive Parsons problems were 100% correct if all of the blocks were in the right order, had the right level of indentation for each block, and did not include distractor blocks. Write-code problems were 100% correct if they passed all of the unit tests. We did not assess the effectiveness/quality of write-code solutions using criteria such as that used by [28]; we used students’ final correct solutions regardless of how many attempts they made. We analyzed the commonality of written student solutions using OverCode as shown in Figure 1. OverCode is a visualization system that clusters similar solutions to programming problems using both static and dynamic analyses [31].

Cognitive load describes the amount of information that working memory can hold and or manipulate at one time. Its capacity is limited (7 ± 2 items)—especially for novel information [30]. Computing education researchers (CERs) and instructional design researchers interested in computer studies use knowledge of human cognitive architecture to optimize cognitive load and design effective instruction and assessment [15, 49, 64, 78]. The goal is to aid learners in the formation of accurate mental models and the development of computational thinking skills [64].

Educational psychologists and CERs recognize that cognitive load theory (CLT) supports either three categories (old CLT) or two categories (new CLT) comprised of intrinsic, extraneous, and or germane cognitive load—the first two of which are mediated by element interactivity that germane resources are used to deal with [19, 55]. Other terms used to describe these include mental load and mental effort. “Mental load is imposed by the task or environmental demands” and “mental effort refers to the amount of capacity or resources that is actually allocated to accommodate the task demands” [57, p. 354].

Cognitive load can be measured indirectly, directly, subjectively, and through dual-task performance measures [42, 64]. Scientists have developed scales that are both unitary (combining categories) and deferential (subscales for each category) [40, 77]. The most valid and reliable measure is the Paas scale [56]. Computing education researchers have developed a deferential scale from the Cognitive Load Component Survey (CLCS) [51], but it has provided mixed results [33, 51, 87].

In this study, we used the Paas scale because prior research in computing education shows it is useful for understanding how much effort novice programmers invest in problem-solving [32, 33, 34]. For example, Harms et al. [32] found that novice programmers experienced higher cognitive load when solving programming puzzles versus when they worked through tutorials for identical problems. And, recently, we found self-reported cognitive load was less for solving Parsons problems with adaptation than solving equivalent write-code problems—although only significant for two problems without looking at order effects [34]. In this study, we analyze responses to the Paas scale with regard to order effects to explore how the order in which participants solve each problem type relates to their mental effort.

3 METHODS

3.1 Participants

We received institutional review board (IRB) approval to recruit participants from a post-secondary research institution in the northern Midwest of the United States. The participants were all enrolled in a data-oriented programming course in Python during the winter semester (between January and April) of 2021 (N = 144). This course is the second Python course for School of Information majors although other majors take it as well. It requires prior programming experience. The course focuses on developing intermediate programming skills in Python and covers working with data from a variety of sources (strings, files, APIs, websites, and databases), object-oriented programming basics, regular expressions, debugging, testing, and SQL. Fifty-three percent of the students identified as female and 47% identified as male; 33% percent identified as Asian, 2% as Black, 6% as Hispanic, 46% as White, 4% as Multiracial, and 9% did not indicate their race. The ages ranged from 18 to 33 years old (M = 20 years old, SD = 1.49). Eleven percent were Computer Science (CS) majors, 3% were Data Science (DS) majors, 3% were Information Science (IS) majors, and 83% were majors in other disciplines. The average maximum American College Test (ACT) math score was 31 (SD = 3.42) on a scale ranging from 1 (low) to 36 (high). The average GPA was 3.721 (SD = 0.448).

3.2 Materials

We used two versions of a problem set from our previous study [34] with the exception of changing one Parsons problem solution that was unusual to match the most common student-written solution; this change is shown in Figure 2. The only difference between versions A and B was the problem type. The second problem in version A was a write-code problem as shown in Figure 3. In version B, the same problem was presented as a Parsons problem as shown in Figure 4. Each version included five sets of isomorphic problems in the order shown in Table 1. To alter the amount of expected cognitive load needed to solve each problem, the problems ranged in difficulty from easy to hard. The concepts covered include strings, lists, ranges, conditionals, loops, dictionaries, and functions. Some of the problems came from past Advanced Placement (AP) Computer Science (CS) A exams and some from the CodingBat website created by Nick Parlante of Stanford University [59]; they exemplify problems covered in introductory computer programming courses. The problems are available in Supplemental Material 1.1.

Table 1: Order of Problem Type by Version
Version Problem Type
A Parsons*, Write, Parsons, Write, Parsons
B Write, Parsons, Write, Parsons, Write
Note: Asterisks indicate a change in the solution. Each percentage represents those who got it correct out of the n for each problem in Table 2.

The Pass scale was administered after each problem in both versions [56]. This question used a 9-point Likert scale that asked respondents to rate how much mental effort they invested in solving the previous problem from “very, very low mental effort” to “very, very high mental effort” as shown in Supplemental Material 1.3.

4 MIXED WITHIN AND BETWEEN-SUBJECTS EXPERIMENT

4.1 Experimental Design

We conducted a mixed within and between-subjects experiment [10] to (1) test the hypothesis that students would be more efficient at solving adaptive Parsons problems with common solutions than writing the equivalent code, (2) explore how the order of completing each problem type affected students’ efficiency and solution acquisition, and (3) explore the relationships between efficiency and self-reported cognitive load ratings.

The first part of the experiment was started on Feb 16th; students were randomly assigned to one version of the problem set (A or B). The second part was due March 10th; students completed the opposite version of the problem set (A or B). Students earned 10 lecture participation points for completing each version of the problem set and were asked to work individually. At the end of the course (week 15), participants were asked to complete both versions (A and B) of the problem set as an extra credit assignment. Students needed to earn 2,000 points or more in during the course to receive an A+. Points were broken down as follows: homework (60 pts. each, 360 max), regular projects (not including the final project, 200 pts. each, 400 max), git commits (max 105 pts.), midterm exams (225 pts. each, 450 max), readings (max 80 pts.), final project (310 pts.), and discussion (100 pts.).

4.2 Participants

In this section, we report on data from the participants (n = 95) who completed both the adaptive Parsons version and equivalent code writing version for some or all of the problems correctly; this ranged from 26 to 62 as shown in Tables 2 and 3.

4.3 Analysis

Task completion times were calculated for each problem using a Python script for adaptive Parsons problems (timeCorrectParsons.py) and a separate one for equivalent write-code problems (timeCorrectWriteCode.py).

Statistical analysis was performed using RStudio. We ran Wilcoxon Matched-Pairs Signed-Ranks test to analyze the difference in the median times to the first correct solution within the groups since this was a within-subjects study and the data violated assumptions of normality and equal variances [54]. We also report the mean and standard deviation for these differences in Supplemental Material 1.5 & 1.6 based on [48]. To do this, we used the ‘wilcox.test’ function from of the stats R package. We also ran Mann–Whitney U tests between groups to explore order effects for certain problems. Finally, we (1) ran paired t-test on cognitive load ratings to analyze the difference between problem types, (2) computed probability-based effect sizes [11, 69] using the canprot R package and Cohen's drm using the R package lsr, and (3) adjusted for any multiple comparisons were using Bonferroni's correction.

4.4 Results and Discussion

First, we present both the within and between-subject study results to answer RQ1. Second, we explain how the within-subject study results answer RQ2. Then we discuss implications and future work related to these questions.

The task completion times for students who solved the adaptive Parsons problem version of a problem before solving the equivalent write-code version are shown in Table 2; the self-reported cognitive load ratings are shown in Table 4. The task completion times for students who solved the write-code version of a problem before solving the equivalent adaptive Parsons problem version are shown in Table 3; the self-reported cognitive load ratings are shown in Table 5. We used OverCode to confirm whether or not the solutions to the Parsons problems matched the most common student-written solutions; this is denoted by the equivalent symbol (). Problem one (has22), which we changed based on our previous study to represent the most common student-written solution, remained the most common. OverCode also confirmed that all of the Parsons problem solutions we presented to students matched the most common student-written solutions except for problem two (countInRange). Students who solved the write-code version of problem two (see Figure 1) before solving the equivalent Parsons problem used solutions that were different from the provided Parsons problem solution (see Figure 4). This problem asked students to finish creating a function that returned the number of times a specific number appeared in a list between the start and end indices (see Figure 3). There were several differences between the clusters of student solutions and the provided Parsons problem solution: 1) the student solutions used count += 1 instead of count = count + 1, 2) students tended not to declare a variable to hold the current value at the index, and 3) a large group of students (25), used a slice to get all the values between the start and end indices and then loop through all those values instead of looping through the indices.

4.4.1 RQ1: What are the effects on efficiency of solving adaptive Parsons problems made with the most common or uncommon student-written solution versus writing the equivalent code?. The results supported H1: Students were significantly more efficient at solving Parsons problems that used the most common student-written solution versus solving equivalent write-code problems. The median time to solve each Parsons problem was significantly less than the median time to write the equivalent code for all of the problems and both versions except for problem two of version B (see Figure 4). It took students significantly more time to solve problem two as a Parsons problem before solving it as a write-code problem (see Table 2). This was different from our previous results [34], which implies that the most common student solution may change over time. In addition, an analysis of the solutions using OverCode revealed several differences between the Parsons solution and the common student solutions. This result supports the inverse hypothesis: Students were significantly less efficient at solving a Parsons problem with an uncommon solution than solving the equivalent write-code problem.

The results also supported H2: Students first presented with a Parsons problem that had an uncommon solution tended to use that solution to solve the equivalent write-code problem. The largest OverCode cluster of solutions for which students solved problem two as an equivalent write-code problem after the Parsons problem showed that these students used the Parsons problem solution to solve the write-code problem (see Figure 5).

Table 2: Task Completion Times for Parsons → Write
Parsons Problem Write-Code Problem Wilcoxon Matched-Pairs Signed-Ranks Test
Problem (Diff.) n Mdn in seconds Mdn in seconds V p-value A
1 has22 (H) 57 50.0 96.0 279.0 p < 0.001*** 0.65
2 countInRange (M) 29 125.0 68.0 325.5 p = 0.020* 0.33
3 diffMaxMin (E) 59 11.0 37.0 87.5 p < 0.001*** 0.65
4 dictTotal (M) 30 24.0 30.0 89.5 p = 0.006** 0.69
5 dictNames (H) 50 82.0 186.0 174.5 p < 0.001*** 0.68
Notes: E = Easy, M = Medium, H = Hard; The equivalent symbol ≡ indicates that students who solved the adaptive Parsons problem first used the same solution to solve the equivalent write-code problem; * p <.05, ** p <.01, *** p <.001; A = probability-based effect size measure (nonparametric generalization of common language effect size statistic).
Table 3: Task Completion Times for Write → Parsons
Parsons Problem Write-Code Problem Wilcoxon Matched-Pairs Signed-Ranks Test
Problem (Diff.) n Mdn in seconds Mdn in seconds V p-value A
1 has22 (H) 30 39.5 205.0 5.0 p < 0.001*** 0.77
2 countInRange (M) 61 89.0 123.0 501.5 p = 0.002** 0.65
3 diffMaxMin (E) 30 10.0 128.5 0 p < 0.001*** 0.84
4 dictTotal (M) 62 25.0 56.5 163.0 p < 0.001*** 0.66
5 dictNames (H) 26 54.5 388.0 14.0 p < 0.001*** 0.82
Notes: E = Easy, M = Medium, H = Hard; The equivalent symbol ≡ indicates that students who solved the write the code problem first used the same solution as the adaptive Parsons problem; * p <.05, ** p <.01, *** p <.001; A = probability-based effect size measure (nonparametric generalization of common language effect size statistic).

This research implies that Parsons problems should be generated from the most common student-written code to be more efficient. We plan to mine the huge amount of student-written code from free and interactive eBooks on the Runestone eBook platform to automatically generate Parsons problems by using OverCode to determine the most common student solution to write-code problems. This research also provides evidence that students recall and use uncommon Parsons problem solutions to later solve equivalent write-code problems.

4.4.2 RQ2: What is the effect on self-reported cognitive load ratings of solving adaptive Parsons problems versus solving equivalent write-code problems?. The results are shown in Table 4 and Table 5. Students reported investing significantly less mental effort in solving problem one as a Parsons problem than the equivalent write-code problem. This was the problem for which we changed the solution from our previous study. Even though insignificant, students invested more mental effort in solving a Parsons problem with an uncommon solution (problem two) than the equivalent write-code problem when they solved the Parsons problem first. But when they solved the write-code version of problem two first, they reported investing lower mental effort in solving the equivalent Parsons problem even though the solution was uncommon. Solving a Parsons problem with an uncommon solution before solving an equivalent write-code problem resulted in slightly lower self-reported cognitive load ratings for solving the write-code problem. Students also reported investing significantly less mental effort in solving problem three—the easiest problem—as a Parsons problem regardless of the order in which they completed it.

Table 4: Cognitive Load Ratings for Parsons → Write
Parsons Problem Write-code Problem Paired t test
Problem (Diff.) M (SD) of ratings M (SD) of ratings t value df p-value Cohen's drm A
1 has22 (H) 2.60 (1.73) 3.28 (2.19) -2.5926 56 p = 0.012* 0.34 0.60
2 countInRange (M) 3.86 (1.71) 3.34 (1.99) 1.4648 28 p = 0.154 -0.27 0.42
3 diffMaxMin (E) 1.17 (1.52) 2.42 (1.83) -5.3916 58 p < 0.001*** 0.73 0.70
4 dictTotal (M) 1.93 (1.39) 2.43 (1.93) -1.4936 29 p = 0.146 0.28 0.58
5 dictNames (H) 3.42 (1.72) 3.34 (2.36) 0.20216 49 p = 0.841 -0.04 0.49
Notes: E = Easy, M = Medium, H = Hard; * p <.05, ** p <.01, *** p <.001, paired t-test; Likert scale: 1 = Very, very low mental effort; 1 = Very low mental effort; 3 = Low mental effort; 4 = Rather low mental effort, 5 = Neither low nor high mental effort; 6 = Rather high mental effort; 7 = High mental effort; 8 = Very high mental effort, 9 = Very, very high mental effort
Table 5: Cognitive Load Ratings for Write → Parsons
Parsons Problem Write-code Problem Paired t test
Problem (Diff.) M (SD) of ratings M (SD) of ratings t value df p-value Cohen's drm A
1 has22 (H) 2.57 (1.74) 3.30 (2.28) -1.8422 29 p = 0.076 0.35 0.60
2 countInRange (M) 3.11 (2.03) 3.39 (2.24) -1.2159 60 p = 0.229 0.13 0.54
3 diffMaxMin (E) 1.00 (1.17) 2.77 (1.85) -5.7071 29 p < 0.001*** 1.07 0.79
4 dictTotal (M) 2.15 (1.64) 2.56 (1.90) -1.7266 61 p = 0.089 0.23 0.57
5 dictNames (H) 3.04 (1.80) 3.00 (2.90) 0.058921 25 p = 0.954 -0.02 0.50
Notes: E = Easy, M = Medium, H = Hard; * p <.05, ** p <.01, *** p <.001, paired t-test; Likert scale: 1 = Very, very low mental effort; 1 = Very low mental effort; 3 = Low mental effort; 4 = Rather low mental effort, 5 = Neither low nor high mental effort; 6 = Rather high mental effort; 7 = High mental effort; 8 = Very high mental effort, 9 = Very, very high mental effort