Roles and strategies of learning analytics in the e-publication era

E-books have been introduced to educational institutions in many countries. The use of e-books in traditional classrooms enables the recording of learning logs. Recently, researchers have begun to carry out learning analytics on the learning logs of e-books. However, there has been limited attention devoted to understanding the types of learning strategies that students employ when they read e-books. In this paper, using e-book learning logs, we examine the learning strategies that students employed when reading e-books. In this paper, we will introduce how to identify learning strategies from e-book learning logs with two case studies. One is “Identifying Learning Strategies Using Clustering” and the other is “Examining Learning Strategies Using Sequential Analysis.”


Introduction
With the development of online technologies and e-publishing standards, traditional textbooks are increasingly being replaced by electronic textbooks (i.e., e-books) or digital textbooks (Yin et al., 2014). E-books have been introduced into educational institutions (Nakajima, Shinohara & Tamura, 2013;Yin et al., 2015a) in many countries (e.g., Japan, Korea, and Singapore); For example, in order to utilize ICT in education, the Japanese government planned to introduce e-books in elementary, middle, and high schools by 2020 (Ogata et al., 2015). The Korean government announced an e-book usage plan in 2007 (Shin, 2012).
In the last decade, much research has focused on the effectiveness of e-books for teaching and learning (Eden & Eshet-Alkalai, 2013;Kim & Jung, 2010), some of which have specifically examined their functional features (Shepperd, Grace, & Koch, 2008). Recently, researchers have begun to pay attention to the utilization of the learning logs of e-books. Instructors' lecture materials, such as slides or other notes, can be posted to the e-book system, allowing the students' learning behaviors to be recorded when they use the e-book to read the learning content. The recording of the students' learning behaviors is defined as a learning log (Yin et al., 2015b).
The use of e-books in traditional classrooms enables the recording of large amounts of data in learning logs, bringing changes to Learning Analytics (LA). LA aims to identify patterns and make predictions that characterize learners' behaviors and achievements, domain knowledge content, assessments, and educational applications (Luan, 2002).
Recently, many LA studies have paid attention to the prediction of learning outcomes. LA results can be used to optimize institutional processes and increase educational and monetary benefits for learners and educators (Colvin et al., 2015). Most recently, researchers have recognized that it is necessary to carry out LA with educational theory and learning strategies (Dawson, Drachsler, Rosé., Gašević, & Lynch, 2016). For example, Jovanović et al. (2017) used the clustering method to detect learning strategies from a university's Learning Management System in a flipped classroom. Researchers have indicated that teachers can make better decisions regarding supporting students and course design processes if they can know the types of learning strategies that students employ in their learning activities (Steif & Dollár, 2009;Jovanović et al., 2017).
Despite the fact that e-books are continually being introduced to educational institutions, there has been limited attention paid to understanding the types of learning strategies that students employ when they read e-books. In this paper, by using e-book learning logs, we examined the learning strategies that students employed when they read e-books. An e-book system was developed to collect students' learning behavior logs, which recorded such behaviors as "open learning content," "turning to the next page," "returning to a previous page," "adding a bookmark," "adding a marker," "writing a memo," and so on. Using these logs, we carried out two case studies to identify the learning strategies that students used. One is "Identifying Learning Strategies Using Clustering" (Yin et al., 2015b), and the other is "Examining Learning Strategies Using Sequential Analysis" . In the following section, we will introduce the data collection procedures and how the learning strategies were identified with these two case studies.

Data collection for learning analytics
Collecting data is the first step in learning analysis Yin, Sung, et al., 2013). Based on the data collection method, previous studies could be classified into three categories: Questionnaire-based Data Collection (QDC), Manual Data Collection (MDC), and Automatic Data Collection (ADC) (Yin et al., 2014;Ren et al., 2017;Yin et al., 2017).
• QDC. In this category, some questionnaires are predesigned to collect data and carry out analysis. The questionnaire is a tool for asking questions of the participants of the survey, and is a data-gathering method used to collect and analyze the feedback of a group of people from a target population.
• MDC. In this category, a manual data collection system is opened to users, who can employ the system and consciously provide data about their learning behaviors. If a user encounters some meaningful objects, such as images, audio, or animation, then he can upload it to the system and share it with his friends or classmates. The advantage of this category is that it collects meaningful data; however, as it is collected manually, the volume of the data is limited.
• ADC. In this category, learning behavior log data are automatically recorded while reading e-documents, e-books, and so on. For example, Yin et al. (2015) identified learning behavior patterns using students' digital textbook reading log data, which were recorded automatically.
For categories QDC and MDC, the data are consciously collected. Therefore, data are affected by users' own subjective factors. For category ADC, the data are objectively collected, thereby removing the subjective factors that affect data authenticity. The present work falls under category ADC.

The benefits of learning analytics
Researchers have reported that LA can positively relate to student efforts (Campbell, DeBlois, & Oblinger, 2007), performance (Macfadyen & Dawson, 2012), and outcomes (Archer, Chetty, & Prinsloo, 2014;Hrastinski, 2009;Yin et al., 2017). Depending on the different teaching and learning roles, the objectives for LA can be different. As shown in Fig. 1, Romero and Ventura (2010) indicated that different roles can obtain different benefits from LA:

•
For Learners, LA can help to improve and share their learning experience, to generate adaptive hints, to recommend courses, relevant discussions, and books.
• For Teachers, LA can help to get feedback from learners, to verify and identify the learning strategies adopted in their course, to analyze students' learning and behavior, and to determine more effective activities.

•
For Course Designers, LA can help to evaluate the structure of course content, and to evaluate teaching materials. Based on the LA results, they can develop a learning support tool to construct learning models.
• For Administrators of educational institutions, LA can help to organize resources, to enhance educational programs/plans, and help to evaluate teachers, students, and institutions.

The goals and methods of learning analytics
As shown in Table 1, LA researchers have mostly focused on the research goals such as Prediction, Structure Discovery, and Relationship mining, and have used many methods to achieve those goals (Baker, 2011;Baker & Yacef, 2009). • Prediction. We usually use a set of data to predict students' future learning behavior or learning outcomes. For example, prediction can help to know who might fail a class; if a student spent the last half hour working in an online learning environment, through the learning log of the last half hour, prediction can help to know whether s/he mastered the skill to solve the next problem.
There are many prediction analysis methods such as Classification, Regression, and Latent Knowledge Estimation. • Structure Discovery. "Structure discovery attempts to find structure, patterns and data points in a set of data without any ground truth or a priori idea of what should be found" (Baker & Inventado, 2014). Clustering, Factor Analysis, Knowledge Inference and Network Analysis are common analysis methods of structure discovery. • Relationship Mining. Its involves discovering relationships between variables in a dataset, these relationships are seen as rules of data for later use (Bousbia & Belamri, 2013). There are many Relationship Mining methods such as "Association rule mining," "Correlation mining," "Sequential pattern mining," and "Causal data mining."

The e-book based data collection
We used an e-book system to collect the data. As shown in Fig. 2, the instructors and students could access the e-book system by using their smartphone or laptop anywhere on or off campus. Through the e-book system, they could perform actions such as "open learning content," "turning to the next page," "returning to a previous page," "adding a bookmark," "adding a marker," "writing a memo," and so on. All actions using the ebook system were recorded in a database (Fig. 2).

Fig. 2.
Collecting data from the e-book system Table 2 shows a sample of a reading behavior log, which we call learning logs. One learning log contains the date, time, user ID, learning content ID, page number, user action, and other data. The measures from the e-book data were Read Pages (RP), Preview Times ( NN: The number of times a student turns to the subsequent page. 2.
NP: The number of times a student returns to the previous page. 3.
PT: The number of times a student previews the lesson before class. All the teaching materials were uploaded to the e-book system, so students could preview the learning content before class. 4.
RP: The total number of pages that a student read. The reading action logs for "Page No." and "Action Time" showed how many pages the students read. Many of them repeatedly read specific pages. 5.
RT: The total time spent reading the learning content. The reading action logs "Action Time" showed the length of time students spent reading the learning content. RT was calculated on an hourly basis. 6.
HL: The number of times a student makes a mark using the highlight function. 7.
UL: The number of times a student makes a mark using the underline function. 8.
BM: The number of times a student adds a bookmark.

Case study 1: Identifying learning strategies using clustering
This case study aimed to find meaningful measures from e-book reading behaviors and to employ these measures in the analysis of students' learning behavioral patterns (Yin et al., 2015a). These patterns are the learning strategies which were employed by the students (Jovanović et al., 2017).
The data used in this case study were collected during an information science course at a university in Japan. The students were given the teaching materials for the next class and were asked to prepare the lesson before the next class. The data from the 98 students, aged 18 to 19, were analyzed.
In order to identify learning strategies from the learning logs, we visualized the learning log in time series, and grouped the students into clusters based on their learning of some meaningful measurement.

Learning log visualization
We visualized the reading log to identify learning strategies. Fig. 3 is a page translation graph of the learning behaviors. The graph visualizes the students' actions using the "Action Time: the time that the action happened," "Page No: the page on which the action happened," "Next: Turning to the subsequent page," and "Prev: Returning to a previous page" logs. The study found that a number of students recorded many "Prev" actions ( Fig. 3. A), indicating their frequent review of previous pages. That is, they often backtracked in their reading. Meanwhile, other students had more "Next" actions ( Fig. 3. B), indicating that they just read the pages of the learning content in sequence.
We define the action that students often return to a previous page as Backtrack Reading (BR) and compare the number of "Prev" and "Next" actions to calculate the Backtrack Reading Rate (BRR).

Fig. 3. Visualization of page translation (turning to subsequent pages or returning to previous pages)
A partial correlation analysis was conducted to identify the correlation of learning achievement with other variables, such as the number of pages read, the number of times a lesson was previewed before class, the total time spent reading the learning content, and the backtrack reading rates. By using correlation analysis, we found that some measures (e.g., BRR, PT, RP, and RT) had a significant positive correlation with the Final Examination Results (FER). Therefore, based on the results of partial correlation, a kmeans clustering analysis was conducted to cluster the students into groups in order to analyze the features of the learning behaviors of those groups.

K-means results
Students were clustered into four groups. Table 3 presents comparisons of the post hoc tests (Scheffe). Clusters 1 to 4 (C1, C2, C3, C4) had 25, 29, 14, and 30 students, respectively. In order to examine the inter-cluster differences, one-way analysis of variance (ANOVA) was conducted for each measure, with C4 as a between-subject factor (data of the four clusters satisfied the ANOVA requirements).  Table 3 shows the cluster comparison results. When comparing the students in C3 and C2, significant differences were observed in the backtrack reading rate (BRR: 3 > 2), pages read (RP: 3 < 2), and reading time (RT: 3 < 2), but not in their learning achievement (FER). The C3 students tended to frequently review previous pages, clocked a shorter time for reading, and obtained satisfactory learning achievement (similar to the C2 students). This finding shows that BRR has a significant positive influence on learning effectiveness, and helps students manage their time to learn more efficiently. BRR has a relevant correlation with learning efficiency and is thus a "good" learning strategy.
A comparison of the C3 and C1 students shows significant differences in the backtrack reading rates (BRR: 3 > 1), read pages (RP: 3 < 1), reading times (RT: 3 < 1), and learning achievement (FER: 3 < 1). The findings show that, although the C3 students demonstrated an effective reading style, they still needed to spend more time reading the learning content to ensure better learning achievement. In other words, an effective reading style and sufficient learning time are simultaneously required.

The results of identifying learning strategies
An important finding emerged from the analyses: The backtrack learning strategy was found to have merit as it can help students save time when studying.
It is interesting to note that the backtrack reading learning behavior can be linked to a reflection learning strategy of linking current knowledge to previous knowledge (Costa & Kallick, 2008). This finding can be used to improve the design of e-books. Teachers can link the association of knowledge in the e-book to help students do backtrack reading.

Case study 2: Examining learning strategies using sequential analysis
To identify the learning strategies adopted when learning with digital textbooks, an experiment was designed using our e-book system to collect students' learning logs. The experiment was carried out on an Educational Technology course for graduate students. A total of 21 graduate students participated in this study. The participants were asked to read an academic paper via the digital textbook system. The age of the participants was 23 on average. The experiment took approximately 1.5 hours.
The aim of the study was to explore the learning strategies students adopted when reading academic papers. Progressive sequential analysis was used to infer the learning strategies of students when they were reading the academic papers. Many researchers have used the progressive sequential analysis method to perform learning analytics (Bakeman & Gottman, 1997;Hwang, Hsu, Lai, & Hsueh, 2017;Yang, Chen, & Hwang, 2015;Yin et al., 2017).
The analysis results identified many significant sequences that occurred while reading the digital textbooks. We then carried out interviews to ask the participants why they took such actions.

Use of the highlight (HL) learning strategy
It was found that after adding a HL, the students deleted it, or after deleting a HL, they added it again (Fig. 4). Some of the students who had these learning behavioral patterns stated their perceptions as follows: a) I highlighted it because I thought it was the main idea of the paragraph, but I realized I was wrong, so I deleted it.
b) I highlighted some words; after that, I found more meaningful words. c) Because I thought it was an important place; after I read the rest of the paper, I found it was not important.

Fig. 4. Highlight
From the interview, it was found that students often changed the important keywords when they were reading the textbook. This can be seen as a learning strategy of using highlight temporarily, but if they found other meaningful words, they deleted it.

Use of the Bookmark (BM) learning strategy
It was also found that, after adding a BM, the students would delete it, or after deleting a BM, they would add it again (Fig. 5). Some of the students who had this learning behavioral pattern stated their perceptions as follows: a) I thought it was an important page, but after I read the rest of the paper, I found it was not important, and added a bookmark on another page. b) I examined the importance of the pages again and removed those of less importance. c) When I had some other things to do, which means I have to read the article later, I will add a new bookmark so that I can continue my work later.

Fig. 5. Bookmark
From the interview, it was also found that the students often changed the important page while they were reading the textbook. There are two kinds of learning strategies here: 1) the students used a bookmark temporarily, and then when they came back to reading, they deleted it, and 2) they added bookmarks to many pages; after that they examined the importance of the pages again and deleted the bookmarks on pages which were not important.

Use of the deleting marker learning strategy
It was also observed that, after "deleting highlight/underline," the students often used "delete bookmark" (Fig. 6). Some of the students who had this learning behavioral pattern shared the following comments: a) When I completed the reading of the paper, I felt that I understood all of them. b) I thought the part which had been highlighted was not important anymore, so I deleted the highlight or the bookmark. c) When I had questions on the content I marked it; when I found the answer, I deleted all the marks.

Fig. 6. Deleting marker
From the interview, we found a learning strategy that the mark functions were sometimes used temporarily, such as if they had questions on some content, then they added marks on that content. After they found the answer, they deleted them.

Conclusions
LA is an emerging topic in Educational Technology. Different roles can gain different benefits from LA, such as optimizing students' learning outcomes, improving teachers' teaching methods, evaluating the structure of courses and teaching materials, and improving the learning environment (Greller & Drachsler, 2012;.
To provide further suggestions to researchers, we list some potential research issues related to e-book based learning analytics as follows:

1.
Strategies for e-book system promotion and data collection. Data collection is the first step of LA. It is important to promote the use of e-book systems for collecting learning log data. Therefore, there are several related research issues: • Proposing promotion strategies to convince schools or teachers to use ebook systems in the existing curricula.
• Proposing effective coding methods and filtering algorithms to collect meaningful data from e-book learning systems.

•
Investigating the issue of personal privacy protection when collecting data from e-book systems, including the privacy and security control policy and techniques for managing e-book learning logs.

2.
Strategies for Learning Design (LD) using e-book systems. LD is highly relevant to the formation of educational data. It focuses on how to make the teaching processes visible, sharable, and consequently more effective and efficient. The research issues regarding LD are listed as follows: • Proposing effective LD strategies for using e-books in school settings.
• Investigating the impact of LD on LA.

•
Proposing strategies for utilizing the LA results to support LD.

•
Proposing strategies for utilizing the LA and LD to improve teaching and learning.
3. Innovative usages of LA. Several potential research issues of LA for e-books are listed as follows: 1) Prediction.
• Providing personalized supports by analyzing students' e-book based learning logs and making predictions. 2) Structure Discovery.
• Identifying students' behavioral patterns from e-book-based learning logs.
• Investigating the impacts of different learning strategies on students' behavioral patterns. • Using LA approaches to investigate the factors affecting students' learning performances.
• Comparing the behavioral patterns of students with different achievement levels and providing suggestions for low-achieving students.
• Investigating the correlations between students' behaviors, learning perceptions and performances.

4.
Integration of theories and strategies. The research issues of LA: • Integrating e-book based LA and pedagogical theories.

•
Integrating e-book based learning logs with other learning data, such as educational game data.
5. LA applications. The research issues of LA application: • Employing learning analytics approaches in various application domains.

•
Practices for the adaptation of LA results to enhance teaching/learning environments.