Learning Analytics Platform in Higher Education in Japan

: In recent years, learning analytics has become a hot topic with many institutes deploying learning management systems and learning analytics tools. In this paper, we introduce learning analytics platforms that have been established in two top national Japanese universities. These initiatives are part of a broader research project into creating wide-reaching learning analytics frameworks. The aim of the project is to support education and learning through research into educational big data accumulated on these platforms. We also discuss the future direction of our research into learning analytics platforms. This includes introducing a model in which learning analytics tools and the results of research can be shared between different education institutes.


Introduction
As the digitization of learning environments is advancing, interest in learning analytics and its effect on education is increasingly gaining attention.An important aspect is the infrastructure and platforms that act as the foundations to support the key activities of learning analytics, which have been define as "the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs" on the LAK11 website (https://tekri.athabascau.ca/analytics/).Early research into learning analytics has mainly focused on highly localized contexts with a very narrow scope of investigation.These limitations were imposed due to a lack of infrastructure, data, and analysis tools available at the time.However, as the field continues to evolve as well as draw on related disciplines, new methods of data collection and analysis have been created.As the collection of education big data is increasing in many different facets of learning environments, research analyzing data from a wide range of learning contexts is a significant challenge (Ferguson, 2012).While many tools have been developed to meet specific needs, there has been few researches into large scale, and wide-ranging infrastructure.Creating platforms to support the automatic analysis of educational data is fundamental to the continuing development of learning analytics.
The research presented in this paper is part of a broader project titled "Research on Cloud infrastructure to support Education and Learning using educational big data".This project aims to not only collect and analyze educational big data, but to ensure that the results of the analysis are conveyed into a form that can be used to inform stakeholders in the education process.An important part of the project is to establish frameworks that can support the cycle of data collection, analysis, and informing practice and the evaluation of its effectiveness.The frameworks will be used to create infrastructure across different levels of educational institutes (primary, secondary, and higher education), with evaluation of its effectiveness in various scenarios.A fundamental part of this project is research into Learning Analytics (LA) platforms that can support frameworks to achieve the key project aims.
Firstly, in this paper we will introduce an LMS dependent Learning Analytics (LA) platform that has been deployed in Kyushu University.Secondly, we will discuss an LMS independent LA platform that is currently deployed in Kyoto University.It is based on a modular design to ensure interoperability with other LMS in different educational institutions, and abstract user private information that is not required in the collection and analysis phase of learning analytics.Then we propose creating a collaborative LA platform framework in which system components and modules can be modified and shared through a version control system.Finally, we will introduce some of our recent research into a distributed learning record system based on blockchain technology.

Literature review
As learning analytics platforms evolve, there is an increasing push toward creating systems that can be reused and integrated in flexible ways to cater for a range of different needs.Siemens et al. (2011) initially proposed a shift towards open learning analytics, where systems are designed for interoperability.Since the initial proposal, there have been several proposals of how open learning analytics could be designed.The non-profit company Jisc has proposed a conceptual architecture that centers around a learning records warehouse and learning analytics processor (Sclater, Peasgood, & Mullan, 2016).The Apereo Foundation's Learning Analytics Initiative (LAI) has taken a more hands on approach by defining subsections in the architecture as individual software development projects (Ferguson et al., 2016).A learning record store that was developed by a project within LAI is use as one of the key modules in a LA platform introduced later in this paper.
In recent years, specifications have been proposed with the aim to standardization the user experience and learning record data within virtual learning environments and LA platforms.In particular, the standardization of authentication between learning systems and data structure and format are of concern when creating platforms for learning analytics.
Interfaces have been proposed to allow the seamless and secure integration of external tools to augment existing LMS experiences.Some of these interfaces have been proprietary and thus limited the tools that can be integrated.IMS Global Learning Consortium (2016) published the Learning Tools Interoperability (LTI) standard for defining the process of connecting two systems by OAuth authentication, and how users will transition across these systems without having to authenticate once again with the destination system.This process of transferring a user's authentication and session to an external tool is known as a launch process.During this process, basic information about the user and the context in which the external tool was launched can be transferred from the source system to the target system.In addition to this basic information, other information, such as: course roster and outcomes from the external tool, can be transferred using queries between the tools after the initial launch process.A LA platform introduced in this paper employs LTI as a standard interface between different systems to seamlessly connect and transfer both teacher and learner authentication and context information.
Standardized protocols to support the sharing of learning record event logs have also been developed along with systems to collect and aggregate data from disparate systems.Currently, there are two main protocols for transferring learning record event logs: Advanced Distributed Learning's xAPI or Experience API, formally known as Tincan API (Advanced Distributed Learning, 2016) which is being developed in an open process, and IMS Global Learning Consortium's Caliper Analytics API (IMS Global Learning Consortium, 2015) which is being developed in by a closed consortium with limited input from outside parties (Griffiths & Hoel, 2016).Both of these standards have been adopted in varying degrees by LMS and other educational tools, and therefore it would be advantageous to support both specifications in a LA platform.
There are a number of educational and psychological theories that can be informed by learning analytics platforms used within educational institutes through the collection of educational big data.Wide spanning LA platforms with behavior sensors inside and outside the classroom can be used to inform Situated Learning Theory (Lave & Wenger, 1991) through the analysis of learning that occurs in different relevant contexts.Modeling of students' knowledge through long term learning analytics can inform Knowledge Construction Theory (Fosnot, 1996) by analyzing the creation of knowledge based on previous knowledge and experiences.Social Constructivist Theory (Palincsar, 1998) and Distributed Cognition Theory (Salomon, 1997;Dillenbourg, 1999) can be informed by the analysis of groups in which learning is occurring and also support the creation of optimal social learning environments through automated group formation based on previous student behavior.The evaluation of analysis from LA platforms that are used to inform education will form the basis of data driven evidence-based education.

LA platform
Currently in the project there have been two different approaches to the construction of LA platforms: one that is dependent on the LMS, and one that is more modular in design and is independent of the LMS.Both of these approaches will be introduced in the following sections along with discussion on their respective advantages and disadvantages.A common aspect of both platforms is the use of a digital learning material reader called BookRoll that has also been developed as part of the project.

LMS dependent platform
Kyushu University introduced an LMS dependent learning analytics platform called: Mitsuba (M2B) that was based around the Moodle learning management system, an eportfolio system (Mahara), and an e-book system (BookRoll).This was supported as part of a project title "Research and Development on Fundamental and Utilization Technologies for Social Big Data" by the National Institute of Information and Communications Technology.The project started on July 1, 2014, and ended in March 2018.As seen in Fig. 1, the M2B system was built on the Moodle LMS platform.In this particular platform, users transitioned from systems connected to Moodle via a proprietary peer to peer single sign on (SSO) mechanism called MNet (Büchner, 2016).Users can initially login to one part of the platform using LDAP authentication and then can transition to another part of the system using by MNet.This enables the secure and seamless transitioning of a user's session to and from different parts of the platform and also consistency of data collection from independent systems.BookRoll has been modified to accept delegated access login via MNet, and an advantage of the setup is that users can click on links to learning materials within Moodle and transition directly to viewing the e-book in the same browser.

LMS (Moodle):
The visualization and feedback in the M2B system are directly integrated into Moodle as plugins that reside on the course sites.These plugins require direct access to the BookRoll database to preprocess data at intervals and therefore require close proximity.An advantage of the M2B platform, is that because the plugins reside within the LMS, they have access to information about time of the classes in a course, the scores of test and exams, and the class roster.However, a disadvantage of the system is that redevelopment is required to apply the plugins on the platform to other data as the system uses proprietary formats.Also, as the plugins are platform specific it hinders the sharing of tools to different institution as there are many different LMS platforms in use.

LMS independent platform
In recent years, Learning Management Systems (LMS) have become an integral part of higher education.As these services are becoming increasingly important to education, LMS are being managed as production environments with stringent security and processes to safeguard the integrity of the system.While data from LMS and other VLE (virtual learning environments) are essential to learning analytics research, a particular concern is the protection of data and privacy throughout the analytics workflow (International Organization for Standardization, 2016).On one hand, researchers must ensure that the privacy of key stakeholders, such as: students, teachers, and administrators are protected.On the other hand, the protection of data privacy can sometimes limit access to data, which can hinder learning analytics research.This problem also raises issues when production and research learning environment systems are integrated during the development of new learning analytics research ideas, and performing experiments to evaluate their effectiveness in the field.Ideally, research systems would pre-emptively protect data and privacy by only handling anonymized data that has been stripped of information that can identify a person.However, this solution also has limitations as it can negatively impact personalized results, such as: a student comparing their personal progress in a course with that of the whole student cohort.There are also possible secondary uses of data collected by these systems that should be investigated, such as: the use of real data in learning analytics and data science education, community-based learning analytics where data is available to stakeholders to freely perform their own analysis, and facilitating 'data takeout' where the stakeholder can export their personal data and transfer it to another system.
Traditionally, there has been little distinction made between the different roles that systems perform, with LMS and learning analytics systems inhabiting the same environment without abstraction.However, as LMS and learning analytics research mature, systems are becoming increasingly modular with personal data being stored in numerous locations, and anonymity by design will play an increasingly important role in the protection of personal data in integrated systems.
The development of a modular learning analytics platform designed with the aim of being independent of the LMS being used at the center of the system started at Kyoto University from 2017.It was designed with the purpose of integrating production and research learning systems that address the protection of stakeholder privacy, while trying to minimize the limitations of anonymized data analysis in research systems (Flanagan & Ogata, 2017).A live soft launch started in October 2017 with 8 courses using the LA platform on a volunteer basis.We are currently preparing to officially launch the system for wider use within the university from April 2018, and are also preparing to deploy the platform for use in a university in Taiwan.
As seen in the overview shown in Fig. 2, the platform comprises of four main components: LMS, behavior sensors, learning record store (LRS), and analysis tool.Currently, the LMS called PandA, which is based on the open source Sakai LMS (www.apereo.org/projects/sakai-project), at Kyoto University is run as a production service by the Institute for Information Management and Communication.

Learning management system (LMS)
External tools can be connected via LTI and other mechanisms to enable seamless transitioning to and from an LMS that is central to virtual learning environments in most education institutions.In many cases, personal information is usually transferred to the target system in this process.However, this can pose a problem when production systems are integrated with research systems.Personal information is usually handled in production systems that have been designed and secured to avoid breaches of user privacy.In contrast to this, research systems are generally not concerned with the design and security aspects required to ensure user privacy.This is influenced by various factors, including: the purpose of the system, time and funding constraints, and the fact that the design and management is usually carried out by a wide range of users from highly experienced professors to students who are just starting their first research.Because of these reasons, it is important to consider how user privacy can be protected when integrating production and research systems.

Anonymized Id management
We propose that the information that is transferred when connecting external tools should be limited to attributes that cannot directly be used to identify a user as a particular person.Most modern LMS utilize an internal universal unique identifier (UUID) to which personal information, such as: real name, student/teacher id, and email address are attributed.As shown in Fig. 2, we propose that (1) UUID should be the only user identification information that is transferred to research systems.The relation between the LMS's internal UUID and personal information is only available within the production system and therefore reduces the risk of a user privacy breach.External tools will then attribute learner events with the LMS's internal UUID that is sent during the LTI launch process, therefore anonymizing (4) Event data collected in the research system side LRS (Learning Record Store).
Anonymized (2) Course and event data using the LMS internal UUIDs in place of personal information will also be exported from the LMS to an analysis tool and LRS.A simple plugin within the LMS translates the UUIDs displayed in research system analysis results into the real name, id, or email address of students and teachers.The plugin will act as a LTI Tool consumer reverse proxy, which involves both authentication using (3) UUID with the LTI Tool provider, and translating UUIDs by retrieving the contents from the provider instead of the user directly transitioning to the external tool.This ensures that the students and teachers will be able to meaningfully interpret research system analysis.This is particularly important for research into predicting at risk students as anonymized results would be difficult to use for intervention support.

Alternative authentication
An alternative for the implementation of authentication in the LMS independent platform design would be to use a single sign on identity provider service such as Shibboleth (Morgan et al., 2004) to handle the authentication of users and access to personal identifying information.A UUID or hash could be generated at either the identity provider level for a whole of federation unique identifier, or at the service provider level to provide a more localized unique identifier at the institute level.This unique identifier would be used in the aggregation of user learning data from disparate systems within the LMS independent LA platform.The identity provider also would be able to identify the role of the user within the institute and allow for simplified administration of user permissions on the learning analytics platform.

Behavior sensors
The actions in tasks that learners take during the course of their studies that occur outside the LMS need to be captured by behavior sensors.These tasks can take place in both formal and informal learning situations in seamless learning environments (Uosaki et al., 2013), and therefore it is important to collect data on the events that occur in both of these environments.We currently have implemented the addition of two behavior sensor systems: a digital learning material reader called BookRoll, and an informal language learning tool called SCROLL (Ogata et al., 2011).The design of the system allows additional behavior sensors to be integrated into the proposed system.Initially, the behavior sensors were proprietary independent systems and did not support open interoperability with other systems.A standardized interface was developed based on: LTI for seamless authentication transition from existing production LMS by anonymized (1) UUID, and xAPI (Advanced Distributed Learning, 2016) which is an open source statement API for outputting anonymized (4) Event data to a centralized independent Learning Record Store (LRS).As the main purpose of the data collected by behavior sensors is for research analysis, all users of the systems are given the option to opt-out on initial authentication if they do not consent to participation and will not have their actions logged.

BookRoll
Digitized learning materials are a core part of modern formal education.In addition to serving as a learning material distribution platform, it is also an important source of data for learning analytics into the reading habits of students.The action events of the readers are recorded, such as: turning to the next or previous page, jumping to different pages, memos, comments, bookmarks, and markers indicating parts of the learning materials that are hard to understand or are of importance.The reading behavior of students has previously been used to visualize class preparation and review patterns (Yin et al., 2015;Ogata, Taniguchi, et al., 2017;Oi et al., 2015;Shimada et al., 2015).The digital learning material reader can be used to not only log the actions of students reading reference materials, but also to distribute lecture slides.
A key feature of the learning analytics platforms that are being researched in this project is the use of the BookRoll digital learning material reader.As shown in Fig. 3, the user interface supports a variety of functions, such as: moving to the next or previous page, jumping to an arbitrary page, marking sections of reading materials in yellow to indicate sections that were not understood, or red for important sections.Memos can also be created at the page level or with a marker to attach it to a specific section of the page.Users can also bookmark pages or use the full text search function to find the information they are looking for later when revising.Currently, learning material content can be uploaded to BookRoll in PDF format, and it supports a wide range of devices, including: notebook computers, tablets, and smartphones, as it can be accessed through a standard web browser.Initially, user behavior was logged in a local database and required that analysis be performed by either connecting directly, or exporting data from the database.
Table 1 presents a sample of e-book logs extracted from BookRoll.In logs, there are many types of operations, for example, OPEN means that the student opened the ebook file and NEXT means that he or she clicked the next button to move to the subsequent page.The logs that are collected in BookRoll are quantitative education data and can be used to observe various objectives, such as (Ogata, Oi, et al., 2017):

•
Analyze the behavior of "active learners" for use in encouraging students to be more active.

•
Observe and analyzing the details of behavior of "active learners" to make the students more active. Teaching: • Based on the logs made during a class session, improving course designs, which include collaborative learning and flipped classroom approaches.

•
Based on the students' patterns of viewing e-books (e.g., understanding which page was frequently viewed), improving teaching materials and the structure of the e-books.

Informal language learning tool
In addition to collecting data on user behavior in formal learning situations, we also plan to deploy the SCROLL ubiquitous learning log system that was reported in Ogata et al. (2011) to collect data on user behavior in informal learning environments.SCROLL can be used to support the sharing and reuse of ubiquitous learning logs that are collected in the context of language learning.The addition of behavior sensors that capture event information outside traditional formal classroom contexts enables the support of research into seamless learning analytics of language learners.As the proposed system will collect data from both formal and informal learning environments, this will enable linking of knowledge learnt in either context in addition to information from the LMS, and can be analyzed to predict and extract behaviors of overachieving and underachieving language learners.For further details into seamless learning analytics, please refer to Flanagan and Ogata (2018).

Learning record store (LRS)
The LRS is an integral part of the proposed system as it is a central independent point to collect all event data from both the production LMS system and behavior sensors which are still in the research phase of the development cycle.While we have chosen to adopt xAPI as the mode of transporting events data from other systems to the LRS, this is not a strict limitation.We have deployed a version of Apereo Foundation's OpenLRS (Apereo Foundation, 2017), which has the ability to support the storing and querying of event data from both xAPI (Advanced Distributed Learning, 2016) and Global Learning Consortium's Caliper Analytics API (IMS Global Learning Consortium, 2015).Data from both interfaces are stored in a unified format within the LRS, which will aid data analysis as researchers will not have to spend as much time extracting, transforming, and loading data (ETL).The collection of data in an LRS also reduces information silos were data is only stored locally in a number of different modular systems, and has the potential to increase the availability of data for analysis.In this platform, we plan to automate the ETL process by taking incremental (5) Event log dumps from the LRS database as seen in Fig. 2, and sending it to the Learning Analytics Tool for automated processing.

Learning analytics tool
The Learning Analytic Tool acts as a dashboard portal system to display actionable results and outcomes of learning analytics in the form of visualizations.The portal is intended to serve a number of different stakeholders, from students comparing their individual progress against that of their anonymous peers, teachers checking the overall progress of the classes under their care, to administrators surveying the effectiveness of education they are offering in their institution.It is proposed that students and teachers will access the portal via a plugin within an LMS that will provide both authentication of the user and also translate the UUIDs that are displayed in the portal into their corresponding real identities depending on their role in the LMS.Teachers who are in charge of class will be able to view all the student identities of students within that specific class.However, students will only be able to view their own identity, and the identities of their peers will remain anonymous in the results of the analysis.Administrators login into the portal through a local authentication system, and the visualizations will only contain anonymized results that protect the identities of individuals.
This tool is split into two main parts.The first part is a processing system that will analyze raw (5) Event log dumps from the LRS along with (2) Event and course data from the LMS.This process will extract and calculate relevant metrics for actionable results and outcomes and store these in a local database for analyzed data.The second part is a visualization system platform which will host customizable visualizations of the analyzed data.
The learning analytics tool provides a dashboard in which visualization, feedback, and actions can be displayed.The dashboard provides functions to view learning records relating to individual contents over a specific period of time.In the current stage of development, it shows four graphs as shown in Fig. 4: A bar graph of comments made by readers.Individual comments can also be shown by hovering over the bar.Teachers can gain insight into the behavior of active learners in the class.Another possible use is for class time question and answer.
• A bar graph of markers drawn by readers.In BookRoll there are two types of markers: red to indicate important materials, and yellow for identifying contents that are difficult to understand.As with the comment graph, the text of individual markers can be viewed by hovering over the bar.This can alert teachers to areas of learning materials that may require revision.This can also be used to inform the design of flipped classes by seeing what sections students marked while reviewing learning materials before the class.
• A chord graph showing page reading transitions.This can identify if students are reading linearly, and sections of the learning material that are skipped or jumped.
The color of the transition chord indicates the frequency of transitions, and in the example, it can be seen that not all students are reading the final pages of the learning material.

•
A bar graph showing the percentage of the learning material each student has read.This can be used to identify the amount of pre-class preparation students have done before a flipped class.
How can databases query big tables quickly?The collaborative LA platform framework as shown in Fig. 6 would support the development of individual branches of the master template by separate institutions for conducting localized research and analysis.The Collaborative LA Platform works in two main way: the creation of a new branch for customization based on the master template, and the check-in and merging of newly developed features back into the shared master template.
An example of how the framework would work is as follows: University A in Fig. 6 joins the Collaborative LA Platform and requests a new branch of the master template be created so it can be customized for their particular needs.They decide to select Moodle as their LMS, BookRoll as a behavior sensor, Learning Locker as their main LRS, and the Analysis and Visualization components from the Analysis and Results pool.These individual components of the platform can be easily connected using interoperable standards that have been defined and tested in the master template.Meanwhile, University B has been conducting research into a new behavior sensor component and wants to share the results and conduct wider evaluation in other institutions.They start by checking-in their new component to their branch template, and then request to merge the branch into the master template.Once the merge has been tested and approved, the new component is then available for use by other institutions in their own branch of the master template.
One of the major advantages of the framework is that it provides a selection of LA tools in the master template that have been tested for interoperability through the version control process.It also allows for the collaborative development of LA tools by offering a common platform on which additional tools are added and distributed.We believe this will not only help the development of interoperable LA tools, but also support the furthering of learning analytics as a whole.

LRS LRS LRS LMS
Open edX

Distributed learning record based on blockchain
Localized inter-system authentication and learning record collection has become standardized in recent years, however there still remains a problem when transferring learning records between institutions.This problem can occur in the following situations: when a student changes school or continues on to higher education, and when a teacher moves to a different educational institution.In these circumstances, it would be ideal to transfer previous learning records to the new institution for long term learning analytics.
As seen in Fig. 7, we have proposed that a blockchain based distributed learning record could facilitate connecting separate LA platforms for learning record transfer.An advantage of the proposed system is that it gives students/teachers control over the transfer process of their personal learning record data through an immutable transaction ledger.For more details please refer to Ocheja, Flanagan, and Ogata (2018).

Conclusion
In this paper, we introduced learning analytics platforms that have been established in two top national Japanese universities.Research into these LA platforms has been part of a broader research project into creating wide reaching learning analytics platforms to support education and learning through research into educational big data.We discussed the advantages and disadvantages of creating a LA platform that is LMS dependent or independent and look at several unique issues that can arise.An inherent problem with researching and developing learning analytics tools on proprietary systems is that it is difficult to replicate research in other institutions.One of the major issues that impede replication is that proprietary systems often lack interoperability with other tools or systems being used at various institutes.In an effort to overcome this problem and support the replication of LA research across various institutes, we have created a standard based LMS independent LA platform to support the collection of education big data and analysis.
Also, we propose a future direction for our research into learning analytics platforms and introduce a model in which learning analytics tools and the results of research can be shared between different education institutes through a version control framework.
Another problem faced in learning analytics research is that of data continuity.Currently when a student or teacher moves to a different education institute to either change or continue their education, only simple records of their achievements are usually transferred in the form of a transcript.This poses a problem in learning analytics research as there is little information about the learning behavior of the student at the previous institution and the collection, analysis, and inform practice process has to be started from scratch again.To overcome this problem, we have proposed that research into a blockchain based distributed learning record can enable the secure transfer of data between different institutions while giving students and teachers control over transactions.Psychology, 49, 345-375. Salomon, G. (1997).Distributed cognitions: Psychological and educational considerations.Cambridge University Press.Shimada, A., Okubo, F., Yin, C., Kojima, K., Yamada, M., & Ogata, H. (2015).Informal learning behavior analysis using action logs and slide features in e-textbooks.Yin, C., Okubo, F., Shimada, A., Oi, M., Hirokawa, S., Yamada, M., Kojima, K., & Ogata, H. (2015).Analyzing the features of learning behaviors of students using ebooks.In Proceedings of the International Conference on Computers in Education (pp.617-626).

Fig. 1 .
Fig. 1.Overview of the LMS dependent platform design at Kyushu University

Fig. 2 .
Fig. 2. Overview of the LMS independent platform design

Fig. 3 .
Fig. 3.A screenshot of the BookRoll digital learning material reader

Fig. 4 .
Fig. 4. Visualization in the dashboard view of the learning analytics tool