How do secondary students engage in complex problem-solving processes in a STEM project?

: STEM education emphasizes improving student learning by linking abstract knowledge with real-world problems and engaging students in authentic projects to solve real-world problems. Accordingly, project-based learning has been widely promoted in STEM programs and has shown a promising impact on student learning. However, solving real-world problems in STEM projects involves complex processes. It remains unclear how students engage in complex problem-solving processes in STEM projects and how their processes may differ among students. This study was conducted with secondary school students who engaged in a design-based STEM project in small groups. The findings show that questioning and responding appeared most frequently and connected with other elements in group discourse, while argumentation and justification appeared least frequently. The findings reveal distinctive discourse patterns that differ among high, medium-and low-performance groups, based on which the implications of the findings were discussed.


Introduction
STEM education has been widely promoted to prepare future citizens to meet the global challenges of the modern world.Different from traditional approaches to teaching and learning, STEM education emphasizes learning by linking abstract knowledge with realworld problems and engaging students in authentic projects to solve real-world problems.In this context, project-based learning has been widely promoted in STEM education and has shown promising effects on improving student learning.However, solving real-world problems in STEM projects involves complex processes.It remains unclear how students engage in complex problem-solving processes in STEM projects and how their processes may differ among students.To address the gap, this study investigated how secondary school students engaged in problem-solving processes in a design-based STEM project in small groups and how their processes differ among high-, medium-, and low-performing groups.

STEM education with problem-solving projects
STEM education has received global interest from educators, policy makers, and researchers to meet the growing demand for human capital in STEM fields and maintain economic competitiveness (Martín-Páez et al., 2019).Although the acronym STEM was once referred to as a single discipline, it is now generally recognized as an integration of STEM disciplines (English, 2016).To facilitate integrated STEM education, new approaches to STEM teaching and learning such as design-based learning (Bozkurt Altan & Tan, 2021;Chen et al., 2023), project-based learning (Hanif et al., 2019;Lou et al., 2017), and maker-centered learning (Chen & Lin, 2019) have been increasingly promoted, along with specific instructional strategies such as the 5E model (engagement, exploration, explanation, elaboration, and evaluation) proposed for STEM education (Eroğlu & Bektaş, 2022).
The key feature of these approaches to STEM education is that students are expected to learn by linking abstract knowledge to real-world contexts through working on authentic projects to solve real-world problems.In STEM projects, students often need to apply subject knowledge to explore real-world problems through inquiry-based learning activities (e.g., Chen et al., 2018) and/or design solutions to solve real-world problems through design-based learning activities (e.g., Cunningham et al., 2020).Research indicated that integrated STEM education through authentic projects provides students with opportunities not only for the acquisition and application of multidisciplinary knowledge, but also for obtaining relevant, holistic, and engaging experience to develop higher-order thinking skills (Bozkurt Altan & Tan, 2021;Chen & Lin, 2019;Hanif et al., 2019;Ugras, 2018;Yalçın & Erden, 2021).
Previous empirical studies revealed that solving real-world problems contains complex processes such as framing the problem, analyzing the problem, formulating and justifying hypotheses, and taking actions to design and develop solutions to solve problems (Peng et al., 2023;Wang et al., 2018;Wu & Wang, 2012).Moreover, problemsolving processes involve not only cognitive components, but also metacognition and social communication-related components.Tan et al. (2018) conceptualized the discourse of collective problem-solving in three folds: cognitive dimension (e.g., problem analysis and defining, solution generation and evaluation), metacognitive dimension (e.g., planning, monitoring, and regulation); and social dimension (e.g., questioning and responding).However, there is a lack of knowledge regarding how students go through complex problem-solving processes in STEM projects and how their process may influence their performance (Chen et al., 2021).

Discourse analysis of complex thinking and learning processes
To understand how students go through complex thinking and learning processes, discourse analysis has been increasing used by researchers (Chinn et al., 2000;Oshima et al., 2020;Wieselmann et al., 2021).Discourse is the cognition and talk sequence that implies humans' underlying cognitive structure.Interpersonal dialog reveals a social mode of thinking or cognition.Research on classroom dialog has transcended from traditional teacher-centered instruction to more student-centered learning activities; it pays more attention to group discussion in collaboration (Howe & Abedin, 2013).For example, Mercer (1995) identified three types of talk in collaborative learning in the classroom including cumulative talk, disputational talk and exploratory talk.Further, specific types of educationally productive talk, such as argumentation (Chin & Osborne, 2010), collaborative reasoning (Reznitskaya et al., 2009), knowledge building (Oshima et al., 2020), and socially shared regulation (Zheng et al., 2019) were examined to uncover critical features, distinctive discourse patterns, and underlying mechanisms in favor of effective learning and high-quality performance.
To analyze student thinking through discourse analysis, multiple methods have been applied.They include ethnography, network analysis, sequential analysis, and deep learning algorithm, which offer more nuanced, multilevel, multidimensional perspectives (Oshima et al., 2020;Rojas-Drummond et al., 2006;Song et al., 2021;Zheng et al., 2019).Among them, epistemic network analysis (ENA) is a modeling technique that can identify and quantify temporal co-occurrence relationships between different components of discourse in a network model or graph (Shaffer et al., 2016).ENA has been widely applied to explore higher-order thinking and learning processes in a variety of contexts, such as creative thinking (Sun et al., 2022), design thinking (Wu et al., 2019b), computational thinking (Wu et al., 2019a), self-regulation in collaborative learning (Wu et al., 2020), TPACK development (Zhang et al., 2019), and knowledge building (Hod et al., 2020).STEM learning activities provide an ideal context for discourse analysis of group discourse that involves complex thinking and learning processes in STEM education (Wieselmann et al., 2021;Zheng et al., 2019).According to the epistemic frame theory, student thinking cannot be reduced to isolated components.Rather, student thinking and learning are a set of relationships among cognitive, metacognitive, and social components that change over time during collaborative learning (Shaffer, 2006).For this reason, this study adopted the ENA method to visualize discourse patterns in STEM project-based learning.

The present study
In STEM education, students are often expected to learn by collaboratively working on authentic projects to solve real-world problems.Student learning in such contexts often involves complex processes in multiple dimensions.However, there is inadequate research investigating how students engage in complex problem-solving processes in STEM projects and how their processes may differ among students of different levels.To address the gap, this study aims to answer the following research questions (RQs).
RQ1: How do secondary students engage in problem-solving processes in a designbased STEM project?RQ2: How do high-, medium-, and low-performing groups differ in their problemsolving processes in a design-based STEM project?

Participants
The participants were 24 Grades 6-7 students from a secondary school in East China, including 15 males and 9 females.They participated in a school-based STEM program taught by a teacher with seven years of teaching experience in science and STEM subjects.The participants were randomly assigned to six gender-balanced small groups, with four members in each group.

Learning materials
The STEM program in this study focused on the scientific concepts of density and buoyancy, aligned with the K6 science curriculum standards.Before attending the STEM program, students had learnt the basics of density and buoyancy in their physics course.In the STEM program, students were expected to connect the learnt knowledge to realworld problems by working on a design-based STEM project.They were asked to use the given materials (including Kraft paper, tin foil, straw, tape, and ice cream sticks) to build a paper boat with load capacity.The expected size of the paper boat was about 20cm x 12cm x 10cm.Students were also given plastic pieces to test the load capacity of the boat.The more pieces a floating boat can hold, the better the performance of the boat.

Procedure
At the beginning of the study, students received an introduction to the study and signed a consent form to confirm their participation in the study.In the following five weeks, students attended one STEM lesson per week in a school classroom.Each lesson lasted one and a half hours.In Lesson 1, students used plasticine to explore the factors that determine whether an object floats or sinks in water.In Lesson 2, students used four different liquids (tap water, vegetable oil, honey, and medical alcohol) to explore the relationship between liquid density and buoyancy.In Lesson 3, students designed paper boats of different shapes to investigate factors that affect buoyancy.In Lesson 4, they tested how a boat's material affects its load capacity.In Lesson 5, each group refined their product, presented it in class, and created a poster showing the detailed design.Fig. 1 shows a paper boat generated by one group of students and their poster.

Fig. 1. Learning artifacts
During each lesson, students received brief instruction from the teacher and then worked in small groups on the project by engaging in conceptual design, prototype models, product refinement, and dealing with challenges such as the stability of the boat and the center of gravity.

Measures and instruments
Learning artifacts.The quality of student-generated paper boats was evaluated in terms of the maximum load capacity of the boat.It was tested by counting the number of plastic pieces that a floating paper boat can hold, with one point per piece.The more plastic pieces a paper boat can hold, the better its performance is.According to the instructor's experience, a paper boat of the expected size made of the given materials can hold 30 to 70 plastic pieces, i.e., the raw scores ranging between 30 and 70.The performance scores were obtained by normalizing the raw scores, i.e., performance score = (raw score-30)/(70-30), ranging between 0 and 1.If the raw score is above 70 or below 20, the normalized score is 1 or 0, respectively, which was not found in this study.
Group discourse.The conversations made by all groups of students during the project were audio-recorded for analysis of their problem-solving processes.

Data analysis
Based on the performance scores (ranging from 0 to 1) of the learning artifacts, the performance level of each group was identified as high (if the score is less than 0.4), medium (if the score is between 0.40 and 0.70), and low (if the score is more than 0.7) for further analysis.
The recorded group conversations were transcribed and coded in Chinese for analysis.The examples of group conversations or episodes illustrated in this article were translated into English for presentation purposes only.Verbatim transcription of group discussions was generated automatically using the iFlytek natural language processing service and corrected manually.All conversations were segmented into separate turns of student talk to organize in a well-structured data table including group number, lesson number, student name, and utterance, as shown in Table 1.

Coding scheme for group discourse
The transcribed conversation data were coded to identify featured categories occurring in student conversations.We adapted the coding scheme of Tan et al. (2018) by revising the coding scheme to make it fit the STEM project-based learning context.For example, the problem definition and problem analysis categories were replaced by the categories of knowledge and information and argument and justification.This is because when analyzing problems in a STEM project, students often discuss problem-related information and knowledge and get involved in arguments or justifications.Some categories with very low frequency (e.g., affectivity, dis-affective) were removed.Some similar categories with relatively low frequency were merged; for example, solution generation and solution evaluation were merged into solution exploration.The purpose of such revisions was to obtain a parsimonious epistemic network model for effective analysis (Wang et al., 2021).The revised coding scheme includes eight categories in three dimensions, the details of which are presented in Table 2.The first two authors coded a sample of 300 utterances; the interrater reliability rates (Cohen's Kappa) of the eight categories ranged between 0.68 and 0.75, which were generally considered satisfactory.After the differences in their coding results were discussed and resolved, the second author coded all the remaining data.

Epistemic network analysis (ENA) of discourse data
The coded conversation data were analyzed using the ENA method by using the ENA Web Tool (version 1.7.0)(Marquart et al., 2018).The purpose was to analyze the connections between featured categories or elements in group conversations, in addition to the quantities and percentages of each featured category or elements.The main output of ENA is a network model presented in a graph, which includes a set of nodes representing featured categories or elements; the edges connecting the nodes represent the co-occurrence of two categories or elements; the thickness of the edges indicates the frequency of co-occurrence of two categories or elements.In this way, a network model represents the structures of co-occurrence relationships or connections between the categories or elements in discourse within a temporal context.The temporal context in this study was defined as five utterances since we found that the most meaningful connections occurred within five utterances in student conversations in this study.
To answer RQ1, the frequency of each category occurring in student conversations of all six groups (i.e., the whole class) was reported; the mean epistemic network model of all six groups' discourse was generated and elaborated.To answer RQ2, the mean epistemic network models of high-, medium-, and low-performing groups' discourse was subtracted and compared.Qualitative analysis of group discourse by presenting typical excerpts was used to justify the identified patterns from the epistemic network models.

Product performance
The evaluation result of students' product performance is presented in Table 3.

Overview of featured categories in group discourse
The frequency of each category occurring in student conversations of all six groups is presented in Table 4 Among the six groups, the number of utterances in student talk varied from 557 in Group 4 to 1271 in Group 5 (Mean = 981.17,SD = 253.63).Group 4 had the least number of utterances, well below the average.The discussion record shows that most members of this group made long talks, and the students in this group spent more time working on the worksheets silently.Note.K = number of utterances in each category; N = number of utterances in all categories; % = the percentage of utterances

Mean epistemic network of the whole class
The conversations of all six groups were accumulated to produce the mean network model presented in Fig. 2. The graph shows the connections (co-occurrence relationships) between the featured categories identified in the conversations of all groups.Each point (in green) represents the centroid of the network of a piece of discourse.The square represents the centroid of the mean network of the discourse of six groups.

Salient properties of the network
The produced network model was projected in a two-dimensional graph over the X-axis and Y-axis, which can capture the salient properties of the network.The categories distributed around the X-axis are related to cognitive and social communicative processes.
The right side focused on Solution Exploration, i.e., solution focused.The left side focused on Question & Response, which is not directly solution-focused but is an important prerequisite for creating or polishing solutions.The categories around the Yaxis are related to meta-cognitive processes, varying from Planning (upper part of the graph) to Regulation (lower part of the graph).The former focused on planning at the initial stage of a task, while the latter focused on regulation during a task.The mean network of the whole class showed that Question & Response was strongly connected with Planning, Information & Knowledge, Monitoring & Reflection, and Regulation.Besides, Information & Knowledge was strongly connected with Solution Exploration.
On the other hand, making argumentations and justifications less occurred and had weak connections with other categories in student conversations.

Comparison of network graphs among high-, medium-, and low-performing groups
Fig. 3 presents subtracted network graphs comparing the mean network graphs for the high-performing groups (Groups 1 and 5), the medium-performing groups (Groups 2, 3, and 6), and the low-performing group (Group 4).It revealed the differences in discourse patterns among high-performing groups (in red), medium-performing groups (in blue), and low-performing groups (in green).
In the ENA, the network of one piece of discourse can be represented as a point in the two-dimensional space over the X-axis and Y-axis, i.e., the centroid of a network graph.The points located closely indicate that the two pieces of discourse have a similar pattern of node connections.In Fig. 3, the red square represents the centroid of the networks for high-performing groups; the blue square represents the centroid of the networks for medium-performing groups; and the green square represents the centroid of the networks for low-performing groups.The boxes surrounding the means stand for the 95% confidence intervals for the location of the means.
The centroids in Fig. 3 show that the high-and medium-performing groups kept a balance between exploring solutions and receiving and responding to requests for help or cooperation, while the low-performing group mainly focused on exploring solutions with less conversation on requests or responses for in-depth exploration.Besides, the mediumperforming groups spent balanced conversations on task planning and task regulation, while the high-and low-performing groups focused their conversations on task planning more than on task regulation.
As shown in the left part of Fig. 3, Question & Response was more connected with Planning and Solution Exploration in the high-performing groups' discourse than in the medium-performing groups' discourse.The discourse of the medium-performing groups showed more connections among Regulation, Monitoring & Reflection, and Information & Knowledge; these categories, however, had weak connections with Solution Exploration.

Discussion
Receiving and responding to requests for help or cooperation appeared most frequently in student conversations, followed by exploring problem-related information and knowledge, regulating task progress, task planning, and monitoring and reflecting on task progress and group members' performance.Among them, receiving and responding to requests for help or cooperation was strongly connected with other categories, and exploring problemrelated information and knowledge was strongly connected with exploring solutions.On the other hand, making argumentations and justifications appeared least frequently in group discourse and had weak connections with other categories.
The group discourse patterns varied among high-, medium-and low-performance groups.The high-and medium-performing groups kept a balance between exploring solutions and dealing with group members' questions and requests, while the lowperforming group mainly focused on exploring solutions with less conversation on requests or responses for in-depth exploration.Besides, the medium-performing groups put more focus on task regulation than the high-and low-performing groups did.
The above findings from the network graphs are consistent with the discourse records.For example, the excerpt in Table 5 shows that students in a high-performing group (Group 5) kept scrutinizing and improving their solutions by dealing with requests and comparing alternative plans.Table 6 shows that students in a medium-performing group (Group 2) coregulated the task and group members' behavior to establish a collective understanding of problem-related information.Yes.I did a small test with FYC before.Its limit is 3.5, the maximum of a piece of A4 paper is 3.5

Information & Knowledge
Table 7 shows that the conversation in a low-performing group (Group 4) focused on exploring solutions and problem-related information and knowledge and made little conversation on the question they experienced during the task.They failed to deepen conceptual understanding and apply the knowledge to generate sound solutions.

Conclusion
Project-based learning has been widely promoted in STEM programs in school contexts.
Students are expected to learn by working with authentic projects often in a collaborative way.In most STEM projects, students are requested to explore real-world problems through inquiry-based learning activities and/or design solutions to solve real-world problems through design-based learning activities.While learning by working with authentic projects to solve real-world problems has shown promising effects on STEM teaching and learning, it remains to be seen how students engage in complex problemsolving processes in STEM projects and how the processes differ among students of different levels.
This study was conducted with secondary students who engaged in a design-based STEM project.The findings show that questioning and responding appeared most frequently in group discourse, while argumentation and justification appeared least frequently.The high-performing groups closely connected questioning and responding strongly with exploring solutions, and focused more on task planning than task regulation.While the medium-performing groups kept a balance between exploring solutions and questioning and responding, they put more focus on task regulation than the highperforming groups did.The low-performing group focused on solution exploration, which, however, was not well connected with questioning and responding; the latter is crucial to stimulating in-depth exploration of the problem and solution during the task.The finding implied that when promoting project-based learning in STEM education, students can be guided on how to engage in productive processes of problem-solving by encouraging questioning and responding to stimulate in-depth exploration of the problem and the solution; further, students are encouraged to focus more on task planning rather than task regulation during the project.
The limitations of the study should be noted.The small sample size may limit the generalizability of the findings to some extent.The participants of this study were from one university, which may constrain the generalization of the findings.Further studies will be conducted to address the limitations.

Fig. 3 .
Fig. 3. Subtracted networks of group discourse of high-versus medium-performing groups (left part) as well as medium-versus low-performing groups (right part) As shown in the right part of Fig. 3, Solution Exploration was more connected with Planning and Information & Knowledge in the low-performing groups' discourse than in the medium-performing groups' discourse.Nevertheless, these categories had weak connections with Question & Response in the low-performing group.

Table 1
Excerpt of discourse transcript on, and then there is another one.Why?Because it is easy to float if it's the hollow.It says to use a whole piece of plasticine.

Table 3
Evaluation of students' product performance

Table 4
Frequency of each category in group discourse

Table 5
Excerpt of a high-performing group's discourse

Table 6
Excerpt of a medium-performing group's discourse t crossed the waterline yet, wait, wait, put it over there.It's almost at the waterline here.Come and look, HXL, come and look!This corner is okay, put it here, 15 pieces, 16... ... 18 pieces, wow, great, great!Don't put it there, put it in this corner, 20, this is close to the limit, 21 pieces.ZLZ, well done.Hey, do you remember the size of this?Because I remember you draw a line on it, do you remember the size?

Table 7
Excerpt of a low-performing group's discourse