AI Ethics Education and Evaluation Framework Based on Case Method Using Network Motif and Sentiment Analysis

Tengfei Shao , Global Education Center, Waseda University, Tokyo, Japan, tengfei@aoni.waseda.jp
Hidehiro Kanemitsu, Global Education Center, Waseda University, Tokyo, Japan, kanemih@waseda.jp
Xu Wang, Faculty of Informatics, Gunma University, Gunma, Japan, xu.wang@gunma-u.ac.jp
Akinobu Sakata, Graduate School of Creative Science and Engineering, Waseda University, Tokyo, Japan, tsakata.a@aoni.waseda.jp
Shingo Takahashi, Graduate School of Creative Science and Engineering, Waseda University, Tokyo, Japan, shingo@waseda.jp
Goto Masayuki, Graduate School of Creative Science and Engineering, Waseda University, Tokyo, Japan, masagoto@waseda.jp

While AI plays an expanded role in areas of significant decision making, the significance of pedagogical models that both promote moral understanding and emotional sensitivity is heightened. To address the challenge, this paper posits the new pedagogical model termed the AI Ethics Education and Evaluation (AEEE) model. It utilizes network motif analysis, measurement of emotion, and case-based teaching to take account of the overall dimension of moral growth. The paradigm was first applied in the Japanese job market, in which algorithmic screening techniques present culturally and ethics-specific concerns. The model has three stages that are cumulative. It subjects the student to real-world moral dilemmas, takes account of long-term evolution of the structure of discourse with patterns of motifs, and observes the change of emotion with sentimental quantities. Graduate student findings demonstrated an increase in the number of motifs that signal intellectual agreement and an overall brighter emotional tone as well. These signify that the use of the AEEE model is not simply arousing the individual to think more critically of ethics, but that it helps in their relationship with their own emotions as well in an improved way. This approach helps with an expanded assessment of AI ethics pedagogy and its broader societal consequences with the provision of an approach.

CCS Concepts:Social and professional topics → Management of computing and information systems;

Additional Keywords: AI ethics education, Case method, Job hunting, Network motif, Sentiment analysis

ACM Reference Format:
Tengfei Shao, Hidehiro Kanemitsu, Xu Wang, Akinobu Sakata, Shingo Takahashi and Goto Masayuki. 2025. AI Ethics Education and Evaluation Framework Based on Case Method Using Network Motif and Sentiment Analysis. In 2025 International Conference on Education, Knowledge and Information Management ICEKIM 2025), June 20-22, 2025, Cambridge, United Kingdom. ACM, New York, NY, USA, 10 Pages. https://doi.org/10.1145/3756580.3756677

1 Introduction

The rapid rise in the prevalence of artificial intelligence is having a profound impact on a number of industries, and hence, stricter ethics controls are being called for to hold surging menaces like algorithmic bias and societal inequality on an increasingly imminent timeline [1] [2]. The Case Method (CM) is unique among the dominant pedagogies owing to its aspect of intertwining ethical dilemmas with practical scenarios, thereby inducing deeper and contextual thinking [3] [4]. Notwithstanding the latter, however, little is available in terms of a standardised structure aiding the systematic incorporation and measurement of such case-based methods in AI ethics teaching [5]. The current work presents the AI Ethics Education and Evaluation (AEEE) paradigm, an integrated paradigm consisting of network motif analysis and sentiment analysis that aids the measurement of the evolution of the principles that are moral. The paradigm operates under the culturally distinctive searching regime of Japan, known as shukatsu, under which the use of screening systems based on AI is raising enormous ethical concerns. This procedure is augmented with an emotional analysis due to the fact that purely structural analysis is unable to bypass emotional involvement, an important dimension of ethics learning, in monitoring the evolution of emotions. The use of the two analytical views within the paradigm allows a more detailed and systematic measurement of the moral evolution of the subjects. The twin objective of the study is:

  • Develop an AEEE structure that incorporates network motivations and sentiment analysis.
  • Test its efficacy with its own case study on Japanese AI-based recruitment of jobs.

2 RELATED WORK

2.1 Case Method and AI Ethics

The Case Method (CM) is coming to be celebrated as an effective means of teaching AI ethics since it can invoke critical consideration of multi-dimensional, complicated moral challenges. Quinn and Coghlan refer to its success with medical schools, with which it increases the perception of clinicians of the moral dimensions and facilitates the observant incorporation of AI instruments [6]. Kooli and others refer to the possibility of applying it both to programmes of academic study and to community programmes, and of endeavouring the ethics-focused teaching models particular to the locale [7]. Much work has been done to date, albeit much of it is such that it is limited almost exclusively to work-based cultural and social contexts and is based almost exclusively on work done to date. Most recent studies have demanded the setting up of an entire ethics code that covers the whole of the AI developmental process [8] [9]. Ashok et al. promote flexible and adaptable directives that shift in tandem with technological advances and refer to the importance of embedding innovations such as these into schooling systems more generally [10]. Many of the proposals lack an empirical foundation and are irrelevant to culture-specific applications. Focusing on the limitation, the structure subjects the case method to Japan's particular employment-searching tradition, the shukatsu, and grounds the moral discussion that takes place therein in the quotidian lives of the student population. Incorporating network analysis of the student's reflection, thus bringing the process of assessment an empirical intensity, it is able to provide a culturally sensitive paradigm of ethics teaching that covers an area noted in existing research.

2.2 Network Motif and Sentiment Analysis

The recurring subgraph patterns known as network motifs, appearing more frequently than statistically predicted, turn out to be useful analytical tools in the investigation of complex networks in such areas as, e.g., the topology of behavioral patterns in areas ranging from tourism, in which they identify how the tourists move between attractions, to trade, in which they identify the common commercial exchange that underlies the thinking of strategy [11]. Shao et al. have shown that these patterns can help track how educational changes affect thinking, revealing how changes in conversation relate to shifts in ethical views. Structural network analysis is supplemented by sentiment analysis, which extracts the emotional cues from the text data to bridge the gap. Advances in deep learning over the past few years have enormously expanded its scope of applicability in areas such as, e.g., education, marketing, and public discourse, esp. in particular. Liu et al. examined the emotional response of the student to the online learning setting [12]; in contrast, Yadav and Vishwakarma utilized the transformer structures to visualize the socio-political discourse's emotions [13]. Zhao et al. conducted the sentiment analysis and the network analysis both to identify how the emotional indicators shape the consumption pattern [14]. Here, the current structure integrates the sentimental analysis with the motif-based network analysis to identify the structure and emotional transformation of the reflection on ethics, both in terms of the former breakthrough.

3 Method

The AEEE Framework, shown in Figure 1, has three connected phases: the first phase involves teaching using the case method and preparing initial data, the second phase focuses on structural analysis and extracting network motifs, and the third phase deals with emotional perception through sentiment analysis.

Figure 1
Figure 1: AI Ethics Education and Evaluation Framework.

3.1 Education by Case Method

The first phase of the framework contains four essential components. The process is initiated by the setting of pedagogical goals, which entails the identification of the moral dimensions of AI technology under review and the population of intended participants. Thereafter, relevant case studies are developed to fit the desired focus of instruction. These cases then become the basis upon which instructions are imparted under the case method, which invites the taking of positions on the moral dilemmas under consideration in an orderly, dialogical setting. The final step entails the systematic acquisition of the data both during and within the educational intervention period. Various methods of data acquisition may be used, ranging from the use of surveys and structured interviews to the use of sensor-based monitoring. Data must, at least, be collected twice during the whole process of instruction. Data so obtained may take different formats, ranging from quantitative scores, written accounts, tape-recorded discussions, or biometric measurements, and are all translated into either discrete or continuous formats that are amenable to further analysis. Importantly, the relationship between the participants and the data points is retained such that relational graphs that become the building blocks of the latter network modeling that follows may be generated.

3.2 Network Motif Analysis

The technique integrates network creation with motif analysis by mapping all relevant textual data into a network and evaluating the relevance of motifs as Figure 2 shown. The program requires textual input (T), a cohort of people (I), specified relations (R), a motif size (s), and several random graphs (M) for statistical analysis. A morphological analysis is conducted on the original textual dataset T, referred to as MA(T), which extracts morphemes from the text. The morphemes are subjected to an additional filtering process using a Filter (M, O) function with a specified goal O, resulting in a filtered subset S of the morphemes used to represent the textual data in a network [15]. The procedure starts by establishing empty sets for vertices (V) and edges (E). The algorithm subsequently examines whether a pre-existing connection exists for each individual-morpheme pair (i,m) inside the set I×S as defined by R. Upon verification of a relation, both the person and morpheme are included as vertices into set V, and an edge is established in set E, therefore forming a realized network G = (V, E). The next motif analysis starts by extracting a detailed enumeration of all interconnected subgraphs of size s inside graph G, which are then quantified and classified according to their topology and organized into a frequency set F. The program will thereafter generate M random graphs that possess the same degree-per-node based network topology G state, which are used for statistical comparison rather than representing actual or realized networks. Motif frequencies will be computed for the motifs of the random graphs in a same manner to establish comparison set C.

Figure 2
Figure 2: Algorithm for network construction and motif analysis.

The method includes finding the average (r_t) and standard deviation (σ_t) of each motif shape from the random graphs we looked at, and then using a Z-score (Z_t) from these averages to see if the number of each motif in Graph G is different from what we expected based on those averages [16]. The algorithm creates and gives back a group of important subnetworks S, along with their frequencies (F) and Z-scores (Z), while also offering a full motif analysis that highlights important structural and statistical features in the network.

3.3 Sentiment Analysis

To complement the structural insights derived from network motif analysis, we integrated sentiment analysis to explore the emotional undertones present in participants’ written reflections. While motif analysis maps the structural logic of ethical reasoning, sentiment analysis sheds light on how learners emotionally engage with ethical dilemmas related to AI. Taken together, these methods offer a dual perspective—one that captures both the cognitive architecture and the affective dimensions of ethical reflection. This integrative approach resonates with emerging research emphasizing the critical role of emotional engagement in effective ethics education [17].

Our analytical workflow began with conventional text preprocessing procedures, including the removal of extraneous characters and morphological analysis, followed by the extraction of semantically meaningful terms. The processed textual data were then analyzed using a transformer-based sentiment model fine-tuned on Japanese language corpora. Each response was assigned a sentiment score on a scale from 0 (strongly negative) to 1 (strongly positive). By aggregating these scores—for instance, comparing pre- and post-intervention responses to Question 6—we were able to observe shifts in emotional tone over the course of the learning process, in line with established practices in AI education and the broader social sciences [18].

4 Experiment

Japan's distinctive cultural and social mores have a large role to play when it comes to its employment practices. This is particularly evident with shukatsu, the traditional Japanese job-seeking process, which prioritizes seniority and lifelong commitment. The introduction of AI into this traditional structure has raised new ethical dilemmas that will have far-reaching implications for society [19]. These developments require robust ethical oversight in relation to the use of AI in hiring processes to protect the rights of job applicants and to ethically employ AI in hiring.

An extreme example included a recent recruitment platform that used data collected from personal social media accounts to forecast, and profit from, candidates' chances of not receiving a job. This is a more extreme ethical issue we encountered during our investigation. In response, we have incorporated the purpose-built ethical survey combined with contextualized case studies into our AEEE framework. This tool consists of seven items (Q1 to Q7; I7-Table 1) that asked participants to express their views on privacy, fairness, acceptance, benefits, and overall support regarding questions about AI in hiring contexts.

Table 1: Ethical Awareness Questionnaire Items.
Question Content Eval.
Q1 Privacy Do you believe that the use of data by AI should be curtailed from a privacy perspective? 1∼7
Q2 Fairness Do you consider AI's assessment of human capabilities to be fair? 1∼7
Q3 Acceptance Are you comfortable with your job applications being evaluated by AI systems? 1∼7
Q4 Benefit Do you think it is beneficial for companies to employ AI in evaluating students? 1∼7
Q5 Support Do you agree or disagree with the use of AI to evaluate applicants for a job? 1∼7
Q6 Problem Describe the problem of the case and why it is a problem in logical detail. Text
Q7 Reason Do you support the future implementation of AI systems for evaluating job applicants once legal issues have been addressed? Text

We undertook a structured study involving 20 graduate students from Waseda University to evaluate the expression of ethical reasoning subject to the ethical survey. The first step involved asking participants to complete a baseline ethics survey (Q1-Q5) and write their responses to a series of ethical scenarios (Q6-Q7). The next step required participants to join the facilitators in robust 30-minute group discussions of 4. The group discussions were designed to establish divergent viewpoints during the sessions before participants engaged in the task again 7 days later. This allowed us to assess changes in both cognitive reasoning and emotional engagement.

5 Results

5.1 Network Motif Analysis

In the final step of the AEEE framework, the answers from participants to questions Q6 and Q7 (before and after the intervention) were turned into four text-based networks (Figure 3). We used morphological parsing to find 1,491 morphemes, subtracting 16 for punctuation marks; in total, we created 5,918 valid word tokens. Each response was shown as a graph connecting participants to individual morphemes. After the intervention, the Q6 network grew from 515 nodes and 1,418 edges to 1,023 nodes and 2,789 edges, showing more variety and complexity in the language used. Each response was represented as a graph with participants connected to individual morphemes. Following the intervention, the Q6 network increased from 515 nodes and 1,418 edges to 1,023 nodes and 2,789 edges, indicating a greater degree of lexical diversity and complexity of discourse. Furthermore, the level of engagement did not diminish as the average node degree and network density were consistent. Q7 also demonstrated a similar increase, demonstrating that the modifications of the intervention had the effect of increasing expressive complexity without disrupting the overall linguistic equilibrium.

Figure 3
Figure 3: Complex networks of Q6 and Q7 (Before and After).

Next, we analyzed the frequency of patterns for groups of three and four (Figure 4), finding six patterns (M1–M6). Motif 1 and Motif 3 illustrated individual lexical variety; Motif 2 and Motif 6 highlighted shared morphemes that referred to topical consensus or focal participants; Motif 4 illustrated convergence (partial overlap); and Motif 5 indicated agreement (strong consensus). After the intervention, both frequencies increased for Motif 2, M4, and M5. For example, Q6 increased from 0.056 to 0.072 for M2 and conjecture 40-60% for M4 overall. While occurrences of contextually accurate M5 were rare, the simple increase in occurrences indicated a higher degree of inclusion within the participants. Individually, M1 and M3 decreased slightly, with Q6's M1 declining from 0.944 to 0.928; this decrease may be due to a more precise application of lexical choices in greater occurrences related to task structure and student expression, rather than narrative fragmentation. The significance of the troop motif was previously identified and confirmed through z-scores generated for randomized null models (Figure 5). Significantly higher z-scores confirmed M4 and M5 increased structural relevance as numbers increased, while M3 and M6 decreased, revealing less fragmentation. The Q6 response displayed greater precision, and Q7 responses appeared more unified, further revealing the effectiveness of the case method approach as a pedagogical vehicle for task-specific ethical expression. The AEEE frame provided insight into the structural and pedagogical changes observed.

Figure 4
Figure 4: Frequency of network motifs.
Figure 5
Figure 5: Z-score of Network Motifs.

5.2 Sentiment Analysis

Following the investigation of sentiment at the level of words, we determined the average sentiment score for each dataset to better understand participants' overall emotional framing. As can be seen in Table Y, the average sentiment for Q6 responses increased from 0.129 before the intervention to 0.245 following the intervention. This suggests movement towards more positively framed language about ethical dilemmas, in other words, decreased reliance on negations and negatively framed concerns.

Sentiment polarity trends (Figure 6) similarly suggested a general increase in scores for keywords post-intervention. In Q6, for instance, "Problem" increased from 0.067 to 0.181, and in Q7, "Decision" increased from 0.289 to 0.643. The Q7 response received the highest sentiment response of 0.938, signifying a substantial contribution to positively positioned emotional framing. On the other hand, "Utilization" rated comparatively lower in sentiment at 0.070 after Q6, suggesting possible uncertainty or continued caution. The quantities of sentiment visible in the bar chart indicated diminishing terms focused on deviations and a shift towards more strategic and evaluative language.

Figure 6
Figure 6: Sentiment Score Comparison.

Although it is difficult to determine whether cumulative word choice shifts were due to part of the "after" responses with the use of emotive terms in the form of sentiment towards ethical decision-making, sentiment increased for recurring words with an apparent upward trend overall, particularly in Q7 (Figure 6), suggesting case-based discussion does influence emotional framing. After responses, the five highest-rated words were Approval = 0.938, Implementation = 0.672, Decision = 0.643, Student = 0.672, and Problem = 0.591. These indicate not only changes in word choice, however, but also an overall more comprehensive emotional shift in participants' responses to ethical dilemmas. Figure 6 depicts a distinct upward trend in sentiment for commonly used words, particularly in Q7 after, which further supports the influence of case-based discussion on emotional framing. Following the intervention, word ratings from most to least highly rated words were Approval (0.938), Implementation (0.672), Decision (0.643), Student (0.672), and Problem (0.591). These results demonstrate not only changes in word choice but also a more general change in thinking about ethics and emotionality.

6 Discussion

This study focused on a critical deficiency in AI ethics education: the absence of evidence-informed frameworks that integrate both cognitive and emotional outcomes of educational interventions. A framework called the AEEE framework, enabling a comprehensive assessment of moral reasoning through network motif and sentiment analysis, was then assessed in the context of Japan's unique job-hunting context (shukatsu).

Network motif analysis demonstrated significant changes in the structural configuration of post-intervention responses. An improved alignment of ethical beliefs was found in the emergence of different consensus-related motifs (e.g., M2, M4, M5). These consensus-based motifs emerged without a decline in lexical diversity, as the network density and average degree stability demonstrate an improvement in the coherence of reasoning and conceptual engagement. The change in motifs was further substantiated through Z-scores, establishing both a statistical and educational significance. The analysis of sentiment was also robust, particularly demonstrating emotional growth. For instance, we saw a change in the sentiment scores for Q6 from 0.129 to 0.245, which suggests a shift from considering employment decisions through a problem-focused lens to a more solution-oriented framework. In Q7, responses demonstrated a more hopeful attitude regarding the use of AI in hiring as long as enough ethical considerations were made. The representation of sentiment in our network graphs revealed the presence of nodes that were significantly influencing the collective reasoning with emotional implications.

The synthesis of structural and emotional patterns highlights the positive aspects of conducting cognitive and affective analyses. Not only did the cohort develop common ethical vocabularies, but they also developed an emotional commitment to the process—similar to research in moral education. The increase in consensus motifs along with the appearance of more emotionally positive words (i.e., "implementation" and "decision") would also confirm the influence of integrative learning within the activity. From an educational standpoint, the case method itself proved successful in making abstract ethics tangible, as the case presented the realities of moral choices. While additional research is required to establish the applicability of the AEEE framework with a broader population, there is an opportunity to scale beyond the case study, as these outcomes lend themselves to a more widespread opportunity to consider more about ways to assess students' ethical growth.

7 Conclusion

This study offered the introduction and validation of the AEEE framework, a framework that integrated a network motif and sentiment analysis to assess the ethical reasoning of students making meaning of AI education. When applied in the context of Japan's job-hunting culture, this framework provided evidence of both structural alignment and emotional growth in student responses post-intervention. The framework for examining both cognitive and emotional aspects proved a successful evaluative framework for ethical engagement. These outcomes reveal the potential of using this framework more widely in contexts where ethical reflection matters.

Acknowledgments

This paper is a part of the outcome of research performed under a Waseda University Grant for Special Research Projects (Project number: 2025C-633). We also offer our best gratitude to Prof. Reiko Hishiyama.

References

  • J.-M. Flores-Vivar and F.-J. García-Peñalvo. 2023. Reflections on the ethics, potential, and challenges of artificial intelligence in the framework of quality education (SDG4). Comunicar 31, 74 (2023), 37–47.
  • W. Holmes et al. 2022. Ethics of AI in education: Towards a community-wide framework. Int. J. Artif. Intell. Educ. (2022), 1–23.
  • M. Burawoy. 1998. The extended case method. Sociol. Theory 16, 1 (1998), 4–33.
  • M. Ryan et al. 2021. Research and Practice of AI Ethics: A Case Study Approach Juxtaposing Academic Discourse with Organisational Reality. Science and Engineering Ethics, 27(2), Article 16.
  • T. Shao et al. 2024. A Study of AI Ethics Education in the Context of Japanese Job-Hunting Based on Case Method Using Network Analysis. In Proceedings of the 2024 4th International Conference on Artificial Intelligence, Big Data and Algorithms (CAIBDA 2024), ACM.
  • T. P. Quinn and S. Coghlan. 2021. Readying medical students for medical AI: The need to embed AI ethics education. arXiv preprint arXiv:2109.02866.
  • C. Kooli. 2023. Chatbots in education and research: A critical examination of ethical implications and solutions. Sustainability 15, 7 (2023), 5614..
  • F. Li, N. Ruijs, and Y. Lu. 2022. Ethics & AI: A systematic review on ethical concerns and related strategies for designing with AI in healthcare. AI 4, 1 (2022), 28–53.
  • L. Floridi and J. Cowls. 2022. A unified framework of five principles for AI in society. Machine Learn. City: Appl. Archit. Urban Des. (2022), 535–545.
  • M. Ashok et al. 2022. Ethical framework for Artificial Intelligence and Digital technologies. Int. J. Inf. Manag. 62 (2022), 102433.
  • T. Shao et al. 2024. Multiple Clusters Discovery Utilizing Network Motifs for Community Improvement: Insights from Tourism and Goods' Transactions. J. Inf. Process. 32 (2024), 308–318.
  • A. Yadav and D. K. Vishwakarma. 2020. Sentiment analysis using deep learning architectures: A review. Artif. Intell. Rev. 53, 6 (2020), 4335–4385.
  • R. Feldman. 2013. Techniques and applications for sentiment analysis. Commun. ACM 56, 4 (2013), 82–89.
  • B. Liu. 2012. Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5, 1 (2012), 1–167.
  • G. Booij. 2012. The Grammar of Words: An Introduction to Linguistic Morphology. Oxford University Press..
  • U. Alon. 2007. Network motifs: theory and experimental approaches. Nat. Rev. Genet. 8, 6 (2007), 450–461.
  • B. Pang and L. Lee. 2008. Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2, 1–2 (2008), 1–135.
  • J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT 2019, 4171–4186.
  • T. Y. Zhuo et al. 2023. Exploring AI ethics of ChatGPT: A diagnostic analysis. arXiv preprint arXiv:2301.12867.

Footnote

Corresponding author.

CC-BY license image
This work is licensed under a Creative Commons Attribution International 4.0 License.

ICEKIM 2025, Cambridge, United Kingdom

© 2025 Copyright held by the owner/author(s).
ACM ISBN 979-8-4007-1562-4/2025/06
DOI: https://doi.org/10.1145/3756580.3756677