【Java精品源码栏目提醒】:网学会员鉴于大家对Java精品源码十分关注,论文会员在此为大家搜集整理了“[精品文章]虚拟教学实验方案相似度的分析和度量 - 科学技术”一文,供大家参考学习
I 华 中科 技 大学硕士学位论 文 摘 要 虚拟实验教学软件的出现降低了学校开设实验课程的成本却也使得学生可以更为方便地抄袭他人的实验方案给教学管理带来了新的问题。
发现学生的抄袭行为是新实验模式下实验教学管理的一项重要内容构建虚拟教学实验方案的复制检测系统可以辅助教师完成这一任务。
计算机学科硬件系列课程的虚拟教学实验方案文件一般分为虚拟实验过程文件和汇编语言源程序文件两类。
在对虚拟教学实验方案文件特点及电子文档相似度度量算法进行认真分析研究的基础上分别给出了两类文件的复制检测方法。
将虚拟实验过程文件的相似度定义为器件相似度、导线相似度、时序相似度和位置相似度的四维向量通过提取指纹并计算指纹集的重叠度来得到器件相似度和导线相似度通过计算操作序列的编辑距离来得到时序相似度通过统计两个过程文件中相同器件摆放位置的距离之和来得到位置相似度。
将汇编语言源程序的抄袭转换难度分为三个级别对源程序进行预处理以消除最低级转换对相似度值的影响使用多重最长公共子序列算法来计算经预处理后的一对源程序间的相似度值。
度量出的相似度超过阈值的文件将被判定为有抄袭嫌疑。
以上方法已在数字逻辑和单片机原理与应用两门
课程的虚拟实验方案复制检测系统中进行实验验证实验结果表明通过对提交的两类方案文件的相似性度量可以较准确地发现学生的实验抄袭行为。
所给出的方法对其他学科虚拟实验教学管理有一定的参考价值。
关键词复制检测相似度虚拟教学实验度量 II 华 中科 技 大学硕士学位论 文 Abstract The emergence of virtual experimental instructional software reduces the cost of offering an experiment course but it has also provided convenience for students to plagiarize others’ experiment schemas causing a new problem for the teaching management. In the new experiment model finding out the plagiarism is an essential component of teaching management and the establishment of a plagiarism detection system can help teachers to fulfill this goal. Generally speaking the virtual teaching experiment schemas of computer science can be classified into two categories the virtual experiment process documents and the assembler source programs. On the basis of careful analysis and study of the characteristics of these two kinds of documents and the algorithms of measuring the similarity between the electronic documents the methods of schemas’ similarity measurement are given. The similarity between process documents is defined as a four dimensional vector which consists of device similarity wire similarity sequence similarity and location similarity. The device similarity and wire similarity are both calculated by extracting fingerprints and measuring the overlap. The sequence similarity is calculated by measuring the edit distance between the two operation sequences. The location similarity is calculated by summing the distances between two locations of the same device. The difficult levels of transforming the plagiarized assembler programs are defined. The way of how to measure the similarity between programs is preprocessing the programs to eliminate the effects of the lowest level of transformations and then using the multiple longest common sequence algorithm to calculate the similarity between the preprocessed programs. The pair of documents whose similarity is higher than the threshold will be judged as plagiarized. These