Outreach Programs
Jovan's Project Page - RISE summer 2000

Jovan Rivera-Montes, Computer Science, Universidad Metropolitana, P.R.
Faculty Supervisor: Ming Li
Department: Computer Science

Web based interface of plagerism detection program

Our goal for this project is to create an automatic program that determines the similarity of each file submitted and returns the result to the users via e-mail. It is more sophisticated than the system based on counting the frequency of certain words in the program text, since it actually examines the program structure. We have been inspired by the Moss program that compares different program languages. Our research attempts to determine the similarity between a submitted file and previously submitted work.

My part of this research is to create the website to hold this program and make it user friendly and interactive to the all the people that want to use it. To use this web page program one has to have an account on our secure website. The web site will have all the personal accounts on data base computer so we could have our own processing in our local campus lab.

Background information:

What is Moss?
Moss (for a Measure Of Software Similarity) is an automatic system for determining the similarity of C, C++, Java, Pascal, Ada, ML, Lisp, or Scheme programs. To date, the main application of Moss has been in detecting plagiarism in programming classes. Since its development in 1994, Moss has been very effective in this role. The algorithm behind moss is a significant improvement over other cheating detection algorithms (at least, over those known to us).

An Internet Service
Moss is being provided as an Internet service. The service has been designed to be very easy to use--you supply a list of files to compare and Moss does the rest.
In response to a query the Moss server produces HTML pages listing pairs of programs with similar code. Moss also highlights individual passages in programs that appear the same, making it easy to quickly compare the files. Finally, Moss can automatically eliminate matches to code that one expects to be shared (e.g., libraries or instructor-supplied code), thereby eliminating false positives that arise from legitimate sharing of code.

Registering for Moss
Moss is being provided in the hope that it will benefit the educational community. Moss is fast, easy to use, and free. Access to Moss is restricted to instructors and staff of programming courses. To obtain a Moss account, send a request to moss-request@cs.berkeley.edu. Processing requests for accounts may take up to a day; once you have an account queries are processed as soon as they are received.

How Does it Work?
While there is a big difference between a good cheating detection algorithm and a bad one, all such algorithms can be fooled if one knows how they work. It's best if we don't say too much here about the ideas behind Moss. Suffice it to say that it is more sophisticated than systems based on counting the frequency of certain words in the program text (a widely used cheating-detection heuristic); Moss actually examines program structure.

Return to the RISE 2000 project list