Downloading Detectives: Searching for On-Line Plagiarism

Robin Satterwhite- Social Sciences Librarian – Tutt Library – Colorado College

Marla Gerein – Social Sciences Academic Technology Specialist – Colorado College

 

 

Introduction

 

The use of the Internet as a research tool is now a common “given” in the academic work of our students and the demands for information placed on our libraries.  But with the rise of high quality on-line resources and increased computer access comes the challenge of how to prevent and deal with new methods and sophisticated levels of plagiarism. The Internet now hosts thousands of sites offering essays, term papers and reports and other ways to “beat the system”. It also offers a plentitude of products and services meant to detect and nab those nasty plagiarists. 

 

At Colorado College, like most post-secondary institutions we suspect, the level of awareness that our professors have of the good, the bad, and the ugly on the Internet, in terms of academic research runs the full spectrum.  For example, one professor told us that he was certain no student could ever use Internet sources to plagiarize an assignment, because his area of study was too “obscure”, (he teaches African and South American studies).  To the other extreme, professors have approached our library staff, wanting the library to purchase the latest plagiarism detection software or service they’ve just heard about, convinced it will solve all their digital woes.

 

In an effort to understand the limits and abilities of plagiarism detection software and on-line services, we decided to run an “experiment” in order to be able to proactively respond to our faculty as to the effectiveness of such products.   This is the story of how we became “downloading detectives” and our experiences in this capacity thus far.

 

Our experiment has had four objectives:

  1. To assess and understand student’s access and ability to plagiarize, using the Internet. (i.e. buying papers from “papermill” sites, getting essays free on the web, using on-line sources to plagiarize, etc.)
  2. To identify and assess the software and on-line resources that exist for plagiarism detection.
  3. To assess these detection products, in terms of their effectiveness at identifying various forms and degrees of plagiarism.
  4. To discuss how the success and/or failure of these detection methods fits into the larger context of enriched research and post-secondary assignments for students in the Information Age.

 

 

The architecture of our experiment has been structured as follows:

  1. We conducted a literature review.
  2. We have defined four variations of on-line plagiarism to try to “detect” in our experiment:
    1. essays students can download for free from the Internet
    2. essays students can buy and then download or receive via email
    3. “cut and paste” plagiarism, where students take bits and pieces from on-line sources such as journals, encyclopedias, etc.
    4. the “miscellaneous” paper, where parts are original, and parts are plagiarized from one or many of the aforementioned sources.

 

  1. We have secured the use of purchased software programs, purchased on-line services, free on-line services/programs and then simple on-line searching techniques to “put through the paces” of our plagiarism detection challenge.

 

  1. Our “subject pool” of papers used to “test” these products, services and techniques include:
    1. papers bought on the internet for between $10 and $90.
    2. papers downloaded for free
    3. papers we cut and pasted together from various on-line sources
    4. papers that were actually handed in to Colorado College and suspected of being plagiarized.
    5. “unplagiarized” papers that were submitted to Colorado College, and received high marks.

 

 

Based upon the results of running the pool of papers through our selected anti-plagiarism products, services and technique, we have made some preliminary assessments about the “cat and mouse” game of on-line plagiarism and its detection.  We have also applied our findings to effective models for research based assignments and promoting information literacy in our digital culture.

 

 

The Literature Review

 

Being somewhat honest and naive, the age-old conundrum of plagiarism was known to us, but neither of us were prepared for the slew of on-line tools and tricks of the trade that we found discussed on the literature we reviewed.  An article by Julie Ryan was particularly interesting, considering that both her and her husband teach an introductory information security concepts course for George Washington University.[1]  Ryan was rather shocked to find that in a class of 42 students, she found 7 to have plagiarized most if not all of their papers from an online source, and 4 others to have misrepresented their sources in footnotes.  The next year, she found the same percentage, one in six students to have plagiarized as well.  And this is an information security concepts course!

 

Article and article brought to the forefront strategies, sites and interesting anecdotes of students in their bids to successfully plagiarize on-line sources.  Many of the articles provided strategies for professors to try to identify plagiarized Internet material.  But very few of the articles discussed more than one or two examples of successful strategies or offered products that really “got the job done” in a bid to successfully nab a cheater every time.  Did such a product or service exist “out there”?  That is what we set out to determine.

 

 

The Papers

 

Make no mistake about it…plagiarizing papers with on-line source materials can be tempting and down right amusing!  In the vast sea of on-line information, how possible would it really be for a professor to nail down a mis-used source?  Or identify a bought or downloaded paper?  How easy would it be for a student to get a complete paper? 

 

Many paper mill sites offer disclaimers that their products are to be used for “research purposes only” and that plagiarism is a serious offense.  But then again, how valid can such admonishments be from sites called “schoolsucks.com” or “The Evil House of Cheat”?  An interesting comparative look could be made between the “hacker” mentality and on-line plagiarism.

 

Paper mills make it incredibly convenient to cheat.  There are hundreds of sites that post papers that can be downloaded for free.  They make no claim to quality, but the again, you get what you pay for!  It took us five minutes to find enough sites to choose five free papers from, of various lengths and on topics.

 

Paper mills that actually charge for papers claim that their papers are of a higher quality, and usually reflect such claims in their price.  The cost for papers on these sites range from $9.95 for access to the entire database for a year, to papers that cost upwards to $200 a piece.  We selected papers from a $9.95 all-you-can-cheat buffet site, a $90 paper, and three other papers in between that range.  All purchases required a credit card, sometimes with follow-up telephone confirmation, and usually service with a smile.  One site even had a representative contact us, claiming the paper we chose was somewhat out-of-date and of a lesser quality, and wanted to recommend a better selection! And the fee comes across so legitimately on your credit card bill – one paper appeared on our American Express statement as, “The Paper Store – Office Supplies, etc”.

 

Claims or no, we found the papers from both the paid and the free sites to be of terrible quality – in the bad, lower level highs school quality range, for the most part.  Bibliographies were often lacking, or were of extremely low quality when provided.  If we were going to make recommendations to fledgling online plagiarists, we would say “take your chances on the free papers”.

 

Our “cut and paste” papers pulled from commonly accessed on-line sources: one was a Brittanica on-line article, one an e-book excerpt from our Net Library collection, and the others selected from on-line periodical indexes, such as ABI Inform, Ebsco, and JSTOR.  To round out the group of papers, we used two essays that were submitted to faculty at Colorado College and suspected of plagiarism, and excerpts from highly-graded papers that were also submitted to the College.

Anti-Plagiarism Products, Services and Techniques

 

We were quite mystified by the claims made by both the free and at-cost anti-plagiarism products and services we found on the Web.  Many claimed to search vast databases, and be able to return contextually sophisticated and comprehensive results.  We very much doubted the ability of these products and services to “get the job done”, when paper mill sites shift, change, add and appear everyday, with databases of papers that range from the hundreds to the thousands.  And what of the fee-based paper sites?  Did the plagiarism detection companies pay for access to these thousands of papers?  Did they run constant updated queries of new sites and new papers?  We didn’t think so, and ventured forth to find out.

 

All the products and services are based on the same concept:  string searches of key phrases.  The key to success seems to be how and why each program selects certain phrases…is it random?  Is it predetermined?  We would like to do more research yet on how these programs are actually designed, and will do so as we continue our “experiment”.  Below are the programs and services, both free and paid for that we have used:

 

EVE2

A “thorough” program, but very slow.  You submit an entire paper, and must be connected to the Internet during the processing.  Our computer locked up once, and we hesitated to "multi-task" until EVE2 was through processing.  It took several

hours to process 13 papers.  (we recommend starting the process before you leave for the weekend!)  A class of 25 students and 10-page papers would easily take 24 hours. 

Positive aspects of the program:

o    Easy to transfer files. 

o    A window opens and directs you to select the files off of your computer or a disk that you want to check.  

o    You get to submit entire document, not a portion.

o    Gives results in percentages. 

o    Though if no plagiarism was detected, no report is issued.  

 

HowOriginal.com

Very ineffective.   It uses a meta search engine to search the web. Only accepts 100 words...or 1000 characters as submitted material to test for plagiarism, (about 1 paragraph).  Randomly picks out phrases to search. Sometimes the phrase is 4 sentences.  Other times two or three words.  One time it chose the numeral "6" for us! When we illuminated all of the common phrase detections that this program marked...the word "woman" , "examples are",  etc.,  then all of the submitted texts passed.  The selection process is far too random.  The text sample is not large enough.  None of the known cases of plagiarism were detected by this program!!!   The price is right for this one -  free. 

 

Paperbin (Integriguard paid version)

A little bit better than the free version, but not by much.   Still basically uses the all-the-web meta search engine.   However, it does look at the entire document instead of 100 words.   Once again, the randomly selected phrases used to check for plagiarism were very poor choices - common phrases, single words - nothing that was unique to that subject matter.  Results were emailed to the instructor.  A document "failed" if one of the 5 random searches brought up a possible web site.  Only three papers actually failed.  2 other papers were reported "failed", but after checking the website that was marked as having similar text,  we changed the result to pass.   Sometimes, web sites that were indicated in the results had nothing to do with the topic!  When using the "find" feature in our browser to search the reported sites we often couldn't locate any of the highlighted words!  We wouldn't recommend this program.

 

 

Turn-it-in (plagiarism.org)

This was our favorite thus far, if we had to recommend one...but we’re still not sold on this method. It’s easy to have students submit their text on the specified website and there is no software to download.  The student submits their entire paper, not just a portion of it, and the submission is secure. Very few irrelevant error detections were returned and their website is well-designed.  Turn-it-in also provides a summary page and then details for each paper that was submitted.  Color-coded results display text highlighted in the same color as the web site where matches were found.  Levels of plagiarism reported…no one simply “fails” or “passes”.   This software accurately detected the most papers, but still not all paperswere detected – only 5 out of 8 papers were successfully detected.  However, Turn-it-in did detect e-book and two of the articles.  When we checked the links, the Project Muse article was posted on a faculty class website.  The Ebsco article was listed as coming from a database that was not available to me.  The one downside to this service is that it can be slow - results sometimes take as long as 24 hours

 

 

GLATT

The whole process using this software/service was far too intrusive.  It reminded us of taking a lie detector test before you are hired for employment - you have to prove your innocence.  Basically, the test for plagiarism in this case works something like this:  the student submits their paper, and then is “tested” on their writing – GLATT takes random portions of the submitted paper, and asks the student to “fill in the blanks” where it has removed words and phrases.  The underlying premise is that if you wrote the words, you should be able to remember them.  This form of testing gave us chills.  WE don't think the campus culture of Colorado College would accept a form of plagiarism detection like this. 

 

WORDCheck

This program is designed for larger files than a student essay.  First, you need to have

identified the possible source that was plagiarized for the paper,  then you can download

that text together with the student paper in question.  The software program will compare the two files.  WORDCheck is designed for authors who want to protect their literary works - you keep your texts in a file and can download texts and compare sources that might have plagiarized/stolen  their intellectual property.  Not an effective tool for student plagiarism detection.

 

In addition to, or as a substitute for paid for programs and services, many of the sources in our literature review recommended using certain search engine techniques to catch plagiarists red-handed.   We entered three key phrases from each paper to both Altavista and Metacrawler and looked at the first two pages of results to see if the plagiarized words sent cyber alarm bells ringing!  All together, these search engines found only 3 of the papers.  Most times, the search was so narrow that I only had a page of results or no results. This blows apart the hypothesis promoted by many authors on student plagiarism that search engines can find many cases of plagiarism.

 

 

 

Summary of Our Observations and Test Results Thus Far

 

We were, although not surprisingly, not very impressed with the results provided by both the paid and free plagiarism detection services and software.  Using search engines returned even more dismal results.   No one source caught all the papers. In fact, the rate of detection across the line was dismal, with a high of 56% to a whopping low of a 0% detection rate.  We were disappointed in the seemingly random nature of the passage checking in some of the programs/services – a clear correlation is observable, we find, between poorly selected passages and the low rate of detection.  Quotations, even with footnotes, seem to cause a real “glitch” in these programs, often “detected” as possible points of plagiarism.

 

In terms of the papers themselves, we were surprised to observe that a few of the bought papers were indeed detected, (perhaps they had become available, thanks to some “generous” soul on a free site as well).  But then again, not one program detect them all, not did each of the programs/services detect the same paper.  It was also interesting to note that the free and purchased papers were detected at the same random rate. No one product/service or technique caught only paid or free papers.  Only one software program detected the e-book paper, and much to our chagrin, no program or service caught the Brittanica article!  And lastly, all the real Colorado College papers avoided any detection – including those suspected of plagiarism.

 

 

Preliminary Conclusions and Suggestions

 

Based on our findings this far, we are fairly confident in our ability to relate to our faculty that available detection software and services as they currently exist are not effective tools with which to identify on-line plagiarism.  They are not reliable, nor sophisticated enough to warrant the investment of college funds.  Not only are they ineffective, but some of the products/services promote a real lack of trust and resentment between professor and student that, especially given their lack of success, makes such a purchase undesirable.  If anything, faculty can take consolation in the fact that from our observations, we doubt that any plagiarist is going to “get away” with submitting a canned paper and receiving an “A” grade.

 

As with many of the sources we consulted in our literature review, we recommend instead spending time and energy on proactively avoiding plagiarism in the first place, rather than trying to detect it after the fact.  We recommend to our factory that they  anticipate and proactively attempt to deter the different “breeds” of cheaters.

 

Lisa Renard, in her article, “Cut and Paste 101: Plagiarism and the Net”[2] has identified three types of student cheaters: the unintentional cheater, the sneaky cheater, and the all-or-nothing cheater.  The unintentional cheater make up a large portion of the cyber-plagiarists professors encounter – they have never learned to properly document their sources, and very much do not consider the Internet as a “source” in that regard.  The sneaky cheater is the student that knows that they’re cheating and know the tactics of how to get away with it.  As they create their finely-crafted and highly-plagiarized essay, they change words or sentences, cut and paste blocks of text from a variety of sources, and even create bibliographic entries.  The fact that they could have easily written a legitimate paper for the amount of effort they’ve expended to cheat successfully doesn’t seem to phase them.  And lastly, the all-or-nothing cheater is the lazy and/or last-minute, panic driven student plagiarist who downloads and submits a paper they’ve bought or found on the Internet in its entirety.

 

The first step in deterring on-line plagiarism is education and perhaps a “declaration of intent”.  We refer to this declaration as the “You should know that I know that you know how to plagiarize on-line” statement.  When faculty are introducing a research paper assignment to a class, there should be a discussion around proper documentation of on-line sources and the fact that, as a professor, they are aware of on-line paper mills, and other such sources for plagiarism.  Perhaps forewarning could also be given to students that on-line plagiarism will be looked for.  In addition to such a declaration, essay assignment and topics need to be given that provide specific context and require individual thought, analysis and perhaps reflection on unique personal or context-specific experience. 

 

Our adventures as “downloading detectives” will continue this year, evaluating more programs, assessing their methods and architectures, and testing more papers.  We will also be doing more interviews, panel discussions and forums to assess the “human” side of the cat-and-mouse game of detecting on-line plagiarism.  At this point in the venture, however, it seems the old adage of “an ounce of prevention is worth a pound of cure”, still holds true in terms of on-line programs and services to sniff out cyber-cheaters.  We will keep you posted as to our future work via our website at:

 

 

 

 

 

 

Bibliography

 

Selected Bibliography of Sources- Plagiarism Detectives

Anderson, Gregory L.
Cyberplagiarism. College & Research Libraries News, May99, Vol. 60 Issue 5, p371, 4p.

Bodi, Sonia
Ethics and Information Technology: Some Principles To Guide Students. Journal of Academic Librarianship, v24 n6 p459-63 Nov 1998

Bunting, Chris. ; Hook, Steve.
An A for simply surfing the Net?. student run essay service in Great Britain The Times Educational Supplement, no4365 (Feb. 25 2000) p.

Bugeja, Michael
Bust a plagiarist in 30 minutes or less. Quill, Apr 2000, Vol. 88 Issue 3, p44.

Bushweller, Kevin.
Digital deception. The American School Board Journal, v. 186 no3 (Mar.1999) p. A18-A19+

Callister, T.A. Jr & Burbules, N.C.
Public spaces and cyberspace: issues of credibility in educational technologies, Insights, 32, 1996 pp. 11-14.

Carnevale, Dan.
Web services help professors detect plagiarism. The Chronicle of Higher Education v. 46 no12 (Nov. 12 1999) p. A49.

Clayton, Mark
Term papers at the click of a mouse. Christian Science Monitor, October 27, 1997, Vol. 89 Issue 232, p1.

Dalton, R.
Professors use web to catch students who plagiarize. Nature, 11/18/99, Vol. 402 Issue 6759, p222.

Denning, Peter J.
Plagiarism in the Web. Communications of the ACM, Dec95, Vol. 38 Issue 12, p29.

Freedman, Morris.
Don't blame the Internet for plagiarism. Education Week v. 18 no14 (Dec. 2 1998) p36.

Greatorex, Maggie.
From research revolution to web of deceit. The Times Higher Education Supplement no.1429 (Mar.31,2000) p.II.

Guernsey, Lisa.
Web site will check papers against data base to detect plagiarism. The Chronicle of Higher Education v. 45 no16 (Dec. 11 '98) p. A38.

Haas, Molly Flaherty
The Undergraduate Research Paper: Teaching Ethical Relationships. Paper presented at the Annual Meeting of the Conference on College Composition and Communication (46th, Washington, DC, March23-25, 1995). ERIC ED387810

Harris, Ian.
The national grid for cheating. The Times Educational Supplement, no4306 (Jan. 8 '99 supp) pp.62-63.

Harris, Robert.
Anti-Plagiarism Strategies for Research Papers. Vanguard University of Southern California . September 1, 2000. Viewed September 3, 2000. http://www.vanguard.edu/rharris/antiplag.htm

Hickman, John N.
Cybercheats. New Republic, 03/23/98, Vol. 218 Issue 12, p14.

Isakson, Carol.
Caught on the Web. Education Digest, Mar2000, Vol. 65 Issue 7, p79.

Klausman, Jeffrey W.
Teaching about plagiarism in the age of the Internet. Teaching English in the Two-Year College, v. 27 no2 (Dec. 1999) pp.209-12

Kock, Ned.
A case of academic plagiarism. Communications of the ACM, Jul99, Vol. 42 Issue 7, pp96-105.

Kopytoff, Verne G.
Brilliant or Plagiarized? Colleges Use Sites to Expose Cheaters. New York Times, (January 20, 2000), Vol. 149 Issue 51273, pG7.

Magid, Larry.
Are Your Kids Stealing From the Web? FamilyPC, June 2000, Vol. 7 Issue 6, p88, 2p.

Manzo, Kathleen Kennedy.
'Literary' Web sites aimed at students irk, prod teachers. Education Week, v. 19 no2 (Sept. 15 1999) p.7.

Marshall, Eliot.
Medline searches turn up cases of suspected plagiarism. Science, January 23, 1998, Vol. 279 Issue 5350, pp.473-474.

McCollum, Kelly.
One way to get into college: buy an essay that worked for someone else: IvyEssays. The Chronicle of Higher Education, v. 43 (Feb. 28 '97)pp.A25-A26.

McCollum, Kelly.
Web site where students share term papers has professors worried about plagiarism. The Chronicle of Higher Education, v. 42 (August 2, 1996)p.A28.

McKenzie, Jamie.
The New Plagiarism : Seven Antidotes to Prevent Highway Robbery in an Electronic Age From Now On: The Educational Technology Journal, Vol. 7, No. 8, May 1998. Viewed September 3, 2000. http://www.fno.org/may98/cov98may.html

Plagiarism Prevention Web Page. University of Wisconsin - Platteville. Elton S. Karrmann Library. August 30, 2000. Viewed September 3, 2000. http://www.uwplatt.edu/~library/reference/plaagiarism.htmlx

Renard, Lisa.
Cut and paste 101: plagiarism and the Net. Educational Leadership, v. 57 no4 (Dec. 1999/Jan. 2000) pp.38-42.

Roach, Ronald.
High-tech cheating:fighting student use of online term paper mills. Black Issues in Higher Education, v. 15 no22 (Dec. 24, 1998) pp.26-27.

Ryan, Julie.
Student plagiarism in an online world. ASEE Prism, v.8 no4 (Dec. 1998) pp. 20-24.

Simkins, Michael B.
Problems and solutions for the digital age. Technology & Learning, v. 18 no9 May '98) p.58.

Stebelman, Scott.
Cybercheating: dishonesty goes digital. American Libraries, v. 29 no8 (Sept. 1998) pp.48-50.

World Wide Web Makes Copying Easy. USA Today Magazine, v. 127 (April 1999)#2647, pp.15-16.

Walker, Janice R.
Copyrights and Conversations: Intellectual Property in the Classroom. Computers and Composition, v. 15 (1998) n2 pp.243-51.

Wood, Julie M.
10 ways to take charge of the Web: easy strategies or Internet smarts. Instructor, v. 109(Jan./Feb. 2000) no5 pp.69-72.

Utley, Alison.
Techno cheats bedevil sector. The Times Higher Education Supplement, no.1393 (July 16,1999) p.1.

Ware, Justin.
Cheat wave: busting online plagiarism. Zdnet: Yahoo Internet Life. Viewed September 3, 2000. http://www.zdnet.com/yil/content/college/colleges99/cheaters.html

Van Hartesveldt, Fred R.
The Undergraduate Research Paper and Electronic Resources: A Cautionary Tale. Teaching History: A Journal of Methods. v. 23 (Fall 1998) n.2 pp.51-59.

 

 



[1] Julie J.C.H. Ryan, “Student Plagiarism in an Online World”, ASEE Prism, v. 8 no.4 (Dec.1988), p. 20-24.

[2] Lisa Renard. “Cut and Paste 101: Plagiarism and the Net”, Educational Leadership. No.4 (Dec1999/Jan 2000), pp.38-42.