I’ll have another post on acceptable and unacceptable levels of overlap in a similarity report, but first, I think that I should go through a report, step by step, to make sure that we are all familiarized with what these reports actually look like and tell us
I’m going to use one from my recent submission. This isn’t a published paper, yet, but I’m okay with showing you snippets of the content so that you can see what this report looks like.
Here is the top of the report:
At the very top is a percentage value called the Similarity Index. This is the total amount of overlap that the iThenticate software found.
Next comes the text of the checked manuscript. Overlapping phrases are highlighted in color and put in a separate box, like the affiliation listing above. Note that the color of the source, listed at the end of the report, matches the color of the overlapping text, and the number is the small box to the right gives the number of the source in the ranking list. Assuming you haven’t moved, affiliations should be the same from paper to paper, but iThenticate will probably flag it as identical text. If an affiliation is marked for overlap then we simply ignore that and we mentally discount it from the total.
The iThenticate software is usually good about removing the reference from its check. That text should be identical from paper to paper, of course, and only very rarely is this section mistakenly included in a check. If the Reference list is left in, then those overlaps are also discounted from the total. However, iThenticate usually highlights citation callouts as similar to previous papers. Here’s an example:
This is unavoidable and every paper will have many of its citations highlighted as duplicative text that’s identical to that found in a previous paper. We don’t worry about these overlaps, either, and neither should you. Remember, not only are citations/references not counted against you in this Similarity Report but also they no longer count against the Publication Unit tally (formerly the page count total) and therefore do not add to the publication cost. So, please cite all of the papers that you feel are necessary to put your new study into proper historical perspective.
In addition to citations, commonly used phrases are also regularly highlighted in these reports, like this:
Again, there is no way around this and we disregard these in our assessment. If you have a paper rejected due to a high cross check value, these are not the places we need you to change.
Unfortunately for the sake of this tutorial, my paper did not have any full sentences that were verbatim from another source. You get the idea of what it will look like, though; it would be multiple lines of colored text in the box, perhaps with a few unique words sprinkled in it, like the image above, but much longer.
To finish up, at the end of the report is a section section of the “sources” listing, which are those existing publications for which iThenticate found some overlap in the manuscript. Here’s a screenshot of the top few from my manuscript:
They are given in the order of most overlap to least, with the percent overlap and the number of words of overlap listed. When we reject a manuscript and tell you that we are particularly concerned about the first ## items in the list, this is the list to which we are referring.
I’ve gone through one format for these reports, but I know that there are actually several other versions that authors sometimes get. One of the formats has the sources at the top, one version doesn’t separate the overlapping text in a box but leaves it in line with the rest of the manuscript (just colors it), and another version has the Similarity Index near the end just before the sources listing. Whichever one you get, I hope that this explanation helps you find your way around the Similarity Report.
Even without the images, this is a very long blog post, so I won’t go into any interpretation here. I will write that and post it in the next few days.