Six Months of Submissions

The first half of 2014 is over already. Let me recap a few key statistics for you for the last six months.

The big number is 535. This is the number of new manuscript submissions to JGR-Space Physics (by AGU’s count). We are on pace for a record-setting year of submissions to the journal. Last year, there were 1014 new manuscripts received, and doubling the half-year count would put us at 1070, just a bit over this.

Some other interesting numbers from AGU: submissions by month. Here are the stats:

2014 (this year)          2013 (last year)

January                       78                                82

February                     79                                79

March                         91                                90

April                           105                              85

May                            84                                69

June                             98                                84

 

As you can see, the numbers are fairly close, with April and May accounting for the difference. I cannot explain this. I would have thought January would be a high-submission month, with people converting their AGU talks into papers. Looking across the numbers for the other AGU journals, this isn’t the case for any field. The other thing I thought would happen is that June would be a slow month, with the GEM, CEDAR, and SHINE Workshops all occurring around the same time within this window. The submission rate, however, is not significantly lower than other months, and is in fact the second highest for 2014.

I am very thankful that I have 4 other editors working with me on this task. I would like to thank them here, just in case you don’t them: Yuming Wang, from the University of Science and Technology China; Alan Rodger, most recently from the British Antarctic Survey; Larry Kepko, who is at NASA Goddard Space Flight Center; and Michael Balikhin, from Sheffield University. These four have been fantastic with their efforts to serve the community, and I greatly appreciate their time and effort in working with me for JGR-Space Physics. The other two I would like to thank here are the staff at AGU HQ that work most closely with this journal: Brian Sedora, the JGR-Space Physics journal program manager; and Mike Crowner, our journal assistant. They work very hard to keep our journal running smoothly, and I am very glad we have such an efficient and dedicated crew.

 

Self-Reviewing Scandal at JVC

The Journal of Vibration and Control just retracted 60 articles because of peer review misconduct. A story about it is here:

http://retractionwatch.com/2014/07/08/sage-publications-busts-peer-review-and-citation-ring-60-papers-retracted/

Apparently, a professor in Taiwan was able to create aliases in the SAGE Track peer review system, and was then able to suggest himself and often serve as a peer reviewer for his own papers. The investigation called it a “peer review ring” that was manipulating the system to get easy passage for their manuscripts, in in the end five dozen papers published from 2010 -2014 were retracted.

This is one of those “what if” nightmares that I face as Editor-in-Chief of JGR Space Physics. I understand that there is pressure to publish and that, sometimes, referees can cause much consternation for authors. Please, don’t sacrifice your ethical integrity to fulfill your need to publish. You will most likely be caught and it will it will end up causing far more harm to your career than the few papers you managed to slip through the system with easy peer review.

I think that AGU has an excellent system in place to defend against this. The biggest thing is that the journal editors are active scientists with extensive knowledge of the community of potential authors/reviewers. I don’t know everyone, of course, and I send manuscripts out for review to people that I don’t know. We mitigate this, however, by having 5 editors for JGR Space Physics, dividing the papers up mostly by discipline so that the assigned editor has some familiarity with the pool of potential reviewers.

Our use of two referees is also a very effective bulwark against this kind of ethical violation. As editor, I occasionally receive disparate reviews, making the editorial decision difficult. I’ll sometimes call on a third or even fourth referee for additional assessments of the paper. We try to be very thorough in our judgments of your manuscripts.

Another big defense against this is the GEMS submission and review system. Even though authors suggest potential referees on the submission pages, I think it is very rare that both referees came from this list and it is a fairly regular occurrence for neither reviewer to be from the list. The “editor’s only” sections of GEMS are quite extensive in helping us find qualified and knowledgeable experts on the topic of a submitted manuscript. Research community members are quite good about updating their contact information and duplicate entries are regularly merged. To me, at least, your reviewer suggestions just get my mind moving in the right direction for potential reviewers.

My Email Inbox

As my term as Editor-in-Chief rolls on, I will probably end up annoying many in the community with my editorial decisions on particular manuscripts by either rejecting yours or accepting one you consider unworthy. But there is another way in which I might be annoying you: when I ignore your emails.

Email is one of our primary means of direct communication and I make time for it every day. That said, it can overtake my life if I let it, so I intentionally limit my time on email. I get somewhere between 20 and 80 non-spam emails a day, and sometimes I don’t get around to answering all of them with specific replies the same day they come in. Sometimes I need to think about the reply and can’t do it immediately. I try to remember those emails that demand quick attention, but my work time is very fragmented and such emails are sometimes pushed out of my mental to-do list. Furthermore, once they are scrolled off of the first page of my inbox, the chance of reply is greatly diminished.

For me, people in my office have my full attention. I will rarely answer the phone when I am meeting with someone. Email is an even lower priority. I make time for it, but only a limited amount of time, and it could be that I don’t get around to answering you that day, that week, or at all.

What I am saying is this: please know that you are not alone in your frustration with me about a lost, forgotten, or ignored email. Please don’t take it personally; I just didn’t get around to responding to it and my multi-tasked life made me inadvertently drop it off my list. I am requesting that you simply resend it with a gentle reminder.

I will leave you with this humorous comic from someecards.com:

SixMonthReminder

While funny, it’s not applicable to me: I am actually asking you to please remind me, on whatever cadence you feel is necessary.

Publications Versus Presentations

I was at a meeting this week, the GEM Workshop in Portsmouth, Virginia, and I am watching many people give talks in the sessions. Many of the talks are very clear and concise, but some strike the nerve of one of my pet peeves. Regarding those, I just have to say: ARGH.

My point is this: the content, format, and style of scientific presentations are fundamentally different than scientific publications. I get annoyed with people giving presentations that look like a copy-paste hack job of converting their paper into a Powerpoint document. Let me explain.

In a paper, you start by citing the relevant literature to build up to an unresolved question to be addressed by your study. You then step through the details of your methodology, covering it in sufficient detail to allow others to trust your analysis and even reproduce your work. This is followed by objective analysis your results, followed by a discussion of the new scientific findings and their implications, and conclude with a brief summary. You can make multiple points in a paper, as long as they are justified by the results and analysis. Readers of the paper can spend as much time as they want on any section, and zoom in on the details of any figure, and flip back and forth through the document as much as they like. That is, the reader is in control of how they absorb the material.

With a presentation, the opposite is true: the viewer has no input into how the material is absorbed and the presenter is in control of the flow of the content. It takes some work to be effective, and it is not a simple translation of Word into Powerpoint. For instance, you do not have to include many (if any) citations, and you only have to give a cursory explanation of your methodology. Some people spend a lot of time on these sections and it isn’t the best use of your time at the podium. In addition, please don’t show a paragraph of text on a slide; the audience will read it instead of listen to you. Because the audience cannot go back to a figure, the presentation and interpretation of the results should be intermixed. In showing the figure, though, don’t leave a lot of white space around it. Conference rooms are big and there is no reason not to maximize the figure size and make it readable from the back. If the axes aren’t readable, then please over-write them in larger font. Another ineffective presentation technique is to squeeze many plots on a single slide; the audience will be distracted by the other plots and not listen to you. It is better to break it up, put each plot on its own slide, and walk the audience through the plots one at a time. Even more importantly, though, the audience is hearing a lot of talks at the meeting, and so a presentation should be very focused, usually on just a single main point. The results should all be directed at this one main point, and extraneous material should be removed.

So, please, presenters, think about your audience when you formulate your talk. Specifically, maximize your figure sizes, minimize unused space on the slides, increase your axis labels and annotations to make them readable, avoid full paragraphs of text, and focus on the content on a single main finding. It requires some work to prepare an effective talk, but it will be worth it.

Conversely, regarding manuscript submissions, they are more than just a cut-and-paste reformatting of your presentation. A lot more work needs to go into the Introduction and Methodology sections, in particular. The Intro needs adequately set the stage that the question being posed is not only unaddressed in the present literature but worthy of attention, and the approach should be thoroughly presented to allow reproduction of the results (or cover this topic with citations of such comprehensive presentations). You can get away with a less polished study in a presentation, but not in a manuscript submission to JGR-Space Physics.

 

 

Models And The Data Policy

Hopefully you know about AGU’s new Data Policy. I have a previous post about it. A big question that has come up about this is the availability of code.

The policy, located here:

http://publications.agu.org/author-resource-center/publication-policies/data-policy/

has this line in it:

  • “New code/computer software used to generate results or analyses reported in the paper.”

The availability of “code” comes at several levels: it could mean the code used to actually make the plot (e.g., the short IDL script), or it could mean the first-principles code that the authors used to make the “numerical data” (e.g., the complicated set of FORTRAN files). While both ends of the spectrum pose issues, it is the latter, bigger request that raises the biggest concern. This is a very sensitive issue for some in our community.

For me, it is about reproducibility. Scientific advancement is based on the ability of others to reproduce your results and verify the accuracy of the analysis and therefore the robustness of the finding. Towards this, it would be great if everything were open source and available for public scrutiny. In reality, though, your code is your intellectual property and the copyright on it is probably held by your company, university, or institute. The code might be (and usually is, these days) a large collaborative effort rather than a single person’s work, making it awkward for one of them (the author of a paper using the code) to provide access without the other code authors’ consent. Plus, the author of the code might want to restrict access to the source files so that they have a competitive advantage on proposals for funding. In addition, if you are basing your paper on results from the Community Coordinated Modeling Center, then you might not even have access to the original code.

It has been argued to me that a scientific code is a lot like a scientific instrument. It is not required to share the actual instrument that collected the data, only the data itself. Therefore, the numerical code used to produce the “numerical data” should not be required to be public, just the output from the code. This point is made especially clear for those using CCMC output for their study. In general, I agree with this assessment.

However, before the data analysis papers appear in journals, the instrument is usually written up with a detailed description paper, or series of papers, somewhere, and perhaps also patents for specific parts. Furthermore, the calibration and processing software is extensively tested and usually available to those that ask for it.

What I am getting to is this: like scientific instrument, numerical models on which scientific findings are based need to be thoroughly tested and verified. Such model description and verification presentations could be within a “research article” JGR-Space Physics paper as a lengthy methodology section, it could be its own paper in JGR-Space within the “Technical Reports: Methods” category, or could be published elsewhere like Journal of Computational Physics, Geoscientific Model Development, or in the near future in AGU’s new journal Earth and Space Science. That is, the “instrument” used to produce the “data” needs to be adequately described and tested so that readers know the “data” is trustworthy. I think that the code itself, though, does not need to be made public, just like the actual instrument (or its technical design documents) do not need to be made public.

The best verification, however, is to make the model available and let others examine and assess the source code. So, I urge you to release your model to the world. At the very least, share it with a colleague and have another set of eyes scrutinize it.

Until we are told or decide otherwise, this will be the implementation of the “code” part of the Data Policy for JGR-Space Physics.

Press Releases for Papers

Space physics is full of exciting new discoveries about how nature works. Within the community, some papers have a big impact on the field, and we can even quantify this with various metrics, like views/downloads from the journal website to citations in top-tier journals. Beyond our field, however, we are often cautious about sharing our findings with a broader audience. This is understandable, because we usually formulate our findings in our jargon-filled, field-specific nomenclature. It takes a serious effort to convert our cool results into language that non-experts can easily comprehend.

Yes, I am talking about the new-paper-related press release. In general, I think that if the authors are excited enough about their research findings to consider a press releases, then they should follow through with it and do the press release.   I don’t want to discourage anyone from pursuing publicity about their latest findings.

Press releases can be done at the local level, and is especially easy if your institution has a press office. Pretty much all universities, large companies, and government labs have such an office with media specialists who know exactly how to construct a press release about your scientific finding and how to distribute it to the right journalists. If you are excited about your result, your press office crew can help you formulate it into a story of relevance to the general public.

In addition, you can work with the press office at the funding source for your work, like NASA or NSF. Your program manager will be able to direct you to a press office within that agency.

I think you should do a press release even if there are others in the field that might not be convinced yet about the veracity of the claim.  If we waited until a majority consensus, then we wouldn’t call it new and it wouldn’t warrant a press release. So, go for it, even if others in the field might cringe. They can have their own press release on a competing finding.

As for the timing of it, many press offices want the release to come out simultaneously with publication. So, if you think your results are worthy of a press release, then start talking with your press office at the time of submission or after the first round of reviews, so that it is ready to go soon after acceptance.

 

Interpreting a Similarity Report

This is a continuation of the last two posts, the first on the general concept of a Similarity Report and the second that walked through an example report. In this one, I will discuss our method of interpreting these reports and making a decision based on it (that is, to send it out for review or reject it based on a high cross check content).

First off, I would just like to say that evaluating a Similarity Report is a subjective task because we do not want to impose rigid rules on ourselves regarding the numeric values within the report. That said, we have some guidelines that we follow. Here’s a quick overview.

In a Similarity Report, there are several numbers that we check. The first is the Similarity Index, the total overlap with other sources. A Similarity Index number greater than 15% is unusual and requires special assessment by the editor. A value of over 25% will almost certainly have the paper sent back to the authors for revision.

When we open the Similarity Report, in our “Document Viewer” display of it we see the highlighted manuscript in one frame and a listing of the sources in another. For each source, two numbers are given: the total number of words of overlap and the percentage of the new manuscript that was found to overlap with this source. Any source below 50 words overlap is nothing to worry about. However, overlap values for individual sources over 100 words or over 5% are unusual and require special assessment by the editor. Sometimes these are just a bunch of small stuff and there is nothing to worry about (affiliations, citations, and commonly used phrases).

Other times, however, entire sentences or even paragraphs are copied verbatim from a previous paper. It is overlap of this kind for which we are scanning the document. Even still, one big chunk of overlapping text usually isn’t enough to make us send it back to you. It rises to the level of rejection when the large-block overlaps exceed a page or so.

Here is an example of a page with serious overlap:

Page from Similarity Report_lowres

I’ve lowered the resolution so that you cannot read the text. You can still see the highlighting, however, and you can see that the colored text covers nearly the entire page. A typical double-spaced manuscript page has 350-400 words, so this represents hundreds of words of overlap. At this point, we really start to consider rejection.

For a typical paper with 5000 – 10,000 words, a single source value over 10% will probably have the paper sent back to the authors for revision. Let me keep iterating the point, though, that there is no hard threshold for this, and single source values as low as 3% could warrant rejection while a value of 15% might actually be okay. It all depends on how it is distributed through the manuscript and it is, in the end, a judgment call by the editor.

I understand that authors only see these reports if the editor deems it to be a problem. So, authors don’t have experience viewing and interpreting these files. This is why I am spending several blog posts on this topic. If you would like to know more, please just ask and I’ll keep posting about it.

One more point of confusion is that we, the editors, perhaps haven’t been as clear as we could be about what exactly in the Similarity Report made us reject the manuscript and therefore what specifically the authors need to change to make it acceptable. We will try to be much better about this in the future, adding a paragraph to our decision letters that itemizes the places that we want changed in the text.

Details of a Similarity Report

I’ll have another post on acceptable and unacceptable levels of overlap in a similarity report, but first, I think that I should go through a report, step by step, to make sure that we are all familiarized with what these reports actually look like and tell us

I’m going to use one from my recent submission. This isn’t a published paper, yet, but I’m okay with showing you snippets of the content so that you can see what this report looks like.

Here is the top of the report:

Title

At the very top is a percentage value called the Similarity Index.  This is the total amount of overlap that the iThenticate software found.

Next comes the text of the checked manuscript. Overlapping phrases are highlighted in color and put in a separate box, like the affiliation listing above. Note that the color of the source, listed at the end of the report, matches the color of the overlapping text, and the number is the small box to the right gives the number of the source in the ranking list. Assuming you haven’t moved, affiliations should be the same from paper to paper, but iThenticate will probably flag it as identical text. If an affiliation is marked for overlap then we simply ignore that and we mentally discount it from the total.

The iThenticate software is usually good about removing the reference from its check. That text should be identical from paper to paper, of course, and only very rarely is this section mistakenly included in a check. If the Reference list is left in, then those overlaps are also discounted from the total. However, iThenticate usually highlights citation callouts as similar to previous papers. Here’s an example:

Citation_overlap

This is unavoidable and every paper will have many of its citations highlighted as duplicative text that’s identical to that found in a previous paper. We don’t worry about these overlaps, either, and neither should you. Remember, not only are citations/references not counted against you in this Similarity Report but also they no longer count against the Publication Unit tally (formerly the page count total) and therefore do not add to the publication cost. So, please cite all of the papers that you feel are necessary to put your new study into proper historical perspective.

In addition to citations, commonly used phrases are also regularly highlighted in these reports, like this:

Common_words

Again, there is no way around this and we disregard these in our assessment. If you have a paper rejected due to a high cross check value, these are not the places we need you to change.

Unfortunately for the sake of this tutorial, my paper did not have any full sentences that were verbatim from another source. You get the idea of what it will look like, though; it would be multiple lines of colored text in the box, perhaps with a few unique words sprinkled in it, like the image above, but much longer.

To finish up, at the end of the report is a section section of the “sources” listing, which are those existing publications for which iThenticate found some overlap in the manuscript. Here’s a screenshot of the top few from my manuscript:

Sources_list

They are given in the order of most overlap to least, with the percent overlap and the number of words of overlap listed. When we reject a manuscript and tell you that we are particularly concerned about the first ## items in the list, this is the list to which we are referring.

I’ve gone through one format for these reports, but I know that there are actually several other versions that authors sometimes get. One of the formats has the sources at the top, one version doesn’t separate the overlapping text in a box but leaves it in line with the rest of the manuscript (just colors it), and another version has the Similarity Index near the end just before the sources listing. Whichever one you get, I hope that this explanation helps you find your way around the Similarity Report.

Even without the images, this is a very long blog post, so I won’t go into any interpretation here. I will write that and post it in the next few days.

Similarity Reports

Every manuscript that is submitted to JGR-Space Physics undergoes a cross-check for identical text in previously published scientific content on the web. Specifically, we send it to a company called iThenticate to generate a “similarity report.” I have a previous post on self-plagiarism. This post is about understanding and interpreting those reports.

The iThenticate software scans the document for strings of characters that match those found in other scholarly publications. It specifically excludes the reference list, for which the entries should be identical between publications. However, scan unfortunately includes the affiliations, which are also always the same.

We (the editors) look at the similarity report for every paper that we manage in the system. This is part of our initial assessment of the manuscript: determining whether to send it out for review. The other big points we assess are whether the English usage in the text is adequate (too many errors and we will send it back to the authors for revision) and whether the paper meets the bar of an original contribution to the field.

Within the similarity report, the big issue that we are looking for is overlap of entire sentences or even paragraphs with a previous source. This is the thing that will get your paper rejected without review. Little things, like the affiliations, specific paper callouts, or standard phrases, are also identified by the software as identical overlap. These small things are discounted by us and not included in our assessment. I have never asked AGU about altering the iThenticate cross-check settings, but perhaps that could be done to omit these small/meaningless overlaps from being highlighted in the report. Until then, however, we all will just have to ignore those places and focus only on the big overlap sections.

So, when you get a manuscript rejected due to a high cross check, please look through the similarity report and find those places where entire sentences are highlighted. These are the places that we want you to rewrite in new wording. We realize that there is often significant overlap with previous papers, especially in the methodology section where the description of the instrument, model, and/or processing scheme is the same as that used in another study. Regardless, you have to rewrite it. It can (it many cases, should) say the same thing as before, but the sentences cannot be verbatim from another paper. We hope that you are able to revise these places of the text very quickly and resubmit within a week or two.

Technical Reports Manuscripts

AGU just revamped/updated/consolidated their list of “paper types”, making it much more uniform across their ~20 journals. This was needed because AGU had many different paper types as each journal editorial board created and deleted classifications.

I don’t see the list posted on the web, yet, but I could be missing where it’s located in the directory tree of author guideline pages. I will make sure that AGU posts the full list for all to see.

One of the bigger changes for JGR-Space Physics is that there are now two “Technical Reports” categories for papers. Actually, these are available as options for all journals except GRL.

One is Technical Reports: Methods. The formal description is as follows.

Technical Reports: Methods provide new analytical or experimental methods, data, and other technical advances, including computer programs and instrumentation, if applicable, that represent a significant advance and enable new science. These papers should not exceed 13 Publication Units and will typically include at least one illustrative example application.

The other is Technical Reports: Data. Its formal description goes like this.

Technical Reports: Data present new data sets with original and innovative features in terms of the monitored processes and/or extensions of the observation period and accuracy, therefore providing an opportunity to support innovative research and theoretical development. The paper may provide an example of a relevant scientific application to demonstrate the usefulness of the data. The data set may refer to experimental sites or virtual laboratories and environments. These papers should not exceed 13 Publication Units.

For JGR-Space Physics, we (the editorial board) have decided that these two types of papers must have an “example application” to space science. That is the ambiguous words of “typically include” and “may provide” should be replaced with “will definitely have.” However, the paper does not need to have a revolutionary original space science contribution to the field.  The technique/data itself is the original contribution worthy of publication.

Also note the 13 PU limit to these papers, after which you will incur excess length fees. This is shorter than the standard 25 PU limit for JGR. These papers are intended to be detailed but concisely written descriptions of the new aspects of the model, observation, or processing technique that are accessible to a wide audience who might find the new methodology/data relevant to their work. Please take advantage of the “electronic supplementary material” feature of JGR to provide additional plots, movies, or overly-technical aspects of the design. Also, remember that a Publication Unit is 500 words (not counting title, author list, affiliations, acknowledgments, and references) or one figure or one table. You can calculate exactly how many PUs your manuscript is before submission and adjust accordingly, if you feel the need. I had a previous post on this.

We (the AGU Editors in Chief and AGU HQ publications staff) just adopted these in mid May. Over the last 6 months, though, I have fielded numerous emails about “techniques” papers, and I very glad that these new paper type descriptions are finalized and ready for broad dissemination and usage. I hope that AGU staff will officially roll it out in the very near future, and I will probably have more blog posts on the new paper type definitions in the coming weeks.