Citations become slow with 100+ sources in document

Feb 7, 2010 at 4:18 PM
Edited Feb 7, 2010 at 4:19 PM

I am writing a thesis using word 2007. Everything was fine when there were only several dozens sources in my document.

Currenty, my thesis is around 150 pages, contains around 500 sources and references almost all of them. Refreshing the citations takes several hours (select all, pressing F9). Any ideas what could be causing the problem? Is there any workaround?

Regards,

Martin

Coordinator
Feb 7, 2010 at 5:14 PM

The more sources you have, the slower it becomes due to the 'bad' way xsl gets loaded and processed in Office. It looks as if for every in-text citation and bibliography the same xslt gets loaded, executed and then again unloaded causing a huge overhead. Also, Office uses an old xslt processor (which doesn't compile stylesheets for faster execution) compared to the one in .Net. And then there is the preprocessing of in-text citations where Word calculates some extra fields and therefore has to check all other in-text citations before it can format a single citations. But still, several hours seem amazingly long. If it really takes that long, I'm surprised you didn't kill Word assuming it became unresponsive.

Anyway, does this happen with all the styles? Or just with the BibWord ones? BibWord is running on top of XSLT and requires extra parsing, so it will be slower than the direct processing. But the difference shouldn't be that much. I doubt there is anything you can do to speed it up.

Suggestions:

  • Turn off any virusscan you have running. I'm not sure if it will help anything, but the constant loading of the xsl could cause your virusscan to go into overdrive.
  • Only update the fields when you need to. So wait until you have finished your entire document. If you want an updated bibliography, you could just update that field by itself.
  • Run a macro which updates each field at a time. You might get a feeling if something particular is getting slow or not.
Feb 8, 2010 at 7:52 AM

You are right about something 'bad' in Office :). Yesterday I've discovered that it hepls when I refresh fields chapter by chapter. Refresh per chapter took around 30 minutes (chapter is about 40-50 pages). It looks like it has memory leaks. During refresh RAM consumed by Word goes from 50 to 500-800 Mb and it is not released until Word is closed. After each chapter was refreshed, I saved the document and closed Word. I guess I will avoid refreshing fields until I am done with the document...

I think this is not bibword specific. I've tried switching to APA. I could not wait for hours, but can say that first few minutes showed similar simptoms (memory leaks, high CPU).

Another workaround which apperas to work well:

  • Copy and paste document fragment (i.e. chapter) to a blank docx, hit refresh, copy and paste back to original docx... Refresh takes less than a minute when there are less than 200 sources and copy takes only sources referenced by the fragment to the new docx. The issue is that it could break numbering and references from other parts of the document.
Aug 22, 2013 at 2:02 PM
This looks the same as my issue in:

https://bibword.codeplex.com/workitem/11212

Does anyone know if the xslt processing was improved between word 2007 and later versions?
Coordinator
Sep 4, 2013 at 4:23 PM
There are still little bugs in there which were reported back in 2007 and only require a single line of code to fix. Yet, they still haven't fixed them. So my guess would be that the chance of an update to the underlying engine is rather small.
Sep 4, 2013 at 5:30 PM
I've done some experiments and it looks like the slow performance is not necessarily the same issue as the issue that causes word to crash/hang/freeze.

I believe I have diagnosed the cause of hang/crash/freeze problem when a user tries to "update citations and bibliography" with a large number of sources - It appears to be caused by Word's UNDO handling.

Basically for every single citation update it seems to create an undo entry in the undo stack. In my case each one of these is between 8Mb and 14Mb (I'm guessing the difference relates to the number of times a source is cited in the document). So for 900+ sources, we are looking at an undo stack of 7 to 13 Gigs. Add in the application and document and we are talking about another 45Mb on top. I don't know if there's a finite addressable limit of a 32bit app, but it looks like what's happening is that as the "update citations and bibliography" proceeds, it just creates a HUGE UNDO STACK which eventually exceeds either some internal address limit or operating system limit or physical resources limit (in my case it gets to about 1.8Gb in RAM). Then it falls over. Since it falls over when Task Manager shows I still have plenty of memory left - I think it's an internal limit.

How did I come to this conclusion?

I've spent days researching. I previously tried switching off the autosave - no change.
I monitored the memory use by the process winword.exe while manually updating my citations one at a time. The app with a blank doc takes about 25Mb. Add in my doc and it rises to 45Mb. Edit the first citation - rises to 57Mb, edit next - 69Mb, and so on. I quickly get to 150 or 170Mb before I chicken-out, exit "manage sources," and save the doc. I verify sources.xml is also updated.

SAVE does NOT clear the undo stack - there's no change in memory before and after SAVE.

On the ribbon, the Undo button offers multiple instances of "select bibliography style" even after save.

In older versions of word you could turn off the UNDO function, but it appears from Word 2007 and up you can't. If anyone knows of a way to disable UNDO in Word, that should solve this issue.

On other fora I hear that there has been no significant change in Word 2010 or 2013 in this area.

It's a diagnosis but not a solution.
Dec 10, 2013 at 5:04 PM
FIXED (kinda) in 64 bit edition of word fo office 365!

I eventually found a phone number and got through to MS tech support. Uninstalled office 365. Re installed the 64 bit edition of Office 365 (MS hide this under the "languages and advanced options" link). Ran Update citations and bibliography on the problem doc. It ran for 3 hours and I watched the winword process go up to 3.2 Gb for a while.

But it completed WITHOUT CRASHING!!

Therefore it is the memory limit in the 32 bit app that is making it fall over, BUT there has to be a smarter way to code this transaction both from a memory use standpoint and a performance standpoint.

Along the way, MS confirmed that the renaming the undo via a macro does not in fact disable the undo stack in word versions post 2003, so I was on-the-money blaming the bloated undo stack.