Issue Details (XML | Word | Printable)

Key: CAL-263
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: David Chui
Reporter: Dave Loeng
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Confluence Calendar Plugin

Calendar extractor causes Confluence indexing to stall completely due to what seems like an infinite loop inside ical-4j

Created: 17/Jun/08 09:15 PM   Updated: 03/Aug/08 08:13 PM
Component/s: Indexer
Affects Version/s: 2.7
Fix Version/s: 2.7.2

Time Tracking:
Not Specified

File Attachments: 1. File cal-263.diff (63 kB)
2. File cal-263.diff (39 kB)
3. File cal-263.diff (39 kB)

Image Attachments:

1. cal-263-configure-extractor-2.png
(59 kB)

2. cal-263-configure-extractor.png
(57 kB)

Labels:


 Description  « Hide
A recent rebuild of the index of QA-EAC stalled at 99%. Several thread dumps performed during stall showed only one indexing thread:
"com.atlassian.confluence.search.lucene.ConfluenceMultiThreadedIndexer: 7" daemon prio=10 tid=0x00000000448cdc00 nid=0x12e9 runnable [0x0000000048315000..
0x0000000048316d10]
   java.lang.Thread.State: RUNNABLE
        at net.fortuna.ical4j.model.component.Observance.getCachedOnset(Observance.java:266)
        at net.fortuna.ical4j.model.component.Observance.getLatestOnset(Observance.java:158)
        at net.fortuna.ical4j.model.component.VTimeZone.getApplicableObservance(VTimeZone.java:239)
        at net.fortuna.ical4j.model.TimeZone.inDaylightTime(TimeZone.java:110)
        at java.util.TimeZone.getOffset(TimeZone.java:198)
        at java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2026)
        at java.util.GregorianCalendar.computeTime(GregorianCalendar.java:2472)
        at java.util.Calendar.updateTime(Calendar.java:2463)
        at java.util.Calendar.getTimeInMillis(Calendar.java:1082)
        at java.util.Calendar.getTime(Calendar.java:1055)
        at net.fortuna.ical4j.model.Recur.getDates(Recur.java:521)
        at com.atlassian.confluence.extra.calendar.ical.model.EventDetails.processRecurrence(EventDetails.java:422)
        at com.atlassian.confluence.extra.calendar.ical.model.BaseEventDetails.getEvents(BaseEventDetails.java:179)
        at com.atlassian.confluence.extra.calendar.ical.model.ICalCalendar.findEvents(ICalCalendar.java:591)
        at com.atlassian.confluence.extra.calendar.model.CalendarGroup.findEvents(CalendarGroup.java:201)
        at com.atlassian.confluence.extra.calendar.extractor.CalendarExtractor.addFields(CalendarExtractor.java:54)
        at com.atlassian.confluence.plugin.descriptor.ExtractorModuleDescriptor$BackwardsCompatibleExtractor.addFields(ExtractorModuleDescriptor.java:40)
        at com.atlassian.bonnie.search.BaseDocumentBuilder.getDocument(BaseDocumentBuilder.java:104)
        at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.getDocument(ConfluenceDocumentBuilder.java:89)
        at com.atlassian.confluence.search.lucene.ConfluenceObjectToDocumentConverter.convert(ConfluenceObjectToDocumentConverter.java:30)
        at com.atlassian.confluence.search.lucene.ConfluenceObjectQueue$1.indexCollection(ConfluenceObjectQueue.java:79)
        at com.atlassian.bonnie.index.QueueProcessingRunnableImpl.run(QueueProcessingRunnableImpl.java:39)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:304)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:106)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:89)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
        at $Proxy69.run(Unknown Source)
        at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:987)
        at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:528)
        at java.lang.Thread.run(Thread.java:619)

All other indexing threads had completed their respective queues, with only one thread remaining stuck trying to index the calendar events for a particular usage of the calendar macro.

In all thread dumps, this line appears:

        at net.fortuna.ical4j.model.Recur.getDates(Recur.java:521)

Digging the source of this method indicates a while(true) loop. Combined with possibly corrupt data, this is our current guess on where the problem lies.

Beta-4 of the ical4j shows a new implementation of the while loop that depends on maxCount. We have not investigated whether upgrading to this version will address the problem.

Until this is resolved, and to avoid stalling such an important administration task, we plan to completely remove the mapping to the extractor from the descriptor file (to prevent customers enabling it and harming themselves). Since this feature will have been broken for several versions of Confluence to due lucene compatibility issues anyway with little or no complaint from customers, we feel it safe to disable it for now until this is resolved.



 All   Comments   Work Log   Change History   FishEye   Crucible   Builds      Sort Order: Ascending order - Click to sort in descending order
David Chui added a comment - 22/Jul/08 01:59 AM
I did some investigation into this. Turns out it isn't the ical4j library that is causing the infinite loop (not a bug with ical4j). However, the problem has something to do with recurring events. I'm able to confirm this because my attempt to upgrade the library didn't solve the problem as we hoped it would. FYI, the upgrade required small changes to the plugin's code.

The problem seems to be caused by the extractor. It indexes all events from the start of time itself (year 0001) till 10,000 years later. That basically says if you have a weekly event scheduled to start in the first week of the first year in 0001, the extractor would have to index over 52000 events. That's just one weekly recurring event. There might be more.

For your reference, please refer to line 54 and 55 of the Extractor source.

I've created a patch for this plugin which allows the administrator to configure the duration of events to be indexed – so that it isn't fixed to 10000 years. I'll attach it with more details in a while.

Also, I've noticed one little thing with the indexing feature of this plugin. It is not complete. New events do not get indexed and my guess is, event changes and removals aren't indexed either.


David Chui added a comment - 22/Jul/08 02:04 AM
Clicking the configure link in the extractor module details screen (in the plugin details screen) brings you to the following screen:

That is where an administrator can configure how many events (approx) can be indexed.


David Chui added a comment - 22/Jul/08 02:05 AM
If the approach of the patch is acceptable, please let me know. I'll commit it to trunk.

Dave Loeng added a comment - 23/Jul/08 01:16 AM
David,

Thanks for the investigation mate. Awesome catch! We should've been suspicious of seeing a method call taking in 10000 as a parameter

However, I don't see great value in indexing more than one occurrence of the an recurring event in the index, even if it is scoped to two years. It would be better to simply index unique events and one occurrence of a recurring event. This is sufficient to identify a page with the calendar we want to find. Otherwise, we would be polluting the index with potentially hundreds of occurrences of terms used in a recurring event description with little benefit.

Cheers,
Dave


David Chui added a comment - 23/Jul/08 06:50 AM
You're right. How about this? We make it to have a configurable scope and we only index unique events?

Dave Loeng added a comment - 23/Jul/08 08:29 PM
From a quick look, here is some of my feedback:
  • your implementation of the WikiContentHandler does not honour the contract of the interface (you do not append any result to the buffer passed in). I would recommend against doing this. Unfortunately, there is nothing in the v2 renderer API that simply returns macro objects. Everything seems geared towards rendering and appending to buffers.
  • instead why not just use a simple regex to look for the id's
    "
    {calendar:.*id\\s*=(.+?)\\|?\\}"
  • when testing for this, instead of sub-classing the extractor with a method that collects ID's, you should specifically test the class you are using to extract the ID's instead.
  • i didn't see a test to ensure we only index events in your range

David Chui added a comment - 23/Jul/08 08:45 PM - edited

your implementation of the WikiContentHandler does not honour the contract of the interface (you do not append any result to the buffer passed in). I would recommend against doing this. Unfortunately, there is nothing in the v2 renderer API that simply returns macro objects. Everything seems geared towards rendering and appending to buffers.

That's true. It was never meant for rendering, that's why it is not appending to the buffer. This is related to:

instead why not just use a simple regex to look for the id's

This doesn't take care of the case where matching markup might appear in bodies of other macros which do not render their bodies. For instance, you could have the calendar macro markup in the body of {noformat}. Also, it doesn't take account of what happens to the index if the calendar macro is disabled and there are still event attached to the page. I think it shouldn't be indexed if the macro is disabled or uninstalled.

The only reliable way I can think of to find all calendar IDs and address the problems above would be to "collect" them as the WikiMarkupParser goes through them.

i didn't see a test to ensure we only index events in your range

No. There isn't a test for that. I planned to write some if the scoped indexing is an acceptable idea.


Dave Loeng added a comment - 23/Jul/08 10:58 PM
I accept your point about handling noformat macros.

Great work!


David Chui added a comment - 24/Jul/08 09:55 PM - edited
Attaching updated calendar event indexing timespan screen and patch that indexes only unique events.


Dave Loeng added a comment - 27/Jul/08 11:41 PM
Can we please simplify the extractor configuration so that:
  1. we only allow duration to be specified as a year unit (I can't think of any reasons right now why someone would want to specify a smaller granularity, especially since all versions till present has been locked at 10,000 years)
  2. we limit the range to 50 years so we prevent anyone from being able to crash indexing using silly values. No plugin should stall the core indexing process.

Cheers,
Dave


David Chui added a comment - 28/Jul/08 05:11 AM
Good idea. I'll make that change.

David Chui added a comment - 28/Jul/08 05:22 AM
Changes applied. I've committed the changes.

David Chui added a comment - 03/Aug/08 08:13 PM
Resolving issue.