|
Clicking the configure link in the extractor module details screen (in the plugin details screen) brings you to the following screen:
That is where an administrator can configure how many events (approx) can be indexed. If the approach of the patch is acceptable, please let me know. I'll commit it to trunk.
David,
Thanks for the investigation mate. Awesome catch! We should've been suspicious of seeing a method call taking in 10000 as a parameter However, I don't see great value in indexing more than one occurrence of the an recurring event in the index, even if it is scoped to two years. It would be better to simply index unique events and one occurrence of a recurring event. This is sufficient to identify a page with the calendar we want to find. Otherwise, we would be polluting the index with potentially hundreds of occurrences of terms used in a recurring event description with little benefit. Cheers, You're right. How about this? We make it to have a configurable scope and we only index unique events?
From a quick look, here is some of my feedback:
That's true. It was never meant for rendering, that's why it is not appending to the buffer. This is related to:
This doesn't take care of the case where matching markup might appear in bodies of other macros which do not render their bodies. For instance, you could have the calendar macro markup in the body of {noformat}. Also, it doesn't take account of what happens to the index if the calendar macro is disabled and there are still event attached to the page. I think it shouldn't be indexed if the macro is disabled or uninstalled. The only reliable way I can think of to find all calendar IDs and address the problems above would be to "collect" them as the WikiMarkupParser goes through them.
No. There isn't a test for that. I planned to write some if the scoped indexing is an acceptable idea. I accept your point about handling noformat macros.
Great work! Attaching updated calendar event indexing timespan screen and patch that indexes only unique events.
Can we please simplify the extractor configuration so that:
Cheers, Good idea. I'll make that change.
Changes applied. I've committed the changes
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The problem seems to be caused by the extractor. It indexes all events from the start of time itself (year 0001) till 10,000 years later. That basically says if you have a weekly event scheduled to start in the first week of the first year in 0001, the extractor would have to index over 52000 events. That's just one weekly recurring event. There might be more.
For your reference, please refer to line 54 and 55 of the Extractor source
.
I've created a patch for this plugin which allows the administrator to configure the duration of events to be indexed – so that it isn't fixed to 10000 years. I'll attach it with more details in a while.
Also, I've noticed one little thing with the indexing feature of this plugin. It is not complete. New events do not get indexed and my guess is, event changes and removals aren't indexed either.