Issue Details (XML | Word | Printable)

Key: UWC-106
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Laura Kolker
Reporter: Laura Kolker
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Confluence Universal Wiki Converter

UWC should disallow illegal page names when importing to Confluence

Created: 19/Apr/07 09:02 AM   Updated: 30/May/07 11:50 AM
Component/s: framework
Affects Version/s: 40
Fix Version/s: 45

Time Tracking:
Not Specified

Labels:


 Description  « Hide
Goal:
  • Disallow the characters that confluence does not allow (try adding apage with a comma in it, and see the list in the error message - where is this info kept can we get it dynamically so if it changes we're reflecting the given Confluence's requirements)
  • While still converting the page, and keeping links accurate

So....
I'm thinking a two step process
From within the ConverterEngine, we always add two converters to the end of the converter string list.
These converter do the following things:
1) Make a list of pagenames that are illegal
2) For each illegal pagename, create a substitute that is legal
3) Change the illegal pagename to the legal substitute
4) Look through every page for links to the illegal page name and change the link to the legal substitute.

(Step 4 requires that steps 1-3 have already been accomplished and saved as state (maybe a hashtable?), therefore step 4 is the second step in the "two-step" process. Also, therefore, need 2 converters. First converter does steps 1-3. Second converter does step 4.



 All   Comments   Work Log   Change History   FishEye   Crucible   Builds      Sort Order: Ascending order - Click to sort in descending order
Laura Kolker added a comment - 14/May/07 01:26 PM
Alternatively, if step 3 is simple and predictable enough... we could just run links from step 4 through that syntax converter as well (instead of keeping the state in a hashtable)

Laura Kolker added a comment - 15/May/07 03:12 PM
What about the starting chars?

The message describing illegal pagenames says:

Page titles can not contain (:, @, /, \, |, ^, #, ;, [, ], {, }, <, >) or start with ($, .., ~).

Laura Kolker added a comment - 30/May/07 11:50 AM
There were generally two possible solutions (or a combination of those two): one that maintained the illegal names in some sort of state, one that didn't.
The problem with the stateful one is that then, the user can't batch the sets of pages. Ie. They have to run the UWC on all their pages in order to get the links correct.

The stateless one can't handle all edge cases for links. Currently, the most problematic edge case is if there are right brackets in the illegal name. The page name will be changed correctly, but the links won't be.

Luckily, many wikis have right brackets as illegal characters for page names as well, so this case is unlikely.

In then event it does become a use case of interest, perhaps we could have an optional-turn-on-the-stateful logic in addition to the stateless logic option.