There are some wrong uses of StringBuilder.ToString(int
startIndex, int length) which may cause
System.ArgumentOutOfRangeException
Explanation with examples and suggestions for solution
above:
===================================================
Analysis\DE\GermanStemmer.Strip(System.Text.StringBuilder
buffer)
The substatement:
...buffer.ToString(buffer.Length - 2, buffer.Length)...
causes:
System.ArgumentOutOfRangeException
"Index and length must refer to a location within the
string.Parameter name: length"
Example:
StringBuilder.ToString(int startIndex, int length) ==>
buffer.ToString(int startIndex, int length) with
buffer.Length 5 ==>
buffer.ToString(5 - 2, 5) ==>
startIndex + length > buffer.Length ==>
System.ArgumentOutOfRangeException
solution:
buffer.ToString(buffer.Length - 2, 2)
===================================================
Analysis\DE\GermanStemmer.RemoveParticleDenotion(System.Text.StringBuilder
buffer)
The statement:
...
for (int c = 0; c < buffer.Length - 3; c++)
{
if (buffer.ToString(c, c + 4).Equals("gege"))
{
buffer.Remove(c, c + 2 - c);
return ;
}
}
...
causes:
System.ArgumentOutOfRangeException
"Index and length must refer to a location within the
string.Parameter name: length"
during the second iteration if the buffer has a length of 5
Example:
StringBuilder.ToString(int startIndex, int length) ==>
buffer.ToString(c, c + 4) with buffer.Length = 5 and c
= 1 ==>
buffer.ToString(1, 1 + 4) ==>
startIndex + length > buffer.Length ==>
System.ArgumentOutOfRangeException
solution 1:
// use this if multiple occurences of "gege..." like
"gegegege..."
// should be truncated to "ge..."
while(buffer.Length > 4 && buffer.ToString(0,
4).Equals("gege"))
{
buffer.Remove(0, 2);
}
return ;
solution 2:
// use this if a word starting with "gege..." should be
truncated to "ge..."
if(buffer.Length > 4 && buffer.ToString(0,
4).Equals("gege"))
{
buffer.Remove(0, 2);
}
return ;
===================================================
Analysis\DE\GermanStemmer.Optimize(System.Text.StringBuilder
buffer)
The substatement:
...buffer.ToString(buffer.Length - 5, buffer.Length)...
causes:
System.ArgumentOutOfRangeException
"Index and length must refer to a location within the
string.Parameter name: length"
Example:
StringBuilder.ToString(int startIndex, int length) ==>
buffer.ToString(int startIndex, int length) with
buffer.Length 7 ==>
buffer.ToString(7 - 5, 7) ==>
startIndex + length > buffer.Length ==>
System.ArgumentOutOfRangeException
solution:
buffer.ToString(buffer.Length - 5, 5)
Logged In: YES
user_id=778461
I've used lucene.net-1.4.3.final-002-22Feb05.src (which
should be the latest version)
Logged In: YES
user_id=446709
Exception traces for me on Mono 1.1.7:
http://leuksman.com/pages/lucene-bug
Patch which I think seems to work:
http://leuksman.com/misc/Lucene-Net-Analysis-DE.diff