1. Scan selected volume/hd folder for duplicate
files/folders in entire collection
(That could be useful when preparing mp3 cd
to be burned... to avoid burning duplicates...)
2. A smarter search algorithm (I mean... John
Smith == Jhon Smith == joHn Smith == Jňhn Smith ==
Jihn Smith)
Here is my idea how to do that...
A. Convert ň->o č->e...
B. Lower
C. Ignore anything but letters
and numbers,
D. Count found letters
E. Choose if they look alike on
a maximum different characters parameter basis
Example:
1. John Smith -> Converting: John
Smith -> Lowercase: john smith -> Ignoring: johnsmith -
> Counting Characters: h(2) i(1) j(1) m(1) n(1) o(1) t(1)
s(1)
2. Jhon Smith -> Converting: Jhon
Smith -> Lowercase: jhon smith -> Ignoring: jhonsmith -
> Counting Characters: h(2) i(1) j(1) m(1) n(1) o(1) t(1)
s(1)
3. joHn Smith -> Converting: joHn
Smith -> Lowercase: john smith -> Ignoring: johnsmith -
> Counting Characters: h(2) i(1) j(1) m(1) n(1) o(1) t(1)
s(1)
4. Jňhn Smith -> Converting: John
Smith -> Lowercase: john smith -> Ignoring: johnsmith -
> Counting Characters: h(2) i(1) j(1) m(1) n(1) o(1) t(1)
s(1)
5. Jihn Smith -> Converting: Jihn
Smith -> Lowercase: jihn smith -> Ignoring: jihnsmith ->
Counting Characters: h(2) i(2) j(1) m(1) n(1) t(1) s(1)
Comparing results, we find that:
A. 1, 2, 3 & 4 match
B. 5 matches 1,2,3 &
4 on a 2 char difference (i++ & o--) but it could be
easyly found out that 'o' has been replaced be 'i'...
Thanks.
Bye