OrderBy ignoring accented letters
I want a method like OrderBy()
that always orders ignoring accented letters and to look at them like non-accented. I already tried to override OrderBy()
but seems I can't do that because that is a static method.
So now I want to create a custom lambda expression for OrderBy()
, like this:
public static IOrderedEnumerable<TSource> ToOrderBy<TSource, TKey>(
this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
if(source == null)
return null;
var seenKeys = new HashSet<TKey>();
var culture = new CultureInfo("pt-PT");
return source.OrderBy(element => seenKeys.Add(keySelector(element)),
StringComparer.Create(culture, false));
}
However, I'm getting this error:
Error 2 The type arguments for method 'System.Linq.Enumerable.OrderBy<TSource,TKey>(System.Collections.Generic.IEnumerable<TSource>, System.Func<TSource,TKey>, System.Collections.Generic.IComparer<TKey>)' cannot be inferred from the usage. Try specifying the type arguments explicitly.
Seems it doesn't like StringComparer
. How can I solve this?
Note:
I already tried to use RemoveDiacritics()
from here but I don't know how to use that method in this case. So I tried to do something like this which seems nice too.
OrderBy
takes a keySelector
as first argument. This keySelector
should be a Func<string,T>
. So you need a method that takes a string and returns a value by which your enumeration should be sorted.
Unfortunatly I'm not sure how to determine if a character is a "accented letter". The RemoveDiacritics
doesn't work for my é
.
So let's assume you have a method called IsAccentedLetter
that determines if a character is an accented letter:
public bool IsAccentedLetter(char c)
{
// I'm afraid this does NOT really do the job
return CharUnicodeInfo.GetUnicodeCategory(c) == UnicodeCategory.NonSpacingMark;
}
So you can sort your list like that:
string[] myStrings = getStrings(); // whereever your strings come from
var ordered = myStrings.OrderBy(s => new string(s.Select(c =>
IsAccentedLetter(c) ? ' ' : c).ToArray()), StringComparer.Create(culture, false));
The lambda expression takes a string and returns the same string, but replaced the accented letters with an empty space.
OrderBy
now sorts your enumeration by these strings, and so "ignores" the accented letters.
UPDATE: If you have a working method RemoveDiacritics(string s)
that returns the strings with the accented letters replaced as you want, you may simply call OrderBy
like this:
string[] mystrings = getStrings();
var ordered = myStrings.OrderBy(RemoveDiacritics, StringComparer.Create(culture, false));
Solved! I was getting that error because to use StringComparer
the element to sort in OrderBy()
expression that element needs to be a string
.
So when I know that element is a string I cast to a string and I use the RemoveDiacritics()
method to ignore the accented letters and to look at them like non-accented.
public static IOrderedEnumerable<TSource> ToOrderBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
if(!source.SafeAny())
return null;
return source.OrderBy(element => Utils.RemoveDiacritics(keySelector(element).ToString()));
}
To garantee the RemoveDiacritics()
works fine I add a HtmlDecode()
line.
public static string RemoveDiacritics(string text)
{
if(text != null)
text = WebUtility.HtmlDecode(text);
string formD = text.Normalize(NormalizationForm.FormD);
StringBuilder sb = new StringBuilder();
foreach (char ch in formD)
{
UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(ch);
if (uc != UnicodeCategory.NonSpacingMark)
{
sb.Append(ch);
}
}
return sb.ToString().Normalize(NormalizationForm.FormC);
}
链接地址: http://www.djcxy.com/p/90098.html
上一篇: 运行Mono的最佳Linux发行版
下一篇: OrderBy忽略重音字母