Spliting string into words length-based lists c#

匿名 (未验证) 提交于 2019-12-03 02:24:01

问题:

I have a string of words separated by spaces. How to split the string into lists of words based on the words length?

Example

input:

" aa aaa aaaa bb bbb bbbb cc ccc cccc cccc bbb bb aa " 

output :

List 1 = { aa, bb, cc} List 2 = { aaa, bbb, ccc} List 3 = { aaaa, bbbb, cccc} 

回答1:

Edit: I'm glad my original answer helped the OP solve their problem. However, after pondering the problem a bit, I've adapted it (and I strongly advise against my former solution, which I have left at the end of the post).

A simple approach

string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc cccc bbb bb aa "; var words = input.Trim().Split().Distinct(); var lookup = words.ToLookup(word => word.Length); 

Explanation

First, we trim the input to avoid empty elements from the outer spaces. Then, we split the string into an array. If multiple spaces occur in between the words, you'd need to use StringSplitOptions as as in Mark's answer.

After calling Distinct to only include each word once, we now convert words from IEnumerable<string> to Lookup<int, string>, where the words' length is represented by the key (int) and the words themselves are stored in the value (string).

Hang on, how is that even possible? Don't we have multiple words for each key? Sure, but that's exactly what the Lookup class is there for:

Lookup<TKey, TElement> represents a collection of keys each mapped to one or more values. A Lookup<TKey, TElement> resembles a Dictionary<TKey, TValue>. The difference is that a Dictionary maps keys to single values, whereas a Lookup maps keys to collections of values.

You can create an instance of a Lookup by calling ToLookup on an object that implements IEnumerable<T>.


Note
There is no public constructor to create a new instance of a Lookup. Additionally, Lookup objects are immutable, that is, you cannot add or remove elements or keys from a Lookup after it has been created.

word => word.Length is the KeySelector lambda: it defines that we want to index (or group, if you will) the Lookup by the Length of the words.

Usage

Write all the words to the console

(similar to the question's originally requested output)

foreach (var grouping in lookup) {     Console.WriteLine("{0}: {1}", grouping.Key, string.Join(", ", grouping)); } 

Output

2: aa, bb, cc 3: aaa, bbb, ccc 4: aaaa, bbbb, cccc 

Put all words of a certain length in a List

List<String> list3 = lookup[3].ToList(); 

Order by key

(note that these will return IOrderedEnumerable<T>, so access by key is no longer possible)

var orderedAscending = lookup.OrderBy(grouping => grouping.Key); var orderedDescending = lookup.OrderByDescending(grouping => grouping.Key); 

Original answer - please don't do this (bad performance, code clutter):

string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc cccc bbb bb aa "; Dictionary<int, string[]> results = new Dictionary<int, string[]>(); var grouped = input.Trim().Split().Distinct().GroupBy(s => s.Length)     .OrderBy(g => g.Key); // or: OrderByDescending(g => g.Key); foreach (var grouping in grouped) {     results.Add(grouping.Key, grouping.ToArray()); } 


回答2:

You can use Where to find elements that match a predicate (in this case, having the correct length):

string[] words = input.Split();  List<string> twos = words.Where(s => s.Length == 2).ToList(); List<string> threes = words.Where(s => s.Length == 3).ToList(); List<string> fours = words.Where(s => s.Length == 4).ToList(); 

Alternatively you could use GroupBy to find all the groups at once:

var groups = words.GroupBy(s => s.Length); 

You can also use ToLookup so that you can easily index to find all the words of a specific length:

var lookup = words.ToLookup(s => s.Length); foreach (var word in lookup[3]) {     Console.WriteLine(word); } 

Result:

 aaa bbb ccc 

See it working online: ideone


In your update it looks like you want to remove the empty strings and duplicated words. You can do the former by using StringSplitOptions.RemoveEmptyEntries and the latter by using Distinct.

var words = input.Split((char[])null, StringSplitOptions.RemoveEmptyEntries)                  .Distinct(); var lookup = words.ToLookup(s => s.Length); 

Output:

aa, bb, cc aaa, bbb, ccc aaaa, bbbb, cccc 

See it working online: ideone



回答3:

First, let's declare a class that can hold a length as well as a list of words

public class WordList {     public int WordLength { get; set; }     public List<string> Words { get; set; } } 

Now, we can build a list of word lists with

string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc "; string[] words = input.Trim().Split(); List<WordList> list = words     .GroupBy(w => w.Length)     .OrderBy(group => group.Key)     .Select(group => new WordList {          WordLength = group.Key,          Words = group.Distinct().OrderBy(s => s).ToList()      })     .ToList(); 

The lists are sorted by length and aphabetically respectively.


Result

e.g.

list[2].WordLength ==> 4 list[2].Words[1] ==> "bbbb" 

UPDATE

If you want, you can process the result immediately, instead of putting it into a data structure

string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc ";  var query = input     .Trim()     .Split()     .GroupBy(w => w.Length)     .OrderBy(group => group.Key);  // Process the result here foreach (var group in query) {     // group.Key ==> length of words     foreach (string word in group.Distinct().OrderBy(w => w)) {        ...     } } 


回答4:

You can use Linq GroupBy

edit Now I applied Linq to generate the string list you wanted for output.

edit2 applied multiple input, single output as in edited question. It is just a Distinct call in Linq

string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc ";  var list = input.Split(' ');  var grouped = list.GroupBy(s => s.Length);  foreach (var elem in grouped) {     string header = "List " + elem.Key + ": ";     // var line = elem.Aggregate((workingSentence, next) => next + ", " + workingSentence);      // if you want single items, use this     var line = elem.Distinct().Aggregate((workingSentence, next) => next + ", " + workingSentence);     string full = header + " " + line;     Console.WriteLine(full); }   // output: please note the last blank in the input string! this generates the 0 list List 0:  , List 2:  cc, bb, aa List 3:  ccc, bbb, aaa List 4:  cccc, bbbb, aaaa 


回答5:

A bit lengthy solution but does get the result in a Dictionary

class Program     {         public static void Main()         {             Print();             Console.ReadKey();         }          private static void Print()         {             GetListOfWordsByLength();              foreach (var list in WordSortedDictionary)             {                 list.Value.ForEach(i => { Console.Write(i + ","); });                 Console.WriteLine();             }         }          private static void GetListOfWordsByLength()         {             string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc ";              string[] inputSplitted = input.Split(' ');              inputSplitted.ToList().ForEach(AddToList);         }          static readonly SortedDictionary<int, List<string>> WordSortedDictionary = new SortedDictionary<int, List<string>>();          private static void AddToList(string s)         {             if (s.Length > 0)             {                 if (WordSortedDictionary.ContainsKey(s.Length))                 {                     List<string> list = WordSortedDictionary[s.Length];                     list.Add(s);                 }                 else                 {                     WordSortedDictionary.Add(s.Length, new List<string> {s});                 }             }         }     } 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!