I am reading in a large file X12 and parsing the information within. I have two bottleneck functions that I can't seem to work around. read_line() and get_element() Is there any way I could make these two functions faster? The main bottleneck in the get_element function seems to be the Substring method.
public String get_element(int element_number) {
int count = 0;
int start_index = 0;
int end_index = 0;
int current_index = 0;
while (count < element_number && current_index != -1) {
current_index = line_text.IndexOf(x12_reader.element_delimiter, start_index);
start_index = current_index + 1;
count++;
}
if (current_index != -1) {
end_index = line_text.IndexOf(x12_reader.element_delimiter, start_index);
if (end_index == -1) end_index = line_text.Length;
return line_text.Substring(start_index, end_index - start_index);
} else {
return "";
}
}
private String read_line() {
string_builder.Clear();
int n;
while ((n = stream_reader.Read()) != -1) {
if (n == line_terminator) return string_builder.ToString();
string_builder.Append((char)n);
}
return string_builder.ToString();
}
I am reading x12 data. Here is an example of what it looks like. http://examples.x12.org/005010X221/dollars-and-data-sent-together/
Since your profiler tells you get_element
is a bottleneck, and the method itself is coded very efficiently, you need to minimize the number of times this method is called.
Calling get_element
repeatedly in a loop forces it to performs the same parsing job repeatedly:
for (int i = 0 ; i != n ; i++) {
var element = get_element(i);
... // Do something with the element
}
You should be able to fix this problem by rewriting get_element
as GetElements
returning all elements as a collection, and then taking individual elements from the same collection in a loop:
var allElements = GetElements();
for (int i = 0 ; i != n ; i++) {
var element = allElements[i];
... // Do something with the element
}
in most cases I only need one or two elements
In this case you could make a method that retrieves all required indexes at once - for example, by passing BitArray
of required indexes.
Ok, second try. Discarding String.Split
due to performance reasons, something like this should work much faster than your implementation:
//DISCLAIMER; typed in my cell phone, not tested. Sure it has bugs but you should get the idea.
public string get_element(int index)
{
var buffer = new StringBuilder();
var counter = -1;
using (var enumerator = text_line.GetEnumerator())
{
while (enumerator.MoveNext())
{
if (enumerator.Current == x12_reader.element_delimiter)
{
counter++;
}
else if (counter == index)
{
buffer.Append(enumerator.Current);
}
else if (counter > index)
break;
}
}
return buffer.ToString();
}
I'm not sure what you are doing exactly, but if I'm understanding your code correctly, wouldn't get element be simpler as follows?
public string get_Element(int index)
{
var elements = line_text.Split(new[] { x12_reader.element_delimiter });
if (index > elements.Length)
return "";
return elements[index];
}
来源:https://stackoverflow.com/questions/39230037/is-there-any-way-i-can-make-this-c-sharp-code-faster