I have the following HTML code:
106.2%
Which I get the number through two phases:
R
I want to share the solution I have found for my problem.
So, I can have HTML tags like the following:
0.9
106.4%
Or simpler:
51.4
First, I take the entire line, throught the following code:
MatchCollection mPrevious = Regex.Matches(html, "\\s*(.*?)\\s* ", RegexOptions.Singleline);
And second, I use the following code to extract the numbers only:
foreach (Match m in mPrevious)
{
if (m.Groups[1].Value.Contains("span"))
{
string stringtemp = Regex.Match(m.Groups[1].Value, "-?\\d+.\\d+.\">-?\\d+.\\d+|-?\\d+.\\d+\">-?\\d+.\\d+|-?\\d+.\">-?\\d+|-?\\d+\">-?\\d+").Value;
int indextemp = stringtemp.IndexOf(">");
if (indextemp <= 0) break;
lPrevious.Add(stringtemp.Remove(0, indextemp + 1));
}
else lPrevious.Add(Regex.Match(m.Groups[1].Value, @"-?\d+.\d+|-?\d+").Value);
}
First I start to identify if there is a SPAN tag, if there is, I take the two number together, and I have considered diferent posibilities with the regular expression. Identify a character from where to remove non important information, and remove what I don't want.
It's working perfect.
Thank you all for the support and quick answers.