Table1 has nvarchar column called umsg which contains unicode text and some time english also.
I want to find out English text present in umsg column.
You did not answer what you want in case there are some unicode and some ascii characters in the same string, so I give you 1 idea and 1 solution for the case if you want only to find "pure English" or "mixed" rows.
You need a table of natural numbers to do this .In case you have no such a table you can generate it like this:
select top 1000000 row_number() over(order by getdate()) as n
into dbo.nums
from sys.messages m1 cross join sys.messages m2;
alter table dbo.nums alter column n int not null;
alter table dbo.nums add constraint PK_nums_n primary key(n);
Now that you have a table of natural numbers we are going to decompose your strings into single characters to check if ascii(character)
= unicode(character)
:
declare @t table(col Nvarchar(200));
insert into @t values
(N'ref no été'), (N'The livers detoxification processes.'), (N'फेंगशुई के छोटे-छोटे टिप्स से आप जीवन की विषमताओं से')
select t.col, n, substring(t.col, n, 1) as nth_character,
ascii(substring(t.col, n, 1)) as ascii,
unicode(substring(t.col, n, 1)) as uni
from @t t join dbo.nums n
on n.n <= len(t.col); -- this is to give you an idea how to see if it's unicode character or ascii
with cte as
(
select t.col, n, substring(t.col, n, 1) as nth_character,
ascii(substring(t.col, n, 1)) as ascii,
unicode(substring(t.col, n, 1)) as uni
from @t t join dbo.nums n
on n.n <= len(t.col)
)
select col,
case
when sum(case when ascii = uni then 1 else 0 end) = count(*) then 'English only'
else 'Not only English'
end as eng_or_not
from cte
group by col -- row level solution
The first part of the code shows you your string character by character along with character's ascii ande unicode code: where they are the same it's ascii character.
The second part just check if all the characters are ascii.