non-printing-characters

Trying to remove non-printable charaters(junk values) from a UNIX file

蓝咒 提交于 2021-01-20 04:18:50
问题 I am trying to remove non-printable character (for e.g. ^@ ) from records in my file. Since the volume to records is too big in the file using cat is not an option as the loop is taking too much time. I tried using sed -i 's/[^@a-zA-Z 0-9`~!@#$%^&*()_+\[\]\\{}|;'\'':",.\/<>?]//g' FILENAME but still the ^@ characters are not removed. Also I tried using awk '{ sub("[^a-zA-Z0-9\"!@#$%^&*|_\[](){}", ""); print } FILENAME > NEW FILE but it also did not help. Can anybody suggest some alternative

Trying to remove non-printable charaters(junk values) from a UNIX file

一曲冷凌霜 提交于 2021-01-20 04:15:04
问题 I am trying to remove non-printable character (for e.g. ^@ ) from records in my file. Since the volume to records is too big in the file using cat is not an option as the loop is taking too much time. I tried using sed -i 's/[^@a-zA-Z 0-9`~!@#$%^&*()_+\[\]\\{}|;'\'':",.\/<>?]//g' FILENAME but still the ^@ characters are not removed. Also I tried using awk '{ sub("[^a-zA-Z0-9\"!@#$%^&*|_\[](){}", ""); print } FILENAME > NEW FILE but it also did not help. Can anybody suggest some alternative

Can someone explain me exactly what the below definition means in the C standard about directives

霸气de小男生 提交于 2020-07-03 01:40:23
问题 What i exactly need to know is what characters are allowed before the start of a directive as we all know we can have new line characters and whitespace characters before the start of a directive before ( # ) now i read the C standard about this and found out the following definition explaining this: A preprocessing directive consists of a sequence of preprocessing tokens that satisfies the following constraints: The first token in the sequence is a # preprocessing token that (at the start of

How to remove non-printable/invisible characters in ruby?

时间秒杀一切 提交于 2020-01-10 01:58:12
问题 Sometimes I have evil non-printable characters in the middle of a string. These strings are user input, so I must make my program receive it well instead of try to change the source of the problem. For example, they can have zero width no-break space in the middle of the string. For example, while parsing a .po file, one problematic part was the string "he is a man of god" in the middle of the file. While it everything seems correct, inspecting it with irb shows: "he is a man of god"

Remove special characters from data frame

穿精又带淫゛_ 提交于 2019-12-18 12:34:56
问题 I have a matrix that contains the string "Energy per �m". Before the 'm' is a diamond shaped symbol with a question mark in it - I don't know what it is. I have tried to get rid of it by using this on the column of the matrix: a=gsub('Energy per �m','',a) [and using copy/paste for the first term of gsub], but it does not work.[unexpected symbol in "a=rep(5,Energy per"]. When I try to extract something from the original matrix with grepl I get: 46: In grepl("ref. value", raw$parameter) : input

Remove unwanted non-printable characters from large CSV files with millions of records -in Python 3 or 2.7

瘦欲@ 提交于 2019-12-13 09:50:43
问题 sample fileI receive large CSV files delimited with (comma or | or ^) with millions of records. Some of the fields have non-printable character like CR|LF which translated as end of field. This is in windows10. I need to write python to go thru the file and remove CR|LF in the fields. However, I cant remove all because then lines will be merged. I have gone thru several postings on here on how to remove non-printable. My thought to write a panda dataframe, then check every field for CR|LF and

Print_r() to PHP error_log() not working. (non-printing characters)

时光怂恿深爱的人放手 提交于 2019-12-13 05:06:23
问题 I have a static method in a helper class Utility::error_log() for helping us gracefully debug HUGE objects in PHP. The method, and it's helper-method Utility::toArray() are below: static function error_log($message, $data=null, $max=2) { if(is_array($data) || is_object($data)) $data = print_r(self::toArray($data, $max),true); if(is_array($message) || is_object($message)) $message = print_r(self::toArray($message, $max),true); if(!empty($data)) $data = "\n".$data; if (!strstr($message, PHP_EOL

Is it possible to echo some non-printable characters in batch/cmd?

橙三吉。 提交于 2019-12-12 09:43:38
问题 motivation I have a 3rd party, somehow long .bat file written for some specific function and would take considerable effort to re-write (which effort is also hindered by my problem). In for loops the most basic way to debug it would seem echoing some information to the screen. I used to do this with \r (0x0D) character in other languages that on some terminals/console re-writes the same line (to avoid overflooding, since in my case the last line would contain the error). I already save the

Ruby convert non-printable characters into numbers

Deadly 提交于 2019-12-10 17:45:11
问题 I have a string with non-printable characters. What I am currently doing is replacing them with a tilde using: string.gsub!(/^[:print:]]/, "~") However, I would actually like to convert them to their integer value. I tried this, but it always outputs 0 string.gsub!(/[^[:print:]]/, "#{$1.to_i}") Thoughts? 回答1: String#gsub, String#gsub! accept optional block. The return value of the block is used for substitution. "\x01Hello\x02".gsub(/[^[:print:]]/) { |x| x.ord } # => "1Hello2" 回答2: Object

How do I get rid of this unicode character?

↘锁芯ラ 提交于 2019-12-07 17:09:14
问题 Any idea how to get rid of this irritating character U+0092 from a bunch of text files? I've tried all the below but it doesn't work. It's called U+0092 + control from the character map sed -i 's/\xc2\x92//' * sed -i 's/\u0092//' * sed -i 's///' * Ah, I've found a way: CHARS=$(python2 -c 'print u"\u0092".encode("utf8")') sed 's/['"$CHARS"']//g' But is there a direct sed method for this? 回答1: Try sed "s/\`//g" * . (I added the g so it will remove all the backticks it finds). EDIT : It's not a