extract

Extract single file from RAR archive with rarfile in Python

老子叫甜甜 提交于 2021-02-08 07:54:35
问题 I have a RAR archive with 2 files and I want to extract only one. I found in another answer that I could use the rarfile package, which according to the documentation contains the extract function. However, when I try to run a script I get a FileNotFoundError: [WinError 2] and the following information: During handling of the above exception, another exception occurred : ... rarfile.RarCannotExec: Unrar not installed? (rarfile.UNRAR_TOOL='unrar') . From the information I could find, I saw it

python pandas extracting numbers within text to a new column

三世轮回 提交于 2021-02-08 04:44:23
问题 I have the following text in column A: A hellothere_3.43 hellothere_3.9 I would like to extract only the numbers to another new column B (next to A), e.g: B 3.43 3.9 I use: str.extract('(\d.\d\d)', expand=True) but this copies only the 3.43 (i.e. the exact number of digits). Is there a way to make it more generic? Many thanks! 回答1: Use Regex. Ex: import pandas as pd df = pd.DataFrame({"A": ["hellothere_3.43", "hellothere_3.9"]}) df["B"] = df["A"].str.extract("(\d*\.?\d+)", expand=True) print

Extract specific rows based on the set cut-off values in columns

◇◆丶佛笑我妖孽 提交于 2021-02-04 19:46:46
问题 I have a TAB-delimited .txt file that looks like this. Gene_name A B C D E F Gene1 1 0 5 2 0 0 Gene2 4 45 0 0 32 1 Gene3 0 23 0 4 0 54 Gene4 12 0 6 8 7 4 Gene5 4 0 0 6 0 7 Gene6 0 6 8 0 0 5 Gene7 13 45 64 234 0 6 Gene8 11 6 0 7 7 9 Gene9 6 0 12 34 0 11 Gene10 23 4 6 7 89 0 I want to extract rows in which at least 3 columns have values > 0.. How do I do this using pandas? I am clueless about how to use conditions in .txt files. thanks very much! update: adding on to this question, how do I

Efficient way to extract text from between tags

元气小坏坏 提交于 2021-02-04 13:44:11
问题 Suppose I have something like this: var = '<li> <a href="/...html">Energy</a> <ul> <li> <a href="/...html">Coal</a> </li> <li> <a href="/...html">Oil </a> </li> <li> <a href="/...html">Carbon</a> </li> <li> <a href="/...html">Oxygen</a> </li' What is the best (most efficient) way to extract the text in between the tags? Should I use regex for this? My current technique relies on splitting the string on li tags and using a for loop, just wondering if there was a faster way to do this. 回答1: You

Efficient way to extract text from between tags

半城伤御伤魂 提交于 2021-02-04 13:44:06
问题 Suppose I have something like this: var = '<li> <a href="/...html">Energy</a> <ul> <li> <a href="/...html">Coal</a> </li> <li> <a href="/...html">Oil </a> </li> <li> <a href="/...html">Carbon</a> </li> <li> <a href="/...html">Oxygen</a> </li' What is the best (most efficient) way to extract the text in between the tags? Should I use regex for this? My current technique relies on splitting the string on li tags and using a for loop, just wondering if there was a faster way to do this. 回答1: You

Locate text position, extract text and insert in new column in MySQL

久未见 提交于 2021-02-04 13:28:25
问题 I have the following example of rows in a MySQl table Column A Row1 Lauguage=English&Country=USA&Gender=Male Row2 Gender=Female&Language=French&Country= Row3 Country=Canada&Gender=&Language=English How can I achieve the following: For example, I need to look for Country I need to locate the position of Country in this text column. This changes from row to row. I need to then ignore the parameter 'Country=' and only extract the value. In some cases this will be NULL (like example in Row2)

extracting substring using regex in sas

蓝咒 提交于 2021-01-29 12:09:30
问题 I have a string like this: dfjkldjfdsldfkdslfkd dfkdjd/FR018/HAHDFKDLFDAFHDKFJL/ABCD//NAME/I WANT TO EXTRACT THIS/JJJJ//NAME/blah blah blah in this string, I want to be able to pull the string I WANT TO EXTRACT THIS . In other words, I want to extract everything that follows /ABCD//NAME/ and before /JJJJ . how can I write this using regular expressions? thanks 回答1: I am not familiar with SAS, but from the documentation it seems like you can do: re = prxparse('/\/ABCD\/\/NAME\/(.*?)\/(.*?)\

Extract companies' register number in Python by getting the next word

我是研究僧i 提交于 2021-01-29 06:54:40
问题 I am trying to get the German Handelsregisternummer (companies' register number) which usually is directly written behind the word HRB . However there are exceptions which I would like to catch with my regex. The goal is to call the function and set the keyword (in this case it is HRB ). Then the function returns the number. Please see regex demo! This is what I have so far! This doesn't catch all cases. def get_company_register_number(string, keyword): reg_1 = fr'\b{keyword}\b[,:|\s]*(\w+)'

unexpend for-loop result after change the third part of for-loop

落花浮王杯 提交于 2021-01-29 04:59:37
问题 When I use for-loop in my source file, I get a unexpend result. Here is the minimum source file (I hide the head of file and the function of print_set ): // the main function int main(void) { set<int> test{3, 5}; print_set(test); for (auto it = test.begin(); it != test.end();) { auto node = test.extract(it); ++it; } print_set(test); } Then I use command to compile and run: $ g++ --version g++ (Dedian 8.3.0-6) 8.3.0 ... (not very important infomation for this question) $ g++ -std=c++17 temp

Extract values from specific row, column using bash

拜拜、爱过 提交于 2021-01-28 15:14:58
问题 I have a sample log file with fields and values in this format value1 value2 50 100 value3 value4 10 150 value5 200 I need to extract fields and values to something of this format value1=50 value2=100 value3=10 value4=150 value5=200 回答1: Try this awk: gawk '{split($0,n_arr," "); getline; n=split($0,v_arr," "); getline; for (i=1;i<=n;i++){print n_arr[i] "=" v_arr[i]}}' 回答2: awk '/value/ { if ((getline values) > 0) { split(values, array) for (i = 1; i <= NF; i++) print $i "=" array[i] } }'