openrefine

Text with /n matching in regex and Openrefine

て烟熏妆下的殇ゞ 提交于 2019-12-11 17:55:34
问题 I'm trying to filter a text that has new lines in open refine. The input is: Them Spanish girls love me like I'm Aventura I'm the man, y'all don't get it, do ya? Type of money, everybody acting like they knew ya Go Uptown, New York City, bitch Them Spanish girls love me like I'm Aventura Tell Uncle Luke I'm out in Miami, too Them Spanish girls love me like I'm Aventura The expected Result would be: Type of money, everybody acting like they knew ya Go Uptown, New York City, bitch Them Spanish

OpenRefine - Fill between cells but not at the end of the list

倖福魔咒の 提交于 2019-12-11 13:57:15
问题 I have a list of stock prices for several stocks. Some of the values are missing due to weekends, holidays and probably other reasons. The gaps are not consistent. Some are two days and some are more than that. I want to fill the gaps with the last known value but not at the end of the list. I have tried in Excel to test a few cells below and if it's now empty, do the fill. The problem is that due to the inconsistency of the gaps, it's a tedious task to change the function for all the cases.

special characters in replace function

眉间皱痕 提交于 2019-12-11 06:35:07
问题 GREL replace function expects 3 strings, or a string, a regex and a string. In the 3rd string used for replacement, some characters have a special behavior : \, \, \t, \n, \', \" and maybe some other combinations. \ does nothing, or an error \ is interpreted as \ \t is interpreted as a tab character \n is interpreted as a new line \" is interpreted as " \' is interpreted as ' Ex : "abab".replace('b',"\") -> "Parsing error at offset 19: Missing number, string, identifier, regex, or

OpenRefine: create a shifted copy of a column

杀马特。学长 韩版系。学妹 提交于 2019-12-11 04:38:49
问题 I wonder if OpenRefine lets you access data from other rows, when creating a new column. I suspect it does not (and it would be a sane design principle) but there could be a hack around that. Here is an example of what one could want to do: shifting a column by one row. I have the following table: ╔═════╦════════╗ ║ row ║ Model ║ ╠═════╬════════╣ ║ 1 ║ Quest ║ ║ 2 ║ DF ║ ║ 3 ║ Waw ║ ║ 4 ║ Strada ║ ╚═════╩════════╝ And I want to obtain the following result: ╔═════╦════════╦══════════╗ ║ row ║

How to merge rows in OpenRefine

拥有回忆 提交于 2019-12-09 03:56:30
问题 How to merge rows based on some ID field? Original Table New Table ID | Field1 | Field2 ID | Field1 | Field2 -----|------- |-------- -------|--------|------- A 5 A 5 10 A 10 B 1 3 B 1 C 4 150 B 3 C 4 C 150 I want to fill a given cell value based on value in a group identified by some ID field. That is, I want to aggregate table and use non empty value in each column as aggregation function. 回答1: I think a simpler solution would be to use: 1° The feature "Edit Cells / Blank Down" on your ID

Import columns to existing OpenRefine project

时光毁灭记忆、已成空白 提交于 2019-12-08 11:33:30
问题 How do I add a column from an external .csv file to an existing project? I tried to find the solution online, but I wasn't successful. 回答1: Using the file you provided, I did this in less than one minute. I had a project, with one column: . If you know a little Python, try Jython. Edit Column > Add column based on this column and chose Language : Jython like this: import csv #we are going to use DictReader to transform our imported rows into dict, #so we can latter just refer to the column we

How to execute OpenRefine JSON on CSV in Python?

北战南征 提交于 2019-12-06 09:47:34
问题 I am trying to find a Python solution which can execute the following OpenRefine Python commands in JSON without OpenRefine server being on. My OpenRefine JSON contains mappings and custom Python commands on each field of any properly formatted CSV file, so this is not a basic JSON reading. One example OpenRefine JSON code where only regex mappings [ { "op": "core/text-transform", "description": "Text transform on cells in column Sleep using expression jython:import re\n\nvalue = re.sub(\"h0\

Trying to parse a Json with Open Refine GREL

你说的曾经没有我的故事 提交于 2019-12-06 07:03:55
问题 I'm trying to parse this JSON but really can't find the way to extract the data I want. { "results" : [ { "address_components" : [ { "long_name" : "44", "short_name" : "44", "types" : [ "street_number" ] }, { "long_name" : "Rue Montaigne", "short_name" : "Rue Montaigne", "types" : [ "route" ] }, { "long_name" : "Agen", "short_name" : "Agen", "types" : [ "locality", "political" ] }, { "long_name" : "Lot-et-Garonne", "short_name" : "Lot-et-Garonne", "types" : [ "administrative_area_level_2",

Open Refine - Add another file to the existing Project

拟墨画扇 提交于 2019-12-06 04:07:44
问题 I've imported a CSV file to OR (Open Refine). Since the CSV file I have contains over 200,000 records, I've decided to create separate files, since uploading the large file wouldn't work in my computer (takes too long, not even sure if it is actually importing). I was able to create three .csv files out of the single file (large). I've successfully imported each of the .csv files but now I want to import all three into one project in OR. Is that even possible? 回答1: Add the three file into one

Value.match() Regex in Google Refine

浪尽此生 提交于 2019-12-04 22:32:56
问题 I am trying to extract a sequence of numbers from a column in Google Refine. Here is my code for doing it: value.match(/[\d]+/)[0] The data in my column is in the format of abcababcabc 1234566 abcabcbacdf The results is "null". I have no idea why!! It is also null if instead of \d I try \w . 回答1: OpenRefine doesn't add implicit wildcards to the end of the pattern as some systems do (and as one might expect). Try this pattern instead: value.match(/.*?(\d+).*?/)[0] You need the lazy/non-greedy