unnest

Unnesting a data frame containing lists

蹲街弑〆低调 提交于 2021-02-16 20:33:52
问题 I have a data frame that contains lists, like below: # Load packages library(dplyr) # Create data frame df <- structure(list(ID = 1:3, A = structure(list(c(9, 8), c(7,6), c(6, 9)), ptype = numeric(0), class = c("vctrs_list_of", "vctrs_vctr")), B = structure(list(c(3, 5), c(2, 6), c(1, 5)), ptype = numeric(0), class = c("vctrs_list_of", "vctrs_vctr")), C = structure(list(c(6, 5), c(7, 6), c(8, 7)), ptype = numeric(0), class = c("vctrs_list_of", "vctrs_vctr")), D = structure(list(c(5, 3), c(4,

How to Delete rows from Structure in bigquery

≡放荡痞女 提交于 2021-02-11 12:28:50
问题 Can you please help me with my question, i am new to Bigquery. I have a table with multiple "record" type fields. I need to delete a row from one of the record. Consider below example as: id date subid.id subid.flag 1234 1/4/2020 1234-1 1 1234-2 1 1234-3 1 1234-4 -1 5678 1/5/2020 5678-1 1 5678-2 1 My requirement from the above is to delete the row from the structure subid with flag -1. What is the best way to do this ? Please help. sample data 回答1: Below is for BigQuery Standard SQL

Fill in same amount of characters where other column is NaN

左心房为你撑大大i 提交于 2021-02-04 05:37:37
问题 I have the following dummy dataframe: df = pd.DataFrame({'Col1':['a,b,c,d', 'e,f,g,h', 'i,j,k,l,m'], 'Col2':['aa~bb~cc~dd', np.NaN, 'ii~jj~kk~ll~mm']}) Col1 Col2 0 a,b,c,d aa~bb~cc~dd 1 e,f,g,h NaN 2 i,j,k,l,m ii~jj~kk~ll~mm The real dataset has shape 500000, 90 . I need to unnest these values to rows and I'm using the new explode method for this, which works fine. The problem is the NaN , these will cause unequal lengths after the explode , so I need to fill in the same amount of delimiters

Fill in same amount of characters where other column is NaN

一个人想着一个人 提交于 2021-02-04 05:35:44
问题 I have the following dummy dataframe: df = pd.DataFrame({'Col1':['a,b,c,d', 'e,f,g,h', 'i,j,k,l,m'], 'Col2':['aa~bb~cc~dd', np.NaN, 'ii~jj~kk~ll~mm']}) Col1 Col2 0 a,b,c,d aa~bb~cc~dd 1 e,f,g,h NaN 2 i,j,k,l,m ii~jj~kk~ll~mm The real dataset has shape 500000, 90 . I need to unnest these values to rows and I'm using the new explode method for this, which works fine. The problem is the NaN , these will cause unequal lengths after the explode , so I need to fill in the same amount of delimiters

Join against the output of an array unnest without creating a temp table

余生长醉 提交于 2021-01-27 06:38:27
问题 I have a query in a UDF (shown below) which unnest() s an array of integers and joins against it, I have created a local temp table in my pgplsql UDF since I know this works. Is it possible to use unnest directly in a query to perform a join instead of having to create a temp table ? CREATE OR REPLACE FUNCTION search_posts( forum_id_ INTEGER, query_ CHARACTER VARYING, offset_ INTEGER DEFAULT NULL, limit_ INTEGER DEFAULT NULL, from_date_ TIMESTAMP WITHOUT TIME ZONE DEFAULT NULL, to_date_

Unnesting in SQL (Athena): How to convert array of structs into an array of values plucked from the structs?

余生长醉 提交于 2020-06-25 08:37:55
问题 I am taking samples from a Bayesian statistical model, serializing them with Avro, uploading them to S3, and querying them with Athena. I need help writing a query that unnests an array in the table. The CREATE TABLE query looks like: CREATE EXTERNAL TABLE `model_posterior`( `job_id` bigint, `model_id` bigint, `parents` array<struct<`feature_name`:string,`feature_value`:bigint, `is_zid`:boolean>>, `posterior_samples` struct <`parameter`:string,`is_scaled`:boolean,`samples`:array<double>>) The

Purrr safely creating lists of lists

天大地大妈咪最大 提交于 2020-06-13 09:24:33
问题 I've used safely to catch an error which occurs in my code when I'm purring. However, the result from safely is much more complex than I anticipated. First we create the necessary functions and example data. #base functions. SI_tall <- function(topheight, age, si ){ paramasi <- 25 parambeta <- 7395.6 paramb2 <- -1.7829 refAge <- 100 d <- parambeta*(paramasi^paramb2) r <- (((topheight-d)^2)+(4*parambeta*topheight*(age^paramb2)))^0.5 ## height at reference age h2 <- (topheight+d+r)/ (2+(4

Returning a tibble: how to vectorize with case_when?

北慕城南 提交于 2020-05-26 19:16:34
问题 I have a function which returns a tibble. It runs OK, but I want to vectorize it. library(tidyverse) tibTest <- tibble(argX = 1:4, argY = 7:4) square_it <- function(xx, yy) { if(xx >= 4){ tibble(x = NA, y = NA) } else if(xx == 3){ tibble(x = as.integer(), y = as.integer()) } else if (xx == 2){ tibble(x = xx^2 - 1, y = yy^2 -1) } else { tibble(x = xx^2, y = yy^2) } } It runs OK in a mutate when I call it with map2 , giving me the result I wanted: tibTest %>% mutate(sq = map2(argX, argY, square