amazon-athena

Access Denied while querying S3 files from AWS Athena within Lambda in different account

馋奶兔 提交于 2020-04-16 21:12:07
问题 I am trying to query Athena View from my Lambda code. Created Athena table for S3 files which are in different account. Athena Query editor is giving me below error: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; I tried accessing Athena View from my Lambda code. Created Lambda Execution Role and allowed this role in Bucket Policy of another account S3 bucket as well like below: { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "AWS"

Connecting to Athena via R

Deadly 提交于 2020-04-16 05:12:49
问题 I am referring to this article for connecting R to Athena. When defining the driver, I am getting the following error : Error in .jfindClass(as.character(driverClass)[1]) : class not found I did some research and I arrived at this page. The accepted answer has a comment stating the same problem. However, the solution provided (i.e. restarting R) didn't work. I have written the following code till now. library("pacman") pacman::p_load("RJDBC") pacman::p_load("dplyr") # Downloading Athena

Connecting to Athena via R

二次信任 提交于 2020-04-16 05:10:39
问题 I am referring to this article for connecting R to Athena. When defining the driver, I am getting the following error : Error in .jfindClass(as.character(driverClass)[1]) : class not found I did some research and I arrived at this page. The accepted answer has a comment stating the same problem. However, the solution provided (i.e. restarting R) didn't work. I have written the following code till now. library("pacman") pacman::p_load("RJDBC") pacman::p_load("dplyr") # Downloading Athena

Access AWS Athena from Python Lambda in different account

久未见 提交于 2020-03-25 18:21:37
问题 I have two account A and B. S3 Buckets and Athena View is in account A and Lambda is in Account B. I want to call Athena from my Lambda. I have also allowed Lambda Execution Role in S3 Bucket Policy. When I try to call Database from Lambda, it gives me error as 'Status': {'State': 'FAILED', 'StateChangeReason': 'SYNTAX_ERROR: line 1:15: Schema db_name does not exist' Below is my Lambda Code: import boto3 import time def lambda_handler(event, context): athena_client = boto3.client('athena')

Access AWS Athena from Python Lambda in different account

本秂侑毒 提交于 2020-03-25 18:21:10
问题 I have two account A and B. S3 Buckets and Athena View is in account A and Lambda is in Account B. I want to call Athena from my Lambda. I have also allowed Lambda Execution Role in S3 Bucket Policy. When I try to call Database from Lambda, it gives me error as 'Status': {'State': 'FAILED', 'StateChangeReason': 'SYNTAX_ERROR: line 1:15: Schema db_name does not exist' Below is my Lambda Code: import boto3 import time def lambda_handler(event, context): athena_client = boto3.client('athena')

Are parquet file created with pyarrow vs pyspark compatible?

…衆ロ難τιáo~ 提交于 2020-02-25 06:03:40
问题 I have to convert analytics data in JSON to parquet in two steps. For the large amounts of existing data I am writing a PySpark job and doing df.repartition(*partitionby).write.partitionBy(partitionby). mode("append").parquet(output,compression=codec) however for incremental data I plan to use AWS Lambda. Probably, PySpark would be an overkill for it, and hence I plan to use PyArrow for it (I am aware that it unnecessarily involves Pandas, but I couldn't find a better alternative). So,

Are parquet file created with pyarrow vs pyspark compatible?

点点圈 提交于 2020-02-25 06:03:39
问题 I have to convert analytics data in JSON to parquet in two steps. For the large amounts of existing data I am writing a PySpark job and doing df.repartition(*partitionby).write.partitionBy(partitionby). mode("append").parquet(output,compression=codec) however for incremental data I plan to use AWS Lambda. Probably, PySpark would be an overkill for it, and hence I plan to use PyArrow for it (I am aware that it unnecessarily involves Pandas, but I couldn't find a better alternative). So,

resource type error while trying to use cloudformation

两盒软妹~` 提交于 2020-02-25 03:39:27
问题 I tried to use the exact same example provided in the user guide mentioned below. It works from console but fails to create stack using client. https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-athena-namedquery.html I got an error while trying to execute the following: { "Resources": { "AthenaNamedQuery": { "Type": "AWS::Athena::NamedQuery", "Properties": { "Database": "swfnetadata", "Description": "A query that selects all aggregated data", "Name":

resource type error while trying to use cloudformation

夙愿已清 提交于 2020-02-25 03:38:28
问题 I tried to use the exact same example provided in the user guide mentioned below. It works from console but fails to create stack using client. https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-athena-namedquery.html I got an error while trying to execute the following: { "Resources": { "AthenaNamedQuery": { "Type": "AWS::Athena::NamedQuery", "Properties": { "Database": "swfnetadata", "Description": "A query that selects all aggregated data", "Name":

spark Athena connector

☆樱花仙子☆ 提交于 2020-01-30 03:44:30
问题 I need to use Athena in spark but spark uses preparedStatement when using JDBC drivers and it gives me an exception "com.amazonaws.athena.jdbc.NotImplementedException: Method Connection.prepareStatement is not yet implemented" Can you please let me know how can I connect Athena in spark 回答1: I don't know how you'd connect to Athena from Spark, but you don't need to - you can very easily query the data that Athena contains (or, more correctly, "registers") from Spark. There are two parts to