bulkloader

Uploading entity with parent using bulkloader

此生再无相见时 提交于 2019-12-10 23:07:18
问题 So, I am trying to create an entity with a parent using bulkloader. I have a Client entity: class Client(db.Model): identifier = db.StringProperty() www_ip = db.StringProperty() local_ip = db.StringProperty() status=db.BooleanProperty() And I want to create a Data entiy as child of Client. class Data(db.Model): songscount = db.IntegerProperty() nextorder = db.IntegerProperty(default=1) players = db.ListProperty(str) previousplayer = db.StringProperty() Client entity exists. Data.yaml is

PySpark: Can saveAsNewAPIHadoopDataset() be used as bulk loading to HBase?

拈花ヽ惹草 提交于 2019-12-10 18:20:52
问题 We currently import data to HBase tables via Spark RDDs (pyspark) by using saveAsNewAPIHadoopDataset(). Is this function using the HBase bulk loading feature via mapreduce? In other words, would saveAsNewAPIHadoopDataset(), which imports directly to HBase, be equivalent to using saveAsNewAPIHadoopFile() to write Hfiles to HDFS, and then invoke org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles to load to HBase? Here is an example snippet of our HBase loading routine: conf = {"hbase

AppEngine bulk loader and automatically created property values

烂漫一生 提交于 2019-12-08 06:45:35
问题 In my model I have a property: created = db.DateTimeProperty(required=True, auto_now_add=True) When an object of this type is created in the datastore, the created property is automatically populated. When I use the bulk loader tool with a table which does not have this field, the field is not automatically populated when I upload to AppEngine, at which time new objects are created. How can I make it set the created time on new objects uploaded from the bulk loader? 回答1: Add something like

bulkloader not importing ndb.model

こ雲淡風輕ζ 提交于 2019-12-07 16:02:38
I am still new to Python and GAE. I have an application on local server that is running just fine. I can add entity to my datastore, I can view my website, etc: everything is fine. Now I am trying to use bulkloader to add entities to my datastore. I followed the tutorial at https://developers.google.com/appengine/docs/python/tools/uploadingdata . My loader is below: from google.appengine.ext import ndb from google.appengine.tools import bulkloader import my_model class ArticleLoader(bulkloader.Loader): def __init__(self): bulkloader.Loader.__init__(self, 'Article', [('title', str), ('author',

How to upload data with key_name by Google App Engine bulkloader

隐身守侯 提交于 2019-12-07 13:09:54
问题 I can upload data, but key_name is empty. How can I use the 'id' in the CSV as the key_name on datastore? I'like to use 'id' as the key_name, because other data uses the 'id' as the foreign key. I'm new to Google App Engine. This is the CSV data. "id","name" "1","USA" "2","France" "3","Italy" This is the YAML - model: model.testcountry.TestCountry connector: csv connector_options: encoding: utf-8 columns: from_header property_map: - property: __key__ external_name: id - property: name

Where are the reference pages of the Google App Engine bulkloader transform?

拈花ヽ惹草 提交于 2019-12-07 06:33:23
问题 From an empty datastore, I was able to auto-generate a bulkloader.yaml file. It only contains the python_preamble , but the transformers section was empty. python_preamble: - import: google.appengine.ext.bulkload.transform - import: google.appengine.ext.bulkload.bulkloader_wizard - import: my_own_transformers - import: data_models # This is where the SomeData class is defined. # some more imports here Then based on the examples in the documentation, I need to define a property map for each of

Bulkloader CSV size error

旧巷老猫 提交于 2019-12-06 20:07:32
问题 Bulkloader raises the following error when importing a CSV file with large cells: [ERROR ] Error in data source thread: field larger than field limit (131072) This is a common problem for the csv module, which can be fixed with: csv.field_size_limit(sys.maxint) How can I make bulkloader execute this? 回答1: Try this: In bulkloader.yaml add: python_preamble: - import: csv_fix ... # the rest of your imports In csv_fix.py add: import csv, sys csv.field_size_limit(sys.maxint) 来源: https:/

Starting, Stopping, and Continuing the Google App Engine BulkLoader

萝らか妹 提交于 2019-12-06 11:30:08
问题 I have quite of bit of data that I will be uploading into Google App Engine. I want to use the bulkloader to help get it in there. However, I have so much data that I generally use up my CPU quota before it's done. Also, any other problem such a bad internet connection or random computer issue can stop the process. Is there any way to continue a bulkload from where you left off? Or to only bulkload data that has not been written to the datastore? I couldn't find anything in the docs, so I

AppEngine bulkloader export Model with self-defined Property

江枫思渺然 提交于 2019-12-06 09:45:03
问题 I want to use bulkloader to download all entities in a model with some self-defined Property . If I define a model like this, class MyType: def __init__(self, arg): self.name = arg['name'] self.id = arg['id'] class MyProperty(db.Property): def get_value_for_datastore(self, instance): val = super(MyProperty, self).get_value_for_datastore(instance) if type(val) == dict: val = MyType(val) return pickle.dumps(val) def make_value_from_datastore(self, val): return None if val is None else pickle

How can I use bulkuploader to populate class with a db.SelfReferenceProperty?

房东的猫 提交于 2019-12-05 21:48:49
I've got a class that is using db.SelfReferenceProperty to create a tree-like structure. When trying to populate the database using appcfg.py upload_data -- config_file=bulkloader.yaml --kind=Group --filename=group.csv (...) , I'm getting an exception saying BadValueError: name must not be empty . (Full stack below) I tried ordering the data to make sure that a Groups that had a foreign key pointing at them were first. That didn't work. By commenting from the bulkloader.yaml the line making the transformation "import_transform: transform.create_foreign_key('Group')", the data is uploaded, but