问题
I am trying to request data from a third-party API, we got the API link like below:
GET /api/v4/dblines
Supposed we have to insert data from 2020-04-12 07:07:00, and we have original data before that DateTime, the API max/limit records are 1000, how to keep the script running and insert real-time data one by one all the time?
Below is the sample JSON data and my sample code:
from sqlalchemy import create_engine
import pandas as pd
import requests
# getting json using requests
# data = requests.get('`https://api.example.com/api/v4/dblines?a=ABC123&b=1min&startTime=1586646420000&limit=1000`').json()
# data example
data = [
[
1593649440000,
"2.9923453200",
"2.9923453200",
"2.0045299700",
"2.0045299700",
"2.2400009700",
1593649499999,
"2.0010870500",
2,
"2.0300009700",
"2.0001359600",
"0"
],
[
1593649500000,
"2.9923453297",
"2.9923453297",
"2.9923453297",
"2.9923453297",
"25.950000970",
1593649559999,
"2.1176054000",
4,
"25.950000970",
"2.1176054000",
"0"
]
]
# create df using json from API
df = pd.DataFrame(
data,
columns=[
# change aliases to column names here...
'start_time',
'alias1',
'alias2',
'alias3',
'alias4',
'alias5',
'end_time',
'alias6',
'alias7',
'alias8',
'alias9',
'alias10',
]
)
# drop unnecessary columns, duplicates, processing df, blabla etc...
# and init db connection
engine = create_engine('mysql+pymysql://{user}:{pw}@localhost/{db}'
.format(user='db_user',
pw='db_password',
db='db_name'))
# insert df into db using connection
# change if_exists if you need
df.to_sql(con=engine, name='table_name_here', if_exists='replace') # orginal data can't be replaced, how to change here?
来源:https://stackoverflow.com/questions/62707433/how-to-insert-real-time-data-to-database-always-from-third-party-api-max-1000