I have click stream data such as referring URL, top landing pages, top exit pages and metrics such as page views, number of visits, bounces all in Google Analytics. There is
There are two important rules about loading data in data-warehouse
When you design using GA api, you need to load the initial historical data for a certain date range. This has its own complications as you might run into segmentation issues, loss of data etc. You need to handle pagination etc.
Once the initial data load is complete, you then run it in incremental mode where you just bring new data only. This data gets appended to the same Data warehouse tables and does not cause duplicate with overlapping dates.
On top of this GA changes their API frequently so you need to be on top of this as well.
Considering the above, we released a fully packaged data-warehouse with Google Analytics and Salesforce data connectors. You can check out the details and get ideas on how you want to setup your own datawarehouse http://www.infocaptor.com/google-analytics-datawarehouse
The minimum you would need to design is some kind of background daemon that runs everyday or at some frequency. You will need job tables to monitor the success and failure of the extracts so that it can resume from where the error occurred.
Some of the other considerations 1. What happens if you run the extract for the same data range 2. What if a job fails for certain dates
It is important to set your primary keys for your DW target tables.In MySQL, using insert statement with duplicate clause will make sure that there are no duplicate records created in case of reloading of data.
Another thing to design is your staging layer. You extract data from GA and dump into a Staging. This way if there is error loading into Target you can simply reload from staging. This way you are not burdening your GA API limits and save bandwidth as well.
You can see our complete design at this location http://www.infocaptor.com/help/social_analytics___datawarehouse.htm
All the best with your DW effort.