large-data

Strategies for One-to-Many type of association where “many” side entries are in millions

喜夏-厌秋 提交于 2019-12-13 05:24:44
问题 Giving an analogy: Twitter like scenario where in a person can be followed by huge number of people (one-to-many) , Few options which I could think of Use some OR mapping tool with lazy loading. But when you access the "followers" side of relations, it will still load all the data even tough lazily. So not a suitable option. Do not maintain one-to-many relation (or not use any OR mapping) . Fetch the "Followers" side in separate call and handle the paging etc programmatically. Offload

How to sort a very large array in C

喜欢而已 提交于 2019-12-13 03:37:41
问题 I want to sort on the order of four million long long s in C. Normally I would just malloc() a buffer to use as an array and call qsort() but four million * 8 bytes is one huge chunk of contiguous memory. What's the easiest way to do this? I rate ease over pure speed for this. I'd prefer not to use any libraries and the result will need to run on a modest netbook under both Windows and Linux. 回答1: Just allocate a buffer and call qsort . 32MB isn't so very big these days even on a modest

Ignore error row when update or insert SQL Server

。_饼干妹妹 提交于 2019-12-13 02:48:18
问题 My project has to deal with huge database. At worst situation, it can be more than 80 millions row. Now, I have 2 tables T1 and T2 . I have to copy data from table T1 to table T2 if a row in table T1 already exists in table T2 (same primary key), then update data of other columns of the row in T1 to T2 else insert new row into T2 At first, I use while loop to loop through 80 millions row in T1 then update or insert to T2 . This is very very very slow, it takes more than 10 hours to finish.

Calculate quantiles for large data

坚强是说给别人听的谎言 提交于 2019-12-12 18:33:56
问题 I have about 300 files, each containing 1000 time series realisations (~76 MB each file). I want to calculate the quantiles (0.05, 0.50, 0.95) at each time step from the full set of 300000 realisations. I cannot merge together the realisations in 1 file because it would become too large. What's the most efficient way of doing this? Each matrix is generated by running a model, however here is a sample containing random numbers: x <- matrix(rexp(10000000, rate=.1), nrow=1000) 回答1: There are at

Bind Combobox with huge data in WPF

假如想象 提交于 2019-12-12 16:21:14
问题 I am trying to bind combobox with custom object list. My object list have around 15K record and combobox takes long time to show the data after clicking on combobox. Below are the code: <ComboBox Height="23" Name="comboBox1" Width="120" DisplayMemberPath="EmpName" SelectedValue="EmpID" VirtualizingStackPanel.IsVirtualizing="True" VirtualizingStackPanel.VirtualizationMode="Recycling"/> code behind: List<EmployeeBE> allEmployee = new List<EmployeeBE>(); allEmployee = EmployeeBO.GetEmployeeAll()

Plotting too many points?

允我心安 提交于 2019-12-12 13:07:33
问题 How does R (base, lattice or whatever) create a graph from a 100000 elements vector (or a function that outputs that values)? Does it plot some and reject others? plot all on top of each other? How can I change this behaviour? How could I crate a graph where for every interval I see the max and min values, as in the trading "bar" charts? (or any other idea to visualize that much info without needing to previously calculate intervals, mins and maxs myself nor using financial pakages) How could

import/export very large mysql database in phpmyadmin

拥有回忆 提交于 2019-12-12 08:43:04
问题 i have a db in phpmyadmin having 3000000 records. i want to export this to another pc. now when i export this only 200000 entries exported into .sql file and that is also not imported on the other pc. 回答1: Answering this for anyone else who lands here. If you can only use phpMyAdmin because you do not have SSH access to the MySQL service or do not know how to use command line tools, then this might help. However as the comment above suggest, exporting a database of this size would be far

Assembly program refuses to accept a larger number [duplicate]

谁说胖子不能爱 提交于 2019-12-12 05:29:55
问题 This question already has an answer here : Converting a program to accept unsigned integer (1 answer) Closed 2 years ago . I am trying to write a program that applies Ulam's conjecture to a number. I have the program working, however it refuses to accept the numbers 38836 and 38838. When these numbers are entered, it gives me the error: NUMBER OUT OF RANGE TRY AGAIN. The stack is at 256, and the variable used is a DW type. I am brand new to assembly and so I apologize if I did not include

git lfs not working properly for files larger than 100MB

好久不见. 提交于 2019-12-12 03:36:28
问题 I was suggested by git to use git lfs for large files. After I tracked them with git lfs and checked if they are added to .gitattribute I still get the error that files are larger than 100MB for the same exact files. What are the suggestions here and how I can solve this problem? I would need to upload these large files as part of the project to github as well. jalal@klein:~/computer_vision/py-faster-rcnn$ git push -u origin masterUsername for 'https://github.com': monajalal Password for

Removing non-unique values and rearranging vectors

点点圈 提交于 2019-12-12 01:13:07
问题 I worked with Sloan Digital Sky Survey (SDSS) data, and got a final data product of this file. The first column is of wLength (wavlength) and second is of flux . Storing the zeros in zero_F variable zero_F = find(a==0) , I removed them from both columns using wLength(zero_F)=[]; and flux(zero_F)=[]; . I want to plot wLength vs flux , flux is dependent on wLength but wLength contains values which are non-unique. How can I get indices of non-unique values in data so that I can remove the