large-data

Maximum number of columns that can be read using read.csv

烈酒焚心 提交于 2019-12-25 15:02:31
问题 I want to read a csv file of 4000 columns and 3000 rows and rows are of different length. Now i'm using the code below to read, but the maximum number of columns can be read is 2067. read_data <- function(filename) { setwd(dir) no_col <- max(count.fields(filename, sep = ",")) temp_data <- read.csv(filename, header = FALSE, sep = ",", row.names = NULL, na.strings = 0, fill = TRUE, col.names=1:no_col) How do I solve this problem? 来源: https://stackoverflow.com/questions/33195978/maximum-number

Large SQL Server database timing out PHP web application

霸气de小男生 提交于 2019-12-25 05:25:18
问题 We are running a hospital system which is web based created in PHP. The system was initially fast due to small size of the database but now it has become slow. The following is an example query select pa.id, pa.date as date, pa.visitno, pa.receiptno, pa.debitnoteno, pad.id as padid, pad.serviceid as serviceid, pad.waitno, pa.paytype, s.id as doctorid, s.fullname as doctorname, p.id as patientid, p.name as patient, p.regno, p.age, p.gender, p.doc, p.department, p.telno, p.address, pa.ins_prov

python recursion over 10000 nodes

蓝咒 提交于 2019-12-25 04:30:27
问题 I have a list of over 10000 items and using recursion i want to iterate over all the possible combinations of their selection starting from 1st item for both its selection and not selection ( 2 branches ) then deciding on 2nd item for each branch and so on till last item in DFS way. I am using this code for optimization and most of the branches are not iterated over but atleast 1 leaf will be reached as to find the best the problem is I made a recursion code in python and it works fine for

How to increase query speed in db4o?

£可爱£侵袭症+ 提交于 2019-12-24 18:21:39
问题 OutOfMemoryError caused when db4o databse has 15000+ objects My question is in reference to my previous question (above). For the same PostedMessage model and same query. With 100,000 PostedMessage objects, the query takes about 1243 ms to return first 20 PostedMessages. Now, I have saved 1,000,000 PostedMessage objects in db4o. The same query took 342,132 ms. Which is non-linearly high. How can I optimize the query speed? FYR: The timeSent and timeReceived are Indexed fields. I am using

MySQL taking forever 'sending data'. Simple query, lots of data

瘦欲@ 提交于 2019-12-24 16:14:56
问题 I'm trying to run what I believe to be a simple query on a fairly large dataset, and it's taking a very long time to execute -- it stalls in the "Sending data" state for 3-4 hours or more. The table looks like this: CREATE TABLE `transaction` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `uuid` varchar(36) NOT NULL, `userId` varchar(64) NOT NULL, `protocol` int(11) NOT NULL, ... A few other fields: ints and small varchars `created` datetime NOT NULL, PRIMARY KEY (`id`), KEY `uuid` (

Generate and handle 3 billion elements matrix

心已入冬 提交于 2019-12-24 13:19:36
问题 I need to generate the complete set of combinations obtained when choosing any 8 numbers from a vector of 25 elements. This can be done by using the Combinator function but is very slow for very large matrix and is becoming RAM consuming. Can anyone suggest a clever way to generate this? Example: using "sample 3 from vector of 4" would yield the following result ( H=3 and L=4 ): 1 1 1 1 1 2 1 1 3 1 1 4 1 2 2 1 2 3 1 2 4 1 3 3 1 3 4 1 4 4 2 2 2 2 2 3 2 2 4 2 3 3 2 3 4 2 4 4 3 3 3 3 3 4 3 4 4 4

INSERT IGNORE or INSERT WHERE NOT IN

走远了吗. 提交于 2019-12-24 12:22:02
问题 I have a 9 million rows table and I'm struggling to handle all this data because of its sheer size. What I want to do is add IMPORT a CSV to the table without overwriting data. Before I would of done something like this; INSERT if not in(select email from tblName where source = "number" and email != "email") INTO (email...) VALUES ("email"...) But I'm worried that I'll crash the server again. I want to be able to insert 10,000s of rows into a table but only if its not in the table with source

Tips for working with large quantity .txt files (and overall large size) - python?

穿精又带淫゛_ 提交于 2019-12-24 11:22:39
问题 I'm working on a script to parse txt files and store them into a pandas dataframe that I can export to a CSV. My script works easily when I was using <100 of my files - but now when trying to run it on the full sample, I'm running into a lot of issues. Im dealing with ~8000 .txt files with an average size of 300 KB, so in total about 2.5 GB in size. I was wondering if I could get tips on how to make my code more efficient. for opening and reading files, I use: filenames = os.listdir('.') dict

select2 + large number of records

此生再无相见时 提交于 2019-12-24 05:53:14
问题 I am using select2 dropdown. It's working fine for smaller number of items. But when the list is huge (more than 40000 items) it really slows down. It's slowest in IE. Otherwise simple Dropdownlist works very fast, till 1000 records. Are there any workarounds for this situation? 回答1: ///////////////**** Jquery Code *******/////////////// var CompanypageSize = 10; function initCompanies() { var defaultTxtOnInit = 'a'; $("#DefaultCompanyId").select2({ allowClear: true, ajax: { url: "

jQuery plugin Chosen (enhances mutliselects) works great in Chrome, but slow in Internet Explorer

a 夏天 提交于 2019-12-24 04:27:07
问题 I'm currently using the Chosen jQuery plugin. Check out my fiddle here: http://jsfiddle.net/3XWSe/ Try the fiddle in both Chrome and Internet Explorer (I tested using IE version 11). Notice there is a delay (4 or 5 seconds) when clicking on the multiselect in Internet Explorer, as compared to very little, almost none, in Chrome. This example dropdown is listing all the cities in Texas and has close to 5000 options. I opened up chosen.jquery.js and traced the problem to this call: Chosen