large-data | 易学教程

Maximum number of columns that can be read using read.csv

阅读更多关于 Maximum number of columns that can be read using read.csv

问题 I want to read a csv file of 4000 columns and 3000 rows and rows are of different length. Now i'm using the code below to read, but the maximum number of columns can be read is 2067. read_data <- function(filename) { setwd(dir) no_col <- max(count.fields(filename, sep = ",")) temp_data <- read.csv(filename, header = FALSE, sep = ",", row.names = NULL, na.strings = 0, fill = TRUE, col.names=1:no_col) How do I solve this problem? 来源： https://stackoverflow.com/questions/33195978/maximum-number

Large SQL Server database timing out PHP web application

阅读更多关于 Large SQL Server database timing out PHP web application

问题 We are running a hospital system which is web based created in PHP. The system was initially fast due to small size of the database but now it has become slow. The following is an example query select pa.id, pa.date as date, pa.visitno, pa.receiptno, pa.debitnoteno, pad.id as padid, pad.serviceid as serviceid, pad.waitno, pa.paytype, s.id as doctorid, s.fullname as doctorname, p.id as patientid, p.name as patient, p.regno, p.age, p.gender, p.doc, p.department, p.telno, p.address, pa.ins_prov

python recursion over 10000 nodes

阅读更多关于 python recursion over 10000 nodes

问题 I have a list of over 10000 items and using recursion i want to iterate over all the possible combinations of their selection starting from 1st item for both its selection and not selection ( 2 branches ) then deciding on 2nd item for each branch and so on till last item in DFS way. I am using this code for optimization and most of the branches are not iterated over but atleast 1 leaf will be reached as to find the best the problem is I made a recursion code in python and it works fine for

How to increase query speed in db4o?

阅读更多关于 How to increase query speed in db4o?

问题 OutOfMemoryError caused when db4o databse has 15000+ objects My question is in reference to my previous question (above). For the same PostedMessage model and same query. With 100,000 PostedMessage objects, the query takes about 1243 ms to return first 20 PostedMessages. Now, I have saved 1,000,000 PostedMessage objects in db4o. The same query took 342,132 ms. Which is non-linearly high. How can I optimize the query speed? FYR: The timeSent and timeReceived are Indexed fields. I am using

MySQL taking forever 'sending data'. Simple query, lots of data

阅读更多关于 MySQL taking forever 'sending data'. Simple query, lots of data

问题 I'm trying to run what I believe to be a simple query on a fairly large dataset, and it's taking a very long time to execute -- it stalls in the "Sending data" state for 3-4 hours or more. The table looks like this: CREATE TABLE `transaction` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `uuid` varchar(36) NOT NULL, `userId` varchar(64) NOT NULL, `protocol` int(11) NOT NULL, ... A few other fields: ints and small varchars `created` datetime NOT NULL, PRIMARY KEY (`id`), KEY `uuid` (

Generate and handle 3 billion elements matrix

阅读更多关于 Generate and handle 3 billion elements matrix

问题 I need to generate the complete set of combinations obtained when choosing any 8 numbers from a vector of 25 elements. This can be done by using the Combinator function but is very slow for very large matrix and is becoming RAM consuming. Can anyone suggest a clever way to generate this? Example: using "sample 3 from vector of 4" would yield the following result ( H=3 and L=4 ): 1 1 1 1 1 2 1 1 3 1 1 4 1 2 2 1 2 3 1 2 4 1 3 3 1 3 4 1 4 4 2 2 2 2 2 3 2 2 4 2 3 3 2 3 4 2 4 4 3 3 3 3 3 4 3 4 4 4

INSERT IGNORE or INSERT WHERE NOT IN

阅读更多关于 INSERT IGNORE or INSERT WHERE NOT IN

问题 I have a 9 million rows table and I'm struggling to handle all this data because of its sheer size. What I want to do is add IMPORT a CSV to the table without overwriting data. Before I would of done something like this; INSERT if not in(select email from tblName where source = "number" and email != "email") INTO (email...) VALUES ("email"...) But I'm worried that I'll crash the server again. I want to be able to insert 10,000s of rows into a table but only if its not in the table with source

Tips for working with large quantity .txt files (and overall large size) - python?

阅读更多关于 Tips for working with large quantity .txt files (and overall large size) - python?

问题 I'm working on a script to parse txt files and store them into a pandas dataframe that I can export to a CSV. My script works easily when I was using <100 of my files - but now when trying to run it on the full sample, I'm running into a lot of issues. Im dealing with ~8000 .txt files with an average size of 300 KB, so in total about 2.5 GB in size. I was wondering if I could get tips on how to make my code more efficient. for opening and reading files, I use: filenames = os.listdir('.') dict

select2 + large number of records

阅读更多关于 select2 + large number of records

问题 I am using select2 dropdown. It's working fine for smaller number of items. But when the list is huge (more than 40000 items) it really slows down. It's slowest in IE. Otherwise simple Dropdownlist works very fast, till 1000 records. Are there any workarounds for this situation? 回答1: ///////////////**** Jquery Code *******/////////////// var CompanypageSize = 10; function initCompanies() { var defaultTxtOnInit = 'a'; $("#DefaultCompanyId").select2({ allowClear: true, ajax: { url: "

jQuery plugin Chosen (enhances mutliselects) works great in Chrome, but slow in Internet Explorer

阅读更多关于 jQuery plugin Chosen (enhances mutliselects) works great in Chrome, but slow in Internet Explorer

问题 I'm currently using the Chosen jQuery plugin. Check out my fiddle here: http://jsfiddle.net/3XWSe/ Try the fiddle in both Chrome and Internet Explorer (I tested using IE version 11). Notice there is a delay (4 or 5 seconds) when clicking on the multiselect in Internet Explorer, as compared to very little, almost none, in Chrome. This example dropdown is listing all the cities in Texas and has close to 5000 options. I opened up chosen.jquery.js and traced the problem to this call: Chosen