mmap | 易学教程

Why doesn't Python's mmap work with large files?

阅读更多关于 Why doesn't Python's mmap work with large files?

问题 [Edit: This problem applies only to 32-bit systems. If your computer, your OS and your python implementation are 64-bit, then mmap-ing huge files works reliably and is extremely efficient.] I am writing a module that amongst other things allows bitwise read access to files. The files can potentially be large (hundreds of GB) so I wrote a simple class that lets me treat the file like a string and hides all the seeking and reading. At the time I wrote my wrapper class I didn't know about the

Linux系统编程（5）——文件与IO之mmap函数

阅读更多关于 Linux系统编程（5）——文件与IO之mmap函数

mmap系统调用它本身提供了不同于一般对普通文件的访问方式，进程可以像读写内存一样对普通文件的操作。而Posix或系统V的共享内存IPC则纯粹用于共享目的，mmap()实现共享内存也是其主要应用之一。mmap系统调用使得进程之间通过映射同一个普通文件实现共享内存。普通文件被映射到进程地址空间后，进程可以像访问普通内存一样对文件进行访问，不必再调用read()，write()等操作。我们的程序中大量运用了mmap，用到的正是mmap的这种“像访问普通内存一样对文件进行访问”的功能。实践证明，当要对一个文件频繁的进行访问，并且指针来回移动时，调用mmap比用常规的方法快很多。简单说就是把一个文件的内容在内存里面做一个映像，内存比磁盘快些。基本上它是把一个文件对应到virtual memory 中的一段，并传回一个指针。以后对这段内存做存取时，其实就是对那个档做存取。它就是一种快速文件I/O，而且使用上和存取内存一样方便，只不过会占掉你的 virutal memory。 mmap这个系统调用可以直接对底层的操作，映射硬件地址，实现用户层驱动。 #include <sys/mman.h> void*mmap(void *addr, size_t len, int prot, int flag, int filedes, off_t off); intmunmap(void *addr,

mmap() vs. reading blocks

阅读更多关于 mmap() vs. reading blocks

问题 I'm working on a program that will be processing files that could potentially be 100GB or more in size. The files contain sets of variable length records. I've got a first implementation up and running and am now looking towards improving performance, particularly at doing I/O more efficiently since the input file gets scanned many times. Is there a rule of thumb for using mmap() versus reading in blocks via C++'s fstream library? What I'd like to do is read large blocks from disk into a

共享内存

阅读更多关于共享内存

采用共享内存通信的一个显而易见的好处是效率高，因为进程可以直接读写内存，而不需要任何数据的拷贝。对于像管道和消息队列等通信方式，则需要在内核和用户空间进行四次的数据拷贝，而共享内存则只拷贝两次数据[1]：一次从输入文件到共享内存区，另一次从共享内存区到输出文件。实际上，进程之间在共享内存时，并不总是读写少量数据后就解除映射，有新的通信时，再重新建立共享内存区域。而是保持共享区域，直到通信完毕为止，这样，数据内容一直保存在共享内存中，并没有写回文件。共享内存中的内容往往是在解除映射时才写回文件的。因此，采用共享内存的通信方式效率是非常高的。 Linux的2.2.x内核支持多种共享内存方式，如mmap()系统调用，Posix共享内存，以及系统V共享内存。linux发行版本如Redhat 8.0支持mmap()系统调用及系统V共享内存，但还没实现Posix共享内存，本文将主要介绍mmap()系统调用及系统V共享内存API的原理及应用。一、内核怎样保证各个进程寻址到同一个共享内存区域的内存页面 1、page cache及swap cache中页面的区分：一个被访问文件的物理页面都驻留在page cache或swap cache中，一个页面的所有信息由struct page来描述。struct page中有一个域为指针mapping ，它指向一个struct address

What does & stands for in C and mmap()

阅读更多关于 What does & stands for in C and mmap()

问题 int fd = open("/dev/mem", O_RDWR); present = (unsigned char *)mmap(0, getpagesize(), PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0x22400000); if ((*present & 1) == 0) { printf("Converter not present\n"); exit(1); } 1) What does '&' operator mean in the preceding code? 回答1: It is the bitwise and operator. This means that the result of the operation is to perform binary and of the two operands but bit-by-bit (in a bitwise fashion i.e). In this case it is checking that the first bit of the memory

【Linux】十问 Linux 虚拟内存管理

阅读更多关于【Linux】十问 Linux 虚拟内存管理

Linux 的虚拟内存管理有几个关键概念：每个进程有独立的虚拟地址空间，进程访问的虚拟地址并不是真正的物理地址虚拟地址可通过每个进程上页表与物理地址进行映射，获得真正物理地址如果虚拟地址对应物理地址不在物理内存中，则产生缺页中断，真正分配物理地址，同时更新进程的页表；如果此时物理内存已耗尽，则根据内存替换算法淘汰部分页面至物理磁盘中。基于以上认识，这篇文章通过本人以前对虚拟内存管理的疑惑由浅入深整理了以下十个问题，并通过例子和系统命令尝试进行解答。 Linux 虚拟地址空间如何分布？ 32 位和 64 位有何不同？ malloc 是如何分配内存的？ malloc 分配多大的内存，就占用多大的物理内存空间吗？如何查看进程虚拟地址空间的使用情况？ free 的内存真的释放了吗（还给 OS ） ? 程序代码中 malloc 的内存都有相应的 free ，就不会出现内存泄露了吗？既然堆内内存不能直接释放，为什么不全部使用 mmap 来分配？如何查看进程的缺页中断信息？如何查看堆内内存的碎片情况？除了 glibc 的 malloc/free ，还有其他第三方实现吗？一.Linux 虚拟地址空间如何分布？ 32 位和 64 位有何不同？ Linux 使用虚拟地址空间，大大增加了进程的寻址空间，由低地址到高地址分别为：只读段：该部分空间只能读，不可写，包括代码段、

【Linux】malloc 与共享内存原理区别

阅读更多关于【Linux】malloc 与共享内存原理区别

本文主要分析内存以及I/O相关的系统调用和库函数的实现原理，根据原理给出在使用过程中需要注意的问题和优化的侧重点，本文涉及到的系统调用包括readahead，pread/pwrite，read/write，mmap，readv/writev，sendfile，fsync/fdatasync/msync，shmget，malloc。本文先简单介绍应用程序对内存的使用以及I/O系统对内存的使用的基本原理，这对理解上述系统调用和库函数的实现有很大帮助。 1 内存管理基础 Linux对物理内存的管理是以页为单位的，通常页大小为4KB，Linux在初始化时为所有物理内存也分配了管理数据结构，管理所有物理页面。每一个应用程序有独立的地址空间，当然这个地址是虚拟的，通过应用程序的页表可以把虚拟地址转化为实际的物理地址进行操作，虽然系统可以实现从虚拟地址到物理地址的转换，但并非应用程序的每一块虚拟内存都对应一块物理内存。Linux使用一种按需分配的策略为应用程序分配物理内存，这种按需分配是使用缺页异常实现的。比如一个应用程序动态分配了10MB的内存，这些内存在分配时只是在应用程序的虚拟内存区域管理结构中表示这一区间的地址已经被占用，内核此时并没有为之分配物理内存，而是在应用程序使用（读写）该内存区时，发现该内存地址对应得物理内存并不存在，此时产生缺页异常

C - syscall - 64-bit - pointer

阅读更多关于 C - syscall - 64-bit - pointer

问题 I am on 64-bit Linux x86. I need to execute mmap syscall using syscall function. mmap syscall number is 9: printf("mmap-1: %lli\n", syscall(9, 0, 10, 3, 2 | 32, -1, 0)); printf("mmap-2: %lli\n", mmap( 0, 10, 3, 2 | 32, -1, 0)); However, when I run it, the syscall function gives wrong results. mmap-1: 2236940288 mmap-2: 140503502090240 mmap-1: 3425849344 mmap-2: 140612065181696 mmap-1: 249544704 mmap-2: 139625341366272 mmap works just fine, but the "addresses" returned syscall result in

Unable to allocate memory with mmap in x86 Linux Assembly Language

阅读更多关于 Unable to allocate memory with mmap in x86 Linux Assembly Language

问题 I have successfully opened a file and have the file descriptor (7) stored in FILE, and I also have the size of the file (153kb) stored in SIZE. That being said, this mmap system call returns a -14. I'm not sure what I'm doing wrong push %esi #Save non-general-purpose registers push %edi #Save non-general-purpose registers push %ebp #Save non-general-purpose registers movl FILE, %edi #Move file descriptor into edi movl $0, %ebp #Offset to 0 movl $0x2, %esi #MAP_PRIVATE movl $0x3, %edx #PROT

Using MapViewOfFile, pointer eventually walks out of memory space

阅读更多关于 Using MapViewOfFile, pointer eventually walks out of memory space

问题 All, I'm using MapViewOfFile to hold part of a file in memory. There is a stream that points to this file and writes to it, and then is rewound. I use the pointer to the beginning of the mapped file, and read until I get to the null char I write as the final character. int fd; yyout = tmpfile(); fd = fileno(yyout); #ifdef WIN32 HANDLE fm; HANDLE h = (HANDLE) _get_osfhandle (fd); fm = CreateFileMapping( h, NULL, PAGE_READWRITE|SEC_RESERVE, 0, 4096, NULL); if (fm == NULL) { fprintf (stderr, "%s

订阅 mmap