W3cubDocs

/Nim

Module memfiles

This module provides support for memory mapped files (Posix's mmap) on the different operating systems.

It also provides some fast iterators over lines in text files (or other "line-like", variable length, delimited records).

Imports

winlean, os, streams

Types

MemFile = object
  mem*: pointer                ## a pointer to the memory mapped file. The pointer
              ## can be used directly to change the contents of the
              ## file, if it was opened with write access.
  size*: int                   ## size of the memory mapped file
  when defined(windows):
      fHandle: Handle
      mapHandle: Handle
      wasOpened: bool          ## only close if wasOpened
    
  else:
      handle: cint
represents a memory mapped file
MemSlice = object
  data*: pointer
  size*: int
represent slice of a MemFile for iteration over delimited lines/records
MemMapFileStream = ref MemMapFileStreamObj
a stream that encapsulates a MemFile
MemMapFileStreamObj = object of Stream
  mf: MemFile
  mode: FileMode
  pos: ByteAddress

Procs

proc mapMem(m: var MemFile; mode: FileMode = fmRead; mappedSize = -1; offset = 0): pointer {...}{.
    raises: [IOError, OSError], tags: [].}

returns a pointer to a mapped portion of MemFile m

mappedSize of -1 maps to the whole file, and offset must be multiples of the PAGE SIZE of your OS

proc unmapMem(f: var MemFile; p: pointer; size: int) {...}{.raises: [OSError], tags: [].}

unmaps the memory region (p, <p+size) of the mapped file f. All changes are written back to the file system, if f was opened with write access.

size must be of exactly the size that was requested via mapMem.

proc open(filename: string; mode: FileMode = fmRead; mappedSize = -1; offset = 0;
         newFileSize = -1; allowRemap = false): MemFile {...}{.raises: [IOError, OSError],
    tags: [].}

opens a memory mapped file. If this fails, EOS is raised.

newFileSize can only be set if the file does not exist and is opened with write access (e.g., with fmReadWrite).

mappedSize and offset can be used to map only a slice of the file.

offset must be multiples of the PAGE SIZE of your OS (usually 4K or 8K but is unique to your OS)

allowRemap only needs to be true if you want to call mapMem on the resulting MemFile; else file handles are not kept open.

Example:

var
  mm, mm_full, mm_half: MemFile

mm = memfiles.open("/tmp/test.mmap", mode = fmWrite, newFileSize = 1024)    # Create a new file
mm.close()

# Read the whole file, would fail if newFileSize was set
mm_full = memfiles.open("/tmp/test.mmap", mode = fmReadWrite, mappedSize = -1)

# Read the first 512 bytes
mm_half = memfiles.open("/tmp/test.mmap", mode = fmReadWrite, mappedSize = 512)
proc flush(f: var MemFile; attempts: Natural = 3) {...}{.raises: [OSError], tags: [].}
Flushes f's buffer for the number of attempts equal to attempts. If were errors an exception OSError will be raised.
proc close(f: var MemFile) {...}{.raises: [OSError], tags: [].}
closes the memory mapped file f. All changes are written back to the file system, if f was opened with write access.
proc `==`(x, y: MemSlice): bool {...}{.raises: [], tags: [].}
Compare a pair of MemSlice for strict equality.
proc `$`(ms: MemSlice): string {...}{.inline, raises: [], tags: [].}
Return a Nim string built from a MemSlice.
proc newMemMapFileStream(filename: string; mode: FileMode = fmRead; fileSize: int = -1): MemMapFileStream {...}{.
    raises: [IOError, OSError], tags: [].}
creates a new stream from the file named filename with the mode mode. Raises ## EOS if the file cannot be opened. See the system module for a list of available FileMode enums. fileSize can only be set if the file does not exist and is opened with write access (e.g., with fmReadWrite).

Iterators

iterator memSlices(mfile: MemFile; delim = '\n'; eat = '\c'): MemSlice {...}{.inline,
    raises: [], tags: [].}

Iterates over [optional eat] delim-delimited slices in MemFile mfile.

Default parameters parse lines ending in either Unix(\l) or Windows(\r\l) style on on a line-by-line basis. I.e., not every line needs the same ending. Unlike readLine(File) & lines(File), archaic MacOS9 \r-delimited lines are not supported as a third option for each line. Such archaic MacOS9 files can be handled by passing delim='\r', eat='\0', though.

Delimiters are not part of the returned slice. A final, unterminated line or record is returned just like any other.

Non-default delimiters can be passed to allow iteration over other sorts of "line-like" variable length records. Pass eat='\0' to be strictly delim-delimited. (Eating an optional prefix equal to '\0' is not supported.)

This zero copy, memchr-limited interface is probably the fastest way to iterate over line-like records in a file. However, returned (data,size) objects are not Nim strings, bounds checked Nim arrays, or even terminated C strings. So, care is required to access the data (e.g., think C mem* functions, not str* functions).

Example:

var count = 0
for slice in memSlices(memfiles.open("foo")):
  if slice.size > 0 and cast[cstring](slice.data)[0] != '#':
    inc(count)
echo count
iterator lines(mfile: MemFile; buf: var TaintedString; delim = '\n'; eat = '\c'): TaintedString {...}{.
    inline, raises: [], tags: [].}

Replace contents of passed buffer with each new line, like readLine(File). delim, eat, and delimiting logic is exactly as for memSlices, but Nim strings are returned.

Example:

var buffer: TaintedString = ""
for line in lines(memfiles.open("foo"), buffer):
  echo line
iterator lines(mfile: MemFile; delim = '\n'; eat = '\c'): TaintedString {...}{.inline,
    raises: [], tags: [].}

Return each line in a file as a Nim string, like lines(File). delim, eat, and delimiting logic is exactly as for memSlices, but Nim strings are returned.

Example:

for line in lines(memfiles.open("foo")):
  echo line

© 2006–2018 Andreas Rumpf
Licensed under the MIT License.
https://nim-lang.org/docs/memfiles.html