API Reference

Submodules

Module contents

megfile.smart_access(path: Union[str, os.PathLike], mode: megfile.pathlike.Access) → bool[source]

Test if path has access permission described by mode

Parameters
  • path – Path to be tested

  • mode – Access mode(Access.READ, Access.WRITE, Access.BUCKETREAD, Access.BUCKETWRITE)

Returns

bool, if the path has read/write access.

megfile.smart_cache(path, cacher=<class 'megfile.smart.SmartCacher'>, **options)[source]

Return a path to Posixpath Interface

param path: Path to cache param s3_cacher: Cacher for s3 path param options: Optional arguments for s3_cacher

megfile.smart_combine_open(path_glob: str, mode: str = 'rb', open_func=<function smart_open>) → megfile.lib.combine_reader.CombineReader[source]

Open a unified reader that supports multi file reading.

Parameters
  • path_glob – A path may contain shell wildcard characters

  • mode – Mode to open file, supports ‘rb’

Returns

A `CombineReader`

megfile.smart_copy(src_path: Union[str, os.PathLike], dst_path: Union[str, os.PathLike], callback: Optional[Callable[int, None]] = None, followlinks: bool = False) → None[source]

Copy file from source path to destination path

Here are a few examples:

>>> from tqdm import tqdm
>>> from megfile import smart_copy, smart_stat
>>> class Bar:
...     def __init__(self, total=10):
...         self._bar = tqdm(total=10)
...
...     def __call__(self, bytes_num):
...         self._bar.update(bytes_num)
...
>>> src_path = 'test.png'
>>> dst_path = 'test1.png'
>>> smart_copy(src_path, dst_path, callback=Bar(total=smart_stat(src_path).size), followlinks=False)
856960it [00:00, 260592384.24it/s]
Parameters
  • src_path – Given source path

  • dst_path – Given destination path

  • callback – Called periodically during copy, and the input parameter is the data size (in bytes) of copy since the last call

  • followlinks – False if regard symlink as file, else True

megfile.smart_exists(path: Union[str, os.PathLike], followlinks: bool = False) → bool[source]

Test if path or s3_url exists

Parameters

path – Path to be tested

Returns

True if path eixsts, else False

megfile.smart_getmtime(path: Union[str, os.PathLike]) → float[source]

Get last-modified time of the file on the given s3_url or file path (in Unix timestamp format). If the path is an existent directory, return the latest modified time of all file in it. The mtime of empty directory is 1970-01-01 00:00:00

Parameters

path – Given path

Returns

Last-modified time

Raises

FileNotFoundError

megfile.smart_getsize(path: Union[str, os.PathLike]) → int[source]

Get file size on the given s3_url or file path (in bytes). If the path in a directory, return the sum of all file size in it, including file in subdirectories (if exist). The result excludes the size of directory itself. In other words, return 0 Byte on an empty directory path.

Parameters

path – Given path

Returns

File size

Raises

FileNotFoundError

megfile.smart_glob_stat(pathname: Union[str, os.PathLike], recursive: bool = True, missing_ok: bool = True) → Iterator[megfile.pathlike.FileEntry][source]

Given pathname may contain shell wildcard characters, return a list contains tuples of path and file stat in ascending alphabetical order, in which path matches glob pattern

Parameters
  • pathname – A path pattern may contain shell wildcard characters

  • recursive – If False, this function will not glob recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

megfile.smart_glob(pathname: Union[str, os.PathLike], recursive: bool = True, missing_ok: bool = True) → List[str][source]

Given pathname may contain shell wildcard characters, return path list in ascending alphabetical order, in which path matches glob pattern

Parameters
  • pathname – A path pattern may contain shell wildcard characters

  • recursive – If False, this function will not glob recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

megfile.smart_iglob(pathname: Union[str, os.PathLike], recursive: bool = True, missing_ok: bool = True) → Iterator[str][source]

Given pathname may contain shell wildcard characters, return path iterator in ascending alphabetical order, in which path matches glob pattern

Parameters
  • pathname – A path pattern may contain shell wildcard characters

  • recursive – If False, this function will not glob recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

megfile.smart_isdir(path: Union[str, os.PathLike], followlinks: bool = False) → bool[source]

Test if a file path or an s3 url is directory

Parameters

path – Path to be tested

Returns

True if path is directory, else False

megfile.smart_isfile(path: Union[str, os.PathLike], followlinks: bool = False) → bool[source]

Test if a file path or an s3 url is file

Parameters

path – Path to be tested

Returns

True if path is file, else False

megfile.smart_listdir(path: Union[str, os.PathLike, None] = None) → List[str][source]

Get all contents of given s3_url or file path. The result is in acsending alphabetical order.

Parameters

path – Given path

Returns

All contents of given s3_url or file path in acsending alphabetical order.

Raises

FileNotFoundError, NotADirectoryError

megfile.smart_load_content(path: Union[str, os.PathLike], start: Optional[int] = None, stop: Optional[int] = None) → bytes[source]

Get specified file from [start, stop) in bytes

Parameters
  • path – Specified path

  • start – start index

  • stop – stop index

Returns

bytes content in range [start, stop)

megfile.smart_save_content(path: Union[str, os.PathLike], content: bytes) → None[source]

Save bytes content to specified path

param path: Path to save content

megfile.smart_load_from(path: Union[str, os.PathLike]) → BinaryIO[source]

Read all content in binary on specified path and write into memory

User should close the BinaryIO manually

Parameters

path – Specified path

Returns

BinaryIO

megfile.smart_load_text(path: Union[str, os.PathLike]) → str[source]

Read content from path

param path: Path to be read

megfile.smart_save_text(path: Union[str, os.PathLike], text: str) → None[source]

Save text to specified path

param path: Path to save text

megfile.smart_makedirs(path: Union[str, os.PathLike], exist_ok: bool = False) → None[source]

Create a directory if is on fs. If on s3, it actually check if target exists, and check if bucket has WRITE access

Parameters
  • path – Given path

  • missing_ok – if False and target directory not exists, raise FileNotFoundError

Raises

PermissionError, FileExistsError

megfile.smart_open(path: Union[str, os.PathLike], mode: str = 'r', s3_open_func: Callable[[str, str], BinaryIO] = <function s3_buffered_open>, encoding: Optional[str] = None, errors: Optional[str] = None, **options) → IO[AnyStr][source]

Open a file on the path

Note

On fs, the difference between this function and io.open is that this function create directories automatically, instead of raising FileNotFoundError

Currently, supported protocols are:

  1. s3: “s3://<bucket>/<key>”

  2. http(s): http(s) url

  3. stdio: “stdio://-”

  4. FS file: Besides above mentioned protocols, other path are considered fs path

Here are a few examples:

>>> import cv2
>>> import numpy as np
>>> raw = smart_open('https://ss2.bdstatic.com/70cFvnSh_Q1YnxGkpoWK1HF6hhy/it/u=2275743969,3715493841&fm=26&gp=0.jpg').read()
>>> img = cv2.imdecode(np.frombuffer(raw, np.uint8), cv2.IMREAD_ANYDEPTH | cv2.IMREAD_COLOR)
Parameters
  • path – Given path

  • mode – Mode to open file, supports r’[rwa][tb]?+?’

  • s3_open_func – Function used to open s3_url. Require the function includes 2 necessary parameters, file path and mode

  • encoding – encoding is the name of the encoding used to decode or encode the file. This should only be used in text mode.

  • errors – errors is an optional string that specifies how encoding and decoding errors are to be handled—this cannot be used in binary mode.

Returns

File-Like object

Raises

FileNotFoundError, IsADirectoryError, ValueError

megfile.smart_path_join(path: Union[str, os.PathLike], *other_paths: Union[str, os.PathLike]) → str[source]

Concat 2 or more path to a complete path

Parameters
  • path – Given path

  • other_paths – Paths to be concatenated

Returns

Concatenated complete path

Note

For URI, the difference between this function and os.path.join is that this function ignores left side slash (which indicates absolute path) in other_paths and will directly concat. e.g. os.path.join(‘s3://path’, ‘to’, ‘/file’) => ‘/file’, and smart_path_join(‘s3://path’, ‘to’, ‘/file’) => ‘/path/to/file’ But for fs path, this function behaves exactly like os.path.join e.g. smart_path_join(‘/path’, ‘to’, ‘/file’) => ‘/file’

megfile.smart_realpath(path: Union[str, os.PathLike])[source]

Return the real path of given path

Parameters

path – Given path

Returns

Real path of given path

megfile.smart_remove(path: Union[str, os.PathLike], missing_ok: bool = False) → None[source]

Remove the file or directory on s3 or fs, s3:// and s3://bucket are not permitted to remove

Parameters
  • path – Given path

  • missing_ok – if False and target file/directory not exists, raise FileNotFoundError

Raises

PermissionError, FileNotFoundError

megfile.smart_move(src_path: Union[str, os.PathLike], dst_path: Union[str, os.PathLike]) → None[source]

Move file/directory on s3 or fs. s3:// or s3://bucket is not allowed to move

Parameters
  • src_path – Given source path

  • dst_path – Given destination path

megfile.smart_rename(src_path: Union[str, os.PathLike], dst_path: Union[str, os.PathLike]) → None[source]

Move file on s3 or fs. s3:// or s3://bucket is not allowed to move

Parameters
  • src_path – Given source path

  • dst_path – Given destination path

megfile.smart_save_as(file_object: BinaryIO, path: Union[str, os.PathLike]) → None[source]

Write the opened binary stream to specified path, but the stream won’t be closed

Parameters
  • file_object – Stream to be read

  • path – Specified target path

megfile.smart_scan_stat(path: Union[str, os.PathLike], missing_ok: bool = True, followlinks: bool = False) → Iterator[megfile.pathlike.FileEntry][source]

Iteratively traverse only files in given directory, in alphabetical order. Every iteration on generator yields a tuple of path string and file stat

Parameters
  • path – Given path

  • missing_ok – If False and there’s no file in the directory, raise FileNotFoundError

Raises

UnsupportedError

Returns

A file path generator

megfile.smart_scan(path: Union[str, os.PathLike], missing_ok: bool = True, followlinks: bool = False) → Iterator[str][source]

Iteratively traverse only files in given directory, in alphabetical order. Every iteration on generator yields a path string.

If path is a file path, yields the file only If path is a non-existent path, return an empty generator If path is a bucket path, return all file paths in the bucket

Parameters
  • path – Given path

  • missing_ok – If False and there’s no file in the directory, raise FileNotFoundError

Raises

UnsupportedError

Returns

A file path generator

megfile.smart_scandir(path: Union[str, os.PathLike, None] = None) → Iterator[megfile.pathlike.FileEntry][source]

Get all content of given s3_url or file path.

Parameters

path – Given path

Returns

An iterator contains all contents have prefix path

Raises

FileNotFoundError, NotADirectoryError

megfile.smart_stat(path: Union[str, os.PathLike], follow_symlinks=True) → megfile.pathlike.StatResult[source]

Get StatResult of s3_url or file path

Parameters

path – Given path

Returns

StatResult

Raises

FileNotFoundError

megfile.smart_sync(src_path: Union[str, os.PathLike], dst_path: Union[str, os.PathLike], callback: Optional[Callable[[str, int], None]] = None, followlinks: bool = False, callback_after_copy_file: Optional[Callable[[str, str], None]] = None, src_file_stats: Optional[Iterable[megfile.pathlike.FileEntry]] = None, map_func: Callable[[Callable, Iterable], Any] = <class 'map'>, force: bool = False) → None[source]

Sync file or directory on s3 and fs

Note

When the parameter is file, this function bahaves like smart_copy.

If file and directory of same name and same level, sync consider it’s file first.

Here are a few examples:

>>> from tqdm import tqdm
>>> from threading import Lock
>>> from megfile import smart_sync, smart_stat, smart_glob
>>> class Bar:
...     def __init__(self, total_file):
...         self._total_file = total_file
...         self._bar = None
...         self._now = None
...         self._file_index = 0
...         self._lock = Lock()
...     def __call__(self, path, num_bytes):
...         with self._lock:
...             if path != self._now:
...                 self._file_index += 1
...                 print("copy file {}/{}:".format(self._file_index, self._total_file))
...                 if self._bar:
...                     self._bar.close()
...                 self._bar = tqdm(total=smart_stat(path).size)
...                 self._now = path
...            self._bar.update(num_bytes)
>>> total_file = len(list(smart_glob('src_path')))
>>> smart_sync('src_path', 'dst_path', callback=Bar(total_file=total_file))
Parameters
  • src_path – Given source path

  • dst_path – Given destination path

  • callback – Called periodically during copy, and the input parameter is the data size (in bytes) of copy since the last call

  • followlinks – False if regard symlink as file, else True

  • callback_after_copy_file – Called after copy success, and the input parameter is src file path and dst file path

  • src_file_stats – If this parameter is not None, only this parameter’s files will be synced, and src_path is the root_path of these files used to calculate the path of the target file. This parameter is in order to reduce file traversal times.

  • map_func – A Callable func like map. You can use ThreadPoolExecutor.map, Pool.map and so on if you need concurrent capability. default is standard library map.

megfile.smart_touch(path: Union[str, os.PathLike])[source]

Create a new file on path

param path: Path to create file

Remove the file on s3 or fs

Parameters
  • path – Given path

  • missing_ok – if False and target file not exists, raise FileNotFoundError

Raises

PermissionError, FileNotFoundError, IsADirectoryError

megfile.smart_walk(path: Union[str, os.PathLike], followlinks: bool = False) → Iterator[Tuple[str, List[str], List[str]]][source]

Generate the file names in a directory tree by walking the tree top-down. For each directory in the tree rooted at directory path (including path itself), it yields a 3-tuple (root, dirs, files).

root: a string of current path dirs: name list of subdirectories (excluding ‘.’ and ‘..’ if they exist) in ‘root’. The list is sorted by ascending alphabetical order files: name list of non-directory files (link is regarded as file) in ‘root’. The list is sorted by ascending alphabetical order

If path not exists, return an empty generator If path is a file, return an empty generator If try to apply walk() on unsupported path, raise UnsupportedError

Parameters

path – Given path

Raises

UnsupportedError

Returns

A 3-tuple generator

megfile.smart_cache(path, cacher=<class 'megfile.smart.SmartCacher'>, **options)[source]

Return a path to Posixpath Interface

param path: Path to cache param s3_cacher: Cacher for s3 path param options: Optional arguments for s3_cacher

megfile.smart_getmd5(path: Union[str, os.PathLike], recalculate: bool = False)[source]

Get md5 value of file

param path: File path param recalculate: calculate md5 in real-time or not return s3 etag when path is s3

Create a symbolic link pointing to src_path named path.

Parameters
  • src_path – Source path

  • dst_path – Desination path

Return a string representing the path to which the symbolic link points. :param path: Path to be read :returns: Return a string representing the path to which the symbolic link points.

megfile.smart_lstat(path: Union[str, os.PathLike]) → megfile.pathlike.StatResult[source]

Get StatResult of path but do not follow symbolic links

Parameters

path – Given path

Returns

StatResult

Raises

FileNotFoundError

megfile.smart_concat(src_paths: List[Union[str, os.PathLike]], dst_path: Union[str, os.PathLike]) → None[source]

Concatenate src_paths to dst_path

Parameters
  • src_paths – List of source paths

  • dst_path – Destination path

megfile.is_s3(path: Union[str, os.PathLike]) → bool[source]
  1. According to aws-cli , test if a path is s3 path.

  2. megfile also support the path like s3[+profile_name]://bucket/key

Parameters

path – Path to be tested

Returns

True if path is s3 path, else False

megfile.s3_access(path: Union[str, os.PathLike], mode: megfile.pathlike.Access = <Access.READ: 1>, followlinks: bool = False) → bool[source]

Test if path has access permission described by mode

Parameters
  • path – Given path

  • mode – access mode

Returns

bool, if the bucket of s3_url has read/write access.

megfile.s3_buffered_open(s3_url: Union[str, os.PathLike], mode: str, followlinks: bool = False, *, max_concurrency: Optional[int] = None, max_buffer_size: int = 134217728, forward_ratio: Optional[float] = None, block_size: int = 8388608, limited_seekable: bool = False, buffered: bool = True, share_cache_key: Optional[str] = None, cache_path: Optional[str] = None) → Union[megfile.lib.s3_prefetch_reader.S3PrefetchReader, megfile.lib.s3_buffered_writer.S3BufferedWriter, _io.BufferedReader, _io.BufferedWriter, megfile.lib.s3_memory_handler.S3MemoryHandler][source]

Open an asynchronous prefetch reader, to support fast sequential read

Note

User should make sure that reader / writer are closed correctly

Supports context manager

Some parameter setting may perform well: max_concurrency=10 or 20, max_block_size=8 or 16 MB, default value None means using global thread pool

Parameters
  • max_concurrency – Max download thread number, None by default

  • max_buffer_size – Max cached buffer size in memory, 128MB by default

  • block_size – Size of single block, 8MB by default. Each block will be uploaded or downloaded by single thread.

  • limited_seekable – If write-handle supports limited seek (both file head part and tail part can seek block_size). Notes: This parameter are valid only for write-handle. Read-handle support arbitrary seek

Returns

An opened S3PrefetchReader object

Raises

S3FileNotFoundError

megfile.s3_cached_open(s3_url: Union[str, os.PathLike], mode: str, followlinks: bool = False, *, cache_path: Optional[str] = None) → megfile.lib.s3_cached_handler.S3CachedHandler[source]

Open a local-cache file reader / writer, for frequent random read / write

Note

User should make sure that reader / writer are closed correctly

Supports context manager

cache_path can specify the path of cache file. Performance could be better if cache file path is on ssd or tmpfs

Parameters
  • mode – Mode to open file, could be one of “rb”, “wb” or “ab”

  • cache_path – cache file path

Returns

An opened BufferedReader / BufferedWriter object

megfile.s3_copy(src_url: Union[str, os.PathLike], dst_url: Union[str, os.PathLike], followlinks: bool = False, callback: Optional[Callable[int, None]] = None) → None[source]

File copy on S3 Copy content of file on src_path to dst_path. It’s caller’s responsebility to ensure the s3_isfile(src_url) == True

Parameters
  • src_url – Given path

  • dst_path – Target file path

  • callback – Called periodically during copy, and the input parameter is the data size (in bytes) of copy since the last call

megfile.s3_download(src_url: Union[str, os.PathLike], dst_url: Union[str, os.PathLike], followlinks: bool = False, callback: Optional[Callable[int, None]] = None) → None[source]

Downloads a file from s3 to local filesystem. :param src_url: source s3 path :param dst_url: target fs path :param callback: Called periodically during copy, and the input parameter is the data size (in bytes) of copy since the last call

megfile.s3_exists(path: Union[str, os.PathLike], followlinks: bool = False) → bool[source]

Test if s3_url exists

If the bucket of s3_url are not permitted to read, return False

Parameters

path – Given path

Returns

True if s3_url eixsts, else False

megfile.s3_getmd5(path: Union[str, os.PathLike], recalculate: bool = False, followlinks: bool = False) → str[source]

Get md5 meta info in files that uploaded/copied via megfile

If meta info is lost or non-existent, return None

Parameters
  • path – Given path

  • recalculate – calculate md5 in real-time or return s3 etag

  • followlinks – If is True, calculate md5 for real file

Returns

md5 meta info

megfile.s3_getmtime(path: Union[str, os.PathLike], follow_symlinks: bool = False) → float[source]

Get last-modified time of the file on the given s3_url path (in Unix timestamp format). If the path is an existent directory, return the latest modified time of all file in it. The mtime of empty directory is 1970-01-01 00:00:00

If s3_url is not an existent path, which means s3_exist(s3_url) returns False, then raise S3FileNotFoundError

Parameters

path – Given path

Returns

Last-modified time

Raises

S3FileNotFoundError, UnsupportedError

megfile.s3_getsize(path: Union[str, os.PathLike], follow_symlinks: bool = False) → int[source]

Get file size on the given s3_url path (in bytes). If the path in a directory, return the sum of all file size in it, including file in subdirectories (if exist). The result excludes the size of directory itself. In other words, return 0 Byte on an empty directory path.

If s3_url is not an existent path, which means s3_exist(s3_url) returns False, then raise S3FileNotFoundError

Parameters

path – Given path

Returns

File size

Raises

S3FileNotFoundError, UnsupportedError

megfile.s3_glob_stat(path: Union[str, os.PathLike], recursive: bool = True, missing_ok: bool = True, followlinks: bool = False) → Iterator[megfile.pathlike.FileEntry][source]

Return a generator contains tuples of path and file stat, in ascending alphabetical order, in which path matches glob pattern Notes: Only glob in bucket. If trying to match bucket with wildcard characters, raise UnsupportedError

Parameters
  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Raises

UnsupportedError, when bucket part contains wildcard characters

Returns

A generator contains tuples of path and file stat, in which paths match s3_pathname

megfile.s3_glob(path: Union[str, os.PathLike], recursive: bool = True, missing_ok: bool = True, followlinks: bool = False) → List[str][source]

Return s3 path list in ascending alphabetical order, in which path matches glob pattern Notes: Only glob in bucket. If trying to match bucket with wildcard characters, raise UnsupportedError

Parameters
  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Raises

UnsupportedError, when bucket part contains wildcard characters

Returns

A list contains paths match s3_pathname

megfile.s3_hasbucket(path: Union[str, os.PathLike]) → bool[source]

Test if the bucket of s3_url exists

Parameters

path – Given path

Returns

True if bucket of s3_url eixsts, else False

megfile.s3_iglob(path: Union[str, os.PathLike], recursive: bool = True, missing_ok: bool = True, followlinks: bool = False) → Iterator[str][source]

Return s3 path iterator in ascending alphabetical order, in which path matches glob pattern Notes: Only glob in bucket. If trying to match bucket with wildcard characters, raise UnsupportedError

Parameters
  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Raises

UnsupportedError, when bucket part contains wildcard characters

Returns

An iterator contains paths match s3_pathname

megfile.s3_isdir(path: Union[str, os.PathLike], followlinks: bool = False) → bool[source]

Test if an s3 url is directory Specific procedures are as follows: If there exists a suffix, of which os.path.join(s3_url, suffix) is a file If the url is empty bucket or s3://

Parameters
  • path – Given path

  • followlinks – whether followlinks is True or False, result is the same. Because s3 symlink not support dir.

Returns

True if path is s3 directory, else False

megfile.s3_isfile(path: Union[str, os.PathLike], followlinks: bool = False) → bool[source]

Test if an s3_url is file

Parameters

path – Given path

Returns

True if path is s3 file, else False

megfile.s3_listdir(path: Union[str, os.PathLike], followlinks: bool = False) → List[str][source]

Get all contents of given s3_url. The result is in acsending alphabetical order.

Parameters

path – Given path

Returns

All contents have prefix of s3_url in acsending alphabetical order

Raises

S3FileNotFoundError, S3NotADirectoryError

megfile.s3_load_content(s3_url, start: Optional[int] = None, stop: Optional[int] = None, followlinks: bool = False) → bytes[source]

Get specified file from [start, stop) in bytes

Parameters
  • s3_url – Specified path

  • start – start index

  • stop – stop index

Returns

bytes content in range [start, stop)

megfile.s3_load_from(path: Union[str, os.PathLike], followlinks: bool = False) → BinaryIO[source]

Read all content in binary on specified path and write into memory

User should close the BinaryIO manually

Parameters

path – Given path

Returns

BinaryIO

megfile.s3_makedirs(path: Union[str, os.PathLike], exist_ok: bool = False)[source]

Create an s3 directory. Purely creating directory is invalid because it’s unavailable on OSS. This function is to test the target bucket have WRITE access.

Parameters
  • path – Given path

  • exist_ok – If False and target directory exists, raise S3FileExistsError

Raises

S3BucketNotFoundError, S3FileExistsError

megfile.s3_memory_open(s3_url: Union[str, os.PathLike], mode: str, followlinks: bool = False) → megfile.lib.s3_memory_handler.S3MemoryHandler[source]

Open a memory-cache file reader / writer, for frequent random read / write

Note

User should make sure that reader / writer are closed correctly

Supports context manager

Parameters

mode – Mode to open file, could be one of “rb”, “wb”, “ab”, “rb+”, “wb+” or “ab+”

Returns

An opened BufferedReader / BufferedWriter object

megfile.s3_open(s3_url: Union[str, os.PathLike], mode: str, followlinks: bool = False, *, max_concurrency: Optional[int] = None, max_buffer_size: int = 134217728, forward_ratio: Optional[float] = None, block_size: int = 8388608, limited_seekable: bool = False, buffered: bool = True, share_cache_key: Optional[str] = None, cache_path: Optional[str] = None) → Union[megfile.lib.s3_prefetch_reader.S3PrefetchReader, megfile.lib.s3_buffered_writer.S3BufferedWriter, _io.BufferedReader, _io.BufferedWriter, megfile.lib.s3_memory_handler.S3MemoryHandler]

Open an asynchronous prefetch reader, to support fast sequential read

Note

User should make sure that reader / writer are closed correctly

Supports context manager

Some parameter setting may perform well: max_concurrency=10 or 20, max_block_size=8 or 16 MB, default value None means using global thread pool

Parameters
  • max_concurrency – Max download thread number, None by default

  • max_buffer_size – Max cached buffer size in memory, 128MB by default

  • block_size – Size of single block, 8MB by default. Each block will be uploaded or downloaded by single thread.

  • limited_seekable – If write-handle supports limited seek (both file head part and tail part can seek block_size). Notes: This parameter are valid only for write-handle. Read-handle support arbitrary seek

Returns

An opened S3PrefetchReader object

Raises

S3FileNotFoundError

megfile.s3_path_join(path: Union[str, os.PathLike], *other_paths: Union[str, os.PathLike]) → str[source]

Concat 2 or more path to a complete path

Parameters
  • path – Given path

  • other_paths – Paths to be concatenated

Returns

Concatenated complete path

Note

The difference between this function and os.path.join is that this function ignores left side slash (which indicates absolute path) in other_paths and will directly concat. e.g. os.path.join(‘/path’, ‘to’, ‘/file’) => ‘/file’, but s3_path_join(‘/path’, ‘to’, ‘/file’) => ‘/path/to/file’

megfile.s3_pipe_open(s3_url: Union[str, os.PathLike], mode: str, followlinks: bool = False, *, join_thread: bool = True) → megfile.lib.s3_pipe_handler.S3PipeHandler[source]

Open a asynchronous read-write reader / writer, to support fast sequential read / write

Note

User should make sure that reader / writer are closed correctly

Supports context manager

When join_thread is False, while the file handle are closing, this function will not wait until the asynchronous writing finishes; False doesn’t affect read-handle, but this can speed up write-handle because file will be written asynchronously. But asynchronous behaviour can guarantee the file are successfully written, and frequent execution may cause thread and file handle exhaustion

Parameters
  • mode – Mode to open file, either “rb” or “wb”

  • join_thread – If wait after function execution until s3 finishes writing

Returns

An opened BufferedReader / BufferedWriter object

megfile.s3_prefetch_open(s3_url: Union[str, os.PathLike], mode: str = 'rb', followlinks: bool = False, *, max_concurrency: Optional[int] = None, max_block_size: int = 8388608) → megfile.lib.s3_prefetch_reader.S3PrefetchReader[source]

Open a asynchronous prefetch reader, to support fast sequential read and random read

Note

User should make sure that reader / writer are closed correctly

Supports context manager

Some parameter setting may perform well: max_concurrency=10 or 20, max_block_size=8 or 16 MB, default value None means using global thread pool

Parameters
  • max_concurrency – Max download thread number, None by default

  • max_block_size – Max data size downloaded by each thread, in bytes, 8MB by default

Returns

An opened S3PrefetchReader object

Raises

S3FileNotFoundError

megfile.s3_remove(path: Union[str, os.PathLike], missing_ok: bool = False) → None[source]

Remove the file or directory on s3, s3:// and s3://bucket are not permitted to remove

Parameters
  • path – Given path

  • missing_ok – if False and target file/directory not exists, raise S3FileNotFoundError

Raises

S3PermissionError, S3FileNotFoundError, UnsupportedError

megfile.s3_rename(src_url: Union[str, os.PathLike], dst_url: Union[str, os.PathLike])[source]

Move s3 file path from src_url to dst_url

Parameters

dst_url – Given destination path

megfile.s3_move(src_url: Union[str, os.PathLike], dst_url: Union[str, os.PathLike]) → None[source]

Move file/directory path from src_url to dst_url

Parameters
  • src_url – Given path

  • dst_url – Given destination path

megfile.s3_sync(src_url: Union[str, os.PathLike], dst_url: Union[str, os.PathLike], followlinks: bool = False, force: bool = False) → None[source]

Copy file/directory on src_url to dst_url

Parameters
  • src_url – Given path

  • dst_url – Given destination path

  • followlinks – False if regard symlink as file, else True

  • force – Sync file forcely, do not ignore same files

megfile.s3_save_as(file_object: BinaryIO, path: Union[str, os.PathLike])[source]

Write the opened binary stream to specified path, but the stream won’t be closed

Parameters
  • path – Given path

  • file_object – Stream to be read

megfile.s3_scan_stat(path: Union[str, os.PathLike], missing_ok: bool = True, followlinks: bool = False) → Iterator[megfile.pathlike.FileEntry][source]

Iteratively traverse only files in given directory, in alphabetical order. Every iteration on generator yields a tuple of path string and file stat

Parameters
  • path – Given path

  • missing_ok – If False and there’s no file in the directory, raise FileNotFoundError

Raises

UnsupportedError

Returns

A file path generator

megfile.s3_scan(path: Union[str, os.PathLike], missing_ok: bool = True, followlinks: bool = False) → Iterator[str][source]

Iteratively traverse only files in given s3 directory, in alphabetical order. Every iteration on generator yields a path string.

If s3_url is a file path, yields the file only If s3_url is a non-existent path, return an empty generator If s3_url is a bucket path, return all file paths in the bucket If s3_url is an empty bucket, return an empty generator If s3_url doesn’t contain any bucket, which is s3_url == ‘s3://’, raise UnsupportedError. walk() on complete s3 is not supported in megfile

Parameters
  • path – Given path

  • missing_ok – If False and there’s no file in the directory, raise FileNotFoundError

Raises

UnsupportedError

Returns

A file path generator

megfile.s3_scandir(path: Union[str, os.PathLike], followlinks: bool = False) → Iterator[megfile.pathlike.FileEntry][source]

Get all contents of given s3_url, the order of result is not guaranteed.

Parameters

path – Given path

Returns

All contents have prefix of s3_url

Raises

S3FileNotFoundError, S3NotADirectoryError

megfile.s3_stat(path: Union[str, os.PathLike], follow_symlinks=True) → megfile.pathlike.StatResult[source]

Get StatResult of s3_url file, including file size and mtime, referring to s3_getsize and s3_getmtime

If s3_url is not an existent path, which means s3_exist(s3_url) returns False, then raise S3FileNotFoundError If attempt to get StatResult of complete s3, such as s3_dir_url == ‘s3://’, raise S3BucketNotFoundError

Parameters

path – Given path

Returns

StatResult

Raises

S3FileNotFoundError, S3BucketNotFoundError

megfile.s3_lstat(path: Union[str, os.PathLike]) → megfile.pathlike.StatResult[source]

Like Path.stat() but, if the path points to a symbolic link, return the symbolic link’s information rather than its target’s.

Remove the file on s3

Parameters
  • path – Given path

  • missing_ok – if False and target file not exists, raise S3FileNotFoundError

Raises

S3PermissionError, S3FileNotFoundError, S3IsADirectoryError

megfile.s3_upload(src_url: Union[str, os.PathLike], dst_url: Union[str, os.PathLike], callback: Optional[Callable[int, None]] = None, **kwargs) → None[source]

Uploads a file from local filesystem to s3. :param src_url: source fs path :param dst_url: target s3 path :param callback: Called periodically during copy, and the input parameter is the data size (in bytes) of copy since the last call

megfile.s3_walk(path: Union[str, os.PathLike], followlinks: bool = False) → Iterator[Tuple[str, List[str], List[str]]][source]

Iteratively traverse the given s3 directory, in top-bottom order. In other words, firstly traverse parent directory, if subdirectories exist, traverse the subdirectories in alphabetical order. Every iteration on generator yields a 3-tuple: (root, dirs, files)

  • root: Current s3 path;

  • dirs: Name list of subdirectories in current directory. The list is sorted by name in ascending alphabetical order;

  • files: Name list of files in current directory. The list is sorted by name in ascending alphabetical order;

If s3_url is a file path, return an empty generator If s3_url is a non-existent path, return an empty generator If s3_url is a bucket path, bucket will be the top directory, and will be returned at first iteration of generator If s3_url is an empty bucket, only yield one 3-tuple (notes: s3 doesn’t have empty directory) If s3_url doesn’t contain any bucket, which is s3_url == ‘s3://’, raise UnsupportedError. walk() on complete s3 is not supported in megfile

Parameters
  • path – Given path

  • followlinks – whether followlinks is True or False, result is the same. Because s3 symlink not support dir.

Raises

UnsupportedError

Returns

A 3-tuple generator

Create a symbolic link pointing to src_path named dst_path.

Parameters
  • src_path – Given path

  • dst_path – Desination path

Raises

S3NameTooLongError, S3BucketNotFoundError, S3IsADirectoryError

Return a string representing the path to which the symbolic link points.

Returns

Return a string representing the path to which the symbolic link points.

Raises

S3NameTooLongError, S3BucketNotFoundError, S3IsADirectoryError, S3NotALinkError

megfile.s3_concat(src_paths: List[Union[str, os.PathLike]], dst_path: Union[str, os.PathLike], block_size: int = 8388608, max_workers: int = 128) → None[source]

Concatenate s3 files to one file.

Parameters
  • src_paths – Given source paths

  • dst_path – Given destination path

megfile.is_fs(path: Union[PathLike, int]) → bool[source]

Test if a path is fs path

Parameters

path – Path to be tested

Returns

True of a path is fs path, else False

megfile.fs_abspath(path: Union[str, os.PathLike]) → str[source]

Return the absolute path of given path

Parameters

path – Given path

Returns

Absolute path of given path

megfile.fs_access(path: Union[str, os.PathLike], mode: megfile.pathlike.Access = <Access.READ: 1>) → bool[source]

Test if path has access permission described by mode Using os.access

Parameters
  • path – Given path

  • mode – access mode

Returns

Access: Enum, the read/write access that path has.

megfile.fs_exists(path: Union[str, os.PathLike], followlinks: bool = False) → bool[source]

Test if the path exists

Note

The difference between this function and os.path.exists is that this function regard symlink as file. In other words, this function is equal to os.path.lexists

Parameters
  • path – Given path

  • followlinks – False if regard symlink as file, else True

Returns

True if the path exists, else False

megfile.fs_getmtime(path: Union[str, os.PathLike], follow_symlinks: bool = False) → float[source]

Get last-modified time of the file on the given path (in Unix timestamp format). If the path is an existent directory, return the latest modified time of all file in it.

Parameters

path – Given path

Returns

last-modified time

megfile.fs_getsize(path: Union[str, os.PathLike], follow_symlinks: bool = False) → int[source]

Get file size on the given file path (in bytes). If the path in a directory, return the sum of all file size in it, including file in subdirectories (if exist). The result excludes the size of directory itself. In other words, return 0 Byte on an empty directory path.

Parameters

path – Given path

Returns

File size

megfile.fs_glob_stat(path: Union[str, os.PathLike], recursive: bool = True, missing_ok: bool = True) → Iterator[megfile.pathlike.FileEntry][source]

Return a list contains tuples of path and file stat, in ascending alphabetical order, in which path matches glob pattern

  1. If doesn’t match any path, return empty list

    Notice: glob.glob in standard library returns [‘a/’] instead of empty list when pathname is like a/**, recursive is True and directory ‘a’ doesn’t exist. fs_glob behaves like glob.glob in standard library under such circumstance.

  2. No guarantee that each path in result is different, which means:

    Assume there exists a path /a/b/c/b/d.txt use path pattern like /**/b/**/*.txt to glob, the path above will be returned twice

  3. ** will match any matched file, directory, symlink and ‘’ by default, when recursive is True

  4. fs_glob returns same as glob.glob(pathname, recursive=True) in acsending alphabetical order.

  5. Hidden files (filename stars with ‘.’) will not be found in the result

Parameters
  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Returns

A list contains tuples of path and file stat, in which paths match pathname

megfile.fs_glob(path: Union[str, os.PathLike], recursive: bool = True, missing_ok: bool = True) → List[str][source]

Return path list in ascending alphabetical order, in which path matches glob pattern

  1. If doesn’t match any path, return empty list

    Notice: glob.glob in standard library returns [‘a/’] instead of empty list when pathname is like a/**, recursive is True and directory ‘a’ doesn’t exist. fs_glob behaves like glob.glob in standard library under such circumstance.

  2. No guarantee that each path in result is different, which means:

    Assume there exists a path /a/b/c/b/d.txt use path pattern like /**/b/**/*.txt to glob, the path above will be returned twice

  3. ** will match any matched file, directory, symlink and ‘’ by default, when recursive is True

  4. fs_glob returns same as glob.glob(pathname, recursive=True) in acsending alphabetical order.

  5. Hidden files (filename stars with ‘.’) will not be found in the result

Parameters
  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Returns

A list contains paths match pathname

megfile.fs_iglob(path: Union[str, os.PathLike], recursive: bool = True, missing_ok: bool = True) → Iterator[str][source]

Return path iterator in ascending alphabetical order, in which path matches glob pattern

  1. If doesn’t match any path, return empty list

    Notice: glob.glob in standard library returns [‘a/’] instead of empty list when pathname is like a/**, recursive is True and directory ‘a’ doesn’t exist. fs_glob behaves like glob.glob in standard library under such circumstance.

  2. No guarantee that each path in result is different, which means:

    Assume there exists a path /a/b/c/b/d.txt use path pattern like /**/b/**/*.txt to glob, the path above will be returned twice

  3. ** will match any matched file, directory, symlink and ‘’ by default, when recursive is True

  4. fs_glob returns same as glob.glob(pathname, recursive=True) in acsending alphabetical order.

  5. Hidden files (filename stars with ‘.’) will not be found in the result

Parameters
  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Returns

An iterator contains paths match pathname

megfile.fs_isabs(path: Union[str, os.PathLike]) → bool[source]

Test whether a path is absolute

Parameters

path – Given path

Returns

True if a path is absolute, else False

megfile.fs_isdir(path: Union[str, os.PathLike], followlinks: bool = False) → bool[source]

Test if a path is directory

Note

The difference between this function and os.path.isdir is that this function regard symlink as file

Parameters
  • path – Given path

  • followlinks – False if regard symlink as file, else True

Returns

True if the path is a directory, else False

megfile.fs_isfile(path: Union[str, os.PathLike], followlinks: bool = False) → bool[source]

Test if a path is file

Note

The difference between this function and os.path.isfile is that this function regard symlink as file

Parameters
  • path – Given path

  • followlinks – False if regard symlink as file, else True

Returns

True if the path is a file, else False

Test whether a path is a symbolic link

Parameters

path – Given path

Returns

If path is a symbolic link return True, else False

Return type

bool

megfile.fs_ismount(path: Union[str, os.PathLike]) → bool[source]

Test whether a path is a mount point

Parameters

path – Given path

Returns

True if a path is a mount point, else False

megfile.fs_listdir(path: Union[str, os.PathLike]) → List[str][source]

Get all contents of given fs path. The result is in acsending alphabetical order.

Parameters

path – Given path

Returns

All contents have in the path in acsending alphabetical order

megfile.fs_load_from(path: Union[str, os.PathLike]) → BinaryIO[source]

Read all content on specified path and write into memory

User should close the BinaryIO manually

Parameters

path – Given path

Returns

Binary stream

megfile.fs_makedirs(path: Union[str, os.PathLike], exist_ok: bool = False)[source]

make a directory on fs, including parent directory

If there exists a file on the path, raise FileExistsError

Parameters
  • path – Given path

  • exist_ok – If False and target directory exists, raise FileExistsError

Raises

FileExistsError

megfile.fs_realpath(path: Union[str, os.PathLike]) → str[source]

Return the real path of given path

Parameters

path – Given path

Returns

Real path of given path

megfile.fs_relpath(path: Union[str, os.PathLike], start: Optional[str] = None) → str[source]

Return the relative path of given path

Parameters
  • path – Given path

  • start – Given start directory

Returns

Relative path from start

megfile.fs_remove(path: Union[str, os.PathLike], missing_ok: bool = False) → None[source]

Remove the file or directory on fs

Parameters
  • path – Given path

  • missing_ok – if False and target file/directory not exists, raise FileNotFoundError

megfile.fs_rename(src_path: Union[str, os.PathLike], dst_path: Union[str, os.PathLike]) → None[source]

rename file on fs

Parameters
  • src_path – Given path

  • dst_path – Given destination path

megfile.fs_move(src_path: Union[str, os.PathLike], dst_path: Union[str, os.PathLike]) → None[source]

rename file on fs

Parameters
  • src_path – Given path

  • dst_path – Given destination path

megfile.fs_sync(src_path: Union[str, os.PathLike], dst_path: Union[str, os.PathLike], followlinks: bool = False, force: bool = False) → None[source]

Force write of everything to disk.

Parameters
  • src_path – Given path

  • dst_path – Target file path

  • followlinks – False if regard symlink as file, else True

  • force – Sync file forcely, do not ignore same files

megfile.fs_save_as(file_object: BinaryIO, path: Union[str, os.PathLike])[source]

Write the opened binary stream to path If parent directory of path doesn’t exist, it will be created.

Parameters
  • path – Given path

  • file_object – stream to be read

megfile.fs_scan_stat(path: Union[str, os.PathLike], missing_ok: bool = True, followlinks: bool = False) → Iterator[megfile.pathlike.FileEntry][source]

Iteratively traverse only files in given directory, in alphabetical order. Every iteration on generator yields a tuple of path string and file stat

Parameters
  • path – Given path

  • missing_ok – If False and there’s no file in the directory, raise FileNotFoundError

Returns

A file path generator

megfile.fs_scan(path: Union[str, os.PathLike], missing_ok: bool = True, followlinks: bool = False) → Iterator[str][source]

Iteratively traverse only files in given directory, in alphabetical order. Every iteration on generator yields a path string.

If path is a file path, yields the file only If path is a non-existent path, return an empty generator If path is a bucket path, return all file paths in the bucket

Parameters
  • path – Given path

  • missing_ok – If False and there’s no file in the directory, raise FileNotFoundError

Returns

A file path generator

megfile.fs_scandir(path: Union[str, os.PathLike]) → Iterator[megfile.pathlike.FileEntry][source]

Get all content of given file path.

Parameters

path – Given path

Returns

An iterator contains all contents have prefix path

megfile.fs_stat(path: Union[str, os.PathLike], follow_symlinks=True) → megfile.pathlike.StatResult[source]

Get StatResult of file on fs, including file size and mtime, referring to fs_getsize and fs_getmtime

Parameters

path – Given path

Returns

StatResult

megfile.fs_lstat(path: Union[str, os.PathLike]) → megfile.pathlike.StatResult[source]

Like Path.stat() but, if the path points to a symbolic link, return the symbolic link’s information rather than its target’s.

Parameters

path – Given path

Returns

StatResult

Remove the file on fs

Parameters
  • path – Given path

  • missing_ok – if False and target file not exists, raise FileNotFoundError

megfile.fs_walk(path: Union[str, os.PathLike], followlinks: bool = False) → Iterator[Tuple[str, List[str], List[str]]][source]

Generate the file names in a directory tree by walking the tree top-down. For each directory in the tree rooted at directory path (including path itself), it yields a 3-tuple (root, dirs, files).

root: a string of current path dirs: name list of subdirectories (excluding ‘.’ and ‘..’ if they exist) in ‘root’. The list is sorted by ascending alphabetical order files: name list of non-directory files (link is regarded as file) in ‘root’. The list is sorted by ascending alphabetical order

If path not exists, or path is a file (link is regarded as file), return an empty generator

Note

Be aware that setting followlinks to True can lead to infinite recursion if a link points to a parent directory of itself. fs_walk() does not keep track of the directories it visited already.

Parameters
  • path – Given path

  • followlinks – False if regard symlink as file, else True

Returns

A 3-tuple generator

megfile.fs_cwd() → str[source]

Return current working directory

returns: Current working directory

megfile.fs_home()[source]

Return the home directory

returns: Home directory path

megfile.fs_expanduser(path: Union[str, os.PathLike])[source]

Expand ~ and ~user constructions. If user or $HOME is unknown, do nothing.

megfile.fs_resolve(path: Union[str, os.PathLike]) → str[source]

Equal to fs_realpath, return the real path of given path

Parameters

path – Given path

Returns

Real path of given path

megfile.fs_getmd5(path: Union[str, os.PathLike], recalculate: bool = False, followlinks: bool = True)[source]

Calculate the md5 value of the file

Parameters
  • path – Given path

  • recalculate – Ignore this parameter, just for compatibility

  • followlinks – Ignore this parameter, just for compatibility

returns: md5 of file

Create a symbolic link pointing to src_path named dst_path.

Parameters
  • src_path – Given path

  • dst_path – Desination path

Return a string representing the path to which the symbolic link points. :returns: Return a string representing the path to which the symbolic link points.

megfile.is_http(path: Union[str, os.PathLike]) → bool[source]

http scheme definition: http(s)://domain/path

Parameters

path – Path to be tested

Returns

True if path is http url, else False

megfile.http_open(path: Union[str, os.PathLike], mode: str = 'rb', *, encoding: Optional[str] = None, errors: Optional[str] = None, max_concurrency: Optional[int] = None, max_buffer_size: int = 134217728, forward_ratio: Optional[float] = None, block_size: int = 8388608, **kwargs) → Union[_io.BufferedReader, megfile.lib.http_prefetch_reader.HttpPrefetchReader][source]

Open a BytesIO to read binary data of given http(s) url

Note

Essentially, it reads data of http(s) url to memory by requests, and then return BytesIO to user.

Parameters
  • path – Given path

  • mode – Only supports ‘rb’ mode now

  • encoding – encoding is the name of the encoding used to decode or encode the file. This should only be used in text mode.

  • errors – errors is an optional string that specifies how encoding and decoding errors are to be handled—this cannot be used in binary mode.

  • max_concurrency – Max download thread number, None by default

  • max_buffer_size – Max cached buffer size in memory, 128MB by default

  • block_size – Size of single block, 8MB by default. Each block will be uploaded or downloaded by single thread.

Returns

BytesIO initialized with http(s) data

megfile.http_stat(path: Union[str, os.PathLike], follow_symlinks=True) → megfile.pathlike.StatResult[source]

Get StatResult of http_url response, including size and mtime, referring to http_getsize and http_getmtime

Parameters
  • path – Given path

  • follow_symlinks – Ignore this parameter, just for compatibility

Returns

StatResult

Raises

HttpPermissionError, HttpFileNotFoundError

megfile.http_getsize(path: Union[str, os.PathLike], follow_symlinks: bool = False) → int[source]

Get file size on the given http_url path.

If http response header don’t support Content-Length, will return None

Parameters
  • path – Given path

  • follow_symlinks – Ignore this parameter, just for compatibility

Returns

File size (in bytes)

Raises

HttpPermissionError, HttpFileNotFoundError

megfile.http_getmtime(path: Union[str, os.PathLike], follow_symlinks: bool = False) → float[source]

Get Last-Modified time of the http request on the given http_url path.

If http response header don’t support Last-Modified, will return None

Parameters
  • path – Given path

  • follow_symlinks – Ignore this parameter, just for compatibility

Returns

Last-Modified time (in Unix timestamp format)

Raises

HttpPermissionError, HttpFileNotFoundError

megfile.http_exists(path: Union[str, os.PathLike], followlinks: bool = False) → bool[source]

Test if http path exists

Parameters
  • path – Given path

  • followlinks (bool, optional) – ignore this parameter, just for compatibility

Returns

return True if exists

Return type

bool

megfile.is_stdio(path: Union[str, os.PathLike]) → bool[source]

stdio scheme definition: stdio://-

Note

Only tests protocol

Parameters

path – Path to be tested

Returns

True of a path is stdio url, else False

megfile.stdio_open(path: Union[str, os.PathLike], mode: str = 'rb', encoding: Optional[str] = None, errors: Optional[str] = None, **kwargs) → IO[AnyStr][source]

Used to read or write stdio

Note

Essentially invoke sys.stdin.buffer | sys.stdout.buffer to read or write

Parameters
  • path – Given path

  • mode – Only supports ‘rb’ and ‘wb’ now

Returns

STDReader, STDWriter

megfile.is_sftp(path: Union[str, os.PathLike]) → bool[source]

Test if a path is sftp path

Parameters

path – Path to be tested

Returns

True of a path is sftp path, else False

Return a SftpPath instance representing the path to which the symbolic link points. :param path: Given path :returns: Return a SftpPath instance representing the path to which the symbolic link points.

megfile.sftp_absolute(path: Union[str, os.PathLike]) → megfile.sftp_path.SftpPath[source]

Make the path absolute, without normalization or resolving symlinks. Returns a new path object

megfile.sftp_glob(path: Union[str, os.PathLike], recursive: bool = True, missing_ok: bool = True) → List[str][source]

Return path list in ascending alphabetical order, in which path matches glob pattern

  1. If doesn’t match any path, return empty list

    Notice: glob.glob in standard library returns [‘a/’] instead of empty list when pathname is like a/**, recursive is True and directory ‘a’ doesn’t exist. fs_glob behaves like glob.glob in standard library under such circumstance.

  2. No guarantee that each path in result is different, which means:

    Assume there exists a path /a/b/c/b/d.txt use path pattern like /**/b/**/*.txt to glob, the path above will be returned twice

  3. ** will match any matched file, directory, symlink and ‘’ by default, when recursive is True

  4. fs_glob returns same as glob.glob(pathname, recursive=True) in acsending alphabetical order.

  5. Hidden files (filename stars with ‘.’) will not be found in the result

Parameters
  • path – Given path

  • pattern – Glob the given relative pattern in the directory represented by this path

  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Returns

A list contains paths match pathname

megfile.sftp_iglob(path: Union[str, os.PathLike], recursive: bool = True, missing_ok: bool = True) → Iterator[str][source]

Return path iterator in ascending alphabetical order, in which path matches glob pattern

  1. If doesn’t match any path, return empty list

    Notice: glob.glob in standard library returns [‘a/’] instead of empty list when pathname is like a/**, recursive is True and directory ‘a’ doesn’t exist. fs_glob behaves like glob.glob in standard library under such circumstance.

  2. No guarantee that each path in result is different, which means:

    Assume there exists a path /a/b/c/b/d.txt use path pattern like /**/b/**/*.txt to glob, the path above will be returned twice

  3. ** will match any matched file, directory, symlink and ‘’ by default, when recursive is True

  4. fs_glob returns same as glob.glob(pathname, recursive=True) in acsending alphabetical order.

  5. Hidden files (filename stars with ‘.’) will not be found in the result

Parameters
  • path – Given path

  • pattern – Glob the given relative pattern in the directory represented by this path

  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Returns

An iterator contains paths match pathname

megfile.sftp_glob_stat(path: Union[str, os.PathLike], recursive: bool = True, missing_ok: bool = True) → Iterator[megfile.pathlike.FileEntry][source]

Return a list contains tuples of path and file stat, in ascending alphabetical order, in which path matches glob pattern

  1. If doesn’t match any path, return empty list

    Notice: glob.glob in standard library returns [‘a/’] instead of empty list when pathname is like a/**, recursive is True and directory ‘a’ doesn’t exist. sftp_glob behaves like glob.glob in standard library under such circumstance.

  2. No guarantee that each path in result is different, which means:

    Assume there exists a path /a/b/c/b/d.txt use path pattern like /**/b/**/*.txt to glob, the path above will be returned twice

  3. ** will match any matched file, directory, symlink and ‘’ by default, when recursive is True

  4. fs_glob returns same as glob.glob(pathname, recursive=True) in acsending alphabetical order.

  5. Hidden files (filename stars with ‘.’) will not be found in the result

Parameters
  • path – Given path

  • pattern – Glob the given relative pattern in the directory represented by this path

  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Returns

A list contains tuples of path and file stat, in which paths match pathname

megfile.sftp_resolve(path: Union[str, os.PathLike], strict=False) → str[source]

Equal to fs_realpath

Parameters
  • path – Given path

  • strict – Ignore this parameter, just for compatibility

Returns

Return the canonical path of the specified filename, eliminating any symbolic links encountered in the path.

Return type

SftpPath

megfile.sftp_isdir(path: Union[str, os.PathLike], followlinks: bool = False) → bool[source]

Test if a path is directory

Note

The difference between this function and os.path.isdir is that this function regard symlink as file

Parameters
  • path – Given path

  • followlinks – False if regard symlink as file, else True

Returns

True if the path is a directory, else False

megfile.sftp_exists(path: Union[str, os.PathLike], followlinks: bool = False) → bool[source]

Test if the path exists

Parameters
  • path – Given path

  • followlinks – False if regard symlink as file, else True

Returns

True if the path exists, else False

megfile.sftp_scandir(path: Union[str, os.PathLike]) → Iterator[megfile.pathlike.FileEntry][source]

Get all content of given file path.

Parameters

path – Given path

Returns

An iterator contains all contents have prefix path

megfile.sftp_getmtime(path: Union[str, os.PathLike], follow_symlinks: bool = False) → float[source]

Get last-modified time of the file on the given path (in Unix timestamp format). If the path is an existent directory, return the latest modified time of all file in it.

Parameters

path – Given path

Returns

last-modified time

megfile.sftp_getsize(path: Union[str, os.PathLike], follow_symlinks: bool = False) → int[source]

Get file size on the given file path (in bytes). If the path in a directory, return the sum of all file size in it, including file in subdirectories (if exist). The result excludes the size of directory itself. In other words, return 0 Byte on an empty directory path.

Parameters

path – Given path

Returns

File size

megfile.sftp_isfile(path: Union[str, os.PathLike], followlinks: bool = False) → bool[source]

Test if a path is file

Note

The difference between this function and os.path.isfile is that this function regard symlink as file

Parameters
  • path – Given path

  • followlinks – False if regard symlink as file, else True

Returns

True if the path is a file, else False

megfile.sftp_listdir(path: Union[str, os.PathLike]) → List[str][source]

Get all contents of given sftp path. The result is in acsending alphabetical order.

Parameters

path – Given path

Returns

All contents have in the path in acsending alphabetical order

megfile.sftp_load_from(path: Union[str, os.PathLike]) → BinaryIO[source]

Read all content on specified path and write into memory

User should close the BinaryIO manually

Parameters

path – Given path

Returns

Binary stream

megfile.sftp_makedirs(path: Union[str, os.PathLike], mode=511, parents: bool = False, exist_ok: bool = False)[source]

make a directory on sftp, including parent directory

If there exists a file on the path, raise FileExistsError

Parameters
  • path – Given path

  • mode – If mode is given, it is combined with the process’ umask value to determine the file mode and access flags.

  • parents – If parents is true, any missing parents of this path are created as needed;

If parents is false (the default), a missing parent raises FileNotFoundError. :param exist_ok: If False and target directory exists, raise FileExistsError :raises: FileExistsError

megfile.sftp_realpath(path: Union[str, os.PathLike]) → str[source]

Return the real path of given path

Parameters

path – Given path

Returns

Real path of given path

megfile.sftp_rename(src_path: Union[str, os.PathLike], dst_path: Union[str, os.PathLike]) → megfile.sftp_path.SftpPath[source]

rename file on sftp

Parameters
  • src_path – Given path

  • dst_path – Given destination path

megfile.sftp_move(src_path: Union[str, os.PathLike], dst_path: Union[str, os.PathLike]) → megfile.sftp_path.SftpPath[source]

move file on sftp

Parameters
  • src_path – Given path

  • dst_path – Given destination path

megfile.sftp_remove(path: Union[str, os.PathLike], missing_ok: bool = False) → None[source]

Remove the file or directory on sftp

Parameters
  • path – Given path

  • missing_ok – if False and target file/directory not exists, raise FileNotFoundError

megfile.sftp_scan(path: Union[str, os.PathLike], missing_ok: bool = True, followlinks: bool = False) → Iterator[str][source]

Iteratively traverse only files in given directory, in alphabetical order. Every iteration on generator yields a path string.

If path is a file path, yields the file only If path is a non-existent path, return an empty generator If path is a bucket path, return all file paths in the bucket

Parameters
  • path – Given path

  • missing_ok – If False and there’s no file in the directory, raise FileNotFoundError

Returns

A file path generator

megfile.sftp_scan_stat(path: Union[str, os.PathLike], missing_ok: bool = True, followlinks: bool = False) → Iterator[megfile.pathlike.FileEntry][source]

Iteratively traverse only files in given directory, in alphabetical order. Every iteration on generator yields a tuple of path string and file stat

Parameters
  • path – Given path

  • missing_ok – If False and there’s no file in the directory, raise FileNotFoundError

Returns

A file path generator

megfile.sftp_stat(path: Union[str, os.PathLike], follow_symlinks=True) → megfile.pathlike.StatResult[source]

Get StatResult of file on sftp, including file size and mtime, referring to fs_getsize and fs_getmtime

Parameters

path – Given path

Returns

StatResult

megfile.sftp_lstat(path: Union[str, os.PathLike]) → megfile.pathlike.StatResult[source]

Get StatResult of file on sftp, including file size and mtime, referring to fs_getsize and fs_getmtime

Parameters

path – Given path

Returns

StatResult

Remove the file on sftp

Parameters
  • path – Given path

  • missing_ok – if False and target file not exists, raise FileNotFoundError

megfile.sftp_walk(path: Union[str, os.PathLike], followlinks: bool = False) → Iterator[Tuple[str, List[str], List[str]]][source]

Generate the file names in a directory tree by walking the tree top-down. For each directory in the tree rooted at directory path (including path itself), it yields a 3-tuple (root, dirs, files).

root: a string of current path dirs: name list of subdirectories (excluding ‘.’ and ‘..’ if they exist) in ‘root’. The list is sorted by ascending alphabetical order files: name list of non-directory files (link is regarded as file) in ‘root’. The list is sorted by ascending alphabetical order

If path not exists, or path is a file (link is regarded as file), return an empty generator

Note

Be aware that setting followlinks to True can lead to infinite recursion if a link points to a parent directory of itself. fs_walk() does not keep track of the directories it visited already.

Parameters
  • path – Given path

  • followlinks – False if regard symlink as file, else True

Returns

A 3-tuple generator

megfile.sftp_path_join(path: Union[str, os.PathLike], *other_paths: Union[str, os.PathLike]) → str[source]

Concat 2 or more path to a complete path

Parameters
  • path – Given path

  • other_paths – Paths to be concatenated

Returns

Concatenated complete path

Note

The difference between this function and os.path.join is that this function ignores left side slash (which indicates absolute path) in other_paths and will directly concat. e.g. os.path.join(‘/path’, ‘to’, ‘/file’) => ‘/file’, but sftp_path_join(‘/path’, ‘to’, ‘/file’) => ‘/path/to/file’

megfile.sftp_getmd5(path: Union[str, os.PathLike], recalculate: bool = False, followlinks: bool = True)[source]

Calculate the md5 value of the file

Parameters
  • path – Given path

  • recalculate – Ignore this parameter, just for compatibility

  • followlinks – Ignore this parameter, just for compatibility

returns: md5 of file

Create a symbolic link pointing to src_path named dst_path.

Parameters
  • src_path – Given path

  • dst_path – Desination path

Test whether a path is a symbolic link

Parameters

path – Given path

Returns

If path is a symbolic link return True, else False

Return type

bool

megfile.sftp_save_as(file_object: BinaryIO, path: Union[str, os.PathLike])[source]

Write the opened binary stream to path If parent directory of path doesn’t exist, it will be created.

Parameters
  • path – Given path

  • file_object – stream to be read

megfile.sftp_open(path: Union[str, os.PathLike], mode: str = 'r', buffering=-1, encoding: Optional[str] = None, errors: Optional[str] = None, **kwargs) → IO[AnyStr][source]

Open a file on the path.

Parameters
  • path – Given path

  • mode – Mode to open file

  • buffering – buffering is an optional integer used to set the buffering policy.

  • encoding – encoding is the name of the encoding used to decode or encode the file. This should only be used in text mode.

  • errors – errors is an optional string that specifies how encoding and decoding errors are to be handled—this cannot be used in binary mode.

Returns

File-Like object

megfile.sftp_chmod(path: Union[str, os.PathLike], mode: int, follow_symlinks: bool = True)[source]

Change the file mode and permissions, like os.chmod().

Parameters
  • path – Given path

  • mode – the file mode you want to change

  • followlinks – Ignore this parameter, just for compatibility

megfile.sftp_rmdir(path: Union[str, os.PathLike])[source]

Remove this directory. The directory must be empty.

megfile.sftp_copy(src_path: Union[str, os.PathLike], dst_path: Union[str, os.PathLike], callback: Optional[Callable[int, None]] = None, followlinks: bool = False)[source]

Copy the file to the given destination path.

Parameters
  • src_path – Given path

  • dst_path – The destination path to copy the file to.

  • callback – An optional callback function that takes an integer parameter and is called periodically during the copy operation to report the number of bytes copied.

  • followlinks – Whether to follow symbolic links when copying directories.

Raises
  • IsADirectoryError – If the source is a directory.

  • OSError – If there is an error copying the file.

megfile.sftp_sync(src_path: Union[str, os.PathLike], dst_path: Union[str, os.PathLike], followlinks: bool = False, force: bool = False)[source]

Copy file/directory on src_url to dst_url

Parameters
  • src_path – Given path

  • dst_url – Given destination path

  • followlinks – False if regard symlink as file, else True

  • force – Sync file forcely, do not ignore same files

megfile.sftp_concat(src_paths: List[Union[str, os.PathLike]], dst_path: Union[str, os.PathLike]) → None[source]

Concatenate sftp files to one file.

Parameters
  • src_paths – Given source paths

  • dst_path – Given destination path

class megfile.S3Path(path: Union[str, os.PathLike], *other_paths: Union[str, os.PathLike])[source]

Bases: megfile.pathlike.URIPath

absolute() → megfile.s3_path.S3Path[source]

Make the path absolute, without normalization or resolving symlinks. Returns a new path object

access(mode: megfile.pathlike.Access = <Access.READ: 1>, followlinks: bool = False) → bool[source]

Test if path has access permission described by mode

Parameters

mode – access mode

Returns

bool, if the bucket of s3_url has read/write access.

copy(dst_url: Union[str, os.PathLike], followlinks: bool = False, callback: Optional[Callable[int, None]] = None) → None[source]

File copy on S3 Copy content of file on src_path to dst_path. It’s caller’s responsebility to ensure the s3_isfile(src_url) == True

Parameters
  • dst_path – Target file path

  • callback – Called periodically during copy, and the input parameter is the data size (in bytes) of copy since the last call

cwd() → megfile.s3_path.S3Path[source]

Return current working directory

returns: Current working directory

exists(followlinks: bool = False) → bool[source]

Test if s3_url exists

If the bucket of s3_url are not permitted to read, return False

Returns

True if s3_url eixsts, else False

getmtime(follow_symlinks: bool = False) → float[source]

Get last-modified time of the file on the given s3_url path (in Unix timestamp format). If the path is an existent directory, return the latest modified time of all file in it. The mtime of empty directory is 1970-01-01 00:00:00

If s3_url is not an existent path, which means s3_exist(s3_url) returns False, then raise S3FileNotFoundError

Returns

Last-modified time

Raises

S3FileNotFoundError, UnsupportedError

getsize(follow_symlinks: bool = False) → int[source]

Get file size on the given s3_url path (in bytes). If the path in a directory, return the sum of all file size in it, including file in subdirectories (if exist). The result excludes the size of directory itself. In other words, return 0 Byte on an empty directory path.

If s3_url is not an existent path, which means s3_exist(s3_url) returns False, then raise S3FileNotFoundError

Returns

File size

Raises

S3FileNotFoundError, UnsupportedError

glob(pattern, recursive: bool = True, missing_ok: bool = True, followlinks: bool = False) → List[megfile.s3_path.S3Path][source]

Return s3 path list in ascending alphabetical order, in which path matches glob pattern Notes: Only glob in bucket. If trying to match bucket with wildcard characters, raise UnsupportedError

Parameters
  • pattern – Glob the given relative pattern in the directory represented by this path

  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Raises

UnsupportedError, when bucket part contains wildcard characters

Returns

A list contains paths match s3_pathname

glob_stat(pattern, recursive: bool = True, missing_ok: bool = True, followlinks: bool = False) → Iterator[megfile.pathlike.FileEntry][source]

Return a generator contains tuples of path and file stat, in ascending alphabetical order, in which path matches glob pattern Notes: Only glob in bucket. If trying to match bucket with wildcard characters, raise UnsupportedError

Parameters
  • pattern – Glob the given relative pattern in the directory represented by this path

  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Raises

UnsupportedError, when bucket part contains wildcard characters

Returns

A generator contains tuples of path and file stat, in which paths match s3_pathname

hasbucket() → bool[source]

Test if the bucket of s3_url exists

Returns

True if bucket of s3_url eixsts, else False

iglob(pattern, recursive: bool = True, missing_ok: bool = True, followlinks: bool = False) → Iterator[megfile.s3_path.S3Path][source]

Return s3 path iterator in ascending alphabetical order, in which path matches glob pattern Notes: Only glob in bucket. If trying to match bucket with wildcard characters, raise UnsupportedError

Parameters
  • pattern – Glob the given relative pattern in the directory represented by this path

  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Raises

UnsupportedError, when bucket part contains wildcard characters

Returns

An iterator contains paths match s3_pathname

is_dir(followlinks: bool = False) → bool[source]

Test if an s3 url is directory Specific procedures are as follows: If there exists a suffix, of which os.path.join(s3_url, suffix) is a file If the url is empty bucket or s3://

Parameters

followlinks – whether followlinks is True or False, result is the same. Because s3 symlink not support dir.

Returns

True if path is s3 directory, else False

is_file(followlinks: bool = False) → bool[source]

Test if an s3_url is file

Returns

True if path is s3 file, else False

Test whether a path is link

Returns

True if a path is link, else False

Raises

S3NotALinkError

iterdir(followlinks: bool = False) → Iterator[megfile.s3_path.S3Path][source]

Get all contents of given s3_url. The result is in acsending alphabetical order.

Returns

All contents have prefix of s3_url in acsending alphabetical order

Raises

S3FileNotFoundError, S3NotADirectoryError

listdir(followlinks: bool = False) → List[str][source]

Get all contents of given s3_url. The result is in acsending alphabetical order.

Returns

All contents have prefix of s3_url in acsending alphabetical order

Raises

S3FileNotFoundError, S3NotADirectoryError

load(followlinks: bool = False) → BinaryIO[source]

Read all content in binary on specified path and write into memory

User should close the BinaryIO manually

Returns

BinaryIO

lstat() → megfile.pathlike.StatResult[source]

Like Path.stat() but, if the path points to a symbolic link, return the symbolic link’s information rather than its target’s.

md5(recalculate: bool = False, followlinks: bool = False) → str[source]

Get md5 meta info in files that uploaded/copied via megfile

If meta info is lost or non-existent, return None

Parameters
  • recalculate – calculate md5 in real-time or return s3 etag

  • followlinks – If is True, calculate md5 for real file

Returns

md5 meta info

mkdir(mode=511, parents: bool = False, exist_ok: bool = False)[source]

Create an s3 directory. Purely creating directory is invalid because it’s unavailable on OSS. This function is to test the target bucket have WRITE access.

Parameters
  • mode – mode is ignored, only be compatible with pathlib.Path

  • parents – parents is ignored, only be compatible with pathlib.Path

  • exist_ok – If False and target directory exists, raise S3FileExistsError

Raises

S3BucketNotFoundError, S3FileExistsError

move(dst_url: Union[str, os.PathLike]) → None[source]

Move file/directory path from src_url to dst_url

Parameters

dst_url – Given destination path

open(mode: str = 'r', *, encoding: Optional[str] = None, errors: Optional[str] = None, s3_open_func: Callable[[str, str], BinaryIO] = <function s3_buffered_open>, **kwargs) → IO[AnyStr][source]

Open the file with mode.

path_with_protocol[source]

Return path with protocol, like file:///root, s3://bucket/key

path_without_protocol[source]

Return path without protocol, example: if path is s3://bucket/key, return bucket/key

protocol = 's3'

Return a S3Path instance representing the path to which the symbolic link points.

Returns

Return a S3Path instance representing the path to which the symbolic link points.

Raises

S3NameTooLongError, S3BucketNotFoundError, S3IsADirectoryError, S3NotALinkError

remove(missing_ok: bool = False) → None[source]

Remove the file or directory on s3, s3:// and s3://bucket are not permitted to remove

Parameters

missing_ok – if False and target file/directory not exists, raise S3FileNotFoundError

Raises

S3PermissionError, S3FileNotFoundError, UnsupportedError

rename(dst_path: Union[str, os.PathLike]) → megfile.s3_path.S3Path[source]

Move s3 file path from src_url to dst_url

Parameters

dst_path – Given destination path

save(file_object: BinaryIO)[source]

Write the opened binary stream to specified path, but the stream won’t be closed

Parameters

file_object – Stream to be read

scan(missing_ok: bool = True, followlinks: bool = False) → Iterator[str][source]

Iteratively traverse only files in given s3 directory, in alphabetical order. Every iteration on generator yields a path string.

If s3_url is a file path, yields the file only If s3_url is a non-existent path, return an empty generator If s3_url is a bucket path, return all file paths in the bucket If s3_url is an empty bucket, return an empty generator If s3_url doesn’t contain any bucket, which is s3_url == ‘s3://’, raise UnsupportedError. walk() on complete s3 is not supported in megfile

Parameters

missing_ok – If False and there’s no file in the directory, raise FileNotFoundError

Raises

UnsupportedError

Returns

A file path generator

scan_stat(missing_ok: bool = True, followlinks: bool = False) → Iterator[megfile.pathlike.FileEntry][source]

Iteratively traverse only files in given directory, in alphabetical order. Every iteration on generator yields a tuple of path string and file stat

Parameters

missing_ok – If False and there’s no file in the directory, raise FileNotFoundError

Raises

UnsupportedError

Returns

A file path generator

scandir(followlinks: bool = False) → Iterator[megfile.pathlike.FileEntry][source]

Get all contents of given s3_url, the order of result is not guaranteed.

Returns

All contents have prefix of s3_url

Raises

S3FileNotFoundError, S3NotADirectoryError

stat(follow_symlinks=True) → megfile.pathlike.StatResult[source]

Get StatResult of s3_url file, including file size and mtime, referring to s3_getsize and s3_getmtime

If s3_url is not an existent path, which means s3_exist(s3_url) returns False, then raise S3FileNotFoundError If attempt to get StatResult of complete s3, such as s3_dir_url == ‘s3://’, raise S3BucketNotFoundError

Returns

StatResult

Raises

S3FileNotFoundError, S3BucketNotFoundError

Create a symbolic link pointing to src_path named dst_path.

Parameters

dst_path – Desination path

Raises

S3NameTooLongError, S3BucketNotFoundError, S3IsADirectoryError

sync(dst_url: Union[str, os.PathLike], followlinks: bool = False, force: bool = False) → None[source]

Copy file/directory on src_url to dst_url

Parameters
  • dst_url – Given destination path

  • followlinks – False if regard symlink as file, else True

  • force – Sync file forcely, do not ignore same files

Remove the file on s3

Parameters

missing_ok – if False and target file not exists, raise S3FileNotFoundError

Raises

S3PermissionError, S3FileNotFoundError, S3IsADirectoryError

walk(followlinks: bool = False) → Iterator[Tuple[str, List[str], List[str]]][source]

Iteratively traverse the given s3 directory, in top-bottom order. In other words, firstly traverse parent directory, if subdirectories exist, traverse the subdirectories in alphabetical order. Every iteration on generator yields a 3-tuple: (root, dirs, files)

  • root: Current s3 path;

  • dirs: Name list of subdirectories in current directory. The list is sorted by name in ascending alphabetical order;

  • files: Name list of files in current directory. The list is sorted by name in ascending alphabetical order;

If s3_url is a file path, return an empty generator If s3_url is a non-existent path, return an empty generator If s3_url is a bucket path, bucket will be the top directory, and will be returned at first iteration of generator If s3_url is an empty bucket, only yield one 3-tuple (notes: s3 doesn’t have empty directory) If s3_url doesn’t contain any bucket, which is s3_url == ‘s3://’, raise UnsupportedError. walk() on complete s3 is not supported in megfile

Parameters

followlinks – whether followlinks is True or False, result is the same. Because s3 symlink not support dir.

Raises

UnsupportedError

Returns

A 3-tuple generator

class megfile.FSPath(path: Union[PathLike, int], *other_paths: Union[str, os.PathLike])[source]

Bases: megfile.pathlike.URIPath

file protocol e.g. file:///data/test/ or /data/test

absolute() → megfile.fs_path.FSPath[source]

Make the path absolute, without normalization or resolving symlinks. Returns a new path object

abspath() → str[source]

Return the absolute path of given path

Returns

Absolute path of given path

access(mode: megfile.pathlike.Access = <Access.READ: 1>) → bool[source]

Test if path has access permission described by mode Using os.access

Parameters

mode – access mode

Returns

Access: Enum, the read/write access that path has.

anchor[source]
chmod(mode: int, *, follow_symlinks: bool = True)[source]

Change the file mode and permissions, like os.chmod().

This method normally follows symlinks. Some Unix flavours support changing permissions on the symlink itself; on these platforms you may add the argument follow_symlinks=False, or use lchmod().

copy(dst_path: Union[str, os.PathLike], callback: Optional[Callable[int, None]] = None, followlinks: bool = False)[source]

File copy on file system Copy content (excluding meta date) of file on src_path to dst_path. dst_path must be a complete file name

Note

The differences between this function and shutil.copyfile are:

  1. If parent directory of dst_path doesn’t exist, create it

  2. Allow callback function, None by default. callback: Optional[Callable[[int], None]],

the int data is means the size (in bytes) of the written data that is passed periodically

  1. This function is thread-unsafe

Parameters
  • dst_path – Target file path

  • callback – Called periodically during copy, and the input parameter is the data size (in bytes) of copy since the last call

  • followlinks – False if regard symlink as file, else True

cwd() → megfile.fs_path.FSPath[source]

Return current working directory

returns: Current working directory

drive[source]
exists(followlinks: bool = False) → bool[source]

Test if the path exists

Note

The difference between this function and os.path.exists is that this function regard symlink as file. In other words, this function is equal to os.path.lexists

Parameters

followlinks – False if regard symlink as file, else True

Returns

True if the path exists, else False

expanduser()[source]

Expand ~ and ~user constructions. If user or $HOME is unknown, do nothing.

classmethod from_uri(path: str) → megfile.fs_path.FSPath[source]
getmtime(follow_symlinks: bool = False) → float[source]

Get last-modified time of the file on the given path (in Unix timestamp format). If the path is an existent directory, return the latest modified time of all file in it.

Returns

last-modified time

getsize(follow_symlinks: bool = False) → int[source]

Get file size on the given file path (in bytes). If the path in a directory, return the sum of all file size in it, including file in subdirectories (if exist). The result excludes the size of directory itself. In other words, return 0 Byte on an empty directory path.

Returns

File size

glob(pattern, recursive: bool = True, missing_ok: bool = True) → List[megfile.fs_path.FSPath][source]

Return path list in ascending alphabetical order, in which path matches glob pattern

  1. If doesn’t match any path, return empty list

    Notice: glob.glob in standard library returns [‘a/’] instead of empty list when pathname is like a/**, recursive is True and directory ‘a’ doesn’t exist. fs_glob behaves like glob.glob in standard library under such circumstance.

  2. No guarantee that each path in result is different, which means:

    Assume there exists a path /a/b/c/b/d.txt use path pattern like /**/b/**/*.txt to glob, the path above will be returned twice

  3. ** will match any matched file, directory, symlink and ‘’ by default, when recursive is True

  4. fs_glob returns same as glob.glob(pathname, recursive=True) in acsending alphabetical order.

  5. Hidden files (filename stars with ‘.’) will not be found in the result

Parameters
  • pattern – Glob the given relative pattern in the directory represented by this path

  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Returns

A list contains paths match pathname

glob_stat(pattern, recursive: bool = True, missing_ok: bool = True) → Iterator[megfile.pathlike.FileEntry][source]

Return a list contains tuples of path and file stat, in ascending alphabetical order, in which path matches glob pattern

  1. If doesn’t match any path, return empty list

    Notice: glob.glob in standard library returns [‘a/’] instead of empty list when pathname is like a/**, recursive is True and directory ‘a’ doesn’t exist. fs_glob behaves like glob.glob in standard library under such circumstance.

  2. No guarantee that each path in result is different, which means:

    Assume there exists a path /a/b/c/b/d.txt use path pattern like /**/b/**/*.txt to glob, the path above will be returned twice

  3. ** will match any matched file, directory, symlink and ‘’ by default, when recursive is True

  4. fs_glob returns same as glob.glob(pathname, recursive=True) in acsending alphabetical order.

  5. Hidden files (filename stars with ‘.’) will not be found in the result

Parameters
  • pattern – Glob the given relative pattern in the directory represented by this path

  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Returns

A list contains tuples of path and file stat, in which paths match pathname

group() → str[source]

Return the name of the group owning the file. KeyError is raised if the file’s gid isn’t found in the system database.

Make this path a hard link to the same file as target.

home()[source]

Return the home directory

returns: Home directory path

iglob(pattern, recursive: bool = True, missing_ok: bool = True) → Iterator[megfile.fs_path.FSPath][source]

Return path iterator in ascending alphabetical order, in which path matches glob pattern

  1. If doesn’t match any path, return empty list

    Notice: glob.glob in standard library returns [‘a/’] instead of empty list when pathname is like a/**, recursive is True and directory ‘a’ doesn’t exist. fs_glob behaves like glob.glob in standard library under such circumstance.

  2. No guarantee that each path in result is different, which means:

    Assume there exists a path /a/b/c/b/d.txt use path pattern like /**/b/**/*.txt to glob, the path above will be returned twice

  3. ** will match any matched file, directory, symlink and ‘’ by default, when recursive is True

  4. fs_glob returns same as glob.glob(pathname, recursive=True) in acsending alphabetical order.

  5. Hidden files (filename stars with ‘.’) will not be found in the result

Parameters
  • pattern – Glob the given relative pattern in the directory represented by this path

  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Returns

An iterator contains paths match pathname

is_absolute() → bool[source]

Test whether a path is absolute

Returns

True if a path is absolute, else False

is_block_device() → bool[source]

Return True if the path points to a block device (or a symbolic link pointing to a block device), False if it points to another kind of file.

False is also returned if the path doesn’t exist or is a broken symlink; other errors (such as permission errors) are propagated.

is_char_device() → bool[source]

Return True if the path points to a character device (or a symbolic link pointing to a character device), False if it points to another kind of file.

False is also returned if the path doesn’t exist or is a broken symlink; other errors (such as permission errors) are propagated.

is_dir(followlinks: bool = False) → bool[source]

Test if a path is directory

Note

The difference between this function and os.path.isdir is that this function regard symlink as file

Parameters

followlinks – False if regard symlink as file, else True

Returns

True if the path is a directory, else False

is_fifo() → bool[source]

Return True if the path points to a FIFO (or a symbolic link pointing to a FIFO), False if it points to another kind of file.

False is also returned if the path doesn’t exist or is a broken symlink; other errors (such as permission errors) are propagated.

is_file(followlinks: bool = False) → bool[source]

Test if a path is file

Note

The difference between this function and os.path.isfile is that this function regard symlink as file

Parameters

followlinks – False if regard symlink as file, else True

Returns

True if the path is a file, else False

is_mount() → bool[source]

Test whether a path is a mount point

Returns

True if a path is a mount point, else False

is_socket() → bool[source]

Return True if the path points to a Unix socket (or a symbolic link pointing to a Unix socket), False if it points to another kind of file.

False is also returned if the path doesn’t exist or is a broken symlink; other errors (such as permission errors) are propagated.

Test whether a path is a symbolic link

Returns

If path is a symbolic link return True, else False

Return type

bool

iterdir() → Iterator[megfile.fs_path.FSPath][source]

Get all contents of given fs path. The result is in acsending alphabetical order.

Returns

All contents have in the path in acsending alphabetical order

joinpath(*other_paths: Union[str, os.PathLike]) → megfile.fs_path.FSPath[source]

Calling this method is equivalent to combining the path with each of the other arguments in turn

listdir() → List[str][source]

Get all contents of given fs path. The result is in acsending alphabetical order.

Returns

All contents have in the path in acsending alphabetical order

load() → BinaryIO[source]

Read all content on specified path and write into memory

User should close the BinaryIO manually

Returns

Binary stream

lstat() → megfile.pathlike.StatResult[source]

Like Path.stat() but, if the path points to a symbolic link, return the symbolic link’s information rather than its target’s.

Returns

StatResult

md5(recalculate: bool = False, followlinks: bool = True)[source]

Calculate the md5 value of the file

Parameters
  • recalculate – Ignore this parameter, just for compatibility

  • followlinks – Ignore this parameter, just for compatibility

returns: md5 of file

mkdir(mode=511, parents: bool = False, exist_ok: bool = False)[source]

make a directory on fs, including parent directory

If there exists a file on the path, raise FileExistsError

Parameters
  • mode – If mode is given, it is combined with the process’ umask value to determine the file mode and access flags.

  • parents – If parents is true, any missing parents of this path are created as needed;

If parents is false (the default), a missing parent raises FileNotFoundError. :param exist_ok: If False and target directory exists, raise FileExistsError :raises: FileExistsError

open(mode: str = 'r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, **kwargs) → IO[AnyStr][source]

Open the file with mode.

owner() → str[source]

Return the name of the user owning the file. KeyError is raised if the file’s uid isn’t found in the system database.

parts[source]

A tuple giving access to the path’s various components

property path_with_protocol

Return path with protocol, like file:///root, s3://bucket/key

protocol = 'file'

Return a FSPath instance representing the path to which the symbolic link points. :returns: Return a FSPath instance representing the path to which the symbolic link points.

realpath() → str[source]

Return the real path of given path

Returns

Real path of given path

relpath(start: Optional[str] = None) → str[source]

Return the relative path of given path

Parameters

start – Given start directory

Returns

Relative path from start

remove(missing_ok: bool = False) → None[source]

Remove the file or directory on fs

Parameters

missing_ok – if False and target file/directory not exists, raise FileNotFoundError

rename(dst_path: Union[str, os.PathLike]) → megfile.fs_path.FSPath[source]

rename file on fs

Parameters

dst_path – Given destination path

replace(dst_path: Union[str, os.PathLike]) → megfile.fs_path.FSPath[source]

move file on fs

Parameters

dst_path – Given destination path

resolve(strict=False) → megfile.fs_path.FSPath[source]

Equal to fs_realpath

Returns

Return the canonical path of the specified filename, eliminating any symbolic links encountered in the path.

Return type

FSPath

rmdir()[source]

Remove this directory. The directory must be empty.

root[source]
save(file_object: BinaryIO)[source]

Write the opened binary stream to path If parent directory of path doesn’t exist, it will be created.

Parameters

file_object – stream to be read

scan(missing_ok: bool = True, followlinks: bool = False) → Iterator[str][source]

Iteratively traverse only files in given directory, in alphabetical order. Every iteration on generator yields a path string.

If path is a file path, yields the file only If path is a non-existent path, return an empty generator If path is a bucket path, return all file paths in the bucket

Parameters

missing_ok – If False and there’s no file in the directory, raise FileNotFoundError

Returns

A file path generator

scan_stat(missing_ok: bool = True, followlinks: bool = False) → Iterator[megfile.pathlike.FileEntry][source]

Iteratively traverse only files in given directory, in alphabetical order. Every iteration on generator yields a tuple of path string and file stat

Parameters

missing_ok – If False and there’s no file in the directory, raise FileNotFoundError

Returns

A file path generator

scandir() → Iterator[megfile.pathlike.FileEntry][source]

Get all content of given file path.

Returns

An iterator contains all contents have prefix path

stat(follow_symlinks=True) → megfile.pathlike.StatResult[source]

Get StatResult of file on fs, including file size and mtime, referring to fs_getsize and fs_getmtime

Returns

StatResult

Create a symbolic link pointing to src_path named dst_path.

Parameters

dst_path – Desination path

sync(dst_path: Union[str, os.PathLike], followlinks: bool = False, force: bool = False) → None[source]

Force write of everything to disk.

Parameters
  • dst_path – Target file path

  • followlinks – False if regard symlink as file, else True

  • force – Sync file forcely, do not ignore same files

Remove the file on fs

Parameters

missing_ok – if False and target file not exists, raise FileNotFoundError

utime(atime: Union[float, int], mtime: Union[float, int])[source]

Set the access and modified times of the file specified by path.

Parameters
  • atime – a float or int representing the access time to be set. If it is set to None, the access time is set to the current time.

  • mtime – a float or int representing the modified time to be set. If it is set to None, the modified time is set to the current time.

Returns

None

walk(followlinks: bool = False) → Iterator[Tuple[str, List[str], List[str]]][source]

Generate the file names in a directory tree by walking the tree top-down. For each directory in the tree rooted at directory path (including path itself), it yields a 3-tuple (root, dirs, files).

root: a string of current path dirs: name list of subdirectories (excluding ‘.’ and ‘..’ if they exist) in ‘root’. The list is sorted by ascending alphabetical order files: name list of non-directory files (link is regarded as file) in ‘root’. The list is sorted by ascending alphabetical order

If path not exists, or path is a file (link is regarded as file), return an empty generator

Note

Be aware that setting followlinks to True can lead to infinite recursion if a link points to a parent directory of itself. fs_walk() does not keep track of the directories it visited already.

Parameters

followlinks – False if regard symlink as file, else True

Returns

A 3-tuple generator

class megfile.HttpPath(path: Union[str, os.PathLike], *other_paths: Union[str, os.PathLike])[source]

Bases: megfile.pathlike.URIPath

exists(followlinks: bool = False) → bool[source]

Test if http path exists

Parameters

followlinks (bool, optional) – ignore this parameter, just for compatibility

Returns

return True if exists

Return type

bool

getmtime(follow_symlinks: bool = False) → float[source]

Get Last-Modified time of the http request on the given http_url path.

If http response header don’t support Last-Modified, will return None

Parameters

follow_symlinks – Ignore this parameter, just for compatibility

Returns

Last-Modified time (in Unix timestamp format)

Raises

HttpPermissionError, HttpFileNotFoundError

getsize(follow_symlinks: bool = False) → int[source]

Get file size on the given http_url path.

If http response header don’t support Content-Length, will return None

Parameters

follow_symlinks – Ignore this parameter, just for compatibility

Returns

File size (in bytes)

Raises

HttpPermissionError, HttpFileNotFoundError

open(mode: str = 'rb', *, max_concurrency: Optional[int] = None, max_buffer_size: int = 134217728, forward_ratio: Optional[float] = None, block_size: int = 8388608, **kwargs) → Union[_io.BufferedReader, megfile.lib.http_prefetch_reader.HttpPrefetchReader][source]

Open a BytesIO to read binary data of given http(s) url

Note

Essentially, it reads data of http(s) url to memory by requests, and then return BytesIO to user.

Parameters
  • mode – Only supports ‘rb’ mode now

  • encoding – encoding is the name of the encoding used to decode or encode the file. This should only be used in text mode.

  • errors – errors is an optional string that specifies how encoding and decoding errors are to be handled—this cannot be used in binary mode.

  • max_concurrency – Max download thread number, None by default

  • max_buffer_size – Max cached buffer size in memory, 128MB by default

  • block_size – Size of single block, 8MB by default. Each block will be uploaded or downloaded by single thread.

Returns

BytesIO initialized with http(s) data

protocol = 'http'
stat(follow_symlinks=True) → megfile.pathlike.StatResult[source]

Get StatResult of http_url response, including size and mtime, referring to http_getsize and http_getmtime

Parameters

follow_symlinks – Ignore this parameter, just for compatibility

Returns

StatResult

Raises

HttpPermissionError, HttpFileNotFoundError

class megfile.HttpsPath(path: Union[str, os.PathLike], *other_paths: Union[str, os.PathLike])[source]

Bases: megfile.http_path.HttpPath

protocol = 'https'
class megfile.StdioPath(path: Union[str, os.PathLike])[source]

Bases: megfile.pathlike.BaseURIPath

open(mode: str = 'rb', encoding: Optional[str] = None, errors: Optional[str] = None, **kwargs) → IO[AnyStr][source]

Used to read or write stdio

Note

Essentially invoke sys.stdin.buffer | sys.stdout.buffer to read or write

Parameters

mode – Only supports ‘rb’ and ‘wb’ now

Returns

STDReader, STDWriter

protocol = 'stdio'
class megfile.SmartPath(path: Union[str, os.PathLike, int], *other_paths: Union[str, os.PathLike])[source]

Bases: megfile.pathlike.BasePath

absolute(*args, **kwargs)
abspath(*args, **kwargs)
access(*args, **kwargs)
property anchor
as_posix(*args, **kwargs)
as_uri(*args, **kwargs)
chmod(*args, **kwargs)
cwd(*args, **kwargs)
property drive
exists(*args, **kwargs)
expanduser(*args, **kwargs)
classmethod from_uri(path: str)[source]
getmtime(*args, **kwargs)
getsize(*args, **kwargs)
glob(*args, **kwargs)
glob_stat(*args, **kwargs)
group(*args, **kwargs)
home(*args, **kwargs)
iglob(*args, **kwargs)
is_absolute(*args, **kwargs)
is_block_device(*args, **kwargs)
is_char_device(*args, **kwargs)
is_dir(*args, **kwargs)
is_fifo(*args, **kwargs)
is_file(*args, **kwargs)
is_mount(*args, **kwargs)
is_relative_to(*args, **kwargs)
is_reserved(*args, **kwargs)
is_socket(*args, **kwargs)
iterdir(*args, **kwargs)
joinpath(*args, **kwargs)
lchmod(*args, **kwargs)
listdir(*args, **kwargs)
load(*args, **kwargs)
lstat(*args, **kwargs)
match(*args, **kwargs)
md5(*args, **kwargs)
mkdir(*args, **kwargs)
property name
open(*args, **kwargs)
owner(*args, **kwargs)
property parent
property parents
property parts
property protocol
read_bytes(*args, **kwargs)
read_text(*args, **kwargs)
realpath(*args, **kwargs)
classmethod register(path_class, override_ok: bool = False)[source]
relative_to(*args, **kwargs)
relpath(*args, **kwargs)
remove(*args, **kwargs)
rename(*args, **kwargs)
replace(*args, **kwargs)
resolve(*args, **kwargs)
rglob(*args, **kwargs)
rmdir(*args, **kwargs)
property root
samefile(*args, **kwargs)
save(*args, **kwargs)
scan(*args, **kwargs)
scan_stat(*args, **kwargs)
scandir(*args, **kwargs)
stat(*args, **kwargs)
property stem
property suffix
property suffixes
touch(*args, **kwargs)
utime(*args, **kwargs)
walk(*args, **kwargs)
with_name(*args, **kwargs)
with_stem(*args, **kwargs)
with_suffix(*args, **kwargs)
write_bytes(*args, **kwargs)
write_text(*args, **kwargs)
class megfile.SftpPath(path: Union[str, os.PathLike], *other_paths: Union[str, os.PathLike])[source]

Bases: megfile.pathlike.URIPath

sftp protocol

uri format: - absolute path

  • sftp://[username[:password]@]hostname[:port]//file_path

  • relative path
      • sftp://[username[:password]@]hostname[:port]/file_path

absolute() → megfile.sftp_path.SftpPath[source]

Make the path absolute, without normalization or resolving symlinks. Returns a new path object

chmod(mode: int, follow_symlinks: bool = True)[source]

Change the file mode and permissions, like os.chmod().

Parameters
  • mode – the file mode you want to change

  • followlinks – Ignore this parameter, just for compatibility

copy(dst_path: Union[str, os.PathLike], callback: Optional[Callable[int, None]] = None, followlinks: bool = False)[source]

Copy the file to the given destination path.

Parameters
  • dst_path – The destination path to copy the file to.

  • callback – An optional callback function that takes an integer parameter and is called periodically during the copy operation to report the number of bytes copied.

  • followlinks – Whether to follow symbolic links when copying directories.

Raises
  • IsADirectoryError – If the source is a directory.

  • OSError – If there is an error copying the file.

cwd() → megfile.sftp_path.SftpPath[source]

Return current working directory

returns: Current working directory

exists(followlinks: bool = False) → bool[source]

Test if the path exists

Parameters

followlinks – False if regard symlink as file, else True

Returns

True if the path exists, else False

getmtime(follow_symlinks: bool = False) → float[source]

Get last-modified time of the file on the given path (in Unix timestamp format). If the path is an existent directory, return the latest modified time of all file in it.

Returns

last-modified time

getsize(follow_symlinks: bool = False) → int[source]

Get file size on the given file path (in bytes). If the path in a directory, return the sum of all file size in it, including file in subdirectories (if exist). The result excludes the size of directory itself. In other words, return 0 Byte on an empty directory path.

Returns

File size

glob(pattern, recursive: bool = True, missing_ok: bool = True) → List[megfile.sftp_path.SftpPath][source]

Return path list in ascending alphabetical order, in which path matches glob pattern

  1. If doesn’t match any path, return empty list

    Notice: glob.glob in standard library returns [‘a/’] instead of empty list when pathname is like a/**, recursive is True and directory ‘a’ doesn’t exist. fs_glob behaves like glob.glob in standard library under such circumstance.

  2. No guarantee that each path in result is different, which means:

    Assume there exists a path /a/b/c/b/d.txt use path pattern like /**/b/**/*.txt to glob, the path above will be returned twice

  3. ** will match any matched file, directory, symlink and ‘’ by default, when recursive is True

  4. fs_glob returns same as glob.glob(pathname, recursive=True) in acsending alphabetical order.

  5. Hidden files (filename stars with ‘.’) will not be found in the result

Parameters
  • pattern – Glob the given relative pattern in the directory represented by this path

  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Returns

A list contains paths match pathname

glob_stat(pattern, recursive: bool = True, missing_ok: bool = True) → Iterator[megfile.pathlike.FileEntry][source]

Return a list contains tuples of path and file stat, in ascending alphabetical order, in which path matches glob pattern

  1. If doesn’t match any path, return empty list

    Notice: glob.glob in standard library returns [‘a/’] instead of empty list when pathname is like a/**, recursive is True and directory ‘a’ doesn’t exist. sftp_glob behaves like glob.glob in standard library under such circumstance.

  2. No guarantee that each path in result is different, which means:

    Assume there exists a path /a/b/c/b/d.txt use path pattern like /**/b/**/*.txt to glob, the path above will be returned twice

  3. ** will match any matched file, directory, symlink and ‘’ by default, when recursive is True

  4. fs_glob returns same as glob.glob(pathname, recursive=True) in acsending alphabetical order.

  5. Hidden files (filename stars with ‘.’) will not be found in the result

Parameters
  • pattern – Glob the given relative pattern in the directory represented by this path

  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Returns

A list contains tuples of path and file stat, in which paths match pathname

iglob(pattern, recursive: bool = True, missing_ok: bool = True) → Iterator[megfile.sftp_path.SftpPath][source]

Return path iterator in ascending alphabetical order, in which path matches glob pattern

  1. If doesn’t match any path, return empty list

    Notice: glob.glob in standard library returns [‘a/’] instead of empty list when pathname is like a/**, recursive is True and directory ‘a’ doesn’t exist. fs_glob behaves like glob.glob in standard library under such circumstance.

  2. No guarantee that each path in result is different, which means:

    Assume there exists a path /a/b/c/b/d.txt use path pattern like /**/b/**/*.txt to glob, the path above will be returned twice

  3. ** will match any matched file, directory, symlink and ‘’ by default, when recursive is True

  4. fs_glob returns same as glob.glob(pathname, recursive=True) in acsending alphabetical order.

  5. Hidden files (filename stars with ‘.’) will not be found in the result

Parameters
  • pattern – Glob the given relative pattern in the directory represented by this path

  • recursive – If False, ** will not search directory recursively

  • missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError

Returns

An iterator contains paths match pathname

is_dir(followlinks: bool = False) → bool[source]

Test if a path is directory

Note

The difference between this function and os.path.isdir is that this function regard symlink as file

Parameters

followlinks – False if regard symlink as file, else True

Returns

True if the path is a directory, else False

is_file(followlinks: bool = False) → bool[source]

Test if a path is file

Note

The difference between this function and os.path.isfile is that this function regard symlink as file

Parameters

followlinks – False if regard symlink as file, else True

Returns

True if the path is a file, else False

Test whether a path is a symbolic link

Returns

If path is a symbolic link return True, else False

Return type

bool

iterdir() → Iterator[megfile.sftp_path.SftpPath][source]

Get all contents of given sftp path. The result is in acsending alphabetical order.

Returns

All contents have in the path in acsending alphabetical order

listdir() → List[str][source]

Get all contents of given sftp path. The result is in acsending alphabetical order.

Returns

All contents have in the path in acsending alphabetical order

load() → BinaryIO[source]

Read all content on specified path and write into memory

User should close the BinaryIO manually

Returns

Binary stream

lstat() → megfile.pathlike.StatResult[source]

Get StatResult of file on sftp, including file size and mtime, referring to fs_getsize and fs_getmtime

Returns

StatResult

md5(recalculate: bool = False, followlinks: bool = True)[source]

Calculate the md5 value of the file

Parameters
  • recalculate – Ignore this parameter, just for compatibility

  • followlinks – Ignore this parameter, just for compatibility

returns: md5 of file

mkdir(mode=511, parents: bool = False, exist_ok: bool = False)[source]

make a directory on sftp, including parent directory

If there exists a file on the path, raise FileExistsError

Parameters
  • mode – If mode is given, it is combined with the process’ umask value to determine the file mode and access flags.

  • parents – If parents is true, any missing parents of this path are created as needed;

If parents is false (the default), a missing parent raises FileNotFoundError. :param exist_ok: If False and target directory exists, raise FileExistsError :raises: FileExistsError

open(mode: str = 'r', buffering=-1, encoding: Optional[str] = None, errors: Optional[str] = None, **kwargs) → IO[AnyStr][source]

Open a file on the path.

Parameters
  • mode – Mode to open file

  • buffering – buffering is an optional integer used to set the buffering policy.

  • encoding – encoding is the name of the encoding used to decode or encode the file. This should only be used in text mode.

  • errors – errors is an optional string that specifies how encoding and decoding errors are to be handled—this cannot be used in binary mode.

Returns

File-Like object

parts[source]

A tuple giving access to the path’s various components

protocol = 'sftp'

Return a SftpPath instance representing the path to which the symbolic link points. :returns: Return a SftpPath instance representing the path to which the symbolic link points.

realpath() → str[source]

Return the real path of given path

Returns

Real path of given path

remove(missing_ok: bool = False) → None[source]

Remove the file or directory on sftp

Parameters

missing_ok – if False and target file/directory not exists, raise FileNotFoundError

rename(dst_path: Union[str, os.PathLike]) → megfile.sftp_path.SftpPath[source]

rename file on sftp

Parameters

dst_path – Given destination path

replace(dst_path: Union[str, os.PathLike]) → megfile.sftp_path.SftpPath[source]

move file on sftp

Parameters

dst_path – Given destination path

resolve(strict=False) → megfile.sftp_path.SftpPath[source]

Equal to sftp_realpath

Parameters

strict – Ignore this parameter, just for compatibility

Returns

Return the canonical path of the specified filename, eliminating any symbolic links encountered in the path.

Return type

SftpPath

rmdir()[source]

Remove this directory. The directory must be empty.

save(file_object: BinaryIO)[source]

Write the opened binary stream to path If parent directory of path doesn’t exist, it will be created.

Parameters

file_object – stream to be read

scan(missing_ok: bool = True, followlinks: bool = False) → Iterator[str][source]

Iteratively traverse only files in given directory, in alphabetical order. Every iteration on generator yields a path string.

If path is a file path, yields the file only If path is a non-existent path, return an empty generator If path is a bucket path, return all file paths in the bucket

Parameters

missing_ok – If False and there’s no file in the directory, raise FileNotFoundError

Returns

A file path generator

scan_stat(missing_ok: bool = True, followlinks: bool = False) → Iterator[megfile.pathlike.FileEntry][source]

Iteratively traverse only files in given directory, in alphabetical order. Every iteration on generator yields a tuple of path string and file stat

Parameters

missing_ok – If False and there’s no file in the directory, raise FileNotFoundError

Returns

A file path generator

scandir() → Iterator[megfile.pathlike.FileEntry][source]

Get all content of given file path.

Returns

An iterator contains all contents have prefix path

stat(follow_symlinks=True) → megfile.pathlike.StatResult[source]

Get StatResult of file on sftp, including file size and mtime, referring to fs_getsize and fs_getmtime

Returns

StatResult

Create a symbolic link pointing to src_path named dst_path.

Parameters

dst_path – Desination path

sync(dst_path: Union[str, os.PathLike], followlinks: bool = False, force: bool = False)[source]

Copy file/directory on src_url to dst_url

Parameters
  • dst_url – Given destination path

  • followlinks – False if regard symlink as file, else True

  • force – Sync file forcely, do not ignore same files

Remove the file on sftp

Parameters

missing_ok – if False and target file not exists, raise FileNotFoundError

utime(atime: Union[float, int], mtime: Union[float, int]) → None[source]

Set the access and modified times of the file specified by path.

Parameters
  • atime (Union[float, int]) – The access time to be set.

  • mtime (Union[float, int]) – The modification time to be set.

Returns

None

walk(followlinks: bool = False) → Iterator[Tuple[str, List[str], List[str]]][source]

Generate the file names in a directory tree by walking the tree top-down. For each directory in the tree rooted at directory path (including path itself), it yields a 3-tuple (root, dirs, files).

root: a string of current path dirs: name list of subdirectories (excluding ‘.’ and ‘..’ if they exist) in ‘root’. The list is sorted by ascending alphabetical order files: name list of non-directory files (link is regarded as file) in ‘root’. The list is sorted by ascending alphabetical order

If path not exists, or path is a file (link is regarded as file), return an empty generator

Note

Be aware that setting followlinks to True can lead to infinite recursion if a link points to a parent directory of itself. fs_walk() does not keep track of the directories it visited already.

Parameters

followlinks – False if regard symlink as file, else True

Returns

A 3-tuple generator