megfile.s3_path module
-
class
megfile.s3_path.S3Path(path: Union[str, os.PathLike], *other_paths: Union[str, os.PathLike])[source] Bases:
megfile.pathlike.URIPath-
absolute() → megfile.s3_path.S3Path[source] Make the path absolute, without normalization or resolving symlinks. Returns a new path object
-
access(mode: megfile.pathlike.Access = <Access.READ: 1>, followlinks: bool = False) → bool[source] Test if path has access permission described by mode
- Parameters
mode – access mode
- Returns
bool, if the bucket of s3_url has read/write access.
-
copy(dst_url: Union[str, os.PathLike], followlinks: bool = False, callback: Optional[Callable[int, None]] = None) → None[source] File copy on S3 Copy content of file on src_path to dst_path. It’s caller’s responsebility to ensure the s3_isfile(src_url) == True
- Parameters
dst_path – Target file path
callback – Called periodically during copy, and the input parameter is the data size (in bytes) of copy since the last call
-
cwd() → megfile.s3_path.S3Path[source] Return current working directory
returns: Current working directory
-
exists(followlinks: bool = False) → bool[source] Test if s3_url exists
If the bucket of s3_url are not permitted to read, return False
- Returns
True if s3_url eixsts, else False
-
getmtime(follow_symlinks: bool = False) → float[source] Get last-modified time of the file on the given s3_url path (in Unix timestamp format). If the path is an existent directory, return the latest modified time of all file in it. The mtime of empty directory is 1970-01-01 00:00:00
If s3_url is not an existent path, which means s3_exist(s3_url) returns False, then raise S3FileNotFoundError
- Returns
Last-modified time
- Raises
S3FileNotFoundError, UnsupportedError
-
getsize(follow_symlinks: bool = False) → int[source] Get file size on the given s3_url path (in bytes). If the path in a directory, return the sum of all file size in it, including file in subdirectories (if exist). The result excludes the size of directory itself. In other words, return 0 Byte on an empty directory path.
If s3_url is not an existent path, which means s3_exist(s3_url) returns False, then raise S3FileNotFoundError
- Returns
File size
- Raises
S3FileNotFoundError, UnsupportedError
-
glob(pattern, recursive: bool = True, missing_ok: bool = True, followlinks: bool = False) → List[megfile.s3_path.S3Path][source] Return s3 path list in ascending alphabetical order, in which path matches glob pattern Notes: Only glob in bucket. If trying to match bucket with wildcard characters, raise UnsupportedError
- Parameters
pattern – Glob the given relative pattern in the directory represented by this path
recursive – If False, ** will not search directory recursively
missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError
- Raises
UnsupportedError, when bucket part contains wildcard characters
- Returns
A list contains paths match s3_pathname
-
glob_stat(pattern, recursive: bool = True, missing_ok: bool = True, followlinks: bool = False) → Iterator[megfile.pathlike.FileEntry][source] Return a generator contains tuples of path and file stat, in ascending alphabetical order, in which path matches glob pattern Notes: Only glob in bucket. If trying to match bucket with wildcard characters, raise UnsupportedError
- Parameters
pattern – Glob the given relative pattern in the directory represented by this path
recursive – If False, ** will not search directory recursively
missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError
- Raises
UnsupportedError, when bucket part contains wildcard characters
- Returns
A generator contains tuples of path and file stat, in which paths match s3_pathname
-
hasbucket() → bool[source] Test if the bucket of s3_url exists
- Returns
True if bucket of s3_url eixsts, else False
-
iglob(pattern, recursive: bool = True, missing_ok: bool = True, followlinks: bool = False) → Iterator[megfile.s3_path.S3Path][source] Return s3 path iterator in ascending alphabetical order, in which path matches glob pattern Notes: Only glob in bucket. If trying to match bucket with wildcard characters, raise UnsupportedError
- Parameters
pattern – Glob the given relative pattern in the directory represented by this path
recursive – If False, ** will not search directory recursively
missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError
- Raises
UnsupportedError, when bucket part contains wildcard characters
- Returns
An iterator contains paths match s3_pathname
-
is_dir(followlinks: bool = False) → bool[source] Test if an s3 url is directory Specific procedures are as follows: If there exists a suffix, of which
os.path.join(s3_url, suffix)is a file If the url is empty bucket or s3://- Parameters
followlinks – whether followlinks is True or False, result is the same. Because s3 symlink not support dir.
- Returns
True if path is s3 directory, else False
-
is_file(followlinks: bool = False) → bool[source] Test if an s3_url is file
- Returns
True if path is s3 file, else False
-
is_symlink() → bool[source] Test whether a path is link
- Returns
True if a path is link, else False
- Raises
S3NotALinkError
-
iterdir(followlinks: bool = False) → Iterator[megfile.s3_path.S3Path][source] Get all contents of given s3_url. The result is in acsending alphabetical order.
- Returns
All contents have prefix of s3_url in acsending alphabetical order
- Raises
S3FileNotFoundError, S3NotADirectoryError
-
listdir(followlinks: bool = False) → List[str][source] Get all contents of given s3_url. The result is in acsending alphabetical order.
- Returns
All contents have prefix of s3_url in acsending alphabetical order
- Raises
S3FileNotFoundError, S3NotADirectoryError
-
load(followlinks: bool = False) → BinaryIO[source] Read all content in binary on specified path and write into memory
User should close the BinaryIO manually
- Returns
BinaryIO
-
lstat() → megfile.pathlike.StatResult[source] Like Path.stat() but, if the path points to a symbolic link, return the symbolic link’s information rather than its target’s.
-
md5(recalculate: bool = False, followlinks: bool = False) → str[source] Get md5 meta info in files that uploaded/copied via megfile
If meta info is lost or non-existent, return None
- Parameters
recalculate – calculate md5 in real-time or return s3 etag
followlinks – If is True, calculate md5 for real file
- Returns
md5 meta info
-
mkdir(mode=511, parents: bool = False, exist_ok: bool = False)[source] Create an s3 directory. Purely creating directory is invalid because it’s unavailable on OSS. This function is to test the target bucket have WRITE access.
- Parameters
mode – mode is ignored, only be compatible with pathlib.Path
parents – parents is ignored, only be compatible with pathlib.Path
exist_ok – If False and target directory exists, raise S3FileExistsError
- Raises
S3BucketNotFoundError, S3FileExistsError
-
move(dst_url: Union[str, os.PathLike]) → None[source] Move file/directory path from src_url to dst_url
- Parameters
dst_url – Given destination path
-
open(mode: str = 'r', *, encoding: Optional[str] = None, errors: Optional[str] = None, s3_open_func: Callable[[str, str], BinaryIO] = <function s3_buffered_open>, **kwargs) → IO[AnyStr][source] Open the file with mode.
-
path_with_protocol[source] Return path with protocol, like file:///root, s3://bucket/key
-
path_without_protocol[source] Return path without protocol, example: if path is s3://bucket/key, return bucket/key
-
protocol= 's3'
-
readlink() → megfile.s3_path.S3Path[source] Return a S3Path instance representing the path to which the symbolic link points.
- Returns
Return a S3Path instance representing the path to which the symbolic link points.
- Raises
S3NameTooLongError, S3BucketNotFoundError, S3IsADirectoryError, S3NotALinkError
-
remove(missing_ok: bool = False) → None[source] Remove the file or directory on s3, s3:// and s3://bucket are not permitted to remove
- Parameters
missing_ok – if False and target file/directory not exists, raise S3FileNotFoundError
- Raises
S3PermissionError, S3FileNotFoundError, UnsupportedError
-
rename(dst_path: Union[str, os.PathLike]) → megfile.s3_path.S3Path[source] Move s3 file path from src_url to dst_url
- Parameters
dst_path – Given destination path
-
save(file_object: BinaryIO)[source] Write the opened binary stream to specified path, but the stream won’t be closed
- Parameters
file_object – Stream to be read
-
scan(missing_ok: bool = True, followlinks: bool = False) → Iterator[str][source] Iteratively traverse only files in given s3 directory, in alphabetical order. Every iteration on generator yields a path string.
If s3_url is a file path, yields the file only If s3_url is a non-existent path, return an empty generator If s3_url is a bucket path, return all file paths in the bucket If s3_url is an empty bucket, return an empty generator If s3_url doesn’t contain any bucket, which is s3_url == ‘s3://’, raise UnsupportedError. walk() on complete s3 is not supported in megfile
- Parameters
missing_ok – If False and there’s no file in the directory, raise FileNotFoundError
- Raises
UnsupportedError
- Returns
A file path generator
-
scan_stat(missing_ok: bool = True, followlinks: bool = False) → Iterator[megfile.pathlike.FileEntry][source] Iteratively traverse only files in given directory, in alphabetical order. Every iteration on generator yields a tuple of path string and file stat
- Parameters
missing_ok – If False and there’s no file in the directory, raise FileNotFoundError
- Raises
UnsupportedError
- Returns
A file path generator
-
scandir(followlinks: bool = False) → Iterator[megfile.pathlike.FileEntry][source] Get all contents of given s3_url, the order of result is not guaranteed.
- Returns
All contents have prefix of s3_url
- Raises
S3FileNotFoundError, S3NotADirectoryError
-
stat(follow_symlinks=True) → megfile.pathlike.StatResult[source] Get StatResult of s3_url file, including file size and mtime, referring to s3_getsize and s3_getmtime
If s3_url is not an existent path, which means s3_exist(s3_url) returns False, then raise S3FileNotFoundError If attempt to get StatResult of complete s3, such as s3_dir_url == ‘s3://’, raise S3BucketNotFoundError
- Returns
StatResult
- Raises
S3FileNotFoundError, S3BucketNotFoundError
-
symlink(dst_path: Union[str, os.PathLike]) → None[source] Create a symbolic link pointing to src_path named dst_path.
- Parameters
dst_path – Desination path
- Raises
S3NameTooLongError, S3BucketNotFoundError, S3IsADirectoryError
-
sync(dst_url: Union[str, os.PathLike], followlinks: bool = False, force: bool = False) → None[source] Copy file/directory on src_url to dst_url
- Parameters
dst_url – Given destination path
followlinks – False if regard symlink as file, else True
force – Sync file forcely, do not ignore same files
-
unlink(missing_ok: bool = False) → None[source] Remove the file on s3
- Parameters
missing_ok – if False and target file not exists, raise S3FileNotFoundError
- Raises
S3PermissionError, S3FileNotFoundError, S3IsADirectoryError
-
walk(followlinks: bool = False) → Iterator[Tuple[str, List[str], List[str]]][source] Iteratively traverse the given s3 directory, in top-bottom order. In other words, firstly traverse parent directory, if subdirectories exist, traverse the subdirectories in alphabetical order. Every iteration on generator yields a 3-tuple: (root, dirs, files)
root: Current s3 path;
dirs: Name list of subdirectories in current directory. The list is sorted by name in ascending alphabetical order;
files: Name list of files in current directory. The list is sorted by name in ascending alphabetical order;
If s3_url is a file path, return an empty generator If s3_url is a non-existent path, return an empty generator If s3_url is a bucket path, bucket will be the top directory, and will be returned at first iteration of generator If s3_url is an empty bucket, only yield one 3-tuple (notes: s3 doesn’t have empty directory) If s3_url doesn’t contain any bucket, which is s3_url == ‘s3://’, raise UnsupportedError. walk() on complete s3 is not supported in megfile
- Parameters
followlinks – whether followlinks is True or False, result is the same. Because s3 symlink not support dir.
- Raises
UnsupportedError
- Returns
A 3-tuple generator
-